人工通用情报没有火灾警报

||Analysis


火警警报的功能是什么?

One might think that the function of a fire alarm is to provide you with important evidence about a fire existing, allowing you to change your policy accordingly and exit the building.

在1968年拉坦和达利(Latane and Darley)的经典实验中,要求八组三个学生在一个房间里填写一份问卷,不久之后不久就开始充满烟雾。八个小组中有五个没有反应或报告烟雾,即使它变得足够稠密以使它们开始咳嗽。随后的操作表明,一个孤独的学生将有75%的时间做出反应。虽然一名学生陪同两个演员被告知假装冷漠的时间只有10%的时间。这项和其他实验似乎确定正在发生的事情是多元化的无知。We don’t want to look panicky by being afraid of what isn’t an emergency, so we try to look calm while glancing out of the corners of our eyes to see how others are reacting, but of course they are also trying to look calm.

(I’ve read a number of replications and variations on this research, and the effect size is blatant. I would not expect this to be one of the results that dies to the replication crisis, and I haven’t yet heard about the replication crisis touching it. But we have to put a maybe-not marker on everything now.)

A fire alarm creates common knowledge, in the you-know-I-know sense, that there is a fire; after which it is socially safe to react. When the fire alarm goes off, you know that everyone else knows there is a fire, you know you won’t lose face if you proceed to exit the building.

The fire alarm doesn’t tell us with certainty that a fire is there. In fact, I can’t recall one time in my life when, exiting a building on a fire alarm, there was an actual fire. Really, a fire alarm isweaker火灾的证据比从门下传来的烟雾的证据。

But the fire alarm tells us that it’s socially okay to react to the fire. It promises us with certainty that we won’t be embarrassed if we now proceed to exit in an orderly fashion.

在我看来,这是人们错误地对自己的信念的信念,例如当有人大声认可其城市的球队赢得大型比赛时,就会要求下注。他们没有自觉地区分大喊球队将获胜的令人振奋的令人振奋,这是预期球队将获胜的感觉。

When people look at the smoke coming from under the door, I think they think their uncertain wobbling feeling comes from not assigning the fire a high-enough probability of really being there, and that they’re reluctant to act for fear of wasting effort and time. If so, I think they’re interpreting their own feelings mistakenly. If that was so, they’d get the same wobbly feeling on hearing the fire alarm, or even more so, because fire alarms correlate to fire less than does smoke coming from under a door. The uncertain wobbling feeling comes from the worry that others believe differently, not the worry that the fire isn’t there. The reluctance to act is the reluctance to be seen looking foolish, not the reluctance to waste effort. That’s why the student alone in the room does something about the fire 75% of the time, and why people have no trouble reacting to the much weaker evidence presented by fire alarms.


时不时地提议我们应该以后对人工通用情报的问题作出反应(背景在这里),因为据说,我们离它很远,以至于今天不可能做生产力的工作。

(For direct argument about there being things doable today, see: Soares and Fallenstein (2014/2017);Amodei, Olah, Steinhardt, Christiano, Schulman, and Mané (2016);或泰勒,尤德科夫斯基,拉维克托尔和克里奇(2016)。

(If none of those papers existed or if you were an AI researcher who’d read them but thought they were all garbage, and you wished you could work on alignment but knew of nothing you could do, the wise next step would be to sit down and spend two hours by the clock sincerely trying to think of possible approaches. Preferably without self-sabotage that makes sure you don’t come up with anything plausible; as might happen if, hypothetically speaking, you would actually find it much more comfortable to believe there was nothing you ought to be working on today, because e.g. then you could work on other things that interested you more.)

(但是不要紧。)

So if AGI seems far-ish away, and you think the conclusion licensed by this is that you can’t do any productive work on AGI alignment yet, then the implicit alternative strategy on offer is: Wait for some unspecified future event that tells us AGI is coming near; andthen我们都知道,可以开始进行AGI对齐是可以的。

在我看来,这是错误的。这里是其中的一些。

One:正如斯图尔特·罗素(Stuart Russell)所观察到的那样,如果您从太空获得无线电信号并与望远镜一起发现飞船,并且您知道外星人在三十年来降落,那么今天您仍然开始考虑这一点。

You’re not like, “Meh, that’s thirty years off, whatever.” You certainly don’t casually say “Well, there’s nothing we can do until they’re closer.” Not without spending two hours, or at leastfive minutes按照时钟的态度,关于您现在是否应该开始的任何东西进行头脑风暴。

如果您说外星人在三十年后来了,因此您今天什么都不做……好吧,如果这些是more effective times, somebody would ask for a schedule of what you thought ought to be done, starting when, how long before the aliens arrive. If you didn’t have that schedule ready, they’d know that you weren’t operating according to a worked table of timed responses, but just procrastinating and doing nothing; and they’d correctly infer that you probably hadn’t searched very hard for things that could be done today.

用布莱恩·卡普兰(Bryan Caplan)的话来说,任何对“现在无能为力的准备”的事实似乎很随便缺少心情; they should be much more alarmed at not being able to think of any way to prepare. And maybe ask if somebody else has come up with any ideas? But never mind.

二:历史表明,对于公众,甚至对于科学家而言,甚至不在关键的内心圈子中,甚至对于科学家来说that key circle, it is very often the case that key technological developments still seem decades away, five years before they show up.

1901年,在帮助建造第一个超过第一个空中飞行者的两年之前,威尔伯·赖特(Wilbur Wright)告诉他的兄弟,动力飞行是五十年了

In 1939, three years before he personally oversaw the first critical chain reaction in a pile of uranium bricks, Enrico Fermi voiced90% confidencethat it was不可能的使用铀来维持裂变链反应。我相信费米(Fermi如果net power from fission was even possible (as he then granted some greater plausibility) then it would be fifty years off; but for this I neglected to keep the citation.

当然,如果您不是赖特兄弟(Wright Brothers)或Enrico Fermi,那么您会感到惊讶。世界上大多数人都知道,当原子武器醒来是广岛的头条新闻时,它们是一件事情。有尊敬的知识分子说four yearsafterthe Wright Flyer那次大型飞行是不可能的,因为当时知识传播得更慢。

Were there events that, in事后看,今天,我们可以看出,迹象表明比空中飞行或核能更重?Sure, but if you go back and read the actual newspapers from that time and see what people actually said about it then, you’ll see that they did not know that these were signs, or that they were very uncertain that these might be signs. Some playing the part of Excited Futurists proclaimed that big changes were imminent, I expect, and others playing the part of Sober Scientists tried to pour cold water on all that childish enthusiasm; I expect that part was more or less exactly the same decades earlier. If somewhere in that din was a superforecaster who said “decades” when it was decades and “5 years” when it was five, good luck noticing them amid all the noise. More likely, the superforecasters were the ones who said “Could be tomorrow, could be decades” both when the big development was a day away and when it was decades away.

事后偏见使我们感到过去比任何人实际上能够预测的更可预测的主要模式之一是,事后看来,我们知道我们应该注意的是什么,我们只关注一个想法每个证据指示的内容。如果您查看人们当时实际上说的话,他们通常不知道三个月发生前三个月发生的事情,因为他们不知道哪些迹象是哪种迹象。

I mean, you可以say the words “AGI is 50 years away” and have those words happen to be true. People were also saying that powered flight was decades away when it was in fact decades away, and those people happened to be right. The problem is that everything looks the same to you either way, if you are actually living history instead of reading about it afterwards.

It’s not that whenever somebody says “fifty years” the thing always happens in two years. It’s that this confident prediction of things being far away corresponds to an epistemic state about the technology that feels the same way internally until you are very very close to the big development. It’s the epistemic state of “Well, I don’t see how to do the thing” and sometimes you say that fifty years off from the big development, and sometimes you say it two years away, and sometimes you say it while the Wright Flyer is flying somewhere out of your sight.

三:进步是由峰值知识而不是普通知识驱动的。

If Fermi and the Wrights couldn’t see it coming three years out, imagine how hard it must be for anyone else to see it.

If you’re not at the global peak of knowledge of how to do the thing, and looped in on all the progress being made at what will turn out to be the leading project, you aren’t going to be able to see of your own knowledge根本迫在眉睫的大发展。Unless you are very good at perspective-taking in a way that wasn’t necessary in a hunter-gatherer tribe, and very good at realizing that other people may know techniques and ideas of which you have no inkling even that you do not know them. If you don’t consciously compensate for the lessons of history in this regard; then you will promptly say the decades-off thing. Fermi wasn’t still thinking that net nuclear energy was impossible or decades away by the time he got to 3 months before he built the first pile, because at that point Fermi was looped in on everything and saw how to do it. But anyone not looped in probably still felt like it was fifty years away while the actual pile was fizzing away in a squash court at the University of Chicago.

人们似乎不自动补偿the fact that the timing of the big development is a function of the peak knowledge in the field, a threshold touched by the people who know the most and have the best ideas; while they themselves have average knowledge; and therefore what they themselves know is not strong evidence about when the big development happens. I think they aren’t thinking about that at all, and they just eyeball it using their own sense of difficulty. If they are thinking anything more deliberate and reflective than that, and incorporating real work into correcting for the factors that might bias their lenses, they haven’t bothered writing down their reasoning anywhere I can read it.

要知道AGI已经几十年了,我们需要对Agi足够了解,以了解缺少哪些难题,以及这些作品的难度。在难题完成之前,这种见解不太可能可用。这也可以说,对于前缘以外的任何人来说,这个难题看起来比在边缘看起来更不完整。该项目可能会在证明它们之前发布其理论,尽管我希望不要。但是现在也有未经证实的理论。

再说一次,这并不是说人们说“五十年”是壁球法院正在发生的事情的一个迹象。他们也说“五十年”六十年前。就是说任何认为技术的人timelinesare actually forecastable, in advance, by people who are not looped in to the leading project’s progress reports and who don’t share all the best ideas about exactly how to do the thing and how much effort is required for that, is learning the wrong lesson from history. In particular, from reading history books that neatly lay out lines of progress and their visible signs that we all knownow很重要和证据。有时可以说有用的有条件有关大型开发的后果,但很少有可能对此做出自信的预测定时在这些发展中,超出了一到两年的视野。而且,如果您是可以打电话给时机的稀有人物之一,如果这样的人甚至存在,没有人知道要注意您,而不是对兴奋的未来主义者或清醒的怀疑论者。

四:The future uses different tools, and can therefore easily do things that are very hard now, or do with difficulty things that are impossible now.

Why do we know that AGI is decades away? In popular articles penned by heads of AI research labs and the like, there are typically three prominent reasons given:

(a)作者不知道如何使用当前技术构建AGI。作者不知道从哪里开始。

(B) The author thinks it is really very hard to do the impressive things that modern AI technology does, they have to slave long hours over a hot GPU farm tweaking hyperparameters to get it done. They think that the public does not appreciate how hard it is to get anything done right now, and is panicking prematurely because the public thinks anyone can just fire up Tensorflow and build a robotic car.

(C) The author spends a lot of time interacting with AI systems and therefore is able to personally appreciate all the ways in which they are still stupid and lack common sense.

现在,我们考虑了参数A的某些方面A。让我们考虑参数B片刻。

Suppose I say: “It is now possible for one comp-sci grad to do in a week anything that N+ years ago the research community could do with neural networks根本。”n有多大?

在Twitter上我得到了一些答案whose credentials I don’t know, but the most common answer was five, which sounds about right to me based on my own acquaintance with machine learning. (Though obviously not as a literal universal, because reality is never that neat.) If you could do something in 2012 period, you can probably do it fairly straightforwardly with modern GPUs, Tensorflow, Xavier initialization, batch normalization, ReLUs, and Adam or RMSprop or just stochastic gradient descent with momentum. The modern techniques are just that much better. To be sure, there are things we can’t do now with just those simple methods, things that require tons more work, but those things were not possible at all in 2012.

In machine learning, when you can do something at all, you are probably at most a few years away from being able to do it easily using the future’s much superior tools. From this standpoint, argument B, “You don’t understand how hard it is to do what we do,” is something of a non-sequitur when it comes to timing.

Statement B sounds to me like the same sentiment voiced by Rutherford1933年when he called net energy from atomic fission “moonshine”. If you were a nuclear physicist in 1933 then you had to split all your atoms by hand, by bombarding them with other particles, and it was a laborious business. If somebody talked about getting net energy from atoms, maybe it made you feel that you were unappreciated, that people thought your job was easy.

But of course this will always be the lived experience for AI engineers on serious frontier projects. You don’t get paid big bucks to do what a grad student can do in a week (unless you’re working for a bureaucracy with no clue about AI; but that’s not Google or FB). Your personal experience will总是be that what you are paid to spend months doing is difficult. A change in this personal experience is therefore not something you can use as a fire alarm.

Those playing the part of wiser sober skeptical scientists would obviously agree in the abstract that our tools will improve; but in the popular articles they pen, they just talk about the painstaking difficulty of this year’s tools. I think that when they’re in that mode they are not even trying to forecast what the tools will be like in 5 years; they haven’t written down any such arguments as part of the articles I’ve read. I think that when they tell you that AGI is decades off, they are literally giving an estimate of对他们有多长时间like it would take to build AGI using their current tools and knowledge. Which is why they emphasize how hard it is to stir the heap of linear algebra until it spits out good answers; I think they are not imagining, at all, into how this experience may change over considerably less than fifty years. If they’ve explicitly considered the bias of estimating future tech timelines based on their present subjective sense of difficulty, and tried to compensate for that bias, they haven’t written that reasoning down anywhere I’ve read it. Nor have I ever heard of that forecasting method giving good results historically.

五:好吧,让我们在这里直率。我不认为关于Agi遥远的大多数话语(或者that it’s near) is being generated by models of future progress in machine learning. I don’t think we’re looking at wrong models; I think we’re looking at no models.

我曾经在一次会议上,那里有一个装满著名AI杰出人物的小组,大多数名人都点头并互相同意,除了两个著名的AI Luminaries时,Agi当然还很遥远取麦克风。

我在问答中说:“好吧,你们都告诉我们,进步不会那么快。但是,让我们更具体和具体。我想知道什么是least您非常自信的令人印象深刻的成就不能be done in the next two years.”

There was a silence.

最终,面板上的两个人冒险回答,比他们用来发音AGI数十年来更具试探性语气。他们命名为“一个机器人将洗碗机从洗碗机中扔掉而不打破它们”,然后Winograd schemas。具体而言,“我非常有信心的是,Winograd模式(最近我们的结果在50%,范围为60%),在未来两年中,无论人们使用什么技术。”

A few months after that panel, there was unexpectedly a big breakthrough on Winograd schemas. The breakthrough didn’t crack 80%, so three cheers for wide credibility intervals with error margin, but I expect the predictor might be feeling slightly more nervous now with one year left to go. (I don’t think it was the breakthrough I remember reading about, but Rob turned up这张纸as an example of one that could have been submitted at most 44 days after the above conference and gets up to 70%.)

但这不是重点。关键是在我的问题之后倒下的沉默,最终我只得到了两个回复,以暂定的音调说。When I asked for concrete feats that were impossible in the next two years, I think that that’s when the luminaries on that panel switched to trying to build a mental model of future progress in machine learning, asking themselves what they could or couldn’t predict, what they knew or didn’t know. And to their credit, most of them did know their profession well enough to realize that forecasting future boundaries around a rapidly moving field is actually真的很难,没有人知道下个月的Arxiv会出现什么,并且他们需要将广泛的可信度间隔放置在二十四个月后的Arxiv论文中可能会发生多少进展。

(此外,Demis Hassabis在场,所以他们都知道,如果他们命名了不够的东西,Demis会去做Deepmind,去做。)

The question I asked was in a completely different genre from the panel discussion, requiring a mental context switch: the assembled luminaries actually had to try to consult their rough, scarce-formed intuitive models of progress in machine learning and figure out what future experiences, if any, their model of the field definitely prohibited within a two-year time horizon. Instead of, well, emitting socially desirable verbal behavior meant to kill that darned hype about AGI and get some predictable applause from the audience.

我会直言不讳:我认为根本没有考虑过这种自信的长期主义。如果您的模型具有非凡的能力,可以说在另外一百二十个月以后十年后的几年是不可能的,那么您应该能够说出两年内不可能的弱事情,而您应该拥有这些内容。被问到后,预测排队并准备出发,而不是陷入紧张的沉默。

实际上,这两年的问题很难,十年的问题非常困难。一般来说,未来很难预测,我们对快速变化和发展的科学和工程领域的预测性掌握确实非常薄弱,而且它不允许在无法做什么的情况下进行狭窄的可靠间隔。

Grace et al. (2017) surveyed the predictions of 352 presenters at ICML and NIPS 2015. Respondents’ aggregate forecast was that the proposition “all occupations are fully automatable” (in the sense that “for any occupation, machines could be built to carry out the task better and more cheaply than human workers”) will not reach 50% probability until 121 years hence. Except that a randomized subset of respondents were instead asked the slightly different question of “when unaided machines can accomplish every task better and more cheaply than human workers”, and in this case held that this was 50% likely to occurwithin 44 years

这就是当您要求人们进行估计时无法估计的情况时,就会发生这种情况,并且对理想的口头行为应该是一种社会意识。


When I observe that there’s no fire alarm for AGI, I’m not saying that there’s no possible equivalent of smoke appearing from under a door.

我说的是,门下的烟雾总是可以争论的。这不会是明显,不可否认的,绝对的火迹象;因此,永远不会出现火灾警报,从而产生常识,即行动现在已经到期且在社会上可以接受。

There’s an old trope saying that as soon as something is actually done, it ceases to be called AI. People who work in AI and are in a broad sense pro-accelerationist and techno-enthusiast, what you might call the Kurzweilian camp (of which I am not a member), will sometimes rail against this as unfairness in judgment, as moving goalposts.

这忽略了反对AI成就的不利选择的真实而重要的现象:如果您可以在1974年与AI一起做一些令人印象深刻的听起来,那就是因为那件事以某种便宜的方式可行,而不是因为1974年如此令人惊讶地令人惊讶擅长AI。我们不确定要执行任务需要多少认知努力,以及对他们作弊的容易,而第一个“令人印象深刻的”任务将是我们最需要多少努力的那些。曾经有一段时间,有些人认为一台计算机赢得了世界国际象棋冠军的冠军将需要在AGI的方向上取得进展,这将算是AGI越来越近的迹象。当Deep Blue在1997年击败Kasparov时,从贝叶斯的意义上讲,我们确实学到了一些有关AI进度的知识,但是我们也学到了一些有关国际象棋轻松的信息。考虑到用于构造深蓝色的技术,我们学到的大部分内容是“出人意料的是,不易于概括的技术,可以下棋,而且“对AGI取得了令人惊讶的进步”。

门下的alphago烟雾是否在10年或更短的时间内具有AGI的迹象?人们以前曾经给出过您在结束前看到的例子。

Looking over the paper describing AlphaGo’s architecture, it seemed to me that wemostly learning that available AI techniques were likely to go further towards generality than expected, rather than about Go being surprisingly easy to achieve with fairly narrow and ad-hoc approaches. Not that the method scales to AGI, obviously; but AlphaGo did look like a product of相对地general insights and techniques being turned on the special case of Go, in a way that Deep Blue wasn’t. I also updated significantly on “The general learning capabilities of the human cortical algorithm are less impressive, less difficult to capture with a ton of gradient descent and a zillion GPUs, than I thought,” because if there were anywhere we expected an impressive hard-to-match highly-natural-selected but-still-general cortical algorithm to come into play, it would be in humans playing Go.

Maybe if we’d seen a thousand Earths undergoing similar events, we’d gather the statistics and find that a computer winning the planetary Go championship is a reliable ten-year-harbinger of AGI. But I don’t actually know that. Neither do you. Certainly, anyone can publicly argue that we just learned Go was easier to achieve with strictly narrow techniques than expected, as was true many times in the past. There’s no possible sign short of actual AGI, no case of smoke from under the door, for which we know that this is definitely serious fire and now AGI is 10, 5, or 2 years away. Let alone a sign where we know everyone else will believe it.

无论如何,机器学习中的多位主要科学家已经发表了文章,告诉我们他们的火灾警报标准。他们会相信人工通用情报是迫在眉睫的:

(A) When they personally see how to construct AGI using their current tools. This is what they are always saying is not currently true in order to castigate the folly of those who think AGI might be near.

(b)当他们的个人工作没有使他们对一切都很困难的感觉时。他们要说的是,这是一个无知的外行人所没有拥有的关键知识,他们认为Agi可能会靠近,他们只相信他们从未熬到凌晨2点,直到凌晨2点试图获得生成性的对抗网络来稳定。

(c)当他们对AI对人的聪明程度给人留下深刻的印象时,这对他们仍然对他们来说仍然是神奇的;与他们知道如何设计的部分相反,对他们来说,这似乎不再是神奇的;又名AI在互动和对话中看起来很聪明。又名AI实际上已经是AGI。

So there isn’t going to be a fire alarm. Period.

There is never going to be a time before the end when you can look around nervously, and see that it is now clearly common knowledge that you can talk about AGI being imminent, and take action and exit the building in an orderly fashion, without fear of looking stupid or frightened.


So far as I can presently estimate, now that we’ve had AlphaGo and a couple of other maybe/maybe-not shots across the bow, and seen a huge explosion of effort invested into machine learning and an enormous flood of papers, we are probably going to occupy our present epistemic state until very near the end.

通过说我们可能会大致处于这种认知状态,直到几乎尽头,我don’t意思是说我们知道AGI迫在眉睫,或者在此期间的AI中不会有重要的新突破。我的意思是,很难猜测AGI需要多少进一步的见解,或者需要多长时间才能获得这些见解。在下一次突破之后,我们仍然不知道需要多少突破,这使我们处于与以前几乎相同的认知状态。无论接下来是什么发现和里程碑,都可能仍然很难猜测需要多少进一步的见解,并且时间表将继续变得模糊。也许研究人员金宝博娱乐的热情和资金将进一步上升,我们可以说时间表正在缩短;或者,也许我们会再次参加AI冬季,我们会知道这是一个迹象,表明事情将花费比以其他方式更长的时间。但是我们仍然不知道多久。

在某个时候,我们可能会看到突然的Arxiv论文泛滥,其中真正有趣,根本和可怕的认知挑战似乎正在越来越多地完成。因此,随着洪水的加速,即使有些人认为自己清醒和怀疑的人也会感到不安,以至于他们冒险,也许阿吉现在只有15年了,也许也许是。这些迹象可能会变得如此公然,在结束前很快,人们开始认为可能AGI可能有10年的休息在社会上是可以接受的。Though the signs would have to be pretty darned blatant, if they’re to overcome the social barrier posed by luminaries who are estimating arrival times to AGI using their personal knowledge and personal difficulties, as well as all the historical bad feelings about AI winters caused by hype.

But even if it becomes socially acceptable to say that AGI is 15 years out, in those last couple of years or months, I would still expect there to be disagreement. There will still be others protesting that, as much as associative memory and human-equivalent cerebellar coordination (or whatever) are now solved problems, they still don’t know how to construct AGI. They will note that there are no AIs writing computer science papers, or holding a truly sensible conversation with a human, and castigate the senseless alarmism of those who talk as if we already knew how to do that. They will explain that foolish laypeople don’t realize how much pain and tweaking it takes to get the current systems to work. (Although those modern methods can easily do almost anything that was possible in 2017, and any grad student knows how to roll a stable GAN on the first try using the tf.unsupervised module in Tensorflow 5.3.1.)

When all the pieces are ready and in place, lacking only the last piece to be assembled by the very peak of knowledge and creativity across the whole world, it will still seem to the average ML person that AGI is an enormous challenge looming in the distance, because they still won’t personally know how to construct an AGI system. Prestigious heads of major AI research groups will still be writingarticlesdecrying the folly of fretting about the total destruction of all Earthly life and all future value it could have achieved, and saying that we should not let this distract us fromreal, respectable concernslike loan-approval systems accidentally absorbing human biases.

当然,未来很难详细预测。很难以至于我不仅承认自己的无能为力,而且还提出了更强烈的积极陈述,没有其他人也能做到。“开创性的Arxiv论文泛滥”场景可能是一种可能发生的方式,但这是我为具体性而弥补的令人难以置信的特定情况。当然,这不是我观看其他类似地球的文明发展的丰富经验。我确实在“曼哈顿项目之外没有太多的标志在广岛”上放置了很大一部分概率,因为这种情况很简单。任何更复杂的事情都只是一个充满的故事繁重的细节that aren’t likely to all be true.

但是,无论细节如何发挥作用,我都在非常普遍的意义上预测,不会有没有实际运行的火灾警报 - 在那时,每个人都知道并同意的毫无疑问的迹象,这使人们无需感觉到就可以采取行动nervous about whether they’re worrying too early. That’s just not how the history of technology has usually played out in much simpler cases like flight and nuclear engineering, let alone a case like this one where all the signs and models are disputed. We already know enough about the uncertainty and low quality of discussion surrounding this topic to be able to say with confidence that there will be no unarguable socially accepted sign of AGI arriving 10 years, 5 years, or 2 years beforehand. If there’s any general social panic it will be by coincidence, based on terrible reasoning, uncorrelated with real timelines except by total coincidence, set off by a Hollywood movie, and focused on relatively trivial dangers.

没有人对这种火灾警报的任何实际说明并不巧合,并令人信服地争论了我们剩下的时间,以及我们应该开始哪些项目。如果有人编写了该提议,那么下一个写一个人会说完全不同的话。And probably neither of them will succeed at convincing me that they know anything prophetic about timelines, or that they’ve identified any sensible angle of attack that is (a) worth pursuing at all and (b) not worth starting to work on right now.


在我看来,将所有动作推迟到完全未指定的未来警报响起,这意味着鲁ck的命令足够好,以至于持续失败的定律开始发挥作用。

持续失败的定律是规则,说明您的国家在所有银行帐户和信贷申请上都无法使用明文9个数字的密码,那么您的国家在下一次灾难之后不足以纠正课程其中揭示了一亿个密码。一个文明足以纠正该产品的课程,以您希望他们的反应方式对其做出反应,这是足够有能力的,不仅可以首先犯错。当一个系统大规金宝博官方模而显然,而不是巧妙的和在能力的边缘上发生故障时,下一个产品不会导致系统突然突然智能地进行事情。

The law of continued failure is especially important to keep in mind when you are dealing with big powerful systems or high-status people that you might feel nervous about derogating, because you may be tempted to say, “Well, it’s flawed now, but as soon as a future prod comes along, everything will snap into place and everything will be all right.” The systems about which this fond hope is actually warranted look like they are mostly doing all the important things right already, and only failing in one or two steps of cognition. The fond hope is almost never warranted when a person or organization or government or social subsystem is currently falling massively short.

The folly required to ignore the prospect of aliens landing in thirty years is already great enough that the other flawed elements of the debate should come as no surprise.

而且,由于今天同时出现了所有这些问题,我们应该预测,在收到不确定的迹象表明外星人可能会在五年内降落的迹象之后,同一系统和激励措施不会产生正确的产出。金宝博官方持续失败的定律表明,如果现有当局立即以足够的不同方式失败,以为试图通过说真正的问题是自动驾驶汽车的安全性,那是有意义的,默认的期望是他们以后仍然会说愚蠢的话。

犯大量同时错误的人通常并没有将所有不正确的想法下意识地标记为“不正确”。即使有动力,他们也无法突然翻转以巧妙地执行所有校正的推理步骤。Yes, we have various experiments showing that monetary incentives can reduce overconfidence and political bias, but (a) that’s reduction rather than elimination, (b) it’s with extremely clear short-term direct incentives, not the nebulous and politicizable incentive of “a lot being at stake”, and (c) that doesn’t mean a switch is flipping all the way to “carry out complicated correct reasoning”. If someone’s brain contains a switch that can flip to enable complicated correct reasoning at all, it’s got enough internal precision and skill to think mostly-correct thoughts now instead of later—at least to the degree that some conservatism and double-checking gets built into examining the conclusions that people know will get them killed if they’re wrong about them.

没有符号和前瞻性,no threshold crossed, that suddenly causes people to wake up and start doing things systematically correctly. People who can react that competently to any sign at all, let alone a less-than-perfectly-certain not-totally-agreed item of evidence that is可能唤醒电话,可能已经完成了时间缩短的事情。他们已经想到了未来的迹象,并继续前进,以前想到明智的想法,就像斯图尔特·罗素(Stuart Russell)所说:“如果您知道外星人在三十年内登陆,那仍然很重要。”


回到funding-starved初期是什么now MIRI, I learned that people who donated last year were likely to donate this year, and people who last year were planning to donate “next year” would quite often this year be planning to donate “next year”. Of course there were genuine transitions from zero to one; everything that happens needs to happen for a first time. There were college students who said “later” and gave nothing for a long time in a genuinely strategically wise way, and went on to get nice jobs and start donating. But I also learned well that, like many cheap and easy solaces, saying the word “later” is addictive; and that this luxury is available to the rich as well as the poor.

我期望与Agi Alignment工作有任何不同。那些试图掌握对准问题的人将在明年对上一年所掌握的一切(另外,是的,任何一般野外进步都可以做得更好(或很多)同时发生了)。想要推迟这一点的人,直到对AI和AGI有更好的了解后,在第二年的AI和AGI促进的价值之后,将希望将工作推迟到对AI和AGI的未来更好的理解。

Some people reallywant对齐,get done因此now试图使他们的大脑关于如何获得强化学习者之类的东西可靠地确定因果环境模型中的特定元素而不是感官奖励术语或者defeat the seeming tautologicalness of updated (non-)deference。其他人则宁愿从事其他事情,因此会宣布今天没有工作可以做不是花两个小时在发表该声明之前先悄悄地思考它。明天不会改变,除非明天是我们醒来一些有趣的报纸头条,甚至可能不会。说“以后”的奢侈不仅是真正可怜的选择。

过了一会儿,我开始在大学里告诉有效的利他主义者:“如果您打算以后收获赚钱,那么现在,每三个月捐出约5美元。而且,切勿连续两次给予完全相同的金额,或者连续两次给予同一组织,以便您练习重新评估原因并定期重新评估捐赠金额的精神习惯。Don’t学习总是说“以后”的精神习惯。

Similarly, if somebody wasactually我要告诉他们每六个月,要“以后”进行AGI对齐,花了几个小时来制定他们可以设计的最佳目前计划,以使AGI保持一致并在该方案上做有用的工作。假设他们必须使用,AGI是一种类似于当前技术的技术。至少在将其发布到Facebook的意义上,并发布了他们最不错的信息;因此,他们将对命名一个计划的尴尬感,这种计划看起来并不是有人花了两个小时试图思考最好的坏方法。

我们将来会更好地了解AI,我们会学到的东西可能会使我们有更多的信心,即特定的研究方法将与AGI相关。金宝博娱乐类似于尼克·博斯特罗姆(Nick Bostrom Publishing)的未来社会学发展可能会有更多超级智能, Elon Musk tweeting about it and thereby heaving a rock through the Overton Window, or more respectable luminaries like Stuart Russell openly coming on board. The future will hold more AlphaGo-like events to publicly and privately highlight new ground-level advances in ML technique; and it may somehow be that this does不是leave us in the same epistemic state as having already seen AlphaGo and GANs and the like. It could happen! I can’t see exactly how, but the future does have the capacity to pull surprises in that regard.

但是在等待惊喜之前,您应该询问您对AGI时间表的不确定性是否真的是不确定性。If it feels to you that guessing AGI might have a 50% probability in N years is not enough knowledge to act upon, if that feels scarily uncertain and you want to wait for more evidence before making any decisions… then ask yourself how you’d feel if you believed the probability was 50% in N years, and everyone else on Earth also believed it was 50% in N years, and everyone believed it was right and proper to carry out policy P when AGI has a 50% probability of arriving in N years. If that visualization feels very different, then any nervous “uncertainty” you feel about doing P is not really about whether AGI takes much longer than N years to arrive.

And you are almost surely going to be stuck with that feeling of “uncertainty” no matter how close AGI gets; because no matter how close AGI gets, whatever signs appear will almost surely not produce common, shared, agreed-on public knowledge that AGI has a 50% chance of arriving in N years, nor any agreement that it is therefore right and proper to react by doing P.

And if all that did become common knowledge, then P is unlikely to still be a neglected intervention, or AI alignment a neglected issue; so you will have waited until sadly late to help.

但是更有可能的是,常识不会存在,因此考虑行动总是会紧张地“不确定”。

你可以尽管如此,行动或者not act. Not act until it’s too late to help much, in the best case; not act at all until after it’s essentially over, in the average case.

I don’t think it’s wise to wait on an unspecified epistemic miracle to change how we feel. In all probability, you’re going to be in this mental state for a while—including any nervous-feeling “uncertainty”. If you handle this mental state by saying “later”, that general policy is not likely to have good results for Earth.


进一步的资源: