友好的AI研究是有效的利金宝博娱乐他主义

||Analysis

Miri成立于2000年的前提1友好的AI可能是做尽可能多的好方法。

此后的一些发展包括:

  • The field of “effective altruism” — trying not just to do good but to do尽可能多2- 比以往任何时候都更加宣传和更好的研究,尤其是通过金宝博娱乐给予,有效利他主义中心,philosopher彼得·辛格, 一个nd the community atLess Wrong3
  • 在他的最近博士学位论文,尼克·贝克斯特德有clarified the assumptions behind the claim that shaping the far future (e.g. via Friendly AI) is overwhelmingly important.
  • 由于Miri进金宝博娱乐行的研究,人类研究所的未来(FHI), and others, our strategic situation with regard to machine superintelligence is more clearly understood, and FHI’sNick Bostrom有或者ganized much of this work in aforthcoming book4
  • Miri的Eliezer Yudkowsky拥有begunto describe in more detail which open research problems constitute “Friendly AI research,” in his view.

鉴于这些发展,我们比以往任何时候都处于更好的位置,以评估友好AI研究的价值作为有效的利他主义。金宝博娱乐

尽管如此,这还是一个困难的问题。这是足够挑战的,可以评估抗马拉里亚网或者direct cash transfers。评估塑造遥远未来(例如,通过友好的AI)尝试的成本效益甚至比这更加困难。因此,this short post sketches an argument that can be given in favor of Friendly AI research as effective altruism, to enable future discussion, 并且是不打算作为彻底的分析。

阅读更多 ”


  1. 在this post, I talk about the value ofhumanity in general创建友好的AI,尽管Miri联合创始人Eliezer Yudkowsky通常会谈论MIRI in particular— or at least, a functional equivalent — creating Friendly AI. This is because I am not as confident as Yudkowsky that it is best for MIRI to attempt to build Friendly AI. When updating MIRI’s bylaws in early 2013, Yudkowsky and I came to a compromise on the language of MIRI’s mission statement, which now reads: “[MIRI] exists to ensure that the creation of smarter-than-human intelligence has a positive impact. Thus, the charitable purpose of [MIRI] is to: (a) perform research relevant to ensuring that smarter-than-human intelligence has a positive impact; (b) raise awareness of this important issue; (c) advise researchers, leaders and laypeople around the world; and (d)as necessary, implement a smarter-than-human intelligence with humane, stable goals” (emphasis added). My own hope is that it will not be necessary for MIRI (or a functional equivalent) to attempt to build Friendly AI itself. But of course I must remain open to the possibility that this will be the wisest course of action as the first creation of AI吸引更接近。也有能力的问题:一些体育ople think that a non-profit research organization has much chance of being the first to build AI. I worry, however, that the world’s elites will not find it fashionable to take this problem seriously until the creation of AI is only a few decades away, at which time it will be especially difficult to develop the mathematics of Friendly AI in time, and humanity will be forced to take a gamble on its very survival with powerful AIs we have little reason to trust.
  2. 人们可能会认为有效的利他主义是直接应用决策理论to the subject of philanthropy. Philanthropic agents of all kinds (individuals, groups, foundations, etc.) ask themselves: “How can we choose philanthropic acts (e.g. donations) which (in expectation) will do as much good as possible, given what we care about?” The consensus recommendation forallkinds of choices under uncertainty, including philanthropic choices, is to maximize expected utility (Chater&Oaksford 2012;彼得森2004;Stein 1996;Schmidt 1998:19). Different philanthropic agents value different things, but decision theory suggests that each of them can get the most of what they want if they each maximize their expected utility. Choices which maximize expected utility are in this sense “optimal,” and thus another term for effective altruism is “最佳慈善事业。”请注意,从这个意义上讲,有效的利他主义与早期的慈善方法并不是太相似的高影响力的慈善事业(制造 ”鉴于投资资本的数量,最大的差异“),strategic philanthropy,有效的慈善事业, 一个nd明智的慈善事业。Note also that effective altruism does not say that a philanthropic agent should specify complete utility and probability functions over outcomes and then compute the philanthropic act with the highest expected utility — that is impractical for bounded agents. We must keep in mind the distinction between normative, descriptive, and prescriptive models of decision-making (Baron 2007): “normative models tell us how to evaluate… decisions in terms of their departure from an ideal standard. Descriptive models specify what people in a particular culture actually do and how they deviate from the normative models. Prescriptive models are designs or inventions, whose purpose is to bring the results of actual thinking into closer conformity to the normative model.” Theprescriptivequestion — about what bounded philanthropic agents should do to maximize expected utility with their philanthropic choices — tends to be extremely complicated, and is the subject of most of the research performed by the effective altruism community.
  3. See, for example:Efficient Charity,有效的慈善机构:对他人做,政治作为慈善,Heuristics and Biases in Charity,Public Choice and the Altruist’s Burden,关于慈善和线性实用程序,Optimal Philanthropy for Human Beings,单独购买模糊和UTILON,金钱:关怀的单位,优化模糊和UTILON:利他主义芯片罐,Efficient Philanthropy: Local vs. Global Approaches,The Effectiveness of Developing World Aid,Against Cryonics & For Cost-Effective Charity,贝叶斯调整不会打败存在的风险慈善机构,如何拯救世界, 一个nd什么是最佳慈善事业?
  4. 我相信Beckstead和Bostrom为研究社区提供了一项巨大的服务,以创建一个金宝博娱乐framework, 一个共享语言,用于讨论轨迹变化,存在风险和机器超智能。在与我的同事讨论这些主题时,通常会花费第一个小时的对话只是试图理解对方在说什么 - 他们如何使用他们采用的术语和概念。贝克斯特德(Beckstead)和博斯特罗姆(Bostrom)的最新工作应使研究人员之间的更清晰,更有效的沟通,从而更高的研究生产力。金宝博娱乐尽管我不知道关于共享语言对研究生产力的影响的任何受控的实验研究,但共享语言被广泛认为对任何研究领域都有很大的好处,我将提供一些本说明的例子金宝博娱乐印刷。Fuzzi et al. (2006):“使用不一致的术语很容易导致来自大气和气候研究的不同[学科]专家之间的交流的误解和混乱,因此可能会抑制科学进步。”金宝博娱乐Hinkel (2008):“技术语言使他们的用户,例如科学纪律的成员,以有效地传达有关感兴趣的领域。”Madin等。(2007):“terminological ambiguity slows scientific progress, leads to redundant research efforts, and ultimately impedes advances towards a unified foundation for ecological science.”

Miri May通讯:情报爆炸微观经济学和其他出版物

||消息letters

执行董事的问候

Dear friends,

It’s been a busy month!

Mostly, we’ve been busypublishing事物。如下所示,Singularity Hypotheses现在已经出版了,其中包括Miri研究人员或研究协会的四章。金宝博娱乐我们还发表了两份新的技术报告 - 一份关于决策理论的新技术报告,另一份关于情报爆炸微观经济学 - 以及几篇新的博客文章,分析了与AI未来有关的各种问题。最后,我们添加了四个旧文章到研究页面,包金宝博娱乐括Ideal Advisor Theories and Personal CEV(2012).

在我们的April newsletter我们谈到了我们4月11日在旧金山举行的聚会,庆祝我们的重新启动,因为机器情报研究所以及我们过渡到数学研究。金宝博娱乐该活动的其他照片现在可以作为Facebook photo album。我们也上传一个视频从事件,在which I spend 2 minutes explaining MIRI’s relaunch and some tentative results from the April workshop. After that, visiting researcherQiaochu Yuan花费4分钟来解释美里的核心研究问题之一:自我修改系统的Löbian障碍。金宝博娱乐金宝博官方

我们四月研讨会的一些研究金宝博娱乐将于6月发布,因此,如果您想立即阅读这些结果,您可能会想188bet娱乐城 to188bet娱乐城

Cheers!

Luke Muehlhauser

执行董事

阅读更多 ”

注册DAGGRE以改善科学技术预测

||消息

什么时候会创建AI?,我命名了四种可能改善我们对AI和其他重要技术的预测的方法。其中两种方法是显式量化leveraging aggregation, 一个s exemplified by IARPA’sACE程序, which aims to “dramatically enhance the accuracy, precision, and timeliness of… forecasts for a broad range of event types, through the development of advanced techniques that elicit, weight, and combine the judgments of many… analysts.”

GMU的达格尔计划, one of five teams participating in ACE, recently宣布从地缘政治预测到科学技术预测的过渡:

Daggre将继续,但它将从地理政治预测过渡到科学技术(S&T)预测,以更好地使用其组合能力。我们将拥有一个全新的闪亮,友好且内容丰富的界面,该界面由Inkling Markets共同设计,您提供自己的预测问题等等!

另一个令人兴奋的发展是,我们的S&T预测预测市场将向世界上至少十八岁的每个人开放。我们要全球!

如果您想帮助提高人类预测重要的技术发展的能力,请注册Daggre的新S&T预测网站这里

I did.

研究页面中添加了四篇文章金宝博娱乐

||Papers

Four older articles have been added to our金宝博娱乐

第一个是Christiano等人的早期草稿。Definability of ‘Truth’ in Probabilistic Logic”先前讨论的这里这里。该草案最后一次更新于2013年4月2日。

The second paper is a cleaned-up version of an article originally published2012年12月by Luke Muehlhauser and Chris Williamson to Less Wrong: “2012年12月by Luke Muehlhauser and Chris Williamson to Less Wrong: “Ideal Advisor Theories and Personal CEV。”

第三篇论文最初是由比尔·希伯德(Bill Hibbard)发表的AGI 2012会议记录:“AGI 2012会议记录:“避免意想不到的AI行为“ 和 ”决定苏pport for Safe AI Design。”希巴德(Hibbard)在成为一名Miri研究助理之前就撰写了这些文章,但他允许我们将其包括在我们的研究页金宝博娱乐面上,因为(1)他在出版文章的AGI-12会议上成为了Miri研究助理,(2)这些文章的部分灵感来自公开对话with Luke Muehlhauser, and (3) the articles build on MIRI’s paper “公开对话with Luke Muehlhauser, and (3) the articles build on MIRI’s paper “情报爆炸和机器道德。”

As mentioned in ourDecember 2012 newsletter,“避免意外的AI行为”获得了Miri的最佳AGI安全文件1000美元的Turing奖。该奖项是为了纪念艾伦·图灵(Alan Turing)而颁发的,他不仅发现了机器智能的一些关键思想,而且还掌握了其重要性,写道:“……似乎一旦[人级机器思维]就开始了不久就超越了我们微弱的力量……在某个阶段,我们应该期望这些机器能够控制……”

什么时候会创建AI?

||Analysis

Strong AI似乎是本周的主题。凯文·鼓琼斯母亲thinks到2040年,AIS将像人类一样聪明。卡尔·史密斯atForbes和“多发性硬化症。” at经济学家似乎在此时间轴上与鼓声大致同意。莫西·瓦尔迪(Moshe Vardi),世界主编most-read computer science magazine,预测that “by 2045 machines will be able to do if not any work that humans can do, then a very significant fraction of the work that humans can do.”

但是,预测AI比许多人想象的要困难得多。

To explore these difficulties, let’s start with a 2009Bloggingheads.tv对话在美里研究员之间金宝博娱乐Eliezer Yudkowsky和MIT computer scientist斯科特·亚伦森(Scott Aaronson), 一个uthor of the excellent量子计算以来。在那次对话的初期,尤德科夫斯基问:

It seems pretty obvious to me that at some point in [one to ten decades] we’re going to build an AI smart enough to improve itself, and [it will]“foom” upward in intelligence,当它耗尽可用途径以进行改进时,它将成为我们的“超级智能”。您觉得这很明显吗?

Aaronson replied:

The idea that we could build computers that are smarter than us… and that those computers could build still smarter computers… until we reach the physical limits of what kind of intelligence is possible… that we could build things that are to us as we are to ants — all of this is compatible with the laws of physics… and I can’t find a reason of principle that it couldn’t eventually come to pass…

我们不同意的主要事情是time scale… a few thousand years [before AI] seems more reasonable to me.

这两个估计(几十年对“几千年”)具有截然不同的政策含义。

如果在接下来的几十年中,AI很有可能会在历史的方向盘上取代人类,那么我们最好戴上手套,然后金宝博娱乐 确保此事件具有积极而不是负面影响。但是,如果我们可以非常有信心AI已经数千年了,那么我们现在就不必担心AI,我们应该专注于其他全球优先事项。因此,看来“何时会创建AI?”是一个很高的问题信息的价值对于我们的物种。

Let’s take a moment to review the forecasting work that已经完成,并查看我们可能得出的结论,即何时可能创建AI。

阅读更多 ”

Advise MIRI with Your Domain-Specific Expertise

||消息

Miri目前有几十名志愿顾问在广泛的主题上,但我们需要更多!如果您想帮助Miri更有效地追求其任务,请注册成为美里顾问

If you sign up, we will occasionally ask you questions, or send you early drafts of upcoming writings for feedback.

我们并不总是想要技术建议(“好吧,您可以通过相关的算术层次结构来做到这一点……”);通常,我们只是想了解不同的专家群体如何对我们的写作做出反应(“本段的基调以错误的方式摩擦了我,因为……”)。

目前,我们最需要以下主题顾问:

Even if you don’t havemuchtime to help,请注册! We will of course respect your own limits on availability.

五个论文,两个引理和几个战略意义

||Analysis

Miri对自我建议AI的主要关注并不是不是由“坏”演员而不是全球领域的“好”演员创造的。相反,我们大多数关心的是纠正这种情况没有人知道how to create a self-modifying AI with known, stable preferences. (This is why we see the main problem in terms of金宝博娱乐 并鼓励他人进行相关研究,而不是试图阻止“坏”演员创建AI。)金宝博娱乐

This, and a number of other basic strategic views, can be summed up as a consequence of 5 theses about purely factual questions about AI, and 2 lemmas we think are implied by them, as follows:

在telligence explosion thesis。A sufficiently smart AI will be able to realize large, reinvestable cognitive returns from things it can do on a short timescale, like improving its own cognitive algorithms or purchasing/stealing lots of server time. The intelligence explosion will hit very high levels of intelligence before it runs out of things it can do on a short timescale. See:Chalmers (2010);Muehlhauser&Salamon(2013);Yudkowsky (2013)

正交论文。思维设计空间足够巨大,可以包含具有几乎所有偏好的代理,并且这种代理在实现这些偏好方面具有理性的理性,并且具有强大的计算能力。例如,从理论上讲,思维设计空间包含功能强大的,具有仪器理性的代理,它们充当预期的纸卷最大化器,并始终在随后选择导致最大数量预期的纸卷的选项。看:Bostrom(2012);Armstrong (2013)

融合工具目标论文。大多数公用事业功能将产生一部分工具目标,这些目标从大多数可能的最终目标遵循。例如,如果您想构建一个充满幸福的众生的星系,则需要物质和精力,如果您想制作纸卷,也是如此。这就是为什么我们担心非常强大的实体,即使他们对我们没有明确的厌恶:“人工智能不爱你,也不恨你,但是你是用原子制成的,它可以用来其他东西。”请注意,尽管通过正交论文,您总是可以拥有一个明确,终止不愿意做任何特定事物的代理 - 一个确实爱您的AI,您不想为备用原子而分开。看:Omohundro (2008);Bostrom(2012)

Complexity of value thesis。它需要很大一部分Kolmogorov的复杂性来描述理想化的人类偏好。也就是说,即使我们采用反射均衡(判断您自己的思维过程)和其他标准规范理论,我们“应该”做的是一个计算复杂的数学对象。具有随机生成的实用功能的超级智能不会做我们认为与银河系相关的任何事情,因为对于拥有多样化的有多样的众生过着有趣的生活的文明,不太可能意外地遭受最终偏好。看:Yudkowsky (2011);Muehlhauser&Helm(2013)

价值论文的脆弱性。获得目标系统90%的权利并不能金宝博官方为您提供90%的价值,除了正确拨打我电话号码的10位数字中的9个,还可以将您与与Eliezer Yudkowsky相似的90%的人联系起来。有多个维度消除价值维度的维度几乎可以消除未来的所有价值。例如,一个外国物种几乎共享了所有人类价值,除了其“无聊”的参数设置要低得多,可能会专用其大部分计算能力来一次又一次地以略有不同的像素颜色(一遍又一遍地重播最佳体验)(或等效的)。友好的AI更像是一个令人满意的阈值,而不是我们试图连续10%的改进。请参阅:Yudkowsky(2009,2011)。

这五个论文似乎暗示着两个重要的引理:

在direct normativity。Programming a self-improving machine intelligence to implement a grab-bag of things-that-seem-like-good-ideas will lead to a bad outcome, regardless of how good the apple pie and motherhood sounded. E.g., if you give the AI a final goal to “make people happy” it’ll just turn people’s pleasure centers up to maximum. “Indirectly normative” is Bostrom’s term for an AI that calculates the ‘right’ thing to do via, e.g., looking at human beings and modeling their decision processes and idealizing those decision processes (e.g. what you would-want if you knew everything the AI knew and understood your own decision processes, reflective equilibria, ideal advisior theories, and so on), rather than being told a direct set of ‘good ideas’ by the programmers. Indirect normativity is how you deal with Complexity and Fragility. If you can succeed at indirect normativity, then small variances in essentially good intentions may not matter much — that is, if two different projects do indirect normativity correctly, but one project has 20% nicer and kinder researchers, we could still hope that the end results would be of around equal expected value. See:Muehlhauser&Helm(2013)

友善的额外困难。您可以通过正交论文来构建友好的AI,但是您需要大量的工作和聪明才能使目标体系正确。金宝博官方更重要的是,其余的人工智能需要达到更高的清洁度,以使目标系统通过十亿个顺序自我修饰保持不变。金宝博官方任何足够聪明的人工智能进行清洁的自我修饰都会倾向于这样做,但是问题是,智力爆炸可能从AIS开始的智能差异要低于它 - 例如,使用遗传算法或其他此类手段的AIS重写自己that don’t preserve a set of consequentialist preferences. In this case, building a Friendly AI could mean that our AI has to be smarter about self-modification than the minimal AI that could undergo an intelligence explosion. See:Yudkowsky(2008)Yudkowsky (2013)

这些引理反过来有两个主要的战略意义:

  1. We have a lot of work to do on things like indirect normativity and stable self-improvement. At this stage a lot of this work looks really foundational — that is, we can’t describe how to do these things using infinite computing power, let alone finite computing power. We should get started on this work as early as possible, since basic research often takes a lot of time.
  2. There needs to be a Friendly AI project that has some sort of boost over competing projects which don’t live up to a (very) high standard of Friendly AI work — a project which can successfully build a stable-goal-system self-improving AI, before a less-well-funded project hacks together a much sloppier self-improving AI. Giant supercomputers may be less important to this than being able to bring together the smartest researchers (see the open question posed inYudkowsky 2013) but the required advantage cannot be left up to chance. Leaving things to default means that projects less careful about self-modification would have an advantage greater than casual altruism is likely to overcome.