December 2017 Newsletter

December 6, 2017|Rob Bensinger|新闻通讯

Our annual fundraiser is live。Discussed in the fundraiser post:

News— What MIRI’s researchers have been working on lately, and more.
Goals— We plan to grow our research team 2x in 2018–2019. If we raise $850k this month, we think we can do that without dipping below a 1.5-year runway.
Actual目标- 我们认为最有可能导致良好全球成果的最有可能的事件的概述更大的概述。

Our funding drive will be running until December 31st.

金宝博娱乐研究更新

New at IAFF:Reward Learning Summary;反射性甲带作为解决律师问题的解决方案;Policy Selection Solves Most Problems
We ran aworkshopon Paul Christiano’s research agenda.
We’ve hired the firstmembers我们的新工程团队，包括数学博士Jesse Liptrap和前Quixey首席建筑师Nick Tarleton！如果您想加入团队，在这里申请呢
I’m also happy to announce that提单ake Borgeson有一个advisory的角色,帮助建立我们的工程项目。提单ake is a自然- 发行的计算生物学家共同创立了递归制药，他领导了生物技术公司的机器学习工作。

General updates

“Security Mindset and Ordinary Paranoia”：Eliezer Yudkowsky的新对话。也可以看看part 2。
公开慈善项目已授予Miria three-year, $3.75 million grant呢
我们在Facebook举行的星期二活动期间收到了48,132美元的捐款，其中11,371美元 - 来自迅速的捐助者，这些捐助者制作了该活动的85秒！- 将由Bill和Melinda Gates Foundation匹配。
Miri员工作家Matthew Graves发表了一篇文章怀疑论者magazine: “Why We Should Be Concerned About Artificial Superintelligence。”
尤德科夫斯基的新书Inadequate Equilibriais out! See other recent discussion of modest epistemology and inadequacy analysis byScott Aaronson，，，，罗宾·汉森（Robin Hanson），，，，Abram Demski，，，，Gregory Lewis，，，，andScott Alexander。

美里的2017 Fundraiser

12月1日，，，，2017|Malo Bourgon|美里战略，，，，News

Update 2017-12-27:We’ve blown past our 3rd and final target, and reached the matching cap of $300,000 for the$2 million Matching Challenge呢Thanks so much to everyone who supported us!

All donations made before 23:59 PST on Dec 31st will continue to be counted towards our fundraiser total. The fundraiser total includes projected matching funds from the Challenge.

美里的2017 fundraiser活到12月底！到目前为止，我们的进度（实时更新）：

目标1
$625,000
Completed 目标2
$ 850,000
Completed 目标3
$ 1,250,000
Completed

总共筹集了$ 2,504,625！

358个捐助者捐款

188betapp

毕业于加州大学伯克利金宝博娱乐分校的美里是一个基于研究非营利组织lifornia with a mission of ensuring that smarter-than-human AI technology has a positive impact on the world. You can learn more about our work at “为什么AI安全？” or via MIRI Executive Director Nate Soares’Google关于AI对齐的讨论。

In 2015，我们讨论了我们有潜在分支以同时探索多个研究计划的兴趣，一旦我们可以支持大型团队。金宝博娱乐根据我们对战略格局的整体情况的最新变化，我们现在正在朝着该目标迈进，并开始探索新的研究方向，同时也继续推动我们的目标金宝博娱乐代理基金会议程。For more on our new views, see “这re’s No Fire Alarm for Artificial General Intelligence” and our2017年战略更新。We plan to expand on our relevant strategic thinking more in the coming weeks.

我们扩大的研究重点意味着我金宝博娱乐们的研究团队可能会发展大大，并迅速发展。我们目前的目标是在未来两年内雇用大约10名新的研究人员，主要是软件工程师。金宝博娱乐如果我们成功，我们的观点是our 2018 budget will be $2.8Mand我们的2019年预算将为350万美元，，，，up from roughly $1.9M in 2017.¹

We’ve set our fundraiser targets by estimating how quickly we could grow while maintaining a 1.5-year runway, on the simplifying assumption that about 1/3 of the donations we receive between now and the beginning of 2019 will come during our current fundraiser.²

打目标1（$ 625K）然后让我们在2018年（但不在2019年）采取行动；目标2（（$850k) lets us act on our full two-year growth plan; and in the case where our hiring goes better than expected,目标3（125万美元）将使我们能够迅速向我们的团队增加新成员，或者根据需要为新的研究人员支付更高的薪水。金宝博娱乐

我们在下面讨论了更多详细信息，包括我们当前的组织活动以及我们如何看到我们的工作适合更大的战略空间。

安全心态和后勤成功曲线

November 26, 2017|Eliezer Yudkowsky|分析

后续到：Security Mindset and Ordinary Paranoia

（（Two days later, Amber returns with another question.）

琥珀色：Uh, say, Coral. How important is security mindset when you’re building a whole new kind of system—say, one subject to potentially adverse optimization pressures, where you want it to have some sort of robustness property?

珊瑚：如何novel is the system?

琥珀色：Very novel.

珊瑚：小说足以发明自己的新最佳实践，而不是查找它们？

琥珀色：正确的。

珊瑚：That’s serious business. If you’re building a very simple Internet-connected system, maybe a smart ordinary paranoid could look up how we usually guard against adversaries, use as much off-the-shelf software as possible that was checked over by real security professionals, and not do too horribly. But if you’re doing something qualitatively new and complicated that has to be robust against adverse optimization, well… mostly I’d think you were operating in almost impossibly dangerous territory, and I’d advise you to figure out what to do after your first try failed. But if you wanted to actually succeed, ordinary paranoia absolutely would not do it.

琥珀色：换句话说，建立关键任务系统的项目的项目应该具有完整的安全心态的顾问，以便顾问可以说出系统构建者真正需要做什么以确保金宝博官方安全。

珊瑚：（（laughs sadly）No.

琥珀色：不？

Security Mindset and Ordinary Paranoia

November 25, 2017|Eliezer Yudkowsky|分析

以下是虚构的对话建筑AI Alignment: Why It’s Hard, and Where to Start。

（（AMBER，一位对更可靠的互联网感兴趣的慈善家，CORAL，一名计算机安全专业人员正在会议酒店一起讨论珊瑚坚称是一个困难而重要的问题：构建“安全”软件的困难。）

琥珀色：So, Coral, I understand that you believe it is very important, when creating software, to make that software be what you call “secure”.

珊瑚：特别是如果它连接到互联网，或者控制金钱或其他贵重物品。但是，是的。

琥珀色：我发现很难相信这必须是计算机科学中的一个单独的话题。通常，程序员需要弄清楚如何使计算机做他们想要的事情。构建操作系统的人们肯定不会希望他们访问未经授权的用户，就像他金宝博官方们不希望这些计算机崩溃一样。为什么一个问题比另一个问题要困难得多？

珊瑚：That’s a deep question, but to give a partial deep answer: When you expose a device to the Internet, you’re potentially exposing it to intelligent adversaries who can find special, weird interactions with the system that make the pieces behave in weird ways that the programmers did not think of. When you’re dealing with that kind of problem, you’ll use a different set of methods and tools.

琥珀色：Any system that crashes is behaving in a way the programmer didn’t expect, and programmers already need to stop that from happening. How is this case different?

珊瑚：Okay, so… imagine that your system is going to take in one kilobyte of input per session. (Although that itself is the sort of assumption we’d question and ask what happens if it gets a megabyte of input instead—but never mind.) If the input is one kilobyte, then there are 2^{8，，，，000}possible inputs, or about 10^2,400or so. Again, for the sake of extending the simple visualization, imagine that a computer gets a billion inputs per second. Suppose that only a googol, 10¹⁰⁰，在10个^2,400可能的输入会导致系统的行为以某种方式原始设计师不打算。金宝博官方

If the system is getting inputs in a way that’s uncorrelated with whether the input is a misbehaving one, it won’t hit on a misbehaving state before the end of the universe. If there’s an intelligent adversary who understands the system, on the other hand, they may be able to find one of the very rare inputs that makes the system misbehave. So a piece of the system that would literally never in a million years misbehave on random inputs, may break when an intelligent adversary tries deliberately to break it.

琥珀色：So you’re saying that it’s more difficult because the programmer is pitting their wits against an adversary who may be more intelligent than themselves.

珊瑚：这是一种几乎右翼的方式。重要的不是“对手”部分，而是优化部分。有系统的非随机力量强金宝博官方烈选择特定结果，导致系统的部分沿着怪异的执行路径降低并占据意外的状态。如果您的系统从字金宝博官方面上根本没有行为不当模式，那么您是否拥有IQ 140，而敌人拥有IQ 160，这不是一场武器竞争。当怪异的状态以相关的方式选择而不是仅出于意外发生时，就很难建立一个不会进入怪金宝博官方异状态的系统。怪异的选择力可以搜索您自己无法想象的较大状态空间的部分。击败确实需要新技能和不同的思维方式，布鲁斯·施耐（Bruce Schneier）称之为“安全思维方式”。

琥珀色：啊，这种安全心态是什么？

珊瑚：I can say one or two things about it, but keep in mind we are dealing with a quality of thinking that is not entirely effable. If I could give you a handful of platitudes about security mindset, and that would actually cause you to be able to design secure software, the Internet would look very different from how it presently does. That said, it seems to me that what has been called “security mindset” can be divided into two components, one of which is much less difficult than the other. And this can fool people into overestimating their own safety, because they can get the easier half of security mindset and overlook the other half. The less difficult component, I will call by the term “ordinary paranoia”.

琥珀色：普通的偏执狂？

珊瑚：Lots of programmers have the ability to imagine adversaries trying to threaten them. They imagine how likely it is that the adversaries are able to attack them a particular way, and then they try to block off the adversaries from threatening that way. Imagining attacks, including weird or clever attacks, and parrying them with measures you imagine will stop the attack; that is ordinary paranoia.

琥珀色：Isn’t that what security is all about? What do you claim is the other half?

珊瑚：To put it as a platitude, I might say… defending against mistakes in your own assumptions rather than against external adversaries.
Read more »

宣布“平衡不足”

November 16, 2017|Rob Bensinger|News

Miri高级研究员Eli金宝博娱乐ezer Yudkowsky今天有一本新书：平衡不足：文明在哪里以及如何卡住，，，，a discussion of societal dysfunction, exploitability, and self-evaluation. From the preface:

Inadequate Equilibria是一本关于一个广义的概念有效吗markets, and how we can use this notion to guess where society will or won’t be effective at pursuing some widely desired goal.An efficient market is one where smart individuals should generally doubt that they can spot overpriced or underpriced assets. We can ask an analogous question, however, about the “efficiency” of other human endeavors.

Suppose, for example, that someone thinks they can easily build a much better and more profitable social network than Facebook, or easily come up with a new treatment for a widespread medical condition. Should they question whatever clever reasoning led them to that conclusion, in the same way that most smart individuals should question any clever reasoning that causes them to think AAPL stock is underpriced? Should they question whether they can “beat the market” in these areas, or whether they can even spot major in-principle improvements to the status quo? How “efficient,” oradequate，，，，should we expect civilization to be at various tasks?

与往常一样，将有一个好方法和坏方法来推理这些问题。这本书是关于两者的。

这book is available from Amazon (inprintandKindle），，，，oniBooks，，，，as a pay-what-you-want数字下载，，，，and as aweb book at equilibriabook.com。这book has also been posted toLess Wrong2。0。

这book’s contents are:

1.不足和谦虚

A comparison of two “wildly different, nearly认知上的非重叠” approaches to thinking about outperformance:谦虚的认识论，，，，andinadequacy analysis。

2。An Equilibrium of No Free Energy

原则上，社会如何最终忽略明显的低悬果？

3. Moloch’s Toolbox

Why does our civilization actually end up neglecting low-hanging fruit?

4.生活在一个不足的世界中

我们如何在决策中最好地考虑文明不足？

5. Blind Empiricism

Three examples of modesty in practical settings.

6. Against Modest Epistemology

An argument against the “epistemological core” of modesty: that we shouldn’t take our own reasoning and meta-reasoning at face value in cases in the face of disagreements or novelties.

7.身份监管和焦虑不足

关于谦虚的因果报道。

AlthoughInadequate Equilibriaisn’t about AI, I consider it one of MIRI’s most important nontechnical publications to date, as it helps explain some of the most basic tools and background models we use when we evaluate how promising a potential project, research program, or general strategy is.

A major grant from the Open Philanthropy Project

2017年11月8日|Malo Bourgon|News

I’m thrilled to announce that the Open Philanthropy Project has awarded MIRI athree-year $3.75 million general support grant（（$1.25 million per year). This grant is, by far, the largest contribution MIRI has received to date, and will have a major effect on our plans going forward.

这笔赠款遵循$ 500,000赠款we received from the Open Philanthropy Project in 2016. The Open Philanthropy Project’s公告对于新的赠款指出，他们“现在旨在支持Miri年度预算的一半”。¹这annual $1.25 million represents 50% of a conservative estimate we provided to the Open Philanthropy Project of the amount of funds we expect to be able to usefully spend in 2018–2020.

这种支持的扩展也取决于我们从其他支持者中提高其他50％的能力。因此，我衷心感谢所有帮助我们达到这一点的过去和现任支持者。

这Open Philanthropy Project has expressed openness to potentially increasing their support if MIRI is in a position to usefully spend more than our conservative estimate, if they believe that this increase in spending is sufficiently high-value, and if we are able to secure additional outside support to ensure that the Open Philanthropy Project isn’t providing more than half of our total funding.

我们将会进入我们的未来组织的更多细节anizational plans in a follow-up post12月1日，，，，where we’ll also discuss our end-of-the-year fundraising goals.

In their write-up, the Open Philanthropy Project notes that they have updated favorably about our technical output since 2016, followingour logical induction paper：

我们收到了关于美里关于“逻辑归纳” by a machine learning researcher who (i) is interested in AI safety, (ii) is rated as an outstanding researcher by at least one of our close advisors, and (iii) is generally regarded as outstanding by the ML community. As mentioned above, we previously haddifficulty evaluatingMiri研究的技术质量，我们以前无法在一定程度上找到符合标准（i金宝博娱乐） - （iii）的人，他对Miri的技术研究感到非常兴奋。虽然我们通常不会仅凭此考虑就不会向任何实验室提供可比的赠款，但我们认为这是原始背景下的重大更新[2016]赠款的案例（（especially MIRI’s thoughtfulness on this set of issues, value alignment with us, distinctive perspectives, and history of work in this area). While the balance of our technical advisors’ opinions and arguments still leaves us skeptical of the value of MIRI’s research, the case for the statement “MIRI’s research has a非平凡的机会of turning out to be extremely valuable (when taking into account how different it is from other research on AI safety)” appears much more robust than it did before we received this review.

这公告also states, “In the time since our initial grant to MIRI, we have made several more grants within this focus area, and are therefore less concerned that a larger grant will signal an outsized endorsement of MIRI’s approach.”

我们非常感谢开放慈善项目的支持，以及他们与整个AI安全领域的深入互动。要了解有关我们与开放慈善项目的讨论及其在这个领域的积极工作的更多信息，请参阅该小组的先前AI safety grants，，，，our conversation with Daniel Deweyon the Effective Altruism Forum，以及开放慈善项目最金宝博娱乐近的研究问题AI fellows program description。

开放慈善项目通常不愿意提供组织资金的一半以上，以促进资助人协调并确保其支持的组织保持独立性。从游行blog post：“我们通常避免在组织资金中提供> 50％的情况，以免产生一种情况，使组织的总资金“脆弱”，因为过度依赖我们。”↩

2017年11月通讯

2017年11月3日|Rob Bensinger|新闻通讯

Eliezer Yudkowskyhas written a new book on civilizational dysfunction and outperformance:平衡不足：文明在哪里以及如何卡住。这full book will be available in print and electronic formats November 16. To preorder the ebook or sign up for updates, visitequilibriabook.com。

在接下来的两周内，我们将分阶段在线发布完整的内容。前两章是：

Inadequacy and Modesty（（discussion:LessWrong，，，，EA论坛，，，，Hacker News）
An Equilibrium of No Free Energy（（discussion:LessWrong，，，，EA论坛）

金宝博娱乐研究更新

A new paper: “功能决策理论：一种新的工具理论理论”（arxiv），，，，by Eliezer Yudkowsky and Nate Soares.
新的研究金宝博娱乐文章和讨论：Comparing Logical Inductor CDT and Logical Inductor EDT;Logical Updatelessness as a Subagent Alignment Problem;Mixed-Strategy Ratifiability Implies CDT=EDT
New from AI Impacts:Computing Hardware Performance Data Collections
这Workshop on Reliable Artificial Intelligencetook place at ETH Zürich, hosted by MIRIxZürich.

General updates

DeepMind announces新版本的alphago使用4个TPU，没有人类培训数据，这在三天内实现了超人类的表现。Eliezer Yudkowsky认为Alphago Zero为证据提供了支持的证据他在AI Foom辩论中的立场;罗宾·汉森（Robin Hanson）responds。也可以看看Paul Christiano onAlphaGo Zero and capability amplification。
Yudkowskyon AGI ethics：“这ethics of bridge-building is to not have your bridge fall down and kill people and there is a frame of mind in which this obviousness is obvious enough.如何不要让桥掉下来很难。”
Nate Soares gave hisensuring smarter-than-human AI has a positive outcome讲话at the O’Reilly AI Conference（（幻灯片）。

News and links

“Protecting Against AI’s Existential Threat“：a华尔街日报op-ed by OpenAI’s Ilya Sutskever and Dario Amodei.
Openai宣布“分层增强学习算法that learns high-level actions useful for solving a range of tasks”.
DeepMind’s Viktoriya Krakovna reports on the first东京AI和社会研讨会。
Nick Bostrom说话和CSERsubmits written evidenceto the UK Parliament’s Artificial Intelligence Commitee.
Rob Wiblin interviews Nick Becksteadfor the 80,000 Hours podcast.

New paper: “Functional Decision Theory”

October 22, 2017|Matthew Graves|文件

Miri高级研究员Eli金宝博娱乐ezer Yudkowsky和执行董事Nate Soares在决策理论上发表了新的介绍性论文：“”Functional decision theory: A new theory of instrumental rationality。”

Abstract:

This paper describes and motivates a new decision theory known asfunctional decision theory（（FDT), as distinct from causal decision theory and evidential decision theory.

Functional decision theorists hold that the normative principle for action is to treat one’s decision as the output of a ﬁxed mathematical function that answers the question, “Which output of this very function would yield the best outcome?” Adhering to this principle delivers a number of beneﬁts, including the ability to maximize wealth in an array of traditional decision-theoretic and game-theoretic problems where CDT and EDT perform poorly. Using one simple and coherent decision rule, functional decision theorists (for example) achieve more utility than CDT on Newcomb’s problem, more utility than EDT on the smoking lesion problem, and more utility than both in Parﬁt’s hitchhiker problem.

In this paper, we deﬁne FDT, explore its prescriptions in a number of diﬀerent decision problems, compare it to CDT and EDT, and give philosophical justiﬁcations for FDT as a normative theory of decision-making.

Our previous introductory paper on FDT, “Cheating Death in Damascus”，重点是将FDT的性能与CDT和EDT的性能进行相当高的术语进行比较。Yudkowsky和Soares的新论文将更大的重点放在FDT的机制和动机上，使“功能决策理论”成为该理论最完整的独立介绍。¹