December 2017 Newsletter


Our annual fundraiser is live。Discussed in the fundraiser post:

  • News— What MIRI’s researchers have been working on lately, and more.
  • Goals— We plan to grow our research team 2x in 2018–2019. If we raise $850k this month, we think we can do that without dipping below a 1.5-year runway.
  • Actual目标- 我们认为最有可能导致良好全球成果的最有可能的事件的概述更大的概述。

Our funding drive will be running until December 31st.


General updates

美里的2017 Fundraiser


Update 2017-12-27:We’ve blown past our 3rd and final target, and reached the matching cap of $300,000 for the$2 million Matching Challenge呢Thanks so much to everyone who supported us!

All donations made before 23:59 PST on Dec 31st will continue to be counted towards our fundraiser total. The fundraiser total includes projected matching funds from the Challenge.

美里的2017 fundraiser活到12月底!到目前为止,我们的进度(实时更新):

总共筹集了$ 2,504,625!


毕业于加州大学伯克利金宝博娱乐分校的美里是一个基于研究非营利组织lifornia with a mission of ensuring that smarter-than-human AI technology has a positive impact on the world. You can learn more about our work at “为什么AI安全?” or via MIRI Executive Director Nate Soares’Google关于AI对齐的讨论

In 2015,我们讨论了我们有潜在分支以同时探索多个研究计划的兴趣,一旦我们可以支持大型团队。金宝博娱乐根据我们对战略格局的整体情况的最新变化,我们现在正在朝着该目标迈进,并开始探索新的研究方向,同时也继续推动我们的目标金宝博娱乐代理基金会议程。For more on our new views, see “这re’s No Fire Alarm for Artificial General Intelligence” and our2017年战略更新。We plan to expand on our relevant strategic thinking more in the coming weeks.

我们扩大的研究重点意味着我金宝博娱乐们的研究团队可能会发展大大,并迅速发展。我们目前的目标是在未来两年内雇用大约10名新的研究人员,主要是软件工程师。金宝博娱乐如果我们成功,我们的观点是our 2018 budget will be $2.8Mand我们的2019年预算将为350万美元,,,,up from roughly $1.9M in 2017.1

We’ve set our fundraiser targets by estimating how quickly we could grow while maintaining a 1.5-year runway, on the simplifying assumption that about 1/3 of the donations we receive between now and the beginning of 2019 will come during our current fundraiser.2

目标1($ 625K)然后让我们在2018年(但不在2019年)采取行动;目标2(($850k) lets us act on our full two-year growth plan; and in the case where our hiring goes better than expected,目标3(125万美元)将使我们能够迅速向我们的团队增加新成员,或者根据需要为新的研究人员支付更高的薪水。金宝博娱乐


Read more »

  1. Note that this $1.9M is significantly below the $2.1–2.5M we predicted for the yearin April。人员成本是Miri最大的费用,2017年的研究人员流动率更高,这意味着我们今年对团队的净增加量少于预算。金宝博娱乐我们在2016年的预算率相对较小,支出了173万美元,而预计为183万美元。

    Our 2018–2019 budget estimates are highly uncertain, with most of the uncertainty coming from substantial uncertainty about how quickly we’ll be able to take on new research staff.

  2. 这大致符合我们过去几年的经验,当时不包括预期的赠款和一次性的一次性捐款。我们在目标中占了前者,但不考虑后者,因为我们认为对不可预测的意外收获是不明智的。




后续到:Security Mindset and Ordinary Paranoia

((Two days later, Amber returns with another question.

琥珀色:Uh, say, Coral. How important is security mindset when you’re building a whole new kind of system—say, one subject to potentially adverse optimization pressures, where you want it to have some sort of robustness property?

珊瑚:如何novel is the system?

琥珀色:Very novel.



珊瑚:That’s serious business. If you’re building a very simple Internet-connected system, maybe a smart ordinary paranoid could look up how we usually guard against adversaries, use as much off-the-shelf software as possible that was checked over by real security professionals, and not do too horribly. But if you’re doing something qualitatively new and complicated that has to be robust against adverse optimization, well… mostly I’d think you were operating in almost impossibly dangerous territory, and I’d advise you to figure out what to do after your first try failed. But if you wanted to actually succeed, ordinary paranoia absolutely would not do it.


珊瑚:((laughs sadly)No.


Read more »

Security Mindset and Ordinary Paranoia


以下是虚构的对话建筑AI Alignment: Why It’s Hard, and Where to Start


琥珀色:So, Coral, I understand that you believe it is very important, when creating software, to make that software be what you call “secure”.



珊瑚:That’s a deep question, but to give a partial deep answer: When you expose a device to the Internet, you’re potentially exposing it to intelligent adversaries who can find special, weird interactions with the system that make the pieces behave in weird ways that the programmers did not think of. When you’re dealing with that kind of problem, you’ll use a different set of methods and tools.

琥珀色:Any system that crashes is behaving in a way the programmer didn’t expect, and programmers already need to stop that from happening. How is this case different?

珊瑚:Okay, so… imagine that your system is going to take in one kilobyte of input per session. (Although that itself is the sort of assumption we’d question and ask what happens if it gets a megabyte of input instead—but never mind.) If the input is one kilobyte, then there are 28,,,,000possible inputs, or about 102,400or so. Again, for the sake of extending the simple visualization, imagine that a computer gets a billion inputs per second. Suppose that only a googol, 10100,在10个2,400可能的输入会导致系统的行为以某种方式原始设计师不打算。金宝博官方

If the system is getting inputs in a way that’s uncorrelated with whether the input is a misbehaving one, it won’t hit on a misbehaving state before the end of the universe. If there’s an intelligent adversary who understands the system, on the other hand, they may be able to find one of the very rare inputs that makes the system misbehave. So a piece of the system that would literally never in a million years misbehave on random inputs, may break when an intelligent adversary tries deliberately to break it.

琥珀色:So you’re saying that it’s more difficult because the programmer is pitting their wits against an adversary who may be more intelligent than themselves.

珊瑚:这是一种几乎右翼的方式。重要的不是“对手”部分,而是优化部分。有系统的非随机力量强金宝博官方烈选择特定结果,导致系统的部分沿着怪异的执行路径降低并占据意外的状态。如果您的系统从字金宝博官方面上根本没有行为不当模式,那么您是否拥有IQ 140,而敌人拥有IQ 160,这不是一场武器竞争。当怪异的状态以相关的方式选择而不是仅出于意外发生时,就很难建立一个不会进入怪金宝博官方异状态的系统。怪异的选择力可以搜索您自己无法想象的较大状态空间的部分。击败确实需要新技能和不同的思维方式,布鲁斯·施耐(Bruce Schneier)称之为“安全思维方式”。


珊瑚:I can say one or two things about it, but keep in mind we are dealing with a quality of thinking that is not entirely effable. If I could give you a handful of platitudes about security mindset, and that would actually cause you to be able to design secure software, the Internet would look very different from how it presently does. That said, it seems to me that what has been called “security mindset” can be divided into two components, one of which is much less difficult than the other. And this can fool people into overestimating their own safety, because they can get the easier half of security mindset and overlook the other half. The less difficult component, I will call by the term “ordinary paranoia”.


珊瑚:Lots of programmers have the ability to imagine adversaries trying to threaten them. They imagine how likely it is that the adversaries are able to attack them a particular way, and then they try to block off the adversaries from threatening that way. Imagining attacks, including weird or clever attacks, and parrying them with measures you imagine will stop the attack; that is ordinary paranoia.

琥珀色:Isn’t that what security is all about? What do you claim is the other half?

珊瑚:To put it as a platitude, I might say… defending against mistakes in your own assumptions rather than against external adversaries.
Read more »



Miri高级研究员Eli金宝博娱乐ezer Yudkowsky今天有一本新书:平衡不足:文明在哪里以及如何卡住,,,,a discussion of societal dysfunction, exploitability, and self-evaluation. From the preface:

Inadequate Equilibria是一本关于一个广义的概念有效吗markets, and how we can use this notion to guess where society will or won’t be effective at pursuing some widely desired goal.An efficient market is one where smart individuals should generally doubt that they can spot overpriced or underpriced assets. We can ask an analogous question, however, about the “efficiency” of other human endeavors.

Suppose, for example, that someone thinks they can easily build a much better and more profitable social network than Facebook, or easily come up with a new treatment for a widespread medical condition. Should they question whatever clever reasoning led them to that conclusion, in the same way that most smart individuals should question any clever reasoning that causes them to think AAPL stock is underpriced? Should they question whether they can “beat the market” in these areas, or whether they can even spot major in-principle improvements to the status quo? How “efficient,” oradequate,,,,should we expect civilization to be at various tasks?


这book is available from Amazon (inprintandKindle),,,,oniBooks,,,,as a pay-what-you-want数字下载,,,,and as aweb book at。这book has also been posted toLess Wrong2。0

这book’s contents are:


A comparison of two “wildly different, nearly认知上的非重叠” approaches to thinking about outperformance:谦虚的认识论,,,,andinadequacy analysis

2。An Equilibrium of No Free Energy


3. Moloch’s Toolbox

Why does our civilization actually end up neglecting low-hanging fruit?



5. Blind Empiricism

Three examples of modesty in practical settings.

6. Against Modest Epistemology

An argument against the “epistemological core” of modesty: that we shouldn’t take our own reasoning and meta-reasoning at face value in cases in the face of disagreements or novelties.



AlthoughInadequate Equilibriaisn’t about AI, I consider it one of MIRI’s most important nontechnical publications to date, as it helps explain some of the most basic tools and background models we use when we evaluate how promising a potential project, research program, or general strategy is.

A major grant from the Open Philanthropy Project


I’m thrilled to announce that the Open Philanthropy Project has awarded MIRI athree-year $3.75 million general support grant(($1.25 million per year). This grant is, by far, the largest contribution MIRI has received to date, and will have a major effect on our plans going forward.

这笔赠款遵循$ 500,000赠款we received from the Open Philanthropy Project in 2016. The Open Philanthropy Project’s公告对于新的赠款指出,他们“现在旨在支持Miri年度预算的一半”。1这annual $1.25 million represents 50% of a conservative estimate we provided to the Open Philanthropy Project of the amount of funds we expect to be able to usefully spend in 2018–2020.


这Open Philanthropy Project has expressed openness to potentially increasing their support if MIRI is in a position to usefully spend more than our conservative estimate, if they believe that this increase in spending is sufficiently high-value, and if we are able to secure additional outside support to ensure that the Open Philanthropy Project isn’t providing more than half of our total funding.

我们将会进入我们的未来组织的更多细节anizational plans in a follow-up post12月1日,,,,where we’ll also discuss our end-of-the-year fundraising goals.

In their write-up, the Open Philanthropy Project notes that they have updated favorably about our technical output since 2016, followingour logical induction paper

我们收到了关于美里关于“逻辑归纳” by a machine learning researcher who (i) is interested in AI safety, (ii) is rated as an outstanding researcher by at least one of our close advisors, and (iii) is generally regarded as outstanding by the ML community. As mentioned above, we previously haddifficulty evaluatingMiri研究的技术质量,我们以前无法在一定程度上找到符合标准(i金宝博娱乐) - (iii)的人,他对Miri的技术研究感到非常兴奋。虽然我们通常不会仅凭此考虑就不会向任何实验室提供可比的赠款,但我们认为这是原始背景下的重大更新[2016]赠款的案例((especially MIRI’s thoughtfulness on this set of issues, value alignment with us, distinctive perspectives, and history of work in this area). While the balance of our technical advisors’ opinions and arguments still leaves us skeptical of the value of MIRI’s research, the case for the statement “MIRI’s research has a非平凡的机会of turning out to be extremely valuable (when taking into account how different it is from other research on AI safety)” appears much more robust than it did before we received this review.

这公告also states, “In the time since our initial grant to MIRI, we have made several more grants within this focus area, and are therefore less concerned that a larger grant will signal an outsized endorsement of MIRI’s approach.”

我们非常感谢开放慈善项目的支持,以及他们与整个AI安全领域的深入互动。要了解有关我们与开放慈善项目的讨论及其在这个领域的积极工作的更多信息,请参阅该小组的先前AI safety grants,,,,our conversation with Daniel Deweyon the Effective Altruism Forum,以及开放慈善项目最金宝博娱乐近的研究问题AI fellows program description

  1. 开放慈善项目通常不愿意提供组织资金的一半以上,以促进资助人协调并确保其支持的组织保持独立性。从游行blog post:“我们通常避免在组织资金中提供> 50%的情况,以免产生一种情况,使组织的总资金“脆弱”,因为过度依赖我们。”



Eliezer Yudkowskyhas written a new book on civilizational dysfunction and outperformance:平衡不足:文明在哪里以及如何卡住。这full book will be available in print and electronic formats November 16. To preorder the ebook or sign up for updates,


  1. Inadequacy and Modesty((discussion:LessWrong,,,,EA论坛,,,,Hacker News
  2. An Equilibrium of No Free Energy((discussion:LessWrong,,,,EA论坛


General updates

News and links

New paper: “Functional Decision Theory”


Functional Decision Theory

Miri高级研究员Eli金宝博娱乐ezer Yudkowsky和执行董事Nate Soares在决策理论上发表了新的介绍性论文:“”Functional decision theory: A new theory of instrumental rationality。”


This paper describes and motivates a new decision theory known asfunctional decision theory((FDT), as distinct from causal decision theory and evidential decision theory.

Functional decision theorists hold that the normative principle for action is to treat one’s decision as the output of a fixed mathematical function that answers the question, “Which output of this very function would yield the best outcome?” Adhering to this principle delivers a number of benefits, including the ability to maximize wealth in an array of traditional decision-theoretic and game-theoretic problems where CDT and EDT perform poorly. Using one simple and coherent decision rule, functional decision theorists (for example) achieve more utility than CDT on Newcomb’s problem, more utility than EDT on the smoking lesion problem, and more utility than both in Parfit’s hitchhiker problem.

In this paper, we define FDT, explore its prescriptions in a number of different decision problems, compare it to CDT and EDT, and give philosophical justifications for FDT as a normative theory of decision-making.

Our previous introductory paper on FDT, “Cheating Death in Damascus”,重点是将FDT的性能与CDT和EDT的性能进行相当高的术语进行比较。Yudkowsky和Soares的新论文将更大的重点放在FDT的机制和动机上,使“功能决策理论”成为该理论最完整的独立介绍。1

Read more »

  1. “功能决策理论”最初是在“Cheating Death in Damascus,,,,” and was significantly longer before we received various rounds of feedback from the philosophical community. “Cheating Death in Damascus” was produced from material that was cut from early drafts; other cut material included a discussion ofproof-based decision theory,,,,and some Death in Damascus variants left on the cutting room floor for being needlessly cruel to CDT.