Updates to the research team, and a major donation

July 4, 2017|Malo Bourgon|News

We have several major announcements to make, covering new developments in the two months since our2017 strategy update:

1. On May 30th, we received a surprise$1.01 million donationfrom anEthereumcryptocurrency investor. This is the single largest contribution we have received to date by a large margin, and will have a substantial effect on our plans over the coming year.

2.Two new full-time researchers加入米里:Tsvi Benson-Ti吗lsen and Abram Demski. This comes in the wake of Sam Eisenstat and Marcello Herreshoff’s addition to the teamin May. We’ve also begun working with engineers on a trial basis for our new slate ofsoftware engineer job openings.

3.Two of our researchers have recently left: Patrick LaVictoire and Jessica Taylor, researchers previously heading work on our “Alignment for Advanced Machine Learning Systems” research agenda.

For more details, see below.

June 2017 Newsletter

June 16, 2017|Rob Bensinger|Newsletters

Research updates

A newAI Impactspaper: “When Will AI Exceed Human Performance?” News coverage atDigital TrendsandMIT Technology Review.
New at IAFF:Cooperative Oracles;杰西卡·泰勒在该议程上;An Approach to Logically Updateless Decisions
Our 2014 technical agenda, “Agent Foundations for Aligning Machine Intelligence with Human Interests,” is now available as a book chapter in the anthologyThe Technological Singularity: Managing the Journey.

General updates

readthesequences.com: supporters have put together a web version of Eliezer Yudkowsky’sRationality: From AI to Zombies.
The Oxford Prioritisation Project publishesa model of MIRI’s workas an existential risk intervention.

News and links

FromMIT Technology Review: “Why Google’s CEO Is Excited About Automating Artificial Intelligence.”
A new alignment paper from researchers at Australian National University and DeepMind: “Reinforcement Learning with a Corrupted Reward Channel.”
New from OpenAI:Baselines, a tool for reproducing reinforcement learning algorithms.
TheFuture of Humanity InstituteandCentre for the Future of Intelligencejoin the Partnership on AI alongsidetwenty other groups.
New AI safetyjob postingsinclude research roles at theFuture of Humanity Instituteand theCenter for Human-Compatible AI, as well as aUCLA PULSE fellowshipfor studying AI’s potential large-scale consequences and appropriate preparations and responses.

May 2017 Newsletter

May 10, 2017|Rob Bensinger|Newsletters

Research updates

New at IAFF:The Ubiquitous Converse Lawvere Problem;Two Major Obstacles for Logical Inductor Decision Theory;Generalizing Foundations of Decision Theory II.
New at AI Impacts:Guide to Pages on AI Timeline Predictions
“Decisions Are For Making Bad Outcomes Inconsistent”: Nate Soares dialogues on some of the deeper issues raised by our “Cheating Death in Damascus” paper.
We ran a machine learningworkshopin early April.
“Ensuring Smarter-Than-Human Intelligence Has a Positive Outcome”: Nate’s talk at Google (video) provides probably the best general introduction to MIRI’s work on AI alignment.

General updates

Ourstrategy updatediscusses changes to our AI forecasts and research priorities, new outreach goals, a MIRI/DeepMind collaboration, and other news.
MIRI is hiring software engineers!If you’re a programmer who’s passionate about MIRI’s mission and wants to directly support our research efforts,apply hereto trial with us.
MIRI Assistant Research Fellow Ryan Carey has taken on an additionalaffiliationwith the Centre for the Study of Existential Risk, and is also helping edit an issue ofInformaticaon superintelligence.

News and links

DeepMind researcher Viktoriya Krakovna listssecurity highlights from ICLR.
DeepMind isseeking applicantsfor a policy research position “to carry out research on the social and economic impacts of AI”.
The Center for Human-Compatible AIis hiring an assistant director. Interested parties may also wish to apply for theevent coordinatorposition at the new Berkeley Existential Risk Initiative, which will help support work at CHAI and elsewhere.
80,000 Hours lists other potentially high-impactopenings, including ones at Stanford’s AI Index project, theWhite House OSTP,IARPA, andIVADO.
New papers: “One-Shot Imitation Learning” and “Stochastic Gradient Descent as Approximate Bayesian Inference.”
The Open Philanthropy Project summarizes its findings onearly field growth.
The Centre for Effective Altruism is collecting donations for theEffective Altruism Fundsin a range of cause areas.

2017 Updates and Strategy

2017年4月30日|Rob Bensinger|MIRI Strategy

In our last strategy update (August 2016), Nate wrote that MIRI’s priorities were to make progress on our代理基础agenda and begin work on our new “Alignment for Advanced Machine Learning Systems” agenda, to collaborate and communicate with other researchers, and to grow our research and ops teams.

Since then, senior staff at MIRI have reassessed their views on how far offartificial general intelligence(AGI) is and concluded that shorter timelines are more likely than they were previously thinking. A few lines of recent evidence point in this direction, such as:¹

AI research is becoming more visibly exciting andwell-funded. This suggests that more top talent (in the next generation as well as the current generation) will probably turn their attention to AI.
AGI is attracting more scholarly attention as an idea, and is the stated goal of top AI groups likeDeepMind,OpenAI, andFAIR. In particular, many researchers seem more open to thinking about general intelligence now than they did a few years ago.
Research groups associated with AGI are showing much clearer externalsignsof profitability.
AI successes likeAlphaGoindicate that it’s easier to outperform top humans in domains like Go (without any new conceptual breakthroughs) than might have been expected.²This lowers our estimate for the number of significant conceptual breakthroughs needed to rival humans in other domains.

There’s no consensus among MIRI researchers on how long timelines are, and our aggregated estimate puts medium-to-high probability on scenarios in which the research community hasn’t developed AGI by, e.g., 2035. On average, however, research staff now assign moderately higher probability to AGI’s being developed before 2035 than we did a year or two ago. This has a few implications for our strategy:

1. Our relationships with current key players in AGI safety and capabilities play a larger role in our strategic thinking. Short-timeline scenarios reduce the expected number of important new players who will enter the space before we hit AGI, and increase how much influence current players are likely to have.

2.我们的研金宝博娱乐究priorities are somewhat different, since shorter timelines change what research paths are likely to pay out before we hit AGI, and also concentrate our probability mass more on scenarios where AGI shares various features in common with present-day machine learning systems.

Both updates represent directions we’ve already been trending in for various reasons.³However, we’re moving in these two directions more quickly and confidently than we were last year. As an example, Nate is spending less time on staff management and other administrative duties than in the past (having handed these off to MIRI COO Malo Bourgon) and less time on broad communications work (having delegated a fair amount of this to me), allowing him to spend more time on object-level research, research prioritization work, and more targeted communications.⁴

I’ll lay out what these updates mean for our plans in more concrete detail below.

Note that this list is far from exhaustive.↩
Relatively general algorithms (plus copious compute) were able to surpass human performance on Go, going from incapable of winning against the worst human professionals in standard play todominating the very best professionalsin the space of a few months. The relevant development here wasn’t “AlphaGo represents a large conceptual advance over previously known techniques,” but rather “contemporary techniques run into surprisingly few obstacles when scaled to tasks as pattern-recognition-reliant and difficult (for humans) as professional Go”.↩
The publication of “Concrete Problems in AI Safety” last year, for example, caused us to reduce the time we were spending on broad-based outreach to the AI community at large in favor of spending more time building stronger collaborations with researchers we knew at OpenAI, Google Brain, DeepMind, and elsewhere.↩
Nate continues to set MIRI’s organizational strategy, and is responsible for the ideas in this post.↩

Software Engineer Internship / Staff Openings

2017年4月30日|Alex Vermeer|News

The Machine Intelligence Research Institute is looking for highly capable software engineers to directly support ourAI alignmentresearch efforts, with a focus on projects related to machine learning. We’re seeking engineers with strong programming skills who are passionate about MIRI’s mission and looking for challenging and intellectually engaging work.

While our goal is to hire full-time, we are initially looking for paid interns. Successful internships may then transition into staff positions.

About the Internship Program

The start time for interns is flexible, but we’re aiming for May or June. We will likely run several batches of internships, so if you are interested but unable to start in the next few months, do still apply. The length of the internship is flexible, but we’re aiming for 2–3 months.

Examples of the kinds of work you’ll do during the internship:

Replicate recent machine learning papers, and implement variations.
Learn about and implement machine learning tools (including results in the fields of deep learning, convex optimization, etc.).
Run various coding experiments and projects, either independently or in small groups.
Rapidly prototype, implement, and test AI alignment ideas related to machine learning (after demonstrating successes in the above points).

For MIRI, the benefit of this program is that it’s a great way to get to know you and assess you for a potential hire. For applicants, the benefits are that this is an excellent opportunity to get your hands dirty and level up your machine learning skills, and to get to the cutting edge of the AI safety field, with a potential to stay in a full-time engineering role after the internship concludes.

Our goal is to trial many more people than we expect to hire, so our threshold for keeping on engineers long-term as full staff will be higher than for accepting applicants to our internship.

The Ideal Candidate

Some qualities of the ideal candidate:

Extensive breadth and depth of programming skills. Machine learning experience is not required, though it is a plus.
Highly familiar with basic ideas related to AI alignment.
Able to work independently with minimal supervision, and in team/group settings.
Willing to accept a below-market rate. Since MIRI is a non-profit, we can’t compete with the Big Names in the Bay Area.
Enthusiastic about the prospect of working at MIRI and helping advance the field of AI alignment.
Not looking for a “generic” software engineering position.

Working at MIRI

We strive to make working at MIRI a rewarding experience.

Modern Work Spaces — Many of us have adjustable standing desks with large external monitors. We consider workspace ergonomics important, and try to rig up work stations to be as comfortable as possible. Free snacks, drinks, and meals are also provided at our office.
Flexible Hours — We don’t have strict office hours, and we don’t limit employees’ vacation days. Our goal is to make rapid progress on our research agenda, and we would prefer that staff take a day off than that they extend tasks to fill an extra day.
Living in the Bay Area — MIRI’s office is located in downtown Berkeley, California. From our office, you’re a 30-second walk to the BART (Bay Area Rapid Transit), which can get you around the Bay Area; a 3-minute walk to UC Berkeley campus; and a 30-minute BART ride to downtown San Francisco.

EEO & Employment Eligibility

MIRI is an equal opportunity employer. We are committed to making employment decisions based on merit and value. This commitment includes complying with all federal, state, and local laws. We desire to maintain a work environment free of harassment or discrimination due to sex, race, religion, color, creed, national origin, sexual orientation, citizenship, physical or mental disability, marital status, familial status, ethnicity, ancestry, status as a victim of domestic violence, age, or any other status protected by federal, state, or local laws.

Apply

If interested,click here to apply. For questions or comments, emailengineering@www.gqpatrol.com.

Update (December 2017): We’re now putting less emphasis on finding interns and looking for highly skilled engineers available for full-time work.Updated job post here.

Ensuring smarter-than-human intelligence has a positive outcome

April 12, 2017|Nate Soares|Analysis,Video

I recently gave a talk at Google on the problem of aligning smarter-than-human AI with operators’ goals:

The talk was inspired by “AI Alignment: Why It’s Hard, and Where to Start,” and serves as an introduction to the subfield of alignment research in AI. A modified transcript follows.

Talk outline (slides):

1.Overview

2.Simple bright ideas going wrong

2.1.Task: Fill a cauldron
2.2.Subproblem: Suspend buttons

3.The big picture

3.1.Alignment priorities
3.2.Four key propositions

4.Fundamental difficulties

Decisions are for making bad outcomes inconsistent

April 7, 2017|Rob Bensinger|Conversations

Nate Soares’ recentdecision theorypaper with Ben Levinstein, “Cheating Death in Damascus,” prompted some valuable questions and comments from an acquaintance (anonymized here). I’ve put together edited excerpts from the commenter’s email below, with Nate’s responses.

The discussion concerns functional decision theory (FDT), a newly proposed alternative to causal decision theory (CDT) and evidential decision theory (EDT). Where EDT says “choose the most auspicious action” and CDT says “choose the action that has the best effects,” FDT says “choose the output of one’s decision algorithm that has the best effects across all instances of that algorithm.”

FDT usually behaves similarly to CDT. In a one-shot prisoner’s dilemma between two agents who know they are following FDT, however, FDT parts ways with CDT and prescribes cooperation, on the grounds that each agent runs the same decision-making procedure, and that therefore each agent is effectively choosing for both agents at once.¹

下面,内特提供了一些自己的观点n why FDT generally achieves higher utility than CDT and EDT. Some of the stances he sketches out here are stronger than the assumptions needed to justify FDT, but should shed some light on why researchers at MIRI think FDT can help resolve a number of longstanding puzzles in the foundations of rational action.

Anonymous:This is great stuff! I’m behind on reading loads of papers and books for my research, but this came across my path and hooked me, which speaks highly of how interesting is the content and the sense that this paper is making progress.

My general take is that you are right that these kinds of problems need to be specified in more detail. However, my guess is that once you do so, game theorists would get the right answer. Perhaps that’s what FDT is: it’s an approach to clarifying ambiguous games that leads to a formalism where people like Pearl and myself can use our standard approaches to get the right answer.

I know there’s a lot of inertia in the “decision theory” language, so probably it doesn’t make sense to change. But if there were no such sunk costs, I would recommend a different framing. It’s not that people’s decision theories are wrong; it’s that they are unable to correctly formalize problems in which there are high-performance predictors. You show how to do that, using the idea of intervening on (i.e., choosing between putative outputs of) the algorithm, rather than intervening on actions. Everything else follows from a sufficiently precise and non-contradictory statement of the decision problem.

Probably the easiest move this line of work could make to ease this knee-jerk response of mine in defense of mainstream Bayesian game theory is to just be clear that CDT isnotmeant to capture mainstream Bayesian game theory. Rather, it is a model of one response to a class of problems not normally considered and for which existing approaches are ambiguous.

Nate Soares:我不接受这种观点。我的观点是莱克阀门e: When you add accurate predictors to the Rube Goldberg machine that is the universe — which can in fact be done — the future of that universe can be determined by the behavior of the algorithm being predicted. The algorithm that we put in the “thing-being-predicted” slot can do significantly better if its reasoning on the subject of which actions to output respects the universe’s downstream causal structure (which is something CDT and FDT do, but which EDT neglects), and it can do better again if its reasoning also respects the world’s global logical structure (which is done by FDT alone).

We don’t know exactly how to respect this wider class of dependencies in general yet, but we do know how to do it in many simple cases. While it agrees with modern decision theory and game theory in many simple situations, its prescriptions do seem to differ in non-trivial applications.

The main case where we can easily see that FDT is not just a better tool for formalizing game theorists’ traditional intuitions is in prisoner’s dilemmas. Game theory is pretty adamant about the fact that it’s rational to defect in a one-shot PD, whereas two FDT agents facing off in a one-shot PD will cooperate.

In particular, classical game theory employs a “common knowledge of shared rationality” assumption which, when you look closely at it, cashes out more or less as “common knowledge that all parties are using CDT and this axiom.” Game theory where common knowledge of shared rationality is defined to mean “common knowledge that all parties are using FDT andthisaxiom” gives substantially different results, such as cooperation in one-shot PDs.

CDT prescribes defection in this dilemma, on the grounds that one’s action cannotcausethe other agent to cooperate. FDT outperforms CDT in Newcomblike dilemmas like these, while also outperforming EDT in other dilemmas, such as the smoking lesion problem and XOR blackmail.↩

April 2017 Newsletter

April 6, 2017|Rob Bensinger|Newsletters

我们最新的文章”,Cheating Death in Damascus,” makes the case for functional decision theory, our general framework for thinking about rational choice and counterfactual reasoning.

In other news,our research team is expanding! Sam Eisenstat and Marcello Herreshoff, both previously at Google, join MIRI this month.

Research updates

New at IAFF: “Formal Open Problem in Decision Theory”
New at AI Impacts: “Trends in Algorithmic Progress”; “Progress in General-Purpose Factoring”
We ran a weekendworkshopon agent foundations and AI safety.

General updates

Ourannual reviewcovers our research progress, fundraiser outcomes, and other take-aways from 2016.
We attended theColloquium on Catastrophic and Existential Risk.
Nate Soares weighs in on the Future of Life Institute’sRisk Principle.
“Elon Musk’s Billion-Dollar Crusade to Stop the AI Apocalypse” features quotes from Eliezer Yudkowsky, Demis Hassabis, Mark Zuckerberg, Peter Thiel, Stuart Russell, and others.

News and links

The Open Philanthropy Project and OpenAIbegin a partnership: Holden Karnofsky joins Elon Musk and Sam Altman on OpenAI’s Board of Directors, and Open Philanthropy contributes $30M to OpenAI’s research program.
Open Philanthropy has also awarded $2Mto the Future of Humanity Institute.
Modeling Agents with Probabilistic Programs: a new book by Owain Evans, Andreas Stuhlmüller, John Salvatier, and Daniel Filan.
New from OpenAI: “Evolution Strategies as a Scalable Alternative to Reinforcement Learning”; “Learning to Communicate”; “One-Shot Imitation Learning”; and from Paul Christiano, “Benign Model-Free RL.”
Chris Olah and Shan Carter discussresearch debtas an obstacle to clear thinking and the transmission of ideas, and proposeDistillas a solution.
Andrew Trask proposesencrypting deep learning algorithmsduring training.
Roman Yampolskiyseeks submissionsfor a book on AI safety and security.
80,000 Hours has updated their problem profile onpositively shaping the development of AI, a solid introduction to AI risk — which 80K now ranks asthe most urgent problem in the world. See also 80K’s write-up onin-demand skill setsat effective altruism oragnizations.

Updates to the research team, and a major donation

June 2017 Newsletter

May 2017 Newsletter

2017 Updates and Strategy

Software Engineer Internship / Staff Openings

About the Internship Program

The Ideal Candidate

Working at MIRI

EEO & Employment Eligibility

Apply

Ensuring smarter-than-human intelligence has a positive outcome

Decisions are for making bad outcomes inconsistent

April 2017 Newsletter

Search

Browse

Subscribe