Entries by

ICLR Safe ML Workshop Report

This year the ICLR conference hosted topic-based workshops for the first time (as opposed to a single track for workshop papers), and I co-organized the Safe ML workshop. One of the main goals was to bring together near and long term safety research communities. The workshop was structured according to a taxonomy that incorporates both near […]

AI Safety: Measuring and Avoiding Side Effects Using Relative Reachability

This article was originally published on the Deep Safety blog. A major challenge in AI safety is reliably specifying human preferences to AI systems. An incorrect or incomplete specification of the objective can result in undesirable behavior like specification gaming or causing negative side effects. There are various ways to make the notion of a “side effect” more precise […]

Is There a Trade-off Between Immediate and Longer-term AI Safety Efforts?

Something I often hear in the machine learning community and media articles is “Worries about superintelligence are a distraction from the *real* problem X that we are facing today with AI” (where X = algorithmic bias, technological unemployment, interpretability, data privacy, etc). This competitive attitude gives the impression that immediate and longer-term safety concerns are […]

Deep Safety: NIPS 2017 Report

This year’s NIPS gave me a general sense that near-term AI safety is now mainstream and long-term safety is slowly going mainstream. On the near-term side, I particularly enjoyed Kate Crawford’s keynote on neglected problems in AI fairness, the ML security workshops, and the Interpretable ML symposium debate that addressed the “do we even need interpretability?” question in a somewhat sloppy […]

Tokyo AI & Society Symposium

I just spent a week in Japan to speak at the inaugural symposium on AI & Society – my first conference in Asia. It was inspiring to take part in an increasingly global conversation about AI impacts, and interesting to see how the Japanese AI community thinks about these issues. Overall, Japanese researchers seemed more […]

Portfolio Approach to AI Safety Research

Long-term AI safety is an inherently speculative research area, aiming to ensure safety of advanced future systems despite uncertainty about their design or algorithms or objectives. It thus seems particularly important to have different research teams tackle the problems from different perspectives and under different assumptions. While some fraction of the research might not end […]

Machine Learning Security at ICLR 2017

The overall theme of the ICLR conference setting this year could be summarized as “finger food and ships”. More importantly, there were a lot of interesting papers, especially on machine learning security, which will be the focus on this post. (Here is a great overview of the topic.)

AI Safety Highlights from NIPS 2016

This year’s Neural Information Processing Systems (NIPS) conference was larger than ever, with almost 6000 people attending, hosted in a huge convention center in Barcelona, Spain. The conference started off with two exciting announcements on open-sourcing collections of environments for training and testing general AI capabilities – the DeepMind Lab and the OpenAI Universe. Among […]

OpenAI Unconference on Machine Learning

The following post originally appeared here. Last weekend, I attended OpenAI’s self-organizing conference on machine learning (SOCML 2016), meta-organized by Ian Goodfellow (thanks Ian!). It was held at OpenAI’s new office, with several floors of large open spaces. The unconference format was intended to encourage people to present current ideas alongside with completed work. The […]

New AI Safety Research Agenda From Google Brain

Google Brain just released an inspiring research agenda, Concrete Problems in AI Safety, co-authored by researchers from OpenAI, Berkeley and Stanford. This document is a milestone in setting concrete research objectives for keeping reinforcement learning agents and other AI systems robust and beneficial. The problems studied are relevant both to near-term and long-term AI safety, […]

Introductory Resources on AI Safety Research

Reading list to get up to speed on the main ideas in the field. The resources are selected for relevance and/or brevity, and the list is not meant to be comprehensive. [Updated on 15 August 2017.] Motivation For a popular audience: Cade Metz, 2017. New York Times: Teaching A.I. Systems to Behave Themselves FLI. AI risk background and FAQ. At the bottom […]

Risks From General Artificial Intelligence Without an Intelligence Explosion

“An ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind.” – Computer scientist I. J. Good, 1965 Artificial intelligence systems we have today can be referred to as narrow AI – they perform well at specific tasks, like playing […]

ITIF panel on superintelligence with Russell and Soares

The Information Technology and Innovation Foundation held a panel discussion on June 30, “Are Superintelligent Computers Really A Threat to Humanity?“. The panelists were Stuart Russell (FLI board member and grant recepient), Nate Soares (MIRI executive director), Manuela Veloso (AI researcher and FLI grant recepient), Ronald Arkin (AI researcher), and Robert Atkinson (ITIF President). The […]

Jaan Tallinn on existential risks

An excellent piece about existential risks by FLI co-founder Jaan Tallinn on Edge.org: “The reasons why I’m engaged in trying to lower the existential risks has to do with the fact that I’m a convinced consequentialist. We have to take responsibility for modeling the consequences of our actions, and then pick the actions that yield […]

Recent AI discussions

1. Brookings Institution post on Understanding Artificial Intelligence, discussing technological unemployment, regulation, and other issues. 2. A recap of the Science Friday episode with Stuart Russell, Erik Horvitz and Max Tegmark. 3. Ryan Calo on What Ex Machina’s Alex Garland Gets Wrong About Artificial Intelligence. 40