< Back to Advanced Writing Portfolio

Machine Ethics Annoated Bibliography

Even before the advent of the term “artificial intelligence,” humanity has been fascinated with the prospect of sharing its rational ability with another species. The initial attempts at this came in the form of computers doing things that seem like human activities, like playing chess. Recently, however, a higher bar has been set for rationality of machines: the ability to learn. This ability is known as “general artificial intelligence,” or machine learning. If successful, a learning machine will be able to update its beliefs, learn from its mistakes, and be taught how to behave without the need for explicit reprogramming. This technology, while technically an engineering problem, has unavoidable ties to ethics, one of the most tired philosophical problems. Accelerating progress toward intelligent machines is forcing the hand of engineers to implement murky or nonexistent ethical systems. As these machines become more integrated into human society, they will gain positions that put the wellbeing of humans into their metaphorical, or perhaps literal, hands.

This situation requires the intimate attention, perhaps for the first time, of both disciplines of engineering and philosophy. The technical professions are eager to push the limits of what can be achieved in machine learning, and indeed are pushed further by encouragement of investors as consumers itch for exciting high-tech products. Unfortunately, these engineers are rarely, if ever, involved in conversations about ethics, let alone what unforeseen moral implications their software might have. Philosophers and ethicists, on the other hand, are formally trained in moral theory and the qualities of moral agents. Understandably, millennia of progress in ethics has only ever dealt with the concerns of ethics as they relate to human beings. Never formalized in a way intelligible to machines, ethical theories now need to be reimagined by both professions to fit an entirely new form. Because the technology is advancing exceedingly fast, the two disciplines will need to formalize machines ethics immediately. The failure to do so will, at best, threaten the wellbeing of the most vulnerable humans, and at worst bring an end to human autonomy as it is currently known.

This annotated bibliography will guide the technically and philosophically savvy reader to an understanding of the current state of the problem of machine ethics. The first section will scope the interdisciplinary nature of machine ethics and why it is not only a worthwhile question, but why it is an urgent one. The second section focuses on the concept of granting moral agency1 to machines, like it is granted to humans or animals. The third section will suggest that attempting to encode ethics into machines is fundamentally flawed. The remaining sources discuss what has been proposed as possible pragmatic solutions to engineering ethical beliefs into machine learning models. Because this is an interdisciplinary topic, a glossary of some especially technical and philosophical terms is provided at the end.

  1. Action for Machine Ethics is Urgent

    • Bonsignorio, Fabio. “The New Experimental Science of Physical Cognitive Systems: AI, Robotics, Neuroscience and Cognitive Sciences under a New Name with the Old Philosophical Problems?” Philosophy and Theory of Artificial Intelligence, Sapere (2013): 133-150. doi: 10.1007/978-3-642-31674-6_10.

      Bonsignorio argues in this article that the new “hard science” fields of artificial intelligence are repurposed problems that philosophers have tackled for ages. He builds the foundation of this idea on mappings between the disciplines: sentience, cognition, epistemology. He further supports this with case studies from past scientific discoveries, typically controversial at their time and birthed from schools of philosophy, like Newtonian physics and 20th century psychology. This underlines the importance of the disciplines joining on these problems. Bonsignorio suggests that it will not be the specialized, but the open-minded and interdisciplinary figures that see machine learning and related fields to their actualization. Whether these new systems are based on the philosophical theories of old or on new foundations, the problems have been unsolved all along.

    • Goodall, Noah. “Machine Ethics and Automated Vehicles.” Road Vehicle Automation, Springer (2014): 93-102. doi: 10.1007/978- 3-319-05990-7_9.

      This article delivers two important but separate arguments. First, Goodall outlines and disputes nine key criticisms of the need for machine ethics in autonomous vehicles. His responses conclude that autonomous vehicles will crash, will not always be able to consult a driver, and will make choices about the lives of its passengers and the passengers and pedestrians around it. Goodall posits that automated vehicles ought not be deployed without a somber consideration of these ethical problems. Second, Goodall briefly enumerates failed deontological2 and utilitarian3 approaches, as well as inconclusive attempts at machine learning approaches, concluding that much work remains. This remaining work, coupled with a dire call for applied ethics for autonomous vehicles, highlights that both the disciplines – technological and philosophical – have met at an impasse and need urgent resolve.

    • West, John. “Microsoft’s Disastrous Tay Experiment Shows the Hidden Dangers of AI.” Quartz Media (2016). https://qz.com/.

      West reports on Microsoft’s failed machine learning experiment that trained its beliefs on the unsupervised inputs from Twitter conversations. The report notes examples of racist, classist, and sexist opinions expressed by the machine as well as Microsoft’s swift shutdown of the project, citing a need for “some adjustments.” West highlights that, while some may consider the machine’s ability to learn a technical success, its actualization as a tool for harassment emphasizes the need for more perspectives and planning when creating artificial intelligences. The failed Tay experiment is an example of a learning machine without a code of ethics, or any kind of thoughtful belief updating functionality, the results of which were to the determent of vulnerable human beings. While many individuals doubt the feasibility of creating morally autonomous machines, real disasters like these demonstrate the grave need for such systems as learning machines continue to integrate.

  2. Moral Agency for Machines

    • Weber, Karsten. “What Is It Like to Encounter an Autonomous Artificial Agent?” AI & Society (2013) 28: 483-489. doi: 10.1007/s00146-013-0453-3.

      Weber published this article as a response to Thomas Nagel’s 1974 paper, “What is it like to be a bat?,” and the famous Turing essay outlining the Turing Test. The Turing Test4 is also known as the “imitation game” where the mark of agency for a machine is simply that it can fool a human into believing she is interacting with another human. The article argues that if humans cannot distinguish the presence of true sentience in animals, such as a bat, or even in one another, then the question of sentience in machines seems pragmatically uninteresting. Weber warns that, if humans perceive machines as being morally autonomous, which modern psychology research of Turing Test-like scenarios suggests, designers of such machines will be enticed to accentuate such features as to not be blamed for inevitable mishaps. This perceived moral authority must not be allowed to block the ethical updating of machines that others in this list are calling for.

    • Bostrom, Nick & Yudkowsky, Eliezer. “The Ethics of Artificial Intelligence.” Cambridge University Press (2011): forthcoming. http://faculty.smcm.edu.

      The authors of this article published from Oxford and the Singularity Institute for Artificial Intelligence as an exploration of the kinds of questions that need to be answered to understand why ethics ought to be applied to machines. The authors outline the ways humanity generally classifies moral status and how, if technologies began to exhibit such qualities, will need to consider what status machines warrant among humans, animals, etc. Similarly, engineers of machines will need to instill a preference for good in machines that reach superintelligence or that make decisions for humans concerning the wellbeing of other moral agents. The authors conclude that, if the ideal states (like those of a chess game) cannot be known apriori, then a system of transparency will be crucial for machines. These interests bolster the worries of current machine learning models that leave engineers unable to evaluate a machine’s rationale in making a complicated decision.

    • Arnold, Thomas & Scheutz, Matthias. “Against the Moral Turing Test: Accountable Design and the Moral Reasoning of Autonomous Systems.” Ethics Information Technology (2016) 18: 103-115. doi: 10.1007/s10676-016-9389-x.

      The authors of this article take an opposing stance to Weber on the implications of the Turing Test. They argue that imitation does not constitute understanding, and therefore does not grant moral agency that is afforded to human beings. Further, they dispute Total Turing Tests, where the machine is “brought into the room,” highlighting that imitation may be more an art of trickery than of validation. If individuals are to treat machines as moral agents in the way they treat humans, verifying real sentience – pain, passion, etc. – is the test machines must pass, and not one of simulation. This emphasizes the shifting expectations from artificial intelligence to machine learning in that something with the ability to learn (Microsoft’s Tay) requires stricter tests than something with simply the ability to do (toaster, chess player).

  3. Traditional Ethics Are Not Suited for Computation

    • Oesterheld, Casper. “Formalizing Preference Utilitarianism in Physical World Models.” Synthese (2016) 193: 2747–2759. doi: 10.1007/s11229-015-0883-1.

      Oesterheld published this article at the University of Bremen, Germany as a researcher of ethics and theory of computation. The article discusses the practicality of creating a system that computationally satisfies the maxim of welfare of all agents in a utilitarian universe. This theoretical model would be bound to the physical laws of the universe and have predictive power over past and future models of the environment and its agents, thus giving it power to make deterministic decisions about maximizing utility. Oesterheld finds this model incomputable in practice and warns that it may stray from popular moral intuitions that are strictly non-utilitarian. The author’s formalization of a traditionally informal topic and its subsequent practical failure sheds light on the problem of encoding ethical maxims in general. It also lends itself to the notion that the results or lack thereof in formalizing ethics will help ethicists better understand the shortcomings of theories.

    • Goodall, Noah. “Ethical Decision Making During Automated Vehicle Crashes.” Transportation Research Board of the National Academies (2014): 58-65. doi: 10.3141/2424-07.

      Goodall, a respected researcher within the subject of self-driving cars, writes a comprehensive article on the problem of machines making ethical decisions in the event of a crash. Goodall explains why deontological and consequential5 models of ethics are not well suited for cars, but argues instead for a “three phase” approach. The basis and first phase are ethics inherited from agreed-upon laws. Second, machine learning will incorporate those fundamentals with the specifics of crash scenarios to decide how to act. Finally, an intelligible explanation will be provided by the machine for why it acted. Goodall believes this method is crucial to safe and trustworthy vehicles. Unfortunately, as is explained in other articles here, the synthesis of an explanation from machine learning methods such as neural networks is exceedingly complex, making correction of such systems virtually impossible.

    • Anderson, Susan Leigh. “Asimov’s ‘Three Laws of Robotics’ and Machine Metaethics.” AI & Society (2008) 22: 477-493. doi: 10.1007/s00146-007-0094-5.

      Though a slightly dated article, Anderson is one of the first and most cited researchers in the field of machine ethics and her analysis of Asimov’s original 1950’s solution to machine ethics is paramount to understanding why problems remain. She explains why the simple three-step deontological ethics for machines is insufficient, as Asimov himself showed in his fiction, for the safe and reliable operation of a learning machine. Anderson shows that Asimov’s laws do not grant moral agency to machines, and in fact renders them as slaves. Even if this is acceptable, Asimov’s laws, and feasibly any deontology, cannot generate reliable ethical decision making in machines. While the laws do not allow machines to physically harm humans, they do not outline complete ethical decision making. Anderson wonders, as Yampolskiy does below, if machines can ever be expected to act on ethical decisions.

  4. Moving Forward Pragmatically

    • Powers, Thomas. “Incremental Machine Ethics.” IEEE (2011): 51-58. doi: 10.1109/MRA.2010.940152.

      Powers points out in this article that the techniques in computer systems engineering of machine learning, though advancing, are widely disagreed upon and experimental. Similarly, ethics is a question that is hugely disagreed upon. From this, Powers proposes that machine ethics take an incremental approach, Specifically, he means that, just as ethics has changed incrementally and served new purposes as needed, machine ethics will change incrementally, incorporating new statutes as they are needed. This proposal contrasts that of Goodall, who believes that advancement of such systems requires answered questions of ethics. Powers instead suggestions that humanity treat this growth of machine ethics like a parent treats a child: settings guidelines for what they know, correcting when the child misbehaves, and learning themselves of curious insights from the child. This method is optimistic, and suggestive of machine ethics being a tool, rather than an application, to discovering better moral frameworks than are currently understood.

    • Armstrong, Stuart & Sandberg, Anders & Bostrom, Nick. “Thinking Inside the Box: Controlling and Using an Oracle AI.” Minds & Machines (2012) 22: 299-324. doi: 10.1007/s11023-012-9282-2.

      The authors of this article, all fairly prominent in the field of machine learning, suggest that the problem of an ethically murky machine stems from the machine being able to interact with the world and update its beliefs based on its perceptions. This observation leads to the thought of an oracle machine, which would not interact with the world, but simply answer questions. All other AI, then, could simply defer moral questions to the oracle, which should not ever be corrupted by private or corrupt interests. This potentially solves the critical problem of Microsoft’s Tay and the hypothetical problem of autonomous vehicles learning unethical maneuvers. The authors point out that, while implementation is far from simple and an oracle machine may only be a complication and not a solution, the prospects for an oracle machine are fundamentally simpler than integrating ethics into general AI.

    • Yampolskiy, Roman. “Artificial Intelligence Safety Engineering: Why Machine Ethics Is a Wrong Approach.” Springer (2013): 335–347.

      Yampolskiy holds the opinion that ethical norms are not universal, and thus any attempt to apply any ethical code to machines will not satisfy all of humanity. He also explains that the Moral Turing Test is too weak a requirement, for if a machine acts like a human, it will often make immoral choices, which is unacceptable. From there, the article pushes back on the broad notion of applying traditional ethics to machines, arguing that an attention to safety and law are all that are necessary. Yampolskiy points out that not implementing ethics in machines will mean they will not be deployed in situations where they make ethical decisions, at most offering ethical advice to human decision makers. This, though an argument against implementing actionable ethics in machines, is also an optimistic proposal for the future of limited, but inherently safer machine learning.


  1. moral agency:

    The ability of an individual to make judgements about what is right and wrong, good and bad. Moral agents are also generally obligated to treat one another morally.

  2. deontological ethics:

    Ethical models that focus on laws that must be followed, despite anticipated outcomes.

  3. utilitarian ethics:

    Ethical models that maximize good among all involved moral agents.

  4. Turing Test:

    One of the original tests of machine agency, famously proposed by Alan Turing, which pragmatically sets the basis at a human’s inability to distinguish the machine from another human.

  5. consequentialism ethics:

    Ethical models that focus on anticipated outcomes, ignoring predefined laws for actions.