Taking Principles Seriously: A Hybrid Approach to Value Alignment in Artificial Intelligence

Authors: Tae Wan Kim, John Hooker, Thomas Donaldson

JAIR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical An important step in the development of value alignment (VA) systems in artificial intelligence (AI) is understanding how VA can reflect valid ethical principles. We propose that designers of VA systems incorporate ethics by utilizing a hybrid approach in which both ethical reasoning and empirical observation play a role. [...] Using quantified modal logic, we precisely formulate principles derived from deontological ethics and show how they imply particular test propositions for any given action plan in an AI rule base. The action plan is ethical only if the test proposition is empirically true, a judgment that is made on the basis of empirical VA. This permits empirical VA to integrate seamlessly with independently justified ethical principles. [...] We have sketched, in response, a proposal for understanding moral reasoning in machines, one that highlights how deontological ethical principles can interact with factual states of affairs.
Researcher Affiliation Academia Tae Wan Kim EMAIL John Hooker EMAIL Tepper School of Business, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213 USA; Thomas Donaldson EMAIL Wharton School of Business, University of Pennsylvania 3730 Walnut Street, Philadelphia, PA 19104-6340 USA
Pseudocode No The paper focuses on logically formulating ethical principles and their interaction with empirical observation. While it discusses the structure of action plans as being convenient for coding rules, it explicitly states: "While it is not our purpose to address engineering aspects of deontically-grounded VA, we can take note of some implementation issues that arise." No pseudocode or algorithm blocks are provided.
Open Source Code No The paper does not mention the release of any source code, nor does it provide links to any repositories or supplementary materials containing code. It focuses on the theoretical framework for value alignment.
Open Datasets No The paper discusses hypothetical examples (e.g., theft, ambulance usage) to illustrate its theoretical framework. It mentions how empirical VA *could* work by collecting responses or popular views, but it does not use or provide access to any specific publicly available datasets for its own analysis or experiments.
Dataset Splits No The paper is theoretical and does not present experiments using datasets. Therefore, no dataset splits are provided.
Hardware Specification No The paper outlines a theoretical framework for value alignment in AI and does not describe any experiments. Consequently, there are no hardware specifications mentioned.
Software Dependencies No The paper describes a theoretical approach using "quantified modal logic" for formulating ethical principles. It does not mention specific software, libraries, or solvers with version numbers that would be used to replicate experiments.
Experiment Setup No The paper is theoretical and does not describe any experiments. Therefore, there is no experimental setup, hyperparameters, or training configurations provided.