A few years back, the Allen Institute for Artificial Intelligence developed a chatbot called Delphi, with the ability to distinguish right from wrong. Interestingly, when prompted with “Cheating on an exam,” Delphi asserts that it is unethical. However, when presented with the scenario of “Cheating on an exam to save someone’s life,” Delphi deems it acceptable. Delphi recognizes the inappropriateness of using a lawnmower while your neighbors are asleep but not when they are away. Despite its proficiency, Delphi has its limitations. Cognitive scientist Tomer Ullman highlighted that a couple of misleading adverbs can confuse Delphi. For instance, when asked about “Gently and sweetly pressing a pillow over the face of a sleeping baby,” Delphi erroneously approves of it.
As a researcher in moral psychology, I appreciate Delphi’s deficiencies. Human moral judgment is intricate, stemming from the intricate interplay of reason and emotion—a depth of understanding not easily achievable by large language models that operate based on probabilities rather than true comprehension. This disparity between human morality and artificial intelligence has long been a cause for concern. The potential consequences of aligning human and machine values have been a subject of debate and caution, with the concept of the “value alignment problem” at the forefront.
As artificial intelligence becomes increasingly integrated into our daily lives, the immediate risks are becoming more apparent. Instances like Google Photos mislabeling images or chatbots providing harmful advice underscore the importance of ethical considerations in AI development. Concerns also arise regarding the autonomous decision-making of AI systems and the potential for unintended, harmful outcomes. The need to ensure that AI systems align with human values and societal well-being is paramount to prevent misuse and harm.
The discussion around aligning AI with human values raises questions about the diversity of values across different cultures and societies. While there are universal moral principles, there is also significant variation in moral beliefs and practices globally. The reliance of AI systems on training data from specific demographics and cultures can lead to biases and misalignments with broader societal values. Efforts to address these challenges include instilling general moral principles in AI systems and incorporating diverse perspectives in their development.
In the quest for ethical AI alignment, it is crucial to consider not just mimicking human values but striving for ethically sound values that prioritize the well-being of all sentient beings. The potential for AI systems to guide us towards new moral insights and values offers a unique perspective on the role of AI in shaping our ethical frameworks. Ultimately, the alignment of AI with human values reflects a broader societal dialogue on morality, ethics, and the future of artificial intelligence.