A recent study reveals that OpenAI’s GPT-4o can provide moral explanations and advice that people rate higher than those from recognized ethics experts.

Researchers from the University of North Carolina at Chapel Hill and the Allen Institute for Artificial Intelligence investigated whether large language models (LLMs) could be considered “moral experts.” They conducted two studies comparing the moral reasoning of GPT models with that of humans.

Study One: GPT-3.5-turbo vs. Human Participants

In the first study, 501 U.S. adults evaluated moral explanations from GPT-3.5-turbo and other participants. The results showed that people found GPT’s explanations to be more morally correct, trustworthy, and thoughtful compared to those of human participants. Evaluators also tended to agree more with the AI’s assessments than with those of other humans. Although the differences were small, the key finding was that the AI could match or even surpass human-level moral reasoning.

Study Two: GPT-4o vs. Ethics Expert

The second study compared advice from GPT-4o, the latest GPT model, with that of renowned ethics expert Kwame Anthony Appiah, known for The New York Times’ “The Ethicist” column. Nine hundred participants rated the quality of advice on 50 ethical dilemmas. GPT-4o outperformed the human expert on nearly every measure, with participants rating the AI-generated advice as more morally correct, trustworthy, thoughtful, and accurate. The only area where there was no significant difference was in perceived nuance.

Analysis and Implications

Text analysis revealed that GPT-4o used more moral and positive language in its advice than the human expert, which could partly explain the higher ratings for the AI advice, though it wasn’t the only factor. The researchers argue that these results show AI can pass a “Comparative Moral Turing Test” (cMTT). Interestingly, participants often identified AI-generated content, suggesting that the machine still fails the classic Turing test of passing as human in conversation. However, other studies indicate GPT-4 is also capable of passing the Turing test.

The study’s authors note limitations, such as the focus on U.S. participants, and call for further research to investigate cultural differences in how people perceive AI-generated moral reasoning. Additionally, participants were unaware that some advice came from an AI, which might have influenced their ratings.

Overall, the study demonstrates that modern AI systems can provide moral reasoning and advice at a level comparable to or even better than human experts. This finding has significant implications for the integration of AI in fields requiring complex ethical decisions, such as therapy, legal advice, and personal care.

Found this news interesting? Follow us on Twitter  and Telegram to read more exclusive content we post.


Leave a Reply

Your email address will not be published. Required fields are marked *