Minds in Machines? On the Psychology and Deception Capabilities of Large Language Models

Abstract: Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life. Therefore, it is of great importance to thoroughly assess and scrutinize their capabilities. Due to the increasingly complex and novel behaviors of today's LLMs, this can be done by treating them as participants in psychological experiments originally designed to test humans. The talk will outline how psychology can inform behavioral tests for LLMs, and how such tests can discover emergent abilities in LLMs. As a deep dive, the talk will focus on the phenomenon of deception in LLMs - an ability that carries significant implications for alignment and safety.

Short bio: Dr. Thilo Hagendorff is an expert in AI safety, AI ethics, machine behavior in generative models, as well as the intersection of machine learning and psychology. He is working as a Research Group Leader at the University of Stuttgart. Previously, he worked for the Cluster of Excellence “Machine Learning” at the University of Tuebingen. He was a visiting scholar at Stanford University as well as UC San Diego. As a lecturer, he teaches at the Hasso Plattner Institute in Potsdam, among others.

Presenter: Dr. Thilo Hagendorff (University of Stuttgart, Germany)

Date: 2025-05-09 10:30 (CEST)

Location: Oficinas ELLIS Alicante, Muelle Pte., 5 – Edificio A, Alicante 03001, Alicante ES

Online: https://teams.microsoft.com/l/meetup-join/19%3ameeting_OTJkZWJjMjQtMGRlNS00NTU0LTkyNDYtMDE5YWZjZWYzNzI3%40thread.v2/0?context=%7b%22Tid%22%3a%22bb758050-7db8-403e-bffa-5643855efdb1%22%2c%22Oid%22%3a%22f63862bc-031d-4058-8533-000ceb056c4c%22%7d

Add to Google Calendar, Outlook Calendar