Grupo de lectura Human-Centric Machine Learning (HCML)

Un grupo de lectura centrado en el aprendizaje automático centrado en el la humanidad (Human Centric Machine Learning) organizado por los estudiantes de doctorado de la Unidad ELLIS de Alicante.

El grupo de lectura HCML tiene como objetivo reunir a investigadores y estudiantes interesados tanto en obtener una visión amplia del área como en profundizar en él. Para llegar a comprender claramente cómo las decisiones algorítmicas y humanas se influyen mutuamente, debatiremos artículos sobre diferentes temas dentro de HCML, y también ahondaremos en nuevos problemas, diferentes enfoques y fuentes de sesgos.

Comunicación (en inglés):

  • Google Group, donde tienen lugar todas las comunicaciones del grupo. En este canal se escriben los enlaces importantes, los artículos sugeridos, etc.
  • Canal Slack de ELLIS PhD & Postdoc: #rg-human-centric-ml
  • Canal Signal: trasladamos la conversación de Signal a Google Group

Lea y respete nuestro código de conducta!


  • Annotations from speech and heart rate: impact on multimodal emotion recognition

    Authors: Kaushal Sharma and Guillaume Chanel (2023)

    Article link:

    Abstract: The focus of multimodal emotion recognition has often been on the analysis of several fusion strategies. However, little attention has been paid to the effect of emotional cues, such as physiological and audio cues, on external annotations used to generate the Ground Truths (GTs). In our study, we analyze this effect by collecting six continuous arousal annotations for three groups of emotional cues: speech only, heartbeat sound only and their combination. Our results indicate significant differences between the three groups of annotations, thus giving three distinct cue-specific GTs. The relevance of these GTs is estimated by training multimodal machine learning models to regress speech, heart rate and their multimodal fusion on arousal. Our analysis shows that a cue(s)-specific GT is better predicted by the corresponding modality(s). In addition, the fusion of several emotional cues for the definition of GTs allows to reach a similar performance for both unimodal models and multimodal fusion. In conclusion, our results indicates that heart rate is an efficient cue for the generation of a physiological GT; and that combining several emotional cues for GTs generation is as important as performing input multimodal fusion for emotion prediction.

    Presenter: Kaushal Sharma

    Date: 2024-03-26 15:00 (CET)


  • Fairness and Inclusivity in Urban Transportation Design Using Reinforcement Learning

    Article link:

    Abstract: Public transportation networks are the foundation of urban living. Designing transportation networks, however, is a complex task that involves physical, social, political, and legal constraints. This complexity is further compounded when considering the trade-off between efficiency and fairness. While efficient lines can boost ridership and reduce car dependency, thereby contributing to environmental sustainability, they may also prioritize densely populated central areas while neglecting other potentially underserved communities, exacerbating existing inequalities. It is therefore crucial to develop tools that address these challenges and prioritize fairness. Recent advancements in Artificial Intelligence offer promising solutions. In this presentation, I will showcase our work in using Reinforcement Learning to design public transportation networks. I will highlight the potential unfairness it can cause and propose strategies to mitigate them. Finally, I will introduce a conceptual framework aimed at fostering an inclusive design process that uses input from local communities and adapts its behaviour accordingly.

    Presenter: Dimitris Michailidis

    Date: 2024-02-27 15:00 (CET)


  • Unprocessing Seven Years of Algorithmic Fairness

    Authors: André F. Cruz, and Moritz Hardt (2023)

    Article link:

    Abstract: Seven years ago, researchers proposed a postprocessing method to equalize the error rates of a model across different demographic groups. The work launched hundreds of papers purporting to improve over the postprocessing baseline. We empirically evaluate these claims through thousands of model evaluations on several tabular datasets. We find that the fairness-accuracy Pareto frontier achieved by postprocessing contains all other methods we were feasibly able to evaluate. In doing so, we address two common methodological errors that have confounded previous observations. One relates to the comparison of methods with different unconstrained base models. The other concerns methods achieving different levels of constraint relaxation. At the heart of our study is a simple idea we call unprocessing that roughly corresponds to the inverse of postprocessing. Unprocessing allows for a direct comparison of methods using different underlying models and levels of relaxation.

    Presenter: André F. Cruz

    Date: 2024-01-30 15:00 (CET)


  • Benchmarking the Generation of Fact Checking Explanations

    Authors: Daniel Russo, Serra Sinem Tekiroğlu, Marco Guerini (2023)

    Article link:

    Abstract: Fighting misinformation is a challenging, yet crucial, task. Despite the growing number of experts being involved in manual fact-checking, this activity is time-consuming and cannot keep up with the ever-increasing amount of fake news produced daily. Hence, automating this process is necessary to help curb misinformation. Thus far, researchers have mainly focused on claim veracity classification. In this paper, instead, we address the generation of justifications (textual explanation of why a claim is classified as either true or false) and benchmark it with novel datasets and advanced baselines. In particular, we focus on summarization approaches over unstructured knowledge (i.e., news articles) and we experiment with several extractive and abstractive strategies. We employed two datasets with different styles and structures, in order to assess the generalizability of our findings. Results show that in justification production summarization benefits from the claim information, and, in particular, that a claim-driven extractive step improves abstractive summarization performances. Finally, we show that although cross-dataset experiments suffer from performance degradation, a unique model trained on a combination of the two datasets is able to retain style information in an efficient manner.

    Presenter: Daniel Russo

    Date: 2023-11-28 15:00 (CET)


  • Bias and Fairness in Large Language Models: A Survey

    Authors: Isabel O. Gallegos, Ryan A. Rossi, Joe Barrow, Md Mehrab Tanjim, Sungchul Kim, Franck Dernoncourt, Tong Yu, Ruiyi Zhang, Nesreen K. Ahmed (2023)

    Article link:

    Abstract: Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing, defining distinct facets of harm and introducing several desiderata to operationalize fairness for LLMs. We then unify the literature by proposing three intuitive taxonomies, two for bias evaluation, namely metrics and datasets, and one for mitigation. Our first taxonomy of metrics for bias evaluation disambiguates the relationship between metrics and evaluation datasets, and organizes metrics by the different levels at which they operate in a model: embeddings, probabilities, and generated text. Our second taxonomy of datasets for bias evaluation categorizes datasets by their structure as counterfactual inputs or prompts, and identifies the targeted harms and social groups; we also release a consolidation of publicly-available datasets for improved access. Our third taxonomy of techniques for bias mitigation classifies methods by their intervention during pre-processing, in-training, intra-processing, and post-processing, with granular subcategories that elucidate research trends. Finally, we identify open problems and challenges for future work. Synthesizing a wide range of recent research, we aim to provide a clear guide of the existing literature that empowers researchers and practitioners to better understand and prevent the propagation of bias in LLMs.

    Presenter: Erik Derner

    Date: 2023-10-24 15:00 (CEST)


  • AI Art Curation: Re-imagining the city of Helsinki in occasion of its Biennial

    Authors: Ludovica Schaerf, Pepe Ballestreros, Valentine Bernasconi, Iacopo Neri, Dario Neguerela del Castillo (2023)

    Article link:

    Abstract: Art curatorial practice is characterized by the presentation of an art collection in a knowledgeable way. Machine processes are characterized by their capacity to manage and analyze large amounts of data. This paper envisages AI curation and audience interaction to explore the implications of contemporary machine learning models for the curatorial world. This project was developed for the occasion of the 2023 Helsinki Art Biennial, entitled New Directions May Emerge. We use the Helsinki Art Museum (HAM) collection to re-imagine the city of Helsinki through the lens of machine perception. We use visual-textual models to place indoor artworks in public spaces, assigning fictional coordinates based on similarity scores. We transform the space that each artwork inhabits in the city by generating synthetic 360 art panoramas. We guide the generation estimating depth values from 360 panoramas at each artwork location, and machine-generated prompts of the artworks. The result of this project is an AI curation that places the artworks in their imagined physical space, blurring the lines of artwork, context, and machine perception. The work is virtually presented as a web-based installation on this this link, where users can navigate an alternative version of the city while exploring and interacting with its cultural heritage at scale.

    Presenter: Ludovica Schaerf

    Date: 2023-09-26 15:00 (CEST)


  • Human-Aligned Calibration for AI-Assisted Decision Making

    Authors: Nina L. Corvelo Benz, Manuel Gomez Rodriguez (2023)

    Article link:

    Abstract: Whenever a binary classifier is used to provide decision support, it typically provides both a label prediction and a confidence value. Then, the decision maker is supposed to use the confidence value to calibrate how much to trust the prediction. In this context, it has been often argued that the confidence value should correspond to a well calibrated estimate of the probability that the predicted label matches the ground truth label. However, multiple lines of empirical evidence suggest that decision makers have difficulties at developing a good sense on when to trust a prediction using these confidence values. In this paper, our goal is first to understand why and then investigate how to construct more useful confidence values. We first argue that, for a broad class of utility functions, there exist data distributions for which a rational decision maker is, in general, unlikely to discover the optimal decision policy using the above confidence values -- an optimal decision maker would need to sometimes place more (less) trust on predictions with lower (higher) confidence values. However, we then show that, if the confidence values satisfy a natural alignment property with respect to the decision maker's confidence on her own predictions, there always exists an optimal decision policy under which the level of trust the decision maker would need to place on predictions is monotone on the confidence values, facilitating its discoverability. Further, we show that multicalibration with respect to the decision maker's confidence on her own predictions is a sufficient condition for alignment. Experiments on four different AI-assisted decision making tasks where a classifier provides decision support to real human experts validate our theoretical results and suggest that alignment may lead to better decisions.

    Presenter: Nina L. Corvelo Benz

    Date: 2023-08-22 15:00 (CEST)


  • Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do

    Authors: Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf and Kristian Kersting (2023)

    Article link:

    Abstract: Artificial writing is permeating our lives due to recent advances in large-scale, transformer-based language models (LMs) such as BERT, its variants, GPT-2/3, and others. Using them as pre-trained models and fine-tuning them for specific tasks, researchers have extended state of the art for many NLP tasks and shown that they capture not only linguistic knowledge but also retain general knowledge implicitly present in the data. Unfortunately, LMs trained on unfiltered text corpora suffer from degenerated and biased behaviour. While this is well established, we show that recent LMs also contain human-like biases of what is right and wrong to do, some form of ethical and moral norms of the society —they bring a “moral direction” to surface. That is, we show that these norms can be captured geometrically by a direction, which can be computed, e.g., by a PCA, in the embedding space, reflecting well the agreement of phrases to social norms implicitly expressed in the training texts and providing a path for attenuating or even preventing toxic degeneration in LMs. Being able to rate the (non-)normativity of arbitrary phrases without explicitly training the LM for this task, we demonstrate the capabilities of the “moral direction” for guiding (even other) LMs towards producing normative text and showcase it on RealToxicityPrompts testbed, preventing the neural toxic degeneration in GPT-2.

    Presenter: Mona Schirmer

    Date: 2023-07-25 15:00 (CEST)


  • On the Richness of Calibration

    Authors: Benedikt Höltgen, Robert C Williamson (2023)

    Article link:

    Abstract: Probabilistic predictions can be evaluated through comparisons with observed label frequencies, that is, through the lens of calibration. Recent scholarship on algorithmic fairness has started to look at a growing variety of calibration-based objectives under the name of multi-calibration but has still remained fairly restricted. In this paper, we explore and analyse forms of evaluation through calibration by making explicit the choices involved in designing calibration scores. We organise these into three grouping choices and a choice concerning the agglomeration of group errors. This provides a framework for comparing previously proposed calibration scores and helps to formulate novel ones with desirable mathematical properties. In particular, we explore the possibility of grouping datapoints based on their input features rather than on predictions and formally demonstrate advantages of such approaches. We also characterise the space of suitable agglomeration functions for group errors, generalising previously proposed calibration scores. Complementary to such population-level scores, we explore calibration scores at the individual level and analyse their relationship to choices of grouping. We draw on these insights to introduce and axiomatise fairness deviation measures for population-level scores. We demonstrate that with appropriate choices of grouping, these novel global fairness scores can provide notions of (sub-)group or individual fairness.

    Presenter: Benedikt Höltgen

    Date: 2023-06-27 15:00 (CEST)


  • Latent Space Smoothing for Individually Fair Representations

    Authors: Momchil Peychev, Anian Ruoss, Mislav Balunović, Maximilian Baader, Martin Vechev (2022)

    Article link:

    Abstract: Fair representation learning transforms user data into a representation that ensures fairness and utility regardless of the downstream application. However, learning individually fair representations, i.e., guaranteeing that similar individuals are treated similarly, remains challenging in high-dimensional settings such as computer vision. In this work, we introduce LASSI, the first representation learning method for certifying individual fairness of high-dimensional data. Our key insight is to leverage recent advances in generative modeling to capture the set of similar individuals in the generative latent space. This enables us to learn individually fair representations that map similar individuals close together by using adversarial training to minimize the distance between their representations. Finally, we employ randomized smoothing to provably map similar individuals close together, in turn ensuring that local robustness verification of the downstream application results in end-to-end fairness certification. Our experimental evaluation on challenging real-world image data demonstrates that our method increases certified individual fairness by up to 90% without significantly affecting task utility.

    Presenter: Jonas Klesen

    Date: 2023-05-30 15:00 (CEST)


  • Uncalibrated Models Can Improve Human-AI Collaboration

    Authors: Kailas Vodrahalli, Tobias Gerstenberg, James Zou (2022)

    Article link:

    Abstract: In many practical applications of AI, an AI model is used as a decision aid for human users. The AI provides advice that a human (sometimes) incorporates into their decision-making process. The AI advice is often presented with some measure of 'confidence' that the human can use to calibrate how much they depend on or trust the advice. In this paper, we present an initial exploration that suggests showing AI models as more confident than they actually are, even when the original AI is well-calibrated, can improve human-AI performance (measured as the accuracy and confidence of the human's final prediction after seeing the AI advice). We first train a model to predict human incorporation of AI advice using data from thousands of human-AI interactions. This enables us to explicitly estimate how to transform the AI's prediction confidence, making the AI uncalibrated, in order to improve the final human prediction. We empirically validate our results across four different tasks--dealing with images, text and tabular data--involving hundreds of human participants. We further support our findings with simulation analysis. Our findings suggest the importance of jointly optimizing the human-AI system as opposed to the standard paradigm of optimizing the AI model alone.

    Presenter: Kajetan Schweighofer

    Date: 2023-04-25 15:00 (CEST)


  • Generating stereotypes from implicitly hateful posts with Influence Functions

    Abstract: Substantial progress has been made on detecting explicit forms of hate, while implicitly hateful posts containing, e.g., microaggressions and condescension, still pose a major challenge. In light of high error rates, explanations accompanying model decisions are especially important. Since implicit abuse cannot be put down to the use of an individual slur, but arises out of the wider sentence context, highlighting individual tokens as an explanation is of limited use. In this paper, we generate full-text verbalisations of stereotypes that underlie implicitly hateful posts. We test the hypothesis that providing more context to the model - such as a small set of related samples - will lower the bar for generating the implied stereotype. For a given post, instance attribution methods, such as Influence Functions, are used to source similar examples from the training data. Then BART is trained to generate the underlying stereotype from an original input and its most similar neighbours.

    Presenter: Alina Leidinger

    Date: 2023-03-21 15:00 (CET)


  • Translating Principles into Practices of Digital Ethics: Five Risks of Being Unethical

    Authors: Luciano Floridi (2019)

    Article link:

    Abstract: It has taken a very long time, 1 but today, the debate on the ethical impact and implications of digital technologies has reached the front pages of newspapers. This is understandable: digital technologies—from web-based services to Artificial Intelligence (AI) solutions—increasingly affect the daily lives of billions of people, so there are many hopes but also concerns about their design, development, and deployment.

    Presenter: Adrián Arnaiz Rodriguez

    Date: 2023-02-28 15:00 (CET)


  • Extracting Training Data from Diffusion Models

    Authors: Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace (2023)

    Article link:

    Abstract: Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the-art models, ranging from photographs of individual people to trademarked company logos. We also train hundreds of diffusion models in various settings to analyze how different modeling and data decisions affect privacy. Overall, our results show that diffusion models are much less private than prior generative models such as GANs, and that mitigating these vulnerabilities may require new advances in privacy-preserving training.

    Presenter: Piera Riccio

    Date: 2023-02-14 15:00 (CET)


  • Multi Scale Ethics—Why We Need to Consider the Ethics of AI in Healthcare at Different Scales

    Authors: Melanie Smallman (2022)

    Article link:

    Abstract: Many researchers have documented how AI and data driven technologies have the potential to have profound effects on our lives—in ways that make these technologies stand out from those that went before. Around the world, we are seeing a significant growth in interest and investment in AI in healthcare. This has been coupled with rising concerns about the ethical implications of these technologies and an array of ethical guidelines for the use of AI and data in healthcare has arisen. Nevertheless, the question of if and how AI and data technologies can be ethical remains open to debate. This paper aims to contribute to this debate by considering the wide range of implications that have been attributed to these technologies and asking whether current ethical guidelines take these factors into account. In particular, the paper argues that while current ethics guidelines for AI in healthcare effectively account for the four key issues identified in the ethics literature (transparency; fairness; responsibility and privacy), they have largely neglected wider issues relating to the way in which these technologies shape institutional and social arrangements. This, I argue, has given current ethics guidelines a strong focus on evaluating the impact of these technologies on the individual, while not accounting for the powerful social shaping effects of these technologies. To address this, the paper proposes a Multiscale Ethics Framework, which aims to help technology developers and ethical evaluations to consider the wider implications of these technologies.

    Presenter: Kaylin Bolt

    Date: 2023-01-31 15:00 (CET)


  • The Grey Hoodie Project: Big Tobacco, Big Tech, and the Threat on Academic Integrity

    Authors: Mohamed Abdalla, Moustafa Abdalla (2021)

    Article link:

    Abstract: As governmental bodies rely on academics' expert advice to shape policy regarding Artificial Intelligence, it is important that these academics not have conflicts of interests that may cloud or bias their judgement. Our work explores how Big Tech can actively distort the academic landscape to suit its needs. By comparing the well-studied actions of another industry (Big Tobacco) to the current actions of Big Tech we see similar strategies employed by both industries. These strategies enable either industry to sway and influence academic and public discourse. We examine the funding of academic research as a tool used by Big Tech to put forward a socially responsible public image, influence events hosted by and decisions made by funded universities, influence the research questions and plans of individual scientists, and discover receptive academics who can be leveraged. We demonstrate how Big Tech can affect academia from the institutional level down to individual researchers. Thus, we believe that it is vital, particularly for universities and other institutions of higher learning, to discuss the appropriateness and the tradeoffs of accepting funding from Big Tech, and what limitations or conditions should be put in place.

    Presenter: Gergely D. Németh

    Date: 2023-01-17 15:00 (CET)


  • What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods

    Authors: Julien Colin, Thomas FEL, Remi Cadene, Thomas Serre (2022)

    Article link:

    Abstract: A multitude of explainability methods has been described to try to help users better understand how modern AI systems make decisions. However, most performance metrics developed to evaluate these methods have remained largely theoretical -- without much consideration for the human end-user. In particular, it is not yet clear (1) how useful current explainability methods are in real-world scenarios; and (2) whether current performance metrics accurately reflect the usefulness of explanation methods for the end user. To fill this gap, we conducted psychophysics experiments at scale () to evaluate the usefulness of representative attribution methods in three real-world scenarios. Our results demonstrate that the degree to which individual attribution methods help human participants better understand an AI system varies widely across these scenarios. This suggests the need to move beyond quantitative improvements of current attribution methods, towards the development of complementary approaches that provide qualitatively different sources of information to human end-users.

    Presenter: Julien Colin

    Date: 2022-12-13 15:00 (CET)


  • BIASeD: Bringing Irrationality into Automated System Design

    Authors: Aditya Gulati, Miguel Angel Lozano, Bruno Lepri, Nuria Oliver (2022)

    Article link:

    Abstract: Human perception, memory and decision-making are impacted by tens of cognitive biases and heuristics that influence our actions and decisions. Despite the pervasiveness of such biases, they are generally not leveraged by today's Artificial Intelligence (AI) systems that model human behavior and interact with humans. In this theoretical paper, we claim that the future of human-machine collaboration will entail the development of AI systems that model, understand and possibly replicate human cognitive biases. We propose the need for a research agenda on the interplay between human cognitive biases and Artificial Intelligence. We categorize existing cognitive biases from the perspective of AI systems, identify three broad areas of interest and outline research directions for the design of AI systems that have a better understanding of our own biases.

    Presenter: Aditya Gulati

    Date: 2022-11-08 15:00 (CET)


  • Understanding and Creating Art with AI: Review and Outlook

    Authors: Eva Cetinic, James She (2022)

    Article link:

    Abstract: Technologies related to artificial intelligence (AI) have a strong impact on the changes of research and creative practices in visual arts. The growing number of research initiatives and creative applications that emerge in the intersection of AI and art motivates us to examine and discuss the creative and explorative potentials of AI technologies in the context of art. This article provides an integrated review of two facets of AI and art: (1) AI is used for art analysis and employed on digitized artwork collections, or (2) AI is used for creative purposes and generating novel artworks. In the context of AI-related research for art understanding, we present a comprehensive overview of artwork datasets and recent works that address a variety of tasks such as classification, object detection, similarity retrieval, multimodal representations, and computational aesthetics, among others. In relation to the role of AI in creating art, we address various practical and theoretical aspects of AI Art and consolidate related works that deal with those topics in detail. Finally, we provide a concise outlook on the future progression and potential impact of AI technologies on our understanding and creation of art.

    Presenter: Nuria Oliver

    Date: 2022-07-14 15:00 (CEST)


  • Psychoanalyzing artifcial intelligence: the case of Replika

    Authors: Luca M. Possati (2022)

    Article link:

    Abstract: The central thesis of this paper is that human unconscious processes infuence the behavior and design of artifcial intelligence (AI). This thesis is discussed through the case study of a chatbot called Replika, which intends to provide psychological assistance and friendship but has been accused of inciting murder and suicide. Replika originated from a trauma and a work of mourning lived by its creator. The traces of these unconscious dynamics can be detected in the design of the app and the narratives about it. Therefore, a process of de-psychologization and de-humanization of the unconscious takes place through AI. This psychosocial approach helps criticize and overcome the so-called “standard model of intelligence” shared by most AI researchers. It facilitates a new interpretation of some classic problems in AI, such as control and responsibility.

    Presenter: Erik Derner

    Date: 2022-06-16 15:00 (CEST)


  • Performative Power

    Authors: Moritz Hardt, Meena Jagadeesan and Celestine Mendler-Dünner (2022)

    Article link:

    Abstract: We introduce the notion of performative power, which measures the ability of a firm operating an algorithmic system, such as a digital content recommendation platform, to steer a population. We relate performative power to the economic theory of market power. Traditional economic concepts are well known to struggle with identifying anti-competitive patterns in digital platforms—a core challenge is the difficulty of defining the market, its participants, products, and prices. Performative power sidesteps the problem of market definition by focusing on a directly observable statistical measure instead. High performative power enables a platform to profit from steering participant behavior, whereas low performative power ensures that learning from historical data is close to optimal. Our first general result shows that under low performative power, a firm cannot do better than standard supervised learning on observed data. We draw an analogy with a firm being a price-taker, an economic condition that arises under perfect competition in classical market models. We then contrast this with a market where performative power is concentrated and show that the equilibrium state can differ significantly. We go on to study performative power in a concrete setting of strategic classification where participants can switch between competing firms. We show that monopolies maximize performative power and disutility for the participant, while competition and outside options decrease performative power. We end on a discussion of connections to measures of market power in economics and of the relationship with ongoing antitrust debates.

    Presenter: Miriam Rateike

    Date: 2022-06-02 15:00 (CEST)


  • An Introduction to AI Safety

    Abstract: A discussion about why powerful AI systems can be very dangerous and why it is important for everyone in the Machine Learning community to at least understand the basic problems.

    Presenter: Marius Hobbhahn

    Date: 2022-05-19 15:00 (CEST)


  • Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction

    Authors: Nina Grgic-Hlaca, Elissa M. Redmiles, Krishna P. Gummadi & Adrian Weller (2018)

    Article link:

    Abstract: As algorithms are increasingly used to make important decisions that affect human lives, ranging from social benefit assignment to predicting risk of criminal recidivism, concerns have been raised about the fairness of algorithmic decision making. Most prior works on algorithmic fairness normatively prescribe how fair decisions ought to be made. In contrast, here, we descriptively survey users for how they perceive and reason about fairness in algorithmic decision making. A key contribution of this work is the framework we propose to understand why people perceive certain features as fair or unfair to be used in algorithms. Our framework identifies eight properties of features, such as relevance, volitionality and reliability, as latent considerations that inform people»s moral judgments about the fairness of feature use in decision-making algorithms. We validate our framework through a series of scenario-based surveys with 576 people. We find that, based on a person»s assessment of the eight latent properties of a feature in our exemplar scenario, we can accurately (> 85%) predict if the person will judge the use of the feature as fair. Our findings have important implications. At a high-level, we show that people»s unfairness concerns are multi-dimensional and argue that future studies need to address unfairness concerns beyond discrimination. At a low-level, we find considerable disagreements in people»s fairness judgments. We identify root causes of the disagreements, and note possible pathways to resolve them.

    Presenter: Stratis Tsirtsis

    Date: 2022-05-05 15:00 (CEST)


  • Lessons for artificial intelligence from the study of natural stupidity

    Authors: Alexander S. Rich & Todd M. Gureckis (2019)

    Article link:

    Abstract: Artificial intelligence and machine learning systems are increasingly replacing human decision makers in commercial, healthcare, educational and government contexts. But rather than eliminate human errors and biases, these algorithms have in some cases been found to reproduce or amplify them. We argue that to better understand how and why these biases develop, and when they can be prevented, machine learning researchers should look to the decades-long literature on biases in human learning and decision-making. We examine three broad causes of bias—small and incomplete datasets, learning from the results of your decisions, and biased inference and evaluation processes. For each, findings from the psychology literature are introduced along with connections to the machine learning literature. We argue that rather than viewing machine systems as being universal improvements over human decision makers, policymakers and the public should acknowledge that these system share many of the same limitations that frequently inhibit human judgement, for many of the same reasons.

    Presenter: Aditya Gulati

    Date: 2022-03-31 15:00 (CEST)


  • An Introduction to Reliable and Robust AI

    Abstract: An introduction to the design of reliable and robust AI systems in computer vision based on a review paper by Francesco Galati.

    Presenter: Francesco Galati

    Date: 2022-03-17 15:00 (CET)


  • Improving human decision-making with machine learning

    Authors: Hamsa Bastani, Osbert Bastani, Wichinpong Park Sinchaisri

    Article link:

    Abstract: A key aspect of human intelligence is their ability to convey their knowledge to others in succinct forms. However, despite their predictive power, current machine learning models are largely blackboxes, making it difficult for humans to extract useful insights. Focusing on sequential decision-making, we design a novel machine learning algorithm that conveys its insights to humans in the form of interpretable 'tips'. Our algorithm selects the tip that best bridges the gap in performance between human users and the optimal policy. We evaluate our approach through a series of randomized controlled user studies where participants manage a virtual kitchen. Our experiments show that the tips generated by our algorithm can significantly improve human performance relative to intuitive baselines. In addition, we discuss a number of empirical insights that can help inform the design of algorithms intended for human-AI interfaces. For instance, we find evidence that participants do not simply blindly follow our tips; instead, they combine them with their own experience to discover additional strategies for improving performance.

    Presenter: Putra Manggala

    Date: 2022-03-03 15:00 (CET)


  • Guest talk by Qualcomm AI Research: Natural Graph Networks

    Authors: Pim de Haan, Taco Cohen, Max Welling (2020)

    Article link:

    Abstract: On 17th February at 15.00 CET, the ELLIS Human-Centric Machine Learning reading group will host the first guest session receiving distinguished researchers from Qualcomm AI research and ELLIS Scholars. Pim de Haan, Research Associate at Qualcomm AI Research, will present his paper Natural Graph Networks, which explores how we can use the local symmetries of graphs to build more expressive graph networks. We will be further exploring the relationship between graph structured data and human-centric problems and applications in a round table made up by Manuel Gómez Rodríguez (MPI-SWS), Carlos Castillo (UPF) and Efstratios Gavves (director of the Qualcomm-UvA Deep Vision Lab). This relationship mainly arises from the fact that a lot of human interaction data is expressed as network structured data. Additionally, many advantages of GNNs, such as capturing complex structures between data or information flow, could lead to GNNs being an outstanding tool for addressing HCML problems. This session will be held online and is open to everyone interested in it.

    Presenter: Pim de Hann (Qualcomm AI Research), Round Table: Manuel Gómez Rodríguez (MPI-SWS), Carlos Castillo (UPF) and Efstratios Gavves (Qualcomm-UvA Deep Vision Lab)

    Date: 2022-02-17 15:00 (CET)


  • CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities

    Authors: Mina Lee, Percy Liang, Qian Yang (2022)

    Article link:

    Abstract: Large language models (LMs) offer unprecedented language generation capabilities and exciting opportunities for interaction design. However, their highly context-dependent capabilities are difficult to grasp and are often subjectively interpreted. In this paper, we argue that by curating and analyzing large interaction datasets, the HCI community can foster more incisive examinations of LMs' generative capabilities. Exemplifying this approach, we present CoAuthor, a dataset designed for revealing GPT-3's capabilities in assisting creative and argumentative writing. CoAuthor captures rich interactions between 63 writers and four instances of GPT-3 across 1445 writing sessions. We demonstrate that CoAuthor can address questions about GPT-3's language, ideation, and collaboration capabilities, and reveal its contribution as a writing 'collaborator' under various definitions of good collaboration. Finally, we discuss how this work may facilitate a more principled discussion around LMs' promises and pitfalls in relation to interaction design. The dataset and an interface for replaying the writing sessions are publicly available at

    Presenter: Gergely D. Németh

    Date: 2022-02-03 15:00 (CET)


  • Towards a Theory of Justice for Artificial Intelligence

    Authors: Iason Gabriel (2021)

    Article link:

    Abstract: This paper explores the relationship between artificial intelligence and principles of distributive justice. Drawing upon the political philosophy of John Rawls, it holds that the basic structure of society should be understood as a composite of socio-technical systems, and that the operation of these systems is increasingly shaped and influenced by AI. As a consequence, egalitarian norms of justice apply to the technology when it is deployed in these contexts. These norms entail that the relevant AI systems must meet a certain standard of public justification, support citizens rights, and promote substantively fair outcomes -- something that requires specific attention be paid to the impact they have on the worst-off members of society.

    Presenter: Nazaal Ibrahim

    Date: 2022-01-20 15:00 (CET)


  • Rethinking of Marxist perspectives on big data, artificial intelligence (AI) and capitalist economic development

    Authors: Nigel Waltona, Bhabani Shankar Nayak (2021)

    Article link:

    Abstract: AI and big data are not ideologically neutral scientific knowledge that drives economic development and social change. AI is a tool of capitalism which transforms our societies within an environment of technological sin- gularity that helps in the expansion of the capitalist model of economic development. Such a development process ensures the precarity of labour. This article highlights the limits of traditional Marxist conceptualisation of labour, value, property and production relations. It argues for the rethinking of Marxist perspectives on AI led economic development by focusing on conceptual new interpretation of bourgeois and proletariat in the infor- mation driven data-based society. This is a conceptual paper which critically outlines different debates and challenges around AI driven big data and its implications. It particularly focuses on the theoretical challenges faced by labour theory of value and its social and economic implications from a critical perspective. It also offers alternatives by analysing future trends and developments for the sustainable use of AI. It argues for developing policies on the use of AI and big data to protect labour, advance human development and enhance social welfare by reducing risks.

    Presenter: Bhargav Srinivasa Desikan

    Date: 2021-12-16 15:00 (CET)


  • Machine Learning for the Developing World

    Authors: De-Arteaga, M., Herlands, W., Neill, D. B., & Dubrawski, A. (2018)

    Article link:

    Abstract: Researchers from across the social and computer sciences are increasingly using machine learning to study and address global development challenges. This article examines the burgeoning field of machine learning for the developing world (ML4D). First, we present a review of prominent literature. Next, we suggest best practices drawn from the literature for ensuring that ML4D projects are relevant to the advancement of development objectives. Finally, we discuss how developing world challenges can motivate the design of novel machine learning methodologies. This article provides insights into systematic differences between ML4D and more traditional machine learning applications. It also discusses how technical complications of ML4D can be treated as novel research questions, how ML4D can motivate new research directions, and where machine learning can be most useful.

    Presenter: Felix Grimberg

    Date: 2021-12-02 15:00 (CET)