Chatbots with Human Likeness

What it is

This example is about a design choice rather than a single product: building a system that presents itself with a real person's human likeness: their face, their voice, or their actual words. The likeness is what makes these tools compelling, and also what makes them potentially problematic.

Context

A clear, careful instance of this from another domain is USC Shoah Foundation's Dimensions in Testimony, which lets visitors ask questions and receive answers from pre-recorded video interviews with Holocaust survivors. It is important to note that this system is not generative: it uses language processing to match a visitor's question to the survivor's own recorded answers, so the words you hear were genuinely spoken by that person. The project has been well received, and we are not singling it out for criticism, but it has also prompted serious scholarly discussion of exactly these questions, including the concern that an interactive, user-driven format can shift the experience away from the survivor's own agency and toward whatever a visitor chooses to ask (Holocaust Studies, 2022).

Still, it gave some of us pause, and it is worth asking why. Many of us happily accept a system "pretending" to be a person in a video game. Why does it feel different here? Two things seem to matter. The first is the history and trauma involved. The stakes of misrepresenting a Holocaust survivor are not the stakes of a game character. The second is that the system stands in for a real, specific person. Even when the answers are pre-recorded, the system surfaces them in response to whatever a visitor types, which creates room to take a person's words out of context. With a generative system the risk is worse still: it can put words in a real person's mouth that they never said.

Another participant described a project in which a native speaker actively wanted her voice turned into a bot. The research team was careful to walk her through the risks, and she decided that the potential benefit outweighed them.

Previewer

TODO: This example, along with negative examples like RoboMeryl were provided by Rolando Coto-Solano, but I don't have the details to write/cite it more clearly. Need to follow up to flesh this part out.

Upsides

  • More personal. A face, a voice, or a name makes a digital tool feel personal and engaging in a way that plain text does not.
  • More accessible. It can make history (or a language) approachable, letting people interact with testimony or speech rather than just read about it.

Downsides

  • Cultural considerations. Representing a real person, especially someone who has died or in cultures with taboos around the images or voices of the dead, can violate cultural norms.
  • Words taken out of context. Because the system answers whatever a user asks, it can surface a real person's words (or, in generative systems, words they never said) in contexts they never intended and could not consent to.
Created · Updated
Supported By the National Science Foundation Award 2542375.