The Workshop

AI & Indigenous Language Revitalization Workshop

June 8–10, 2026; Loyola Marymount University, Los Angeles

What it is

Sunaŵi brought together people from very different backgrounds (Indigenous community members, computer scientists, AI experts, linguists, archivists, language teachers, nonprofit leaders, education experts, students, industry representatives, and junior and senior faculty) to surface shared concerns and move toward defining what responsible AI looks like in the context of Indigenous language revitalization. It went a long way toward breaking down some of the barriers that prevent us all from working more closely together.

This wiki is the workshop's living output. See the people who took part.

How it came to be

It goes all the way back to why I chose computer science as a major in the first place: I wanted to create a "Rosetta Stone" (the language-learning software) for my tribe's critically endangered language, Owens Valley Paiute. Throughout my studies my research went in a very different direction (distributed computing), but I never lost sight of that original motivation. I helped create online tools (like an online version of Glenn Nelson's Owens Valley Paiute dictionary) to make the language more accessible to my community, but it wasn't until ChatGPT exploded in 2022 that I saw, for the first time, an opportunity to do research in this area as well.

I started working with linguists, archivists, and other scholars in language documentation, preservation, and revitalization. Until then, my only real exposure to the field was through my own tribe's language program, and most of that was interacting with historical materials. As many of the workshop participants can attest, the traditional language documentation field was largely extractive. Linguists would come into communities, gather data from speakers, produce potentially valuable resources (dictionaries, grammars, etc.), and then publish them behind academic paywalls or in ways that were otherwise inaccessible to the communities that produced the data.

When I started studying Owens Valley Paiute seriously, I came across a grammar that I didn't know existed and found it locked behind UC San Diego's academic paywall. As a Ph.D. student at USC, I was able to get access to it, but it remains largely inaccessible to the rest of my community. While reading the first few pages, I was stunned to learn that Ida Stewart (my great-great-grandmother) was the source of most of the data in it. I had no idea that this resource existed until I was in graduate school, and none of the family members I talked to knew about it either. This was my experience coming into this work.

I learned, however, that the field has come a long way toward making documentation and preservation a much more community-centered practice. In fact, in every conversation I've observed or taken part in with documentation scholars and experts, "community-centered" is at the top of everyone's mind.

This is not my experience with AI research, which I think is still largely in an extractive phase of its lifespan (and hopefully it is just a phase). So the idea behind this workshop was to bring together people from very different backgrounds (Indigenous communities, language programs, linguistics, archival work, computer science, and AI) to surface shared concerns and move toward defining what responsible AI looks like in this space.

Jared Coleman, organizer

Created · Updated
Supported By the National Science Foundation Award 2542375.