Adaptable, community-controlled, language technologies for language maintenance


Lori Levin, Language Technologies Institute, Carnegie Mellon University

Endangered languages may require more flexible language technologies than stable ones. They may not be standardized, and there might be variation in everything from spelling to grammar. They may have to make up for lost words or make up new words for new things. Although an older form of a language should be documented, it should not be prescribed to a language community in the process of revitalization, and a static technology that comes out of a laboratory is probably not appropriate. In this talk, I will argue that the coverage and content of language technologies should be in the hands of the speech community, and that it needs to be adaptable and learn from users. This calls for new approaches, possibly based on active learning to allow the language technologies to be as flexible and changeable as languages generally are. The talk also addresses ways in which the development of a machine translation system can be initiated when resources are scarce, including the process of language elicitation and automatic rule learning that has been developed by the AVENUE project at Carnegie Mellon University. The talk will conclude with a proposal. Machine translation between related low-resource languages should be undertaken in earnest as a way of augmenting the body of resources available to all of the related languages. Examples will be presented from indigenous languages of the Western Hemisphere including Mapudungun (Chile/Argentina), Iñupiaq (Alaska), and Ojibwe (Michigan/Canada).