Vendredi 26 juillet, Anne-Laure Dotte et Stéphanie Geneix-Rabault de l’équipe ERALO, accompagnée de leur collègue Diane Wejë Bae de l’Académie des Langues Kanak, présenteront une communication lors du Workshop Innovative approaches to speech and language technologies for Oceania, the world’s most linguistically diverse region (programme). Ce rendez-vous proposé en présentiel dans notre université partenaire Flinders à Adélaïde (Australie) est également accessible en ligne.
C-LARA: disseminating resources for Iaai, an under-documented Oceanic language
by Anne Laure-Dotte & Stéphanie Geneix-Rabault, Université de la Nouvelle-Calédonie, Eralo, and Diane Wejë Bae, Académie des Langues Kanak, New Caledonia
Iaai is one of the Kanak languages spoken in the archipelago of New Caledonia, more specifically in the centre and north-east of Ouvéa island (Oceanic language from the Austronesian family), but also in Nouméa, Greater Nouméa and other parts of Grande Terre (Dotte et al., 2017). With around 3,700 speakers (ISEE, 2021), Iaai is a living language, which is taught as an optional subject in the local education system, but for which teaching resources and written productions are still rare. On the Internet, resources are even more sporadic, even though demand from teachers, artists and the community itself is strong, linked to the fear of seeing their language disappear, and to the isolation sometimes felt by speakers of this Kanak language, who find themselves in an environment where their mother tongue is a minority and dominated by other Kanak languages, but also by French.
In this context, we developed a collection of resources with the C-LARA tool which is innovative and relevant not only for young learners of the language, but also for language revitalisation efforts. It is composed of resources from oral literature, recorded with two main Iaai speakers in Ouvea, and digitised by researchers from the University of New Caledonia. Thanks to a fruitful international collaboration with a team of colleagues from Flinders University and University of South Australia, we were able to upload seven texts into C-LARA platform (‘Fonds Pacifique’ grant from the Agence Francaise de Developpement, Maizonniaux et al., 2023-2024). Several challenges were faced and will be addressed in this paper, such as the particularity of using C-LARA with human-recorded audio, sung voice segmentation issues, the orthographic choices for a language in the process of being standardized, the relevance of the images proposed by DALL-E 3 (CHAT-GPT) to illustrate a rarely documented context, the prospects for pedagogical use of these resources and collaborative enrichment, the lack of resources on languages and our cultural context at the present time in the AI system.
Les ressources C-LARA en iaai sont accessibles, gratuitement, ici !