This symposium will be co-located with the large LFG conference, where we expect at least 25 scholars and researchers from around the world to take part.

South Asia—comprising Afghanistan, Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan, and Sri Lanka—is home to over two billion people and represents one of the richest linguistic regions in the world. With more than 700 languages and approximately 25 major writing systems, the region embodies extraordinary cultural and linguistic diversity. In addition, over 50 million South Asians live in diaspora communities worldwide.

Despite this richness, South Asian languages remain significantly underrepresented in language technologies. Most contemporary large language models (LLMs) contain only minimal data from the region, and fundamental challenges persist—from encoding and orthography to data scarcity and cultural representation. While most South Asian scripts are encoded in Unicode, orthographic complexity, inconsistent rendering, and limited input methods continue to hinder practical use. Linguistic features such as diglossia, dialectal variation, code-mixing, and close language contact further complicate natural language processing (NLP) tasks.

Focus on Tamil and Sinhala

This symposium places a special emphasis on Tamil and Sinhala, two major languages of Sri Lanka with long literary traditions, rich sociolinguistic variation, and growing relevance in language technology. Both languages pose unique challenges for NLP and speech technologies, including complex morphology, script-specific orthographic rules, dialectal diversity, and limited high-quality digital resources. At the same time, they offer important opportunities for advancing inclusive and culturally grounded computational linguistics research.

Background and Motivation

The first edition of the CHiPSAL workshop, co-located with COLING 2025, was well received by both academia and industry, attracting strong participation from South Asia and beyond. The second edition will be co-located with LREC 2026.

CHiPSAL has provided a much-needed forum for exchanging challenges, insights, and solutions related to South Asian language processing. Building on this momentum, the workshop continues to address linguistic, cultural, and technical challenges, with the broader goal of preserving and promoting South Asian linguistic heritage through responsible and inclusive language technologies.

Symposium Format

Unlike traditional paper presentation sessions, the CHiPSAL symposium adopts a roundtable-style format designed to foster interaction, discussion, and collaborative feedback.

Rather than presenting finished work, participants are encouraged to share ongoing research ideas, early-stage projects, or emerging concepts. The emphasis is on dialogue—helping researchers refine their ideas, identify challenges, and explore potential collaborations.

Keynote addresses

In addition to the discussions, the keynote addresses will be delivered by the following speakers:

Session Structure

  • 10-minute presentation outlining the research idea
  • 15-minute moderated discussion with the audience and invited experts

This format creates a supportive yet critical environment, particularly beneficial for early-stage researchers, students, and researchers working on low-resource languages such as Tamil and Sinhala.

Call for Synopses

We invite submissions of short synopses (maximum 500 words) describing a research idea relevant to the theme.

Each synopsis should include:

  • Title of the Research Idea
  • A concise and descriptive title.
  • Author(s) and Affiliation(s)
  • Synopsis
    • A brief description of the research idea, including the problem being addressed, the proposed approach, and the potential impact. Preliminary results may be included if available.
  • Discussion Points
    • A short list of questions or topics the author would like to discuss with the audience.

Evaluation Criteria

  • Synopses will be selected based on:
    • Their potential to stimulate lively and productive discussion
    • Originality of the idea
    • Relevance to South Asian languages, with particular attention to Tamil and Sinhala
    • Alignment with the broader goals of CHiPSAL

Important Dates

  • Submission Deadline: 30 June 2026
  • Submission Link: Will be announced soon

Selected synopses will be shared with registered participants in advance to encourage pre-reading and informed discussion.

We warmly invite researchers, practitioners, and students to contribute their ideas and take part in these meaningful conversations.

Organisers

  • EYA Charles, Department of Computer Science, University of Jaffna.
  • K Sarveswaran, Department of Computer Science, University of Jaffna.