Executive Overview
The Challenge
Artificial intelligence systems increasingly shape how knowledge is produced, accessed, and distributed. Yet the global AI ecosystem remains heavily skewed toward a small set of dominant languages.
Somali, spoken by over 25 million people across the Horn of Africa and the global diaspora, remains largely excluded from high-quality, AI-ready language datasets.
Without deliberate intervention, Somali speakers risk growing exclusion from AI-enabled education, digital public services, research ecosystems, and emerging knowledge economies.
The Opportunity
Underrepresented languages are not merely at risk of exclusion — they represent a major opportunity for inclusive innovation.
Expanding high-quality language infrastructure for Somali can:
- Enable meaningful participation in the AI and digital knowledge economy
- Support education, governance, and development applications
- Contribute valuable linguistic diversity to global AI systems
- Create new pathways for youth skills development and applied research
The Program: SLIP
The Somali Language Infrastructure Program (SLIP) is a Somalia-based pilot designed to address this gap through a practical, capacity-driven approach. SLIP combines youth training, high-quality language data production, and public-interest governance to lay the foundations for sustained Somali participation in AI and digital knowledge systems.
Over a 12-month pilot phase, SLIP will:
Youth Training
Train Somali youth in linguistics-informed language data production
Dataset Production
Produce high-quality, documented, machine-ready Somali language datasets
Resource Development
Develop open documentation, terminology resources, and standards
Ethical Framework
Establish ethical safeguards and stewardship pathways for long-term sustainability
SLIP is intentionally designed as a foundational intervention — testing scalable models that can later support expanded fellowship cohorts and replication across other underrepresented languages.
Why Bookoob
Bookoob Ltd serves as the implementing partner due to its demonstrated capacity in Somali knowledge production and language workflows.
Translation Excellence
Produced hundreds of high-quality Somali translations of complex nonfiction content
Terminology Development
Developed consistent Somali terminology across economics, science, and social thought
Quality Workflows
Established multilingual editorial, quality assurance, and documentation workflows
Digital Archives
Built and maintained structured digital archives of Somali language content
Expert Networks
Cultivated networks of translators, editors, narrators, and reviewers
Public Interest
All donor-supported outputs governed by explicit public-good safeguards
Through its daily operations, Bookoob already generates near-clean linguistic data. SLIP builds on this foundation, transforming existing practice into a structured, public-interest infrastructure program rather than starting from scratch.
Public-Good Safeguards and Stewardship
A core principle of SLIP is that donor-funded outputs serve the public interest.
Open Licensing
All datasets produced with donor support will be non-exclusively licensed for research, education, and public-interest use
Governed Stewardship
Outputs will be contributed to a governed stewardship framework rather than privately owned
Transparent Documentation
Transparent documentation, provenance tracking, and ethical use guidelines will be applied
Following the pilot phase, stewardship of SLIP-generated datasets will transition to an independent, mission-driven cooperative model designed to manage access, licensing, and sustainability in line with public-interest principles.
Partnership Pathways
Donors & Foundations
Support pilot implementation, youth training, and infrastructure development
Universities & Research
Engage through applied research collaboration and curriculum linkage
AI Labs & Tech Companies
Ethically sponsor applied fellows or support open language infrastructure
All partnerships are structured to maintain academic independence, ethical boundaries, and community benefit.
Why Now
AI systems are rapidly advancing, and language inclusion decisions made today will shape who participates in future knowledge economies. For Somali and other underrepresented languages, the window to build foundational infrastructure is narrowing.
SLIP represents a timely, practical intervention — grounded in Somalia's linguistic reality, aligned with global public-interest principles, and designed to deliver durable institutional value beyond a single project cycle.
Get Involved with SLIP
The Somali Language Infrastructure Program offers a credible, scalable pathway to ensure Somali remains present, usable, and valuable in the AI era.