BanglaLLM Research

Building language technologyfor Bangla

BanglaLLM is an independent, open research effort building language models for Bangla. We think there's a real difference between treating a language as an afterthought and designing for it from day one.

31+ Models7+ Datasets3+ Papers

What We Work On

Research Themes

Our work spans foundation models, benchmarks, data infrastructure, and real-world applications.

Foundation Models

New tokenization, continued pre-training, and instruction-tuning for Bangla, built on Llama and Qwen. The BanglaLlama family ranges from 3B to 33B; all released openly on HuggingFace.

Evaluation & Benchmarks

Measuring how well models perform in Bangla is still largely an open question. We're building benchmarks around political-bias detection, mathematical reasoning, and test-time scaling.

Data Infrastructure

Good models need good data, and for Bangla we've built most of it ourselves. News crawlers, translated instruction datasets (Bangla-Alpaca, Bangla-Orca), math datasets, all open.

Research to Product

Research that reaches people matters more than research that stays on a shelf. Drishtikon, a news-literacy platform for Bangladesh, is built on this lab's work.

Publications

Research Output

Published

LoResLM @ EACL 20262026

BanglaLlama: LLaMA for Bangla Language

Abdullah Khan Zehady, Shubhashis Roy Dipta, Naymul Islam, Safi Al Mamun, Santu Karmaker

Introduces Bangla-Alpaca (52k) and Bangla-Orca (172k) instruction datasets, plus 5 open BanglaLlama model variants.

BLP @ IJCNLP-AACL 20252025

Read Between the Lines: A Benchmark for Uncovering Political Bias in Bangla News Articles

Nusrat Jahan Lia, Shubhashis Roy Dipta, Abdullah Khan Zehady, Naymul Islam, Madhusodan Chakraborty, Abdullah Al Wasif

BanglaBias, a 200-article benchmark with three-way labels (gov-leaning / gov-critique / neutral), evaluated across 28 LLMs.

In Progress

In Progress

TutorLM

Building tutoring-oriented Bengali models.

Preprint coming soon

Team

Research Team

Researchers and advisors building language technology for Bangla.

Abdullah Khan Zehady

Abdullah Khan Zehady

Research Lead

Founder, Perspectivity

Shubhashis Roy Dipta

Shubhashis Roy Dipta

Research Lead

PhD Student, UMBC

Naymul Islam

Naymul Islam

Research Lead

BanglaLLM

Santu Karmaker

Santu Karmaker

Research Advisor

Assistant Professor, UCF / Bridge-AI Lab

Safi Al Mamun

Safi Al Mamun

Researcher

BanglaLLM

Nusrat Jahan Lia

Nusrat Jahan Lia

Researcher

BanglaLLM

Madhusodan Chakraborty

Madhusodan Chakraborty

Researcher

BanglaLLM

Sibgat Zehady

Sibgat Zehady

Researcher

BanglaLLM

Research in Production

Powering Real-World Impact

Perspectivity

Perspectivity

Multi-perspective analysis platform for understanding complex information. Real-time insights powered by research-grade language models.

Visit Perspectivity
Drishtikon

Drishtikon

Bengali news-literacy platform with real-time bias detection. Multi-perspective analysis and source transparency for informed readers.

Visit Drishtikon
Get Involved

Collaborate with BanglaLLM

We're an open research group. Contributions, collaborations, and feedback are always welcome. The easiest way to get started is opening a GitHub issue or sending a pull request.