NortheastLM — India's First NE India Foundational AI

India's First Foundational AI for Northeast India

A from-scratch transformer model trained on Assamese, English, and Hinglish. Purpose-built for 50M+ Northeast India speakers. Deployed at zero cost.

Assamese Native LLaMA 3-Class Architecture Trained in Guwahati $0/month Deployment

NortheastLM Chat

RoPE + RMSNorm + SwiGLU + GQA

Namaskar! I am NortheastLM, India's first foundational AI model for Northeast India, built by NECISS in Guwahati. Ask me about Assam, Kaziranga, Bihu, tea, silk, business ideas, or the entire Northeast region. I understand Assamese, English, and Hinglish.

NortheastLM v1.0 · 80M params · ONNX INT8 on ARM

Architecture — What Powers NortheastLM

RoPE

Rotary Position Embeddings — encodes position through vector rotation. Extrapolates to unseen lengths.

RMSNorm

Root Mean Square Normalization — 15% faster than LayerNorm, same training stability.

SwiGLU

Swish-Gated Linear Unit — outperforms GELU and ReLU across all model sizes.

GQA

Grouped Query Attention — 8 query heads, 4 KV heads. 2x less memory, same quality.

KV-Cache

Key-Value caching for autoregressive inference. 10-50x faster generation.

Custom BPE

Trained on Assamese script. 3.1x compression vs character-level. Full Unicode support.

Scaling Roadmap — From Proof-of-Concept to Production

Current: 80M

Trained on Kaggle T4 x2 GPU (free). 7 smart techniques: label smoothing, z-loss, EMA, stochastic depth, curriculum learning, token dropout, GQA. Running live right now.

→

Target: 1.5B

With Startup India funding for NVIDIA DGX Spark (128GB unified memory). 40x more capable. Production-grade NE India AI. Same zero-cost deployment.