Startup India Certified

India's First Foundational AI for Northeast India

A from-scratch transformer model trained on Assamese, English, and Hinglish. Purpose-built for 50M+ Northeast India speakers. Deployed at zero cost.

Assamese Native LLaMA 3-Class Architecture Trained in Guwahati $0/month Deployment
42M
Parameters
8,192
BPE Tokens
3
Languages
3.1x
Compression
$0
Monthly Cost

NortheastLM Chat

RoPE + RMSNorm + SwiGLU + GQA
NE
Namaskar! I am NortheastLM, India's first foundational AI model for Northeast India, built by NECISS in Guwahati. Ask me about Assam, Kaziranga, Bihu, tea, silk, business ideas, or the entire Northeast region. I understand Assamese, English, and Hinglish.
NortheastLM v1.0 · 42M params · ONNX INT8 on ARM

Architecture — What Powers NortheastLM

RoPE

Rotary Position Embeddings — encodes position through vector rotation. Extrapolates to unseen lengths.

RMSNorm

Root Mean Square Normalization — 15% faster than LayerNorm, same training stability.

SwiGLU

Swish-Gated Linear Unit — outperforms GELU and ReLU across all model sizes.

GQA

Grouped Query Attention — 8 query heads, 4 KV heads. 2x less memory, same quality.

KV-Cache

Key-Value caching for autoregressive inference. 10-50x faster generation.

Custom BPE

Trained on Assamese script. 3.1x compression vs character-level. Full Unicode support.

Scaling Roadmap — From Proof-of-Concept to Production

Current: 42M

Proof-of-concept trained on Google Colab T4 GPU (free). Demonstrates architecture, tokenizer, and deployment pipeline. Running live right now.

Target: 1.5B

With Startup India funding for NVIDIA DGX Spark (128GB unified memory). 40x more capable. Production-grade NE India AI. Same zero-cost deployment.