Notes from the founder's
desk.
Long-form writing, field notes, and benchmarks from Kamalakar Devaki — on edge AI, sovereign intelligence, the cloud-tax problem, and what it actually takes to build a full-stack AI company from the transistor up.
Fresh from the desk.
EdgeMatrix v0.0.5 — Throughput and Efficiency Gains
EdgeMatrix v0.0.5 ships throughput and efficiency improvements across multiple LLMs on NVIDIA L40s vs vLLM — with FlashAttention 4, context parallelism, and 190+ model families now supported.
All articles published on LinkedIn — read, comment, and share natively over there.
Filter by topic.
50 pieces across edge AI, model architecture, silicon, runtime, and strategy. Use the filters to narrow by topic.
Enterprise AI Has a Token Leakage Problem
Enterprise AI bills aren't high because models are expensive — they're high because tokens leak. Hallucinations, poor orchestration, context overload. A full-stack approach (model strategy + runtime + monitoring) cuts token consumption 30–40%.
Why the Next 100× AI Returns May Not Be in Models
The inference inflection point has arrived. Competitive advantage is shifting from model size to deployment efficiency — cost per inference, energy, latency.
Shakti Architecture: Designing Language & Vision Models for the Real World
How the Shakti family is designed for real-world enterprise constraints — architecture choices that make compact models punch above their weight.
Shakti-4B — A Production-Grade Vision-Language Model
Shakti-4B as a production-ready VLM — engineered for the document-intelligence workloads enterprises actually run, not for leaderboard hill-climbing.
The Edge AI Systems Problem — Here's How We're Solving It
The edge isn't a smaller cloud — it's a different problem. SandLogic's approach to silicon, runtime, and models as a co-designed system.
Shakti-4B + OCR — Beating DeepSeek
Shakti-4B benchmarked against DeepSeek on OCR — quick field note on where the comparison lands.
Raising the Bar on On-Device AI — ExSLerate v2 Benchmarks
New benchmarks for ExSLerate v2 — what the numbers mean for on-device AI deployment and how they compare to incumbent options.
Benchmarking LMCache vs EdgeMatrix — Why Caching Alone Isn't Enough
Prefix caching is necessary but not sufficient. Why hybrid KV-cache reuse beats prefix-only approaches in multi-tenant inference workloads.
Core/Edge Energy Performance in AI Chips
How to think about energy-per-inference at the core/edge boundary — the metric that actually matters at scale.
Redefining LLM Inference — How EdgeMatrix Outperforms vLLM
EdgeMatrix vs vLLM head-to-head on enterprise SLMs — the architectural choices behind a 73% throughput lift on L40s.
Engineering Scalable Edge AI — The Semiconductor Stack
What the semiconductor stack for scalable edge AI actually looks like — silicon, compiler, runtime, and how they need to be designed together.
Building the Full-Stack AI Future — Chip, Runtime, Models
The thesis statement for SandLogic — why full-stack vertical integration is the right shape for an AI company in 2025, not a luxury.
Why AI Chip Makers Need In-House Research, Now More Than Ever
Chip companies that outsource research are running an open-loop strategy. The case for the closed-loop alternative — and what it changes.
Lexicons, Nexons, Shakti — A Continuum of Intelligence
Introducing the SandLogic model continuum: Lexicons (curated open-source, quantized), Nexons (open foundations refined with our datasets), Shakti (in-house, ground-up).
Shakti LLM Series — Post 2: Built or Borrowed?
When do you build a sovereign language model and when do you start from open weights? Trade-offs from the SandLogic decision tree.
Shakti LLM Series — Post 1: Why We Built a Sovereign Language Model
The founding rationale for Shakti — why sovereignty, language coverage, and edge deployment forced the in-house path.
GenAI · Edge AI · Multi-Modal LLMs — Field Note
Short field-note on multi-modal LLMs running on the edge — what works, what doesn't yet.
Escape the Cloud Tax
Companion post for the "Escape the Cloud Tax" series — making the case for on-prem inference economics.
Escape the Cloud Tax — Post 5: Serve Faster, Spend Smarter, Scale
Final post in the "Escape the Cloud Tax" series — how to design inference for speed, cost, and scale simultaneously.
LLM Inference Acceleration — Field Note
Short post on the practical mechanics of LLM inference acceleration on the edge.
LLM MLOps on the Edge — Field Note
Operating LLMs at the edge — what an MLOps stack looks like when there is no cloud safety net.
LLM Inference MLOps — Notes
Quick notes on inference MLOps — observability, model swapping, drift detection at production scale.
EdgeMatrix vs the Cloud Tax — Field Note
How EdgeMatrix maps to the "escape the cloud tax" thesis — the economics in one chart.
ExSLerate — On-Chip AI for the Edge
Introducing the ExSLerate IP family — what makes an AI accelerator chip "edge-native" rather than a shrunk-down data-center part.
EdgeMatrix — Scaling 70B-Parameter Models for Enterprise AI
How EdgeMatrix scales to 70B-parameter LLMs without the cloud sticker shock — the engineering trade-offs explained.
Shakti-4B's OCR Capabilities — Comprehensive Evaluation
Comprehensive evaluation of Shakti-4B's OCR performance — datasets, methodology, and head-to-head benchmarks.
How EdgeMatrix Is Redefining Enterprise AI — More for Less Cost
Concrete enterprise economics — what "more for less" looks like when the inference layer is engineered for the workload.
Shakti-4B — Multi-Modal AI Model Powering Intelligence
Shakti-4B as a multi-modal foundation — what it can do today across vision and language.
Shakti-1B — Vision-Language Model Built for Enterprise
Shakti-1B as the right size for many enterprise document workflows — fast, accurate, and edge-deployable.
LingoForge — Revolutionizing How Enterprises Harness AI
LingoForge as the agent-orchestration layer enterprises actually need — and what "actually need" means in regulated industries.
Revolutionizing ASR with Samba-ASR
Samba-ASR — the Mamba-based architecture under Sruthi-S that beats Whisper-large-v3 on average WER. Linear complexity, frontier accuracy.
Speech Recognition Innovation — Field Note
Quick note on speech-recognition innovation — what changed and where it goes next.
Shakti LLM · Generative AI — Recognition
Recognition for Shakti LLM in generative-AI excellence rankings.
Real-World Applications of Shakti LLMs — Revolutionizing AI
How Shakti models show up in real-world enterprise deployments — concrete use cases across verticals.
Shakti LLMs Driving On-Device AI Workplace Agents
On-device workplace AI agents powered by Shakti — what becomes possible when the model lives on the device.
Precision & Power — Shakti's Blueprint for AI Excellence
The blueprint behind the Shakti family — where precision and power meet to define the model architecture.
Harnessing the Power of Shakti — LLM Series
A walk-through of the Shakti family for builders — what to deploy, where, and how to think about model selection.
From Edge to Excellence — Shakti LLM Revolution for Enterprise
Shakti from the edge to enterprise excellence — how the model line evolved from edge-first design to enterprise scale.
NASSCOM DeepTech Club — Startup Badge Awarded
SandLogic recognized by NASSCOM's DeepTech Club — a milestone moment.
Make in India · AI for Good · Enterprise AI
On building sovereign AI for India — Make in India meets AI for Good meets enterprise reality.
Shakti-2.5B — Live on Hugging Face
Announcing the Shakti-2.5B Hugging Face Space — explore the model interactively.
Shakti — A 2.5-Billion-Parameter Small Language Model
The first wide-audience announcement of Shakti-2.5B — what it is, why it's small on purpose, and what it beats.
Revolutionizing UI Localization Testing with LLMs
How LLMs change the economics of UI localization testing — a vertical use case for compact language models.
Shakti LLM · Responsible AI — Field Note
On building responsible AI guardrails into a sovereign LLM — the design choices behind HaluMon.
Introducing LexiQ — Your AI-Powered Assistant for KPI & PowerBI
LexiQ — a Lexicon-built assistant for KPI and PowerBI workflows. Domain-specialized AI shipped as a product, not a demo.
Optimized Llama3-Med42-8B GGUF SandLogic Lexicon
A medical-domain Lexicon — Llama3-Med42-8B optimized via SandLogic's quantization recipe. What enterprise-ready open-source looks like.
Unlocking Bilingual AI — A SandLogic Lexicon-Based Approach
Bilingual AI via curated Lexicons — the engineering and the dataset choices behind production-grade language coverage.
Turbocharge Your AI with SandLogic Lexicons
Why curated open-source — quantized, packaged, and benchmarked — beats raw model downloads for enterprise teams.
Introducing HaluMon — Ensuring Language-Model Reliability
The launch post for HaluMon — what reliable LLM deployment looks like in regulated industries, and the four-metric scoring that makes it auditable.
Eight years. One thesis.

Kamalakar Devaki · Founder & CEO
Kamalakar Devaki founded SandLogic in 2018 on the bet that intelligence belongs on the device, not rented from the cloud. Eight years later, the silicon, the runtime, the models, and the applications have all shipped under one roof.
The writing on this page is not marketing. It's the working record: engineering decisions, benchmark numbers, strategic bets, and the occasional unfiltered opinion. It all lives on LinkedIn because that's where the audience reads, comments, and shares. This page curates the archive in one place.
The thesis hasn't changed. The execution has.