Name: Sruthi ASR Engine
Brand: SandLogic
Availability: InStock

Question 1

What is Sruthi?

Accepted Answer

Sruthi is SandLogic's automatic speech recognition (ASR) engine — speech-to-text. Sruthi delivers a −51% average word error rate (WER) reduction versus Whisper-large-v3, the open-source ASR benchmark. The advantage is published in the Samba-ASR paper on arXiv (2501.02832). Sruthi uses Mamba state-space architecture for linear-time speech processing — different from the transformer-attention approach in Whisper — which makes it both more accurate on long-form audio and more compute-efficient at scale.

Question 2

How accurate is Sruthi compared to OpenAI Whisper?

Accepted Answer

Sruthi delivers a 51% average word error rate (WER) reduction compared to Whisper-large-v3, OpenAI's open-source ASR baseline. The benchmark is published as arXiv 2501.02832 (Samba-ASR — the underlying model name). Sruthi performs especially well on accented English, code-switched Indic speech, long-form audio (where transformer-attention models degrade), and noisy enterprise call-center recordings — the workloads where general-purpose ASR services struggle.

Question 3

Does Sruthi support Indian languages?

Accepted Answer

Yes. Sruthi handles 22 Indic languages in production deployments — the same coverage as the Lingo speech-analytics platform built on top. Sruthi is engineered specifically for multilingual enterprise speech: accented English, code-switching between English and Indic languages, dialect variation, and the audio quality typical of contact-center recordings.

Question 4

Can Sruthi run on-prem?

Accepted Answer

Yes — on-prem deployment is the primary deployment mode. Sruthi runs on the EdgeFlow inference engine across customer-owned NVIDIA, AMD, Intel, ARM, or Qualcomm silicon, with no third-party API calls during inference. This is critical for regulated verticals (BFSI, healthcare, public sector) and for high-volume call-center workloads where per-call API metering becomes the primary cost driver. Sruthi is designed to ship on Krsna SoC silicon for on-device deployment in voice-first products.

Model	WER %	CER %
SandLogic STT	13.27	11.36
Sarvam Saaras v3	13.55	13.41
Deepgram nova-3	17.53	10.16
Microsoft Azure	21.93	8.45
ElevenLabs Scribe v2	23.19	10.16
Google Chirp 3	24.47	12.24

Model	WER %	CER %
Deepgram nova-2	13.80	7.75
SandLogic STT	16.50	10.95
Sarvam Saaras v3	17.52	13.51
ElevenLabs Scribe v2	19.99	10.22
Microsoft Azure	29.35	12.29
Google Chirp 3	29.55	10.83

Test set	Samba WER %	Whisper-l-v3	Canary-1b	Note
LibriSpeech clean	1.17	—	—	frontier
LibriSpeech other	2.48	—	—	frontier
GigaSpeech	9.12	—	—
SPGISpeech	1.84	—	—	financial domain
Average WER	3.65	7.44	4.15

Sruthi.
ASR engine
for production speech.

Pick the model that fits the workload.

Transformer · multilingual

Samba · state-space

Best in class on English long-form.

Top-two on Hindi long-form.

State-space attention beats transformers on average WER.

Mamba beats transformers on average WER.

Six things production ASR needs.

Streaming transcription

Code-switching native

Long-form robustness

Diarization & overlap

Voice biometrics

On-prem & edge

Production-ready today. Multilingual on roadmap.

Benchmarked production

Production rollout

Roadmap

Hear it on your audio.

Drop-in replacement for your existing ASR.

REST + WebSocket

Telephony adapters

Lingo & IRA bundled

Pick the audio. Get the transcript.

Sruthi.ASR enginefor production speech.