May 27, 2025

Introducing Optrix-1

Optrix-1 is SVECTOR’s efficient 1B parameter language model for real-time text generation in Voice Mode.

Model Specifications

•Architecture: Transformer with Grouped-Query Attention (GQA)
•Parameters: 1 Billion
•Embedding Dim: 2048 · Layers: 16 · Heads: 32
•Max Position Embeddings: 131,072
•Vocabulary Size: 128,256
•Languages: English, German, Hindi, Spanish, French, and more
•Activation: GELU · Positional Encoding: Rotary (dynamic scaled)

Architecture Overview

The Optrix-1 model begins with input tokenization using the Optrix Tokenizer, which converts user text into tokenized sequences. These tokens are embedded via the embed_tokens layer.

The embedded tokens are passed through multiple decoder layers, each utilizing Grouped Query Attention. In this mechanism, the keys (K) and values (V) are grouped, while the queries (Q) are processed per attention head. Each of these undergoes projection (k_proj, v_proj, q_proj) and normalization (k_norm, q_norm) before entering scaled dot-product attention.

Rotary Positional Embeddings are applied post-attention to preserve sequence order. Each decoder layer continues with layer normalization, followed by a feedforward network that includes a gate_proj, up_proj, a GELU activation, and down_proj. This is followed by another normalization step.

Finally, the output is normalized again, passed through lm_head, and produces logits representing word probabilities (e.g., "cat").

This structure allows Optrix-1 to achieve efficient text generation with high throughput and low memory usage.