July 16, 2021

Introducing Spec-1-Mini

A compact, transformer-based 130M parameter language model optimized for efficiency and performance.

View on Hugging Face ➚

Spec-1-Mini Architecture

Model Architecture

Architecture: Decoder-only Transformer
Parameters: 130 Million
Layers: 12 transformer blocks
Hidden size: 768
Attention heads: 12
Max context length: 1024 tokens
Tokenizer: Byte-Pair Encoding (BPE) — 50,000 vocab size

Training Details

Training corpus: Less than 10GB of curated multilingual text
Domains: Open-domain internet corpus, books, documentation, conversations
Training hardware: Low-end CPU clusters (Intel i5/i7 8th-Gen equivalents)
Epochs: 8
Batch size: 256 (gradient accumulation)
Precision: FP32 with manual memory optimization
Frameworks: PyTorch + HuggingFace Transformers

Use Cases

Text generation for chatbots and interfaces
Semantic search and document summarization
Offline inference in mobile or edge devices
Proof-of-concept NLP pipelines for startups

Comparison to Industry Models

Memory footprint: < 500MB model size on disk
Inference speed: Real-time response on CPU (< 1s per response)

Explore More

Other SVECTOR Models