Model Pricing
Pay-as-you-go pricing per 1M tokens. Filter by capability and size tier.
Flagship hybrid MoE+SSM model. 35B total parameters with only 3.6B active per token. Native 262K context with reasoning and thinking capabilities.
Aggressively tuned 35B MoE with zero refusals. Same architecture as Qwepus with custom imatrix quantization for maximum quality.
Dense 40B model expanded from Qwen3.6-27B architecture with 96 layers. Deep thinking and uncensored reasoning capabilities.
42B MoE thinking coder with only 3B active parameters. Built on Qwen3-30B-A3B-2507 with Brainstorm 20x enhancements for superior coding.
Dense 31B multimodal model with sliding window attention. Supports both text and image understanding with GQA architecture.
True chain-of-thought reasoner distilled from DeepSeek R1. Dense 32B Qwen2.5 architecture with deep reasoning capabilities.
GLM-4.7 based MoE flash model with only 2B active parameters. Optimized for speed with specialized imatrix quantization.
Compact reasoning model distilled from DeepSeek R1. Qwen-14B architecture with chain-of-thought capabilities at lower compute cost.
Qwen2.5 Coder 14B optimized for code generation. Produces complete, runnable solutions across multiple programming languages.
Compact multimodal model with vision support. Gemma4 architecture with sliding window attention for efficient inference.
Enhanced Gemma4-12B with improved instruction following and expanded knowledge. Tuned for superior task completion.
Josiefied Qwen3-8B tuned for unrestricted general-purpose assistance. Strong instruction following with creative capabilities.
Qwen3-VL vision-language model for image understanding and analysis. Processes both text and images in a single conversation.
Claude-Mythos hybrid 9B model. Combines conversational depth with broad knowledge for engaging, unrestricted dialogue.
Creative variant of the Qwen3 9B line. Optimized for narrative generation, roleplay, and creative writing tasks.
Llama3.3-8B with thinking and reasoning chain trained for high-reasoning tasks. Uncensored Claude 4.5 Opus distillation.
Marco DeepResearch model specialized for in-depth research and analysis tasks. Produces thorough, well-structured reports.
Professional writer model tuned for long-form content, storytelling, and creative prose. Produces publication-quality text.
Narrative-focused model with merged heretic tuning. Excels at worldbuilding, character development, and immersive storytelling.
Fully unrestricted variant of Mythos 9B. Zero content filtering for maximum creative freedom in narrative generation.
Lightweight agentic model with 16K context. Fast inference with surprisingly capable output for its size class.
Llama 3.2 3B Instruct. Compact and fast model for simple tasks, quick queries, and lightweight automation.
Ultra-lightweight nano model with 32K context. Ideal for high-throughput tasks, classification, and structured data extraction.