Model Pricing

Pay-as-you-go pricing per 1M tokens. Filter by capability and size tier.

qwepus-35b-a3b

Flagship hybrid MoE+SSM model. 35B total parameters with only 3.6B active per token. Native 262K context with reasoning and thinking capabilities.

reasoningthinkingvisionagentic
Context Limit131K
Pricing / 1M$3.00 / $9.00
Size TierFlagship
qwen3.6-35b-pro

Aggressively tuned 35B MoE with zero refusals. Same architecture as Qwepus with custom imatrix quantization for maximum quality.

agenticvision
Context Limit131K
Pricing / 1M$3.00 / $9.00
Size TierFlagship

Deckard 40B

40BDense
deckard-40b

Dense 40B model expanded from Qwen3.6-27B architecture with 96 layers. Deep thinking and uncensored reasoning capabilities.

thinkingreasoning
Context Limit16K
Pricing / 1M$3.00 / $9.00
Size TierFlagship
total-recall-42b

42B MoE thinking coder with only 3B active parameters. Built on Qwen3-30B-A3B-2507 with Brainstorm 20x enhancements for superior coding.

thinkingcodingreasoning
Context Limit32K
Pricing / 1M$3.00 / $9.00
Size TierFlagship

Gemma 4 31B

31BDense
gemma4-31b

Dense 31B multimodal model with sliding window attention. Supports both text and image understanding with GQA architecture.

visionmultimodal
Context Limit8K
Pricing / 1M$2.00 / $6.00
Size TierLarge
deepseek-r1-32b

True chain-of-thought reasoner distilled from DeepSeek R1. Dense 32B Qwen2.5 architecture with deep reasoning capabilities.

reasoningchain-of-thought
Context Limit16K
Pricing / 1M$2.00 / $6.00
Size TierLarge

Flash 30B

30BMoE
flash-30b

GLM-4.7 based MoE flash model with only 2B active parameters. Optimized for speed with specialized imatrix quantization.

fastcoding
Context Limit16K
Pricing / 1M$2.00 / $6.00
Size TierLarge
deepseek-r1-14b

Compact reasoning model distilled from DeepSeek R1. Qwen-14B architecture with chain-of-thought capabilities at lower compute cost.

reasoning
Context Limit8K
Pricing / 1M$0.80 / $2.40
Size TierMedium

Coder 14B

14BDense
coder-14b

Qwen2.5 Coder 14B optimized for code generation. Produces complete, runnable solutions across multiple programming languages.

coding
Context Limit8K
Pricing / 1M$0.80 / $2.40
Size TierMedium

Gemma 4 12B

12BDense
gemma4-12b

Compact multimodal model with vision support. Gemma4 architecture with sliding window attention for efficient inference.

visionmultimodal
Context Limit8K
Pricing / 1M$0.80 / $2.40
Size TierMedium
supergemma-12b

Enhanced Gemma4-12B with improved instruction following and expanded knowledge. Tuned for superior task completion.

general
Context Limit8K
Pricing / 1M$0.80 / $2.40
Size TierMedium

Josie 8B

8BDense
josie-8b

Josiefied Qwen3-8B tuned for unrestricted general-purpose assistance. Strong instruction following with creative capabilities.

general
Context Limit8K
Pricing / 1M$0.30 / $0.90
Size TierStandard

Vision 8B

8BDense
vision-8b

Qwen3-VL vision-language model for image understanding and analysis. Processes both text and images in a single conversation.

visionmultimodal
Context Limit8K
Pricing / 1M$0.30 / $0.90
Size TierStandard

Qwythos 9B

9BDense
qwythos-9b

Claude-Mythos hybrid 9B model. Combines conversational depth with broad knowledge for engaging, unrestricted dialogue.

conversational
Context Limit8K
Pricing / 1M$0.30 / $0.90
Size TierStandard

Qwable 9B

9BDense
qwable-9b

Creative variant of the Qwen3 9B line. Optimized for narrative generation, roleplay, and creative writing tasks.

creative
Context Limit8K
Pricing / 1M$0.30 / $0.90
Size TierStandard

Thinker 8B

8BDense
thinker-8b

Llama3.3-8B with thinking and reasoning chain trained for high-reasoning tasks. Uncensored Claude 4.5 Opus distillation.

thinkingreasoning
Context Limit8K
Pricing / 1M$0.30 / $0.90
Size TierStandard
deepresearch-8b

Marco DeepResearch model specialized for in-depth research and analysis tasks. Produces thorough, well-structured reports.

researchanalysis
Context Limit8K
Pricing / 1M$0.30 / $0.90
Size TierStandard

Writer 9B

9BDense
writer-9b

Professional writer model tuned for long-form content, storytelling, and creative prose. Produces publication-quality text.

creativewriting
Context Limit8K
Pricing / 1M$0.30 / $0.90
Size TierStandard

Mythos 9B

9BDense
mythos-9b

Narrative-focused model with merged heretic tuning. Excels at worldbuilding, character development, and immersive storytelling.

narrativecreative
Context Limit8K
Pricing / 1M$0.30 / $0.90
Size TierStandard
mythos-9b-unhinged

Fully unrestricted variant of Mythos 9B. Zero content filtering for maximum creative freedom in narrative generation.

narrativecreative
Context Limit8K
Pricing / 1M$0.30 / $0.90
Size TierStandard

Fable 3B

3BDense
fable-3b

Lightweight agentic model with 16K context. Fast inference with surprisingly capable output for its size class.

fastagentic
Context Limit16K
Pricing / 1M$0.05 / $0.15
Size TierNano

Llama 3B

3BDense
llama-3b

Llama 3.2 3B Instruct. Compact and fast model for simple tasks, quick queries, and lightweight automation.

fastgeneral
Context Limit16K
Pricing / 1M$0.05 / $0.15
Size TierNano

Cyber 1.5B

1.5BDense
cyber-1.5b

Ultra-lightweight nano model with 32K context. Ideal for high-throughput tasks, classification, and structured data extraction.

fastnano
Context Limit32K
Pricing / 1M$0.05 / $0.15
Size TierNano