Model Pricing

Pay-as-you-go pricing per 1M tokens. Filter by capability and size tier.

Search models

Capability

Model Size

Qwepus 35B A3B

35BMoE

qwepus-35b-a3b

Flagship hybrid MoE+SSM model. 35B total parameters with only 3.6B active per token. Native 262K context with reasoning and thinking capabilities.

reasoningthinkingvisionagentic

Context Limit131K

Pricing / 1M$3.00 / $9.00

Size TierFlagship

Qwen 3.6 35B Pro

35BMoE

qwen3.6-35b-pro

Aggressively tuned 35B MoE with zero refusals. Same architecture as Qwepus with custom imatrix quantization for maximum quality.

agenticvision

Context Limit131K

Pricing / 1M$3.00 / $9.00

Size TierFlagship

Deckard 40B

40BDense

deckard-40b

Dense 40B model expanded from Qwen3.6-27B architecture with 96 layers. Deep thinking and uncensored reasoning capabilities.

thinkingreasoning

Context Limit16K

Pricing / 1M$3.00 / $9.00

Size TierFlagship

Total Recall 42B

42BMoE

total-recall-42b

42B MoE thinking coder with only 3B active parameters. Built on Qwen3-30B-A3B-2507 with Brainstorm 20x enhancements for superior coding.

thinkingcodingreasoning

Context Limit32K

Pricing / 1M$3.00 / $9.00

Size TierFlagship

Gemma 4 31B

31BDense

gemma4-31b

Dense 31B multimodal model with sliding window attention. Supports both text and image understanding with GQA architecture.

visionmultimodal

Context Limit8K

Pricing / 1M$2.00 / $6.00

Size TierLarge

DeepSeek R1 32B

32BDense

deepseek-r1-32b

True chain-of-thought reasoner distilled from DeepSeek R1. Dense 32B Qwen2.5 architecture with deep reasoning capabilities.

reasoningchain-of-thought

Context Limit16K

Pricing / 1M$2.00 / $6.00

Size TierLarge

Flash 30B

30BMoE

flash-30b

GLM-4.7 based MoE flash model with only 2B active parameters. Optimized for speed with specialized imatrix quantization.

fastcoding

Context Limit16K

Pricing / 1M$2.00 / $6.00

Size TierLarge

DeepSeek R1 14B

14BDense

deepseek-r1-14b

Compact reasoning model distilled from DeepSeek R1. Qwen-14B architecture with chain-of-thought capabilities at lower compute cost.

reasoning

Context Limit8K

Pricing / 1M$0.80 / $2.40

Size TierMedium

Coder 14B

14BDense

coder-14b

Qwen2.5 Coder 14B optimized for code generation. Produces complete, runnable solutions across multiple programming languages.

coding

Context Limit8K

Pricing / 1M$0.80 / $2.40

Size TierMedium

Gemma 4 12B

12BDense

gemma4-12b

Compact multimodal model with vision support. Gemma4 architecture with sliding window attention for efficient inference.

visionmultimodal

Context Limit8K

Pricing / 1M$0.80 / $2.40

Size TierMedium

SuperGemma 12B

12BDense

supergemma-12b

Enhanced Gemma4-12B with improved instruction following and expanded knowledge. Tuned for superior task completion.

general

Context Limit8K

Pricing / 1M$0.80 / $2.40

Size TierMedium

Josie 8B

8BDense

josie-8b

Josiefied Qwen3-8B tuned for unrestricted general-purpose assistance. Strong instruction following with creative capabilities.

general

Context Limit8K

Pricing / 1M$0.30 / $0.90

Size TierStandard

Vision 8B

8BDense

vision-8b

Qwen3-VL vision-language model for image understanding and analysis. Processes both text and images in a single conversation.

visionmultimodal

Context Limit8K

Pricing / 1M$0.30 / $0.90

Size TierStandard

Qwythos 9B

9BDense

qwythos-9b

Claude-Mythos hybrid 9B model. Combines conversational depth with broad knowledge for engaging, unrestricted dialogue.

conversational

Context Limit8K

Pricing / 1M$0.30 / $0.90

Size TierStandard

Qwable 9B

9BDense

qwable-9b

Creative variant of the Qwen3 9B line. Optimized for narrative generation, roleplay, and creative writing tasks.

creative

Context Limit8K

Pricing / 1M$0.30 / $0.90

Size TierStandard

Thinker 8B

8BDense

thinker-8b

Llama3.3-8B with thinking and reasoning chain trained for high-reasoning tasks. Uncensored Claude 4.5 Opus distillation.

thinkingreasoning

Context Limit8K

Pricing / 1M$0.30 / $0.90

Size TierStandard

DeepResearch 8B

8BDense

deepresearch-8b

Marco DeepResearch model specialized for in-depth research and analysis tasks. Produces thorough, well-structured reports.

researchanalysis

Context Limit8K

Pricing / 1M$0.30 / $0.90

Size TierStandard

Writer 9B

9BDense

writer-9b

Professional writer model tuned for long-form content, storytelling, and creative prose. Produces publication-quality text.

creativewriting

Context Limit8K

Pricing / 1M$0.30 / $0.90

Size TierStandard

Mythos 9B

9BDense

mythos-9b

Narrative-focused model with merged heretic tuning. Excels at worldbuilding, character development, and immersive storytelling.

narrativecreative

Context Limit8K

Pricing / 1M$0.30 / $0.90

Size TierStandard

Mythos 9B Unhinged

9BDense

mythos-9b-unhinged

Fully unrestricted variant of Mythos 9B. Zero content filtering for maximum creative freedom in narrative generation.

narrativecreative

Context Limit8K

Pricing / 1M$0.30 / $0.90

Size TierStandard

Fable 3B

3BDense

fable-3b

Lightweight agentic model with 16K context. Fast inference with surprisingly capable output for its size class.

fastagentic

Context Limit16K

Pricing / 1M$0.05 / $0.15

Size TierNano

Llama 3B

3BDense

llama-3b

Llama 3.2 3B Instruct. Compact and fast model for simple tasks, quick queries, and lightweight automation.

fastgeneral

Context Limit16K

Pricing / 1M$0.05 / $0.15

Size TierNano

Cyber 1.5B

1.5BDense

cyber-1.5b

Ultra-lightweight nano model with 32K context. Ideal for high-throughput tasks, classification, and structured data extraction.

fastnano

Context Limit32K

Pricing / 1M$0.05 / $0.15

Size TierNano