WebX11 AI
Back to Chat Refresh

AI Models

All available models on WebX11 — cloud hosted via Ollama Cloud and local GGUF models. Each model includes use-case recommendations to help you pick the right one.

Cloud API Local GGUF Recommended
0 models total 0 cloud 0 local

Chat and General Purpose 0

deepseek-v4-flash cloud default
~70Bvery fast
Fastest cloud model — low latency, strong reasoning. The perfect daily driver for quick chats, tool use, automation, and general conversation.
daily-driverfasttool-useautomation
deepseek-v4-pro cloud
~670B MoEfast
DeepSeek flagship — powerful reasoning, architecture planning, code review, and complex multi-step tasks. Deeper than flash.
complex-reasoningarchitecturecode-reviewplanning
kimi-k2.6 cloud
largemedium speed
Excellent for analytical questions, research synthesis, and long-context understanding. Strong bilingual capability.
researchanalysislong-contextbilingual
minimax-m2.7 cloud
largemedium speed
Strong creative writing, nuanced prose, and Turkish content. Good fallback for a different creative angle.
creative-writingturkishprose
gemma4:31b cloud
31Bmedium speed
Google's latest — excellent instruction following, clean structured output. Great for reliable general tasks.
instruction-followingstructured-outputreliable
glm-5.1 / 4.7 cloud
largemedium speed
Zhipu GLM series — strong multilingual including Turkish. Good for long-form content, creative tasks, and vision (5.1).
multilingualturkishcreativevision
gemma3:27b cloud
27Bmedium speed
Google Gemma 3 — solid all-rounder. Good balance of speed and quality for general conversation.
generalbalancedchat
minimax-m2.5 cloud
largefast
Fast MiniMax variant — good for quick creative tasks and alternative perspectives.
creativefastalternative

Coding and Technical 0

qwen3-coder:480b cloud best for code
480B MoEmedium
Dedicated coding model — specialized for complex code generation, debugging large codebases, and systems engineering.
code-generationdebuggingarchitecturerefactoring
deepseek-v3.2 cloud
~670B MoEfast
DeepSeek V3 — excellent for code review, architectural decisions, refactoring, and technical deep-dives.
code-reviewarchitecturerefactoringtechnical
devstral-2:123b cloud
123Bmedium
Purpose-built for technical reasoning — math, logic puzzles, scientific analysis. Great for hard problems.
mathlogicsciencetechnical-deep-dive
qwen2.5-coder:7b local
7B4.7 GB
Local coding model — no API needed, zero latency. Good for quick code completions and offline work.
offlinelightweightquick-code
qwen2.5:7b-instruct local
7B2.2 GB
Lighter local instruct model — good general-purpose fallback when cloud is unavailable.
offlinefallbacklight

Reasoning and Deep Thought 0

cogito-2.1:671b cloud heavy duty
671Bslow
The biggest model — 671B parameters. For the hardest problems: deep scientific reasoning, mathematical proofs, complex multi-step logic.
deep-reasoningscientificmath-proofsheavy
kimi-k2-thinking cloud
largeslow
Kimi with explicit chain-of-thought — ideal for multi-step problem solving, structured reasoning, and step-by-step verification.
chain-of-thoughtstep-by-stepverification

Vision and Multimodal 0

qwen3-vl:235b cloud best vision
235Bmedium
Dedicated vision-language model — understands images, screenshots, diagrams, and documents. Perfect for OCR and visual QA.
visionOCRscreenshotsdiagramsdocument-analysis
glm-5.1 cloud
largemedium
GLM 5.1 has built-in vision support — good all-rounder for both text and image tasks without switching models.
visionmultimodalgeneral

Local Models (Offline) 0

gemma4:e2b local best local
~27B7.2 GB
Best local model — Google Gemma 4. Solid general performance, good instruction following, completely offline.
offlinegeneralbest-local
qwen3:14b local
14B9.3 GB
Largest local Qwen — good reasoning capability for offline use. Requires ~10GB RAM.
offlinereasoningqwen
gemma3:12b local
12B8.1 GB
Google Gemma 3 — good general local model. Balanced quality and resource usage.
offlinebalancedgemma
qwen3: 8b / 4b / 1.7b / 0.6b local
0.6B-8B0.5-5.2 GB
Qwen 3 family from ultra-light (0.6B, 522MB) to decent (8B, 5.2GB). Pick based on your RAM.
offlinelightweightflexibleembedded
llama3.2:1b / 3b local
1-3B1.3-2 GB
Meta's small models — fast inference, tiny footprint. Good for simple tasks and constrained environments.
offlinetinyfastfallback
phi3 local
3.8B2.2 GB
Microsoft Phi-3 — surprisingly capable for its size. Lightweight general tasks.
offlinesmallphi
mistral-nemo / llama3.1 / codellama / mistral:7b local
7B-12B3.8-7.1 GB
Legacy local models — still functional but superseded by newer Qwen 3 and Gemma 4.
offlinelegacyfallback

Quick Pick Guide

TaskBest ModelRunner-upOffline Option
Quick chat, daily usedeepseek-v4-flashgemma4:31bgemma4:e2b
Complex codingqwen3-coder:480bdeepseek-v3.2qwen2.5-coder:7b
Deep reasoningcogito-2.1:671bdeepseek-v4-proqwen3:14b
Vision / Imagesqwen3-vl:235bglm-5.1--
Turkish contentglm-5.1minimax-m2.7--
Creative writingminimax-m2.7gemma4:31bgemma4:e2b
Code reviewdeepseek-v4-proqwen3-coder:480bqwen3:14b
Lightweight offlinegemma4:e2bqwen3:8bphi3 / llama3.2:1b