01

AI Models

Compare 90 AI models across benchmarks. Explore performance metrics, capabilities, and find the right model for your use case.

18 Multimodal
65 Large Language Models
4 Vision Models
2 Embedding Models
1 Audio Models
02

Benchmark Leaderboards

MMLU Professional

reasoning
Higher is better

Instruction Following Eval

reasoning
Higher is better

WinoGrande

reasoning
Higher is better

Mostly Basic Python Problems

coding
Higher is better

MathVista

math
Higher is better

Massive Text Embedding Benchmark

embedding
Higher is better

LibriSpeech Clean Test

speech
Lower is better

LibriSpeech Other Test

speech
Lower is better
03

Browse by Model Type

Large Language Models65 models

Claude 3 Haiku

Claude 3 Haiku is Anthropic’s smallest and fastest Claude 3 model, optimized for low-latency, high-throughput text and lightweight vision tasks. It ta...

Claude 3 Opus

Claude 3 Opus is Anthropic’s flagship Claude 3 series large language model optimized for high-level reasoning, complex analysis, and long-context unde...

Claude 3 Sonnet

Claude 3 Sonnet is Anthropic's balanced Claude 3-series model that targets a strong mix of intelligence, speed, and cost-efficiency for general-purpos...

Claude 3.5 Haiku

Claude 3.5 Haiku is Anthropic’s lightweight Claude 3.5-series model optimized for speed and low cost while retaining strong reasoning and coding capab...

Claude 3.5 Opus

Claude 3.5 Opus is Anthropic’s flagship Claude 3.5 series model optimized for high‑end reasoning, coding, and complex analysis. It offers strong perfo...

Claude 3.5 Sonnet

Claude.ai is Anthropic’s chat-oriented interface for accessing Claude family large language models via the web. It provides natural language assistanc...

Claude 4 Opus

Anthropic's Claude 4 flagship model with advanced reasoning...

Claude 4 Sonnet

Balanced Claude 4 model for everyday tasks...

Claude Haiku 4.5

Claude Opus 4.1

Enhanced Claude 4 with improved agentic capabilities...

Claude Opus 4.5

Anthropic's most capable model with extended thinking...

Claude Opus 4.7

Claude Sonnet 4.5

Claude 4.5 Sonnet with breakthrough coding performance...

Claude Sonnet 4.6

Codestral

Codestral is Mistral AI’s open-source, 22B-parameter code-specialized language model optimized for software development workflows such as code generat...

Cohere Embed v3

Cohere Rerank is a production search and retrieval reranking model that scores and reorders candidate documents based on their relevance to a query. I...

Command R+

Cohere Command R+ is a production-grade large language model optimized for enterprise workloads, retrieval-augmented generation (RAG), and tool use. I...

Command R+ 08-2024

Updated Command R+ with improved performance...

DeepSeek R1

DeepSeek's reasoning model trained with pure RL...

DeepSeek R1-0528

Updated DeepSeek R1 with improved math reasoning...

DeepSeek V3

State-of-the-art open-source MoE model with 671B parameters...

DeepSeek V3.1

DeepSeek V3.1 with improved capabilities...

DeepSeek V3.2

DeepSeek V3.2 experimental release...

ERNIE 5.0

Baidu's ERNIE 5.0 preview...

GLM-4.5

Zhipu's GLM-4.5 bilingual model...

GLM-4.6

Zhipu's latest GLM model...

GPT-3.5 Turbo

The ChatGPT API is OpenAI's hosted interface to its GPT-4.1-class language models, exposed as a general-purpose text-in/text-out service. It supports ...

GPT-4

GPT-4.1 is a flagship OpenAI large language model that offers GPT-4-level intelligence with improved speed, cost, and reliability. It is designed for ...

GPT-4.1

OpenAI GPT-4.1 with 1M context window...

GPT-4.5

OpenAI's GPT-4.5 preview model...

GPT-5

OpenAI's GPT-5 with unified reasoning capabilities...

GPT-5 High

GPT-5 with maximum reasoning effort...

GPT-5.1

Updated GPT-5 with improved capabilities...

GPT-5.4 mini

GPT-5.4 nano

GPT-5.5

Gemini 2.0 Flash Thinking

Gemini 2.0 Flash with extended reasoning capabilities...

Grok 4

xAI's Grok 4 with breakthrough reasoning capabilities...

Grok 4.1

Updated Grok 4 with enhanced thinking...

Grok 4.1 Thinking

Grok 4.1 with extended reasoning mode...

Grok-2

xAI's flagship model with real-time knowledge...

Grok-2 mini

Efficient version of Grok-2...

Kimi K2

Moonshot's Kimi K2 with advanced reasoning...

Kimi K2 Thinking

Kimi K2 with extended thinking capabilities...

Llama 3 70B

Meta Llama 3 is Meta’s third-generation open large language model family, released in 8B and 70B parameter sizes. It is optimized for instruction foll...

Llama 3.1 70B

Llama 3.1 70B is a large-scale open-weight language model from Meta designed to provide near frontier-level performance in reasoning, coding, and gene...

Llama 3.3 70B

Meta's efficient 70B model matching 405B performance...

Llama 4 Maverick

Meta's Llama 4 Maverick with 1M context...

Mistral 7B

Mistral 7B is a dense 7B-parameter open-weight language model and Mixtral 8x7B is a sparse Mixture-of-Experts model with eight 7B experts, both from M...

Mistral Large

Mistral FT refers to fine-tuned variants of Mistral AI base language models, exposed via the Mistral API for domain- or task-specific use. These model...

Mistral Large 2

Mistral's flagship model with 123B parameters...

Mistral Large 3

Mistral's flagship 2025 model...

Mistral Small

Mistral Small is a lightweight proprietary instruction-tuned language model from Mistral AI, designed to offer strong reasoning and coding performance...

Mixtral 8x22B

Mistral 7B and Mixtral 8x22B are open-weight large language models from Mistral AI, designed for efficient, high-quality text generation and reasoning...

Mixtral 8x7B

Mixtral 8x7B is a sparse mixture-of-experts large language model by Mistral AI, combining eight 7B expert networks with conditional routing for high e...

Nemotron 70B

NVIDIA's open model optimized for inference...

OpenAI o1

OpenAI's reasoning model with extended thinking capabilities...

OpenAI o1-mini

Smaller, cost-efficient reasoning model optimized for coding...

OpenAI o1-preview

Preview version of OpenAI's o1 reasoning model...

Qwen 2.5 72B

Alibaba's flagship open model with strong multilingual support...

Qwen 2.5 Coder 32B

Specialized coding model rivaling GPT-4o on code tasks...

Qwen 3 Max

Alibaba's Qwen 3 flagship model...

Yi Lightning

01.AI's ultra-fast inference model...

o3

OpenAI's o3 advanced reasoning model...

o4-mini

Efficient reasoning model optimized for speed...