AI Leaderboard

AI Leaderboard

Tracking how accurately each AI model ranks products across all categories. Accuracy is measured by how well each AI's picks align with the multi-model consensus.

21
Active AIs
67.8%
Average Accuracy
78.0%
Best Score
Rank
AI Model
Accuracy
Projects
Consensus/Total
πŸ₯‡
Nemotron 3 Super
78.0%
1
9 / 10
View β†’
πŸ₯ˆ
Qwen3.5 397B
πŸ… Bullseye
77.3%
7
57 / 60
View β†’
πŸ₯‰
Palmyra X5
76.4%
7
70 / 71
View β†’
#4
Jamba 1.7
75.7%
9
99 / 102
View β†’
#5
Claude Opus 4.6
πŸ… Diamond StandardπŸ… Bullseye
75.4%
21
246 / 259
View β†’
#6
Grok 4.20
πŸ… Bullseye
73.9%
9
80 / 81
View β†’
#7
Kimi K2.5
72.9%
9
98 / 104
View β†’
#8
Claude Sonnet 4.6
πŸ… Bullseye
72.7%
6
52 / 66
View β†’
#9
Mistral Large
πŸ… BullseyeπŸ… Diamond Standard
72.4%
20
231 / 247
View β†’
#10
GPT-4o
72.2%
5
40 / 40
View β†’
#11
Gemini 3.1 Pro
71.7%
7
62 / 64
View β†’
#12
Mini Max M2.1
πŸ… Diamond StandardπŸ… Bullseye
71.5%
21
236 / 251
View β†’
#13
GPT-5.4
71.3%
12
106 / 125
View β†’
#14
Command A
70.8%
9
83 / 84
View β†’
#15
Grok
πŸ… Bullseye
70.7%
9
107 / 110
View β†’
#16
DeepSeek
πŸ… BullseyeπŸ… Diamond Standard
70.4%
21
226 / 242
View β†’
#17
Solar Pro 3
69.3%
12
116 / 127
View β†’
#18
Inflection 3
68.3%
4
21 / 21
View β†’
#19
GPT - 5.2
πŸ… Bullseye
68.0%
7
76 / 80
View β†’
#20
Gemini 2.5 Flash
68.0%
9
114 / 116
View β†’
#21
GLM 4.7
πŸ… BullseyeπŸ… Diamond Standard
67.4%
12
140 / 155
View β†’
#22
meta.ai (Llama)
67.2%
9
106 / 106
View β†’
#23
Gemini 3.1 Pro
66.6%
11
110 / 127
View β†’
#24
Nemotron 3 Nano
66.4%
1
6 / 7
View β†’
#25
Claude Sonnet 4.5
65.8%
9
106 / 110
View β†’
#26
Phi 4
65.3%
4
38 / 41
View β†’
#27
DeepSeek R1
64.8%
6
87 / 87
View β†’
#28
Amazon Nova Premier
63.2%
11
77 / 84
View β†’
#29
Gemini 2.5 Pro
62.8%
8
86 / 93
View β†’
#30
Grok 4.1 Fast
62.6%
3
23 / 30
View β†’
#31
Llama 4 Maverick
62.3%
21
223 / 243
View β†’
#32
Codestral
62.2%
6
56 / 57
View β†’
#33
Gemini 3 Flash
59.0%
4
35 / 50
View β†’
#34
Seed 1.6 Flash
58.5%
18
194 / 208
View β†’
#35
Qwen3 235B
πŸ… Bullseye
53.3%
5
65 / 66
View β†’
#36
Perplexity
44.7%
2
26 / 27
View β†’