LLM Leaderboard - Open WebUI

NOTICE

Open WebUI Community is currently undergoing a major revamp to improve user experience and performance ✨

Leaderboard

366

RK	Model	Rating	Won	Lost
1	askleion-test	1180	100.0% 18	0.0% 0
2	anthropic.claude-3-haiku-20240307	1166	63.9% 85	36.1% 48
3	llama3.2:3b-instruct-fp16	1163	64.7% 55	35.3% 30
4	deepseek-coder-v2:16b-lite-instruct-q8_0	1161	93.8% 15	6.3% 1
5	qwen:7b	1158	80.8% 21	19.2% 5
6	llama-3.2-90b-text-preview	1157	54.9% 50	45.1% 41
7	hf.co/theprint/VanRossum-Qwen2.5-Coder-3B:Q8_0	1154	76.9% 20	23.1% 6
8	gpt-4o	1153	60.3% 35	39.7% 23
9	gpt-4o-mini	1148	63.7% 135	36.3% 77
10	gemma2:9b-instruct-q5_K_M	1138	54.7% 35	45.3% 29
11	ReWiz-Llama-3.2-3B.Q8_0.gguf:latest	1130	66.7% 28	33.3% 14
12	llama3.1:8b-instruct-q5_K_M	1130	68.2% 30	31.8% 14
13	grok-beta	1126	65.9% 56	34.1% 29
14	llama3.1:8b-instruct-q8_0	1123	62.2% 56	37.8% 34
15	mistralai/mistral-large-2411	1111	100.0% 7	0.0% 0
16	openrouter-.google/gemini-pro-1.5	1109	59.0% 23	41.0% 16
17	Vo.claude-3-5-sonnet-20240620	1102	81.8% 9	18.2% 2
18	nemotron-mini:4b-instruct-fp16	1102	48.6% 35	51.4% 37
19	smollm2-1.7b-instruct	1099	66.7% 16	33.3% 8
20	google/palm-2-codechat-bison	1096	100.0% 6	0.0% 0
21	marco-o1:latest	1094	56.8% 21	43.2% 16
22	based-dolphin-mixtral:latest	1093	87.5% 7	12.5% 1
23	Vo.claude-3-opus-20240229	1092	81.8% 9	18.2% 2
24	closex/neuraldaredevil-8b-abliterated:Q6_K	1089	67.4% 31	32.6% 15
25	qwen2.5:14b-instruct-q3_K_S	1086	57.9% 22	42.1% 16
26	Vo.gemini-1.5-pro-latest	1083	81.8% 9	18.2% 2
27	gemini-exp-1121	1082	100.0% 5	0.0% 0
28	chatgpt-4o-latest	1072	50.6% 41	49.4% 40
29	x-ai/grok-beta	1068	85.7% 6	14.3% 1
30	llama3:latest	1067	64.3% 9	35.7% 5
31	phi4:latest	1066	73.7% 14	26.3% 5
32	llama3.2-vision:latest	1066	75.0% 6	25.0% 2
33	Boptruth-NeuralMonarch-7B-unsloth.Q6_K.gguf:latest	1066	56.5% 13	43.5% 10
34	gpt-4-turbo	1066	100.0% 4	0.0% 0
35	mistral-nemo:12b-instruct-2407-q3_K_M	1064	59.4% 41	40.6% 28
36	llama3.2:3b-instruct-q8_0	1058	60.6% 66	39.4% 43
37	ReWiz-7B.Q4_K_S.gguf:latest	1057	62.2% 28	37.8% 17
38	llama3.1:8b-instruct-fp16	1057	83.3% 5	16.7% 1
39	granite3.1-dense:8b-instruct-q5_K_M	1055	77.8% 7	22.2% 2
40	llama3.1-latest	1055	83.3% 5	16.7% 1
41	mistral-nemo:12b-instruct-2407-q5_K_M	1054	64.0% 16	36.0% 9
42	qwen2.5-coder:latest	1050	58.8% 10	41.2% 7
43	grok-vision-beta	1050	100.0% 3	0.0% 0
44	claude-3-haiku-with-system-prompt	1048	50.0% 3	50.0% 3
45	granite3.1-moe:3b-instruct-q5_K_M	1046	100.0% 3	0.0% 0
46	yi-lightning	1046	64.0% 16	36.0% 9
47	gemma-7b-it	1046	80.0% 4	20.0% 1
48	misty	1046	100.0% 3	0.0% 0
49	mistralclaudefree20b-clone	1046	100.0% 3	0.0% 0
50	mistral:7b	1043	80.0% 4	20.0% 1
51	askleion-v1-qwen-25-7b	1042	100.0% 4	0.0% 0
52	anthropic/claude-3.5-sonnet	1042	66.7% 6	33.3% 3
53	qwen2.5:latest	1041	56.3% 9	43.8% 7
54	deepseek-coder-v2:latest	1040	66.7% 6	33.3% 3
55	gemini-exp-1114	1040	80.0% 4	20.0% 1
56	starcoder2:3b	1038	100.0% 2	0.0% 0
57	openai/gpt-4o	1037	66.7% 6	33.3% 3
58	askleion-v1-gemma9b	1036	66.7% 16	33.3% 8
59	Llama-3.1-Nemotron-70B-Instruct:latest	1036	80.0% 4	20.0% 1
60	mistral-nemo:latest	1035	50.9% 28	49.1% 27
61	llama3.1:70b	1034	55.0% 11	45.0% 9
62	vanilj/llama3.1-70b-iquants:IQ4_XS	1034	75.0% 3	25.0% 1
63	Qwen/Qwen2.5-72B-Instruct	1033	100.0% 2	0.0% 0
64	claude-3-5-sonnet-20241022	1032	100.0% 2	0.0% 0
65	ollama.com/library/medllama2:latest	1032	100.0% 2	0.0% 0
66	ollama.com/library/llama2-uncensored:7b	1032	100.0% 2	0.0% 0
67	ReWiz-Nemo-12B-Instruct-GGUF.Q4_K_M.gguf:latest	1032	51.5% 17	48.5% 16
68	Qwen/Qwen2.5-72B-Instruct-128K	1032	75.0% 3	25.0% 1
69	Qwen/Qwen2-7B-Instruct	1032	100.0% 2	0.0% 0
70	phi3.5:3.8b-mini-instruct-q8_0	1031	100.0% 2	0.0% 0
71	nemotron-mini:4b-instruct-q8_0	1031	100.0% 2	0.0% 0
72	senecallm	1031	100.0% 2	0.0% 0
73	vanessa	1031	100.0% 2	0.0% 0
74	multi-agent:latest	1030	55.6% 5	44.4% 4
75	gemma2:latest	1029	60.0% 6	40.0% 4
76	llama3.2:3b	1029	60.0% 9	40.0% 6
77	hermes3:latest	1029	75.0% 3	25.0% 1
78	hf.co/bartowski/Mistral-Small-Instruct-2409-GGUF:IQ3_M	1026	60.0% 6	40.0% 4
79	llama3:instruct	1022	100.0% 1	0.0% 0
80	qwen2.5:32b	1021	55.6% 5	44.4% 4
81	anthropic.claude-3-5-sonnet-20240620	1021	100.0% 1	0.0% 0
82	qwen2.5:14b	1018	57.1% 4	42.9% 3
83	argon	1018	48.8% 20	51.2% 21
84	llama3-uncensored:latest	1017	100.0% 1	0.0% 0
85	Athena:latest	1016	62.5% 5	37.5% 3
86	deepseek-r1:7b	1016	100.0% 1	0.0% 0
87	falcon3:7b	1016	100.0% 1	0.0% 0
88	llama3.3:70b-instruct-q4_K_M	1016	100.0% 1	0.0% 0
89	sonnet	1016	100.0% 1	0.0% 0
90	anthropic.claude-3-5-sonnet-latest	1016	100.0% 1	0.0% 0
91	Qwen/QwQ-32B-Preview	1016	100.0% 1	0.0% 0
92	doubao-pro-128k	1016	100.0% 1	0.0% 0
93	claude-3-5-sonnet-20240620	1016	66.7% 2	33.3% 1
94	llama-3.2-90b-vision-preview	1016	41.4% 12	58.6% 17
95	reflection	1016	100.0% 1	0.0% 0
96	gpt-4-turbo-2024-04-09	1016	100.0% 1	0.0% 0
97	hf.co/TheBloke/dolphin-2.6-mistral-7B-dpo-laser-GGUF:Q5_K_M	1016	100.0% 1	0.0% 0
98	llama3.2:70b-text-fp16	1016	100.0% 1	0.0% 0
99	mythomax-l2-13b	1016	100.0% 1	0.0% 0
100	playground-v2.5	1016	100.0% 1	0.0% 0
101	blackboxai-pro	1016	100.0% 1	0.0% 0
102	wizardlm-2-8x22b	1016	100.0% 1	0.0% 0
103	anthropic/claude-3.5-sonnet:beta	1016	100.0% 1	0.0% 0
104	hermes3:8b-llama3.1-q8_0	1016	100.0% 1	0.0% 0
105	test	1016	100.0% 1	0.0% 0
106	qwen2.5:1.5b	1015	57.1% 4	42.9% 3
107	llama3.1:latest	1015	60.2% 97	39.8% 64
108	hermes-3-llama-3.2-3b	1015	100.0% 1	0.0% 0
109	mistral:7b-instruct	1015	66.7% 2	33.3% 1
110	coders	1015	66.7% 2	33.3% 1
111	mistral-small:latest	1014	53.3% 8	46.7% 7
112	gemini-1.5-flash-exp-0827	1013	100.0% 1	0.0% 0
113	qwen2.5:7b	1011	45.7% 42	54.3% 50
114	gpt-4o-mini-2024-07-18	1009	50.0% 7	50.0% 7
115	small-models	1007	50.0% 3	50.0% 3
116	llama-3.1-8b-instruct:latest	1005	50.0% 4	50.0% 4
117	nemotron:latest	1002	50.0% 7	50.0% 7
118	mistral_small_obliterated_22b:latest	1002	50.0% 2	50.0% 2
119	falcon-mamba-instruct-7b-0	1001	50.0% 1	50.0% 1
120	phi3:3.8b	1001	50.0% 2	50.0% 2
121	mental-health-assistant	1001	50.0% 1	50.0% 1
122	incept5/llama3.1-claude:latest	1000	50.0% 2	50.0% 2
123	hf.co/bartowski/Llama-3.1-WhiteRabbitNeo-2-8B-GGUF:Q6_K_L	1000	50.0% 1	50.0% 1
124	t.e-8.1-iq-imatrix-request	1000	50.0% 1	50.0% 1
125	athenaa	1000	50.0% 1	50.0% 1
126	chatgpt-4-uncensored:latest	1000	50.0% 2	50.0% 2
127	shirka-ict-o1	1000	50.0% 1	50.0% 1
128	llama-3.2-11b-vision-preview	1000	50.0% 2	50.0% 2
129	ollama.com/library/meditron:7b	1000	50.0% 1	50.0% 1
130	gemini-1.5-pro-002	1000	50.0% 1	50.0% 1
131	visual_tree_of_thoughts.mcts-nomic-embed-text:latest	1000	50.0% 1	50.0% 1
132	sarah-lovely-caring-girlfriend:latest	1000	50.0% 3	50.0% 3
133	starcoder2:15b	1000	50.0% 5	50.0% 5
134	deal-reg-generator	1000	45.5% 10	54.5% 12
135	CognitiveComputations/dolphin-mistral-nemo:12b-v2.9.3-Q5_K_M	1000	50.0% 2	50.0% 2
136	testout-helper	1000	50.0% 1	50.0% 1
137	RasLike-Qwen2.5-3B.Q4_K_M.gguf:latest	999	50.0% 1	50.0% 1
138	fever-dreams	999	50.0% 1	50.0% 1
139	gemma2:2b	998	27.7% 18	72.3% 47
140	original-models	998	50.0% 1	50.0% 1
141	gemma2:27b	996	50.0% 2	50.0% 2
142	llama3.2:latest	995	58.5% 48	41.5% 34
143	qwen2.5:3b	994	53.8% 14	46.2% 12
144	arena	993	40.0% 2	60.0% 3
145	llama3.2-vision:11b	990	44.4% 4	55.6% 5
146	hf.co/theprint/RuDolph-Hermes-7B-Q6_K-GGUF:latest	990	43.8% 14	56.3% 18
147	atsi-ai	989	0.0% 0	100.0% 1
148	qwen2.5-coder:7b-instruct-q4_0	988	0.0% 0	100.0% 1
149	llama3.2:3b-instruct-q5_K_M	988	42.9% 3	57.1% 4
150	llama3-70b-8192	988	57.1% 52	42.9% 39
151	Vo.gpt-4o	987	40.0% 2	60.0% 3
152	arena-random-model	987	33.3% 1	66.7% 2
153	Qwen/Qwen2.5-Math-72B-Instruct	987	0.0% 0	100.0% 1
154	llama-3.2-3b-preview	987	33.3% 1	66.7% 2
155	Vo.flux-pro-max	986	40.0% 2	60.0% 3
156	openrouter-.google/gemini-exp-1121:free	986	0.0% 0	100.0% 1
157	hank	985	0.0% 0	100.0% 1
158	intui	985	0.0% 0	100.0% 1
159	o1-preview	984	0.0% 0	100.0% 1
160	llama-3.3-70b-versatile	984	0.0% 0	100.0% 1
161	gpt-4o-2024-05-13	984	0.0% 0	100.0% 1
162	Vo.gemini-ultra	984	40.0% 2	60.0% 3
163	codellama:latest	984	0.0% 0	100.0% 1
164	haiku	984	0.0% 0	100.0% 1
165	ERNIE-Speed-128K	984	0.0% 0	100.0% 1
166	leon	984	0.0% 0	100.0% 1
167	hf.co/TheBloke/dolphin-2.6-mistral-7B-dpo-laser-GGUF:Q6_K	984	0.0% 0	100.0% 1
168	gemma2-9b-it	984	0.0% 0	100.0% 1
169	cohere/command-r-03-2024	984	0.0% 0	100.0% 1
170	mid-large-perf	984	0.0% 0	100.0% 1
171	over	984	0.0% 0	100.0% 1
172	qwen2.5:14b-instruct-q4_K_S	984	40.0% 8	60.0% 12
173	meta-llama/llama-3.1-405b-instruct:free	984	0.0% 0	100.0% 1
174	anthropic/claude-1.2	984	0.0% 0	100.0% 1
175	nousresearch/nous-hermes-2-mixtral-8x7b-dpo	984	0.0% 0	100.0% 1
176	general	984	0.0% 0	100.0% 1
177	microsoft/wizardlm-2-8x22b	984	0.0% 0	100.0% 1
178	openai/gpt-4o-2024-08-06	984	0.0% 0	100.0% 1
179	google/palm-2-chat-bison	984	0.0% 0	100.0% 1
180	test2	984	0.0% 0	100.0% 1
181	deepseek-coder:6.7b-base	983	25.0% 1	75.0% 3
182	anamnesetest	983	42.9% 3	57.1% 4
183	internclaudefree20b	983	0.0% 0	100.0% 1
184	joey	983	33.3% 1	66.7% 2
185	ministral-8b-latest	982	50.0% 5	50.0% 5
186	hf.co/theprint/Llama-3.2-3B-VanRossum:Q8_0	981	48.6% 18	51.4% 19
187	internlm2:20b	981	42.9% 3	57.1% 4
188	lamma-resumos-de-vdeo	980	48.1% 13	51.9% 14
189	qwen2.5:72b-instruct-q4_K_S	977	46.7% 7	53.3% 8
190	CursorCore-Yi-9B.Q6_K.gguf:latest	976	33.3% 2	66.7% 4
191	CognitiveComputations/dolphin-phi-3:medium-v2.9.2-q5_k_m	973	0.0% 0	100.0% 2
192	wizard-vicuna-uncensored:13b	972	0.0% 0	100.0% 2
193	glm-4-plus	971	0.0% 0	100.0% 2
194	home-turf	970	39.0% 16	61.0% 25
195	WorldBuilder-7B-GGUF.Q5_K_M.gguf:latest	970	44.4% 8	55.6% 10
196	qwen2.5:72b-instruct-q4_0	970	0.0% 0	100.0% 2
197	codegemma:latest	969	0.0% 0	100.0% 2
198	CleverBoi-Llama-3.2-3B-Instruct.Q8_0.gguf:latest	969	49.1% 27	50.9% 28
199	llama-3.1-70b-versatile	968	20.0% 1	80.0% 4
200	THUDM/glm-4-9b-chat	968	0.0% 0	100.0% 2
201	llama-3.2-3b-instruct	967	25.0% 1	75.0% 3
202	hf.co/theprint/CleverBoi-Llama-3.1-8B-Instruct:Q5_K_M	966	37.5% 3	62.5% 5
203	llava:latest	966	37.5% 3	62.5% 5
204	openrouter-.anthropic/claude-3.5-sonnet	965	47.7% 21	52.3% 23
205	gemini-1.5-pro-exp-0827	965	0.0% 0	100.0% 2
206	mixtral-8x7b-32768	965	39.0% 30	61.0% 47
207	openhermes:7b-mistral-v2-q5_K_M	965	44.8% 13	55.2% 16
208	llama3.1:8b-instruct-q4_0	965	0.0% 0	100.0% 2
209	gpt-4o-2024-08-06	964	28.6% 2	71.4% 5
210	openrouter-.meta-llama/llama-3.1-405b-instruct	961	46.8% 22	53.2% 25
211	CleverBoi-Nemo-12B-v2.Q4_K_S.gguf:latest	960	45.2% 14	54.8% 17
212	hf.co/theprint/ReWiz-Qwen-2.5-14B:Q4_K_S	959	33.3% 2	66.7% 4
213	doubao-pro-4k	957	0.0% 0	100.0% 3
214	askleion-v1-mistral-7bv03q4km	955	37.8% 17	62.2% 28
215	granite-3.1-8b-instruct	954	42.1% 8	57.9% 11
216	ollama.com/bengt0/em_german_leo_mistral:latest	953	0.0% 0	100.0% 3
217	minicpm-v:latest	953	27.3% 12	72.7% 32
218	big-models	952	0.0% 0	100.0% 3
219	gemma2:9b	952	43.1% 31	56.9% 41
220	llama3.2-vision:90b	948	25.0% 2	75.0% 6
221	llama3.1:8b	945	16.7% 1	83.3% 5
222	google/gemini-exp-1114	944	33.3% 4	66.7% 8
223	doubao-pro-32k	943	0.0% 0	100.0% 4
224	llama3.2:1b	942	51.0% 26	49.0% 25
225	exclude	940	0.0% 0	100.0% 4
226	gpt-4	939	0.0% 0	100.0% 4
227	llama-3.2-1b-preview	935	44.0% 11	56.0% 14
228	falcon3:10b	934	34.5% 10	65.5% 19
229	deepseek-coder-v2:16b-lite-instruct-q4_0	933	48.1% 26	51.9% 28
230	qwen2.5-coder-1.5b-instruct-mlx	927	35.0% 7	65.0% 13
231	gemini-1.5-pro	924	0.0% 0	100.0% 5
232	hf.co/theprint/ReWiz-Llama-3.2-1B:latest	919	41.3% 19	58.7% 27
233	gemini-1.5-flash-002	917	11.1% 1	88.9% 8
234	gemma2-2b	915	33.3% 8	66.7% 16
235	phi3:latest	914	55.2% 37	44.8% 30
236	askleion-v1-mistral-nemo	914	33.3% 8	66.7% 16
237	mistral:7b-instruct-q4_K_S	914	40.0% 16	60.0% 24
238	mistral:latest	903	38.7% 12	61.3% 19
239	anthropic/claude-3-sonnet:beta	901	11.1% 1	88.9% 8
240	llama-3.1-8b-instant	900	33.9% 19	66.1% 37
241	deepseek-ai/DeepSeek-V2.5	897	9.1% 1	90.9% 10
242	gpt-3.5-turbo	896	10.0% 1	90.0% 9
243	google/gemini-pro-1.5	896	0.0% 0	100.0% 7
244	x/llama3.2-vision:latest	889	38.2% 13	61.8% 21
245	Vo.o1-preview	885	9.1% 1	90.9% 10
246	deepseek-coder:latest	883	0.0% 0	100.0% 11
247	Vo.o1-mini	881	9.1% 1	90.9% 10
248	granite3-dense:8b-instruct-q5_K_M	875	34.4% 22	65.6% 42
249	llama3-8b-8192	874	24.2% 31	75.8% 97
250	ReWiz-Worldbuilder-7B-GGUF.Q5_K_M.gguf:latest	872	28.6% 10	71.4% 25
251	mistral-nemo:12b-instruct-2407-q8_0	870	47.6% 40	52.4% 44
252	arena-model	864	39.6% 59	60.4% 90
253	openrouter-.mistralai/mistral-large-2411	863	33.3% 10	66.7% 20
254	askleion-v1-phi35	855	32.1% 9	67.9% 19
255	nemotron-mini:latest	849	28.2% 11	71.8% 28
256	Nerdish-Llama-3.1-8B.Q4_K_M.gguf:latest	847	18.2% 4	81.8% 18
257	codellama:13b	841	8.3% 1	91.7% 11
258	granite3-dense:8b	832	18.8% 3	81.3% 13
259	granite3-moe:3b-instruct-q6_K	827	26.1% 12	73.9% 34
260	qwen:latest	698	22.7% 15	77.3% 51
-	-assistant	-	-	-
-	70b-models	-	-	-
-	ai-bot-expert-programmer	-	-	-
-	aia/dolphin-llama3.1:latest	-	-	-
-	aminadaven/dictalm2.0-instruct:q5_k_m	-	-	-
-	anamnesebot	-	-	-
-	anthropic.claude-3-sonnet-20240229	-	-	-
-	azure_openai	-	-	-
-	azure-admin-expert	-	-	-
-	bespoke-minicheck:latest	-	-	-
-	brxce/stable-diffusion-prompt-generator:latest	-	-	-
-	cmdt-escalona	-	-	-
-	cmdt-escalona-o1-mini	-	-	-
-	code-companion:latest	-	-	-
-	codegemma:7b	-	-	-
-	codellama:7b	-	-	-
-	codestral-mamba-latest	-	-	-
-	cohere_dbscan	-	-	-
-	cohere.command-r-plus	-	-	-
-	cryptonautslab/Ego:latest	-	-	-
-	cyberwald/llama-3.1-sauerkrautlm-8b-instruct:latest	-	-	-
-	ddg/claude-3-haiku	-	-	-
-	deepseek-chat	-	-	-
-	deepseek-coder:6.7b	-	-	-
-	deepseek-v2:latest	-	-	-
-	dolphin-llama3:8b-256k-v2.9-q6_K	-	-	-
-	dolphin-mistral:latest	-	-	-
-	dolphin-mixtral:latest	-	-	-
-	dolphin-phi:2.7b	-	-	-
-	eva	-	-	-
-	excel-assistant	-	-	-
-	excelv2	-	-	-
-	exer/laser-dolphin-mixtral:2x7b-dpo-q6_K	-	-	-
-	F-0x:latest	-	-	-
-	finalend/llama-3.1-storm:8b-q6_K	-	-	-
-	funktion-test	-	-	-
-	gemini-pro	-	-	-
-	gemma:2b	-	-	-
-	glm-4	-	-	-
-	google_genai.gemini-2.0-flash-exp	-	-	-
-	google_genai.gemini-exp-1121	-	-	-
-	gpt-3.5-turbo-0125	-	-	-
-	gpt-3.5-turbo-1106	-	-	-
-	gpt-3.5-turbo-16k	-	-	-
-	gpt-4-0613	-	-	-
-	granite3-moe:3b-instruct-q5_K_M	-	-	-
-	granite3.1-dense:latest	-	-	-
-	graveler	-	-	-
-	hermes3:3b	-	-	-
-	hf.co/LiuWoodsCode/Llama32Cyn:latest	-	-	-
-	hungama-fusion-ai	-	-	-
-	image-analysis	-	-	-
-	it-norms-expert-germma27b	-	-	-
-	llama-3.1-405b	-	-	-
-	Llama-3.1-8B-Lexi-Uncensored.Q4_K_S:latest	-	-	-
-	llama-arena	-	-	-
-	llama-guard-3-8b	-	-	-
-	llama2-uncensored:latest	-	-	-
-	llama3-groq-tool-use:latest	-	-	-
-	llama3.2-vision:11b-instruct-q8_0	-	-	-
-	llama3.2:3b-instruct-q6_K	-	-	-
-	llava:13b-v1.6-vicuna-q5_K_M	-	-	-
-	malyki-vision-v1	-	-	-
-	MalyKIv3:latest	-	-	-
-	marco-o1:7b-q4_K_M	-	-	-
-	medical	-	-	-
-	meta-llama/llama-3.1-70b-instruct:free	-	-	-
-	ministral-3b-2410	-	-	-
-	mistral-large-latest	-	-	-
-	mistral-medium-2312	-	-	-
-	mistral-small-2402	-	-	-
-	mythosmith	-	-	-
-	nemotron-mini:4b	-	-	-
-	o1-mini	-	-	-
-	o1-mini-2024-09-12	-	-	-
-	o1-preview-2024-09-12	-	-	-
-	ollama.com/library/mistral-nemo:latest	-	-	-
-	openai/gpt-3.5-turbo-instruct	-	-	-
-	phi3:14b-medium-128k-instruct-q5_K_M	-	-	-
-	phi3.5:3.8b-mini-instruct-q5_K_M	-	-	-
-	phi3.5:latest	-	-	-
-	pixtral-12b-2409	-	-	-
-	professor-de-biologia-em-portugues:latest	-	-	-
-	Qwen/Qwen2.5-7B-Instruct	-	-	-
-	Qwen/Qwen2.5-Coder-32B-Instruct	-	-	-
-	qwen2-math:7b	-	-	-
-	qwen2-math:latest	-	-	-
-	qwen2:latest	-	-	-
-	qwen2.5-coder:1.5b	-	-	-
-	qwen2.5-coder:3b	-	-	-
-	qwen2.5-coder:7b	-	-	-
-	qwen2.5:72b	-	-	-
-	random	-	-	-
-	run_se	-	-	-
-	siliconflow_api_connection.Qwen/Qwen2-VL-72B-Instruct	-	-	-
-	socialnetwooky/llama3.2-abliterated:3b_q8_0	-	-	-
-	solar:latest	-	-	-
-	sroecker/sauerkrautlm-7b-hero:latest	-	-	-
-	stablelm2:12b-chat-q4_K_M	-	-	-
-	starcoder2:7b	-	-	-
-	summarizer	-	-	-
-	tarot-with-images	-	-	-
-	tencent_hunyuanai.hunyuan-pro	-	-	-
-	undi95/remm-slerp-l2-13b	-	-	-
-	vicuna:13b	-	-	-
-	wizard-vicuna-uncensored:latest	-	-	-

ⓘ The evaluation leaderboard is based on the Elo rating system and is updated in real-time.

The leaderboard is currently in beta, and we may adjust the rating calculations as we refine the algorithm.

Feedback History

1678

Models	Result	Updated At
cryptonautslab/Ego:latest	Won	12 hours ago
cryptonautslab/Ego:latest	Won	2 days ago
cryptonautslab/Ego:latest	Won	4 days ago
cryptonautslab/Ego:latest	Won	6 days ago
cryptonautslab/Ego:latest	Won	7 days ago
Athena:latest	Won	8 days ago
cryptonautslab/Ego:latest	Lost	9 days ago
cryptonautslab/Ego:latest	Won	9 days ago
cryptonautslab/Ego:latest	Won	12 days ago
deepseek-r1:7b qwen2.5:1.5b	Won	16 hours ago

Help us create the best community leaderboard by sharing your feedback history!

...