Leaderboard
66
RK Model Rating Won Lost
1
anthropic.claude-3-haiku-20240307
anthropic.claude-3-haiku-20240307
1151
85
48
2
llama-3.2-90b-text-preview
llama-3.2-90b-text-preview
1139
50
41
3
gpt-4o
gpt-4o
1124
30
16
4
gpt-4o-mini
gpt-4o-mini
1117
108
53
5
google/palm-2-codechat-bison
google/palm-2-codechat-bison
1096
6
0
6
ReWiz-Llama-3.2-3B.Q8_0.gguf:latest
ReWiz-Llama-3.2-3B.Q8_0.gguf:latest
1084
7
1
7
gpt-4-turbo
gpt-4-turbo
1066
4
0
8
llama3.1:8b-instruct-fp16
llama3.1:8b-instruct-fp16
1055
5
1
9
llama3:latest
llama3:latest
1055
8
5
10
qwen2.5:14b-instruct-q3_K_S
qwen2.5:14b-instruct-q3_K_S
1048
22
16
11
claude-3-haiku-with-system-prompt
claude-3-haiku-with-system-prompt
1045
3
3
12
ReWiz-7B.Q4_K_S.gguf:latest
ReWiz-7B.Q4_K_S.gguf:latest
1040
5
2
13
qwen2.5-coder:latest
qwen2.5-coder:latest
1040
2
0
14
llama3.1:8b-instruct-q5_K_M
llama3.1:8b-instruct-q5_K_M
1035
3
1
15
openhermes:7b-mistral-v2-q5_K_M
openhermes:7b-mistral-v2-q5_K_M
1031
2
0
16
home-turf
home-turf
1030
7
4
17
mistral-nemo:12b-instruct-2407-q3_K_M
mistral-nemo:12b-instruct-2407-q3_K_M
1024
41
28
18
gpt-4o-mini-2024-07-18
gpt-4o-mini-2024-07-18
1023
3
2
19
mistral-nemo:latest
mistral-nemo:latest
1022
15
15
20
hf.co/bartowski/Mistral-Small-Instruct-2409-GGUF:IQ3_M
hf.co/bartowski/Mistral-Small-Instruct-2409-GGUF:IQ3_M
1020
6
4
21
mistral-nemo:12b-instruct-2407-q5_K_M
mistral-nemo:12b-instruct-2407-q5_K_M
1020
9
6
22
test
test
1016
1
0
23
anthropic.claude-3-5-sonnet-20240620
anthropic.claude-3-5-sonnet-20240620
1016
1
0
24
gemma2:9b-instruct-q5_K_M
gemma2:9b-instruct-q5_K_M
1015
3
2
25
testout-helper
testout-helper
1000
1
1
26
Boptruth-NeuralMonarch-7B-unsloth.Q6_K.gguf:latest
Boptruth-NeuralMonarch-7B-unsloth.Q6_K.gguf:latest
999
1
1
27
chatgpt-4o-latest
chatgpt-4o-latest
995
9
10
28
llama3.2:3b-instruct-q8_0
llama3.2:3b-instruct-q8_0
989
4
5
29
small-models
small-models
985
1
2
30
meta-llama/llama-3.1-405b-instruct:free
meta-llama/llama-3.1-405b-instruct:free
984
0
1
31
anthropic/claude-1.2
anthropic/claude-1.2
984
0
1
32
nousresearch/nous-hermes-2-mixtral-8x7b-dpo
nousresearch/nous-hermes-2-mixtral-8x7b-dpo
984
0
1
33
general
general
984
0
1
34
microsoft/wizardlm-2-8x22b
microsoft/wizardlm-2-8x22b
984
0
1
35
openai/gpt-4o-2024-08-06
openai/gpt-4o-2024-08-06
984
0
1
36
google/palm-2-chat-bison
google/palm-2-chat-bison
984
0
1
37
test2
test2
984
0
1
38
ReWiz-Worldbuilder-7B-GGUF.Q5_K_M.gguf:latest
ReWiz-Worldbuilder-7B-GGUF.Q5_K_M.gguf:latest
981
2
3
39
WorldBuilder-7B-GGUF.Q5_K_M.gguf:latest
WorldBuilder-7B-GGUF.Q5_K_M.gguf:latest
980
6
8
40
llama3.2:latest
llama3.2:latest
977
8
11
41
qwen2.5:14b-instruct-q4_K_S
qwen2.5:14b-instruct-q4_K_S
976
8
12
42
mistral:7b-instruct-q4_K_S
mistral:7b-instruct-q4_K_S
975
2
4
43
gpt-4o-2024-08-06
gpt-4o-2024-08-06
975
1
3
44
CleverBoi-Llama-3.2-3B-Instruct.Q8_0.gguf:latest
CleverBoi-Llama-3.2-3B-Instruct.Q8_0.gguf:latest
974
9
10
45
llama3.1:latest
llama3.1:latest
973
23
34
46
gpt-3.5-turbo
gpt-3.5-turbo
973
1
3
47
llama3-70b-8192
llama3-70b-8192
970
52
39
48
CleverBoi-Nemo-12B-v2.Q4_K_S.gguf:latest
CleverBoi-Nemo-12B-v2.Q4_K_S.gguf:latest
970
5
8
49
qwen2.5:7b
qwen2.5:7b
963
29
37
50
llava:latest
llava:latest
961
3
5
51
gemma2:latest
gemma2:latest
952
0
3
52
deepseek-coder-v2:16b-lite-instruct-q4_0
deepseek-coder-v2:16b-lite-instruct-q4_0
950
3
7
53
mixtral-8x7b-32768
mixtral-8x7b-32768
945
26
46
54
exclude
exclude
938
0
4
55
gpt-4
gpt-4
938
0
4
56
phi3:latest
phi3:latest
936
2
5
57
Nerdish-Llama-3.1-8B.Q4_K_M.gguf:latest
Nerdish-Llama-3.1-8B.Q4_K_M.gguf:latest
934
2
6
58
llama-3.1-8b-instant
llama-3.1-8b-instant
884
18
36
59
llama3.2:1b
llama3.2:1b
869
3
16
60
llama3-8b-8192
llama3-8b-8192
856
31
97
-
azure-admin-expert
azure-admin-expert
-
-
-
-
eva
eva
-
-
-
-
llama-3.1-70b-versatile
llama-3.1-70b-versatile
-
-
-
-
meta-llama/llama-3.1-70b-instruct:free
meta-llama/llama-3.1-70b-instruct:free
-
-
-
-
openai/gpt-4o
openai/gpt-4o
-
-
-
-
stablelm2:12b-chat-q4_K_M
stablelm2:12b-chat-q4_K_M
-
-
-
ⓘ The evaluation leaderboard is based on the Elo rating system and is updated in real-time.
The leaderboard is currently in beta, and we may adjust the rating calculations as we refine the algorithm.
Feedback History
253
Models Result User Updated At
gpt-4o-mini
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 16 more
Won
dotjustin
21 hours ago
llama-3.2-90b-text-preview
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 15 more
Won
dotjustin
21 hours ago
llama-3.1-8b-instant
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 14 more
Lost
dotjustin
21 hours ago
llama-3.1-8b-instant
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 13 more
Won
dotjustin
21 hours ago
gpt-4o-mini
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 12 more
Won
dotjustin
21 hours ago
llama3-8b-8192
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 11 more
Lost
dotjustin
21 hours ago
llama-3.1-8b-instant
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 10 more
Lost
dotjustin
21 hours ago
gpt-4o-mini
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 9 more
Won
dotjustin
21 hours ago
gpt-4o-mini
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 8 more
Won
dotjustin
21 hours ago
llama3-8b-8192
llama3-8b-8192, anthropic.claude-3-haiku-20240307, and 7 more
Lost
dotjustin
a day ago
Help us create the best community leaderboard by sharing your feedback history!
...