Who will have the best LLM at the end of 2024 (as decided by ChatBot Arena)?
💎
Premium
773
Ṁ670k
Dec 31
51%
OpenAI
45%
Google
3%
xAI
1.6%
Anthropic

I was browsing Twitter, and I saw a post by Karpathy positively talking about ChatBot Arena, which is a platform for ranking LLMs based on human ratings. As expected, OpenAI is holding positions 1, 2, and 3. I wonder which company will be #1 at the end of 2024.


Screenshot of the rankings table taken on the 13th of December:


Get
Ṁ1,000
and
S3.00
Sort by:

@traders Based on the comments below, I think it makes sense to resolve this question based on the ELO rating in case of a tie in "rank." When I created this question, a tie was not an option, so I doubt anyone even traded based on this assumption.

I created a similar question that only uses the rank. Feel free to trade on it.


Google deepmind was and is severely underrated by this market. The odds are looking more reasonable now though

@AJama The rumor is that OpenAI will release GPT-4.5 soon.

Gemini 1206 is now top 1 model in all categories by a small margin yet people think OAI will be better at the end of the year (59% at the moment). Do people believe in new release? GPT4.5?

@mathvc I think it's more a question of how often the leader board is updated.

I agree with your stance, I just don't know if I want more exposure to this market with my novice level of understanding of the subject.

is openai planning on doing another update before end of year? they used to be like every 2 weeks, but google lately has started that schedule

@NoahRich gpt 4.5 will be released this year

@Soli it is not announced anywhere officially

bought Ṁ50 YES

@mathvc i know, its a prediction

damn, it didnt even take the next 2 weekly openai update for the gemini model to drop below 1.

boughtṀ250 YES

@JasonDavies Google limit order of NO at 45%

reposted

interesting shift

bought Ṁ500 YES

@NoahRich i don’t think it is that interesting, if anything it shows google is out of the race for spot #1 this year. openai will pass them with the next minor update to 4o. they won’t even need to release a new model to pass google.

@Soli both could release another minor update in the time. there have even been reports shared here previously suggesting the potential lol

besides its interesting that google in its own has even reached this point. up until now they have been pretty far off. especially given their compute potential. interesting if they are finally starting to make use of their leg up in funding potential and compute

@NoahRich google reached this point already in july/august when they were ranked #1 for 1-2 weeks (see this other market that resolved yes) so imo no new information here that would be relevant for 2024 since there is a still a large enough gap between openai and google that can’t be closed this year. However for 2025 it is a different story and Google indeed might fully catch up instead of being 1-2 months behind.

/Soli/which-companies-will-outrank-openai

/Soli/who-will-have-the-best-llm-at-the-e-382ae559b471

@Soli Must've been right before I joined Manifold then! I joined in late August I think and at that time OpenAI was already leading. Thanks for sharing.

bought Ṁ550 YES
bought Ṁ100 YES

@JasonDavies wow! big update

Elon you need to try harder. Your enemies have figured out distributed training. You need to go faster.

https://x.com/elonmusk/status/1850991323010261230

bought Ṁ250 YES

Google reportedly releasing in December

https://9to5google.com/2024/10/25/gemini-2-0-december/

@inar same article mentions openai releasing in december too

bought Ṁ300 NO

I'm surprised you guys don't think that any other lab can hardcode a scratchpad/think step by step prompt to their flagship models

In fact, I would be very surprised if Opus 3.5 and the next QWENs and Geminis don't ship with a more expensive version with prethinking mode

@PeterBuyukliev i think anthropic will release something that still comes short of beating openai

Comment hidden
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules