Resolution Criteria
This market resolves to YES if Daniel Kokotajlo publicly states or writes that the state-of-the-art large language model release(s) in 2025 after April 3rd have caused him to increase his estimated timeline for the development of artificial general intelligence (AGI). This must be a clear statement attributing the timeline extension specifically to 2025 LLM releases.
(Increase means the amount of time it takes will increase; that is, things will go slower.)
The market resolves to NO if:
- Kokotajlo does not make such a statement by the end of 2025 
- He explicitly states that 2025 LLM releases have not changed or have decreased his AGI timeline estimate 
- He makes no public comment on how 2025 LLM releases affect his AGI timeline 
- Kokotajlo's timeline increases, but not due to dissapointing LLM releases. Instead, something like a war in Taiwan or a market crash may make AGI development happen slower than currently expected by him. 
This market resolves to NA if:
- He makes no public comment on how 2025 LLM releases affect his AGI timeline 
- He makes conflicting public comments that make it difficult to determine his overall view 
I will wait until the end of year 2025, because it is possible he changes his mind at various points through the year. For example, GPT-5 might initially be disappointing, but then later in 2025 Gemini 3.0 exceeds expectations.
Background
Daniel Kokotajlo is a researcher known for his work on AI alignment and forecasting AI development timelines. He has previously published analyses and predictions regarding the development of artificial general intelligence.
State-of-the-art (SOTA) large language models are the most advanced AI language systems available at a given time. GPT-5, Claude 4, DeepSeek-R2, Gemini 3.0, and Grok 4 are all models which will likely be released this year.
Daniel recently co-authored https://ai-2027.com on April 3rd, 2025. This question is aiming to basically ask, "will the models released after that report be less good than expected (to him)?"
- Update 2025-04-10 (PST) (AI summary of creator comment): Clarifications from the Creator: - Statements where Kokotajlo indicates that the release has influenced (i.e., increased) his median forecast should be taken as a YES resolution if he uses terms such as slightly influenced, moderately influenced, significantly influenced, or completely influenced. 
- The term just barely influenced is not considered sufficient to trigger a YES resolution. 
 
This might be a bit vague. For example, shortly after GPT-4.5 release Daniel said that he raised his median forecast from 2027 to 2028. He said that it was not a direct consequence of this release but significantly influenced by it, although he is struggling to exactly estimate how much (IIRC).
Would a similar statement be enough for YES?
@qumeric Hmmm yeah I think "significantly influenced" passes the threshold for YES.
But you are right that he could say anything in the range of "GPT5 just barely/slightly/moderately/significantly/completely influenced my median forecast."
Any ideas? I'm leaning towards resolving any of the above yes, except maybe "just barely."
@AdamK Good point, what I really meant was any new models released from this point in time going forward. Although I guess Llama 4 was only released a few days ago after the AI-2027 document. So I think this question specifically refers to models released after April 3rd. I will update the question
I find the "increase"/"decrease" wording unclear. Does "increase" mean a longer timeline or a faster timeline?