This will be evaluated according to the AI Safety Levels (ASL) v1.0 standard defined by Anthropic here. See this market for criteria for determining a system to be ASL-3 for the purposes of this market.
Unlike in that market, the date in question is the date of the first public report that contains credible evidence that a model is ASL-3, which may be later than the date that the model is trained, and earlier than the date that there is a consensus that that evidence should count for ASL-3.
Feel free to add new answer choices. Valid choices must be in the format YYYY QQ.
- Update 2025-05-25 (PST) (AI summary of creator comment): - The six-month countdown period from the referenced market will be used. - If Anthropic makes a provisional ASL-3 assessment, the market will, by default, resolve to an answer of Q2 (for the relevant year). 
- This default resolution to Q2 is contingent on Anthropic not retracting the provisional assessment within that six-month period.