When will LLMs be able to generate formal proof for sudoku solver?
Basic
6
Ṁ330
2027
1.1%
2024
34%
2025
46%
2026
19%
2027 or later

When LLMs or hybrid AI models can generate both the implementation of a Sudoku solver and its formal specification, along with a formal proof of correctness?

Prompts might look like: 'Write a Sudoku solver in Coq and formally prove its correctness.'

Model should be generally available

Get
Ṁ1,000
and
S3.00
Sort by:

As a status check, I tried o1-preview and it basically avoided doing the actual work instead essentially placing, “solution goes here” in its answer.

when further prompted to fill in those sections, it did spew out a lot of text but it didn’t look complete to me and finished by reiterating the problem was hard and it didn’t have the full proof.

However, maybe with o1 itself or o2 (and maybe even enough prompting and tokens with o1 preview) this will be achievable. It seems along the right path to improvement vs 4o.

@LiamZ I beleive that models of deepmind will solve such problems first

their models proves theorems iteratively and have feedback from proof checker or symbolic engine,which increase accuracy of result and speed up proof search

bought Ṁ1 YES

@fornever I agree that and hybrid approaches generally are much more promising but when do you think it be “generally available” or have some similar backend integrated in the chatbots?

@LiamZ they said “very soon”

https://x.com/demishassabis/status/1816596568398545149?s=46

it may be end of this year or start of 2025

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules