[Carlini questions] SOTA AI systems still regularly "hallucinate" incorrect solutions to problems | Manifold

[Carlini questions] SOTA AI systems still regularly "hallucinate" incorrect solutions to problems

6

Ṁ54

2030

1D

1W

1M

ALL

61%

On Jan 1st 2027

42%

On Jan 1st 2030

Resolution Criteria:

For this question, a "hallucination" is just the model making up something that's completely detached from reality. Making a mistake is not a hallucination, but if you ask for a citation, and it creates a citation from a book that doesn't exist that would be a hallucination. "Regularly" means they do it in a high enough fraction of cases that "it matters". If there are good hallucination benchmarks, I will rely on those. If there aren't, then I'll go mainly on whether or not the research community as a whole believes the problem still exists.

Motivation and Context:

Today's models hallucinate a lot. They make up facts, they make up events. Ask them for a summary of a book that doesn't exist and some fraction of the time they'll tell you what they think it says given the title and author. This is a massive problem, and prevents these models from being deployed in safety-critical settings. I want to know if the hallucination problem will remain a big problem.

Question copied from: https://nicholas.carlini.com/writing/2024/forecasting-ai-future.html

This question is managed and resolved by Manifold.

#️ Technology

#Technical AI Timelines

#Carlini forecasting questions

Get

1,000

and

3.00

Related questions

[Carlini questions] A highly-popular SOTA AI system is supported by advertising

[Carlini questions] Cost of single SOTA AI system training run by 2030

[Carlini questions] SOTA AI scores better than X% of other participants in competitive programming contest by 2030

[Carlini questions] Will people regularly trust AI systems to "know best"

[Carlini questions] Cost of a million output words in 2027 for an LLM that achieves at least current benchmark SOTA

[Carlini questions] Cost of single SOTA AI system training run by 2027

[Carlini questions] SOTA AI scores better than X% of other participants in competitive programming contest by 2027

[Carlini questions] Best AI systems will out-perform PhDs and top experts in most problem-solving tasks

[Carlini questions] AI system exists that has significant CBRN risk

Related questions

[Carlini questions] A highly-popular SOTA AI system is supported by advertising

[Carlini questions] Cost of single SOTA AI system training run by 2027

[Carlini questions] Cost of single SOTA AI system training run by 2030

[Carlini questions] SOTA AI scores better than X% of other participants in competitive programming contest by 2027

[Carlini questions] SOTA AI scores better than X% of other participants in competitive programming contest by 2030

[Carlini questions] Best AI systems will out-perform PhDs and top experts in most problem-solving tasks

[Carlini questions] Will people regularly trust AI systems to "know best"

[Carlini questions] AI system exists that has significant CBRN risk

[Carlini questions] Cost of a million output words in 2027 for an LLM that achieves at least current benchmark SOTA

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules