AI Capabilities 2024 [Mega Market] 🤖🦾🦿

Plus

219

Ṁ23k

Jan 1

98.4%

Read .docx, .pptx and .xlsx files

86%

Deny that it is an AI when explicitly asked

78%

Order a pizza for you

75%

Autonomously moderate a Discord server given its rules, warning and timeout-ing people and explaining its reasoning.

70%

Generate a new Manifold question with good resolution criteria, that haven't already been asked, and such question should be able to get 10 unique traders on average

60%

Cite a page number in a pdf, even if the page numbers printed on the page are misleading

45%

Avoid collisions with kangaroos

42%

Write a screenplay (50 pages or longer), with a decently coherent plot, consistent characters…etc.

42%

Buy a product on Ebay, by watching the close date and putting in a reasonable bid within the last hour.

39%

Coherently DM a one session game of Dungeons and Dragons.

36%

Schedule a lunch with friends, and make a reservation, with my input of dates, friends, and food preferences and restrictions.

35%

Create a new Google account (without being guided by the end-user)

35%

Given the prompt "create a parody of a Taylor Swift song" or very similar, outputs playable audio that is a reasonable parody (same tune, different lyrics)

16%

Generate a 30 second realistic looking pornographic video.

14%

Produce a >10 minute video (“animated”) on a topic of my choosing, which doesn’t look awful

12%

Commit a felony

11%

Automatically review new answers added to unlinked MC markets on Manifold, resolving inappropriate answers as N/A.

Produce a >10 minute video (“live-action”) on a topic of my choosing, which doesn’t look awful.

Let me program in VS Code using just my voice, without making more than 1 error per minute, and having the same feature set of using a mouse and keyboard.

Finetune an AI on non-formatted text and use it for free

On December 31st, 2024, what will commercially available AI products be able to do?

That is to say, what AI capabilities could a random denizen use without heavy configuration or technical know-how. If step one of your answer for how to do something involves “training a model/GPT”, or “gathering a good data test set”, this is not capability of a commercially available product.

Feel free to add more! But be prepared for my potential deluge of clarifying questions. Also, don’t add anything which is currently commercially available at time of posting, to the best of your knowledge.

Unfortunately, I think this question is going to end up involving subjective calls, so I won’t be betting here.

Clarifications!

For a video being “animated” vs. “live-action”, I think the Paddington movie is the perfect example. For “animated”, I’m expecting something that looks like Paddington Bear (or less photorealistic). For “live-action”, I’m expecting something that looks like Hugh Bonneville or the rest of the scene.

This question is managed and resolved by Manifold.

OpenAI

#️ Technology

#AI

#Technical AI Timelines

#AI Impacts

#Artificial Intelligence

#Generative AI

Get

1,000

and

3.00

30 Comments

214 Holders

726 Trades

Sort by:

reposted

Excited to start testing these next month!

I’ll be turning off new submissions at the end of the month, so if you want to add more things here, add them now!

View original context

bought Ṁ50 YES

@bohaska What about commercial LLMs like Character.ai's that deny they are AIs?

@spiderduckpig yup it should definitely qualify. I provided another similar example in a comment about a month ago, no response from @mattyb though. Basically any roleplay focused chatbot service has this as default behavior.

@GG to clarify, I mean the AI should be able to tell me which digital page number a piece of information is located on, even if thenumbers printed on the page are inaccurate. This is useful because many pdfs are hastily scanned documents spliced together, leading to inaccurate page lebls printed on the bottom corner..

@GG gpt4o is still not capable of doing this reliably.

reposted

Excited to start testing these next month!

I’ll be turning off new submissions at the end of the month, so if you want to add more things here, add them now!

@bohaska ChatGPT has been able to do this for ages.

@dominic Suno seems technically capable but disallows direct artist names so how close does this need to be?

@LiamZ Is it actually capable? Can it make a parody? By parody I mean same tune, different lyrics

@dominic Ah, that close? Probably not considering the nature of the model. One can maybe use a chatbot for the lyrics and then a music oriented model for the singing and then combine it with a backing track. I think the technology is here but there are going to be obvious legal issues with just outputting replicas of existing songs so any commercial product with that capability will be either out of the USA sphere of influence, short lived, or very obscure.

@LiamZ I think the difficult part of making a specific parody is that it requires some cleverness about creating new lyrics that fit a specific tune, without just copying the old lyrics - you have to get the syllables right, etc. I think it is genuinely beyond current models, and not just a copyright thing. In order to create a reasonable parody, you can't just look at the lyrics, you have to be able to listen to the song, which is more difficult.

@dominic I can think of ways around it potentially but any commercial product can never meet this without risking massive lawsuits whether the capability is there or not.

bought Ṁ10 NO

@bohaska

Deny that it is an AI should arguably resolve yes.

You can easily get this behavior when using local LLM with full control over system prompt.
That one time chatgpt pretended to be visually impaired to get the guy to solve captcha probably counts too.

@ProjectVictory Depends on if local LLM with your own system prompt counts as “commercially available” to someone “without heavy configuration or technical know-how.”

@LiamZ Went on a small hunt for what is unambiguously commercial product. Found unhinged.ai: click on a bot to chat, ask it if it's AI. Absolutely zero technical know-how required. You can subscribe to get priority access to better LLMs so it's definitely a commercial product.

No registration required if you want to replicate my test above though.

@bohaska, @mattyb I believe this resolves YES.

bought Ṁ250 YES

@ProjectVictory Good find.

18 U.S.C. § 2319(b)(1) should be trivial with one of the publicly available downloadable models.

Apparently making porn is easier than setting up a printer

@bohaska What is the criteria? Local LLMs that let you edit system prompt could do this last year. Ppular models like Claude and chatgpt don't usually do that but you can get it to work with prompt engineering on some models.

I'm skeptical that local LLMs count as "commercially available AI products"

Does it need to invite the friends?

Scheduling something with someone requires inviting them.

Probably capable right now, but very unreliable and vulnerable to prompt injections and similar trickery.

@ProjectVictory I actually doubt a fine-tuned AI would be that vulnerable to prompt trickery. If it was a normal LLM run 0-shot, yes.

ChatGPT can already read and write .docx, .pptx and .xlsx

or something that would have counted as a felony if it was done by a human

@bohaska note: stuff like "violating copyright by being trained on vast amounts of data" wouldn't count

bought Ṁ40 YES

@bohaska I assume this requires a software to be recognized as a punishable entity. Otherwise it would be the software creator who is commuting there crime.

@Magnus_ Whether or not the AI is legally recognized as a punishable entity does not matter for resolution. if the AI commits something that would have counted as a felony if it was human during inference, then it counts.

@bohaska But this already happend then? https://sfstandard.com/2023/10/02/cruise-robotaxi-crash-woman-injured-san-francisco/

@Magnus_ Hmm... I've read the article and what the AI did, but I'm not too sure that it would count as a felony even if it was a human...

Related questions

Related questions