We are not feeling the AGI - Part 2

David Szabo-Stuban

Dec 22, 2024

The world is not ready for o3 yet.

Read →

7 Comments

Mike

Dec 23

Exceptional piece. Well done!

Expand full comment

Reply (1)

David Szabo-Stuban

Dec 23

thank you! appreciate it! 🙏

Expand full comment

Philippe Delanghe

Dec 22

Yes the distance between model capabilities and what most companies are able to do - because things are always messy in organizations is still huge

Expand full comment

Reply (1)

David Szabo-Stuban

Dec 22

yep. that's the big opportunity here i think

Expand full comment

Jurgen Gravestein

Dec 22

Man, what a great piece! So many nuggets of wisdom. I found myself nodding in agreement throughout.

Expand full comment

Reply (1)

David Szabo-Stuban

Dec 22

thank you!

Expand full comment

yo-cuddles

Dec 25

This is one of the best peices I've seen from the business angle. You earned another subscriber. I'll be reading more of your material when I get the time, bravo!

Just to gesture in a direction, and perhaps you're already knowledgeable on this because you're clearly ahead of the pack, but in case this isn't your wheelhouse:

Another thing holding back AI from subbing in for humans in even entry level workflows is that o3 just isn't doing "intelligence" the way we want it to. Its score dropped from human level in the arc benchmark to 65 points below human on the arc-2 test, which wasn't made adversarially to o3. It's safe to say that the model is horrifically overfit, and becomes straightforwardly not fit for task out of distribution.

LLM's aren't useful unless they are incredibly reliable, a 90 percent accurate system is an absolute non-starter for an automated workflow, if needs to be near perfect both to substitute the remarkable reliability of human agents as well as overcome inevitable regulatory and institutional barriers to making this happen. LLM's would also need to develop an ability they totally lack right now: knowing what they don't know. If you can't tell (with near absolute reliability) that you don't have the necessary information it is just not going to work

Expand full comment

Lumberjack

We are not feeling the AGI - Part 2