AI - An enthusiasts example of why AI is not a given

AI - An enthusiasts example of why AI is not a given

Author
Discussion

FrancisA

Original Poster:

152 posts

21 months

Thursday 6th February
quotequote all
Hi All

I just finished watching Mo Gawdat’s presentation on AI https://www.youtube.com/watch?v=a4dP6ItHS_Y
(for those that do not know him look up his name).

Following that video I decided to challenge AI. Below is a link to the thread of the conversation based on Porsches. It is part of his guidance to not automatically accept the answer from the AI tool. Having said that it is going to be subjective to the tool in question (ChatGBP in this case).

I thought I would share this in a context that enthusiasts would understand (note my challenges to the responses and how the AI tool changes it’s response)

https://chatgpt.com/share/67a3ff09-e930-800e-9b76-...


Kawasicki

13,727 posts

247 months

Thursday 6th February
quotequote all
AI isn’t intelligent. It’s just collecting phrases together based on your questions. Your question largely determines the answer.

Orangecurry

7,607 posts

218 months

Thursday 6th February
quotequote all
FrancisA said:
It is part of his guidance to not automatically accept the answer from the AI tool.
No kidding.

I know Trump and Musk making up reality is now the norm, but common sense really has left the building if you use AI for anything, IMHO.

cseven

270 posts

248 months

Thursday 6th February
quotequote all
Orangecurry said:
No kidding.

I know Trump and Musk making up reality is now the norm, but common sense really has left the building if you use AI for anything, IMHO.
The rate of development is quite astonishing, think race to the moon.

c3m

322 posts

163 months

Thursday 6th February
quotequote all
It's important to understand that LLMs are just very good bullst generators.

By design, they're untrustworthy because their output is based on the statistics of their training data - there's no notion of "truth" or "accuracy" or anything like that.

You might have heard "LLMs halucinate some small % of the time" except that's inaccurate - they hallucinate 100% of the time, it's just that the most likely answer (i.e., statistical output based on training data) is right a lot of the time - except when it isn't and it's total bullst.

LLMs cannot reason and that's clearly evidenced by the numerous examples where even basic answers are wrong when the inputs are changed to something statistically unlikely (e.g., father son car accident riddle, crossing the river riddle). Researches at Apple showed that that LLMs have dramatically worse performance on basic algebra questions when you add in irrelevant information.

LLMs can be good at use cases where there's no "right asnwer" or accuracy is not critical - for example, generating stories, music, transcription of audio/video.

Then there's there's the thorny issue of copyright where LLM companies claim usage of all data on the Internet is fair use and does not require licensing yet they've stricken data licensing deals with other big companies - why would the strike licensing deals if they believe it's fair use?

Eerke Boiten, Professor of Cyber Security at De Montfort University Leicester put it very well:
Boiten said:
From the perspective of software engineering, current AI systems are unmanageable, and as a consequence their use in serious contexts is irresponsible. For foundational reasons (rather than any temporary technology deficit), the tools we have to manage complexity and scale are just not applicable.
So, why are all the big tech companies throwing hundreds of billions into this? It's because Silicon Valley has been on the lookout for the "next big thing" for more than a decade now without anything in sight - those stock prices must be pumped, infinite growth needs to come from somewhere. After the collapse of the NFT idiocy, they needed something else to latch on, so here we are today. There are so many vested interests in keeping the hype train going, so be aware of incentives driving people's behaviour.

Does that mean there's nothing useful in LLMs? No, they provide value in specific contexts (but probably 10% of the currently hyped use cases).

Does that mean that AI/Machine Learning won't be useful in the future? No, we should expect to see more progress here in the future.

But LLMs are not the answer that delivers intelligence, we will need a new architecture for that - e.g., JEPA.

As Yann LeCun, one of the Godfathers of AI, thinks: "AI Is Dumber Than a Cat".

Maxym

2,311 posts

248 months

Thursday 6th February
quotequote all
An LLM is a 'large language model', if any of you had forgotten or were unfamiliar with jargon...

I didn't have a fking clue until I Googled it. AI provided the answer. hehe

Edited by Maxym on Thursday 6th February 09:34

FrancisA

Original Poster:

152 posts

21 months

Thursday 6th February
quotequote all
c3m said:
It's important to understand that LLMs are just very good bullst generators.

By design, they're untrustworthy because their output is based on the statistics of their training data - there's no notion of "truth" or "accuracy" or anything like that.

You might have heard "LLMs halucinate some small % of the time" except that's inaccurate - they hallucinate 100% of the time, it's just that the most likely answer (i.e., statistical output based on training data) is right a lot of the time - except when it isn't and it's total bullst.

LLMs cannot reason and that's clearly evidenced by the numerous examples where even basic answers are wrong when the inputs are changed to something statistically unlikely (e.g., father son car accident riddle, crossing the river riddle). Researches at Apple showed that that LLMs have dramatically worse performance on basic algebra questions when you add in irrelevant information.

LLMs can be good at use cases where there's no "right asnwer" or accuracy is not critical - for example, generating stories, music, transcription of audio/video.

Then there's there's the thorny issue of copyright where LLM companies claim usage of all data on the Internet is fair use and does not require licensing yet they've stricken data licensing deals with other big companies - why would the strike licensing deals if they believe it's fair use?

Eerke Boiten, Professor of Cyber Security at De Montfort University Leicester put it very well:
Boiten said:
From the perspective of software engineering, current AI systems are unmanageable, and as a consequence their use in serious contexts is irresponsible. For foundational reasons (rather than any temporary technology deficit), the tools we have to manage complexity and scale are just not applicable.
So, why are all the big tech companies throwing hundreds of billions into this? It's because Silicon Valley has been on the lookout for the "next big thing" for more than a decade now without anything in sight - those stock prices must be pumped, infinite growth needs to come from somewhere. After the collapse of the NFT idiocy, they needed something else to latch on, so here we are today. There are so many vested interests in keeping the hype train going, so be aware of incentives driving people's behaviour.

Does that mean there's nothing useful in LLMs? No, they provide value in specific contexts (but probably 10% of the currently hyped use cases).

Does that mean that AI/Machine Learning won't be useful in the future? No, we should expect to see more progress here in the future.

But LLMs are not the answer that delivers intelligence, we will need a new architecture for that - e.g., JEPA.

As Yann LeCun, one of the Godfathers of AI, thinks: "AI Is Dumber Than a Cat".
Absolutely bang on response. The statistical aspect of the LLMs is not fully understood and appreciated. The black or white response explains why it changed its response when I challenged it. This is important. However the point of the experiment was to demonstrate to people how AI can represent inaccurate responses in a confident manner.

ChrisW.

7,415 posts

267 months

Thursday 6th February
quotequote all
Which is why mining information and matching questions and answers from multiple sources has a fighting change of creating the consensus answer ... it's like asking a friend who will tell you what he or she has heard without them actually knowing themselves.

I asked what the kWh equivalent was of a gallon of petrol. A lot of the answer was correct but it didn't correctly calculate the litre equivalent of a UK gallon. When I replied explaining the correct answer, it just replied YES.

I hope I was right ... because it might now repeat this answer to the next idiot that asks ... smile

c3m

322 posts

163 months

Thursday 6th February
quotequote all
FrancisA said:
However the point of the experiment was to demonstrate to people how AI can represent inaccurate responses in a confident manner.
You're absolutely correct and that's the actual danger of this technology – we're on the precipice of raising a young generation being fed bullst at scale and made to believe it's true because "it's Artificial Intelligence".

LLMs will answer questions with authority and confidence all the time - even when they're wrong.

Juniors at my workplace would come to me with obviously wrong "facts" and be very puzzled when I tell them it's absolutely wrong. Their response is usually "Oh, but AI told me this, how can it be wrong? They know everything!". Paradoxically, the higher the accuracy the of LLMs, the more dangerous it becomes as people start trusting it more and in those 1%/2%/5% of the cases where it really matters, it's going to make a catastrophic mistake. And worst of all, young people will not develop the skills to think critically but just trust blindly.

As in all things in life, this phenomenon is driven by very powerful incentives and trillions of dollars. We would have to go through the stage of "We deployed this AI system, results were terrible" for companies to realise that it's not a magic wand to be deployed everywhere.

Similar to what happened here:

TechMonitor said:
The UK government has reportedly discontinued six artificial intelligence (AI) prototype projects intended to modernise its welfare system, it has emerged. Freedom of Information (FoI) requests revealed that these pilots, aimed at improving services at Job Centre, staff training, communication systems, and disability benefit processing, faced challenges in scalability, reliability, and testing.
In the near term, expect AI to be shoved down our throats in everywhere. It's now at the top of every Google search, in Microsoft Office, etc. Why? Because these companies then report "Usage of our AI products have increased by thousands of %, so those billions in CapEx are totally worth it" in hopes to keep their stock prices up. Except the AI businesses are major money losing operations hoping to recoup their investments by taking over the world once they invent AGI/super-intelligence and lay off most employees.

You have a small number of live players and the rest are grifters who have just come along for the journey having left the crypto hype train to now become AI bros.

LLMs have their use cases and can be leveraged to provide value. I personally use Whisper to transcribe audio/video and it's absolutely amazing - even if it makes a few mistakes here and there, I can easily correct them and it's not critical. The danger is when those LLMs get deployed in places where they have no place.

Time will tell. In the meantime, enjoy the ride and remember - "It is difficult to get a man to understand something when his salary depends on his not understanding it".

c3m

322 posts

163 months

Thursday 6th February
quotequote all
ChrisW. said:
Which is why mining information and matching questions and answers from multiple sources has a fighting change of creating the consensus answer ... it's like asking a friend who will tell you what he or she has heard without them actually knowing themselves.
I think it depends on the use case. Do I want to collect the opinion of thousands of unverified sources about the reinforcements needed to ensure a bridge doesn't collapse? Or do I want to task a tiny subset of civil engineers to work through this problem?

The other aspect is that LLM output embeds the implicit bias of its training data. Have to be super careful if such systems get deployed to make decisions which affect people's lives - in the extreme, think Minority Report-style.

Some of my friends have children at university and it sounds like a lot of students use AI to write essays, do coursework, etc without actually learning anything. Teaching staff would try to detect AI usage but that's not simple, either, so it becomes a bit of a cat-and-mouse game of who can cheat without getting detected. Having said that, it might be a signal that the eduction system needs to be rethought but that's not something that can be done quickly and effectively.

ChrisW. said:
I asked what the kWh equivalent was of a gallon of petrol. A lot of the answer was correct but it didn't correctly calculate the litre equivalent of a UK gallon. When I replied explaining the correct answer, it just replied YES.
And that's the danger of these systems - they're mostly right on topics which have a lot of training data (the worse the training data, the worse the answers get - try something very obscure and you get mostly garbage). Because they're mostly right, people start trusting them fully without realising parts of the answers can be completely wrong.

In your case, you caught the inaccuracy but maybe others would not. They would then proceed to use the incorrect litre equivalent without unknown consequences. Now scale this to billions of people being actively misled by such systems. But hey, there's tons of money to be made, so who cares about the consequences.

Orangecurry

7,607 posts

218 months

Thursday 6th February
quotequote all
c3m said:
It's important to understand that LLMs are just very good bullst generators.

LLMs can be good at use cases where there's no "right asnwer" or accuracy is not critical - for example, generating stories, music, transcription of audio/video.
Someone I know runs a publishing company (let's call it xPC).

(this is the abridged version hehe )

One of their signed-authors recently promised a new six-book deal - when the timescales for delivery were stated in the contract vs the money to be paid, the xPC wrote back saying 'are you using AI for any of the six-books?'

Author: No absolutely not.

xPC; we can't see how you can produce this much work in that time - are you sure you're not using AI?

Author: No absolutely not.

xPC: we've now run this past our technical experts, and they can't see how this can be done - are you sure you're not using AI?

Author: ........... ok yes I am

xPC: that's fine, but fees will be substantially reduced and we need to see the AI generated material before we agree to anything.

Edited by Orangecurry on Thursday 6th February 12:44

c3m

322 posts

163 months

Thursday 6th February
quotequote all
Orangecurry said:
Author: ........... ok yes I am

xPC: that's fine, but fees will be substantially reduced and we need to see the AI generated material before we agree to anything.

Edited by Orangecurry on Thursday 6th February 12:44
This behaviour is rife and only getting worse - there's a term for this called "AI slop" or just "slop", similar to spam. The Internet and platforms are getting absolutely polluted witht his garbage and it's a one way street.

There are large scale operations publishing "fake" or LLM generated books on Amazon. Short-term incentives means Amazon don't care too much about fixing the problem - more books, more sales which they get a % cut out of.

Then you get the LLM generated fake reviews on Amazon, Google, everywhere. In the end, you won't be able to trust anything because LLMs have significantly lowered the cost of generating plausible text.

Spotify themselves are playing a sneaky game - they're generating music using LLMs which pays royalties back to themselves rather than going to artists. The more AI music generated by Spotify gets played, the more of the monthly fees they keep to themselves - the incentives are obvious here. A lot of people don't understand how Spotify splits revenue from a user's monthly fee - it doesn't go to the artist that a user listens to but gets pooled and then split by total streams.

For example, suppose you pay £10 a month and only listen to a single obscure indie band on Spotify. How much of that goes to the indie band? Almost nothing. Here's how it works - Spotify calculates the number of streams for that band, say 1,000 in month and then calculates the % of that 1,000 out of all streams on the platform and gets paid according to that proportion. So most of your £10 goes to all the popular artists even though you don't listen to them.

Per the article:

TheRegulatoryReview said:
Most streaming services, including Spotify, use a pro-rata system in which all subscription money is pooled into a single pot, and then a percentage of it is distributed to artists proportionate to their number of streams. For example, if Beyoncé gets five percent of the total streams in a month, Beyoncé’s label will get five percent of the total money paid to all content providers that month.
You might wonder – why does Spotify not pay "fairly", i.e., splitting the fees according to who listens to the music? The answer is the power of big labels who control the most popular music. The current system favours big labels as they hold the license to the most popular music.

The reason they can force Spotify to use such a revenue sharing system is that Spotify needed/needs the big labels, as otherwise no one would subscribe to Spotify if the most popular music is not on the platform. This creates yet another incentive for Spotify to reduce their dependence on big labels if they can shift their users' listening patterns to LLM generated music. Spotify has the levers to change listening patterns by using AI generated playlists, DJ X, etc.

f6box

179 posts

9 months

Thursday 6th February
quotequote all
Orangecurry said:
No kidding.

I know Trump and Musk making up reality is now the norm, but common sense really has left the building if you use AI for anything, IMHO.
Slightly luddite view. LLMs have serious limitations, but they have plenty of sensible uses. A copywriter, for instance, might plug in a few tag lines and ask for alternates guided by various parameters and prompts. It typically won't spit out something ready baked you'd want to use, but it might spark an idea or new direction. And it'll do so in an instant and for little or no cost and with no immediate risk of malign outcomes. Takes about 30 seconds, why wouldn't you?

Moreover, lots of people use it sensibly as a labour saving device for writing. They'll plug in a list of factual content and ask for that to be strung together is some or other format. Provided you proof and the outcome, and you'll typically want to make adjustments anyway, there's little obvious downside. It's certainly a help for anyone for whom writing isn't a core skill or much of a pleasure.

Personally, I've tried to use it for my work but for various reasons it's not fit for purpose for now. But, actually, all it takes is a little common sense to usde AI fruitfully. It's just a tool, for now. Much depends on correct usage.

Edited by f6box on Thursday 6th February 13:32

f6box

179 posts

9 months

Thursday 6th February
quotequote all
c3m said:
Juniors at my workplace would come to me with obviously wrong "facts" and be very puzzled when I tell them it's absolutely wrong. Their response is usually "Oh, but AI told me this, how can it be wrong? They know everything!". Paradoxically, the higher the accuracy the of LLMs, the more dangerous it becomes as people start trusting it more and in those 1%/2%/5% of the cases where it really matters, it's going to make a catastrophic mistake. And worst of all, young people will not develop the skills to think critically but just trust blindly.
That smacks of ye olde "sigh, young people, today". I mean, you give them digital calculators, now nobody can do maths. It's the end of civilisation. They'll be watching TV next.

Even cursory exposure to LLMs will reveal their capacity for error. The problem isn't the AI if the user ignores the obvious in that regard, it's the user. It's a new technology and no doubt we haven't worked out widely agreed best practice for using it yet. And the lazy will use it lazily. But then they'll use everything else lazily, too.

Maybe you need better juniors or maybe they just need a little training up on this matter just as they do on any number of matters.

ChrisW.

7,415 posts

267 months

Thursday 6th February
quotequote all
It's a tool for people who know their subject and can use AI material as a short-cut ... padding their own content and then correcting AI material with which they disagree.

My concern is that as a replacement for user knowledge it is currently incompetent, it also risks attaching fiction to fact ... making it dangerous in inexperienced hands.

Further, there are those who would re-write history ... Jan 6th 2021 was clearly when Trump was cheated of the Presidency by Pence ... Trump supporters were pardoned and the police sanctioned for their behaviour in the matter ?

Orangecurry

7,607 posts

218 months

Thursday 6th February
quotequote all
f6box said:
Takes about 30 seconds, why wouldn't you?
And therein lies the problem.

It's a slippery slope.

80sMatchbox

3,913 posts

188 months

Thursday 6th February
quotequote all

I was at a cars and coffee meet recently and got chatting to the owner of a Porsche specialist.He said he'd started using AI and didn't know anyone else who wasn't using it for business.

His reason for using it was legitimate to me. He said he was often accused of being a bit blunt when advising customers what work needed to be done to their cars. Now he writes what he usually does but asks AI to word it better.

He said its been great for this and has had customers positively comment on his email correspondence with them.

Not relying on it for information seems very useful and foolproof right now.

BertBert

20,110 posts

223 months

Thursday 6th February
quotequote all
80sMatchbox said:
I was at a cars and coffee meet recently and got chatting to the owner of a Porsche specialist.He said he'd started using AI and didn't know anyone else who wasn't using it for business.

His reason for using it was legitimate to me. He said he was often accused of being a bit blunt when advising customers what work needed to be done to their cars. Now he writes what he usually does but asks AI to word it better.

He said its been great for this and has had customers positively comment on his email correspondence with them.

Not relying on it for information seems very useful and foolproof right now.
It's great until he gets blase and doesn't properly check what the AI has written - promising a load of things to be fixed for free.

I did a similar test to the OP and asked AI how to set the valve clearances on a Ford Cosworth BDA engine. Firstly it got the wrong engine. I told it and it got the right engine. Then it got completely the wrong method to set the clearances (wibbling on about adjusters). I told it and it got essentially the right method but with important details wrong. It's a bit like reading a newspaper report of an event that you have actualy factual knowledge of. The Newspaper has the right theme ,but the facts are completely wrong. Same with AI LLMs

FrancisA

Original Poster:

152 posts

21 months

Thursday 6th February
quotequote all
I have been reading the responses from my original post and I am relieved to see that so many people understand the limitations of LLMs. As has been mentioned it is using the AI tool in an intelligent manner to enable efficiency as opposed to becoming dependent on the AI tool.

Unfortunately we live in a world where large portions of the populace appear to refuse to discern. Rather they accept.....the usual "I read it on Facebook" explanation.

Going forward we need to push back on the absolute reliance of these tools and realise their limitations. However as someone pointed out it is an uphill battle as the Tech companies have positioned AI as the must have for the future and have bet their stock on it. It was with some amusement I watched Nvidia lose $600 billion last week as Deepseek revealed their AI tool. I am not being anti-US but anyone with a modicum of financial understanding could not comprehend the valuation of OpenAI based on a conditioned and limited result set.

gtsralph

1,269 posts

156 months

Thursday 6th February
quotequote all
GIGO