Using AI to automate business tasks
Discussion
I regularly utilise GPT for reviewing and amending various written materials, including extensive emails, to optimise my administrative activities. I'm delving into customised GPTs that incorporate specialised knowledge bases such as documents or PDFs. To date, we have incorporated manuals into our system, providing users with swift access to information. However, I've noticed inconsistencies in accuracy and instances where it seems to (sometimes) disregard the appended data or correctly added instructions, making things up, or ignoring the information supplied?
Has anyone embarked on merging GPT with knowledge databases, specifically for applications like customer service (e.g. Gmail) or instructional material? I’m keen to learn more, particularly any API developments that work well, with AI helping out.
Thanks
Has anyone embarked on merging GPT with knowledge databases, specifically for applications like customer service (e.g. Gmail) or instructional material? I’m keen to learn more, particularly any API developments that work well, with AI helping out.
Thanks
BGARK said:
I've noticed inconsistencies in accuracy and instances where it seems to (sometimes) disregard the appended data or correctly added instructions, making things up, or ignoring the information supplied?
This is the big problem with LLMs/ GPT. They aren’t capable of reliably telling the truth yet and once you understand what the algorithm behind them does, it’s clear to see why this is the case
They are great at producing plausible prose but not yet at getting it 100% correct.
I personally wouldn’t be rushing into integrating them with my own knowledge base / documentation until I could see something that changes this.
I treat these systems like an intern fresh out of college. They'll do the work with enthusiasm (well, most do) but they will give you sub-par work ie it will be 80% correct but this means you need to check 100% of it. If you want a "starter for 10" then that's great (and this is what I do) and it certainly helps with the process of creating something. I guess it depends on how perfect you need the work to be!
fat80b said:
This is the big problem with LLMs/ GPT.
They aren’t capable of reliably telling the truth yet and once you understand what the algorithm behind them does, it’s clear to see why this is the case
They are great at producing plausible prose but not yet at getting it 100% correct.
I personally wouldn’t be rushing into integrating them with my own knowledge base / documentation until I could see something that changes this.
They are pretty good if you can spend the time tuning them and only make them focus on one thing at a time. They aren’t capable of reliably telling the truth yet and once you understand what the algorithm behind them does, it’s clear to see why this is the case
They are great at producing plausible prose but not yet at getting it 100% correct.
I personally wouldn’t be rushing into integrating them with my own knowledge base / documentation until I could see something that changes this.
I've put them into production (not GPT). You do have to accept that there will be errors and decide if the error rate and severity is acceptable for the use case.
fat80b said:
BGARK said:
I've noticed inconsistencies in accuracy and instances where it seems to (sometimes) disregard the appended data or correctly added instructions, making things up, or ignoring the information supplied?
This is the big problem with LLMs/ GPT. They aren’t capable of reliably telling the truth yet and once you understand what the algorithm behind them does, it’s clear to see why this is the case
They are great at producing plausible prose but not yet at getting it 100% correct.
I personally wouldn’t be rushing into integrating them with my own knowledge base / documentation until I could see something that changes this.
Hoofy said:
I treat these systems like an intern fresh out of college. They'll do the work with enthusiasm (well, most do) but they will give you sub-par work ie it will be 80% correct but this means you need to check 100% of it. If you want a "starter for 10" then that's great (and this is what I do) and it certainly helps with the process of creating something. I guess it depends on how perfect you need the work to be!
Agreed, and they are improving rapidly, its already proven that for general written English these outperform the majority of humans.Give it another 6 months...
Claude just overtook GPT:
https://youtu.be/jnUhpLAuaBA?si=2nM8cNeEmVELhN8W
https://www.anthropic.com/news/claude-3-family
BGARK said:
Ok thanks, what specifically do you automate, any examples?
Zapier pulls data from an old web based system which doesn't offer an API. Power Automate shifts data and spreadsheets around.
I am not keen on either of these personally. They are dirty tools for dirty jobs, but sometimes its better this than a big expensive development project.
BGARK said:
Agreed, and they are improving rapidly, its already proven that for general written English these outperform the majority of humans.
Give it another 6 months...
Claude just overtook GPT:
https://youtu.be/jnUhpLAuaBA?si=2nM8cNeEmVELhN8W
https://www.anthropic.com/news/claude-3-family
Claude 3 is only better than GPT4 if you use the most basic prompts or you can use the API to control the output. The numerical benchmarks are a bit of a waste of time because they only apply to the specific cases that are being measured. Some developers will tune the models specifically to those tests to get attention whereas the better ones will develop a much better all round model so the only way to judge them is to try a few similar tests that meet your requirements (or in other words create your own benchmarks for your own purposes).Give it another 6 months...
Claude just overtook GPT:
https://youtu.be/jnUhpLAuaBA?si=2nM8cNeEmVELhN8W
https://www.anthropic.com/news/claude-3-family
BGARK said:
Hoofy said:
I treat these systems like an intern fresh out of college. They'll do the work with enthusiasm (well, most do) but they will give you sub-par work ie it will be 80% correct but this means you need to check 100% of it. If you want a "starter for 10" then that's great (and this is what I do) and it certainly helps with the process of creating something. I guess it depends on how perfect you need the work to be!
Agreed, and they are improving rapidly, its already proven that for general written English these outperform the majority of humans.Give it another 6 months...
Claude just overtook GPT:
https://youtu.be/jnUhpLAuaBA?si=2nM8cNeEmVELhN8W
https://www.anthropic.com/news/claude-3-family
Have you found Claude to be better? I've heard it is but I can't tell either way.
Gassing Station | Business | Top of Page | What's New | My Stuff