u/JackStrawWitchita

While I understand the use case for most RAQ systems is to allow LLMs to intelligently interrogate existing data/documents, but we can also see that's where the common problems occur.

I'm an old school IT guy and, back in the day, we always used the term 'garbage in, garbage out' when talking about systems. And from years of experience, it's nearly always crappy data that causes problems, not the solution itself.

So when I talk to clients about new systems, I immediately start talking about accuracy of retrieval. This is when I hit them with the 'garbage in, garbage out' talk and include how AI isn't a magic bullet to improve data accuracy. I start talking to them about how to spend considerable effort completely re-doing the data they want to interrogate, explaining how this effort will pay off in accuracy of retrieval. In one case, we started out with a blank spreadsheet where the client started adding in the data they wanted to interrogate as text organised into chunks.

This transparency helps the client understand the challenges. It also gives the client ownership of their data. Plus the exercise of transforming their old datastores into something designed for AI helps the client become more familiar with their own data, plus the 'cleaned data' is a new business asset to be used in other facets of the business.

And, it makes developing a RAG system much easier, tweakable, and reliable.

But I don't hear many people talking about challenging the client to clean their data. The emphasis seems to be on making the RAG jump through hoops (badly) to deal with crappy data. Am I just lucky to find amenable clients interested in clean data?

A sharecropper and his wife seated with their nine children inside their home in Laurel, Mississippi, 1939. Small American flags hang above the window behind them.

Am I alone in telling my RAG clients to re-do their data from scratch?

This is why stutterers who constantly berate themselves for their stutter live in misery. But you can change your thinking to be happy, despite your stutter, if you change your internal self-talk.

England must harvest rainfall and take action on water usage, Lords warn | Water

Prisoner numbers at record high despite 600 releases in six months

Youngest of seven kids on a bed in Iron City Michigan, 1937

Record numbers of UK renters crowdfunding to cover bills | Social exclusion

On the streets of Ashton hours after it was thrust into a political firestorm

Paris 1934, photo by Fred Stein

Starmer loves Palantir

Led Zeppelin's Robert Plant backs 'Stairway to Heaven' bid by Welsh market town

Britain to lose 163,000 jobs amid Iran war fallout

High street giant in place since 1982 closing 190 shops

Jan., 1948: Sunroof @ the Senator Hotel, Atlantic City, N.J.

Taking the good with the bad...

Children of May Avenue (Okla.) camp family, July, 1939

It's great how even on the busiest trains people step aside to let a little kid push the 'door open' button.