That is what they do.
Male, female, man, woman, boy, girl are sex categories, not gender categories, that is they predate the very idea of gender as distinct from sex.
Sports categories never had anything to do with gender.
The other difference of sexual development are different sexes
I have proclaimed RAG is dead many times, and I stand by it.
RAG is Dead! Long Live Agentic RAG! || Long Live putting stuff in databases where it damn well belongs!
I think you agree with the people saying RAG is Dead, or at least you agree with me and I say RAG is Dead, when you say "Simply using docling and transforming PDFs to markdown and have a vector database doing the rest is ridiculous."
I fully agree, but that was the promise of RAG, chunk your documents into little bits and find the bit that is closet to the users query and add it to the context, maybe leave a little overlap on the chunks, is how RAG was initially presented, and how many vendors implement RAG, looking at tools like Amazon Bedrock Knowledge Bases here.
When I want to know the latest <important financial number>, I want that pulled that from the source of truth for that data, not hopefully get the latest and not last years number from some document chunk.
So, when people, or at least when I say RAG is Dead, it's short hand for: this is really damn complex, and vector search doesn't replace decades of information theory, storage and retrieval patterns.
Hell, I've worked with teams trying to extract everything from databases to push it into vector stores so the LLM can use the data.
First, it often failed as they had chunks with multiple rows of data, and the LLM got confused as to which row actually mattered, they hadn't realized that the full chunk would be returned and not just the row they were interested in.
Second, the use cases being worked on by these teams were usually well defined, that is, the required data could be deterministically defined before going to the LLM and pulled from a database using a simple script, no similarity required, but that's not the cool way to do it.
I agree with you that simple vector search + context stuffing is dead as a method, but I think it's ridiculous to reserve the term "RAG" for just the earliest most basic implementation. The definition of Retrieval Augmented Generation is any method that tries to give the LLM relevant data dynamically as opposed to relying purely on it memorising training data, or giving it everything it could possibly need and relying on long context windows.
The RAG system you mentioned is just RAG done badly, but doing it properly doesn't require a fundamentally different technique.
> it's ridiculous to reserve the term "RAG" for just the earliest most basic implementation
Whether we like it or not, dumb semantic search became the colloquial definition of RAG.
And when you hear someone saying "we use RAG here" 95% of the time this is exactly what they mean.
When you inject user's name into the system prompt, technically you're doing RAG - but nobody thinks about it that way. I think it's one of those case where colloquial definition is actually more useful that the formal one.
> doing it properly doesn't require a fundamentally different technique
Then what do you call RAG done well? You need a term for it.
> And when you hear someone saying "we use RAG here" 95% of the time this is exactly what they mean.
That's just Sturgeon's law in action. 95% of every implementation is crap. Back in the 90s, you might have heard "we use OOP here" and come to a similar conclusion, but that doesn't mean you need to invent a new word for doing OOP properly.
> But agentic RAG is fundamentally different.
From an implementation POV, absolutely not.
I've personally gradually converted a dumb semantic search to a more fully featured agentic RAG in small steps like these:
- Have a separate LLM call write the query instead of just using the user's message.
- Make the RAG search a synthetic injected tool call, instead of appending it to the system prompt.
- Improve the search endpoint by using an LLM to pre-process the data into structured chunks with hierarchical categories, tags, and possible search queries, embedding the search queries separately from the desired information (versus originally just having a raw blob).
- Have the LLM be able to search both with a semantic sentence, and a list of tags.
- Have the LLM view and navigate the hierarchy in a tree-like manner.
- Make the original LLM able to call the search on its own instead of being automatically injected using a separate query rewriting call, letting it search in multiple rounds and refine its own queries.
When did the system go from RAG to "not RAG"? Because fundamentally, all you need to do to make an agentic RAG is to have the LLM be able to write/rewrite its own search queries (possibly in multiple passes) as opposed to just passing the user's messages(s) directly.
I like the audacity of parent poster that equates 95% of implementations he has seen with 95% of all there is. When it easily could have been 0.01% of all there is. World is much bigger than we think :)
>all you need to do to make an agentic RAG is to have the LLM be able to write/rewrite its own search queries (possibly in multiple passes)
I think this is a huge oversimplification, the term "search query" is doing a lot of heavy lifting here.
When Claude Code calls something like
find . -type d -maxdepth 3 -not -path '*/node_modules/*'
to understand the project hierarchy before doing any of the grep calls, I don't think it's fair to call it just a "search query", it's more like "analyze query". Just because text goes in and out in both cases, doesn't mean that it's all the same.
When you give the agent the ability to query the nature of the data (e.g. hierarchy), and not just data itself, it means that you need to design your product around it. Agentic RAG has entirely different implementation, product implications, cost, latency, and primarily, outcomes. I don't think it's useful to pretend that it's just a different flavor of the same thing, simply because at the end of the day it's just some text flying over the network.
I don't think we should undersell that transformers and semantic search are really powerful information retrieval tools, and they are extremely potent for solving search problems. That being said, I think I agree with you that RAG is fundamentally just search, and the hype (like any hype) elides the fact that you still have to solve all of the normal, difficult search problems.
That non crime hate incident goes on your criminal record and if you need an enhanced criminal records check, it will show up, and can be used to deny you employment. Its not just intimidation.
there's many to choose from, you can google for more. But here's what got Lucy Connolly a 31 month sentence:
"Mass deportations, now, set fire to all the fucking hotels full of the bastards for all I care, if that makes me a racist, so be it".
Racist maybe, although she doesn't seem to care about race.
Offensive, yeah, seems that it could be interpreted as offensive, but thats not technically illegal (the high court has repeatedly affirmed to right to be offensive).
Inciting violence (the offense she was convicted of) no, not at all, she was stating her political opinion and her belief that the lives of immigrants is worth less than british children.
Although people will point out she admitted guilt, but the threat of significant pre-trail imprisonment was used a lot at this time to force guilty pleas.
She called for hotels housing immigrants to be burned in the middle of a riot. Hotels suspected of housing immigrants were, in fact, burned during the course of that riot.
She clearly understood that her actions were wrong, and went on to try to cover her tracks and "play the mental health card".
This is a really poor example to use of censorship - there are very few countries in the world where this wouldn't have been against the law. Even the USA, with it's famed first amendment rights, makes it unlawful to "organize, promote, encourage, participate in, or carry on a riot".
If you're rights are contingent on circumstance, they're not rights.
I don't see anything there encouraging a riot. There is no call to action.
We should know this isn't enough to convict, since a Labour councillor who called for far-right activists' throats to be cut at an anti-racism rally [0], actually inciting violence, was cleared of wrong doing.
From the article, you'll notice politicians calling out situation:
Shadow home secretary Chris Philp said of the decision: "It is astonishing that this Labour councillor, who was caught on video calling for throats to be slit, is let off scot-free, whereas Lucy Connolly got 31 months prison for posting something no worse."
I agree, both should have been charged. Only one was. You could argue that the MP is making the greater offence as he/she is in a position of authority.
Thanks for the info. What disturbs me most is the polarization and increasing intolerance of different/opposing ideas and opinions. I'm referring to "slit their throats" kinds of reactions and "set [it] on fire". There's no "lets agree to disagree and meet half way". No compromise. That's seen as weak.
That's the position I came to based on these rulings, or lack thereof.
I think of all the reasons open source shouldn't accept AI created code is that it can't be protected, and that has the potential to threaten the whole project.
OpenClaw, for instance has an MIT license [0], but, per the creators own words, they didn't even review the code. OpenClaw isn't MIT licensed, the MIT license relies on copyright, and because there was not even human review of the majority of the code, no substantial human input, that code base can't be copyrighted.
No need to steal AI code, it doesn't belong to anyone.
In this case isn't it more that:
Every sculpture that is made, every picture drawn, every bed left unmade, in the final sense, a theft from those who hunger and are not fed, those who are cold and are not clothed.
From where I'm sitting, this is theft, its forced wealth redistribution, from people that are potentially already struggling,to people that choose to slum it as artists. Its not even means tested, this really will result in money transferring from those on the edge of poverty to rich art school kids.
There's currently 16,000 homeless / at risk people in Ireland, including 5000 children [0]. I can think of at least one better use for that money.
Yeah, its almost as if the knives aren't the problem. The gang memebrs will use whatever gives them an advantage, guns, knives, acid, bats, bricks. We can't ban everything, we should possibly tackle the cause instead of the symptom...
But don't worry, in the mean time they're coming for our regular knives.
The BBC has already rolled out Idris Ebla to explain that kitchen knives shouldnt have points[0]. Yes this has been picked up by politicians with the minister for policing at the time calling it an interesting idea [1].
No, and the blades created because of the methods used, would likely not be covered by the legislation anyway, theres a carve out for antiques and weapons made using traditional methods (now define traditional methods, because the law doesn't, but hammer and anvil would seem to be the most obvious traditional approach).
However, in practice the police continually take and often destroy legally owned antiques claiming they are zombie swords.
The law is written in such a way the police can take anything and you have to prove to a judge they aren't illegal.
Members of a team creates a report explaining the state of their small section of the business, usually a 2x2 grid of boxes to fill.
This is then reviewed, usually in an in person meeting that requires full team participation.
These are joined together to create a weekly business review, that will require another meeting to review.
Each month the WBRs are combined to created the monthly business review, with a massive meeting requiring participation by multiple teams.
The pyramid of documents and meetings continues all the way up to the CEO.
I should probably point out, none of this information is unavailable at any level, its copied and pasted from system to 2x2 then copied from doc to doc. It's a spectacle that needs to be seen to be believed.
And that just the reporting, planning is another exercise in multiple report writing that I'll save for another day. But, hopefully you get the idea.
Amazon is 90% internal document writing and 70% work (9-5 does not really exist, it could, it just doesnt).
It's essentially a massive jobs program for middle management that aren't capable enough to join the TSA and that's being unfair to the TSA.
The only reason I can think for the existence of the reporting is to give managers something to do between pipping staff.
i'm curious, how do you think other large companies operate with regards to reporting progress/status/results up the management chain?
At least at companies where the upper management is aware enough of the details to make good judgements, and the business is critical enough for some reason that low level management can't just be entrusted to yeet/yolo-things into production?
His way with words and way to highlight to absurdity of situations is first class.
My favorite is the Celestial Emporium of Benevolent Knowledge. It's a critique of the classification used by the Institute of Bibliography which he considered nonsensical. He claims to have found the list in an ancient Chinese encyclopaedia:
It's such a wonderful thing to be reminded of how silly it is to take language seriously. IMO it's prickles and goo[1] all the way down - and the prickles help us share meaning and exchange information, but there is no project of exactitude to be completed.
The hubris it takes to maintain the view that we can just keep figuring things out if we are rational enough is also sometimes overwhelming to me. It's not that we can't understand things better through analysis, just that it sometimes seems foolish to me to try to get all of it through system-2 type behavior. We will always miss something crucial[2].
An algorithm written in a well specified language with precise semantics might have bugs. A "logical" argument made with natural language is orders of magnitude less precise
What I've always wondered, though, is whether that lack of precision is what allows for meaning to arise in the first place. In the gap between language and - this - .
Sports categories never had anything to do with gender.
The other difference of sexual development are different sexes
reply