When history is cached conversations tend not to be slower, because the LLM can 'continue' from a previous state.
So if there was already A + A1 + B + B1 + C + C1 and you asking 'D' ... well, [A->C1] is saved as state. It costs 10ms to prepare. Then, they add 'D' as your question and that will be done 'all tokens at once' in bulk - which is fast.
Then - they they generate D1 (the response) they have to do it one token at a time, which is slow. Each token has to be processed separately.
Also - even if they had to redo- all of [A->C1] 'from scratch' - its not that slow, because the entire block of tokens can be processed in one pass.
'prefill' (aka A->C1) is fast, which by the way is why it's 10x cheaper.
So prefill is 10x faster than generation, and cache is 10x cheaper than prefill as a very general rule of thumb.
Prefill is 10x faster than generation without caching, and 100x faster with caching - as a very crude measure. So it's not a matter of 'only the case'. Those are different scenarios. Some hosts are better than others with respect to managing caching, but the better one's provide decent SLA on that.
Were your table is stored shouldn't matter that much if you have proper indezes which you need and if you change anything, your db is rebuilding the indezes anyway
1) no one cares if it works. No one cared before how your code looked as long as you are not a known and well used opensource project.
2) there are plenty of services which do not require state or login and can't be hacked. So still plenty of use cases you can explore. But yes i do agree that Security for production live things are still the biggest worry. But lets be honest, if you do not have a real security person on your team, the shit outthere is not secure anyway. Small companies do not know how to build securely.
> 1) no one cares if it works. No one cared before how your code looked as long as you are not a known and well used opensource project.
Forgive me if this is overly blunt, but this is such a novice/junior mindset. There are many real world examples of things that "worked" but absolutely should not have, and when it blows up, can easily take out an entire company. Unprotected/unrestricted firebase keys living in the client are all the rage right now, yea they "work"until someone notices "hey, I technically have read/write god mode access to their entire prod DB", and then all of a sudden it definitely doesn't work and you've possibly opened yourself to a huge array of legal problems.
The more regulated the industry and the more sensitive the business data, the worse this is exacerbated. Even worse if you're completely oblivious to the possibility of these kinds of things.
> Forgive me if this is overly blunt, but this is such a novice/junior mindset.
Unfortunately the reality is there are far more applications written (not just today but for many years now) by developer teams that will include a dozen dependencies with zero code review because feature XYZ will get done in a few days instead of a few weeks.
And yes, that often comes back to bite the team (mostly in terms of maintenance burden down the road, leading to another full rebuild), but it usually doesn't affect the programmers who are making the decisions, or the project managers who ship the first version.
Its still not a Hype, its still crazy what is possible today and we still have no clear at all if this progress continues as it does or not with the implication, that if it continues, it has major implications.
My wife, who has no clue about coding at all, chatgpted a very basic android app only with guidance of chatgpt. She would never ever been able to do this in 5 hours or so without my guidance. I DID NOT HELP HER at all.
I'm 'vibecoding' stuff small stuff for sure, non critical things for sure but lets be honest, i'm transforming a handfull of sentences and requirements into real working code, today.
Gemini 3 and Claude Opus 4.5 def feel better than their prevous versions.
Do they still fail? Yeah for sure but thats not the point.
The industry continues to progress on every single aspect of this: Tooling like claude CLI, Gemini CLI, Intellij integration, etc., Context length, compute, inferencing time, quality, depth of thinking etc. there is no current plateau visible at all.
And its not just LLMs, its the whole ecosystem of Machine Learning stuff: Highhly efficient weather model from google, Alpha fold, AlphaZero, Roboticsmovement, Environment detection, Image segmentation, ...
And the power of claude for example, you will only get with learning how to use it. Like telling it your coding style, your expectations regarding tests etc. We often assume, that an LLM should just be the magic work collegue 10x programmer but its everything an dnothing. If you don't communicate well enough it is not helpful.
And LLMs are not just good in coding, its great in reformulating emails, analysing error messages, writing basic SVG files, explaining kubernetes cluster status, being a friend for some people (see character.ai), explaining research paper, finding research, summarizing text, the list is way to long.
Alone 2026 there will go so many new datacenters live which will add so much more compute again, that the research will continue to be faster and more efficient.
There is also no current bubble to burst, Google fights against Microsoft, Antrophic and co. while on a global level USA competets with China and the EU on this technology. The richest companies on the planet are investing in this tech and they did not do this with bitcoins because they understod that bitcoin is stupid. But AI is not stupid.
Or Machine learing is not stupid.
Do not underestimate the current status of AI tools we have, do not underestimate the speed, continues progress and potential exponential growth of this.
My timespan expecation for obvious advancments in AI is 5-15 years. Experts in this field predict already 2027/2030.
But to iterate over this: a few years ago no one would have had a good idea how we could transform basic text into complex code in such a robust way, which such diverse input (different language, missing specs, ...) . No one. Even 'just generating a website'.
> My wife, who has no clue about coding at all, chatgpted a very basic android app only with guidance of chatgpt. She would never ever been able to do this in 5 hours or so without my guidance. I DID NOT HELP HER at all.
I know what app builders are but she talked to a computer system, told it what she wanted and it WALKED HER through all the steps and i think she used android studio.
She did all of this in a few hours and doesn't work with computers.
Literally all much of the business world needed was a slightly more capable VB but now we have a bunch of crappy, web browser based SaaS platforms instead.
I think it really depends how a person judges the progress from chatgpt 3.5, 3 years ago to Opus 4.5.
In one light it is super impressive and amazing progress, in another light it is not impressive at all and totally over hyped.
Using the Hubert Dreyfus analogy. It is impressive if the goal is to climb as high as we can up giant tree. The height we have reached isn't impressive at all though if we are trying to climb the tree to get to the moon.
Even if we assume for a moment everything you are saying is true and/or reasonable, can't you see how comments like these paint your position here in a bad light? It just reads a little desperate!
It might be just different viewpoints people don't understand?
I'm advocating for spending time with AI because it works already good enough and it continues to progress surprisingly fast. Unexperienced fast for me tbh.
If i say "AI is great" i also know when AI is also stupid but i'm already/stil so impressed that i can transform basic text into working go/java whatever code, that i accept that its not perfect just because I highly appreciate how fast we got this.
And it feels weird too tbh. It doesn't feel special to talk to an LLM and get code back somehow while this was unthinkable just a few years back.
Somethimes it likes you just forget about all these facts and have to remind yourself that this is something new.
Just because it’s “new” doesn’t mean it’s going to fulfill all our wildest fantasies. It’s becoming clear that these things are just… tools. Useful in certain situations, but ultimately not the game-changer they’re sold as.
For me its getting clear that there is so much broad ways of continues improvement, and constant speed, that if this continues like this, even a basic LLM can do a lot of jobs.
Just today cursor released a blog were they run an agent for a week and it build a browser.
Claude Code with subagents and hooks are really good too.
And it takes time to just get it and roll it out to everyone, it takes time to do research, experiments, it takes time to install GPUs and make them etc.
We are currently only limited by things we can progress on.
Journalists are the backbone of a healthy democracy.
FU USA FU
And just to be clear: The biggest military force of the world threatens denmark, scrambles the economy around the world due to sudden politic changes (tarifs) and destroys its own integrity as an ally
We're just powerless to do anything, as the (probably) legally elected administration runs this ship like its own personal party barge into ... everything in sight.
Our Checks refuse to speak up in Congress, and our Balances keep voting to make the (current) POTUS immune to the law.
Frankly, both parties feel like the "elected administration runs this ship like its own personal party barge" when they're out of power.
If you don't like that, the only solution is to push for limited government next time you're in power.
Whatever power you put into the hands of the government is guaranteed to fall into your enemy's hands some day. This is a deliberate design feature of the US political system. It's the only way to get people to wake up for the need to limit government power.
A good start would be ending selective prosecution by restoring the original role of grand juries: to decide whether or not to hire a contract prosecutor for a single case. Public Prosecutors can be just like Public Defenders -- contractors of the court, with no discretionary powers.
> Whatever power you put into the hands of the government is guaranteed to fall into your enemy's hands some day.
Only if there's a functioning system of checks and balances. Unfortunately, there is not. This Court is willing to use motivated reasoning to achieve its preferred outcomes; to slow-walking favorable rulings for Democrats while expediting favorable rulings for Republicans (often without explanation via the "shadow docket"); and to throw out decades of precedent in the process by ignoring stare decisis, a bedrock legal principle which ensures stability of the judicial process.
Just to give an example, consider the ban on universal (national) injunctions. One might be surprised to learn that it was the Biden administration that initially petitioned the Court for the ban. However, the Court found such a ban unnecessary then (i.e., when lower Courts were blocking the Biden administration's agenda), but conveniently found it necessary during the second Trump administration (when lower courts started blocking the Trump administration's agenda). And just as another kick in the balls, they used the birthright citizen case as a vehicle to bring the matter to Court, strengthening the President without even deigning to address the Trump administration's obviously illegal executive order.
The result of this mess is that, if the Trump administration is eventually voted out, it is highly unlikely that an incoming Democratic administration would be able to capitalize on the expansion of executive powers that this Court has given to this President. We see a similar situation in Poland. After ~a decade in power the Law & Justice party was voted out, but the new coalition government has not inherited the same ability to government, with its agenda constantly curtailed by Law & Justice appointees embedded throughout the government (including the highest court).
Trump doesn't take the normal route as any other president did.
Of course with his second term, at least people can't complain how he interacts with your ex allies like us germany. Thats fair to do, shitty and short viewed but hey.
But certain things like his fraud coins etc. this is bluntly illegal and he did not do this shit in his first term.
Not really. In the West, they are just parrots for the wealthy.
Also the US has always kind of been the biggest threat to world peace, with the exception of Nazi Germany. It only feels outrageous now because White people are being attacked too instead of the usual targets
I always say "most companies today are IT companies, they just don't know it".
I would argue that Disney is definitily a big IT company and relies a lot on that tech.
Perfect storytelling might need flaweless execution to not distract. cirque du soleil for example are also experts in every single aspect relevant to their show/business. Check out the YT video from their sound manager "
Inside the Sound of Cirque du Soleil: Drawn to Life" this is so crazy but it explains so much especially how they control the audiance clapping.
For me, being an "IT Company" or "Tech Company" means the tech is what drives the business decisions.
Where I work, it's a b2b service company. We've had CIOs get up and say we're a tech company, but when push comes to shove, the IT org always loses to "the business". The business solutions are what are being sold, they really don't care what the tech is under the hood... even if the tech enables every product to exist at this point.
This. In the same way that Pre 2000s Boeing was an Engineering company, driven by making a good product 'the right way'. Rather than a business company that happens to rely on engineering.
When I worked at an insurance company I heard this all the time, that we were “an IT company selling insurance”. They had 3k IT staff, more staff than any other department.
We were also still operating T1s, lotus notes, windows xp in 2014. So I always took it with a grain of salt.
Context is limited.
You do not want the cloud provider running a context compaction if you can control it a lot better.
There are even tips on when to ask the question like "send first the content then ask the question" vs. "ask the question then send the content"