I built a product solving this problem about a year ago, basically a serverless, container-based, NATed VScode where you can eg "run Claude Code" (or this) in your browser on a remote container.
There's a reason I basically stopped marketing it, Cursor took off so much then, and now people are running Claude/Codex locally. First, this is something people only actually start to care about once they've been bitten by it hard enough to remember how much it hurt, and most people haven't got there yet (but it will happen more as the models get better).
Also, the people who simultaneously care a lot about security and systems work AND are AI enthusiasts AND generally highly capable are potentially building in the space, but not really customers. The people who care a lot about security and systems work aren't generally decision makers or enthusiastic adopters of AI products (only just now are they starting to do so) and the people who are super enthusiastic about AI generally aren't interested in spending a lot of time on security stuff. To the extent they do care about security, they want it to Just Work and let them keep building super fast. The people who are decision makers but less on the security/AI trains need to this happen more, and hear about the problem from other executives, before they're willing to spend on it.
To the extent most people actualy care about this, they still want to Just Work like they do now and either keep building super fast or not thinking about AI at all. It's actually extremely difficult to give granular access to agents because the entire point is them acting autonomously or keeping you in a flow state. You either need to have a really compatible threat model to doing so (eg open source work, developer credentials only used for development and kept separate from production/corp/customer data), spend a lot of time setting things up so that agents can work within your constraints (which also requires a willingness to commit serious amounts of time or resources to security, and understanding of it), or spend a lot of time approving things and nannying it.
So right now everybody is just saying, fuck it, I trust Anthropic or Microsoft or OpenAI or Cursor enough to just take my chances with them. And people who care about security are of course appalled at the idea of just giving another company full filesystem access and developer credentials in enterprises where the lack of development velocity and high process/overhead culture was actually of load-bearing importance. But really it's just that secure agentic development requires significant upfront investment in changing the way developers work, which nobody is willing to pay for yet, and has no perfect solutions yet. Dev containers were always a good idea and not that much adopted either, btw.
It takes a lot more investment in actually providing good permissions/security for agent development environments still too, which even the big companies are still working on. And I am still working on it as well. There's just not that much demand for it, but I think it's close.
> In reality you could host the entire thing for well under $50k/year in hardware
I could pay Anthropic $400 to write more code than you have in your entire lifetime.
Sure, you're able to operate a website acting as essentially the most important and highest volume venue for sharing academic research in the world, but come on, why couldn't I just ask Claude Code or some web developer in a foreign country to do the same thing?
Every time I try to build a castle in my swamp, it gets to a certain height and then it just sinks?
STOP telling me about civil engineering, we fucking invented that shit. And NO, we have to build it in the swamp, it feeds us and keeps us safe, and I'm darned proud to say we invented that too.
Their 40 year old boss the will be younger than many of the 20 something, 30 something, 40 something entrepreneurs who already, now, at this moment (me included) would find the idea of moving to Active Directory and stocking the company with Wintel laptops equally farcical.
> A startup may be all-Mac. Eventually if it's successful though, it'll be too big to use 'consumer' tools only... likewise they have no realistic counterpart for Active Directory, nor business email or collaboration (at least Google has that though).
Between the two, they have those needs pretty much completely covered (also, Apple does have increasingly good support for MDM now). To me this reads more as a complaint that neither of them is trying to execute the same bundling/business model as Microsoft, or selling the same kind of security model as what makes sense for an old school IT shop that literally could never leave Microsoft products if it even wanted to.
Every single mobile device in "Enterprise" is using MDM provided and supported by those two companies for business users at multiple layers of the stack required to provide that functionality, they just don't make a business out of selling it directly as a Serious Enterprise Product to IT departments (the least important part of the stack, ie where a guy in a collared shirt with a web app takes a middle manager out for a steak dinner).
I set up MDM for the first time while standing in line for a flight at the airport, on my iphone and for my iphone. My company uses an enterprise IdP with a zero trust security model, which I saw executed firsthand by both Microsoft and Google for their own companies, neither of which made a fuss about giving me a mac device to work with. Somehow, it worked.
So far it's better for equivalent Qwen 3.5 workloads, and much less expensive. As you mention Qwen spends way too much time/tokens reasoning, so it ends up being more expensive than you'd think based on its model card (also IME, flaky).
I actually think this model is a Big Deal because there's a whole world out there of people building on top of Qwen and other Chinese models, and now Mistral has just released one of the best generalist FOSS models in its price/size range at an excellent price ($0.60/1M output is a steal). Mistral could potentially grab a lot of that.
Personally I am going to build off of it and invest in their ecosystem now, with this model, because it's definitely worth paying for at the current price. Whether Mistral or some other venture comes out with the next big thing in that category is anybody's guess, but I hope now that labs are starting to converge on more rapid release cycles, I'm hoping Mistral won't be far behind.
The main thing for me though is that for small model use cases, it just doesn't make sense to pay a ton for Haiku/Gemini and other expensive small models that you can't self-host or finetune or generally build upon. They cost too much and can't be tinkered with. Also, the range in which you'd want the incrementally better performance of something like Haiku over Mistral, but not enough to think about the benefits of tuning or self-hosting inference, are few for me. But at the same time, if you're going to invest in building on top of someone else's product, you want them to be trustworthy and long-term partners.
Very cool! I’ve been flirting with the idea of biting the bullet and moving more towards language extensions around protobuf/grpc vs just tools so it’s really great to see projects on the other side of that kind of decision shipping and what choices they made
Why the square brackets in particular? Notation is such an annoying part of this stuff, I’m actually leaning towards pushing a lot of structure to the filesystem
The people who pay for operating systems are paying for a private entity to decide what the operating system should do. They're paying for someone to compile it from source and get it to run on their computer and maintain it.
That's the whole point. Paying someone for that thing you also know how to do so they can consider that problem solved and focus on the things they know how to do.
The difference between a 10% agent and a 30-60% subcontractor is what's being purchased, and from whom. Actors and other famous creatives are selling their particular work, which is unique and demanded by clients mostly independently of details like who their agent is. When a client pays 2x to an agency that pays the subcontractor implementing the work 1x to complete it, what's being purchased is the agency's work - working directly with the client, finding developers to complete the work, and managing the process of delivery (and all the related bits: making sure their subcontractors know what they're doing and are appropriate fits for the project, keeping work on track, being accountable for delivery/operational execution to the client).
If that extra 20-50% were so easy/useless that it can be grabbed "without lifting a finger", why aren't you finding enough work on your own to keep yourself busy, or, why are you still working with that third-party company to begin with? Oh, you would, if you "had any interest" in doing that. That level of accountability to the client and attention to their needs is literally what clients are paying the agency for, and why they're the ones handling the demand for work rather than their subcontractors.
If clients aren't seeking out your particular involvement in their project, you're the guy working the mic, not the movie star.
> making sure their subcontractors know what they're doing and are appropriate fits for the project, keeping work on track, ...
I think this is where the reality falls apart. Often agencies are just skimming their percentage without adding real value to the project.
I.e. all those scenarios where a consulting company requires a PM and BA be billed, but the dev+customer are doing 99% of the communication and work directly
> being accountable for delivery/operational execution to the client
This is actually what most VPs are paying for: being able to pick up a phone and chew IBM GCS, TCS, CG, etc. out when the schedule slips.
In my experience that’s the opposite: the sweatshops are extracting the most.
I remember early in my career learning from the client I had been posted at for 2 years that they were paying for my expertise 6 times my daily rate. I was furious to say the least; I demanded a 3x raise the next day or threatened to quit, which they could only grant me. 15 years later, I won’t allow these egregious amounts of markup anymore but 2x is routine and I have to swallow it if I want to get any work at all.
Let me be clear: these companies have no role in the day-to-day, just introducing the contractor and just sitting back collecting money with no effort at all. Absolutely disgraceful.
> these companies have no role in the day-to-day, just introducing the contractor and just sitting back collecting money with no effort at all
The company is also responsible for the job getting done if you get hit by a bus or say "Fuck it" and disappear.
But in long-term customer contracts with the same personnel, there should absolutely be a decrease in the gap between billable rate and salary.
If the customer keeps someone around that long: they're working with that person, not the company.
The other legitimate reason for markup (on both sides) is optionality around immediate arbitrary filing. Even in the US / "right to work" states, firing can be messy. Severing contracts is much less messy / quicker.
So some of the premium is for being able to treat contractors like contracts rather than employees/people.
Having worked on Cloud Run/Cloud Functions, I think almost every company that isn't itself a cloud provider could be in category 1, with moderately more featureful implementations that actually competed with K8s.
Kubernetes is a huge problem, it's IMO a shitty prototype that industry ran away with (because Google tried to throw a wrench at Docker/AWS when Containers and Cloud were the hot new things, pretending Kubernetes is basically the same as Borg), then the community calcified around the prototype state and bought all this SAAS/structured their production environments around it, and now all these SAAS providers and Platform Engineers/Devops people who make a living off of milking money out of Kubernetes users are guarding their gold mines.
Part of the K8s marketing push was rebranding Infrastructure Engineering = building atop Kubernetes (vs operating at the layers at and beneath it), and K8s leaks abstractions/exposes an enormous configuration surface area, so you just get K8s But More Configuration/Leaks. Also, You Need A Platform, so do Platform Engineering too, for your totally unique use case of connecting git to CI to slackbot/email/2FA to our release scripts.
At my new company we're working on fixing this but it'll probably be 1-2 more years until we can open source it (mostly because it's not generalized enough yet and I don't want to make the same mistake as Kubernetes. But we will open source it). The problem is mostly multitenancy, better primitives, modeling the whole user story in the platform itself, and getting rid of false dichotomies/bad abstractions regarding scaling and state (including the entire control plane). Also, more official tooling and you have to put on a dunce cap if YAML gets within 2 network hopes of any zone.
In your example, I think
1. you shouldn't have to think about scaling and provisioning at this level of granularity, it should always be at the multitenant zonal level, this is one of the cardinal sins Kubernetes made that Borg handled much better
2. YAML is indeed garbage but availability reporting and alerting need better official support, it doesn't make sense for every ecommerce shop and bank to building this stuff
3. a huge amount of alerts and configs could actually be expressed in business logic if cloud platforms exposed synchronous/real-time billing with the scaling speed of Cloud Run.
If you think about it, so so so many problems devops teams deal with are literally just
1. We need to be able to handle scaling events
2. We need to control costs
3. Sometimes these conflict and we struggle to translate between the two.
4. Nobody lets me set hard billing limits/enforcement at the platform level.
(I implemented enforcement for something close to this for Run/Appengine/Functions, it truly is a very difficult problem, but I do think it's possible. Real time usage->billing->balance debits was one of the first things we implemented on our platform).
5. For some reason scaling and provisioning are different things (partly because the cloud provider is slow, partly because Kubernetes is single-tenant)
6. Our ops team's job is to translate between business logic and resource logic, and half our alerts are basically asking a human to manually make some cost/scaling analysis or tradeoff, because we can't automate that, because the underlying resource model/platform makes it impossible.
Since you are developing in this domain. Our challenge with both lambdas and cloud run type managed solutions is that they seem incompatible with our service mesh. Cloud run and lambdas can not be incorporated with gcp service mesh, but only if it is managed through gcp as well. Anything custom is out of the question. Since we require end to end mTLS in our setup we cannot use cloud run.
To me this shows that cloud run is more of an end product than a building block and it hinders the adoption as basically we need to replicate most of cloud run ourselves just to add that tiny bit of also running our Sidecar.
> Cloud run and lambdas can not be incorporated with gcp service mesh, but only if it is managed through gcp as well
I'm not exactly sure what this means, a few different interpretations make sense to me. If this is purely a run <-> other gcp product in a vpc problem, I'm not sure how much info about that is considered proprietary and which I could share, or even if my understanding of it is even accurate anymore. If it's that cloud run can't run in your service mesh then it's just, these are both managed services. But yes, I do think it's possible to run into a situation/configuration that is impossible to express in run that doesn't seem like it should be inexpressible.
This is why designing around multitenancy is important. I think with hierarchical namespacing and a transparent resource model you could offer better escape hatches for integrating managed services/products that don't know how to talk to each other. Even though your project may be a single "tenant", because these managed services are probably implemented in different ways under the hood and have opaque resource models (ie run doesn't fully expose all underlying primitives), they end up basically being multitenant relative to each other.
That being said, I don't see why you couldn't use mTLS to talk to Cloud Run instances, you just might have to implement it differently from how you're doing it elsewhere? This almost just sounds like a shortcoming of your service mesh implementation that it doesn't bundle something exposing run-like semantics by default (which is basically what we're doing), because why would it know how to talk to a proprietary third party managed service?
There are plenty of PaaS components that run on k8s if you want to use them. I'm not a fan, because I think giving developers direct access to k8s is the better pattern.
Managed k8s services like EKS have been super reliable the last few years.
YAML is fine, it's just configuration language.
> you shouldn't have to think about scaling and provisioning at this level of granularity, it should always be at the multitenant zonal level, this is one of the cardinal sins Kubernetes made that Borg handled much better
I'm not sure what you mean here. Manage k8s services, and even k8s clusters you deploy yourself, can autoscale across AZ's. This has been a feature for many years now. You just set a topology key on your pod template spec, your pods will spread across the AZ's, easy.
Most tasks you would want to do to deploy an application, there's an out of the box solution for k8s that already exists. There have been millions of labor-hours poured into k8s as a platform, unless you have some extremely niche use case, you are wasting your time building an alternative.
I will just say based on recent experience the fix is not Kubernetes bad it’s Kubernetes is not a product platform; it’s a substrate, and most orgs actually want a platform.
We recently ripped out a barebones Kubernetes product (like Rancher but not Rancher). It was hosting a lot of our software development apps like GitLab, Nexus, KeyCloak, etc
But in order to run those things, you have to build an entire platform and wire it all together. This is on premises running on vxRail.
We ended up discovering that our company had an internal software development platform based on EKS-A and it comes with auto installers with all the apps and includes ArgoCD to maintain state and orchestrate new deployments.
The previous team did a shitty job DIY-ing the prior platform. So we switched to something more maintainable.
If someone made a product like that then I am sure a lot of people would buy it.
This is one of the things that excites me about TigerBeetle; the reason why so much billing by cloud providers is reported only on an hourly granularity at best is because the underlying systems are running batch jobs to calculate final billed sums. Having a billing database that is efficient enough to keep up with real-time is a game-changer and we've barely scratched the surface of what it makes possible.
Thanks for mentioning them, we're doing quite similar debit-credit stuff as https://docs.tigerbeetle.com/concepts/debit-credit/ but reading https://docs.tigerbeetle.com/concepts/performance/ they are definitely thinking about the problem differently from us. You need much more prescribed entities (eg resources and skus) on the modelling side and different choices on the performance side (for something like a usage pricing system) for a cloud platform.
This feels like a single-tenant, centralized ACH but I think what you actually want for a multitenant, multizonal cloud platform is not ACH but something more capability-based. The problem is that cloud resources are billed as subscriptions/rates and you can't centralize anything on the hot-path (like this does) because it means that zone/any availability interacting with that node causes a lack of availability for everything else. Also, the business logic and complexity for computing an actual final bill for a cloud customer's usage is quite complex because it's reliant on so many different kinds of things, including pricing models which can get very complex or bespoke, and it doesn't seem like tigerbeetle wants calculating prices to be part of their transactions (I think)
The way we're modelling this is with hierarchical sub-ledgers (eg per-zone, per-tenant, per-resourcegroup) and something which you could think of as a line of credit. In my opinion the pricing and resource modelling + integration with the billing tx are much more challenging because they need to be able to handle a lot of business logic. Anyway, if someone chooses to opt-in to invoice billing there's an escape hatch and way for us to handle things we can't express yet.
Every time I’ve pushed for cloud run at jobs that were on or leaning towards k8s I was looked at as a very unserious person. Like you can’t be a “real” engineer if you’re not battling yaml configs and argoCD all day (and all night).
It does have real tradeoffs/flaws/limitations, chief among them, Run isn't allowed to "become" Kubernetes, you're expected to "graduate". There's been an immense marketing push for Kubernetes and Platform Engineering and all the associated SAAS sending the same message (also, notice how much less praise you hear about it now that the marketing has died down?).
The incentives are just really messed up all around. Think about all the actual people working in devops who have their careers/job tied to Kubernetes, and how many developers get drawn in by the allure and marketing because it lets them work on more fun problems than their actual job, and all the provisioned instances and vendor software and certs and conferences, and all the money that represents.
npm pack builds the file locally, then we expose it to the container filesystem where we do a build and check the outputs. You can move dependencies to bundledDependencies in npm to embed them in the image.
However, this is assuming you're rebuilding the static site generator itself every time. If you just want to build a site using an existing static site generator, it's much easier provided that the site generator itself is easy to use (for example, ours has a one-liner to take you all the way from nothing to a local static site running on localhost, see https://statue.dev)
If you aren't changing the SSG code itself between container runs you'd just mount the markdown to the container and pre-install the ssg in the Dockerfile itself. For statue.dev that would just be a Dockerfile almost exactly the same as the one we use already, except you'd use your own script, and RUN this in the Dockerfile itself: yes | npx sv create . --template minimal --types ts --no-add-ons --install npm && npm install statue-ssg && npx statue init && npm install
In your script you'd just npm run build then do whatever it is you want to do to send the files somewhere, and wherever starts the script, you'd do something like -v "pathtomymarkdown/foo:/test-package/" - not sure how to do this in github runners.
Depending on how interested you/other people are in doing this with statue.dev, we could prob get something like this (where the markdown is parameterized, not the npm package) working by Tuesday. We're building out the sandbox/build features on our cloud platform as we speak, this could be one of the first use cases/defaults.
There's a reason I basically stopped marketing it, Cursor took off so much then, and now people are running Claude/Codex locally. First, this is something people only actually start to care about once they've been bitten by it hard enough to remember how much it hurt, and most people haven't got there yet (but it will happen more as the models get better).
Also, the people who simultaneously care a lot about security and systems work AND are AI enthusiasts AND generally highly capable are potentially building in the space, but not really customers. The people who care a lot about security and systems work aren't generally decision makers or enthusiastic adopters of AI products (only just now are they starting to do so) and the people who are super enthusiastic about AI generally aren't interested in spending a lot of time on security stuff. To the extent they do care about security, they want it to Just Work and let them keep building super fast. The people who are decision makers but less on the security/AI trains need to this happen more, and hear about the problem from other executives, before they're willing to spend on it.
To the extent most people actualy care about this, they still want to Just Work like they do now and either keep building super fast or not thinking about AI at all. It's actually extremely difficult to give granular access to agents because the entire point is them acting autonomously or keeping you in a flow state. You either need to have a really compatible threat model to doing so (eg open source work, developer credentials only used for development and kept separate from production/corp/customer data), spend a lot of time setting things up so that agents can work within your constraints (which also requires a willingness to commit serious amounts of time or resources to security, and understanding of it), or spend a lot of time approving things and nannying it.
So right now everybody is just saying, fuck it, I trust Anthropic or Microsoft or OpenAI or Cursor enough to just take my chances with them. And people who care about security are of course appalled at the idea of just giving another company full filesystem access and developer credentials in enterprises where the lack of development velocity and high process/overhead culture was actually of load-bearing importance. But really it's just that secure agentic development requires significant upfront investment in changing the way developers work, which nobody is willing to pay for yet, and has no perfect solutions yet. Dev containers were always a good idea and not that much adopted either, btw.
It takes a lot more investment in actually providing good permissions/security for agent development environments still too, which even the big companies are still working on. And I am still working on it as well. There's just not that much demand for it, but I think it's close.
reply