I’ve found shared app, shared database completely workable by utilizing Postgres...

ccleve · on Aug 30, 2020

There's another benefit to using a tenant_id. You can partition your tables using Postgres partitioning. That keeps the tenant's data together and keeps queries fast.

It's also amenable to distributed processing if you use something like Citus.

I'm really looking for an alternative to Citus, though. Citus itself is a bit tricky to use, and the SaaS version of it is owned by Microsoft, which means Azure-only. Also, Microsoft makes the SaaS version insanely expensive. If they had Citus-like features on Amazon Aurora I'd be there in a heartbeat.

jaxn · on Aug 31, 2020

I use Citus. Their team is great, but it is very expensive.

The move from AWS to Azure came with a reduction of IOPS per instance. We have not had the same level of performance since migrating to Azure.

I am considering running our own Citus instances.

number6 · on Aug 31, 2020

What about postgres' schema to isolate tenants?

https://www.postgresql.org/docs/current/ddl-schemas.html

treis · on Aug 31, 2020

These guys wrote the most popular Ruby on Rails multitenant gem and eventually moved away from it:

https://influitive.io/our-multi-tenancy-journey-with-postgre...

Some things were mistakes but others look like pretty fundamental flaws. Performance is a problem, database changes are a problem, and you aren't able to query across tenants.

pvorb · on Aug 31, 2020

I'm using this in practice and it has served me well for years.

isuckatcoding · on Aug 30, 2020

100% this works. Just be careful to ensure all your queries and indexes include the tenant Id.

nogabebop23 · on Aug 30, 2020

this is essential the key (haha) to all shared/shared scenarios, regardless of what tech they are implemented with. The challenge can be migrating from single tennant to this could be fairly impactful, depending on how you built your original solution.

yabai_yatsu · on Aug 30, 2020

Not really. Your queries don't have to contain tenant Id if as part of your tenant context they have a connection string to their tenant DB.

jawr · on Aug 30, 2020

Wouldn’t that mean you’re holding lots of non reusable connections?

I have recently been using row level security with a transaction middleware where I set the tenant ID.

Nice article regarding it - https://aws.amazon.com/blogs/database/multi-tenant-data-isol...

yabai_yatsu · on Aug 31, 2020

We have about 6 physical DB servers, each client has their own db/schema on one of those boxes. Several thousand clients each with 10s of users per client. It brings in $15m ARR.

So, it works.

We're wanting to move off that architecture to something more future proof, but it's not our biggest pain point at this point in time.

treis · on Aug 30, 2020

The issue I've seen with this is that to get true security you need 1 db user per tenant. That makes connection pooling difficult and presents its own security issues on limiting the tenant to only their DB user.

rubber_duck · on Aug 30, 2020

>That makes connection pooling difficult and presents its own security issues on limiting the tenant to only their DB user.

Can you use SET LOCAL ROLE <user> on each transaction ?

treis · on Aug 30, 2020

If you do it that way then you don't gain much security. Any SQL exploit would just need to add the Set Local Role to break out of the tenant row level security. Any code error would (probably) still allow unauthorized access because that error will likely also set the incorrect user.

It adds a layer of security so it might prevent some bugs leading to exploits. But in itself is not enough to rely on to separate tenants.

rubber_duck · on Aug 30, 2020

Well if you have SQL injection bugs then you have bigger issues to worry about - I've used this to enforce multi-tenancy on database access level (like another poster said - preventing queries accessing wrong data by accident, which is far more common I think).

treis · on Aug 31, 2020

A SQL injection bug is (probably) not that big of a deal as long as the tenant boundary isn't crossed. They'd be stealing their own data.

rubber_duck · on Aug 31, 2020

True, I'm just not sure that I'd trust the DB isolation once the user has SQL injection. I never saw a SQL injection report on a project (well since the PHP days) ORMs solved this for the most part, but I did see multiple instances of accidental data leaks from bugs on different projects.

It looks like you could also use SET SESSION AUTHORISATION for this but I haven't used it so I don't know how this works with data access/pooling

kevincox · on Aug 30, 2020

If you are running a copy of the same software for each tenant anyways it doesn't matter much as a SQL injection for one tenant is most likely available on all tenants.

I think for this use case security is focused on accidentally returning the wrong tenant's data (fully or partially)

lixtra · on Aug 30, 2020

> available on all tenants

Yes, but typically not across tenants. Maybe the flaw is only exploitable to admins of each tenant and they shouldn’t see other tenants data.

I.e. https://news.ycombinator.com/item?id=24216009

vosper · on Aug 31, 2020

Thanks, this really helped me understand how row level security can be implemented effectively to partition tenants. It probably seems an obvious idea to many, but I appreciate it nonetheless

pvorb · on Aug 30, 2020

My strategy is using a separate, but equivalent schema for every tenant and setting the tenant context with every request before running queries.

ccleve · on Aug 30, 2020

This works and solves a lot of problems. The downside is that schema changes are cumbersome because you have to make them in many places. If you want to roll out a new feature in a shared app which depends on a schema change, it's hard to do without downtime or complicated feature flags.

hinkley · on Aug 30, 2020

It seems like some continuous deployment strategies could work here, as long as you limit the number of schema changes in flight to a handful.

Customer A is running the new schema, so gets one cluster. Customer Z will get migrated in the next hour, and then none of the old system is running.

infogulch · on Aug 30, 2020

Ideally setting the tenant context happens early during request authorization, is a required to get access to a database connection, and is configured outside the scope of any request business logic.

pvorb · on Aug 31, 2020

Yeah, I'm using Java, so in practice this means I extended my JDBC DataSource. On every call to getConnection() it sets the schema.

christophilus · on Aug 30, 2020

Same. This is the most sane way, in my experience. It’s pretty easy to move a tenant out of this model and into isolation if needed (never had to do it, but I dry-ran it). It’s harder to go the other way. Deployments, total system queries for analysis, etc are all simpler with this approach.