Things I learned after getting users

dahwolf · on March 13, 2023

"this is mostly because i relied on a SQL ORM which in short is a tool that makes writing SQL easier to pick up and faster to develop. the biggest downside is that it might execute 50 queries to your database to get a list of information, when it probably only needs 1, which will cause slowdown."

I appreciate this honesty. Listen to this old man's advise: learn SQL properly. It's not that hard. Focus on it for a few weeks intensely and you've mastered it for life. Then just write SQL directly.

I've had weekends ruined troubleshooting my "highly productive ORM layer" that nuked a production database. Whilst functionally speaking my ORM code was in no way incorrect. I'm talking differences of a thousand fold in query load depending on how one expresses the ORM calls.

You can then become proficient in trying to reason and predict about what your ORM calls do in the actual database, but when you're several joins deep, this becomes near impossible. At which point you become the ORM, and might as well just write SQL.

ookblah · on March 14, 2023

Honestly, I started on the ORM hate train when I first started and wrote my own queries in an effort to be "performant", but my frankenstein of writing direct queries and abstracting code to be reusable became basically a super shitty ORM that was a nightmare to maintain for a large codebase lol. I don' think there's anything wrong with it and for 90% of CRUD actions it's what you should choose. By the time you hit performance issues you've probably thought of something else anyway.

layoric · on March 14, 2023

I am the same, the advice of "learn SQL" is still very important. Most ORMs have the escape hatch of just mapping results of a straight SQL query to an expected structure (thinking of typed languages like C#/Java here). 95% of the time I can just use the ORM to make simple queries in a easy to maintain way for your application. For the 5% of queries that are either complex or can't be easily represented via your ORM of choice, declare the expected type response and use that escape hatch to write your performant SQL.

I've been on projects where the team has re-invented an ORM organically, and I've never seen it go well. Likewise with projects that are ORM 'purest', bending themselves over backwards to use an ORM for a query that can't easily be represented by whatever query syntax it has.

This is why I still think 'lite' weight ORMs are the best of both worlds since they usually have a pretty good experience for common/easy queries but then when things get tricky, it is best to just use SQL and map to an app language data structure for the results. I mostly use OrmLite [0] in dotnet and have found it has a balance that done well by me for years (note I now work at ServiceStack who built OrmLite).

[0] https://docs.servicestack.net/ormlite/ormlite-apis#query-exa...

partly_cloudy · on March 14, 2023

"95% of the time I can just use the ORM to make simple queries in a easy to maintain way for your application. For the 5% of queries that are either complex or can't be easily represented via your ORM of choice, declare the expected type response and use that escape hatch to write your performant SQL."

site creator here - yeah this is the approach i'm taking now. the ORM is useful for sure and there's still a benefit to using it, but anything that needs to read from a few different tables, i'm definitely going with raw SQL

brycelarkin · on March 14, 2023

I felt the same way until I found sqlx (https://github.com/launchbadge/sqlx). The package lets you write sql in your app with compile-time checks. Was so much easier to get up and running than learning a new ORM.

carom · on March 15, 2023

Need this for Go. Been considering writing a static checker for ages.

watwut · on March 14, 2023

Yeah, I am old enough to have started back when ORM did not existed yet. World with actual tested ORM is way more maintennable, waaay more easy to deal with. You can write your own queries when you really need to ... but should do it only when you really really need to.

Writing queries is easy part. Maintaining that thing is the hard part.

dahwolf · on March 14, 2023

I'm wired differently.

Even if I'd run a database operation just on a local machine and it takes 100x more load than an optimized SQL query, I still care. It's craftsmanship.

Sounds like you may have a case of the abstraction disease. You keep reaching for abstractions that never accomplish the actual benefits of a truly solid abstraction.

jw1224 · on March 13, 2023

> I'm talking differences of a thousand fold in query load depending on how one expresses the ORM calls

That doesn’t sound like ORM… More like an N+1 problem. Eager-loading makes N+1 more likely with ORMs, but it’s easy to avoid when you know what to look for.

ORMs are designed to reduce querying, not increase it a thousand-fold :)

eyelidlessness · on March 14, 2023

N+1 is a common problem with many ORMs, and a classic example of the object-relation impedance mismatch.

Even the notion of eager vs implied-lazy loading suggests N+1, just spread out over time. Granted that might be optimal for a whole lot of use cases! But it’s definitely not optimal for a use case where you need a join upfront and your ORM does it in memory.

Also granted many ORMs can handle this in a lot of general cases if you know how to use them, and know their limitations. But they’re inherently highly dynamic, and inevitably deoptimize for some cases where just querying the database directly will be much more effective. That’s not even an admonishment to “learn SQL” as is the common retort, it’s just the generic “abstractions are leaky and sometimes it’s better to bail out a layer or more”.

noduerme · on March 14, 2023

Upfront: I really dislike ORMs. They're an unnecessary abstraction and a performance headache as an extra layer of middleware to worry about keeping up to date. I'd never use one in production.

But we do always need a clean way to handle CRUD between apps and DBs, that ideally doesn't require custom calls for each and every data view. Here's the thing: Probably 90% of CRUD can be handled with generic updates and inserts. Raw JSON reads can be passed back to the client... let the client know how to cast or structure those to complex data types. The rest, that the server has to manipulate, can be cast / structured using the same classes the client uses, if in node. The actual meat of really complex reads or really optimized writes should always be exceptional and done by hand.

BiteCode_dev · on March 14, 2023

Right. In fact any ORM worth its salt will have methods specifically designed to let you address the N+1 problem easily.

E.G: django "select_related"

Not to mention they will will have ways to use raw SQL through the lib infrastructure, which is still better than querying SQL manually, and give you full control of the query.

Also wordpress shows that you can write manual SQL and still have terrible performances because your application is badly structured, which an ORM helps with.

In the end, I don't think there is less trap with raw SQL, the traps are just different. It's not a matter of "better".

manv1 · on March 14, 2023

It's hard to explain to developers why an ORM isn't really that great of a solution....but at a high level:

* your database is the core of your app. Abstraction implies that you don't care about the details of your data store, which eventually will lead to two major problems:

1. When things are slow you won't be able to debug it, because you're abstracted away your database and don't understand what queries your ORM is using to do things, and

2. when your ORM migration fails you won't know what to do, because you don't understand your database. In general this happens with every ORM product, ever, because you didn't put in a constraint initially because you didn't know what that meant and then you did later, which caused the migration to fail because your constraint is being violated. Or the migration fails because the tool uses some attribute and not others for drift detection, so it tries to delete your table/database because it's detected drift.

* the way you deal with data in code and in a database are different. Code iterates over objects. If you just use your ORM naively you'll end up doing some ridiculous number of selects because in code it looks like you're just iterating over objects...when in reality the ORM is doing a select/join for each one of those objects. And if you start getting more complicated there are structures that are just harder to do with ORMs...because you have to map what you're trying to do in SQL to the way your ORM works. At that point why not ditch your ORM?

SQL isn't really that hard...but thinking about how do queries is harder than you would expect. It's almost an order of magnitude faster to do joins in the database than to join stuff in-code...but that also assumes that your schema isn't screwed up, it has indexes, etc.

Lastly, a lot of ORMs don't put indexes on columns for some reason, which kills performance. You'd think that would be a detail that would be abstracted away for you at the ORM level, but it isn't. I mean it knows what you're using to select data, so it should auto-create indexes for you, right?

watwut · on March 14, 2023

Ad2: Every single real world java project I have seen used hibernate (ORM) with flyway. I never ever seen any project in production where some underlying tech would be deleting tables autonomously.

Ad1: People do debug ORMs, actually. Including queries effectivity.

manv1 · on March 15, 2023

Ad2 -> I've run into lots of hibernate apps. They also tend to use Spring, and the team often gets lost trying to figure out which one is the cause of the problem. It's usually both.

Hibernate and Prisma migrations will delete tables if you allow them to.

Ad1 -> I'm sure they do...but usually they call in a DBA, who basically says "this is a disaster" then recommends that they refactor everything because their database schema is fucked up because it's based on some weird object hierarchy instead of RDBMS principle.

watwut · on March 16, 2023

I dont find neither of your responses consistent with reality. First, database administrators do not debug ORM. Second, it does not take specialist to debug those, developers debug those. Third, recommendation is rarely "refactor everything". Usually it is adding an index, change the query or some other minor tweak. Fourth, "weird object hierarchy instead of RDBMS principle" does not even mean anything.

Fifth, spring has nothing to do with database whatsoever. Sixth, every production code I have ever seen used flyway.

adammarples · on March 15, 2023

I think you're mixing up someone who's decided to use an ORM rather than some hodgepodge of string concatenation with someone who has no idea what a database is, is unable to set it up effectively, and is unable for some reason to debug the sql emitted by ORM queries or use their ORM at all effectively to avoid N+1 problems. Not really a fair comparison.

selcuka · on March 14, 2023

> I mean it knows what you're using to select data, so it should auto-create indexes for you, right?

Curious, how does it know what you're using to select data?

manv1 · on March 15, 2023

Because you're using the ORM to do selects.

I'll give you an example: prisma has a "find/findMany." As part of the build process it should examine your usage and put indexes in the column that you're using in find/findMany

Indexes are an implementation detail - but to use RDBMSs successfully you need indexes. ORMs try to hide those implementation details, but without those kinds of details you get bad performance.

Prisma just fixed another problem it had with transaction isolation levels. To even understand the problem you need to understand SQL, how databases work, and how each database implements that. It's another important implementation detail that's abstracted away...until it bites you in the ass.

Prisma also finally allowed you to use database-specific types instead of just using varchar(191), which was its default for strings.

I'm picking on prisma because I happen to be using it right now. There are good things about it, like the way it does migrations and handles schema changes. But there are times its abstractions get in the way, like when you're trying to do aggregated group by (which is easier in SQL than it is in prisma). And its migrations often fail, and to understand why you really need to understand SQL.

Hibernate was worse, because it provides a transparent object-level backing store.

selcuka · on March 16, 2023

> As part of the build process it should examine your usage and put indexes in the column that you're using in find/findMany

This is not as easy as it sounds, though. Reliably inspecting source code is hard, and definitely not in the scope of an ORM. It also can't catch dynamically generated queries (think user configurable filters). It might be a good idea for a third party product that analyses your code and suggest adding indexes to models.

You know who knows best about your queries? It's the RDBMS. There are tools that can add/suggest indexes by searching your database logs for slow running queries.

petervandijck · on March 14, 2023

Learn SQL and use the ORM is usually the correct answer. Let it print out the SQL queries and how long they take (ORMs do that out of the box), then check for indexes and other standard issues. If there’s some unusual complex query to be written, sure, write that by hand.

akprasad · on March 13, 2023

If writing SQL directly, what process do you use to update your queries during schema changes? Do you rely on a test suite to catch errors then update queries by hand? Are you using compile-time checks through libraries like sqlx [1]?

[1]: https://github.com/launchbadge/sqlx

eyelidlessness · on March 14, 2023

Finally something that bucks my instinct that integration tests are a waste of time. If I’m deploying a SQL query, everything that gets in or out gets thorough automated testing. Regardless of your attitude on the subject, directly interfacing a database is business logic. Test it like you would anything else that interacts with an external dependency. If you made a schema change and it broke something, you’ll either get test failures, or learn where your coverage is incomplete, or learn where you don’t care about a particular failure mode.

Normally I’d prefer testing more isolated units, but if it’s your own data access, just write the integration tests. They’re at least implicitly part of your “unit” anyway.

phendrenad2 · on March 14, 2023

I recommend taking a step back before making a backward-incompatible schema change. Do you really need this? Chances are, it's just faffing about out of a sense of neatness and orderliness and you don't actually need to break the schema.

tetha · on March 13, 2023

Sorry to butt heads there, but what ORM does automatically handle schema changes? I have so many teams with highly abstracted ORMs telling me that no-downtime schema changes are impossible, no matter how trivial the changes are, or would be in e.g. Hibernate.

And the only team capable of zero-downtime schema changes uses a minimal DSL to SQL lib.

travisjungroth · on March 14, 2023

There are two parts to the schema change, the table changes (migrations) and the code changes. I think the person you were replying to was asking about the code part. Every ORM I've used or looked at[0] does that automatically in the sense that you make a change to a model and all queries generated by that model will now have that change. Depending on the change they could break, but the point is the changes propagate. If you're not using an ORM and are just using string templates, this won't happen. There are query builders, too, which are sort of in between.

[0] Django, SQLAlchemy, Active Record, Piccolo, Tortoise, SQLModel, peewee

akprasad · on March 14, 2023

^yes, that's indeed what I had in mind, and you've articulated the position very clearly.

travisjungroth · on March 14, 2023

Thanks, I’m glad! It’s a topic that pops up for me a lot. Related: https://news.ycombinator.com/item?id=26905647

akprasad · on March 13, 2023

I don't know of any ORM that handles schema changes automatically (edit: for the most strict definition of "automatically" -- I've used ORMs that generate migration files and am generally happy with them).

My experience here is quite limited, so I'm inclined to defer to yours. With that caveat, I find that ORMs play more nicely with my IDE (e.g. I can rename a column name and automatically update all call sites) and often raise early warnings if their declarative classes are out of sync with the underlying table schema.

yakshaving_jgt · on March 14, 2023

The persistent[0] library can automatically generate and run migrations to match schema changes.

[0]: https://hackage.haskell.org/package/persistent-2.14.5.0/docs...

ipaddr · on March 13, 2023

This problem seems foreign to me. Are your teaming not using Laravel? If not maybe it's time to upgrade. https://laravel.com/docs/10.x/eloquent

willio58 · on March 13, 2023

This is exactly why I find no solution fits all here. For me, I use an ORM as a catch-all and then for certain applications I manually write the queries. It's best of both worlds, and I know what components in my app have custom queries so I can test against them.

closeparen · on March 14, 2023

I'm happy to write my own SELECT queries, but it sucks to be manually schlepping row[0], row[1], etc. into the fields of my domain entities, or making sure I provide the parameters to the parameterized INSERT statement in the right order. ORMs usually let you drop down to SQL as desired, and I do that when I have a nontrivial query to express, but CRUD plumbing is nice.

ilamparithi · on March 14, 2023

Not sure which programming language you are using. For Java, I find Spring JDBC templates with Row Mappers more productive than ORMs. DataClassRowMapper will automatically populate the domain entity.

closeparen · on March 18, 2023

Go, Java language-level 8, Python. All have an approach, usually with some third party library, but the bone stock database drivers deal in field indexes. And people consider those libraries "ORM" whether or not they really are technically.

chillfox · on March 13, 2023

The core problem with ORMs is that you end up having to learn the ORM on top of learning SQL and the particular database. It ends up being like working with the database through an opaque indirect layer.

xarope · on March 14, 2023

Better yet, learn stored procedures, then call them just like any API with a well established set of inputs and expected return typed outputs. You'll thank me later once you decide to split your DB from your app server.

noduerme · on March 14, 2023

It's been at least 12 years since I had a DB on the same server as the app or service that called it... I can't fathom why anyone would do that these days. But either way, the stored procedures would work just as well. The issue you're alluding to (I think) is the time to set up connections for a dozen calls between the backend and the DB if they're on separate servers. But that's actually less of a bottleneck now than it used to be, with persistent pooled connections and most things hosted in the same VPN zones. I'm not arguing for ORMs... but appside processing [edit: "joins"] can still be appropriate if you're dealing with a monolithic DB and don't want certain giant queries or procedures to block rapid reads. The point is more to understand your infrastructure, and ORMs obfuscate it to a point that makes you irresponsible as a developer in some cases.

Also, with lateral joins and windows in mysql 8 you can really have control over whether you prefer your execution plan to loop or scan. Which takes away most of the rest of the argument for appside processing.

freetonik · on March 14, 2023

In my limited experience, the best thing about ORM isn't the querying, but the mapping to objects. This is why I really like ActiveRecord's `find_by_sql`[1], which allows to write a raw SQL query and get an array of objects.

[1]: https://guides.rubyonrails.org/v3.2/active_record_querying.h...

lmm · on March 13, 2023

> learn SQL properly. It's not that hard. Focus on it for a few weeks intensely and you've mastered it for life.

Exactly the same is true of ORMs. I find most of the people who advocate "just use SQL" have a bizarre aversion to applying the same learning effort to their ORM.

> I've had weekends ruined troubleshooting my "highly productive ORM layer" that nuked a production database. Whilst functionally speaking my ORM code was in no way incorrect. I'm talking differences of a thousand fold in query load depending on how one expresses the ORM calls.

That kind of problem happens with regular SQL all the time, and the tools to test/investigate it are less available.

jjoonathan · on March 13, 2023

> bizarre aversion to applying the same learning effort to their ORM

Oh, sure, instead of just learning SQL I'll just learn SQL, the ORM, the ORM's weird edge case features that actually support the SQL I want, the undocumented ORM internals that prevent the good query from actually being generated, and then I'll commit a patch to the open source project to fix the undocumented internals and shepherd a custom dependency for 6 months while it gets into a numbered release. Then I'll do it all over again when I switch stacks and have to learn a completely new ORM.

Or I could just use SQL.

lmm · on March 14, 2023

> the ORM, the ORM's weird edge case features that actually support the SQL I want, the undocumented ORM internals that prevent the good query from actually being generated

ORMs are much less bad than SQL at that, IME. If the ORM generated a particular query there is usually documentation for why, often an option you can change. If the SQL engine decided not to use the right index for this query... tough, there's literally nothing you can do.

jjoonathan · on March 14, 2023

"SQL is worse" is a nonsensical comparison. The problems of ORMs add to the problems of SQL and compound the problems of SQL, they do not substitute the problems of SQL.

I'm glad you found an ORM you like, but it seems we have very different experiences with ORMs.

lmm · on March 14, 2023

Depends. Sometimes the ORM knows about the pitfalls and works around them automatically. Same as how using a compile-to-JS language can often end up being much nicer than writing javascript yourself.

skeeter2020 · on March 14, 2023

I would always use an ORM over raw SQL but this doesn't make any sense, your ORM is still going to run a query on the same DB engine. It is much harder to optimize an ORM query, but if you're only writing very simple operations you don't typically need to. The power of ORMs (at least light weight ones) is compile time types and parameters by default.

KronisLV · on March 14, 2023

> If the SQL engine decided not to use the right index for this query... tough, there's literally nothing you can do.

MySQL Index Hints: https://dev.mysql.com/doc/refman/8.0/en/index-hints.html

MariaDB Index Hints: https://mariadb.com/kb/en/use-index/

PostgreSQL: No. More information: https://stackoverflow.com/questions/309786/how-do-i-force-po...

SQL Server INDEX hint: https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-tr...

Oracle INDEX Hint: https://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_e...

Of course, knowing exactly what is wrong and what hint you need to use (whether the issue is the index in particular or something else) is also a bit of work to figure out.

I recall working on a project that had Oracle as the RDBMS of choice and there was this one query that took approx. 45 minutes to execute, which wasn't acceptable. Merely a SELECT that JOINed a few tables and checked the data with some EXISTS constructions and such. If memory serves me right, looking through the AWR reports and adding a single NO_EXPAND hint made the query execute in not 45 minutes but around 3 seconds.

Of course, ORMs don't necessarily solve the aforementioned types of problems either, since it's still SQL statements being executed under the hood and figuring out how to add hints to those might be a bit more challenging. I've had cases where you need to drop down to native query level when using ORMs for particular queries, because there were no elegant ways of doing it otherwise (apart from creating a DB view and mapping against that in some cases, for read only access).

I think both using ORMs and using SQL directly suck, just in their own unique ways. That said, I wouldn't outright dismiss either: depending on what you're doing, one or the other is going to be a good enough tool for the job. You occasionally also see some pretty interesting projects in the ORM space, which is nice.

For example, jOOQ allows you to use a type safe fluent API for constructing your queries: https://www.jooq.org/

Oh and MyBatis let's you generate your own SQL dynamically, which is an interesting approach: https://mybatis.org/mybatis-3/dynamic-sql.html

ttymck · on March 14, 2023

How do I map my complex join into a list of structs with embedded associations? I'd probably need to write a good bit of code just to unmarshal the data, right? How are you doing this every time you write a new query?

epgui · on March 14, 2023

Can you give an example?

smallnamespace · on March 13, 2023

> have a bizarre aversion to applying the same learning effort to their ORM.

SQL is a lingua franca while an ORM is specific to a stack.

If you know that you will always, say, be using SQLAlchemy on Python, then learning it well might be a good investment, but I don't find learning multiple ORMs to be a good time investment over just learning SQL. Most every stack has good, lightweight query builders.

webnrrd2k · on March 13, 2023

Yes, both SQL and ORMs can give you problems, but I have seen far more serious, hard to diagnose trouble from ORMs than straight SQL.

Learning straight SQL is far more transferable across databases than learning an ORM.

There are reasons to have ORMs, for e.g. to keep a large code base consistent. It's useful for keeping relatively simple queries consistent across an ERP system, stuff like that.

However, learning SQL is going to have a far higher payoff across your career.

lmm · on March 14, 2023

> Learning straight SQL is far more transferable across databases than learning an ORM.

Not my experience. For example a recursive/tree query usually has one way to do it in an ORM, but three or four different ways to express it in SQL depending on the database, some of which seem like they're wilfully screwing you over (one popular database requires you to only write "WITH" and won't recognise "WITH RECURSIVE"; another requires you to write "WITH RECURSIVE", even though the error message it gives you when you only write "WITH" tells you that it knew exactly what you meant).

epgui · on March 14, 2023

I’m not sure I understand the consistency argument… Do you have an example, even if it’s just a silly one?

webnrrd2k · on March 15, 2023

The example is from a python ERP that I modified a few years ago. All the db interaction was done through an ORM and I thought that it was an OK strategy for keeping the SQL consistent and manageable.

There were other problems with the ORM approach -- mostly that it ran tens or hundreds of extra queries for relatively straightforward SQL queries. But the software was only used by a few people at a time, so the performance wasn't that big of a deal.

dahwolf · on March 14, 2023

"That kind of problem happens with regular SQL all the time"

Well, no, it doesn't. I told you to learn SQL properly. When you do, you know what a query plan is. Which is something you'd rarely need if you know how to design a database and its indices.

All of this would be covered in every beginner's book.

lmm · on March 14, 2023

I'm aware of what a query plan is. I can see that the query plan is wrong. There's nothing I can do about it, since I took the standard advice on HN and used PostgreSQL as my database.

Meanwhile if you'd applied the same "learning it properly" standard to your ORM you would never have had the problem you described.

fulafel · on March 14, 2023

An alternative to learning SQL is to use a non relational database, that has more of a impedance match to your host language to start with. (Eg XTDB or another Datalog DB in case of Clojure)

danmaz74 · on March 14, 2023

I'm pretty proficient with SQL, and I completely agree that you need to know it well if you use a relational DB for something that isn't trivial, but I still love using ActiveRecord. In most web applications I worked on I needed to manually optimise some queries, either massaging AR a bit or directly writing SQL when necessary, but I would say that's much less than 1% of all interactions with the DB. In all the other situations, using AR saved me a lot of time.

superb-owl · on March 14, 2023

ORMs are great for avoiding boilerplate and adding some typesafety. But once you start doing complicated things like JOINs, it's better to drop into raw SQL.

> Listen to this old man's advise: learn SQL properly. It's not that hard.

I couldn't agree more with the first half, or disagree more with the second :) Once you're operating at scale, it takes a lot of fine-tuning, know-how, experimentation, and reading docs to get things moving efficiently.

phendrenad2 · on March 14, 2023

What are some examples of fine-tuning or trial-and-error you might need at scale?

superb-owl · on March 15, 2023

Postgres has a ton of parameters that can greatly affect performance. I've used https://pgtune.leopard.in.ua/ to help w/ that.

In terms of schemas, where you put your indexes matters a TON. Indexes make reads faster and writes slower (to oversimplify).

Figuring out where to break out a separate table vs keep everything in one table is also a common issue.

phendrenad2 · on March 15, 2023

True, all of these would still be an issue with ORMs.

uoaei · on March 14, 2023

Honestly, ORM is an unnecessary layer most of the time.

You can have database schemas, and you can have software schemas, and they don't necessarily need to coincide as long as your database schema allows efficient queries. The rest of your relational-manager-logic can be embedded in the post-processing methods you apply to the query results.

droobles · on March 13, 2023

What do you think of "micro ORMs" like knex.js, or Dapper for C#? I personally agree with you and my personal projects just use stored procedures, but these micro orms have always piqued my interest since they're just a SQL abstraction for building single queries.

jiayo · on March 13, 2023

I would consider those "query builders" rather than fully fledged ORMs.

notfed · on March 14, 2023

Yeah no. A micro ORM still expects you to write the full, raw SQL.

The raison d'etre for a micro ORM is to capture the results of that SQL query into a simple, flat collection of objects in a way that has a better UX than your framework's default behavior.

Micro ORMs are the perfect middle ground for debates like these.

dgb23 · on March 14, 2023

I personally like those. They are qualitatively different from ORMs. They give you three things:

- sensible ways of composing queries beyond direct string concatenation

- safe parametrization

- return results into usable data structures (not objects)

None of these things carry the problems of ORMs. They don’t impose a paradigm that isn’t SQL. They simply let you interface with SQL in an ergonomic manner.

skeeter2020 · on March 14, 2023

This is typically as far as I go with an ORM, CRUD operations on single tables that give you some type safety at compile time. If you need more complex data shapes I usually create a view or occasional stored procedure but that's about it. I use Entity Framework in my day job and do like the migrations though...

andrewstuart · on March 13, 2023

These days I build applications that actually use SQL.

Typically a single statement to get the job done for any query.

partly_cloudy · on March 14, 2023

who you callin old man i'm still in my 20s!

waste_monk · on March 14, 2023

One foot in the grave. Practically a fossil.

econnors · on March 13, 2023

> when the site first got a surge of users from hacker news, there was one poster in particular who came to the site, registered a bunch of offensive, racist usernames and proceeded to post and create threads that were just full of dumb slurs. this was definitely a learning experience because i had to act quickly, so i tried a bunch of different methods to get rid of him.

it's sad that people like this exist in the world. what could possibly motivate someone to spend their time doing this?

nicbou · on March 13, 2023

I dealt with a similar basket case once.

It seems like parasocial relationships can swing both ways. You know how some fans develop a creepy, obsessive sort of love for creators? Well, the same goes for hatred. They feel slighted by that person that doesn't know them, and they retaliate from behind their keyboard.

wesleychen · on March 13, 2023

The thing that really surprised me was that even when he implemented IP blocking, the user used a VPN to continue abusing the site. That's a step beyond "casual" trolling that someone might do to test a site's security (not that this is justifiable behavior) and enters the territory of targeted harassment.

Kiro · on March 14, 2023

I don't find it surprising. This happens in every project I run. It's almost a guarantee that when you ban someone they will come back with a new account and when you ban their IP they start using VPNs. It's such a common occurrence and they get more aggressive for every step where they consider it a victory when you manually take action against them. I often need to resort to banning whole IP ranges.

mhuffman · on March 13, 2023

Richard Bartle, has this way of dividing up the way what he calls "players" but it works for any social media[1]. One of them is the "killer" with a sub-type of "griefer" which are those whose "... vague aim is to get a big, bad reputation". So, from that perspective, they actually do get something out of it.

[1]https://en.wikipedia.org/wiki/Bartle_taxonomy_of_player_type...

marcus_holmes · on March 13, 2023

Griefers. Some people get enjoyment from hurting/annoying others.

First time I bumped into this was in the DAOC MMORPG. Some players were deliberately annoying others, to the point of "wasting" their online time doing stupid shit that annoyed people. It really shocked me.

Thing is, there's lots of griefers in any online game. Which means there's lots of people out there in the real world who would do this if they could get away with it. Anonymity allows them to do this online with very little repercussion, so it shows how many there really are. But now every time I do an interview, or look at a rental, I'm thinking "is this person a griefer? are they going to enjoy making my life a misery?"

avg_dev · on March 13, 2023

> I said that's life

> (That's life)

> And as funny as it may seem

> Some people get their kicks

> Stomping on a dream

> But I don't let it, let it get me down

> Cause this fine old world, it keeps spinnin' around

- That’s Life

By Dean Kay and Kelly Gordon. Most famously recorded by Frank Sinatra.

perilunar · on March 14, 2023

https://youtu.be/TnlPtaPxXfc?t=36

citizenpaul · on March 13, 2023

> what could possibly motivate someone to spend their time doing this?

Probably being between the ages of 10-14 years old. Bartle Killer-explorer?

Kiro · on March 14, 2023

The hardcore persistent trolls making it their life's mission to ruin my site have always been 20+ after I've managed to identify them. Kid trolls just spam, get banned and move on.

flangola7 · on March 13, 2023

Attention seeking + lack of moral compass.

yamazakiwi · on March 13, 2023

A lot of them are edgy underage children and don't know any better. They think that they have a dark sense of humor but really they've lived a life disconnected from those words, so they like the idea of pushing others buttons at no cost to their selfish existence.

wolfi1 · on March 13, 2023

not only attention seeking, sometimes the motivation is "to spread the truth", at least as they perceive it, and sometimes it's people who get triggered and are not able to stop their rants

nico · on March 13, 2023

Had a similar experience recently. A random person started repeatedly filling out contact forms for one of our clients. They were doing it manually, and they did it for several days straight until finally blocked.

It also left me wondering why that person would spend an hour or two each day for several days in a row, filling out online forms. What’s the motivation?

My suspicion is that the person doing it works at the company and was trying to mess with their systems. But I’ll never know for sure.

efxhoy · on March 13, 2023

I spent some time working in a prison, it was very eye opening for someone like me from a pretty sheltered background. Most of the inmates were decent enough people that made bad choices but there were a few who were just horrible sadistic bullies.

username3 · on March 13, 2023

Whitehat moderation tester

cm2012 · on March 13, 2023

Mental illness

wahnfrieden · on March 14, 2023

i'm more sad to see founders repeatedly put out ugc-oriented software or communities without any design priority for safety.

safety is the hard part of ugc products but is left to figure out after scale and ossification.

t11hrow · on March 14, 2023

How would you plan for 'safety' in UGC-oriented software? I'm actually interested in this.

wahnfrieden · on March 14, 2023

I have a lot of thoughts on this that I plan to start writing on under indiedevstack.com before long, for now you can follow my one biz social where I will promote it once ready @manabiSRS

But some quick references... One approach is to avoid algorithmic surfacing of UGC outside of one's own network (or secondary connections etc), which makes discovery harder (must be compensated in other ways). Twitter may explore similar ideas with pluggable algorithms (though I don't trust them). Another approach is to constrain UGC: eg a music/audio community which has barriers to sharing new audio outside your network, but which freely allows remixing and promoting remixed audio of known good audio without as much safety control over the remixes because the operations allowed on the "good" source material make it difficult to subvert. This kind of idea is at the core of a product concept, not an additional layer to tack on later. I believe that finding these kind of cheaper ways of managing UGC and lending discovery to UGC can be a huge competitive edge.

t11hrow · on March 14, 2023

Hey man, thanks. Really interesting. I've added indiedevstack.com to my bookmarks, so I'll check in that once you've launched

wahnfrieden · on March 16, 2023

Cheers, good luck with your own work

bunnyswipecom · on March 13, 2023

competitor obviously

expertentipp · on March 13, 2023

"Is my girlfriend pregante?"

golergka · on March 13, 2023

Alcohol.

cousin_it · on March 13, 2023

> some users have suggested pretty smart features that i've since implemented, like this back-to-top button to quickly get back to the top of the page

To me all position:fixed elements (headers, footers, this back-to-top button, etc) feel like a kind of annoying dirt on the screen. Their absence is a big part of why I love the web 1.0 aesthetic.

saurik · on March 13, 2023

My god yes. There is literally a key on a full keyboard dedicated to this function (Home), smaller keyboards have a chord for it (like Fn-Up), and "mobile" touchscreen UI has a global mechanism for this (on iOS: touching the status bar)... to plop down a position:fixed button on top of the content to make it even easier to access this feature that is already extremely easy to access is just gratuitous.

newaccount74 · on March 13, 2023

Does Android have this? I had an Android phone for some time and the thing I missed most was the back-to-top shortcut

layer8 · on March 13, 2023

Android doesn’t have this, and I missed it at first as well, but the scroll acceleration is so fast (much faster than on iOS) that you can nevertheless scroll to the top very quickly.

perilunar · on March 14, 2023

On MacOS Fn+left arrow = back to top, fn+up arrow = up one window.

eythian · on March 14, 2023

I have this in a bookmarklet:

https://github.com/t-mart/kill-sticky

nicbou · on March 13, 2023

Me too, but I still added a table of contents button on my long, structured articles. It's very helpful in my opinion.

layer8 · on March 13, 2023

Wikipedia does this now, and I find it annoying, in particular the changing “current section” highlighting, and the fact that it hides when the browser window is a bit narrower. I’d rather press Home to get to the TOC again when it scrolled off.

retox · on March 14, 2023

Agreed, seeing something changing out of the corner of my eye is very off-putting.

nicbou · on March 14, 2023

Hm, I can see where the criticism comes from. Is there a way to keep this feature that would make it more acceptable?

layer8 · on March 14, 2023

Personally I don’t think it’s worth the benefits over just having a TOC at the top of the page, as is otherwise customary. It’s different if this is a web site or web app where you have an account, and where users can permanently enable/disable it as an option when they prefer it. But on a public web page for a general audience, it always introduces friction by being visually distracting (because it doesn‘t scroll with the rest of the page), or by hiding due to responsive layout (requiring the browser width to be adjusted, or having to toggle it by mouse instead of scrolling to the top by keyboard), and so on.

nicbou · on March 14, 2023

Fair point. I personally end up using those a lot, especially in articles that I skim (instead of reading end to end).

I'll start collecting data on its use, because people on the orangey site (including me) tend to have opinions that don't represent the average user.

nicbou · on March 14, 2023

I don't see it on my phone. Do you mean the desktop table of contents? I have one of those too. I use it all the time on other websites.

I think it's a really good feature, but the Wikipedia implementation needs work.

Joel_Mckay · on March 13, 2023

Yes, there are numerous automated and human-powered nuisance traffic streams.

1. CMS sites are constant maintenance, as most are an endless supply of issues. However, some have content caching to reduce the SQL workload.

2. Delayed registration with CAPTCHA and a brief explanation of why you are there. Quiet banning IP filter applied to list to boot pending users who enter emails that bonce or fail to authenticate.

3. Firewall blacklist areas of the world where you don't do business (better yet, whitelist the ISPs in the regions you do business), blacklist proxy/tor/spam IP ranges, add port tripwires, and setup rate limited traffic per IP (see slow loris mitigation methods if you are not using nginx).

4. add peer site content blocker for forum spammers/bots i.e. share exploit probes preemptively with the rest of the net.

5. add email filter for mention of bitcoin/BTC, and black-hole the entire IP block if in an irrelevant region.

6. lookup same-origin enforcement for your web-server, add Subresource Integrity Hash to your core, and re-scale/watermark/scrub all media to protect users from themselves.

7. fail2ban rules for common site security scanners, known exploit attempts, and common email scams.

You owe nonpaying users nothing, so the collateral cost of blanket bans is $0 in hostile regions. Remote traffic monitoring is also recommended if you have a game engine running.

On day 2 we can look at how BTC tumblers/launderers fund most of these issues, and whether it is OK to also preemptively blanket-ban most cloud/hosting providers (costs under 7% of your users in most cases). Remember, adversaries will often pretend to be from wherever they wish to inflict harm, and time does not have an associated cost in the 3rd world.

Have a gloriously wonderful day =)

hderms · on March 14, 2023

Hey there! Thanks for your comment. What do you specifically mean by rescale/watermark/scrub content to protect users from themselves?

Joel_Mckay · on March 15, 2023

This is done mainly to strip off meta/EXIF/GPS Data, and mitigate other format violations often used to cause "trouble". Resampling/transcoding also tends to corrupt steganography utilities, thrash user hardware thumbprints, and standardize the web experience by excluding unsafe formats in favor of well-tested efficient traditional codecs.

Specifics preclude the multifaceted nature of the policy.

https://www.youtube.com/watch?v=cJMwBwFj5nQ

Happy computing =)

TulliusCicero · on March 14, 2023

I'm surprised that they're surprised about trolls. As someone who's been doing PC gaming a long time, I always assume there'll be people who just want to ruin things for everyone else. What they're attracted to is popularity, so the better you do, the more you'll have to deal with them (they tend to grow slightly superlinearly relative to overall user growth).

This is basically every game or internet forum that acquires even a little popularity: there will be some (few) people who just wanna ruin everything, and I'm always surprised by how many people are surprised by this even when they're the technically literate sort.

For example, some Japanese fighting game devs still try to count disconnects during a match as different from losses for someone's record. One guess as to what this encourages as far as player behavior goes.

dhosek · on March 13, 2023

Man, on the abuse front—it’s amazing the lengths that people will go through to put spam on the web. There are apparently canned solutions for pushing stuff to any Mediawiki site, although I found that a really stupid captcha¹ was enough to bring that down almost to zero, but early on with rejectionwiki, I had the same sort of chronic abuser things happening that are described in the article.

⸻

1. Basically a set of really obvious questions, like “Who wrote Hamlet?” and what’s “Shakespeare’s first name?” that any writer (for whom the site is targeted) should be able to answer.

codetrotter · on March 13, 2023

> Who wrote Hamlet?

Sir Francis Bacon :smirk:

naniwaduni · on March 13, 2023

> that any writer (for whom the site is targeted) should be able to answer

hey now, not all writers who discuss things in English are necessarily familiar with the anglo literary tradition

(it's probably a higher overlap than average, and shouldn't be too hard for them to search, but be careful throwing the assumption around)

chrismorgan · on March 14, 2023

(Ooh, three-em dash. Fancy! I use two-em dash very occasionally in prose, and —⁂— as a section and footnote divider, but haven’t found anywhere I really wanted to use three-em dash, though I did still add a Compose key binding for it.)

warkdarrior · on March 13, 2023

> Basically a set of really obvious questions, like “Who wrote Hamlet?” and what’s “Shakespeare’s first name?” that any writer (for whom the site is targeted) should be able to answer.

Given that ChatGPT exists now, I assume these questions will need to be replaced with something harder to automate.

juped · on March 13, 2023

ChatGPT would be an extremely expensive way to answer these questions.

addisonl · on March 13, 2023

At MOST that is a 100 token prompt/response, so that is like $0.0002 to answer with gpt-turbo. Hardly going to break the bank...

corobo · on March 13, 2023

Really? I thought these types of questions would be limited. They seem hand-crafted.

If you cache the answers you're probably looking at 10 queries or so until the site admin gives up on that idea and tries something different

dhosek · on March 13, 2023

Given that the spammers seem to use some sort of canned software, it might have been enough to figure out how to change one or two internal URLs in MediaWiki, actually.

josephg · on March 14, 2023

Yep. I think writing code is quite difficult for most spammers. They often depend on hacked together scripts and things other people have written. Making those obvious scripts fail can make a massive difference. I suppose people who can write good code can usually make a lot more money by getting a real programming job.

Years ago I heard of a simple anti-spam technique where you add extra form fields to a web form. Then use CSS to make those fields invisible. Put a check in your backend where if you see any content in those form fields, you respond with 200 OK but ignore the request.

The programmer in me can immediately think of 10 ways to get around that - the most obvious being to fill in spam using a real web browser, automated via webdriver or something. But apparently that one trick removed ~95% of spam on their site.

hermitcrab · on March 13, 2023

>listen to your users. they might have better ideas than you!

So true. My products have improved greatly from listening to (some!) user feedback.

OJFord · on March 14, 2023

And if you're going to do it, make it easy. Many sites have missed out on my thoughtful (not to say I'm 'right', but that even if they don't want to do it it's considered and reasoned) feedback simply because the process although offered requires a login (i.e. sign-up) or because the last time I tried I got some dumb email back thanking me for my question it's not really possible but I can work around it by (x y z that I already said would be easier if w) and do get in touch if I have any more (sic) questions.

If you're going to solicit feedback, just dump it in an ideas bucket, no need to reply, certainly don't funnel it through the support channel for bugs/questions.

hermitcrab · on March 14, 2023

I always reply to feedback. Seems only polite.

OJFord · on March 14, 2023

Certainly, I've had some great even more thoughtful/in-depth replies too, it's just the ~zero effort template support-channel 'thanks for your question'/'unfortunately that's not possible today' type response - I know, that's why I gave the feedback; all it does is tell me it didn't get anywhere near the right person, which is a souring experience (and waste of my time & effort) and makes me wish there just hadn't been a feedback button to begin.

dgb23 · on March 14, 2023

If you can recognize good feedback and manage expectations well.

From a user’s perspective there is typically a tipping point for a thing that becomes popular enough where user feedback becomes useless, superficial, lowest common denominator crap, which doesn’t understand the value prop, the quality standards and the implications of change vs stability.

I believe this type of feedback often bubbles up for similar reasons bikeshedding can become a problem, which is then perpetuating through social media.

At this point one needs a filter.

yamazakiwi · on March 13, 2023

Users that give great feedback also generally give great feedback in the future.

partly_cloudy · on March 14, 2023

hey i'm the author of this site! going through comments now, but looks like i still need some work to do because all this traffic caused some slowdown :(. Looks like adding a caching layer is next.

diceduckmonk · on March 14, 2023

> someone is going to abuse your site

Would gating access with Google Sign-in, or Facebook sign-in, etc, be sufficient for rate limiting bad actors?

godshatter · on March 14, 2023

Google and Facebook both have too much data about me already, why would I voluntarily give them more? This isn't probably that common of an attitude, though.

mcstempel · on March 13, 2023

> this worked for a little bit longer, but he proceeded to get on a VPN, and then another when i blocked that IP, then another when i blocked that IP, etc, etc.

Beyond VPNs, I've even seen attackers leverage residential IP networks which makes VPN detection ineffective as well [1]. If you ever need a more permanent identifier to ban users on, consider using a device/browser fingerprinting tool [2]. It helps avoid the whack-a-mole issue of more sophisticated attackers churning IPs/emails/user agents/etc.

[1] https://brightdata.com/proxy-types/residential-proxies [2] https://stytch.com/products/device-fingerprinting (I'm admittedly biased towards our solution as I work at Stytch)

eks391 · on March 17, 2023

Although difficult and not well known about, fingerprinting can be randomized[1]. I have been successful creating a random fingerprint on only Brave so far, but I did need to tweak some browser settings.

1 check your fingerprint details here: https://coveryourtracks.eff.org/

monroewalker · on March 13, 2023

Are you paying for Sentry? What type of monitoring does it provide? I'm working on a project I'd like to add some monitoring so I'm on the lookout for a good solution. Looking for something free though until there's a need to have better insight than I can get without paying for it

kevincox · on March 13, 2023

I'm using the Sentry free tier and it is great. The main value is how to managed repeated errors. You can group different exceptions for the "same" error, resolve errors that have been fixed or ignore known errors until they occur to more users, more often or what have you.

If you are good about squashing errors you can make it very far on the free plan. Plus they have some burst detection built in. Just make sure that "expected" errors aren't just ignored in the UI, stop emitting them in the app itself so they don't count towards your quota (and it keeps your logs tidy).

I haven't been using their tracing or anything because their Rust SDK doesn't seem to support it despite claiming that it does (or I have set it up wrong).

monroewalker · on March 13, 2023

Thank you for elaborating, this is useful :)

benaduggan · on March 13, 2023

My team got really far using something called GlitchTip. It's compatible with the Sentry SDKs, but really cheap, so it felt like there were minimal consequences to switch if we ever had the need to. We only switched to Sentry cause our company was acquired and they already had a Sentry subscription going.

monroewalker · on March 13, 2023

Hadn't heard of GlitchTip, but will look into it. Thanks!

t0mbstone · on March 14, 2023

You can self-host Sentry in a docker container. It's pretty easy to get running, and essentially free.

partly_cloudy · on March 14, 2023

i am not! right now, i'm mostly just using it for performance monioring - how many MS are my queries taking and how long are my first-contentful paints on the front-end

also for errors from the BE and FE

gloosx · on March 17, 2023

The best way to deal with point one – ghosting. Feeling sorry to see 3 greatly ineffective, and 1 desperate method to get rid of the abusive spammer. I had experience with the community-driven portal, and trust me - the best way to exhaust the spammer - hide his posts from others - don't give him a single clue you acted on him, just let him continue posting in a special vacuum prison crafted exactly for him.

Sujeto · on March 14, 2023

Currently I'm using a code method, where registration requires a "code". I share this code lightly, and can change it at any time. Plus there's a re-captcha in both register and login pages to annoy abusers a bit more.

scoofy · on March 14, 2023

I have a site that will likely need a denylist for usernames. Do you have any resources on implementing that? I mean, it sounds obvious how to do it, but if it's already been done, I'd rather just have a list to work from.

partly_cloudy · on March 14, 2023

the denylist on the site is actually pretty quite simple. for now, it's just a list of chunks of words and the form validation will just compare the username against that list, where each chunk in the list if used as a regexp

monroewalker · on March 14, 2023

You could also consider shadow banning these users. If you're immediately preventing them from creating accounts with certain usernames, they'll probably just get more creative with the profanity or save it for their posts.

mydriasis · on March 14, 2023

Hey hey! Been loving your forum so far. It's been great chatting with folks. Hoping for many years to come.

beardog · on March 14, 2023

A good way to combat abuse is to not feed the trolls, don't engage beyond a warning or two, simply delete and definitely don't argue publicly.

flippinburgers · on March 14, 2023

AKA blacklists are useful.

hoseja · on March 14, 2023

Blacklist please.

Blocklist if you absolutely have to.

"denylist" is an abomination.

Oh I see, a goon.