For some time now, we’ve been working with SQL to communicate with database systems. What we learned in these years is that SQL is not a good way to query data, and I’m going to explain why.
SQL should be a standard way of querying data, but most programmers have learned (probably the hard way) that most of databases implement SQL in a different way. What is means in practice is that any time we need to change databases we will face lots of incompatibilities and queries that simply won’t work as we expect. But this is only the beginning of our problems…
We tried lots of ways to solve this kind of problem, one of them migrating to ORMs. But, ORMs in fact solve a different problem – the one that relational databases work with row-column structures, and our programming languages use objects, hash-maps, records, and other richer ways of representing data. Ruby’s ActiveRecord was a huge step forward, promising us to deliver value simplifying our relational-object mapping, but in the end we faced the same problems – incompatible queries, SQL fragments being thrown in the code, and in the end, we ended up with another huge kind of problems – performance, complexity, and separation of concerns problems (a single ActiveRecord mapping is responsible for validation, for queries, and to define business logic). Even worse, the Arel promise (a complete library to abstract every possible SQL query) was underused – it’s now an internal library to ActiveRecord, it doesn’t really have a stable public API, and in every minor version, something changes in a bizarre and incompatible way.
So, I’ve started a simple project named relational. In the beginning, it was just a playground to learn Scala. But, right now, and faced with modern problems (I’m working with Clojure, and it doesn’t really have a good way to query relational databases – Korma is incomplete in multiple ways, HoneySQL doesn’t really delivers what I want, and other libs are just wrappers around string queries), I’m implementing a version of Relational in Clojure, and the reason I’ve started working on it is kinda simple…
SQL isn’t a standard.
Okay, if we just want to query all data from a single database, inner-joining with other, just listing the fields, it’s completely fine. Add SQL functions and pagination, and we’re in a pinch – for instance, the standard way of limiting the result to just 100
rows is:
SELECT * FROM table FETCH FIRST 100 ROWS ONLY
I don’t know a single person who wrote this kind of query, simply because almost no database supports the standard – in PostgreSQL, MySQL and Sqlite, it’s written as:
SELECT * FROM table LIMIT 100
In Oracle, it is
SELECT * FROM table WHERE rownum < 100
In Microsoft SQL Server, it is
SELECT TOP 100 * FROM table
And don't even start with GROUP_CONCAT
or other strange SQL functions…
(more…)