SQL is about as underrated as CSS. To be entirely forthcoming, I'm not sure how controversial an opinion that is, but all the same. Nowadays, no-SQL
and kinda-SQL
solutions are everywhere. I use Mongo almost daily, and I love it, but I'm often left searching for features that aren't there.
For blogs, personal projects, and a litany of other micro to mid-size projects, Mongo can't be beaten. But once you dip your toes into data-centric projects you'll find a lot more value in relational databases like PostgreSQL.
First, we'll quickly tackle the basics, then I'll share some of my favorite tips and tricks to manipulate tables and views in SQL. The goal is to have the data in the exact format you're expecting before you push and/or fetch it from the database.
I know it's redundant to a strange degree, but I've used local Postgres DBs to format my data before pushing it to a cloud-hosted database... Whatever works, right?
...Right?
What's SQL?
Structured Query Language, or SQL, is a language built to gather specific data from large datasets. Put simply, you tell the database what you want, and it gives you only that.
SQL's one of those oddball languages. It only has a handful of declarations, but the interactivity between them is staggering. Let's start by creating two tables.
1 CREATE TABLE IF NOT EXISTS Customers (2 id varchar PRIMARY KEY,3 name varchar,4 email varchar,5 dateOfPurchase date6 );78 CREATE TABLE IF NOT EXISTS Sales (9 salesId varchar PRIMARY KEY,10 name varchar,11 sold int12 );13
In a Mongo database, we'd have no way of easily and directly joining these two tables. We'd have to settle for a large nested array that includes the sales data. With Postgres, we can compare, contrast, and even create a new table or view out of our queried data.
To set a baseline, let's take a look at JOINS
. SQL offers four options to specify how you want to merge the two tables. You'll primarily use LEFT JOIN
and INNER JOIN
, but the other two have their place.
CROSS JOIN
- returns all items that belong to BOTH tables
LEFT JOIN
- returns all records from the left table, and matched rows from the right table
RIGHT JOIN
- returns all rows from the right table and matched rows from the left
INNER JOIN
- returns all rows that have matching records in both tables
1 SELECT sum(sold) as numSold,2 date_trunc('day', dateOfPurchase) as dt3 FROM Customers c4 LEFT JOIN Sales s on c.id = s.salesId5 GROUP BY 26 ORDER BY 2 DESC7
That returns a list of the sum of all items sold on each date, ordered by date in descending order. The nice thing about SQL is that the vast majority is completely human readable, and does exactly what you'd expect.
SQL and the wonders of CASE WHEN
Dune Analytics is my favorite place to practice SQL. Even if you have zero interest in crypto, a blockchain is, at its simplest, a distributed database. This makes it ideal for showing the full extent of what a relational database can do.
Dune is just that. Sets of large tables, many of which can be joined using an id
, hash
, or date
. Take this query. It calculates the totals borrowed and lent by a protocol and separates these users into tiers based on dollar value. It also showcases one of my most used SQL patterns, CASE WHEN
.
Note: In SQL casing doesn't matter, and formatting only vaguely matters. We'll discuss formatting rules in a later, more expansive post.
1 select distinct borrower,2 date_trunc('day', "block_time") as dt,3 (case when "usd_value" > 0 and "usd_value" <= 1004 then 'a. Minnow <100'5 when "usd_value" > 100 and "usd_value" <= 10006 then 'b. Shrimp 100-1k'7 when "usd_value" > 1000 and "usd_value" <= 50008 then 'c. Crab 1k-5k'9 when "usd_value" > 5000 and "usd_value" <= 1000010 then 'd. Fish 5k-10k'11 when "usd_value" > 10000 and "usd_value" <= 10000012 then 'e. Dolphin 10k-100k'13 when "usd_value" > 100000 and "usd_value" <= 100000014 then 'f. Shark 100k-1M'15 when "usd_value" > 100000016 then 'g. Whale >1M' end) as tier,17 count("borrower") as holders,18 sum(usd_value) as amount19 from lending."repay"20 where project = 'MakerDAO'21 and "block_time" >= now() - interval '90 days'22 and "usd_value" > 02324 group by 1, 2, 325 order by tier desc;26
This data can then be easily charted to give interested parties a granular view of the users interacting with the protocol. Now, imagine parsing that out in JS, even Python... it's really not all that fun.
SQL is fantastic at handling the heavy lifting, so you can do more of what you enjoy: building, researching, exploring, sharing, and other -ing verbs. They're all back on the table as efficiently as possible.
This serves as a primer. I'm not entirely sure how much interest there is in this kind of backend fun, but for now, we'll give it a fair shot.
Thanks for reading!