I said the last SQL piece was experimental, but that was before it cruised into my top posts. You could point to the obscene clickbait title. That's both logical and entirely fair, but hear me out. SQL's not the most popular kid in school, and that air of mystique gave her a fighting chance...
Either way, full-stack fun is back on the menu! For an introduction to SQL's time-bending nature, start with Part I.
Today, we dive deeper, covering the complex interactions that make SQL brilliant. We tackle the pros and cons, the power of JOINS
, and the modularity of UNION ALL
. But first, let's start with some Javascript.
The JS Approach
Let's start with a problem. We need to parse and track the net liquidity of the U.S. Federal Reserve. We have access to all required data but need to find the percent change between rows.
- FED Net Liquidity = (Federal Reserve Balance) - (Treasury General Account + Reverse Repo Agreements)
- Don't worry about what these mean, but the process is an interesting (and deep) rabbit hole if you're so inclined.
The initial data would look something like:
1 const fedBalance = [2 // Your data goes here:3 { date: '2023-01-01', fed: 100, tga: 50, rrp: 30, net_liquidity: 70, spx: 200 },4 { date: '2023-01-02', fed: 110, tga: 55, rrp: 35, net_liquidity: 75, spx: 210 },5 // More data...6 ];7
Remember, we need to find the percent change, or the difference between the current and proceeding rows, divided by the proceeding row, then multiplied by 100. JS offers a range of approaches, but map
does the trick.
1 const calculatePercentageChange = (prev, current) => {2 return ((current - prev) / prev) * 100;3 };45 const fedBalanceFull = fedBalance.map((row, index, arr) => {6 if (index === 0) {7 return {8 ...row,9 net_change_fed: null,10 net_change_tga: null,11 net_change_rrp: null,12 net_change_net_liquidity: null,13 };14 }1516 const prevRow = arr[index - 1];1718 return {19 ...row,20 net_change_fed: calculatePercentageChange(prevRow.fed, row.fed),21 net_change_tga: calculatePercentageChange(prevRow.tga, row.tga),22 net_change_rrp: calculatePercentageChange(prevRow.rrp, row.rrp),23 net_change_net_liquidity: calculatePercentageChange(prevRow.net_liquidity, row.net_liquidity),24 };25 });26
This gives us what we need, but it took a hefty dose of repetition to get there. It also adds another layer of complexity to your front and/or backend implementations, which should be limited by design.
All this to say, I prefer formatting as much as possible before it hits my application.
A Glance Through the Window
True to its name, SQL is a querying language specifically designed to handle and manipulate large datasets. The below features one of SQL's heavy lifters:
Window functions
perform calculations across a set of rows related to the current row. Three main types exist:
- Aggregate, or function like
SUM()
,MAX()
, orCOUNT()
- Ranking, or
RANK()
andDENSE_RANK()
- Value-based, or functions like
LAG()
andLEAD()
A full rundown is outside this guide's scope but will be thoroughly covered in a follow-up post. For now, focus on LAG()
. Note that window functions
include an OVER
clause, a PARTITION BY,
and/or an ORDER BY
. But the OVER
clause distinguishes them from the average SQL query.
As window functions
are row-based operations, we must specify how the data relates to the rows around it. OVER
acts as that guide. In plain English, a SQL function lets us ask for data. A window function extends that, enabling us to ask for calculated data in a specific order.
Where SUM(amount)
returns the total sum of an entire column in our dataset, SUM(amount) OVER (ORDER BY date)
returns a row with the summed totals of each day. The below showcases LAG(),
which finds the difference between a row and the row immediately preceding it.
1 CREATE TABLE IF NOT EXISTS fed_balance_full AS (2 SELECT date,3 fed,4 fed - LAG(fed, 1) over (ORDER BY date) AS net_change_fed,5 tga,6 tga - LAG(tga, 1) over (ORDER BY date) AS net_change_tga,7 rrp,8 rrp - LAG(rrp, 1) over (ORDER BY date) AS net_change_rrp,9 net_liquidity,10 net_liquidity - LAG(net_liquidity, 1) over (ORDER BY date) AS net_change_net_liquidity,11 spx,12 FROM fed_balance13 );14
Here we ask for the difference between the current and preceeding rows, ordered by date. We get the same result, but less wordy and more human-readable.
I tested both operations. I ran the JS locally in VSCode and ran the query via a local-hosted PostgreSQL database on the same machine. The SQL took under 50 milliseconds to process and deliver, while Javascript took around 0.253 seconds. So, the SQL query processed around 5.06 times faster than the equivalent JS.
The tests were (admittedly) isolated, but it's not all about speed and brevity. SQL handles complex queries, mutations, and interactions with outright class.
Recruit More Data
SQL's a maestro when it comes to manipulating and querying relational data. This becomes invaluable in operations that cross-reference multiple tables, allowing you to pull insights from seemingly disparate data. In a business context, it's often a unique ID that serves as the bridge between tables, linking, say, sales data to specific customers or salespersons.
For example, consider two tables, sales and salesTeam, where we link them via a salesId:
1 -- NOTE: While SQL offers a `money` data type, we use `numeric`2 -- here for compatibility with JavaScript data types in React.3 CREATE TABLE IF NOT EXISTS sales (4 id varchar PRIMARY KEY,5 salesDate date,6 salesAmount numeric7 );89 CREATE TABLE IF NOT EXISTS salesTeam (10 salesId varchar REFERENCES sales(id),11 salesDate date,12 salesTotal numeric,13 PRIMARY KEY(salesId)14 );1516
Use JOIN
to find the matching records between two or more tables. We covered the basics of each type in the first article, but here we'll use LEFT JOIN
.
In our two tables, sales.salesId
matches the values held by salesTeam.id
. We include this information in our ON
to join the salesTeam table on the id
match. Now, let's get to joining:
1 SELECT sales.id, sales.salesDate, SUM(salesTotal) as netSales2 FROM sales3 LEFT JOIN salesTeam ON sales.id = salesTeam.salesId4 GROUP BY 1, 25 ORDER BY 2 DESC6
With this, we can query the total sales for each salesperson in a given period.
UNIONS
Let's ramp things up. We've been cruising at ground level with flat queries, but it's time to soar. Enter SQL's UNIONS
and nested queries.
SQL's UNION
declaration joins multiple queries using related tables. SQL also allows for nested
queries. Think of nesting
like JS for loops
. The comparison isn't 1:1, but the mental model is similar.
For this, let's head to Dune Analytics. I said it before, but you don't need to be into crypto to appreciate Dune. It's the ideal spot to hone your SQL, as the entire site is one massive relational database, which you can query to your heart's content.
We'll start with UNIONS
, which you'll typically see as UNION ALL
.
UNION ALL
literally stacks two or more result sets on top of each other, keeping duplicate rows.
UNION
does the same but eliminates duplicate rows, which can be less efficient if you need those duplicates for your analysis.
It has myriad uses, but the below showcases my favorite.
Now, hold on tight; we're diving deep. The query below utilizes each technique we've learned so far. At its core, it calculates the Total Value Locked (TVL) in a lending protocol by summing up deposits and subtracting withdrawals. Take a moment to absorb the data flow from the inner queries to the top-level aggregate.
It has numerous things going on:
- We assign a name to each query and subquery using
WITH
.
- We nest a
SELECT
query in ourFROM
clause.
- We merge the two queries with
UNION
.
1 -- Nesting in the `FROM` clause.2 WITH borrows AS (3 SELECT dt,4 project,5 token,6 asset_symbol,7 sum(delta) as "TVL"8 FROM (9 SELECT DATE_TRUNC('day', evt_block_time) as dt,10 project,11 asset_address as token,12 asset_symbol,13 sum(usd_value) as delta14 FROM lending.borrow15 GROUP BY 1, 2, 3, 41617 UNION ALL1819 SELECT DATE_TRUNC('day', evt_block_time) as dt,20 project,21 asset_address as token,22 asset_symbol,23 -sum(usd_value) as delta -- negative for $ out24 FROM lending.repay25 GROUP BY 1, 2, 3, 426 -- Nested `FROM` clauses need to be aliased.27 -- You can use `AS` or type away.28 ) sub2930 GROUP BY 1, 2, 3, 431 ORDER BY 1 DESC32 );33
Now, we hit the true time savers. The above could likely be done in Javascript, but good luck. That's the beauty of SQL. It offers unparalleled efficiency when handling or performing complex computations on large datasets.
We have a few moving pieces to work with here.
The top-level SELECT
aggregates the below queries to find the TVL (Total Value Locked), or the sum of all deposits minus any withdrawals. Our nested FROM
names four columns in each query but focus on the last item.
sum(usd_value) as delta
-sum(usd_value) as delta -- negative for $ out
The result? The Total Value Locked, or the sum of all deposits minus withdrawals. It's a powerful metric in just a few lines of SQL, outperforming similar JS-based solutions in speed and simplicity.
Wrapping Up
If you're still with me, well done! I know it was a ton to absorb, but SQL gets simpler with time. SQL doesn't have many declarations, but the depth of each can be staggering.
Fortunately, the majority of web projects require fewer intricate queries. But if complex data's your swing, nothing beats SQL.
Now, to quickly recap:
- We compared SQL to a Javascript approach
- Introduced
window functions
, which allow for absurdly simple data sorting and manipulation
- Covered complex querying through the use of
JOIN
- And learned the power of nested queries or SQLs modular
for
loop
So, what's next? We'll keep peeling back the layers of SQL, diving into stored procedures, triggers, and more advanced window functions. Feel free to message me on Twitter if you have specific questions or scenarios you'd like covered. I'm more than happy to explore them in future posts!
In this ever-changing world of web development, keeping your skills sharp is crucial. If you're looking to gain an edge, particularly with data-heavy applications, the depth and power of SQL should not be underestimated.
Like CSS, love it or hate it, it's always there to help!
Cheers!