The average GET
request holds a few kilobytes of data, primarily consisting of headers. That's all it takes to fetch a database of questionably guarded data. It doesn't matter if you work on the front or back end, sensitive data should be handled with care.
That said, I've noticed some confusion, and it's entirely understandable. I shared many of the same questions until I got into the nitty gritty of web security. In this article, we'll explore various algorithms and tackle when and why we use each solution.
But first, let's cover a little history and define a few terms.
A (very) Brief History of Modern Cryptography
If you already know your history or want to get coding, click here. That said, I think it's easier to appreciate the tools you have once you know where they came from!
From Caesar to Modern Ciphers
Ciphers date back to ancient times, but for the sake of modern history, let's fast-forward to the Caesar cipher. Think of it as the "grandparent" of modern cryptography.
I know, I know, I said modern history, but it'll be a brief dip into times past. Promise.
Caesar's cipher is simple. Each character in a message shifts along the alphabet at preset intervals. Say we wanted to encrypt the message, "His name was Robert Paulson". Using the Ceasar cipher with a shift of five characters, we get, mnx sfrj bfx wtgjwy ufzqxts
.
Feel free to take a test drive below!
The cipher works for personal or time-sensitive data but is a breeze to crack with modern technology. Which leads us to the age of connectivity.
Symmetric Keys: DES as the Trailblazer
The early '70s gave us DES (Data Encryption Standard).
DES utilized a single symmetrical key for encryption and decryption. It laid the groundwork for what would become the modern encryption stack.
For the first time, the cipher wasn't preset.
Asymmetric Keys: The RSA Paradigm
The late '70s saw RSA (River-Shamir-Adleman) introduce the asymmetric method. With two keys—one for encryption and one for decryption—RSA added an extra layer of security.
Think of them like mathematical puzzle pieces where the public key only fits with the correct private key.
Unlike DES, it still sees heavy use today. Its asymmetry is particularly suited to applications heavily reliant on secure key exchange, like blockchains, email services, digital signatures, SSL/TLS for web security, and VPNs.
- Quick Takeaway: DES laid the groundwork for symmetric encryption, while RSA advanced the field by introducing asymmetric methods for secure key pairing.
SSL & TLS Equip Us for the Digital Age
Fast-forward to the '90s, and SSL (Secure Sockets Layer) became the web standard. It saw regular use through 2014, and then along came TLS (Transport Security Layer).
TLS came equipped for the ever-morphing digital landscape. On the web, we can't assume honest intentions. Quite the opposite. TLS, also called a TLS handshake, enables both the sender and receiver to confirm the identity of the other before any data is transferred.
- Boiled down, TLS is a secret handshake between your machine and a web server, so all involved know they're among friends.
Today, most sites are served as HTTPS, an implementation of TLS encryption atop the standard HTTP protocol. So, any site using HTTPS utilizes TLS to encrypt and decrypt information.
AES Enables Security at Scale
AES, or the Advanced Encryption Standard, surfaced in 2000 and quickly became the go-to algorithm for myriad web applications.
Like DES, it utilizes symmetric keys, meaning the user and sender employ the same decryption key. The "trustless" nature makes AES ideal for securing data in transit and at rest. Which leads us to 2023.
AES and RSA still see heavy use in myriad applications. With its ties to TLS, AES reigns supreme for data transfer and storage, particularly on the backend. But the internet of today requires more finesse than days passed.
Today, data is token-based. AES typically (not always, as we'll soon see), handles database storage and API data transfers, but that's only half the picture. What about the front end?
The Age of Tokens
One of the areas where RSA shines is in token-based authentication systems like JWT (JSON Web Tokens) and OAuth 2.0. The model is slower than AES but well-suited to random key generation.
- JWT enables claims to be transmitted between two parties and can be encrypted for tamper-proofing. In JWT, RSA is often used to sign these tokens so that the recipient can trust the source and integrity of the content.
- OAuth 2.0 is another authentication protocol commonly used for token-based authentication in modern web applications. In OAuth2.0, RSA is often used to generate secure tokens, particularly when
Authorization Code
flows are used. The protocol uses asymmetric cryptography, most often RSA, to ensure secure token exchange, even in environments where HTTPS may not be reliable.
JWT and OAuth 2.0 are specifically tailored for user authentication, but that's worthy of a separate post. Now, we only skimmed the surface, but we hit the major takeaway:
Internet requests are encrypted via the TLS standard, which commonly uses the AES or RSA cipher.
What is Encryption?
In web development, encryption enables the secure transfer of messages and data. To paint in broad strokes, let's say we make a GET
request to the Pokemon API. The simplified flow looks like this:
- The user requests Dragonite data, initiating a TLS handshake with the server.
- The server sends back its certificate, which includes its public key, for verification.
- After verifying, the user generates a pre-master secret, encrypts it with the server's public key, and sends it back to the server.
- The server decrypts the pre-master secret with its private key. Both parties then use this secret to independently generate the same session key.
- Any further messages between the user and server flow through the freshly encrypted session key.
- The user receives the requested (and glorious) Dragonite data.
But that's enough talk. Let's look at some code. Node provides a default crypto
library covering most of our use cases. Alternatives with lower learning curves exist, but if we can avoid a dependency, then why not?
First, we'll need to generate an encryption key. There are several ways to handle this, but I like using openssl
.
1 $ openssl rand -hex 322
You'll see a long hex string. There's your random key. We'll add this to a .env
or .env.local
file to avoid accidentally showing it to the world.
But how do we go about creating and maintaining our own encryption logic?
1 import crypto from 'crypto';23 const KEY = process.env.ENCRYPTION_KEY;45 export async function encryptData (message) {6 /*7 Generate a 12-byte IV (Initialization Vector) using `randomBytes()`.8 This size is commonly recommended for the GCM mode of AES.910 Create a cipher using the AES-256-GCM algorithm with the help of11 `createCipher()` function. We initialize it with the encryption `KEY`12 and the generated IV.1314 // Note: I'm generating a random IV, but you'd typically need a method15 to securely share it on the client side.16 */1718 const iv = crypto.randomBytes(12);19 const cipher = crypto.createCipheriv(20 "aes-256-gcm",21 Buffer.from(KEY, "hex"),22 iv,23 { authTagLength: 16 }24 );2526 /*27 Encrypt the message using the cipher and output it as28 hex-encoded data. Then retrieve the authentication tag,29 which provides integrity and authenticity.30 */3132 let encrypted = cipher.update(message, "utf-8", "hex");33 encrypted += cipher.final("hex");34 const authTag = cipher.getAuthTag().toString("hex");3536 return {37 iv: iv.toString("hex"),38 encryptedData: encrypted,39 authTag: authTag,40 };41 };42
But encryption's only half the story. At some point, we need to decrypt. Fortunately, it works much the same but reversed. As a quick example:
1 export async function decryptData ({ iv, encryptedData, authTag }) {2 const decipher = crypto.createDecipheriv(3 "aes-256-gcm",4 Buffer.from(KEY, "hex"),5 Buffer.from(iv, "hex"),6 { authTagLength: 16 }7 );89 decipher.setAuthTag(Buffer.from(authTag, "hex"));1011 let decrypted = decipher.update(encryptedData, "hex", "utf-8");12 decrypted += decipher.final("utf-8");1314 return decrypted;15 }16
When Do We Use Encryption
Encryption's major appeal stems from reversibility. Use encryption for any data that needs to be later decrypted, like payment information. You don't want the world to see it, but you may need to use it later.
Payment processing aside, there's plenty that needs encrypting. Chat applications encrypt messages end-to-end, ensuring only the sender and receiver can read them (I mean... ideally). Smart devices use it to encrypt the data flow from server to machine.
Really, there's no end to the practical examples. It doesn't matter if the data's at rest or flowing free, it should be encrypted. Only decrypt when absolutely necessary.
Algorithms like AES-GCM handle encryption and produce authentication tags (authTag
) that ensure the data hasn't been tampered with, like digital seals.
You can see those in action in the code above.
What is Hashing?
Unlike encryption, the hashing process is irreversible. Once hashed, a message can't simply be unhashed. At first glance, that doesn't seem that useful, but most data should never see the light of day. Things like user credentials, social security numbers, and down the list.
Hashing is perfect for these scenarios. As a developer, you have no need for a cleartext password. Instead, compare the incoming hash with the stored password hash. If they're the same, you'll know it.
SHA-256 never ceases to amaze me. It's a hashing algorithm that returns 64 characters regardless of input length. Give it a try below!
Number of characters in hash:
It doesn't matter if you type "hey" or paste in a transcript of every conversation ever held. The resulting hash will always be 64 characters long.
That's peak efficiency, and it's absolutely glowing.
Alongside its irreversibility, hashing offers "data fingerprinting", integrity checking, and speedy lookups. We won't go too deeply into these categories, but we've already touched on the first two.
In the above password example, I mentioned comparing the hash of the incoming password to the stored password hash. This functions as a validation and integrity check, as any slight deviation to the contents changes the hash entirely.
Just Add Salt
"Salting" is a term you've likely come across, typically regarding password protection. Salting is separate but tangential to hashing. It may seem a bit arcane at first. It sure did for me, but it's as simple as it is clever.
Say a user inputs the plaintext password, #password123
. If you hash this directly, anyone with the same password would end up with the same hash. This is problematic.
If shadowy super-coders got their hands on the hashed passwords, they could precompute hashes for common passwords to find matches.
This is where we add a little salt. The "salting" process appends random data, known as salt, to the plaintext. So, the hash for #password123
would be something like #password123+random_salt
. This salt is randomly generated for each entry, so even if two users share a password, their hashes differ due to the unique salts.
Here's how that looks in code:
1 import crypto from 'crypto';23 // Generate a 16 byte 'salt'4 const salt = crypto.randomBytes(16).toString('hex');5 const password = "#password123";67 // Merge the password with the salt, then hash8 const hashedPassword = crypto.pbkdf2Sync(9 password, salt, 1000, 64, 'sha512').toString('hex');10
Then store the salt
and hashedPassword
in your database. This protects the data from reverse engineering and brute-force attacks. As each salt is unique, hashes need to be computed on a per-user level, making for an incredibly inefficient attack target.
Putting it All Together
To better encapsulate the concept, I built a little component depicting each part of the encryption process. Typically, the keys return as Objects or Promise[Object]
, so the below strings have been converted to base64
for your viewing pleasure.
- Generate a public and private key.
- Enter a message into the
textarea
and hit 'Encrypt.'
- Then decrypt to view the unsalted result.
- Finally, add salt to view your freshly seasoned message!
To Conclude
- TDLR: Encrypt data that needs to be read later and hash data that should never be read.
Encryption and hashing are related but separate methods used to secure data on the web. Today, data flows via the TLS standard, allowing the sender and the receiver to verify one another before exchanging data. TLS commonly utilizes the AES cipher for encryption and RSA for secure key exchange.
Encryption and hashing are related but separate methods used to secure data on the web. Today, data flows via the TLS standard, allowing the sender and the receiver to verify one another before exchanging data. TLS commonly utilizes the AES cipher for encryption and RSA for secure key exchange.
Never pass sensitive information like passwords, payment data, and user credentials in plaintext. Instead, hash the data. Hashing is an irreversible process which scrambles incoming data and spits out a base64
string, regardless of the length of the input. This hashed data is then salted, and both the hash and generated salt are sent to the server. The stored hash can then be compared against an incoming hash to confirm its validity.
So, while encryption, salting, and hashing are related, they each play essential roles in the greater security process.
Today, we covered the basics of web encryption using Node's default crypto
library. To quickly recap:
- We handled the creation of a unique encryption key.
- We created a unique AES cipher and IV to encrypt and decrypt our data.
- We covered the hashing and salting process, ensuring unique data hashes and closing attack vectors.
- We watched the process from start to finish to better understand each step.
With our feet sufficiently wet, the follow-up article dives into user authentication and the tokenized web! Until then, thanks for reading, scoffing, or just stopping by!
Cheers!