Goal

Understand entropy and randomness as the foundation of cryptographic security, learn binary encoding (Base64, hex), and combine everything into practical encrypted workflows with best practices.

Prerequisites: Week 2b (Symmetric Encryption)

This is Part 3 of 3 - Covers entropy, encoding, workflows, and best practices.


1. Entropy and Randomness: The Foundation of Security

What Is Entropy?

Entropy is a measure of unpredictability (randomness).

In cryptography:

  • High entropy = Truly random, unpredictable
  • Low entropy = Patterns, predictable

Why it matters:

  • Encryption keys MUST be random
  • Predictable keys = Broken encryption
  • Random number generators (RNGs) are security-critical

Sources of Randomness in Linux

Linux provides two key sources:

1. /dev/random (Blocking, High-Entropy)

# Blocks (waits) if not enough entropy available
# Used for: Generating long-term cryptographic keys
head -c 32 /dev/random | base64

2. /dev/urandom (Non-Blocking, Cryptographically Secure)

# Never blocks, always returns data
# Used for: Most cryptographic operations (recommended)
head -c 32 /dev/urandom | base64

Modern Linux (kernel 5.4+): Both are equally secure. /dev/urandom is recommended for all uses.

How Linux Generates Randomness

Entropy sources:

  • Hardware events: Keyboard timings, mouse movements, disk I/O
  • CPU jitter: Timing variations in instruction execution
  • Hardware RNG: Intel RDRAND, AMD RdRand (if available)
  • Network packet timings

Process:

Hardware Events → Entropy Pool → CSPRNG → /dev/urandom
(keyboard, disk)   (mixed)     (ChaCha20)  (secure random bytes)

CSPRNG = Cryptographically Secure Pseudo-Random Number Generator

Generating Random Keys

# Generate 256-bit (32-byte) encryption key
head -c 32 /dev/urandom > encryption.key

# Generate random password (base64 encoded for readability)
head -c 32 /dev/urandom | base64
# Output: kR9$mP2#vL7@nQ4*wX6&jZ8tC1yB5nD3gF4hJ6kL8

# Generate hex-encoded random data
xxd -l 32 -p /dev/urandom
# Output: 7d8f4c3a1b2e5f6d9c8a7b6e5f4d3c2b1a9e8d7c6b5a4f3e2d1c0b9a8e7f6d5c

The Danger of Weak RNGs

Historical failure: Debian OpenSSL bug (2006-2008)

  • Bug reduced entropy to only 32,767 possible keys
  • Should have been 2^256 possible keys
  • All SSH keys generated during this period were weak
  • Attackers could try all possible keys in hours

Lesson: Never implement your own RNG. Use /dev/urandom.


2. Binary Encoding: Base64, Hex, and Why They Matter

The Problem: Binary Data in Text Systems

Encryption produces binary data (random bytes 0-255), but many systems expect text:

  • Email (originally 7-bit ASCII)
  • JSON configuration files
  • Copy/paste in terminals

Solution: Encode binary as text using Base64 or Hex.

Base64 Encoding

Concept: Represent binary data using 64 printable ASCII characters.

Character set: A-Z, a-z, 0-9, +, / (64 total)

Efficiency: 3 bytes binary → 4 bytes base64 (33% overhead)

# Encode binary to base64
echo "Hello, cypherpunk!" | base64
# Output: SGVsbG8sIGN5cGhlcnB1bmshCg==

# Decode base64 to binary
echo "SGVsbG8sIGN5cGhlcnB1bmshCg==" | base64 -d
# Output: Hello, cypherpunk!

Hexadecimal Encoding

Concept: Represent each byte as two hex digits (0-9, A-F).

Efficiency: 1 byte binary → 2 bytes hex (100% overhead)

When to use:

  • Debugging (more readable than base64)
  • Cryptographic hashes (SHA-256 output always shown in hex)
  • Low-level binary analysis
# Encode to hex
echo "Hello" | xxd -p
# Output: 48656c6c6f0a

# Decode from hex
echo "48656c6c6f0a" | xxd -r -p
# Output: Hello

Why Base64 Has “=” Padding

Base64 encodes 3 bytes at a time:

  • 3 bytes (24 bits) → 4 base64 characters
  • If input not divisible by 3, add padding
2 bytes: "He" → SGU= (one = for padding)
1 byte:  "H"  → SA== (two = for padding)

The “=” is filler, not part of data.


3. Putting It All Together: Encrypted Workflow

Let’s combine everything into a real-world encrypted workflow.

Scenario: Encrypting Sensitive Documents

# 1. Create directory structure
mkdir -p ~/secure/{documents,keys,logs}

# 2. Generate strong encryption key (256-bit)
openssl rand 32 > ~/secure/keys/master.key
chmod 600 ~/secure/keys/master.key  # Only you can read

# 3. Create sensitive document
cat > ~/secure/documents/secrets.txt <<EOF
Bank account: 1234567890
SSH passphrase: MySecretPhrase123
API keys: sk_live_abc123xyz
EOF

# 4. Hash the original (for integrity verification)
sha256sum ~/secure/documents/secrets.txt > ~/secure/logs/secrets.sha256

# 5. Encrypt with AES-256-GCM (authenticated encryption)
openssl enc -aes-256-gcm \
  -in ~/secure/documents/secrets.txt \
  -out ~/secure/documents/secrets.enc \
  -pass file:~/secure/keys/master.key

# 6. Securely delete original (Week 5 covers secure deletion)
shred -u ~/secure/documents/secrets.txt

# 7. Later: Decrypt when needed
openssl enc -d -aes-256-gcm \
  -in ~/secure/documents/secrets.enc \
  -out ~/secure/documents/secrets-decrypted.txt \
  -pass file:~/secure/keys/master.key

# 8. Verify integrity (hash should match)
sha256sum ~/secure/documents/secrets-decrypted.txt
cat ~/secure/logs/secrets.sha256

What We Just Did

  1. Generated random key - High-entropy 256-bit key from /dev/urandom
  2. Protected key file - Set strict permissions (chmod 600)
  3. Hashed original - SHA-256 fingerprint for integrity verification
  4. Encrypted with AES-GCM - Authenticated encryption (confidentiality + integrity)
  5. Deleted plaintext - Removed unencrypted copy
  6. Decrypted securely - Retrieved data with key file
  7. Verified integrity - Checked hash matches (no tampering)

4. Cryptographic Best Practices

Algorithm Selection Guide

For Hashing:

  • SHA-256 - General purpose, widely supported
  • BLAKE2b - Faster alternative, modern applications
  • SHA-1 - Only if legacy system requires it
  • MD5 - Never use (broken)

For Symmetric Encryption:

  • AES-256-GCM - Default choice, authenticated
  • ChaCha20-Poly1305 - Mobile/embedded devices, Tor
  • AES-256-CBC - Legacy systems only
  • AES-ECB - Never use (leaks patterns)

For Key Derivation:

  • Argon2id - Best choice (if available)
  • scrypt - Good alternative
  • PBKDF2 - Minimum acceptable (100k+ iterations)
  • Plain SHA-256 - Never use for passwords

Common Mistakes to Avoid

1. Using ECB mode

# WRONG
openssl enc -aes-256-ecb -in file.txt -out file.enc

2. Not using salt

# WRONG
openssl enc -aes-256-cbc -in file.txt -out file.enc  # Missing -salt

3. Weak passwords with strong encryption

# WRONG (strong algorithm, weak password)
openssl enc -aes-256-gcm -in file.txt -out file.enc -k "password123"
# AES-256 can't protect against password cracking

4. Storing keys with encrypted data

# WRONG
encryption.key  # Same directory as encrypted files
secrets.enc     # Attacker gets both!

5. Not verifying integrity

# WRONG (no way to detect tampering)
# Encrypt but never hash
# Attacker could modify ciphertext

Security Checklist

Before deploying encryption:

  • Using modern algorithm (AES-256-GCM or ChaCha20-Poly1305)
  • Strong key (256-bit random from /dev/urandom)
  • Key derivation if password-based (PBKDF2 100k+ iterations minimum)
  • Authenticated encryption (GCM or Poly1305)
  • Integrity verification (SHA-256 hash)
  • Secure key storage (separate from encrypted data, chmod 600)
  • Tested decryption (verify you can recover data)

Week 2 Checklist

  • Understand difference between hashing and encryption
  • Explain why MD5 is broken (collision attacks)
  • Hash files with SHA-256 and verify integrity
  • Encrypt files with AES-256-GCM (authenticated encryption)
  • Decrypt files and verify authentication tag
  • Generate 256-bit random keys from /dev/urandom
  • Understand encryption modes (ECB broken, CBC legacy, GCM modern)
  • Know when to use AES vs ChaCha20
  • Use key derivation for password-based encryption
  • Encode binary data with base64 and hex
  • Demonstrate avalanche effect with tiny input changes
  • Create encrypted workflow (hash → encrypt → verify)

Journal & Git Commit

echo "Week 2: Mastered hash functions (SHA-256), symmetric encryption (AES-256-GCM), key derivation (PBKDF2/Argon2), and entropy sources. Built encrypted workflow with integrity verification." >> notes/week02_journal.md

git add .
git commit -S -m "Week 2 - Cryptographic fundamentals, hashing, encryption"

Up Next: Week 3

Now that you understand symmetric encryption (same key for encrypt/decrypt), Week 3 introduces asymmetric encryption:

  • Public/private keypairs - Math that makes key exchange possible
  • GPG (GnuPG) - The cypherpunk’s Swiss Army knife
  • Digital signatures - Prove you wrote a message
  • Web of trust - Decentralized identity verification

The Bridge: Week 2 taught you how encryption works. Week 3 teaches you how to use it without sharing your secret key.


Additional Resources

Further Reading:

Tools Covered This Week:

  • openssl - Swiss Army knife of cryptography
  • sha256sum - SHA-2 family hashing
  • xxd / hexdump - Binary data visualization
  • base64 - Binary-to-text encoding
  • /dev/urandom - Cryptographically secure RNG

Key Takeaways

  • Entropy measures randomness - high entropy keys are unpredictable
  • /dev/urandom is the recommended source for cryptographic randomness
  • Base64 encodes binary for text systems (33% overhead)
  • Hex encoding is more readable for debugging (100% overhead)
  • Never implement your own RNG - use system-provided sources
  • Complete workflow: Generate key → Hash original → Encrypt → Delete plaintext → Verify