Networking Interview Prep
Application Protocols

HTTPS

Secure Web Communication

LinkedIn Hook

Most developers treat the padlock icon as a checkbox.

"It has HTTPS — must be safe." Wrong assumption.

HTTPS does not mean the website is trustworthy. It means the connection is encrypted. The malicious site selling fake goods can have a valid padlock too.

Here is what HTTPS actually guarantees — and what most developers miss in interviews:

  • Encryption — no one in the middle can read what you send (not your ISP, not airport Wi-Fi, not a nation-state intercept)
  • Authentication — you are actually talking to the real server, not an imposter (verified by a Certificate Authority)
  • Integrity — data was not tampered with in transit (no silent byte-flipping by a proxy)

And none of that means the destination is honest — only the tunnel is secure.

Understand this distinction and you will answer HTTPS interview questions better than 90% of candidates.

Read the full lesson → [link]

#Networking #HTTPS #WebSecurity #TLS #InterviewPrep #BackendEngineering


HTTPS thumbnail


What You'll Learn

  • Why HTTPS is simply HTTP with a TLS layer wrapped around it
  • The three concrete security problems HTTPS solves
  • How the TLS handshake works (simplified, interview-ready version)
  • What an X.509 certificate contains and why it matters
  • The Certificate Authority chain of trust
  • Self-signed certificates vs CA-signed certificates
  • What the browser padlock icon really means (and does not mean)
  • How Let's Encrypt provides free certificates via the ACME protocol
  • What mixed content is and why it breaks your HTTPS guarantees

The Analogy That Makes This Click

Imagine you want to send a message to your bank.

HTTP is a postcard. You write your account number on it, drop it in a mailbox, and it travels through dozens of postal workers — every one of them can read it. Any postal worker can also cross out your message and write something different before it arrives. And when it arrives, you have no way to confirm it actually came from your bank and not someone pretending to be them.

HTTPS is a sealed, certified letter. Before the first word is exchanged:

  1. Your bank proves its identity using a certificate stamped by a trusted authority (like a notary)
  2. You and the bank agree on a private code that only the two of you know
  3. The letter is sealed with that code — no one in transit can read it or alter it without the seal breaking

The postman (your ISP, the router at the coffee shop, any middleman) sees that a sealed letter went from you to your bank. That is all they know.


HTTPS = HTTP + TLS

HTTPS is not a new protocol. It is the same HTTP protocol you already know — requests, responses, status codes, headers — but the entire conversation is wrapped inside a TLS (Transport Layer Security) tunnel before it travels over the network.

Application Layer:  HTTP  (request/response — same as always)
                     |
Security Layer:     TLS   (encrypts and authenticates the HTTP data)
                     |
Transport Layer:    TCP   (reliable delivery)
                     |
Network Layer:      IP    (routing)

The default port for HTTPS is 443. When your browser connects to https://example.com, it:

  1. Establishes a TCP connection to port 443
  2. Runs the TLS handshake to create a secure channel
  3. Sends and receives normal HTTP messages through that channel

Everything below the application layer is invisible to your code. fetch("https://...") handles all of this for you.


The Three Problems HTTPS Solves

1. Encryption — No Eavesdropping

Without encryption, any device your packets pass through can read the exact contents: passwords, session tokens, credit card numbers, private messages.

With HTTPS, all data is encrypted using a session key that only the client and server know. Even if an attacker captures every packet, they see random bytes they cannot decrypt.

Real-world consequence: On an HTTP site, connecting from a coffee shop Wi-Fi exposes everything you send. On HTTPS, an attacker on the same Wi-Fi sees only that you connected to a particular server — not what you said.

2. Authentication — No Impersonation

Without authentication, nothing stops an attacker from routing your traffic to a server they control (DNS spoofing, BGP hijacking, ARP poisoning on local networks). You would happily send your credentials to the attacker's server thinking it was your bank.

With HTTPS, the server presents a digital certificate that proves its identity. Your browser verifies it was signed by a trusted Certificate Authority. If the certificate does not match the domain, the browser refuses to connect and shows an error page.

Real-world consequence: HTTPS makes man-in-the-middle impersonation attacks effectively impossible without a forged certificate — which requires compromising a trusted CA.

3. Integrity — No Tampering

Without integrity protection, a middleman (your ISP, a router) can silently modify data in transit. Some ISPs historically injected ads into HTTP responses. Attackers on local networks can alter login forms to redirect your credentials.

With HTTPS, every message includes a cryptographic MAC (Message Authentication Code). If a single byte is changed in transit, the MAC check fails and the connection is terminated.

Real-world consequence: What you send is exactly what arrives. What the server sends is exactly what your browser renders.


The TLS Handshake (Simplified)

Before any HTTP data flows, the client and server run the TLS handshake to agree on encryption parameters and authenticate the server. Here is the interview-ready version:

Client                                          Server
  |                                               |
  |--- ClientHello -------------------------------->
  |    (TLS versions supported, cipher suites,    |
  |     random number)                            |
  |                                               |
  |<-- ServerHello --------------------------------|
  |    (chosen TLS version, chosen cipher suite,  |
  |     random number)                            |
  |                                               |
  |<-- Certificate --------------------------------|
  |    (server's X.509 cert: domain, public key,  |
  |     CA signature, expiry)                     |
  |                                               |
  |    [Client verifies certificate against       |
  |     trusted CA list]                          |
  |                                               |
  |--- Key Exchange ------------------------------->
  |    (client generates pre-master secret,       |
  |     encrypts with server's public key)        |
  |                                               |
  |    [Both derive the same session key          |
  |     from the pre-master secret]               |
  |                                               |
  |--- Finished (encrypted) ----------------------->
  |<-- Finished (encrypted) -----------------------|
  |                                               |
  |    [Handshake complete — switch to            |
  |     symmetric encryption for all data]        |
  |                                               |
  |=== Encrypted HTTP data flows =================>

Key insight for interviews: The handshake uses asymmetric encryption (public/private key) only to securely exchange a symmetric session key. All actual data then uses symmetric encryption because it is far faster. The server's private key is used exactly once per handshake — to prove identity and protect the key exchange.

TLS 1.3 (current standard) compresses this to 1 round trip (1-RTT) instead of the 2-RTT of TLS 1.2, and eliminates several weaker cipher suites.


X.509 Certificates — What Is in the Padlock

A certificate is a digital document that binds a public key to an identity (your domain name). It is what makes authentication possible.

An X.509 certificate contains:

FieldExamplePurpose
SubjectCN=example.comThe domain this cert is for
IssuerLet's Encrypt Authority X3The CA that signed it
Public KeyRSA 2048-bit or EC P-256Used during the key exchange
Validity Period2025-01-01 to 2025-07-01Not-before / not-after dates
Serial Number03:A1:B2:...Unique identifier
CA SignatureDigital signature blobProves the CA vouched for this cert
Subject Alt Namesexample.com, www.example.comAll domains this cert covers
Key UsageDigital Signature, Key EnciphermentWhat the key can be used for

When your browser receives a certificate, it checks:

  1. Is the domain in the certificate's Subject/SAN fields?
  2. Is it within the validity period?
  3. Is the CA signature valid and from a CA the browser trusts?
  4. Has the certificate been revoked (CRL / OCSP check)?

If any check fails, the browser shows an error. If all pass, the padlock appears.


Certificate Authority (CA) and Chain of Trust

How does your browser know which CAs to trust? They are pre-installed — your operating system and browser ship with a list of ~150 trusted Root CAs (Mozilla, Google, Apple, and Microsoft each maintain their own list).

The trust chain works as a hierarchy:

Root CA  (self-signed, built into your OS/browser — offline, highly secured)
  |
  |--- Intermediate CA  (signs leaf certs on behalf of the Root)
          |
          |--- Leaf Certificate  (your website's actual cert)

Why Intermediate CAs? Root CAs are kept offline in hardware security modules (HSMs) — air-gapped, physically secured. If a Root CA's private key were compromised, every certificate on the internet would be untrusted. Intermediate CAs do the day-to-day signing. If an Intermediate CA is compromised, only certs it signed need to be revoked — the root remains safe.

When a browser verifies your certificate, it walks the chain:

  1. Is the leaf cert signed by an Intermediate CA I recognize?
  2. Is that Intermediate CA signed by a Root CA I trust?
  3. Is the Root CA in my trusted store?

All three must be true. This is why servers must send the full certificate chain (leaf + intermediates) in the TLS handshake — not just the leaf certificate.

# Inspecting the full certificate chain with openssl
openssl s_client -connect example.com:443 -showcerts

# Output includes:
# Certificate chain
#  0 s:CN=example.com              (leaf)
#  1 s:CN=Let's Encrypt R3         (intermediate)
#  2 s:CN=ISRG Root X1             (root)

Self-Signed vs CA-Signed Certificates

A self-signed certificate is one where the issuer and subject are the same — the server signed its own certificate using its own private key. There is no third party vouching for the identity.

# Generate a self-signed cert (for local development only)
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem \
  -days 365 -nodes -subj "/CN=localhost"
Self-SignedCA-Signed
CostFreeFree (Let's Encrypt) or paid
Trusted by browsersNo — security warningYes
Proves identityNoYes (domain validated at minimum)
Suitable for productionNeverYes
Suitable for local devYesUnnecessary

When a browser encounters a self-signed certificate, it cannot verify the identity claim. Anyone can generate a self-signed cert claiming to be paypal.com. The browser shows a full-page warning: "Your connection is not private." Visitors must manually click through to proceed — which most will not, and should not.

Rule of thumb: Self-signed certificates are for local development and internal tooling only. Never use them in production.


The Padlock Icon — What It Really Means

Clicking the padlock in Chrome or Firefox shows:

  • "Connection is secure" — the TLS handshake succeeded, the certificate is valid, and all data is encrypted
  • The certificate's issuer and validity dates
  • Whether the page loaded with any mixed content

What the padlock does NOT mean:

  • The website is legitimate or trustworthy
  • The business behind it is real
  • The content is safe or accurate
  • Your data is protected on their servers (only in transit)

Phishing sites routinely have valid HTTPS certificates. Getting a domain-validated certificate (DV cert) is free and takes seconds — no identity verification of the actual organization is required. A site with a padlock that sells fake goods is still fraudulent.

For stronger identity assurance, look for OV (Organization Validated) or EV (Extended Validation) certificates, which require the CA to verify the legal existence of the organization. However, browsers no longer visually distinguish EV certificates with a green bar (they removed this around 2019).


Let's Encrypt — Free, Automated Certificates

Before Let's Encrypt (launched 2016), SSL certificates cost $50–$300/year and required manual processes. Let's Encrypt changed everything: certificates are free, automated, and renew every 90 days.

How it works — the ACME protocol:

1. Your server runs a certbot (ACME client)

2. certbot requests a certificate for your-domain.com

3. Let's Encrypt issues a challenge to prove you control the domain:
   - HTTP-01 challenge: "Place this token at http://your-domain.com/.well-known/acme-challenge/<token>"
   - DNS-01 challenge: "Add this TXT record to your DNS: _acme-challenge.your-domain.com"

4. certbot completes the challenge

5. Let's Encrypt verifies the domain is under your control

6. Let's Encrypt issues the certificate (valid 90 days)

7. certbot installs it and sets up auto-renewal (renews at ~60 days)
# Install certbot and get a certificate (Ubuntu + Nginx)
sudo apt install certbot python3-certbot-nginx
sudo certbot --nginx -d example.com -d www.example.com

# Certbot automatically:
# - Obtains the certificate from Let's Encrypt
# - Configures Nginx to use it
# - Sets up a cron job / systemd timer for auto-renewal

# Test auto-renewal
sudo certbot renew --dry-run

Let's Encrypt is operated by the Internet Security Research Group (ISRG), a non-profit. It has issued over 3 billion certificates and is now the world's largest CA by volume.


HTTP vs HTTPS — Full Comparison

FeatureHTTPHTTPS
Default Port80443
EncryptionNone — plaintextTLS — encrypted
AuthenticationNoneServer identity verified via cert
Data IntegrityNoneCryptographic MAC on every message
Certificate RequiredNoYes (from trusted CA for production)
PerformanceSlightly faster (no handshake)Negligible overhead with TLS 1.3
SEOPenalized by GoogleRanked higher
Browser WarningShown for pages with formsNo warning (padlock shown)
HTTP/2 SupportRarelyYes — HTTP/2 requires TLS in practice
Use CasesInternal tooling (never public)All public-facing sites

Performance note: The TLS handshake adds ~1 round trip of latency to the first connection. After that, data flows at near-identical speed. TLS 1.3 further reduces this with 0-RTT session resumption for returning visitors. The performance argument against HTTPS no longer holds — the web moved entirely to HTTPS.


Mixed Content

Mixed content occurs when an HTTPS page loads resources (images, scripts, stylesheets, iframes) over HTTP.

<!-- You are on https://example.com -->

<!-- SAFE: resource also served over HTTPS -->
<script src="https://cdn.example.com/app.js"></script>

<!-- MIXED CONTENT: resource served over HTTP on an HTTPS page -->
<script src="http://cdn.example.com/app.js"></script>
<img src="http://example.com/logo.png" />

Why it matters:

  • An attacker can intercept the HTTP resource and replace it with malicious code
  • A malicious <script> loaded over HTTP on your HTTPS page has full access to the page's DOM, cookies (if not HttpOnly), and form values — the HTTPS protection of the main page means nothing
  • Browsers block active mixed content (scripts, stylesheets, iframes) and show a warning or blank the page
  • Browsers may display a warning or partial padlock for passive mixed content (images, videos) but still load them in some cases

The fix: Always load all resources over HTTPS. Use protocol-relative URLs (//example.com/resource) or absolute HTTPS URLs. Enforce with the Content-Security-Policy header:

Content-Security-Policy: upgrade-insecure-requests

This header instructs the browser to automatically upgrade all HTTP sub-resource requests to HTTPS before sending them — a simple safety net for legacy mixed content issues.


Common Mistakes

  • Thinking HTTPS means the website is safe or legitimate. HTTPS encrypts the connection. It says nothing about whether the destination is honest, the business is real, or the content is trustworthy. Phishing sites have HTTPS. Always verify the domain, not just the padlock.

  • Using self-signed certificates in production. Browsers will show a full-page security warning for self-signed certs. Users will bounce. Search engines will penalize you. Let's Encrypt provides free, browser-trusted certificates with automated renewal — there is no valid reason to use self-signed in production.

  • Not redirecting HTTP to HTTPS — leaving both active. If your server accepts both HTTP and HTTPS, users who type http:// or follow an old link get an unencrypted connection. Always redirect HTTP (port 80) to HTTPS (port 443) with a 301 Moved Permanently response, and use the Strict-Transport-Security (HSTS) header so browsers remember to always use HTTPS going forward.

# Nginx: redirect all HTTP to HTTPS
server {
    listen 80;
    server_name example.com www.example.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;
    server_name example.com www.example.com;

    ssl_certificate     /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

    # HSTS: tell browsers to always use HTTPS for 1 year
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
}

Interview Questions

Q: What three problems does HTTPS solve that HTTP cannot?

HTTPS solves three fundamental security problems. First, encryption — all data in transit is encrypted with TLS, so no eavesdropper (ISP, router, attacker on the same Wi-Fi) can read the contents. Second, authentication — the server presents an X.509 certificate signed by a trusted Certificate Authority, proving the server is who it claims to be and preventing impersonation attacks. Third, integrity — a cryptographic MAC on every message ensures data cannot be tampered with in transit; any alteration causes the connection to fail. HTTP provides none of these guarantees — data travels in plaintext, the server's identity is unverifiable, and content can be modified by any intermediary.


Q: What is a Certificate Authority and why do browsers trust them?

A Certificate Authority (CA) is an organization that verifies identities and issues digital certificates that bind a domain name to a public key. To get a certificate, you prove to the CA that you control the domain (via a challenge like placing a file at a known URL or adding a DNS record). The CA then signs your certificate with its own private key.

Browsers trust specific CAs because their Root CA certificates are pre-installed in the operating system and browser trust store — maintained by Mozilla, Google, Apple, and Microsoft. When a browser sees a certificate signed by one of these trusted roots (or an intermediate they signed), it trusts the certificate. The CA model works because the browser trusts these ~150 root organizations to only issue certificates after proper domain verification.


Q: What is the difference between a self-signed certificate and a CA-signed certificate?

A self-signed certificate is generated and signed by the server itself — the subject and issuer are the same entity. There is no third party verifying that the claimed identity is genuine. A CA-signed certificate has been issued by a trusted Certificate Authority after verifying that the requester actually controls the domain.

The practical difference: browsers trust CA-signed certificates and show a padlock. Browsers do not trust self-signed certificates and show a full-page security warning, because anyone can generate a self-signed certificate claiming to be any domain with zero verification. Self-signed certificates are appropriate for local development and internal infrastructure only. For any public-facing site, use a CA-signed certificate — Let's Encrypt provides them for free.


Q: What is mixed content and why is it a security risk?

Mixed content is when an HTTPS page loads sub-resources (scripts, images, stylesheets, iframes) over HTTP. It is a security risk because the HTTP resources are not encrypted or authenticated — an attacker who can intercept the HTTP request can replace the resource with malicious content. If a <script> tag loads over HTTP on an HTTPS page, that script runs in the page's context with full access to the DOM, form values, and cookies. The HTTPS protection of the main page is completely bypassed. Browsers block active mixed content (scripts, stylesheets) entirely. The fix is to load all resources over HTTPS and use the Content-Security-Policy: upgrade-insecure-requests header.


Quick Reference — Cheat Sheet

The Three HTTPS Guarantees

+------------------+--------------------------------------------+
| Guarantee        | What It Means                              |
+------------------+--------------------------------------------+
| Encryption       | Data unreadable by anyone in transit       |
|                  | (ISP, router, attacker on same network)    |
+------------------+--------------------------------------------+
| Authentication   | Server proved its identity via a cert      |
|                  | signed by a CA your browser trusts         |
+------------------+--------------------------------------------+
| Integrity        | Data was not modified in transit;          |
|                  | any alteration breaks the connection       |
+------------------+--------------------------------------------+

HTTP vs HTTPS

+---------------------+-----------+----------------------------+
| Feature             | HTTP      | HTTPS                      |
+---------------------+-----------+----------------------------+
| Default Port        | 80        | 443                        |
| Encryption          | None      | TLS (AES-128/256-GCM)      |
| Authentication      | None      | X.509 Certificate + CA     |
| Integrity           | None      | Cryptographic MAC          |
| Certificate         | Not needed| Required (CA-signed)       |
| Browser Padlock     | No        | Yes (if cert is valid)     |
| SEO                 | Penalized | Ranking signal             |
| HTTP/2              | Rarely    | Required in practice       |
| Plaintext in transit| Yes       | No                         |
+---------------------+-----------+----------------------------+

TLS Handshake (One-Line Summary Per Step)

ClientHello  →  "Here are the TLS versions and ciphers I support"
ServerHello  ←  "I chose TLS 1.3 and AES-256-GCM"
Certificate  ←  "Here is my cert — verify my identity"
KeyExchange  →  "Here is the pre-master secret, encrypted to you"
Finished     ↔  "We both derived the same session key — let's go"
[Data]       ↔  All HTTP data flows encrypted with symmetric key

Certificate Chain of Trust

[Root CA]          — Self-signed, in your OS/browser trust store
     |
[Intermediate CA]  — Signed by Root, used for day-to-day issuance
     |
[Leaf Cert]        — Your site's cert, signed by Intermediate
                     Contains: domain, public key, expiry, CA sig

Padlock Means / Does Not Mean

DOES mean:          Connection is encrypted
                    Server proved domain ownership to a CA
                    Data was not tampered with in transit

DOES NOT mean:      Website is legitimate or trustworthy
                    Business behind the site is real
                    Your data is safe on their servers

Previous: Lesson 5.1 — HTTP (01-http.md) Next: Lesson 5.3 — TLS/SSL (03-tls-ssl.md)


This is Lesson 5.2 of the Networking Interview Prep Course — 8 chapters, 32 lessons.

On this page