Concepts: Hashing, Digital Signatures & Certificates
Hashing — A Fingerprint for Data
Deepa is curious. “What exactly is a hash? Rohan keeps mentioning it.”
A hash function takes any input — a word, a document, an entire video file — and produces a fixed-length output called a hash (or digest). The output is always the same length, regardless of how long the input is.
Think of it like a fingerprint. Every person has a unique fingerprint. You can take someone’s fingerprint without knowing anything else about them. And if you compare two fingerprints, you know instantly whether they belong to the same person.
A hash function does the same for data.
SHA-256 in action:
SHA-256 (Secure Hash Algorithm 256-bit) is the hash function used in TLS certificates, digital signatures, Bitcoin, and many other systems. Let us look at what it produces:
| Input | SHA-256 Hash |
|---|---|
Hello | 185f8db32921bd46d35cc2671d49b60c4e31ea2f71d4f3e5f3b11f4a7a00ab28 |
hello | 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 |
Hello! | 334d016f755cd6dc58c53a86e183882f8ec14f52fb05345887c8a5edd42c87b7 |
Notice:
Hello(capital H) andhello(lowercase h) produce completely different hashes — even though only one character changed- All three hashes are exactly 64 characters long, regardless of input length
- There is no way to look at the hash and reconstruct the original text
This is the avalanche effect — a tiny change in the input causes a completely different output. One character different → completely different hash.
The Four Properties of a Good Hash Function
A hash function used in security must have four properties:
1. One-way — You can compute the hash from the input, but you cannot compute the input from the hash. Given 2cf24dba..., you cannot figure out that the input was hello. This is why banks store the hash of your password, not the password itself. If their database is stolen, the attacker gets hashes, not passwords.
2. Deterministic — The same input always produces the same output. hello will always hash to 2cf24dba... — on any computer, anywhere, any time. This is essential for verification: Deepa hashes a document and gets a value; her professor can hash the received document and get the same value if it is unchanged.
3. Avalanche effect — A small change in the input produces a completely different output. Change one bit and the entire hash changes. This makes it impossible to modify a document slightly and hope the hash stays the same.
4. Fixed length — SHA-256 always produces a 256-bit (64 character) output. Hash a single letter or hash a 4GB video file — the output is the same length. This makes hashes practical for comparison and storage.
“Can two different inputs produce the same hash?”
In theory, yes — this is called a collision. But with SHA-256, the probability is so astronomically small that it is considered impossible in practice. The output space has 2²⁵⁶ possible values. For comparison, there are roughly 2²⁶⁶ atoms in the observable universe. Finding a deliberate collision in SHA-256 is beyond any computing power that exists or is expected to exist.
Try it yourself at sha256.online. Type any text, click Hash, and see the SHA-256 output. Then change one letter and hash again. Watch the entire output change. This is the avalanche effect, live. It takes milliseconds — but the same computation secures every TLS certificate on the internet.
Digital Signatures — Proving Who Sent It
Now that we understand hashing, digital signatures become straightforward. A digital signature answers three questions at once:
- Did this come from who it claims to come from? (Authentication)
- Has it been modified since it was signed? (Integrity)
- Can the signer deny having signed it? (Non-repudiation: they cannot)
How Signing Works (3 Steps)
Rohan wants to sign his project document before sending it to his professor.
Step 1 — Hash the document
document.pdf → SHA-256 → "a3f82c..." (the document's fingerprint)
Step 2 — Encrypt the hash with Rohan's private key
"a3f82c..." + Rohan's Private Key → Digital Signature
Step 3 — Attach the signature to the document
Send: document.pdf + Digital Signature together
The signature is just an encrypted hash. It proves two things: it came from whoever holds that private key (Rohan), and it was computed from this exact version of the document.
How Verification Works (3 Steps)
The professor receives the document and the signature. She wants to verify it is genuinely from Rohan and has not been changed.
Step 1 — Hash the received document
received_document.pdf → SHA-256 → "a3f82c..."
Step 2 — Decrypt the signature with Rohan's public key
Digital Signature + Rohan's Public Key → "a3f82c..."
Step 3 — Compare the two hashes
"a3f82c..." == "a3f82c..." → Signature is VALID ✓
If the document was changed after signing, the hash in Step 1 would be different from the hash in Step 2, and the signature check would fail.
If the signature was not created with Rohan’s private key, decrypting it with Rohan’s public key would produce nonsense — and the comparison would fail.
Three Guarantees
| Property | What It Means | How |
|---|---|---|
| Authentication | This came from Rohan | Only Rohan’s private key could create a valid signature for his public key |
| Integrity | The document was not changed | If changed, the hashes would not match |
| Non-repudiation | Rohan cannot deny signing it | Only his private key creates signatures that his public key verifies |
Digital signatures are legally binding in India under the Information Technology Act 2000, Sections 3–5. A document signed with a certified digital signature has the same legal standing as a handwritten signature. Your employer’s IT department may issue you a Digital Signature Certificate (DSC) — it lets you sign official documents electronically. The same PKI technology that secures your browser session secures legally binding contracts.
Digital Certificates — Linking a Public Key to an Identity
Here is the problem: Rohan’s professor has his public key. But how did she get it? How does she know that key actually belongs to Rohan and not someone impersonating him?
This is exactly the problem that digital certificates solve on the internet.
A digital certificate is an electronic document that links a public key to an identity. It says, in effect: “This public key belongs to sbi.co.in, and we — DigiCert — vouch for that.”
Think of it like a government-issued ID card. Your Aadhaar card says “This photo and biometrics belong to this person, and the Government of India vouches for it.” A digital certificate says “This public key belongs to this domain, and this Certificate Authority vouches for it.”
What a Real Certificate Contains
Here is what the certificate for sbi.co.in actually contains (simplified from a real certificate):
| Field | Value for sbi.co.in |
|---|---|
| Subject | sbi.co.in |
| Issuer | DigiCert TLS RSA SHA256 2020 CA1 |
| Valid From | 2024-03-15 |
| Valid To | 2025-03-14 |
| Public Key | RSA 2048-bit (the server’s public key) |
| Signature Algorithm | SHA-256 with RSA |
| Thumbprint (SHA-256) | 3a:f9:2b:... (hash of the entire certificate) |
The Thumbprint (also called Fingerprint) is the SHA-256 hash of the entire certificate. If any field in the certificate is changed — by even one character — the thumbprint changes completely. This makes tampering detectable.
“How do I read a real certificate? Is it just a text file?”
Certificates are stored in a format called X.509. Your browser can show you the certificate for any HTTPS site — click the padlock icon in the address bar. In Exercise 1, you will read a real one yourself.
Certificate Authorities — The Internet’s Notary System
A Certificate Authority (CA) is an organisation that verifies identities and issues certificates. When sbi.co.in applies for a certificate, the CA checks that they actually own the domain. Once satisfied, the CA issues a certificate with their digital signature on it — exactly like the signing process Rohan used, but now it is the CA signing the certificate.
The notary analogy: when you sign a property document, you need a notary public to stamp it. The notary checks your identity and then puts their official stamp on the document. Anyone who sees the stamp knows the notary vouched for the signer. CAs are the internet’s notaries.
Well-known CAs include DigiCert, Sectigo, and GlobalSign (commercial, paid), and Let’s Encrypt (free, automated — used by millions of websites including smaller Indian businesses and NGOs).
The Chain of Trust
No browser trusts individual website certificates directly. Instead, there is a hierarchy:
Root CA (e.g., DigiCert Global Root G2)
└── Intermediate CA (e.g., DigiCert TLS RSA SHA256 2020 CA1)
└── Site Certificate (e.g., sbi.co.in)
Root CAs are the top-level authorities. Their certificates come pre-installed in your device’s operating system — Apple, Google, Microsoft, and Mozilla each maintain a list of Root CAs they trust. Getting onto this list requires passing rigorous audits. There are fewer than 200 Root CAs trusted globally.
Intermediate CAs sit between Root CAs and individual websites. Root CAs issue certificates to Intermediate CAs, vouching for them. Intermediate CAs then issue the certificates that websites use. This structure protects Root CA keys — if a Root CA key were ever compromised, it would be catastrophic. Keeping Root CA keys offline and rarely used reduces that risk.
Site Certificates are what individual websites hold. sbi.co.in has a certificate issued by an Intermediate CA.
When your browser connects to sbi.co.in, it:
- Receives the site certificate and the intermediate certificate
- Checks that the intermediate certificate is signed by a Root CA it already trusts
- Checks that the site certificate is signed by that intermediate CA
- Verifies the site certificate is for the domain you are visiting (
sbi.co.in) - Verifies the certificate has not expired
- Verifies the certificate has not been revoked
All of this happens before the page loads.
“Who decided that DigiCert is trustworthy? Why should I trust them?”
Root CA trust lists are maintained by Apple, Google, Microsoft, and Mozilla. To get included, a CA must pass annual security audits, follow strict policies, and demonstrate they verify identities properly. If a CA misbehaves — and this has happened — they get removed from the trust list, instantly destroying trust in all certificates they have issued. The system is not perfect, but it is audited and accountable.
What Happens When Verification Fails?
You have probably seen browser certificate warnings. Now you know what they mean:
- “Certificate has expired” — the validity period on the certificate is in the past. The site owner forgot to renew it. This does not necessarily mean the site is malicious, but you cannot verify its identity today.
- “Certificate is not valid for this domain” — the certificate was issued for
sbi.co.inbut you are connecting tosbi.co.in.update.xyz. The domain does not match. - “Certificate authority is not trusted” — the certificate was not signed by any CA in your device’s trust list. This could be a self-signed certificate (no CA vouched for it).
- “Certificate has been revoked” — the CA has explicitly cancelled this certificate, usually because the private key was compromised.
These are not suggestions. Your browser treats them as hard blocks on banking and high-security sites.
When a browser shows you a certificate error, it means it cannot verify the identity of the server. Clicking “Advanced → Proceed anyway” means you are connecting to a server whose identity you cannot confirm. For banking, government, or any sensitive site — do not proceed. Leave the site.
PKI — Public Key Infrastructure
The whole system we have described — certificate authorities, trust chains, public and private keys, digital signatures, and hash functions — is called PKI (Public Key Infrastructure). It is the backbone of internet security.
PKI is not one company or one piece of software. It is a system of policies, organisations, software, and hardware that together make secure communication possible. When you use HTTPS, you are relying on PKI. When you use WhatsApp’s end-to-end encryption, you are relying on PKI. When you use a Digital Signature Certificate to sign a tax document — PKI.
Putting It Together: Deepa Opens Her Banking App
When Deepa opens her SBI app:
- The app connects to SBI’s API server
- The server presents its certificate: “I am
api.sbi.co.in, here is my certificate signed by DigiCert” - The app hashes the certificate and checks the CA’s digital signature to verify the certificate has not been tampered with
- The app walks the chain of trust: DigiCert → Intermediate →
api.sbi.co.in - The app verifies DigiCert is in its trusted root list (pre-installed on her Android phone by Google)
- The app verifies the domain name matches
- The app verifies the certificate has not expired
- The TLS handshake completes — an encrypted session begins
- Deepa sees her balance
If any step fails — if the chain is broken, the domain does not match, or the certificate has expired — the app would refuse to connect. Not warn. Refuse.