For a long time, HTTP was the standard protocol for the world wide web. But this protocol communicates in plain text.
So, if someone can intercept the communication channel.
- They can read the request and responses, breaching data privacy
- They can tamper with the request and responses, compromising data integrity
HTTPS was introduced to address these issues of data integrity and privacy.
Let’s talk about my understanding of HTTPS.
Then there comes another issue. How to know I am communicating with the right server? TLS solves this problem as well. It uses a certificate-based protocol to ensure authenticity.
If I want to break down an HTTPS connection. I can divide it into four consecutive processes.
- TCP Handshake
- Certificate validation
- Key exchange
- Data transmission
In this article, I will mostly talk about Certificate validation and Key exchange. The key components of TLS handshake
Certificates
Let’s talk about certificates and why we need them in the first place.
Whenever we request something, we need to ensure that responses are coming from the right place. Certificates validate that the responses are coming from the right source, the servers are indeed what they claim to be.
An SSL certificate is a digital certificate that authenticates a website’s identity and enables an encrypted connection.
Digital Certificates are verifiable small data files that contain identity credentials to help websites, people, and devices represent their authentic online identity (authentic because the CA has verified the identity).
There are some trusted Certificate Authorities that use asymmetric encryption to issue Digital Certificates.
Certificate Authority
Here is what ssl.com says about certificate authority.
A certificate authority (CA), also sometimes referred to as a certification authority, is a company or organization that acts to validate the identities of entities (such as websites, email addresses, companies, or individual persons) and bind them to cryptographic keys through the issuance of electronic documents known as digital certificates.
A CSR(Certificate Signing Request) is an encoded text file that includes the public key and other information that will be included in the certificate.
After generating the CSR, the applicant sends it to a CA, who independently verifies that the information it contains is correct and, if so, digitally signs the certificate with an issuing private key and sends it to the applicant.
Unlike the public key, the applicant’s private key is kept secure and should never be shown to the CA (or anyone else).
Certificate fields
A digital certificate provides information about the subject of the certificate, the validity of the certificate, and applications and services that can use the certificate.
The figure below shows the contents of X.509 version 3 certificates
Validation Process
We need to talk a little bit about asymmetric cryptography, also known as public-key cryptography before talking about certificate validation on a third-party client (Browsers )
So, basically, we encrypt the data with the private key, only the public key can decrypt the data.
In the process of the certificate validation, the server sends the certificate to the client.
The client gets the certificate. And the validation process starts
Now we need to talk about a concept called chain of trust.
A chain of trust refers to a process where a trusted user can issue another certificate. And all the certificates can be traced back to the original root certificate, which is self-signed. We use path validation to trace back to the root. If the root can be verified, We can be sure about the whole chain of the certificate.
Basic Certificate processing must satisfy each of the following requirements.
- The signature can be verified using the public key.
- The certificate is not expired.
- The certificate is not revoked
- The certificate issuer name is the issuer’s distinguished name in the next certificate in the chain.
The validation keeps repeating until it gets the last certificate. Which is a self-signed root certificate.
Root certificate authorities are acknowledged by browsers and we don’t need to worry about that. As browsers are pre-installed with the public keys of the CA, browsers can use that to verify the certificate without contacting the CA.
Now, the browser has validated the certificate and is sure about the identity of the server. Trust is established.
Key exchange
Once the client is sure about the identity of the server. We can start communicating. But we can’t just communicate in plain text, right?! What if someone intercepts the packets? A very serious problem it is.
Simple solution, We can encrypt the packets. Right?
If we use asymmetric encryption like before and encrypt using our private keys, It won’t work. Because to decrypt that, we need public keys that are publicly available. Anyone can decrypt the packets.
We will use symmetric encryption in this particular case. One single secret key will be used to encrypt and decrypt.
But there is another problem. How do we exchange the secret key without anyone knowing about it?
If we generate a key and then pass it to the server. This key passing is vulnerable to interception as we have no other way than plain text.
We can use The Diffie-Hellman Key Exchange or RSA to address this issue. The image below demonstrates how keys are exchanged using Diffie-Hellman Key Exchange
Client and Server publicly agree to use a modulus N = 23 and base g = 5
The client chooses a secret integer C = 4, then sends to Server
Suppose $Cx = g^C mod N = 5^4 mod 23 = 4
$
The server chooses a secret integer S = 3, then sends to the Client
Suppose $Sx = g^S mod N = 5^3 mod 23 = 10
$
Client computes $Cy = Sx^C mod N = 10^4 mod 23 = 18
$
Server computes $Sy = Cx^S mod N = 4^3 mod 23 = 18
$
Client and Server now both share a secret number 18. Both Client and Server have arrived at the same values because of Math!!
Once the key exchange is done, the Client and Server can use the key to encrypt and decrypt messages. Hence a secure connection is established.