Email Basics: The Internet Message Format

Note: This is part 2 of a two-part segment on Email Basics. You can also read part 1 of this segment, which deals with the Simple Mail Transfer Protocol.

In part 1 of this two-part segment, I mentioned that the goal of SMTP was to ensure the responsible delivery of a digital package, but also that SMTP does not care what kind of data the package contains. In fact, a good implementation of SMTP should accept any sequence of bits as a valid package so long as it ends in <CRLF>.<CRLF>, which is 0D 0A 2E 0D 0A in hexadecimal. In practice, this package of data usually follows the RFC822 standard or its successors, which define the format of “Internet Text Messages”, more commonly thought of as Email.

RFC822 and SMTP were designed to be very robust protocols. They were designed to:

  • keep memory usage contained in the case of Email servers that may run for thousands of days
  • be human-readable for debugging purposes
  • support backward-compatibility as new features are added

These properties will become clear as we talk about RFC822. Before I start, you should know that you can view the full, original RFC822 version of any email you have ever received. These source messages are available in Gmail with the “Show Original” dropdown menu option and in Yahoo Mail with the “Show Headers” option. If you open up one of your emails, you will notice a few things:

  • The actual email is at the bottom. The top has a bunch of Key: Value pairs, some of which you recognize and all of which are in English.
  • At the bottom, your email is repeated twice. One version looks like a normal text email, but the other one has a bunch of HTML-like tags embedded in it.
  • (If there are any attachments) There is a huge chunk of gibberish at the bottom.

(If you’ve ever worked with HTTP headers, then Email headers should look very familiar to you. They work mostly the same way, with some small differences.)

Email headers are bits of information that describe an email. They appear at the top of an email in Key: Value pairs. They can span more than 1 line by prefixing the continuing lines with one or more whitespace characters. Email headers work on the Collapsing White Space (CWS) idea, like HTML does. Any number of spaces, tabs, CRLF’s, or sequences of whitespace are treated as a single space character. Email headers include the familiar From:, To:, Subject:, and Date: headers, but they also include these important pieces of information: (try to find these in your own emails!)

  • Message-ID – Uniquely identifies the email so that we can have things like threads using emails. They look like email addresses, but have a more restrictive syntax.
  • References or In-Reply-To – List of Message-IDs that this email is related to. Used for making threads.
  • Received – This header is added on to the beginning of the email headers whenever the email changes hands (by means of SMTP or other means). It contains information about the receiving server and the sending server, as well as the time and protocol variables. The newest ones are always at the top, not the bottom.
  • DKIM-Signature – A PKI signature by the Email’s sender’s mail server that guarantees the email has not been tampered with since it left its origin. More on this later.
  • X-* – Any header that begins with X- is a non-standard header. However, non-standard headers may serve some very important functions. Sometimes, standards begin as X- non-standard headers and slowly make their way into global adoption.

After the email headers is a completely empty line. This marks the border between the headers and the email body. Most emails today are transferred using the multipart/alternative content-type. The body of an email is sent in 2 forms: one is a plain text version that can be read on command-line email clients like mutt, and the other is a HTML version that provides greater functionality in graphical email clients like a web application or desktop email application. It is up to the email program to decide which one to show.

Applications are typically encoded in Base64 before they can be attached to an email. Base64 contains 64 different characters to represent data: lowercase and uppercase letters, numbers, and the + and / symbols. Attachments just follow as another block in the multipart/alternative encoding of email bodies. (Because of this, attachments are actually 4 times as large as their originals when they are sent with an email.)

Here’s an example RFC822 email that demonstrates everything up to now:

Received: by castle from treehouse for
    <lemongrab@castle>; Tue, 25 Jun ...
Message-ID: <haha+man@treehouse>
Date: Tue, 25 Jun 2013 11:29:00 +0000
Subject: Hey!
Return-Path: Neptr <neptr@mybasement>
From: Finn the Human <fth@treehouse>
To: Lemongrab <lemongrab@castle>
Content-Type: multipart/alternative; boundary=front

--front
Content-Type: text/plain; charset=UTF-8

mhm.. mhmm...
You smell like dog buns.

--front
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<p>mhm.. mhmm...<br />
You smell like dog buns.</p>

--front--

When your mail server receives an email, it has no way of knowing if the sender is really who he or she claims to be. All the server knows is who is currently delivering the email. To help discourage people from forging these email headers, two infrastructural standards were developed: SPF and DKIM.

Both of these standards depend on special pieces of information made available in a domain’s DNS records. SPF (Sender Protection Framework) lets a domain define which IP addresses are permitted to deliver email on its behalf. (This works because, in reality, a TCP connection is usually established directly between the origin and the destination servers without any intermediate servers.) Email from a server that isn’t in a domain’s SPF record is considered suspicious. Gmail displays the result of a successful SPF verification by saying an email was Mailed by: ... a domain.

To see a server’s SPF record, perform the following:

$ nslookup -q=txt tumblr.com
> ...

DKIM is public-key cryptography applied to email. A server releases one or more public keys through its DNS records. Each public key is available through a TXT DNS lookup at key._domainkey.example.com.

$ nslookup -q=txt 20120113._domainkey.gmail.com
> 20120113._domainkey.gmail.com	text = "k=rsa\; p=MIIB
> ...

The origin server declares which headers are “signed headers” and then applies its signature to both the signed headers and the email body. Gmail displays the result of a successful DKIM verification by saying an email was Signed by: ... a domain.

It is important to note that any Internet-connected computer can act as a origin server for email. However, only servers designated by a DNS MX record can be configured to act as the receiving server. When you are designing your own email-enabled web applications, it is important to keep these authentication mechanisms in mind to ensure that your email comes only from trustworthy SMTP origin servers and that you take proper actions to prevent your emails from being classified as spam.