Before email (short for electronic mail, or e-mail) when dinosaurs walked the Earth, it was difficult to avoid “phone tag”. To leave someone a message, you needed to physically travel to their office and leave a slip of paper. (There were no cell phones in those days, and even answering machines were rare.) At some point it was realized that (at some organizations anyway) people had computer terminals in their offices, and that it was possible to leave a message in a file at a given location. When a user returned to their office they could check if that file existed and contained any new messages.
To make this easier, a simple program was used.
To send email to someone, the original mail
program
could be used this way:
“mail user
”.
Any text you typed thereafter would be appended to that user's
mailbox file, sometimes called the inbox.
(The standard location back then was
/usr/spool/mail/username
.)
When done entering the message, you would indicate EOF
by hitting Control+D.
(You could also pipe into this command.)
To read your mail (if any), a user would just type
“mail
”.
This would dump the contents of that user's mailbox file to
the screen.
Over time this command became more sophisticated, to allow the
mailbox to hold multiple messages, to automatically add a
header with the sender's username and the time the
message was sent.
This scheme allowed the mailbox to hold multiple messages since
the header separated one from the next.
That file is called a mail folder
(or mailbox), but is a single text file containing one
message after another.
The newer mail
command also displayed one message at
a time and allowed options for the user to save the message,
delete the message, and even to reply to it (that is,
send a message back to the person who sent you a message).
The reply message usually has the same “subject” as the original,
with “Re:
” prepended.
Sometimes the original message is quoted in the new message
body as well.
Email continues to be very popular. The marketing research firm Radicati reports over 182 billion email messages were sent every day in 2013, over 215 billion in 2016, and over 290 billion in 2019. That comes to over 3.3 million email messages sent or received every second! (Sadly, other estimates indicate that over 90% of that is spam or other forms of malware such as viruses.)
The invention of email is generally credited to V.A. Shiva Ayyadurai in 1978, while he was a high school student in New Jersey. However, this isn't true at all; email was in use in the early 1970s. Apparently, this guy invented a program he called “EMAIL”, and the press got confused.
An email message contains three parts: the envelope which identifies the sender and recipients, the headers, and the message body. The headers and body together are referred to as the message.
It is important to understand that only the recipients listed on the envelope will receive an email message; mail servers never look at any headers to determine this! Also, the envelope is not part of the message that gets saved in a user's mailbox. So once some email message is delivered to you, you can't tell who else was listed on the envelope.
The special header mentioned above that identifies the start of an email message in a mailbox looks like this:
From sender date
(Note the space and no colon after
“From
”.
This special “From
” header is part of the
mailbox format common to most systems, known as
MBOX or Berkeley mailbox format,
and is not a standard email header at all.
See RFC-4155 and
the mbox(5)
man page for details.)
This “From
”
header is generated automatically from the
envelope from address.
Even if a “From
” header was provided, it is
over-written by most mail servers with the real sender.
This header is always the first one of an email message.
When a mail program is reading a mailbox, an email message begins
with this header and continues until the next
“From
” header, or until the end of the file.
What happens if the message body contains a line starting with
“From
”?
The mail program will typically insert an ASCII space
in front of that line, a technique called
space-stuffing.
When displaying messages the mail program removes the extra
spaces.
Eventually other headers were allowed as well such as
“Subject:
”.
A problem is, how to tell the difference between the headers
added by the mail program and the message entered by the
sender?
The answer is to have all the headers at the beginning
(hence the name), followed by one blank line, and that
followed by the message entered by the user.
That part is known as the message body.
Recall the “To:
” header does not determine
to whom the email gets delivered.
The addresses passed to the mail server (the envelope
addresses) and not the ones listed in any mail headers
determine who receives the email.
Since the various headers in the message do not determine who
receives the message, they may be faked easily.
There are many standard headers that can be used, such as:
Subject:
, To:
, From:
,
Cc:
(carbon copy), Date:
,
etc.
The carbon copy list is the same as the “To:” list, just
more recipients to add to the envelope.
The only difference is that somehow being listed on the
To:
list confers more status than being listed on the
Cc:
list.
Note there is no such thing as a Bcc:
header;
a “Blind carbon copy” recipient is listed on the
envelope but not in any headers.
See RFC-2076
and RFC-4021
for a description of standard email headers.
It is often the case where you need to add your name, title, contact information, and sometimes a legal notice, to some or to every email message you send. This information is known as a signature block, signature line, sig block, or just a signature. (This should not be confused with digital signatures, discussed below.) That can get tedious to type in for each message! Many mail programs (MUAs) and mail servers (MTAs) have a feature where you can set a signature to be automatically appended to the body of all email messages.
There are rules of “netiquette” (network etiquette) for email signatures. They should always begin with a line only containing two dashes and a space. This signature separator is called sig dashes, signature cut line, or sig-marker. (It is recognized automatically by most email programs, which can treat signatures specially when replying to a message and quoting the message body.) The other rules are that the signature should be plain text, with no more than 4 lines; each line should be at most 80 columns long.
There are many rules of “netiquette” for the body of email.
When using traditional, plain text email there are rules for
formatting signatures, quoting material, line length, line wrapping,
and so on.
These are defined in
RFC-3676 (Text/Plain Format).
This also defines when to use “space stuffing” (adding
a space to lines that start with “From
”,
a space, or a “>
”).
One useful convention you should follow is to quote URLs with angle-brackets (“<” and “>”). This allows an MUA to recognize a URL even when wrapped over multiple lines.
Modern email addresses look like this:
username@hostname
The hostname
is a host or computer on a
network.
The “@hostname” part is optional;
if missing usually localhost
(that is, the current system) is assumed.
The hostname should be a valid
DNS
host name, or an IP
address enclosed in square-brackets
(e.g., “hymie@[192.0.2.3]
”).
It can be another name defined in the DNS system,
in an MX record.
A common example is to use an organizations domain name only
and not the name of any particular host; for example
“user@example.com
” and not
“user@mailserver.example.com
”.
The username should be a valid account on that system
(or a defined alias such as
“webmaster
”).
The credit for the invention of this form of email address goes to Raymond Samuel Tomlinson, while working on an extension to the localhost only email program, SNDMSG. The result was the first email program that could send messages to a user on another host. Tomlinson wrote this for the early ARPANET.
First you (let's call you “User A”) start up your MUA
(email client software),
compose a message, add some headers (such as Subject:
), and state
to whom the email should be sent (let's say “User B”).
When you are finished (and click on “Send”),
the program (usually) adds some additional standard headers and then sends
the mail message to your outgoing email server, either an
MSA (Mail Submission Agent) or an MTA
(Mail Transport Agent),
which may scan or modify the email before sending it out.
The mail gets routed from one mail server to another
(nowadays very few hops are needed), with the
servers in the middle relaying the mail
to the next server (or “hop”) along the path from
the source to the destination.
(Each mail server that receives an email will add a
“Received:
” header to it.
By looking at these headers you can see the path a message took.)
The mail arrives and is accepted by the recipient's mail server. The mail may be filtered, forwarded, or sorted into different mail folders, or simply delivered to the user's inbox.
Note: when sending email to many users on the same system, all the users are listed on envelope but the actual email is sent only once to that (remote) system.
The recipient is somehow notified of the arrived email, and eventually reads it. The recipient uses their email client software to talk with their mail server. The process looks like this:
(Note the diagram shows the optional MSA and MFA.) The MTAs (mail servers) work in a store and forward manner. An MTA receives an email message and stores it on a (local) disk. Then the message is relayed to another MTA or handed to the local MDA (Mail Delivery Agent, the software that actually delivers email once received) for delivery. Various problems can cause messages to not get delivered. This is usually known as a bounced message.
Email is submitted, transported, accepted, delivered, and read using a number of server components. These are collectively known as mail servers. All of the pieces, the MUAs, MTAs, MSAs, MDAs, and MAAs, communicate with each other using various protocols. These protocols can be proprietary or public; for example the common public protocol that MTAs understand is known as SMTP (simple mail transport protocol), or the modern enhanced version (ESMTP).
Mailer-daemon is the usual name of an MTA or MDA when it generates error email messages to return to the sender. Common causes include: bad user-name (destination MTA will bounce the email, which means return it to the sender with an explanation as to what happened), bad hostname (sender's MTA will bounce it), destination MTA is down (Sender's MTA—actually, the MTA immediately upstream of the destination—will try for a while and then send a warning. If the destination server never responds then the MTA will eventually give up and will bounce the email).
A mail store is where email is stored on disk,
in one or more mailboxes (or mail folders).
The MDA stores email there, and the MAA
(or sometimes the recipient’s MUA) fetches email from there.
The default, standard location for a user's incoming email is a file named
/var/mail/name
, but this can be configured to a different
location (as long as all the software used, MUA, MDA,
and MAA, uses the same location and format).
The original (and still common) mailbox format is known as
MBOX, which has one file for each mailbox or mail folder.
Newer formats include
Maildir and
Maildir++.
These (and other) formats use one folder (directory) per mailbox, with each
message in a separate file.
(Microsoft Outlook uses propriety file formats such as “PST”).
Email may also be stored in databases.
Whenever your email is stored on a server, it is subject to some email policy (such as quota limits). You can view HCC's email policy at www.hccfl.edu/oit/email-storage-policy.aspx.
The various mail components communicate between each other using protocols. Examples of some used for email are SMTP, POP, and IMAP. (Demo SMTP protocol by sending to wpollock@hccfl.edu. Show email “Received” headers to show the hops taken, by logging into Outlook, select the email, Actions→View original message. In the pop up window, click on the Message Details button.)
In the old days email was plain ASCII text, which takes only seven bits of each byte. Much of the early Internet dropped the 8th bit of every byte to gain a 12.5% speedup. Naturally this won't work with binary files such as GIFs, binary data files, or programs. So these needed to be encoded (in essence adding a zero bit after every seventh bit) when sent, and decoded by the recipient. Here's how this was done:
uuencode filename filename > file.uu mail recipient < file.uu
The encoded file is copied into the body of the outgoing email. Once delivered the recipient would have to save the body, edit it to remove all but the encoded file, and decode the file:
mail ... save received email body in “file.uu” ... uudecode file.uu
What a pain! Besides this problem, plain text email is... plain-looking. Early business adopters of email wanted better looking email with features such as bold, italics, underline, justified text, and color.
MIME (Multipurpose Internet Mail Extensions) is a protocol (actually, an encoding, or type of formatting, of the email body) that MUAs use to provide styled text demanded by business users of early email. Today that isn't important (as we now use HTML for email with styles, graphics, and fancy formatting). But most importantly MIME supports multi-part email messages. This is when the body of the message is split into several parts, separated by a MIME separator string, where each part contains its own headers and is automatically encoded and decoded. Each of these parts are known as an attachment. Note that MIME is invisible to MTAs, MDAs, and MAAs, which only see a single message body with some weird stuff in it. (Virus scanners do know about attachments, of course.)
Today much of the Internet uses all eight bits of a byte, but not all of it so encoding is still used. MIME uses a technique known as Base-64 encoding (RFC-4648). MIME is defined by RFC-2045.
View a sample email message that uses MIME attachments.
Even though HTML or PDF attachments make for attractive email, the information they contain should always be available as plain text in the body. Aside from security concerns about opening attachments (even simple graphics), the bodies of email messages can be searched; attachments generally cannot.
Many old command line MUAs exist, the most popular of
those is mailx
.
However old mailx
doesn't know about MAAs
or MIME.
An updated compatible MUA nail
is
available (nail = new mail?).
nail
is often installed under the name mailx
on Linux and some Unix systems.
These older MUAs are still valuable because some of
them (mailx
but not alpine
) can be used
non-interactively to send mail from a shell script, and because
system administrators often use command line access to
Unix/Linux servers via
SSH
and may need to read or send mail directly from that server.
The use of a MUA such as alpine
should be easy to learn since it is menu-driven.
alpine
is the current version of the
pine
MUA.
(The developers wanted to change the license before continuing
development, and they couldn't with the old name.)
In alpine
, at any point you can examine the menus
to see what you can do.
These are context-sensitive menus.
For instance hitting ^J
(control+J) when
a header field is highlighted means to add an attachment;
if the message body is highlighted this means to
justify the text.
In message body use ^R
to read a file
and paste its contents into the mail body.
There are three ways to get stuff into the body of an email message you're composing, aside from typing it in:
^R
”
to read and copy the file into the message body. cat file | alpine ub00
ub00
,
with the contents of file already in the body.
(You can pipe other commands too: “ls -l
”.) Spam is usually sent from personal computers infected by a virus, making them part of a “botnet”. The Kelihos botnet controlled 41,000 computers worldwide, and was capable of sending 3.8 billion spam e-mails per day. To see where spam comes from, you can view the latest numbers from www.SpamRankings.net.
One type of nasty spam that is popular is a message that appears to be from someone or some organization you know. The attacker tries to trick you into clicking a link (in the email body) that will either run nasty software on your computer or take you to a fake website that looks real, to collect your personal information. This is known as phishing and can be very hard to detect. (Avoid clicking on links in email; use a browser's bookmark/favorite instead to be sure you're going to the correct site.)
In the U.S. and a few other countries it is legal for users to encrypt their emails to protect their privacy. (See below for details on how to do this.) However recent legislation (such as the “Sarbanes-Oxley Act”, or “SOX”) may require that for any organization that allows encrypted email, the organization must be able to recover the private key to be able to turn over emails when legally requested to do so by the proper authorities.
In other parts of the world it is illegal to encrypt email.
A related issue is censorship of emails by employers, ISPs, or governments. Today there is wide-spread censoring or delaying of some Internet traffic by many ISPs. Some ISPs have a “premium” level of service where they promise not to delay or block your Internet traffic if you pay extra. (To me this seems a kind of protection racket: “Nice email you got there buddy! If you pay I can make sure nothing bad will happen to it”.)
As an example of ethical issues, consider the case of U.S. Circuit Judge Randall Rader. Once the top patent judge in the nation, Rader announced his resignation in June 2014, following an admission that he made an ethical “lapse” when he sent an e-mail praising an attorney who appears frequently before his court. The attorney got into trouble as well for forwarding the email.
Do you think this was an ethical lapse by the judge? Was it serious enough to warrant resignation? Does it matter that the attorney sent copies to his clients?
Another meaning for mail-bomb is an email that, just by reading it, will install malicious software on your computer or do other terrible things. This is a half-truth. Since plain email is just text, reading it with a basic MUA can never hurt your computer. Unfortunately some advanced MUAs (including reading email with a web browser) will accept instructions embedded in the email (perhaps using MIME) which can be abused. Personally I turn off HTML and JavaScript in my MUA, as well as any auto-loading of documents in attachments. Then, unless I download and run something that I found in an email, I am fairly safe from this threat.
Return-Receipt-To: "Hymie Piffl" <hpiffl@example.com>
Or this official header (RFC 3798):
Disposition-Notification-To: J Smith <jsmith@example.com>
For more details, see wikipedia.org/wiki/Return_receipt.
A list of e-mail addresses identified by a single name
such as students@hccfl.edu
is called
a mailing list.
When an e-mail message is sent to the mailing list name
it gets sent to all the addresses in the list.
Each list also has a special email address for configuring
your use of the list (for example, add or remove yourself
from the list) and another address for the (human) owner of
the list.
A good resource for learning about mailing lists is
“Understanding Mailing Lists” by Harley Hahn
(the author of our textbook).
Users can configure the procmail
MDA
to manage mail logs, to spam check, to filter email, to
automatically reply to some mail, etc.
See the man pages for procmailrc
and
procmailex
.
The file
~wpollock/.procmailrc
contains examples of MDA return-receipts and spam
filter using regular expressions,
spamassassin
(www.SpamAssassin.org).
While you have no guarantee of privacy with your email, you are allowed (in the U.S. anyway) to protect your email by encrypting the message. Such a message can't be read or tampered with by unauthorized parties.
The older technology for encryption is called symmetric (or shared) key: you and I share a key (password). Qu: how do we do that securely? Qu: what about doing business say with Amazon.com using this?
The old method is efficient but the problems are too difficult to make this technology useful on a wide scale. A newer technology is called public key encryption. With this method a pair of keys is made for each party. One is kept secret (the private key) and one is published (in email messages, in flyers, on web sites, on key servers, etc.) called the public key. To send a message to you I encrypt the message with your public key. Only you can decrypt it since this requires your private key and only you have it.
You reply to me by encrypting your reply with my public key (which only I can decrypt, using my private key). As you can see, four keys are used altogether. Public key encryption is the technology behind secure web sites that we all rely on (the web sites using the HTTPS protocol).
Public key encryption is much, much slower than symmetric key encryption. To make this technology practical, rather than encrypt a lengthy email message (body) a very large random number is generated by the sender. This number is used as a symmetric key and the message is encrypted with it. Only this session key gets encrypted using the public key method.
A digital signature is created for a message by encrypting it with the sender's private key. This doesn't protect the confidentiality of the message since anyone with the sender's public key can decode the message. (If privacy is also desired, the sender encrypts this encrypted message with the recipient's public key.) If the sender's public key decodes the message correctly, it is strong proof that their private key was used to encrypt it in the first place. Since only the sender has their private key, only the sender could have sent the message. Note this encryption with the sender's private key is a digital signature; a GIF graphic of a hand-written signature is not! (View sample digitally signed email.)
The U.S. federal government now treats digital signatures just as binding as a hand-written (or holographic) signature. See the Electronic Signatures in Global and National Commerce (ESIGN) Act passed in June of 2000, for details. Also many state governments have passed laws treating digitally signed emails as equivalent to documents signed holographically (by a person).
In practice this takes too long (even on modern computers) so a checksum (or digest or hash) of the message is encrypted with the private key instead, and this is appended to the (unencrypted) message. The recipient also computes a message digest of the email, then decrypts the message digest sent with the message body and compares the two. If the two digests differ the message was altered or forged.
Because the private keys used to encrypt email must often be made available to organizations (to comply with laws), separate sets of keys are often used for digital signatures and for email encryption.
Issue of Trust: A public key can be digitally signed by a trusted third party (such as VeriSign). This third party has a well-known public key. Most web browsers and email clients come with a built-in list of such well-known public keys.
People can use
PGP
/GPG
(Pretty Good Privacy, Gnu Privacy Guard, both
written by Phil Zimmerman) to encrypt, decrypt, compute
message digests, and to digitally sign messages.
PGP was written first; later versions were
renamed GPG.
GPG is sometimes called
GnuPG.
The standard used by any of these programs is called
OpenPGP; most people use all these terms interchangeably.
(See the man page for gpg
for details.)
GPG provides easy email integration with modern MUAs. But to use this technology people must generate a pair of keys and publish their public key. (You also need a third party to sign your key so others will trust it.) These complexities have hampered the widespread adoption of encryption and digital signatures. You can find easy to follow, step by step directions for using GPG with Thunderbird MUA at EmailSelfDefense.FSF.org.
The secure web sites with URLs such as
HTTPS://www.example.com/
work by exchanging public keys between
server and browser, which then verifies these by using the
trusted third party's public key to validate the key.
Next the browser and web server exchange a session key
(a big random number encrypted with the public keys)
to use for symmetric key encryption for the rest of that session.
(This is an over-simplification of what really happens but
should provide some idea of the process.)
mail
.
The modern version of this is called mailx
.
These tools are still useful to allow mail to be send
from a shell script (although some modern tools allow that as well). /var/mail/username
.
This can be changed by configuring the MDA, MDA,
and MAA (all of which need to know where the email is kept).
The MUA must also know this location;
the MAIL
environment variable is often used for this. From user date
”. From:
,
To:
, Cc:
, Subject:
,
etc.
(However there is no Bcc:
header.) Received:
” header to the front of
the email. uuencode
and uudecode
utilities
were common, now MIME's Base-64 encoding
is used. alpine
has an easy to use menu-driven interface.
Although it supports attachments, there are several ways to copy files
(or the output of commands) into the body of an email message. gpg
.