Email system: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Eric M Gearhart
No edit summary
imported>John Stephenson
m (moved Email system/Draft to Email system: citable version policy)
 
(30 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{subpages}}
{{subpages}}
{| class="wikitable sortable" width="35%" align="right"
{{seealso|Email processes and protocols}}
! Term
The '''email system''' is the network of computers handling electronic mail ([[email]]) on the Internet.  This system includes user machines running [[Email user programs|programs]] that compose, send, retrieve, and view messages, and agent machines that are part of the mail handling system. Like other complex systems, the email system is best explained by looking separately at different perspectives, applying the principle of [[Separation of concerns|separation of concerns]]. There are two coequal ways of looking at email systems - the administrative perspective (who does what), and the process perspective (how it flows). The administrative perspective presented in this article is the simplest. It can be understood without any technical background. The process perspective presented in "[[Email processes and protocols]]" provides more technical depth, and should be understood by anyone involved in the design or operation of email systems.
! Definition
|-
| MTA
| Message Transfter Agent; usually describes the software on the server side for moving email messages around
|-
| SMTP
| Simple Mail Transfer Protocol; the protocol used to transfer mail from one mail system to another. Uses port 25 or 587 for unencrypted message transfer.
|-
| POP
| Post Office Protocol; A protocol where a client connects, downloads mail from the server and then deletes that mail from the server. Mail that is downloaded then "sticks" on the computer the user retrieves their mail from. Contrast with IMAP.
|-
| IMAP
| Internet Message Access Protocol; IMAP differs from POP in that messages are left on the server; this allows a user to "float" between different clients at different locations but still have access to all their mail
|}


This article provides a basic description of how the Internet email system works.  See the related articles and bibliography for more on the protocols used in [[Email message transfer|message transfer]] or the [[Email message formats|format of messages]].  Good elementary discussions of these topics can also be found in most texts on computer networks.
In the '''process perspective,''' the mail handling system can be modeled as a sequence of [[Relay (computers)|relay]] processes, each temporarily storing the message, performing some specialized function, and passing it on to the next relay using the [[Simple Mail Transfer Protocol|SMTP protocol]].<ref>
<ref> See Bibliography [PnD07] and [Stevens04].</ref>
Do not confuse [[Simple Mail Transfer Protocol | SMTP Relays]] with [[Router | routers]] or packet switches.  In this article and its subtopics, Relay will always mean an SMTP relay.  Relays use SMTP/TCP/IP, and the functionality of routers is entirely encapsulated within the IP layer of this [[Internet Protocol Suite | protocol stack]].  We can ignore routers in this discussion.  They are "transparent" to SMTP.
<!-- <ref name=PnD>{{citation
</ref>
  | author = L. Peterson, B. Davie
You can tell how many relays handled a message by looking at the lines labeled "Received:" in the [[Email message headers|message header]]. There should be one for each relay. Relays are not our focus in this article, however. We can ignore them in higher-level models, just as routers and physical links can be ignored in discussing relays.
| title = Computer Networks: A Systems Approach
| edition = 4th
| year = 2007
| contribution = Sect. 9.1.1 Electronic Mail}}</ref> -->


Internet mail has evolved without much central planning to a collection of very diverse and astonishingly complex systems.  Like the [[Internet]] itself, it is helpful to study these systems the way a biologist would study an organism, or a social scientist the behavior of a group.  Who are the Actors in a typical email system?  What are their roles and responsibilities in handling the mail?  What are their relationships with each other?  What are their motivations?  How can we build better security systems?
In the '''administrative perspective,''' the principal entities are actors, their roles, and their relationships.  Who are the actors in a typical email system?  What are their roles and responsibilities in handling the mail?  What are their relationships with each other?  What are their motivations?  How can we build better security systems? A basic understanding of the administrative perspective should help answer these questions.  This article provides that understanding.


A typical mail handling system has a network of Relays,
=== System architecture ===
<ref>Do not confuse [[SMTP]] Relays with [[router]]s or packet switchesRelays use SMTP/TCP/IP, and the functionality of routers is entirely encapsulated within the IP layer of this protocol stack.  We can ignore routers in this discussion.  They are "transparent" to SMTP.
{{Image|EmailSystem.png|right|700px|'''Figure 1 Actors (users and agents) and their roles in an ideal email system.'''}}
</ref>
each temporarily storing the message, performing some specialized function, and passing it on to the next Relay using the [[SMTP]] protocol.  You can tell how many Relays handled a message by looking at the Received lines in the message header.  There should be one for each Relay.


{|align="right" cellpadding="10" style="background-color:#FFFFCC; width:300px; border: 1px solid #aaa; margin:10px; font-size: 100%;"
|
<big>'''Actors and Roles:'''</big><br />
Actors include users and [[Email agents | agents]].<br />
Agents may play more than one role.<br />
Typical roles include Transmitter, Receiver, and Forwarder.<br />
|}


==A typical Email System==
Figure 1 shows an ideal system with the machines grouped into functional blocks. In this diagram, we have named the blocks by their role in processing a message.  The actors (users or agents) are shown in italic text.  The MSA role, for example, is played by a ''Mail Submission Agent'', which performs all functions related to message submission.  In this ideal system, we have assigned each role to a different actorIn real systems, however, an actor can have multiple blocks, a block can have multiple machines, and a machine can host multiple relays running as independent [[Daemon (computer software)|daemon processes]].
Figure 1 shows a typical system with the Relays grouped into functional blocks. In this diagram, we have named the blocks by the role they play in processing a message, and assigned each role to a different Actor (User or Agent)However, each Actor can have multiple blocks, each block can have multiple hosts, and each host can have multiple Relays running as independent daemon processes.  A Transmitter might have a dozen Relays, operating in parallel to handle a large mailflow, or widely dispersed to serve users all over the world.  An MDA might have a process dedicated to managing a large mailstore, another running a POP/IMAP server, and another providing a webmail interface.


{{Image|EmailSystem.png|left|700px|'''Figure 1  Actors (Users and Agents) and roles (functional blocks) in a typical email system.'''}}
A small Internet Service Provider (ISP) might perform the roles of MSA/Transmitter using two relays running on one machine. An agent performing the role of [[Email agents|Transmitter]] might have a dozen relays, operating in parallel to handle a large mailflow, or widely dispersed to serve users all over the world. A Mail Delivery Agent might have a process dedicated to managing a large mailstore, another running a [[Messaging application protocols|POP/IMAP]] server, and another providing [[Webmail|webmail]] via HTTP to the Recipient's browser.


To understand a mail handling system, including its security vulnerabilities, we need to focus on the roles and responsibilities of each Actor and the relationships between them.  Figure 1 is a simplified model of just one system.  There are many other possibilities. We might add a Forwarder between the Receiver and the MDA, or an Open Relay floating in the cloud. We might join two blocks under one Agent. We might add another layer of organization, showing a group of Actors organized as a Mail Receiving Network
There are many other possibilities. We might add a Forwarder between the Receiver and the MDA. We might show contractual relationships between the agents or their affiliation with particular networks.<ref>
<ref>{{citation
Networks of routers and links are organized into [[Autonomous System | Autonomous Systems]]As with routers, however, it is much simpler to think of this layer as transparent to the level we are modeling (even if the actors in our model happen to be also network owners).  This is a frequent point of confusion, particularly for people who would like to hold network owners responsible for the content of the messages they carry.
| author = K. Moore
  | title = Recommendations for Submission of Email and Relaying of Email Between Mail Networks
| year = 2005}}, http://www.cs.utk.edu/~moore/opinions/email-submission-recommendations.html
</ref>
</ref>
or an Administrative Management Domain.
A diagram like Figure 1 could get quite complex. A shorthand notation will allow us to show the relevant networks, actors, roles, and relationships. Here is a basic system with four actors (two users and two agents), organized as two networks:
<ref> Administrative Management Domain (ADMD) is the more general term proposed in [Crocker08].  This could include a Mail Receiving Network or any other group of Actors with some pre-arranged relationship.</ref>
<!-- <ref>{{citation
| author = D. Crocker
| title = Internet Mail Architecture
| year = 2008}}, http://tools.ietf.org/html/draft-crocker-email-arch-11</ref> -->
 
Yet another layer could be shown by grouping the Relays according to who owns the equipment.
<ref> These are the [[Autonomous System]]s of the physical network.  As with routers, however, it is much simpler to think of this layer as transparent to the level we are modeling (even if the Actors in our model happen to be also network owners).  This is a frequent point of confusion, particularly for people who would like to hold network owners responsible for the content of the messages they carry.</ref>
A diagram like Figure 1 could get quite complex. A shorthand notation will allow us to show the relevant networks, actors, roles, and relationships. Here is a basic system with four Actors (two Users and two Agents), organized as two networks:
<ref> These "networks" are of Actors and their relationships, not routers and data links.</ref>
<!-- To avoid confusion, we could use the more precise but less familiar term, ADMD, proposed in [Crocker08].</ref> -->
<!-- but less familiar terminology, ADMD, proposed in D. Crocker, "Internet Mail Architecture", 2008, http://tools.ietf.org/html/draft-crocker-email-arch-11.</ref> -->


  |--- Sender's Network ---|          |-- Recipient's Network -|
|--- Sender's Network ---|          |-- Recipient's Network -|
                                  /
                                /
  Author ==> MSA/Transmitter --> / --> Receiver/MDA ==> Recipient
Author ==> MSA/Transmitter --> / --> Receiver/MDA ==> Recipient
                                /
                              /
                            Border
                            Border


The double arrow shows a direct relationship between Actors (e.g. a contract between the Author and his Email Service Provider ('''ESP''')). The single arrow shows only the direction of mail flow. There is no relationship between Agents across the Border to the open Internet. The / shows multiple roles being played by one Actor. Using these diagrams, we can model almost any system, and include a lot of detail on relationships, but not lose the simplicity of Figure 1. The elements of the model (Actor's roles) are the fundamental building blocks.
To understand a mail handling system, including its security vulnerabilities, we need to focus on the roles and responsibilities of each actor and the relationships between them. The double arrow shows a direct relationship between actors (e.g. a contract between the Author and his ISP). The single arrow shows only the direction of mail flow. There is no relationship between agents across the Border to the open Internet. The / shows multiple roles being played by one actor. Using these diagrams, we can model almost any system, and include a lot of detail on relationships, but not lose the simplicity of Figure 1. The elements of the model (actors' roles) are the fundamental building blocks.  See [[Email agents]] for more example systems.
<!--
<ref>The roles of Transmitter, Receiver, and Forwarder can be abbreviated as XMTR, RCVR, and FWDRThis will make the diagrams more compact, and avoid any confusion between roles and actors.  In this discussion, however, there is little confusion, and the precise terminology is too awkward.  Thus, when we say "authenticate the Transmitter" we mean "authenticate the Agent playing the role of XMTR".
</ref> -->


Here is an extension of the basic system, adding a Forwarder role, played by the same Actor as the Receiver. Both the Receiver/Forwarder and the MDA have a direct relationship with the Recipient, so they have an indirect relationship with each other. These details are important in discussions of authentication protocols.
Here is an extension of the basic system, adding a Forwarder role, played by the same actor as the Receiver. Both the Receiver/Forwarder and the MDA have a direct relationship with the Recipient, so they have an indirect relationship (wavy arrow) with each other. These details are important in discussions of [[Email authentication|authentication protocols]].  
          |-------- Recipient's Network ---------|
      /
--> / --> Receiver/Forwarder ~~> MDA ==> Recipient
    /
  Border


            |-------- Recipient's Network ---------|
If we wonder why email continues to be such an insecure system, we can study this last example. An MDA is quite frequently a Receiver/MDA that is unaware when an incoming message has been forwarded. If the MDA runs the most common [[Email authentication|authentication]] checks on the incoming message, it may be rejected as a forgery.  The problem is that the Transmitter's domain name no longer correlates with the IP address seen on the incoming connection from the Forwarder.
      /
  --> / --> Receiver/Forwarder ~~> MDA ==> Recipient
    /
  Border


If we wonder why email continues to be such an insecure system, we can study this last exampleAuthentication protocols that try to correlate the Transmitter's domain name to the connecting IP address can fail when a Forwarder is involved.  We cannot just dismiss Forwarding as an "edge case", however.  It is important for a user who changes jobs or ESPs, and would like to continue receiving mail at his old address.
This is a good example of how difficult it is for security protocols to keep up with evolving system designs and changing environmentsForwarding is much more prevalent now than when the most common email authentication protocols were designed.  We can no longer dismiss Forwarding as just an "edge case".  It is important for a user who changes jobs or ISPs, and would like to continue receiving mail at her old address.


Let's follow a message from start to finish.  The scenario begins with an Author composing a message using his mail client.  There are countless mail clients available, just like there are many web browsers to choose from.  In fact, most web browsers now include a mail client, or at least a mechanism to invoke the user's preferred client when he clicks a mailto: link in a webpage.
=== Message handling ===


When the Author clicks SEND, his mail client connects to an MSA at his ESP, and the message is transferred using [[SMTP]]. A key responsibility of the MSA is to authenticate the Author. This can be done with a password, by assigning the client machine a static IP address, or by having the client connect through the MSA's local network, not through the Internet.
Let's follow a message from start to finish. The scenario begins with an Author composing a message using a [[Email user programs|mail client]] on a home computer. There are numerous mail clients available, just like there are many web browsers to choose from. In fact, many web browsers now include a mail client, or at least a mechanism to invoke the user's preferred client when he clicks a mailto: link in a webpage.
When the Author clicks SEND, his mail client connects to an MSA machine at his ISP. A key responsibility of the MSA is to authenticate the Author. This can be done with a password, by assigning the client machine a static IP address, or by having the client connect through the MSA's local network, not through the Internet.  If it is necessary to connect through the open Internet, use of a special [[Email port 587|TCP port 587]] helps to segregate requests for an authenticated connection from the flood of fraudulent attempts on port 25. After authentication, the message is transferred using SMTP.


Most large ESPs operate their own transmitter Relays, but smaller companies, and organizations with a lot of bulk mail, often subcontract this specialized role to another AgentThe Transmitter's responsibilities include prevention of outgoing spam, and providing some means to prove their identity to unrelated Receivers.  It isn't enough to say "HELO, this is trustme.com".  Any spammer can do that.  The Transmitter must provide some "out-of-band" data using a service like DNS that is more trusted than email.
Most large ISPs operate their own transmitter relays, but smaller companies, and organizations with a lot of bulk mail, often subcontract this specialized role to another agentSuch agents advertise their services under the name "SMTP Relay", but in this article we will use the more specific term Transmitter when we mean a role or agent, and transmitter relay when we mean the relay at the sender's side of the Border.


The Receiver's responsibilities include a number of functions we might call "Border defense" – blocking a DoS attack, authenticating the sender, and various spam-blocking strategies, including whitelisting, blacklisting, statistical analysis of message content, and use of heuristic rulesets that have proven effective in separating spam from legitimate mail. Border defense should be done at the Border.  Loss of mail due to violations of this principle is common. A forwarded message can look like a forgery, and then the MDA has a tough choice – drop the message with no notice to the alleged sender, or send the notice and risk being reported for "bounce spam".
The Transmitter's responsibilities include prevention of outgoing spam, and providing some means to prove their identity to unrelated Receivers. It isn't enough to say "Hello, this is trustme.com". Any criminal can do that, and identity fraud has become a major problem on the Internet. The Transmitter must provide some "out-of-band" data using a service like [[Domain Name System|DNS]] that is more secure than email. DNS records can be used to publish a public key, a list of IP addresses, or some other data that the Receiver can use to run one or more [[Email authentication|authentication methods]].


The problems with mis-configured mailsystems can be avoided if all Actors understand their roles and relationships. When a Recipient sets up forwarding from his old Receiver/Forwarder to his new MDA, he should make sure that the Forwarder is whitelisted by the MDA. Forwarders should make sure that Recipients (non-expert users) understand this. MDAs should understand that forwarding is a common need, and make it easy for Recipients to whitelist their Forwarders.
The Receiver's responsibilities include a number of functions we might call "border defense" – blocking [[Denial of service|Denial of Service]] (DoS) attacks, authenticating the sender, and various [[Anti-spam techniques|spam-filtering]] strategies, including whitelisting, blacklisting, statistical analysis of message content, and use of heuristic rulesets that have proven effective in separating spam from legitimate mail. Border defense should be done at the Border. Loss of mail due to violations of this principle is common. A forwarded message may be rejected as a forgery, and then the Forwarder has a tough choice – drop the message with no notice to the alleged sender, or send the notice and risk being reported for [[Anti-spam techniques|"bounce spam"]].
The problems with mis-configured mail systems can be avoided if all actors understand their roles and responsibilities. When a Recipient sets up forwarding from his old Receiver/Forwarder to his new MDA, he should make sure that the Forwarder is whitelisted by the MDA. Forwarders should make sure that Recipients (non-expert users) understand this. MDAs should understand that forwarding is a common need, and make it easy for Recipients to whitelist their Forwarders.


== Notes ==
=== Notes ===
{{reflist|2}}
<references/>

Latest revision as of 06:07, 29 August 2013

This article has a Citable Version.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article has an approved citable version (see its Citable Version subpage). While we have done conscientious work, we cannot guarantee that this Main Article, or its citable version, is wholly free of mistakes. By helping to improve this editable Main Article, you will help the process of generating a new, improved citable version.
See also: Email processes and protocols

The email system is the network of computers handling electronic mail (email) on the Internet. This system includes user machines running programs that compose, send, retrieve, and view messages, and agent machines that are part of the mail handling system. Like other complex systems, the email system is best explained by looking separately at different perspectives, applying the principle of separation of concerns. There are two coequal ways of looking at email systems - the administrative perspective (who does what), and the process perspective (how it flows). The administrative perspective presented in this article is the simplest. It can be understood without any technical background. The process perspective presented in "Email processes and protocols" provides more technical depth, and should be understood by anyone involved in the design or operation of email systems.

In the process perspective, the mail handling system can be modeled as a sequence of relay processes, each temporarily storing the message, performing some specialized function, and passing it on to the next relay using the SMTP protocol.[1] You can tell how many relays handled a message by looking at the lines labeled "Received:" in the message header. There should be one for each relay. Relays are not our focus in this article, however. We can ignore them in higher-level models, just as routers and physical links can be ignored in discussing relays.

In the administrative perspective, the principal entities are actors, their roles, and their relationships. Who are the actors in a typical email system? What are their roles and responsibilities in handling the mail? What are their relationships with each other? What are their motivations? How can we build better security systems? A basic understanding of the administrative perspective should help answer these questions. This article provides that understanding.

System architecture

(CC) Image: David MacQuigg
Figure 1 Actors (users and agents) and their roles in an ideal email system.

Actors and Roles:
Actors include users and agents.
Agents may play more than one role.
Typical roles include Transmitter, Receiver, and Forwarder.

Figure 1 shows an ideal system with the machines grouped into functional blocks. In this diagram, we have named the blocks by their role in processing a message. The actors (users or agents) are shown in italic text. The MSA role, for example, is played by a Mail Submission Agent, which performs all functions related to message submission. In this ideal system, we have assigned each role to a different actor. In real systems, however, an actor can have multiple blocks, a block can have multiple machines, and a machine can host multiple relays running as independent daemon processes.

A small Internet Service Provider (ISP) might perform the roles of MSA/Transmitter using two relays running on one machine. An agent performing the role of Transmitter might have a dozen relays, operating in parallel to handle a large mailflow, or widely dispersed to serve users all over the world. A Mail Delivery Agent might have a process dedicated to managing a large mailstore, another running a POP/IMAP server, and another providing webmail via HTTP to the Recipient's browser.

There are many other possibilities. We might add a Forwarder between the Receiver and the MDA. We might show contractual relationships between the agents or their affiliation with particular networks.[2] A diagram like Figure 1 could get quite complex. A shorthand notation will allow us to show the relevant networks, actors, roles, and relationships. Here is a basic system with four actors (two users and two agents), organized as two networks:

|--- Sender's Network ---|           |-- Recipient's Network -|
                                /
Author ==> MSA/Transmitter --> / --> Receiver/MDA ==> Recipient
                              /
                           Border

To understand a mail handling system, including its security vulnerabilities, we need to focus on the roles and responsibilities of each actor and the relationships between them. The double arrow shows a direct relationship between actors (e.g. a contract between the Author and his ISP). The single arrow shows only the direction of mail flow. There is no relationship between agents across the Border to the open Internet. The / shows multiple roles being played by one actor. Using these diagrams, we can model almost any system, and include a lot of detail on relationships, but not lose the simplicity of Figure 1. The elements of the model (actors' roles) are the fundamental building blocks. See Email agents for more example systems.

Here is an extension of the basic system, adding a Forwarder role, played by the same actor as the Receiver. Both the Receiver/Forwarder and the MDA have a direct relationship with the Recipient, so they have an indirect relationship (wavy arrow) with each other. These details are important in discussions of authentication protocols.

         |-------- Recipient's Network ---------|
     /
--> / --> Receiver/Forwarder ~~> MDA ==> Recipient
   /
 Border

If we wonder why email continues to be such an insecure system, we can study this last example. An MDA is quite frequently a Receiver/MDA that is unaware when an incoming message has been forwarded. If the MDA runs the most common authentication checks on the incoming message, it may be rejected as a forgery. The problem is that the Transmitter's domain name no longer correlates with the IP address seen on the incoming connection from the Forwarder.

This is a good example of how difficult it is for security protocols to keep up with evolving system designs and changing environments. Forwarding is much more prevalent now than when the most common email authentication protocols were designed. We can no longer dismiss Forwarding as just an "edge case". It is important for a user who changes jobs or ISPs, and would like to continue receiving mail at her old address.

Message handling

Let's follow a message from start to finish. The scenario begins with an Author composing a message using a mail client on a home computer. There are numerous mail clients available, just like there are many web browsers to choose from. In fact, many web browsers now include a mail client, or at least a mechanism to invoke the user's preferred client when he clicks a mailto: link in a webpage.

When the Author clicks SEND, his mail client connects to an MSA machine at his ISP. A key responsibility of the MSA is to authenticate the Author. This can be done with a password, by assigning the client machine a static IP address, or by having the client connect through the MSA's local network, not through the Internet. If it is necessary to connect through the open Internet, use of a special TCP port 587 helps to segregate requests for an authenticated connection from the flood of fraudulent attempts on port 25. After authentication, the message is transferred using SMTP.

Most large ISPs operate their own transmitter relays, but smaller companies, and organizations with a lot of bulk mail, often subcontract this specialized role to another agent. Such agents advertise their services under the name "SMTP Relay", but in this article we will use the more specific term Transmitter when we mean a role or agent, and transmitter relay when we mean the relay at the sender's side of the Border.

The Transmitter's responsibilities include prevention of outgoing spam, and providing some means to prove their identity to unrelated Receivers. It isn't enough to say "Hello, this is trustme.com". Any criminal can do that, and identity fraud has become a major problem on the Internet. The Transmitter must provide some "out-of-band" data using a service like DNS that is more secure than email. DNS records can be used to publish a public key, a list of IP addresses, or some other data that the Receiver can use to run one or more authentication methods.

The Receiver's responsibilities include a number of functions we might call "border defense" – blocking Denial of Service (DoS) attacks, authenticating the sender, and various spam-filtering strategies, including whitelisting, blacklisting, statistical analysis of message content, and use of heuristic rulesets that have proven effective in separating spam from legitimate mail. Border defense should be done at the Border. Loss of mail due to violations of this principle is common. A forwarded message may be rejected as a forgery, and then the Forwarder has a tough choice – drop the message with no notice to the alleged sender, or send the notice and risk being reported for "bounce spam".

The problems with mis-configured mail systems can be avoided if all actors understand their roles and responsibilities. When a Recipient sets up forwarding from his old Receiver/Forwarder to his new MDA, he should make sure that the Forwarder is whitelisted by the MDA. Forwarders should make sure that Recipients (non-expert users) understand this. MDAs should understand that forwarding is a common need, and make it easy for Recipients to whitelist their Forwarders.

Notes

  1. Do not confuse SMTP Relays with routers or packet switches. In this article and its subtopics, Relay will always mean an SMTP relay. Relays use SMTP/TCP/IP, and the functionality of routers is entirely encapsulated within the IP layer of this protocol stack. We can ignore routers in this discussion. They are "transparent" to SMTP.
  2. Networks of routers and links are organized into Autonomous Systems. As with routers, however, it is much simpler to think of this layer as transparent to the level we are modeling (even if the actors in our model happen to be also network owners). This is a frequent point of confusion, particularly for people who would like to hold network owners responsible for the content of the messages they carry.