Domain Name System: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Howard C. Berkowitz
m (fixed SOA RR sentence)
mNo edit summary
 
(66 intermediate revisions by 11 users not shown)
Line 1: Line 1:
{{subpages}}
{{subpages}}
On the [[Internet]], the '''[[Domain Name System]] (DNS)''' is a critically important [[directory service]] that translates to and from a raw [[IP address]] (such as ''207.46.197.32'') and a domain name (such as ''microsoft.com'').  This allows people to interact with software via domain names, which are easier to remember than numerical IP addresses. 


In the Internet, the '''Domain Name System (DNS)''' is a critically important [[directory service]] that translates to and from a raw [[IP address]] (such as ''207.46.197.32'') and a domain name (such as ''microsoft.com'').  This allows people to interact with software via easier-to-remember domain names instead of numerical IP addresses.   
More importantly, it allows computer-friendly but user-unfriendly IP addresses to change without affecting human users. Thus people can still expect to find the same information behind the user-friendly ''domain names'', and need not be concerned if Microsoft Corporation changes the IP address on one of its host computers, as the domain name ''microsoft.com'' is sufficient, thanks to DNS, to find their computers regardless of which IP address the Microsoft administrator has assigned to those hosts.   


More importantly, it allows information to move around on the internet, from host to host, whereas people can still expect to find the information via its ''domain name''. For example, a user needs not care if Microsoft Corporation changes the IP address on one of its host computers; all the user needs to know is the domain name ''microsoft.com'', and he or she can then (thanks to DNS) find Microsoft's computers regardless of which IP address the internet has assigned to its computers this year. 
DNS is a hierarchical [[federated database]], distributed widely across many host computers on the public Internet, and it also has a set of application protocols for interacting with the database. DNS names must comply with standards on the public Internet, but need not do so in a private internet where DNS is still useful.  The original purpose of DNS was to translate a domain name to an IP address ('''forward DNS'''), and an IP address to a domain name ('''[[reverse mapping|reverse DNS]]'''),<ref name=RFC1034>{{citation
 
DNS is a hierarchical database, distributed widely across many host computers on the internet, and it also has a set of application protocols for interacting with the database.  The original purpose of DNS was to translate a domain name to an IP address ('''forward DNS'''), and an IP address to a domain name ('''[[reverse mapping|reverse DNS]]''')<ref name=RFC1034>{{citation
  | id = RFC1034  
  | id = RFC1034  
  | title = Domain names - concepts and facilities
  | title = Domain names - concepts and facilities
Line 12: Line 11:
  | date = November 1987
  | date = November 1987
  | publisher = Internet Engineering Task Force
  | publisher = Internet Engineering Task Force
}}</ref>, but in recent years there have been ongoing attempts to expand the purpose and functionality of DNS in the internet.  Further, because the lookup process for DNS superficially appears to resemble the lookup process for searching on the world wide web, it has become easy to confuse the purposes of a DNS lookup with a search-engine lookup.  These two kinds of lookups have very different goals and occur at vastly different levels within the internet protocol stack.  This article will explain the functions and purposes of the internet Domain Name System, the nature of its distributed and hierarchical database, and the protocols for accessing it.  It will also note how the functions of DNS differ markedly from those of search engines, since this seems to be a matter of frequent confusion on the part of learners.  In lay terms, you might think of DNS as like the ''white pages'' in a traditional phone book, and search engines as more like the ''yellow pages''.
}}</ref> but in recent years there have been ongoing attempts to expand the purpose and functionality of DNS in the public Internet.  Further, because the lookup process for DNS superficially appears to resemble the lookup process for searching on the world wide web, it has become easy to confuse the purposes of a DNS lookup with a search-engine lookup.  These two kinds of lookups have very different goals and occur at vastly different levels within the internet protocol stack.  This article will explain the functions and purposes of the Domain Name System, the nature of its distributed and hierarchical database, and the protocols for accessing it.  It will also note how the functions of DNS differ markedly from those of search engines, since this seems to be a matter of frequent confusion on the part of learners.  In lay terms, you might think of DNS as like the ''white pages'' in a traditional phone book, and search engines as more like the ''yellow pages''.


As the ''white page'' type lookup service of the internet, DNS has been attacked by hostile programs either attempting to disrupt internet traffic or divert users to illicit host machines.  The distributed and simplistic approach taken by DNS has proved, historically, surprisingly resilient against such attacks, but as the size and importance of the internet has grown, so have the security concerns related to DNS.  This article, or its related sub-articles, will also tackle security issues surrounding DNS.
As the ''white page'' type lookup service of the public Internet, DNS has been attacked by hostile programs either attempting to disrupt Internet traffic or divert users to illicit host machines.  The distributed and simplistic approach taken by DNS has proved, historically, surprisingly resilient against such attacks, but as the size and importance of the public Internet has grown, so have the security concerns related to DNS.  This article, or its related sub-articles, will also address basic [[DNS security]] issues.


{{TOC|right}}
{{TOC|right}}


==History==
==History==
DNS was first introduced for use on the internet in 1983, with the first specification written by Paul Mockapetris.<ref name=RFC882>{{citation
DNS was first introduced for use on the Internet in 1983, with the first specification written by Paul Mockapetris.<ref name=RFC882>{{citation
  | author = Mockapetris, P.V.
  | author = Mockapetris, P.V.
  | id = RFC882  
  | id = RFC882  
Line 31: Line 30:


'''Note well: all DNS was designed to do was replace the <code>hosts.txt</code> file that had the name to address mappings for <u>every</u> computer in the ARPANET.''' That's all. '''DNS was not designed to be a [[search engine]].''' Search engines hadn't been invented, since, after all, the Web had not been invented.  
'''Note well: all DNS was designed to do was replace the <code>hosts.txt</code> file that had the name to address mappings for <u>every</u> computer in the ARPANET.''' That's all. '''DNS was not designed to be a [[search engine]].''' Search engines hadn't been invented, since, after all, the Web had not been invented.  


{| class="wikitable"
{| class="wikitable"
Line 55: Line 52:


==New requirements==
==New requirements==
[[Image:Security scope.png|thumb|350px|DNS security responsibilities]]
Over the years, it has taken on more technical and administrative roles. These include providing additional information for the names and addresses, especially for security; the DNS infrastructure itself needed to be enhanced to be secure and trusted. <ref name=RFC4033>{{citation
Over the years, it has taken on more technical and administrative roles. These include providing additional information for the names and addresses, especially for security; the DNS infrastructure itself needed to be enhanced to be secure and trusted. <ref name=RFC4033>{{citation
  | id = RFC4033  
  | id = RFC4033  
Line 68: Line 66:
  | title = Representing Internet Protocol version 6 (IPv6) Addresses in the  Domain Name System (DNS)
  | title = Representing Internet Protocol version 6 (IPv6) Addresses in the  Domain Name System (DNS)
  | author = Bush, R. ''et al.''
  | author = Bush, R. ''et al.''
  | url = http://www.ietf.org/rfc/rfc1034.txt
  | url = http://www.ietf.org/rfc/rfc3363.txt
  | date = August 2002
  | date = August 2002
  | publisher = Internet Engineering Task Force
  | publisher = Internet Engineering Task Force
}}</ref>  Berkeley Internet Name Domain (BIND), first deployed in BSD 4.3 UNIX and written by Kevin Dunlap, was the first widespread DNS implementation. BIND is now public domain code supported by the Internet Systems Consortium <ref>http://www.isc.org/index.pl</ref>.
}}</ref>  Berkeley Internet Name Domain (BIND), first deployed in BSD 4.3 UNIX and written by Kevin Dunlap, was the first widespread DNS implementation. BIND is now public domain code supported by the Internet Software Consortium <ref>{{citation
| url = https://www.isc.org/software/bind
| publisher = Internet Software Consortium
| title = BIND}}</ref>


In the years DNS has served, Internet technology and operational issues changed. When the new IPv6 address format came into use, the need to change name-to-address mapping tools to handle that format is understandable.  
In the years DNS has served, Internet technology and operational issues changed. When the new IPv6 address format came into use, the need to change name-to-address mapping tools to handle that format is understandable.  


Less obvious, but still necessary, is the new requirement to have a capability to track dynamically assigned addresses when there is no central address server. [[Domain Name System dynamic update]] can can do such tracking, but dynamic update at this level is a security vulnerability. Address assignment spoofing is, by no means, the only threat to DNS, and an entire set of [[Domain Name System security]] ([[DNSSEC]]) are being deployed.<ref name=RFC4033>{{citation
Less obvious, but still necessary, is the new requirement to have a capability to track dynamically assigned addresses when there is no central address server. [[Domain Name System dynamic update]] can do such tracking, but dynamic update at this level is a security vulnerability. Address assignment spoofing is, by no means, the only threat to DNS, and an entire set of [[Domain Name System security]] ([[DNSSEC]]) extensions are being deployed.<ref name=RFC4033 />  
| id = RFC4033
| title =DNS Security Introduction and Requirements
| author =  R. Arends, R. Austein, M. Larson, D. Massey, S. Rose
| date = March 2005
| url = http://www.ietf.org/rfc/rfc4033.txt}}</ref>
[[Image:Security scope.png|thumb|DNS security responsibilities]]


The U.S. government is requiring DNSSEC for all Federal information systems by December 2009.<ref name=OMB-DNSSEC>{{citation
The U.S. government now requires DNSSEC for all Federal information systems, effective December 2009.<ref name=OMB-DNSSEC>{{citation
  | url = http://www.whitehouse.gov/omb/memoranda/fy2008/m08-23.pdf
  | url = http://www.whitehouse.gov/omb/memoranda/fy2008/m08-23.pdf
  | date = August 22, 2008
  | date = August 22, 2008
Line 90: Line 85:


==Domain name structure and schema==
==Domain name structure and schema==
[[Image:Domain Tree.png|thumb|left|Domain Name System tree section]]
[[Image:RevUKDNS-1.png|thumb|left|350px|Domain Name System tree section]]
The DNS namespace is hierarchical. Individual domain and host names within it have a textual representation, from right to left, which mirrors the tree that makes up the schema of the DNS:
The DNS namespace is hierarchical. Individual domain and host names within it have a textual representation, from right to left, which mirrors the tree that makes up the schema of the DNS:


Line 96: Line 91:


appears to have three components, but actually has four. The naming hierarchy is a tree, with increasingly specific levels reading right to left.  
appears to have three components, but actually has four. The naming hierarchy is a tree, with increasingly specific levels reading right to left.  


From what can be seen in the textual example,
From what can be seen in the textual example,
*'''.com''' is a '''top-level domain (TLD)''' under the authority of a TLD registry.
*'''.org''' is a '''top-level domain (TLD)''' under the authority of a TLD registry.
*'''.citizendium''' is a '''second-level domain''' under the authority of a SLD registry (SLD)
*'''.citizendium''' is a '''second-level domain''' under the authority of a SLD registry (SLD)
*'''.en''' identifies either a subdomain or a host, as defined by the <code>citizendium.com</code> technical administrator.
*'''.en''' identifies either a subdomain or a host, as defined by the <code>citizendium.com</code> technical administrator.
Line 122: Line 113:
====DNS registries====
====DNS registries====
{{seealso|Domain Name System non-technical policy issues}}
{{seealso|Domain Name System non-technical policy issues}}
DNS registries' fundamental role is to operate the data base for their top-level domain (TLD), and authorize registrars as "retail" agents to provide customer service. The bulk of TLDs are national, and use [[International Organization for Standardization]] (ISO) two-letter country codes (e.g., Canada=.ca, China - .cn, Germany=.de, United Kingdom = .uk). A few, such as Tuvalu's .tv, form attractive branding, and the country has few internal registrants but considerable income from outside registrants.  
DNS registries' fundamental role is to operate the data base for their top-level domain (TLD), and authorize registrars as "retail" agents to provide customer service. The bulk of TLDs are national, and use [[International Organization for Standardization]] (ISO) two-letter country codes (e.g., Canada = <tt>'''.ca'''</tt>, China = <tt>'''.cn'''</tt>, Germany = <tt>'''.de'''</tt>). In the majority of cases these country codes must be from the ISO 3166-1 list. However, there have been a few exceptions, usually for historical reasons. For example the ISO 3166-1 code for the United Kingdom is <tt>'''gb'''</tt>, but for historical reasons the assigned TLD is <tt>'''.uk'''</tt>. While the <tt>'''.gb'''</tt> TLD does exist, it has only one subdomain and does not accept new registrations. A few country codes, such as Tuvalu's <tt>'''.tv'''</tt>, form attractive branding, and the country has few internal registrants but considerable income from outside registrants.  


Country codes were not, at first, used, and the majority of registrations still go into the best-known .com. Some countries have a rational system where they use the "traditional" major suffix, or a variant of it, as a second-level domain, such as .co.uk, or .edu.uk. This has not always been done in an intuitive manner; would a relatively naive user expect .com.uk or '''.co.uk''', or '''.org.uk''' vs. or.uk ? <ref name=>{{citation
New TLDs are created by the [[Internet Corporation for Assigned Names and Numbers]] (ICANN), who then delegates the registry function to an organization that contracts with ICANN. Some new or proposed TLDs have been quite controversial, such as the [[.xxx domain|<tt>'''.xxx'''</tt> domain]] for [[pornography]]. Others, which offer some competitive commercial service, may take much time and effort to create, since multiple organizations may want to be the registry.
 
Remember that the public Internet, while international from the start, began as a U.S. project. A small set of non-national TLDs were created for early convenience. Country codes were not, at first, used, and the majority of registrations still go into the best-known <tt>'''.com'''</tt>. While the "<tt>.cc</tt>" country codes had gradually been used, they were formalized in the 1998 U.S. Department of Commerce White Paper about moving the U.S. government out of Internet operations.
 
Some countries have a rational system where they use the "traditional" major suffix, or a variant of it, as a second-level domain, such as <tt>'''.co.uk'''</tt>, or <tt>'''.ac.uk'''</tt>. This, however, has not always been done in an intuitive or consistent manner. A relatively naive user might expect <tt>.com.uk</tt> to be correct in line with the international <tt>'''.com'''</tt>, but <tt>'''.co.uk'''</tt> is in fact correct. Based on this the user may then think that <tt>.or.uk</tt> would be the equivalent of <tt>'''.org'''</tt>, but in this case <tt>'''.org.uk'''</tt> is correct.<ref name=>{{citation
  | url = http://www.nominet.org.uk/digitalAssets/3257_dotukrevisited.pdf
  | url = http://www.nominet.org.uk/digitalAssets/3257_dotukrevisited.pdf
  | author = Dyer, Stephen
  | author = Dyer, Stephen
  | title = .UK – Revisited
  | title = .UK – Revisited
  | date = October 1, 2004}}</ref>
  | date = October 1, 2004}}</ref> Similarly one would expect that either <tt>.edu.uk</tt> or <tt>.ed.uk</tt> would correspond to <tt>'''.edu'''</tt>. But neither of these are correct, and instead <tt>'''.ac.uk'''</tt> is used for higher education colleges and universities, and <tt>'''.sch.uk'''</tt> for primary and secondary schools.


{| class="wikitable"
{| class="wikitable"
Line 169: Line 164:
There is a continuing business, political, and technical argument about the desirability of more TLDs, especially from those that want TLDs that are suggestive of the business purpose of a registrant.  From a technical standpoint, while a proliferation of TLDs would not, as once suspected, seriously impact DNS performance, it would be likely to increase customer support cost due to the likelihood of making mistakes and getting the wrong domain.
There is a continuing business, political, and technical argument about the desirability of more TLDs, especially from those that want TLDs that are suggestive of the business purpose of a registrant.  From a technical standpoint, while a proliferation of TLDs would not, as once suspected, seriously impact DNS performance, it would be likely to increase customer support cost due to the likelihood of making mistakes and getting the wrong domain.


Another argument, the details of which involve [[intellectual property]] issues beyond the scope of this article, is the legal theory that a [[trademark]] must be "defended" or risks going into the public domain. If a second-level domain is identical to a trademarked company name, does the company have exclusive rights to it? Intellectual property attorneys have often argued that a well-known-company is not "defending" its trademark if it allows a domain to be created with its name, so there has been a tendency that whenever some TLD ".new" is created, trademark holders rush to register "well-known-company.new".  Speculators, meanwhile, rush to do so before the trademark holder can do so, and, if successful, sell the rights to the domain at a very high price.
There are also [[#legal issues|legal issues]] of [[intellectual property]] involved in domain disputes.
 
One especially hotly argued and unresolved issue is whether sexually-oriented businesses should have a .xxx TLD; some of those arguing against it also want to restrict access to sexually-oriented content, which would be identified by the TLD. Obviously, there would be no way to enforce keeping sexually-oriented content in .xxx, but it could reasonably be assumed that if a domain were in .xxx, it was sexually-oriented.


====DNS registrars====
====DNS registrars====
Registrars are the "retail" side of DNS operation. In .com and many other TLDs, they are profit-making entitities. They deal with organizations that wish to acquire particular domain names, verifying the name is available, and then handling the administrative interaction with the domain registry.
Registrars are the "retail" side of DNS operation. In .com and many other TLDs, they are profit-making entities. They deal with organizations that wish to acquire particular domain names, verifying the name is available, and then handling the administrative interaction with the domain registry.


Most registrars are reasonable and ethical. They may be subdivisions of companies that can sell additional services, such as web server hosting, to domain registrants. Frequently, they have user support functions that will help new DNS administrators set up their zone files, or they may actually operate name servers on behalf of registrants. If there is a dispute over the rights to a domain name, one's registrar can be a valuable ally.
Most registrars are reasonable and ethical. They may be subdivisions of companies that can sell additional services, such as web server hosting, to domain registrants. Frequently, they have user support functions that will help new DNS administrators set up their zone files, or they may actually operate name servers on behalf of registrants. If there is a dispute over the rights to a domain name, one's registrar can be a valuable ally.
Line 180: Line 173:
There are registrars that compete for the business of large hosting centers and other organizations that need many domain names, typically discounting the registration fee to multiple-domain customers.  It is to the advantage of a registrar to keep its existing customers, as most domains will be renewed, producing a continuing income stream. Registrars want to avoid "churn", a name for customers changing to other registrars.
There are registrars that compete for the business of large hosting centers and other organizations that need many domain names, typically discounting the registration fee to multiple-domain customers.  It is to the advantage of a registrar to keep its existing customers, as most domains will be renewed, producing a continuing income stream. Registrars want to avoid "churn", a name for customers changing to other registrars.


Some registrars, unfortunately, act against the original Internet tradition of it being a shared resource, and DNS being a service. Domain registrations expire annually, although one can pay the registrar to renew it automatically. It is not uncommon for certain registrars to look for domain names that expire in the near term, domains that were registered by a different registrar, and send the domain administrators what appear to be legitimate renewal notices. If completed and returned with payment, such a registrar will indeed renew the domain name &mdash; but transfer it away from the existing registrar.  
Some registrars, unfortunately, act against the original Internet tradition of it being a shared resource, and DNS being a service. Domain registrations expire annually, although one can pay the registrar to renew it automatically. It is not uncommon for certain registrars to look for domain names that expire in the near term, domains that were registered by a different registrar, and send the domain administrators what appear to be legitimate renewal notices. If completed and returned with payment, such a registrar will indeed renew the domain name &mdash; but transfer it away from the existing registrar.
 
===Legal and business issues associated with domain names===
===Legal and business issues associated with domain names===
When the [[ARPANET]], and then the [[Internet]], were new, DNS was seen as a simple mechanism to avoid memorizing or typing host addresses. As the Internet became more commercial, domain names acquired business value, since new users were apt to look for "company" at <code>company.com</code>. Indeed, as unpleasant to the DNS-knowledgeable ear as it may be, there are a substantial number of enterprises that have "dot-com", or sometimes other TLDs, as part of their corporate name.
When the [[ARPANET]], and then the [[Internet]], were new, DNS was seen as a simple mechanism to avoid memorizing or typing host addresses. As the Internet became more commercial, domain names acquired business value, since new users were apt to look for "company" at <code>company.com</code>. Indeed, as unpleasant to the DNS-knowledgeable ear as it may be, there are a substantial number of enterprises that have "dot-com", or sometimes other TLDs, as part of their corporate name.
Another argument, the details of which involve [[intellectual property]] issues beyond the scope of this article, is the legal theory that a [[trademark]] must be "defended" or risks going into the public domain. If a second-level domain is identical to a trademarked company name, does the company have exclusive rights to it? Intellectual property attorneys have often argued that a well-known-company is not "defending" its trademark if it allows a domain to be created with its name, so there has been a tendency that whenever some TLD "<tt>.new</tt>" is created, trademark holders rush to register "<tt>well-known-company.new</tt>".  Speculators, meanwhile, rush to do so before the trademark holder can do so, and, if successful, sell the rights to the domain at a very high price.
One especially hotly argued issue is whether sexually-oriented businesses should have a <tt>[[.xxx]]</tt> TLD; some of those arguing for it also want to restrict access to sexually-oriented content, which would be identified by the TLD. Obviously, there would be no way to enforce keeping sexually-oriented content in <tt>.xxx</tt>, but it could reasonably be assumed that, if a domain were in <tt>.xxx</tt>, it was sexually-oriented. After six years of debate the <tt>.xxx</tt> TLD was approved in June 2010, and is expected to be launched in early 2011.<ref>ICM Registry (June 25, 2010), [http://www.icmregistry.com/blog/?p=306 ICM Registry welcomes approval of .xxx]</ref>
==Name servers and zone files==
==Name servers and zone files==
A [sub]domain is a '''name space''' that need not have names in it. The basic source of name information that goes into a particular space is a '''zone file''', created manually or with software assistance.  
One of the most confusing things to newcomers to DNS is the difference between a domain and a zone. One way to look at it is that a domain declares a range of potential names, while the zone defines the names actually in use.  Formally, a [sub]domain is a '''namespace''' that need not have names in it. The basic source of name information that goes into a particular space is a '''zone file''', created manually or with software assistance.  
[[Image:Initial population.png|thumb|Populating a primary name server]]
 
Let us consider <tt>citizendium.org</tt>, which could have every valid character string as a subdomain from the shortened <tt>aaaa.citizendium.org to zzzz.citizendium.org</tt>. That are domains, comparable to the Citizendium name spaces such as Main, Talk, User, and CZ, in the sense that, ignoring lengths, the Main or Talk userspaces can have articles from Aaaa to Zzzz. Not all those article names, however, are meaningful.
 
If, however, there are only actual hosts named <tt>en.citizendium.org</tt>, <tt>test.citizendium.org</tt>, <tt>reid.citizendium.org</tt>, and <tt>locke.citizendium.org</tt>, Citizendium's zone file would have only four host entries.  To continue the analogy with CZ name spaces, the name file would be the set of articles, in each name  space, which actually exist.  Main: Zzzz is not an article; Main: Zero is an article.
 
[[Image:Initial population.png|thumb|250px|Populating a primary name server]]
Just as the DNS namespace is a tree of domains, the actual information in that namespace can be regarded as a tree of zone files.
Just as the DNS namespace is a tree of domains, the actual information in that namespace can be regarded as a tree of zone files.


Line 197: Line 201:


*Iterative: the server refers the client to another server and lets the client pursue the query; the client is aware of multiple nameservers but is only interacting with one at a time
*Iterative: the server refers the client to another server and lets the client pursue the query; the client is aware of multiple nameservers but is only interacting with one at a time
*recursive: the first server pursues the query for the client at another server; the client is aware of only one DNS server
*Recursive: the first server pursues the query for the client at another server; the client is aware of only one DNS server
===Domains versus zones===
===Domains versus zones===
At each of these levels is an abstract '''namespace'''.  No other second-level domain could have '''notcz.citizendium.com''', but the administrator of '''citizendium.com''' is not obligated to have any number of subordinate hosts or domains. There is a subtle distinction between the abstraction of a name space, and a '''zone file''' that actually defines the hosts and subdomains in the zone.
At each of the levels of the DNS hierarchy &mdash; top-level, second level, etc. &mdash; is an abstract '''namespace'''.  No other second-level domain could have <tt>notcz.citizendium.org</tt>, but the administrator of <tt>citizendium.org</tt> is not obligated to have any number of subordinate hosts or domains. There is a subtle distinction between the abstraction of a name space, and a '''zone file''' that actually defines the hosts and subdomains in the zone. Name spaces define possible records; zone files contain actual records within that space, plus a few special cases such as "glue" records to name servers outside that space.  <tt>wikipedia.citizendium.org</tt> is part of the <tt>citizendium.org</tt> namespace, but, since there is no such host, it is not in any zone file.
 
===Resource records===
===Resource records===
Zone files are made up of '''resource records (RR)'''. All RRs have several common properties:
Zone files are made up of '''resource records (RR)'''. All RRs have several common properties:
*'''owner''': the domain in which the authoritative RR resides. This is often implicitly derived from context, perhaps relative to the current domain name
*'''owner''': the domain in which the authoritative RR resides. This is often implicitly derived from context, perhaps relative to the current domain name
*'''type''': an encoded 16 bit value that defines the type of resource defined by the current records.  Some types are obsolete, while others continue to be added for new DNS functions.  
*'''type''': an encoded 16 bit value that defines the type of resource defined by the current records.  Some types are obsolete, while others continue to be added for new DNS functions.  
*'''class''': an obsolete but required field, it is a 16 bit value for the protocol family with which the RR is associated. The only value used is the Internet, textually represented as '''IN'''
*'''class''': an obsolete but required field, it is a 16 bit value for the protocol family with which the RR is associated. The only value used is the "''Internet''", textually represented as '''IN'''
*'''time to live''': commonly called '''TTL''', this parameter specifies how long the RR may be kept in a cache and assumed to be valid. It is a 32 bit integer, whose value is measured in seconds
*'''time to live''': commonly called '''TTL''', this parameter specifies how long the RR may be kept in a cache and assumed to be valid. It is a 32 bit integer, whose value is measured in seconds
*'''RDATA''': type-specific data about the resource
*'''RDATA''': type-specific data about the resource
Line 212: Line 217:
For example, the RR defining the address associated with the name XX.LCS.MIT.EDU<ref>Note that the actual RR has a terminal period that does not appear when the DNS name is written in other uses</ref>
For example, the RR defining the address associated with the name XX.LCS.MIT.EDU<ref>Note that the actual RR has a terminal period that does not appear when the DNS name is written in other uses</ref>
<center><code>'''XX.LCS.MIT.EDU. IN      A      10.0.0.44'''</code></center>
<center><code>'''XX.LCS.MIT.EDU. IN      A      10.0.0.44'''</code></center>


{| class="wikitable"
{| class="wikitable"
Line 236: Line 239:
| Address [[IPv6]]
| Address [[IPv6]]
| Specifies the IPv6 address for a host
| Specifies the IPv6 address for a host
| IPv4 Address
| IPv6 Address
|-
|-
| PTR
| PTR
Line 250: Line 253:
| NS
| NS
| Name server
| Name server
| (usually) an address of a name server one level of domain hierarchy above the current domain
| (usually) An address of a name server one level of domain hierarchy above the current domain
| Address
| Address
|-
|-
Line 256: Line 259:
| Mail exchanger
| Mail exchanger
| Defines the start of a zone or a subzone; subordinate records inherit parameters
| Defines the start of a zone or a subzone; subordinate records inherit parameters
| a 16 bit preference value (lower is  better) followed by a host name willing  to act as a mail exchange for the owner    domain.
| A 16 bit preference value (lower is  better) followed by a host name willing  to act as a mail exchange for the owner    domain.
|-
|-
|}
|}
===Wildcards in Resource Records===
===Wildcards in Resource Records===
An additional complexity of RRs is that they may contain [[regular expression|wildcards]]. The simplest example is a "*" character in a name expression will match any string. In specific situations, this is an extremely useful function, but it can complicate troubleshooting.<ref name=RFC4592>{{citation
An additional complexity of RRs is that they may contain [[regular expression|wildcards]]. The simplest example is a " <tt>*</tt> " character that will match any string in a name expression. In specific situations, this is an extremely useful function, but it can complicate troubleshooting.<ref name=RFC4592>{{citation
  | id=RFC4592  
  | id=RFC4592  
  | title = The Role of Wildcards in the Domain Name System
  | title = The Role of Wildcards in the Domain Name System
Line 270: Line 274:
  | url = http://www.icann.org/en/topics/wildcard-history.html
  | url = http://www.icann.org/en/topics/wildcard-history.html
  | title = Verisign's Wildcard Service Deployment
  | title = Verisign's Wildcard Service Deployment
  | author = [[Internet Corporation for Assigned Names and Numbers]]}}</ref> If the [[World Wide Web]] alone were the only function on the [[Internet]], this might, although revenue-generating, have been useful. Unfortuntately, there are many other functions on the Internet. In particular, [[messaging application protocols]] such as the [[Simple Mail Transfer Protocol]] (SMTP) would use the "host not found" information to conclude that mail to that host was undeliverable.  
  | author = [[Internet Corporation for Assigned Names and Numbers]]}}</ref> If the [[World Wide Web]] alone were the only function on the [[Internet]], this might, although revenue-generating, have been useful. Unfortunately, there are many other functions on the Internet. In particular, [[messaging application protocols]] such as the [[Simple Mail Transfer Protocol]] (SMTP) would use the "host not found" information to conclude that mail to that host was undeliverable.  


A quite useful use for a wildcard, however, would be in a [[split DNS]] application, with different name resolution policies on different sides of a firewall. On the public [[Internet]] side of the firewall, the DNS server for <code>example.com</code> would have explicit records for the organization's public web server, mail server, and other public servers. Any reference to "inside" addresses, however, would be handled by the record:
A quite useful use for a wildcard, however, would be in a [[split DNS]] application, with different name resolution policies on different sides of a firewall. On the public [[Internet]] side of the firewall, the DNS server for <code>example.com</code> would have explicit records for the organization's public web server, mail server, and other public servers. Any reference to "inside" addresses, however, would be handled by the record:
Line 276: Line 280:
<center><code>'''*.example.com  IN A [outside address of the firewall]'''</code></center>
<center><code>'''*.example.com  IN A [outside address of the firewall]'''</code></center>


 
[[Domain Name System security]], however, does not have a complete solution to working with wildcarded RRs.
[[Domain Name System security]] however, does not have a complete solution to working with wildcarded RRs.


==Deploying DNS==
==Deploying DNS==
To understand basic DNS, assume that it is being used in a single organization, which has one technical and administrative authority in control. In other words, the domain and its subdomains are homogeneous. While there may be minor exceptions due to the existence of temporarily cached data in individual clients and servers, and not all clients and servers may be able to view all parts of the highest-level domain, a single organization's DNS is essentially a [[distributed data base]], where there are multiple copies of a single "golden copy" of information.
To understand basic DNS, assume that it is being used in a single organization, which has one technical and administrative authority in control. In other words, the domain and its subdomains are homogeneous. While there may be minor exceptions due to the existence of temporarily cached data in individual clients and servers, and not all clients and servers may be able to view all parts of the highest-level domain, a single organization's DNS is essentially a [[distributed database]], where there are multiple copies of a single "golden copy" of information.


Once one starts interconnecting domains under different authority, as in the Internet, both administrative and technical aspects change. First, it is understood that while the total collection of all domains conceptually have access to all public name information, no one domain will have a copy of all information. Rather than being a distributed data base, it has become a [[federated data base]], where there is a common indexing and retrieval model, but requests may need to go to multiple servers, in multiple domains and subdomains, before the request is satisfied.
Once one starts interconnecting domains under different authority, as in the Internet, both administrative and technical aspects change. First, it is understood that while the total collection of all domains conceptually have access to all public name information, no single domain will have a copy of all information. Rather than being a distributed data base, it has become a [[federated data base]], where there is a common indexing and retrieval model, but requests may need to go to multiple servers, in multiple domains and subdomains, before the request is satisfied.


Second, even between well-recognized business partner organizations, there are trust issues. Third, there are [[miscreant]]s actively attacking the DNS, for reasons from ideology to technical status to pure criminal revenue.
Second, even between well-recognized business partner organizations, there are trust issues. Third, there are [[miscreant]]s actively attacking the DNS, for reasons from ideology to technical status to pure criminal revenue.
===Basic Implementation===
===Basic Implementation===
The administrator of a homegeneous domain (and its subdomains) starts by building a zone file that defines the names and addresses of hosts in that zone, optional additional information to be added to the responses, and to a higher-level nameserver that helps connect the domain of the zone to other domains. For example, if one was in <code>'''a.com'''</code> , one would have to go to the nameserver of <code>'''.com'''</code> to find the address of the <code>'''b.com'''</code> nameserver.
The administrator of a homogeneous domain (and its subdomains) starts by building a zone file that defines the names and addresses of hosts in that zone, optional additional information to be added to the responses, and to a higher-level nameserver that helps connect the domain of the zone to other domains. For example, if one was in <code>'''a.com'''</code> , one would have to go to the nameserver of <code>'''.com'''</code> to find the address of the <code>'''b.com'''</code> nameserver.


====SOA RR====
====SOA Resource Record====
The zone/domain name starts the record; it must end with a trailing period. Assume that it is <code>sub.example.com.</code>
The zone/domain name starts the record; it must end with a trailing period. Assume that it is <code>sub.example.com.</code>


In the resource data, the first field is the primary name server that is <u>''in</u>'' this domain, as opposed to the name server in the ''NS'' record, which is <u>''above and outside</u>'' the current domain. In this case, it might be <code>ns1.sub.example.com.</code>  
In the resource data, the first field is the primary name server that is <u>''in</u>'' this domain, as opposed to the name server in the ''NS'' record, which is <u>''above and outside</u>'' the current domain. In this case, it might be <code>ns1.sub.example.com.</code>


Next comes the mail address of the person or [[role]] responsible for the data in this domain, written not in the conventional <code>user@domain</code>, but in the syntax of a DNS name in a zone file. To create a mail address, replace the leftmost period with an "@" symbol and remove the trailing period. <br>
Next comes the mail address of the person or [[role]] responsible for the data in this domain, written not in the conventional <code>user@domain</code>, but in the syntax of a DNS name in a zone file. To create a mail address, replace the leftmost period with an "@" symbol and remove the trailing period. <br>
<code>administrator.sub.example.com.</code> is changed to <code>administrator@sub.example.com</code>
" <code>administrator.sub.example.com.</code> " is changed to " <code>administrator@sub.example.com</code> ".


Following the administrator are several parameters that may have defaults, but should be known. The first is the serial number of this version of the zone file, which will increase whenever this file is updated.
Following the administrator are several parameters that may have defaults, but should be known. The first is the serial number of this version of the zone file, which will increase whenever this file is updated.
Line 304: Line 307:
*'''TTL''': The default TTL for RRs in this zone. An appropriate TTL is controversial, and may be quite different on an internal nameserver versus one accessible from the Internet. The shorter the interval, the more accurate is the data, and, further, the better it is for name-based load distribution schemes. The longer the interval, the less DNS traffic is generated
*'''TTL''': The default TTL for RRs in this zone. An appropriate TTL is controversial, and may be quite different on an internal nameserver versus one accessible from the Internet. The shorter the interval, the more accurate is the data, and, further, the better it is for name-based load distribution schemes. The longer the interval, the less DNS traffic is generated


====NS RR====
====Other Resource Records====
gives the IP address of a hierarchically higher name server to which the name server goes when it cannot complete a name-to-address or address-to-name mapping based on its own information.
; NS : gives the IP address of a hierarchically higher name server to which the name server goes when it cannot complete a name-to-address or address-to-name mapping based on its own information.
====A and AAAA RR====
; A ''and'' AAAA : code the authoritative host name and its address, and, optionally, the TTL if different from the zone TTL.
Code the authoritative host name and its address, and, optionally, the TTL if different from the zone TTL.
; PTR : code an address and the corresponding host name, and, optionally, the TTL if different from the zone TTL.
====PTR RR====
; CNAME : code an alternative host name and its address, and, optionally, the TTL if different from the zone TTL.
Code an address and the corresponding host name, and, optionally, the TTL if different from the zone TTL.
 
====CNAME RR====
Code an alternative host name and its address, and, optionally, the TTL if different from the zone TTL.
===Resource Record sets (RRsets)===
===Resource Record sets (RRsets)===
While no two RRs should have the same label and type and data all equal, it is perfectly possible to have RRs with the same label and type, but different RDATA. For example, a physically multihomed server could have four network interface cards (NIC), each on a different subnet. The set of addresses for this host name (i.e., label) would reasonably form a set of four A records with different address data. Such a set of records is called a  '''Resource Record Set''' (RRSet). <ref name=RFC2181>{{citation
While no two RRs should have the same label and type and data all equal, it is perfectly possible to have RRs with the same label and type, but different RDATA. For example, a physically multihomed server could have four network interface cards (NIC), each on a different subnet. The set of addresses for this host name (i.e., label) would reasonably form a set of four A records with different address data. Such a set of records is called a  '''Resource Record Set''' (RRSet). <ref name=RFC2181>{{citation
  | id = RFC 2181
  | id = RFC2181
  | title = Clarifications to the DNS Specification  
  | title = Clarifications to the DNS Specification  
  | author = R. Elz, R. Bush
  | author = R. Elz, R. Bush
Line 320: Line 321:
  | date = July 1997
  | date = July 1997
  | publisher = Internet Engineering Task Force
  | publisher = Internet Engineering Task Force
}}</ref>
}}</ref>


===Obtaining root information===
===Obtaining root information===
The root name server zone file is expected to be retrieved, by anonymous [[FTP]], from various well-known sites approved by ICANN. In practice, most DNS implementations ship with a recent copy. Root servers remain very busy. <ref name=DNS-BIND /> In fact, while the root server zone file mentioned above will give the names and addresses of root servers in the general form  
The root name server zone file is expected to be retrieved, by anonymous [[FTP]], from various well-known sites approved by ICANN. In practice, most DNS implementations ship with a recent copy. Root servers remain very busy. <ref name=DNS-BIND /> In fact, while the root server zone file mentioned above will give the names and addresses of root servers in the general form  


<center><code>a.root-servers.net</code></center>
<center><code>a.root-servers.net</code></center>


the address of a particular server is of the [[anycast]] type; <ref>{{citation  
the address of a particular server is of the [[anycast]] type; <ref>{{citation  
Line 335: Line 334:


For each domain, there must be at least one, and preferably more than one '''name server''' that holds the zone files. '''Primary''' domain servers have the authoritative zone files, and '''secondary''' domain servers keep an exact copy of the primary's zone file. Both types are assumed to have a disk or other storage from which they can restore the domain information.
For each domain, there must be at least one, and preferably more than one '''name server''' that holds the zone files. '''Primary''' domain servers have the authoritative zone files, and '''secondary''' domain servers keep an exact copy of the primary's zone file. Both types are assumed to have a disk or other storage from which they can restore the domain information.
[[Image:Initial population with trusted externals.png|thumb|left|Zone transfer adds to populating a server database]]
[[Image:Initial population with trusted externals.png|thumb|300px|left|Zone transfer adds to populating a server database]]
 
 
 


A secondary server will use a '''zone transfer''' to obtain the primary zone file for its domain. There are various operational reasons why a physical server might act as primary and secondary for multiple zones; the important point here is that a zone transfer, as opposed to ordinary DNS retrieval, alters the contents of the definitions and must be treated as a sensitive operation.
A secondary server will use a '''zone transfer''' to obtain the primary zone file for its domain. There are various operational reasons why a physical server might act as primary and secondary for multiple zones; the important point here is that a zone transfer, as opposed to ordinary DNS retrieval, alters the contents of the definitions and must be treated as a sensitive operation.
[[Image:Including dynamic updateV2.png|thumb|Adding trusted dynamic updates]]
[[Image:Including dynamic updateV2.png|250px|thumb|Adding trusted dynamic updates]]
The nameserver also can take dynamic transfers, which, strictly speaking, do not have to be secured, but dynamic update, especially in a IPv6 environment, is so open an invitation to miscreants that it should never be considered without being secured. DNS security is the normal way this might be done, but there are other alternatives, such as an encrypted link between the update source and the nameserver.
The nameserver also can take dynamic transfers, which, strictly speaking, do not have to be secured, but dynamic update, especially in an IPv6 environment, is so open an invitation to miscreants that it should never be considered without being secured. DNS security is the normal way this might be done, but there are other alternatives, such as an encrypted link between the update source and the nameserver.
 
 
 


There are also '''caching-only''' servers that contain only the names and addresses that have been recently looked up, and are still valid with respect to the TTL parameter in the relevant records.
There are also '''caching-only''' servers that contain only the names and addresses that have been recently looked up, and are still valid with respect to the TTL parameter in the relevant records.
[[Image:Distribution within zone.png|left|thumb|Resolvers, their caches, and their information sources]]
[[Image:Distribution within zone.png|300px|left|thumb|Resolvers, their caches, and their information sources]]
The program, on a host, which is the client of DNS servers is most often called a '''resolver'''.  Depending on the local network architectural implementation, a resolver may go to a caching-only server, a secondary server, or the primary server for its information. It may retain a cache of recently retrieved DNS information, clearing items from cache as their TTLs expire.
The program, on a host, which is the client of DNS servers is most often called a '''resolver'''.  Depending on the local network architectural implementation, a resolver may go to a caching-only server, a secondary server, or the primary server for its information. It may retain a cache of recently retrieved DNS information, clearing items from cache as their TTLs expire.


===Heterogeneous DNS===
===Heterogeneous DNS===
{{main|Split DNS|}}
While there will be different federated databases, DNS is certainly not limited to the public Internet. It is quite common for organizations to have '''split DNS''' "inside the firewall" and "outside the firewall".  An inside user will query local DNS for the address of an internal machine and get the address of the actual host, but, if it asks for the address of <code>citizendium.com</code>, the address returned by DNS may well be that of the "inside" interface of a [[firewall]], or other security [[middlebox]]<ref name=RFC3303>{{citation
While there will be different federated databases, DNS is certainly not limited to the public Internet. It is quite common for organizations to have '''split DNS''' "inside the firewall" and "outside the firewall".  An inside user will query local DNS for the address of an internal machine and get the address of the actual host, but, if it asks for the address of <code>citizendium.com</code>, the address returned by DNS may well be that of the "inside" interface of a [[firewall]], or other security [[middlebox]]<ref name=RFC3303>{{citation
  | id = RFC3303
  | id = RFC3303
Line 374: Line 368:
  | title = Dynamic Updates in the Domain Name System (DNS UPDATE)
  | title = Dynamic Updates in the Domain Name System (DNS UPDATE)
  | editor = Vixie, P.
  | editor = Vixie, P.
  | url = http://www.ietf.org/rfc/rfc4033.txt
  | url = http://www.ietf.org/rfc/rfc2136.txt
  | date = April 1997
  | date = April 1997
  | publisher = Internet Engineering Task Force
  | publisher = Internet Engineering Task Force
Line 380: Line 374:


==Extended applications==
==Extended applications==
These include [[Domain Name System dynamic update]], use of the DNS as a data base in [[Public Key Infrastructure]] for security, [[Domain Name System security]] ([[DNSSEC]]) and name-based routing and load distribution.  
These include [[Domain Name System dynamic update]], use of the DNS as a data base in [[public key infrastructure|Public Key Infrastructure (PKI)]] for general security, [[Domain Name System security]] ([[DNSSEC]]) and name-based routing and load distribution.
 
==References==
==References==
{{reflist|2}}
{{reflist|2}}
[[Category:Flagged for Review]][[Category:Suggestion Bot Tag]]

Latest revision as of 06:01, 8 August 2024

This article has a Citable Version.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article has an approved citable version (see its Citable Version subpage). While we have done conscientious work, we cannot guarantee that this Main Article, or its citable version, is wholly free of mistakes. By helping to improve this editable Main Article, you will help the process of generating a new, improved citable version.

On the Internet, the Domain Name System (DNS) is a critically important directory service that translates to and from a raw IP address (such as 207.46.197.32) and a domain name (such as microsoft.com). This allows people to interact with software via domain names, which are easier to remember than numerical IP addresses.

More importantly, it allows computer-friendly but user-unfriendly IP addresses to change without affecting human users. Thus people can still expect to find the same information behind the user-friendly domain names, and need not be concerned if Microsoft Corporation changes the IP address on one of its host computers, as the domain name microsoft.com is sufficient, thanks to DNS, to find their computers regardless of which IP address the Microsoft administrator has assigned to those hosts.

DNS is a hierarchical federated database, distributed widely across many host computers on the public Internet, and it also has a set of application protocols for interacting with the database. DNS names must comply with standards on the public Internet, but need not do so in a private internet where DNS is still useful. The original purpose of DNS was to translate a domain name to an IP address (forward DNS), and an IP address to a domain name (reverse DNS),[1] but in recent years there have been ongoing attempts to expand the purpose and functionality of DNS in the public Internet. Further, because the lookup process for DNS superficially appears to resemble the lookup process for searching on the world wide web, it has become easy to confuse the purposes of a DNS lookup with a search-engine lookup. These two kinds of lookups have very different goals and occur at vastly different levels within the internet protocol stack. This article will explain the functions and purposes of the Domain Name System, the nature of its distributed and hierarchical database, and the protocols for accessing it. It will also note how the functions of DNS differ markedly from those of search engines, since this seems to be a matter of frequent confusion on the part of learners. In lay terms, you might think of DNS as like the white pages in a traditional phone book, and search engines as more like the yellow pages.

As the white page type lookup service of the public Internet, DNS has been attacked by hostile programs either attempting to disrupt Internet traffic or divert users to illicit host machines. The distributed and simplistic approach taken by DNS has proved, historically, surprisingly resilient against such attacks, but as the size and importance of the public Internet has grown, so have the security concerns related to DNS. This article, or its related sub-articles, will also address basic DNS security issues.

History

DNS was first introduced for use on the Internet in 1983, with the first specification written by Paul Mockapetris.[2] Mockapetris' first DNS implementation was called JEEVES, and replaced the ARPANET (pre-Internet) environment with few enough computers that a single file, hosts.txt, was sufficient to contain all connected computer names and their numeric addresses.[3] Its designers, however, did not think of it as anything like a search engine, with the ability to seek a name corresponding to an idea (e.g. "pizza"), but to work with explicit names already known by the application. Manually maintaining and sharing host files became impractical as the scale of the Internet grew, and DNS was designed and implemented as the solution to the problem of scalable host name resolution.

Note well: all DNS was designed to do was replace the hosts.txt file that had the name to address mappings for every computer in the ARPANET. That's all. DNS was not designed to be a search engine. Search engines hadn't been invented, since, after all, the Web had not been invented.

Original design goals for DNS
Protocol designers Name & address authorities System administrators
Standard formats for resource data. Addresses for the root servers The definition of zone boundaries
Standard methods for querying the database Unique assignments of domain names Master files of data (i.e., sets of Resource Records (RR)
Standard methods for name servers to refresh local data from foreign name servers. Operation, perhaps with delegation of the root servers and top-level domain servers Statements of the refresh policies desired

New requirements

DNS security responsibilities

Over the years, it has taken on more technical and administrative roles. These include providing additional information for the names and addresses, especially for security; the DNS infrastructure itself needed to be enhanced to be secure and trusted. [4] DNS originally was manually configured, but there have been a variety of extensions to allow dynamic operation, such as the temporary binding of an address to a name.

The domain name space, as well as the address spaces both for Internet Protocol version 4 (IPv4) and Internet Protocol version 6 (IPv6), are under the authority of the Internet Corporation for Assigned Names and Numbers (ICANN), with much delegation of administration. The original system only handled IPv4, so one of the first steps for IPv6 support was defining how to represent IPv6 addresses in DNS. [5] Berkeley Internet Name Domain (BIND), first deployed in BSD 4.3 UNIX and written by Kevin Dunlap, was the first widespread DNS implementation. BIND is now public domain code supported by the Internet Software Consortium [6]

In the years DNS has served, Internet technology and operational issues changed. When the new IPv6 address format came into use, the need to change name-to-address mapping tools to handle that format is understandable.

Less obvious, but still necessary, is the new requirement to have a capability to track dynamically assigned addresses when there is no central address server. Domain Name System dynamic update can do such tracking, but dynamic update at this level is a security vulnerability. Address assignment spoofing is, by no means, the only threat to DNS, and an entire set of Domain Name System security (DNSSEC) extensions are being deployed.[4]

The U.S. government now requires DNSSEC for all Federal information systems, effective December 2009.[7]

Domain name structure and schema

Domain Name System tree section

The DNS namespace is hierarchical. Individual domain and host names within it have a textual representation, from right to left, which mirrors the tree that makes up the schema of the DNS:

en.citizendium.com


appears to have three components, but actually has four. The naming hierarchy is a tree, with increasingly specific levels reading right to left.

From what can be seen in the textual example,

  • .org is a top-level domain (TLD) under the authority of a TLD registry.
  • .citizendium is a second-level domain under the authority of a SLD registry (SLD)
  • .en identifies either a subdomain or a host, as defined by the citizendium.com technical administrator.

What cannot be seen is the hierarchically "zeroth" highest part, the root. If a part usually suppressed were displayed,

en.citizendium.com.

The rightmost dot identifies the root of the DNS tree. In actual practice, there are multiple root servers, for which addresses are in an explicit file, a representative of which is found at http://www.internic.net/zones/named.root

It is defined as:

This file holds the information on root name servers needed to initialize cache of Internet domain name servers (e.g. reference this file in the "cache . <file>" configuration file of BIND domain name servers).

A fully qualified domain name can be traced from the hierarchically lowest host name to the root. For example, en.citizendium.org goes from the host en all the way up to the top-level domain .org, which is connected to the root.

A computer within the second-level domain citizendium.org could refer to the subdomain en, which would be a relative domain name; most DNS applications would append the current domain to the right of the host name. k12.en.citizendium.org is a hypothetical subdomain of en.citizendium.org; an arbitrary host could be larry.en.citizendium.org and the DNS software would understand if it is dealing with a host or a domain.

Domain name authority and issues

Name assignment

The administrative process of DNS name assignment involves both DNS registries and DNS registrars

DNS registries

See also: Domain Name System non-technical policy issues

DNS registries' fundamental role is to operate the data base for their top-level domain (TLD), and authorize registrars as "retail" agents to provide customer service. The bulk of TLDs are national, and use International Organization for Standardization (ISO) two-letter country codes (e.g., Canada = .ca, China = .cn, Germany = .de). In the majority of cases these country codes must be from the ISO 3166-1 list. However, there have been a few exceptions, usually for historical reasons. For example the ISO 3166-1 code for the United Kingdom is gb, but for historical reasons the assigned TLD is .uk. While the .gb TLD does exist, it has only one subdomain and does not accept new registrations. A few country codes, such as Tuvalu's .tv, form attractive branding, and the country has few internal registrants but considerable income from outside registrants.

New TLDs are created by the Internet Corporation for Assigned Names and Numbers (ICANN), who then delegates the registry function to an organization that contracts with ICANN. Some new or proposed TLDs have been quite controversial, such as the .xxx domain for pornography. Others, which offer some competitive commercial service, may take much time and effort to create, since multiple organizations may want to be the registry.

Remember that the public Internet, while international from the start, began as a U.S. project. A small set of non-national TLDs were created for early convenience. Country codes were not, at first, used, and the majority of registrations still go into the best-known .com. While the ".cc" country codes had gradually been used, they were formalized in the 1998 U.S. Department of Commerce White Paper about moving the U.S. government out of Internet operations.

Some countries have a rational system where they use the "traditional" major suffix, or a variant of it, as a second-level domain, such as .co.uk, or .ac.uk. This, however, has not always been done in an intuitive or consistent manner. A relatively naive user might expect .com.uk to be correct in line with the international .com, but .co.uk is in fact correct. Based on this the user may then think that .or.uk would be the equivalent of .org, but in this case .org.uk is correct.[8] Similarly one would expect that either .edu.uk or .ed.uk would correspond to .edu. But neither of these are correct, and instead .ac.uk is used for higher education colleges and universities, and .sch.uk for primary and secondary schools.

Representative non-national TLD registries
Top-level domain Registry Comments
.aero Société Internationale de Télécommunications Aéronautiques SC, (SITA) Sponsored by air transport industry
.com Verisign Unsponsored
.edu Educause Under U.S. government agreement, ending in 2011
.net Verisign Unsponsored
.mil Defense Information Systems Agency U.S. government agency
.org Public Interest Registry (PIR) Unsponsored; not-for-profit
.biz NeuLevel, Inc. Unsponsored

There is a continuing business, political, and technical argument about the desirability of more TLDs, especially from those that want TLDs that are suggestive of the business purpose of a registrant. From a technical standpoint, while a proliferation of TLDs would not, as once suspected, seriously impact DNS performance, it would be likely to increase customer support cost due to the likelihood of making mistakes and getting the wrong domain.

There are also legal issues of intellectual property involved in domain disputes.

DNS registrars

Registrars are the "retail" side of DNS operation. In .com and many other TLDs, they are profit-making entities. They deal with organizations that wish to acquire particular domain names, verifying the name is available, and then handling the administrative interaction with the domain registry.

Most registrars are reasonable and ethical. They may be subdivisions of companies that can sell additional services, such as web server hosting, to domain registrants. Frequently, they have user support functions that will help new DNS administrators set up their zone files, or they may actually operate name servers on behalf of registrants. If there is a dispute over the rights to a domain name, one's registrar can be a valuable ally.

There are registrars that compete for the business of large hosting centers and other organizations that need many domain names, typically discounting the registration fee to multiple-domain customers. It is to the advantage of a registrar to keep its existing customers, as most domains will be renewed, producing a continuing income stream. Registrars want to avoid "churn", a name for customers changing to other registrars.

Some registrars, unfortunately, act against the original Internet tradition of it being a shared resource, and DNS being a service. Domain registrations expire annually, although one can pay the registrar to renew it automatically. It is not uncommon for certain registrars to look for domain names that expire in the near term, domains that were registered by a different registrar, and send the domain administrators what appear to be legitimate renewal notices. If completed and returned with payment, such a registrar will indeed renew the domain name — but transfer it away from the existing registrar.

Legal and business issues associated with domain names

When the ARPANET, and then the Internet, were new, DNS was seen as a simple mechanism to avoid memorizing or typing host addresses. As the Internet became more commercial, domain names acquired business value, since new users were apt to look for "company" at company.com. Indeed, as unpleasant to the DNS-knowledgeable ear as it may be, there are a substantial number of enterprises that have "dot-com", or sometimes other TLDs, as part of their corporate name.

Another argument, the details of which involve intellectual property issues beyond the scope of this article, is the legal theory that a trademark must be "defended" or risks going into the public domain. If a second-level domain is identical to a trademarked company name, does the company have exclusive rights to it? Intellectual property attorneys have often argued that a well-known-company is not "defending" its trademark if it allows a domain to be created with its name, so there has been a tendency that whenever some TLD ".new" is created, trademark holders rush to register "well-known-company.new". Speculators, meanwhile, rush to do so before the trademark holder can do so, and, if successful, sell the rights to the domain at a very high price.

One especially hotly argued issue is whether sexually-oriented businesses should have a .xxx TLD; some of those arguing for it also want to restrict access to sexually-oriented content, which would be identified by the TLD. Obviously, there would be no way to enforce keeping sexually-oriented content in .xxx, but it could reasonably be assumed that, if a domain were in .xxx, it was sexually-oriented. After six years of debate the .xxx TLD was approved in June 2010, and is expected to be launched in early 2011.[9]

Name servers and zone files

One of the most confusing things to newcomers to DNS is the difference between a domain and a zone. One way to look at it is that a domain declares a range of potential names, while the zone defines the names actually in use. Formally, a [sub]domain is a namespace that need not have names in it. The basic source of name information that goes into a particular space is a zone file, created manually or with software assistance.

Let us consider citizendium.org, which could have every valid character string as a subdomain from the shortened aaaa.citizendium.org to zzzz.citizendium.org. That are domains, comparable to the Citizendium name spaces such as Main, Talk, User, and CZ, in the sense that, ignoring lengths, the Main or Talk userspaces can have articles from Aaaa to Zzzz. Not all those article names, however, are meaningful.

If, however, there are only actual hosts named en.citizendium.org, test.citizendium.org, reid.citizendium.org, and locke.citizendium.org, Citizendium's zone file would have only four host entries. To continue the analogy with CZ name spaces, the name file would be the set of articles, in each name space, which actually exist. Main: Zzzz is not an article; Main: Zero is an article.

Populating a primary name server

Just as the DNS namespace is a tree of domains, the actual information in that namespace can be regarded as a tree of zone files.

Name servers are computers that contain information about domains, all the way up to the root. Be sure to understand the difference between the abstraction of a domain or subdomain namespace, and the zone file that describes the contents of that namespace and actually runs in a name server. The primary name server is authoritative for domains, and contains the master copy of the zone file for that domain.

Name servers can contain more than one zone file; indeed, this is the usual case when there are domains with subdomains.

Depending on the implementation, a name server may cache information in addition to what it learned from the zone file. For example, a local cache file in a name server could contain data about name-address relationships outside the domain, but which have been needed by a client within that domain. The name server may also contain limited-lifetime dynamic name updates, which might or might not be accessible from outside the domain.

RFC1034, the basic DNS conceptual specification, describes two ways, one optional and one required, for looking up names.[10] The same logic is relevant inside a domain that has caching nameservers.

  • Iterative: the server refers the client to another server and lets the client pursue the query; the client is aware of multiple nameservers but is only interacting with one at a time
  • Recursive: the first server pursues the query for the client at another server; the client is aware of only one DNS server

Domains versus zones

At each of the levels of the DNS hierarchy — top-level, second level, etc. — is an abstract namespace. No other second-level domain could have notcz.citizendium.org, but the administrator of citizendium.org is not obligated to have any number of subordinate hosts or domains. There is a subtle distinction between the abstraction of a name space, and a zone file that actually defines the hosts and subdomains in the zone. Name spaces define possible records; zone files contain actual records within that space, plus a few special cases such as "glue" records to name servers outside that space. wikipedia.citizendium.org is part of the citizendium.org namespace, but, since there is no such host, it is not in any zone file.

Resource records

Zone files are made up of resource records (RR). All RRs have several common properties:

  • owner: the domain in which the authoritative RR resides. This is often implicitly derived from context, perhaps relative to the current domain name
  • type: an encoded 16 bit value that defines the type of resource defined by the current records. Some types are obsolete, while others continue to be added for new DNS functions.
  • class: an obsolete but required field, it is a 16 bit value for the protocol family with which the RR is associated. The only value used is the "Internet", textually represented as IN
  • time to live: commonly called TTL, this parameter specifies how long the RR may be kept in a cache and assumed to be valid. It is a 32 bit integer, whose value is measured in seconds
  • RDATA: type-specific data about the resource

While there are many graphic tools for creating RRs, the basic textual syntax is:

[owner] IN [class] [rdata]

For example, the RR defining the address associated with the name XX.LCS.MIT.EDU[11]

XX.LCS.MIT.EDU. IN A 10.0.0.44
RR types in current use
Class RR Name Function Typical RDATA
SOA Start Of Authority Defines the start of a zone or a subzone; subordinate records inherit parameters Multiple fields
A Address IPv4 Specifies the IPv4 address for a host IPv4 Address
AAAA Address IPv6 Specifies the IPv6 address for a host IPv6 Address
PTR "Pointer" Reverse mapping of address to name Name
CNAME Canonical name Specifies an alias name for an address Address
NS Name server (usually) An address of a name server one level of domain hierarchy above the current domain Address
MX Mail exchanger Defines the start of a zone or a subzone; subordinate records inherit parameters A 16 bit preference value (lower is better) followed by a host name willing to act as a mail exchange for the owner domain.

Wildcards in Resource Records

An additional complexity of RRs is that they may contain wildcards. The simplest example is a " * " character that will match any string in a name expression. In specific situations, this is an extremely useful function, but it can complicate troubleshooting.[12]

In 2003, Verisign, who operates the .com registry, inserted a wildcard into the master DNS files, so that an undefined name, rather than returning an error message, would be redirected to one of the registry's commercial search engines.[13] If the World Wide Web alone were the only function on the Internet, this might, although revenue-generating, have been useful. Unfortunately, there are many other functions on the Internet. In particular, messaging application protocols such as the Simple Mail Transfer Protocol (SMTP) would use the "host not found" information to conclude that mail to that host was undeliverable.

A quite useful use for a wildcard, however, would be in a split DNS application, with different name resolution policies on different sides of a firewall. On the public Internet side of the firewall, the DNS server for example.com would have explicit records for the organization's public web server, mail server, and other public servers. Any reference to "inside" addresses, however, would be handled by the record:

*.example.com IN A [outside address of the firewall]

Domain Name System security, however, does not have a complete solution to working with wildcarded RRs.

Deploying DNS

To understand basic DNS, assume that it is being used in a single organization, which has one technical and administrative authority in control. In other words, the domain and its subdomains are homogeneous. While there may be minor exceptions due to the existence of temporarily cached data in individual clients and servers, and not all clients and servers may be able to view all parts of the highest-level domain, a single organization's DNS is essentially a distributed database, where there are multiple copies of a single "golden copy" of information.

Once one starts interconnecting domains under different authority, as in the Internet, both administrative and technical aspects change. First, it is understood that while the total collection of all domains conceptually have access to all public name information, no single domain will have a copy of all information. Rather than being a distributed data base, it has become a federated data base, where there is a common indexing and retrieval model, but requests may need to go to multiple servers, in multiple domains and subdomains, before the request is satisfied.

Second, even between well-recognized business partner organizations, there are trust issues. Third, there are miscreants actively attacking the DNS, for reasons from ideology to technical status to pure criminal revenue.

Basic Implementation

The administrator of a homogeneous domain (and its subdomains) starts by building a zone file that defines the names and addresses of hosts in that zone, optional additional information to be added to the responses, and to a higher-level nameserver that helps connect the domain of the zone to other domains. For example, if one was in a.com , one would have to go to the nameserver of .com to find the address of the b.com nameserver.

SOA Resource Record

The zone/domain name starts the record; it must end with a trailing period. Assume that it is sub.example.com.

In the resource data, the first field is the primary name server that is in this domain, as opposed to the name server in the NS record, which is above and outside the current domain. In this case, it might be ns1.sub.example.com.

Next comes the mail address of the person or role responsible for the data in this domain, written not in the conventional user@domain, but in the syntax of a DNS name in a zone file. To create a mail address, replace the leftmost period with an "@" symbol and remove the trailing period.
" administrator.sub.example.com. " is changed to " administrator@sub.example.com ".

Following the administrator are several parameters that may have defaults, but should be known. The first is the serial number of this version of the zone file, which will increase whenever this file is updated.

The next four are timers for the domain, specified in seconds:

  • refresh interval: Secondary name servers in the domain should check the primary for new data after this number of seconds expires
  • retryinterval: If the secondary was unable to get an update when the refresh interval expires, this parameter tells the secondary how long to wait before retrying. The value in this field is usually less than the refresh interval
  • expireinterval: If the secondary was unable to get an update before this timer expires, it should assume that all of the RR information is in its copy of the zone file. If this timer triggers, the secondary server will stop responding to DNS requests
  • TTL: The default TTL for RRs in this zone. An appropriate TTL is controversial, and may be quite different on an internal nameserver versus one accessible from the Internet. The shorter the interval, the more accurate is the data, and, further, the better it is for name-based load distribution schemes. The longer the interval, the less DNS traffic is generated

Other Resource Records

NS
gives the IP address of a hierarchically higher name server to which the name server goes when it cannot complete a name-to-address or address-to-name mapping based on its own information.
A and AAAA
code the authoritative host name and its address, and, optionally, the TTL if different from the zone TTL.
PTR
code an address and the corresponding host name, and, optionally, the TTL if different from the zone TTL.
CNAME
code an alternative host name and its address, and, optionally, the TTL if different from the zone TTL.

Resource Record sets (RRsets)

While no two RRs should have the same label and type and data all equal, it is perfectly possible to have RRs with the same label and type, but different RDATA. For example, a physically multihomed server could have four network interface cards (NIC), each on a different subnet. The set of addresses for this host name (i.e., label) would reasonably form a set of four A records with different address data. Such a set of records is called a Resource Record Set (RRSet). [14]

Obtaining root information

The root name server zone file is expected to be retrieved, by anonymous FTP, from various well-known sites approved by ICANN. In practice, most DNS implementations ship with a recent copy. Root servers remain very busy. [3] In fact, while the root server zone file mentioned above will give the names and addresses of root servers in the general form

a.root-servers.net

the address of a particular server is of the anycast type; [15] there are multiple physical computers with that address, for fault tolerance and load sharing.

For each domain, there must be at least one, and preferably more than one name server that holds the zone files. Primary domain servers have the authoritative zone files, and secondary domain servers keep an exact copy of the primary's zone file. Both types are assumed to have a disk or other storage from which they can restore the domain information.

Zone transfer adds to populating a server database

A secondary server will use a zone transfer to obtain the primary zone file for its domain. There are various operational reasons why a physical server might act as primary and secondary for multiple zones; the important point here is that a zone transfer, as opposed to ordinary DNS retrieval, alters the contents of the definitions and must be treated as a sensitive operation.

Adding trusted dynamic updates

The nameserver also can take dynamic transfers, which, strictly speaking, do not have to be secured, but dynamic update, especially in an IPv6 environment, is so open an invitation to miscreants that it should never be considered without being secured. DNS security is the normal way this might be done, but there are other alternatives, such as an encrypted link between the update source and the nameserver.

There are also caching-only servers that contain only the names and addresses that have been recently looked up, and are still valid with respect to the TTL parameter in the relevant records.

Resolvers, their caches, and their information sources

The program, on a host, which is the client of DNS servers is most often called a resolver. Depending on the local network architectural implementation, a resolver may go to a caching-only server, a secondary server, or the primary server for its information. It may retain a cache of recently retrieved DNS information, clearing items from cache as their TTLs expire.

Heterogeneous DNS

For more information, see: Split DNS.

While there will be different federated databases, DNS is certainly not limited to the public Internet. It is quite common for organizations to have split DNS "inside the firewall" and "outside the firewall". An inside user will query local DNS for the address of an internal machine and get the address of the actual host, but, if it asks for the address of citizendium.com, the address returned by DNS may well be that of the "inside" interface of a firewall, or other security middlebox[16] Depending on the firewall implementation, it may deny access, or create a proxy connection to the outside host. To establish that connection, the middlebox will query an "outside" DNS, which contains the addresses of the organization's public hosts, but primarily contains the addresses of external hosts. In some cases, that outside DNS enjoys some trust with an external organization, and may do secured zone transfers. More often, however, the outside DNS is primarily a cache of name-address information that it obtained by queries to the nameservers of other domains.

DNS protocols

The most basic DNS protocols are the lookup service, which runs over port 53 of the connectionless User Datagram Protocol, and the zone transfer service, which also runs over port 53 of the connection-oriented Transmission Control Protocol.[17] Lookup is a read-only function, while zone update is read-write and should be implemented as a privileged, authenticated operation. Otherwise any client on a DNS server's network could request a zone transfer, and receive a complete copy of a zonefile, which is a security risk.

There are also protocols for dynamic update, so that network clients can automatically update their DNS servers to reflect correct hostnames (e.g. if they dynamically receive a different IP address via DHCP). This concept is also known as Dynamic DNS. [18]

Extended applications

These include Domain Name System dynamic update, use of the DNS as a data base in Public Key Infrastructure (PKI) for general security, Domain Name System security (DNSSEC) and name-based routing and load distribution.

References

  1. Mockapetris, P.V. (November 1987), Domain names - concepts and facilities, Internet Engineering Task Force, RFC1034
  2. Mockapetris, P.V. (November 1983), Domain names: Concepts and facilities, Internet Engineering Task Foce, RFC882
  3. 3.0 3.1 Albitz, Paul & Cricket Liu (1997), DNS and BIND, second edition, O'Reilly p. 9
  4. 4.0 4.1 Arends, R. et al. (March 2005), DNS Security Introduction and Requirements, Internet Engineering Task Force, RFC4033
  5. Bush, R. et al. (August 2002), Representing Internet Protocol version 6 (IPv6) Addresses in the Domain Name System (DNS), Internet Engineering Task Force, RFC3363
  6. BIND, Internet Software Consortium
  7. Evans, Karen (August 22, 2008), Securing the Federal Government’s Domain Name System Infrastructure (Submission of Draft Agency Plans Due by September 5, 2008)
  8. Dyer, Stephen (October 1, 2004), .UK – Revisited
  9. ICM Registry (June 25, 2010), ICM Registry welcomes approval of .xxx
  10. RFC1034, pp. 3-4
  11. Note that the actual RR has a terminal period that does not appear when the DNS name is written in other uses
  12. E. Lewis (July 2006), The Role of Wildcards in the Domain Name System, RFC4592
  13. Internet Corporation for Assigned Names and Numbers, Verisign's Wildcard Service Deployment
  14. R. Elz, R. Bush (July 1997), Clarifications to the DNS Specification, Internet Engineering Task Force, RFC2181
  15. Liman, Lars-Johan et al, Operation of the Root Name Servers
  16. P. Srisuresh, J. Kuthan, J. Rosenberg, A. Molitor, A. Rayhan (August 2002), Middlebox communication architecture and framework., RFC3303
  17. Mockapetris., P.V. (November 1987), Domain names - implementation and specification, Internet Engineering Task Force, RFC1035
  18. Vixie, P., ed. (April 1997), Dynamic Updates in the Domain Name System (DNS UPDATE), Internet Engineering Task Force, RFC2136