You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by ba...@apache.org on 2006/07/29 15:16:01 UTC
svn commit: r426798 [6/30] - in /james/server/trunk/src/site/resources/rfclist: ./ basic/ imap4/ ldap/ nntp/ pop3/ smtp/

Added: james/server/trunk/src/site/resources/rfclist/basic/rfc2046.txt
URL: http://svn.apache.org/viewvc/james/server/trunk/src/site/resources/rfclist/basic/rfc2046.txt?rev=426798&view=auto
==============================================================================
--- james/server/trunk/src/site/resources/rfclist/basic/rfc2046.txt (added)
+++ james/server/trunk/src/site/resources/rfclist/basic/rfc2046.txt Sat Jul 29 06:15:59 2006
@@ -0,0 +1,2468 @@
+
+
+
+
+
+Network Working Group                                          N. Freed
+Request for Comments: 2046                                     Innosoft
+Obsoletes: 1521, 1522, 1590                               N. Borenstein
+Category: Standards Track                                 First Virtual
+                                                          November 1996
+
+
+                 Multipurpose Internet Mail Extensions
+                            (MIME) Part Two:
+                              Media Types
+
+Status of this Memo
+
+   This document specifies an Internet standards track protocol for the
+   Internet community, and requests discussion and suggestions for
+   improvements.  Please refer to the current edition of the "Internet
+   Official Protocol Standards" (STD 1) for the standardization state
+   and status of this protocol.  Distribution of this memo is unlimited.
+
+Abstract
+
+   STD 11, RFC 822 defines a message representation protocol specifying
+   considerable detail about US-ASCII message headers, but which leaves
+   the message content, or message body, as flat US-ASCII text.  This
+   set of documents, collectively called the Multipurpose Internet Mail
+   Extensions, or MIME, redefines the format of messages to allow for
+
+    (1)   textual message bodies in character sets other than
+          US-ASCII,
+
+    (2)   an extensible set of different formats for non-textual
+          message bodies,
+
+    (3)   multi-part message bodies, and
+
+    (4)   textual header information in character sets other than
+          US-ASCII.
+
+   These documents are based on earlier work documented in RFC 934, STD
+   11, and RFC 1049, but extends and revises them.  Because RFC 822 said
+   so little about message bodies, these documents are largely
+   orthogonal to (rather than a revision of) RFC 822.
+
+   The initial document in this set, RFC 2045, specifies the various
+   headers used to describe the structure of MIME messages. This second
+   document defines the general structure of the MIME media typing
+   system and defines an initial set of media types. The third document,
+   RFC 2047, describes extensions to RFC 822 to allow non-US-ASCII text
+
+
+
+Freed & Borenstein          Standards Track                     [Page 1]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   data in Internet mail header fields. The fourth document, RFC 2048,
+   specifies various IANA registration procedures for MIME-related
+   facilities.  The fifth and final document, RFC 2049, describes MIME
+   conformance criteria as well as providing some illustrative examples
+   of MIME message formats, acknowledgements, and the bibliography.
+
+   These documents are revisions of RFCs 1521 and 1522, which themselves
+   were revisions of RFCs 1341 and 1342.  An appendix in RFC 2049
+   describes differences and changes from previous versions.
+
+Table of Contents
+
+   1. Introduction .........................................    3
+   2. Definition of a Top-Level Media Type .................    4
+   3. Overview Of The Initial Top-Level Media Types ........    4
+   4. Discrete Media Type Values ...........................    6
+   4.1 Text Media Type .....................................    6
+   4.1.1 Representation of Line Breaks .....................    7
+   4.1.2 Charset Parameter .................................    7
+   4.1.3 Plain Subtype .....................................   11
+   4.1.4 Unrecognized Subtypes .............................   11
+   4.2 Image Media Type ....................................   11
+   4.3 Audio Media Type ....................................   11
+   4.4 Video Media Type ....................................   12
+   4.5 Application Media Type ..............................   12
+   4.5.1 Octet-Stream Subtype ..............................   13
+   4.5.2 PostScript Subtype ................................   14
+   4.5.3 Other Application Subtypes ........................   17
+   5. Composite Media Type Values ..........................   17
+   5.1 Multipart Media Type ................................   17
+   5.1.1 Common Syntax .....................................   19
+   5.1.2 Handling Nested Messages and Multiparts ...........   24
+   5.1.3 Mixed Subtype .....................................   24
+   5.1.4 Alternative Subtype ...............................   24
+   5.1.5 Digest Subtype ....................................   26
+   5.1.6 Parallel Subtype ..................................   27
+   5.1.7 Other Multipart Subtypes ..........................   28
+   5.2 Message Media Type ..................................   28
+   5.2.1 RFC822 Subtype ....................................   28
+   5.2.2 Partial Subtype ...................................   29
+   5.2.2.1 Message Fragmentation and Reassembly ............   30
+   5.2.2.2 Fragmentation and Reassembly Example ............   31
+   5.2.3 External-Body Subtype .............................   33
+   5.2.4 Other Message Subtypes ............................   40
+   6. Experimental Media Type Values .......................   40
+   7. Summary ..............................................   41
+   8. Security Considerations ..............................   41
+   9. Authors' Addresses ...................................   42
+
+
+
+Freed & Borenstein          Standards Track                     [Page 2]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   A. Collected Grammar ....................................   43
+
+1.  Introduction
+
+   The first document in this set, RFC 2045, defines a number of header
+   fields, including Content-Type. The Content-Type field is used to
+   specify the nature of the data in the body of a MIME entity, by
+   giving media type and subtype identifiers, and by providing auxiliary
+   information that may be required for certain media types.  After the
+   type and subtype names, the remainder of the header field is simply a
+   set of parameters, specified in an attribute/value notation.  The
+   ordering of parameters is not significant.
+
+   In general, the top-level media type is used to declare the general
+   type of data, while the subtype specifies a specific format for that
+   type of data.  Thus, a media type of "image/xyz" is enough to tell a
+   user agent that the data is an image, even if the user agent has no
+   knowledge of the specific image format "xyz".  Such information can
+   be used, for example, to decide whether or not to show a user the raw
+   data from an unrecognized subtype -- such an action might be
+   reasonable for unrecognized subtypes of "text", but not for
+   unrecognized subtypes of "image" or "audio".  For this reason,
+   registered subtypes of "text", "image", "audio", and "video" should
+   not contain embedded information that is really of a different type.
+   Such compound formats should be represented using the "multipart" or
+   "application" types.
+
+   Parameters are modifiers of the media subtype, and as such do not
+   fundamentally affect the nature of the content.  The set of
+   meaningful parameters depends on the media type and subtype.  Most
+   parameters are associated with a single specific subtype.  However, a
+   given top-level media type may define parameters which are applicable
+   to any subtype of that type.  Parameters may be required by their
+   defining media type or subtype or they may be optional.  MIME
+   implementations must also ignore any parameters whose names they do
+   not recognize.
+
+   MIME's Content-Type header field and media type mechanism has been
+   carefully designed to be extensible, and it is expected that the set
+   of media type/subtype pairs and their associated parameters will grow
+   significantly over time.  Several other MIME facilities, such as
+   transfer encodings and "message/external-body" access types, are
+   likely to have new values defined over time.  In order to ensure that
+   the set of such values is developed in an orderly, well-specified,
+   and public manner, MIME sets up a registration process which uses the
+   Internet Assigned Numbers Authority (IANA) as a central registry for
+   MIME's various areas of extensibility.  The registration process for
+   these areas is described in a companion document, RFC 2048.
+
+
+
+Freed & Borenstein          Standards Track                     [Page 3]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   The initial seven standard top-level media type are defined and
+   described in the remainder of this document.
+
+2.  Definition of a Top-Level Media Type
+
+   The definition of a top-level media type consists of:
+
+    (1)   a name and a description of the type, including
+          criteria for whether a particular type would qualify
+          under that type,
+
+    (2)   the names and definitions of parameters, if any, which
+          are defined for all subtypes of that type (including
+          whether such parameters are required or optional),
+
+    (3)   how a user agent and/or gateway should handle unknown
+          subtypes of this type,
+
+    (4)   general considerations on gatewaying entities of this
+          top-level type, if any, and
+
+    (5)   any restrictions on content-transfer-encodings for
+          entities of this top-level type.
+
+3.  Overview Of The Initial Top-Level Media Types
+
+   The five discrete top-level media types are:
+
+    (1)   text -- textual information.  The subtype "plain" in
+          particular indicates plain text containing no
+          formatting commands or directives of any sort. Plain
+          text is intended to be displayed "as-is". No special
+          software is required to get the full meaning of the
+          text, aside from support for the indicated character
+          set. Other subtypes are to be used for enriched text in
+          forms where application software may enhance the
+          appearance of the text, but such software must not be
+          required in order to get the general idea of the
+          content.  Possible subtypes of "text" thus include any
+          word processor format that can be read without
+          resorting to software that understands the format.  In
+          particular, formats that employ embeddded binary
+          formatting information are not considered directly
+          readable. A very simple and portable subtype,
+          "richtext", was defined in RFC 1341, with a further
+          revision in RFC 1896 under the name "enriched".
+
+
+
+
+
+Freed & Borenstein          Standards Track                     [Page 4]
+
+RFC 2046                      Media Types                  November 1996
+
+
+    (2)   image -- image data.  "Image" requires a display device
+          (such as a graphical display, a graphics printer, or a
+          FAX machine) to view the information. An initial
+          subtype is defined for the widely-used image format
+          JPEG. .  subtypes are defined for two widely-used image
+          formats, jpeg and gif.
+
+    (3)   audio -- audio data.  "Audio" requires an audio output
+          device (such as a speaker or a telephone) to "display"
+          the contents.  An initial subtype "basic" is defined in
+          this document.
+
+    (4)   video -- video data.  "Video" requires the capability
+          to display moving images, typically including
+          specialized hardware and software.  An initial subtype
+          "mpeg" is defined in this document.
+
+    (5)   application -- some other kind of data, typically
+          either uninterpreted binary data or information to be
+          processed by an application.  The subtype "octet-
+          stream" is to be used in the case of uninterpreted
+          binary data, in which case the simplest recommended
+          action is to offer to write the information into a file
+          for the user.  The "PostScript" subtype is also defined
+          for the transport of PostScript material.  Other
+          expected uses for "application" include spreadsheets,
+          data for mail-based scheduling systems, and languages
+          for "active" (computational) messaging, and word
+          processing formats that are not directly readable.
+          Note that security considerations may exist for some
+          types of application data, most notably
+          "application/PostScript" and any form of active
+          messaging.  These issues are discussed later in this
+          document.
+
+   The two composite top-level media types are:
+
+    (1)   multipart -- data consisting of multiple entities of
+          independent data types.  Four subtypes are initially
+          defined, including the basic "mixed" subtype specifying
+          a generic mixed set of parts, "alternative" for
+          representing the same data in multiple formats,
+          "parallel" for parts intended to be viewed
+          simultaneously, and "digest" for multipart entities in
+          which each part has a default type of "message/rfc822".
+
+
+
+
+
+
+Freed & Borenstein          Standards Track                     [Page 5]
+
+RFC 2046                      Media Types                  November 1996
+
+
+    (2)   message -- an encapsulated message.  A body of media
+          type "message" is itself all or a portion of some kind
+          of message object.  Such objects may or may not in turn
+          contain other entities.  The "rfc822" subtype is used
+          when the encapsulated content is itself an RFC 822
+          message.  The "partial" subtype is defined for partial
+          RFC 822 messages, to permit the fragmented transmission
+          of bodies that are thought to be too large to be passed
+          through transport facilities in one piece.  Another
+          subtype, "external-body", is defined for specifying
+          large bodies by reference to an external data source.
+
+   It should be noted that the list of media type values given here may
+   be augmented in time, via the mechanisms described above, and that
+   the set of subtypes is expected to grow substantially.
+
+4.  Discrete Media Type Values
+
+   Five of the seven initial media type values refer to discrete bodies.
+   The content of these types must be handled by non-MIME mechanisms;
+   they are opaque to MIME processors.
+
+4.1.  Text Media Type
+
+   The "text" media type is intended for sending material which is
+   principally textual in form.  A "charset" parameter may be used to
+   indicate the character set of the body text for "text" subtypes,
+   notably including the subtype "text/plain", which is a generic
+   subtype for plain text.  Plain text does not provide for or allow
+   formatting commands, font attribute specifications, processing
+   instructions, interpretation directives, or content markup.  Plain
+   text is seen simply as a linear sequence of characters, possibly
+   interrupted by line breaks or page breaks.  Plain text may allow the
+   stacking of several characters in the same position in the text.
+   Plain text in scripts like Arabic and Hebrew may also include
+   facilitites that allow the arbitrary mixing of text segments with
+   opposite writing directions.
+
+   Beyond plain text, there are many formats for representing what might
+   be known as "rich text".  An interesting characteristic of many such
+   representations is that they are to some extent readable even without
+   the software that interprets them.  It is useful, then, to
+   distinguish them, at the highest level, from such unreadable data as
+   images, audio, or text represented in an unreadable form. In the
+   absence of appropriate interpretation software, it is reasonable to
+   show subtypes of "text" to the user, while it is not reasonable to do
+   so with most nontextual data. Such formatted textual data should be
+   represented using subtypes of "text".
+
+
+
+Freed & Borenstein          Standards Track                     [Page 6]
+
+RFC 2046                      Media Types                  November 1996
+
+
+4.1.1.  Representation of Line Breaks
+
+   The canonical form of any MIME "text" subtype MUST always represent a
+   line break as a CRLF sequence.  Similarly, any occurrence of CRLF in
+   MIME "text" MUST represent a line break.  Use of CR and LF outside of
+   line break sequences is also forbidden.
+
+   This rule applies regardless of format or character set or sets
+   involved.
+
+   NOTE: The proper interpretation of line breaks when a body is
+   displayed depends on the media type. In particular, while it is
+   appropriate to treat a line break as a transition to a new line when
+   displaying a "text/plain" body, this treatment is actually incorrect
+   for other subtypes of "text" like "text/enriched" [RFC-1896].
+   Similarly, whether or not line breaks should be added during display
+   operations is also a function of the media type. It should not be
+   necessary to add any line breaks to display "text/plain" correctly,
+   whereas proper display of "text/enriched" requires the appropriate
+   addition of line breaks.
+
+   NOTE: Some protocols defines a maximum line length.  E.g. SMTP [RFC-
+   821] allows a maximum of 998 octets before the next CRLF sequence.
+   To be transported by such protocols, data which includes too long
+   segments without CRLF sequences must be encoded with a suitable
+   content-transfer-encoding.
+
+4.1.2.  Charset Parameter
+
+   A critical parameter that may be specified in the Content-Type field
+   for "text/plain" data is the character set.  This is specified with a
+   "charset" parameter, as in:
+
+     Content-type: text/plain; charset=iso-8859-1
+
+   Unlike some other parameter values, the values of the charset
+   parameter are NOT case sensitive.  The default character set, which
+   must be assumed in the absence of a charset parameter, is US-ASCII.
+
+   The specification for any future subtypes of "text" must specify
+   whether or not they will also utilize a "charset" parameter, and may
+   possibly restrict its values as well.  For other subtypes of "text"
+   than "text/plain", the semantics of the "charset" parameter should be
+   defined to be identical to those specified here for "text/plain",
+   i.e., the body consists entirely of characters in the given charset.
+   In particular, definers of future "text" subtypes should pay close
+   attention to the implications of multioctet character sets for their
+   subtype definitions.
+
+
+
+Freed & Borenstein          Standards Track                     [Page 7]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   The charset parameter for subtypes of "text" gives a name of a
+   character set, as "character set" is defined in RFC 2045.  The rules
+   regarding line breaks detailed in the previous section must also be
+   observed -- a character set whose definition does not conform to
+   these rules cannot be used in a MIME "text" subtype.
+
+   An initial list of predefined character set names can be found at the
+   end of this section.  Additional character sets may be registered
+   with IANA.
+
+   Other media types than subtypes of "text" might choose to employ the
+   charset parameter as defined here, but with the CRLF/line break
+   restriction removed.  Therefore, all character sets that conform to
+   the general definition of "character set" in RFC 2045 can be
+   registered for MIME use.
+
+   Note that if the specified character set includes 8-bit characters
+   and such characters are used in the body, a Content-Transfer-Encoding
+   header field and a corresponding encoding on the data are required in
+   order to transmit the body via some mail transfer protocols, such as
+   SMTP [RFC-821].
+
+   The default character set, US-ASCII, has been the subject of some
+   confusion and ambiguity in the past.  Not only were there some
+   ambiguities in the definition, there have been wide variations in
+   practice.  In order to eliminate such ambiguity and variations in the
+   future, it is strongly recommended that new user agents explicitly
+   specify a character set as a media type parameter in the Content-Type
+   header field. "US-ASCII" does not indicate an arbitrary 7-bit
+   character set, but specifies that all octets in the body must be
+   interpreted as characters according to the US-ASCII character set.
+   National and application-oriented versions of ISO 646 [ISO-646] are
+   usually NOT identical to US-ASCII, and in that case their use in
+   Internet mail is explicitly discouraged.  The omission of the ISO 646
+   character set from this document is deliberate in this regard.  The
+   character set name of "US-ASCII" explicitly refers to the character
+   set defined in ANSI X3.4-1986 [US- ASCII].  The new international
+   reference version (IRV) of the 1991 edition of ISO 646 is identical
+   to US-ASCII.  The character set name "ASCII" is reserved and must not
+   be used for any purpose.
+
+   NOTE: RFC 821 explicitly specifies "ASCII", and references an earlier
+   version of the American Standard.  Insofar as one of the purposes of
+   specifying a media type and character set is to permit the receiver
+   to unambiguously determine how the sender intended the coded message
+   to be interpreted, assuming anything other than "strict ASCII" as the
+   default would risk unintentional and incompatible changes to the
+   semantics of messages now being transmitted.  This also implies that
+
+
+
+Freed & Borenstein          Standards Track                     [Page 8]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   messages containing characters coded according to other versions of
+   ISO 646 than US-ASCII and the 1991 IRV, or using code-switching
+   procedures (e.g., those of ISO 2022), as well as 8bit or multiple
+   octet character encodings MUST use an appropriate character set
+   specification to be consistent with MIME.
+
+   The complete US-ASCII character set is listed in ANSI X3.4- 1986.
+   Note that the control characters including DEL (0-31, 127) have no
+   defined meaning in apart from the combination CRLF (US-ASCII values
+   13 and 10) indicating a new line.  Two of the characters have de
+   facto meanings in wide use: FF (12) often means "start subsequent
+   text on the beginning of a new page"; and TAB or HT (9) often (though
+   not always) means "move the cursor to the next available column after
+   the current position where the column number is a multiple of 8
+   (counting the first column as column 0)."  Aside from these
+   conventions, any use of the control characters or DEL in a body must
+   either occur
+
+    (1)   because a subtype of text other than "plain"
+          specifically assigns some additional meaning, or
+
+    (2)   within the context of a private agreement between the
+          sender and recipient. Such private agreements are
+          discouraged and should be replaced by the other
+          capabilities of this document.
+
+   NOTE: An enormous proliferation of character sets exist beyond US-
+   ASCII.  A large number of partially or totally overlapping character
+   sets is NOT a good thing.  A SINGLE character set that can be used
+   universally for representing all of the world's languages in Internet
+   mail would be preferrable.  Unfortunately, existing practice in
+   several communities seems to point to the continued use of multiple
+   character sets in the near future.  A small number of standard
+   character sets are, therefore, defined for Internet use in this
+   document.
+
+   The defined charset values are:
+
+    (1)   US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII].
+
+    (2)   ISO-8859-X -- where "X" is to be replaced, as
+          necessary, for the parts of ISO-8859 [ISO-8859].  Note
+          that the ISO 646 character sets have deliberately been
+          omitted in favor of their 8859 replacements, which are
+          the designated character sets for Internet mail.  As of
+          the publication of this document, the legitimate values
+          for "X" are the digits 1 through 10.
+
+
+
+
+Freed & Borenstein          Standards Track                     [Page 9]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   Characters in the range 128-159 has no assigned meaning in ISO-8859-
+   X.  Characters with values below 128 in ISO-8859-X have the same
+   assigned meaning as they do in US-ASCII.
+
+   Part 6 of ISO 8859 (Latin/Arabic alphabet) and part 8 (Latin/Hebrew
+   alphabet) includes both characters for which the normal writing
+   direction is right to left and characters for which it is left to
+   right, but do not define a canonical ordering method for representing
+   bi-directional text.  The charset values "ISO-8859-6" and "ISO-8859-
+   8", however, specify that the visual method is used [RFC-1556].
+
+   All of these character sets are used as pure 7bit or 8bit sets
+   without any shift or escape functions.  The meaning of shift and
+   escape sequences in these character sets is not defined.
+
+   The character sets specified above are the ones that were relatively
+   uncontroversial during the drafting of MIME.  This document does not
+   endorse the use of any particular character set other than US-ASCII,
+   and recognizes that the future evolution of world character sets
+   remains unclear.
+
+   Note that the character set used, if anything other than US- ASCII,
+   must always be explicitly specified in the Content-Type field.
+
+   No character set name other than those defined above may be used in
+   Internet mail without the publication of a formal specification and
+   its registration with IANA, or by private agreement, in which case
+   the character set name must begin with "X-".
+
+   Implementors are discouraged from defining new character sets unless
+   absolutely necessary.
+
+   The "charset" parameter has been defined primarily for the purpose of
+   textual data, and is described in this section for that reason.
+   However, it is conceivable that non-textual data might also wish to
+   specify a charset value for some purpose, in which case the same
+   syntax and values should be used.
+
+   In general, composition software should always use the "lowest common
+   denominator" character set possible.  For example, if a body contains
+   only US-ASCII characters, it SHOULD be marked as being in the US-
+   ASCII character set, not ISO-8859-1, which, like all the ISO-8859
+   family of character sets, is a superset of US-ASCII.  More generally,
+   if a widely-used character set is a subset of another character set,
+   and a body contains only characters in the widely-used subset, it
+   should be labelled as being in that subset.  This will increase the
+   chances that the recipient will be able to view the resulting entity
+   correctly.
+
+
+
+Freed & Borenstein          Standards Track                    [Page 10]
+
+RFC 2046                      Media Types                  November 1996
+
+
+4.1.3.  Plain Subtype
+
+   The simplest and most important subtype of "text" is "plain".  This
+   indicates plain text that does not contain any formatting commands or
+   directives. Plain text is intended to be displayed "as-is", that is,
+   no interpretation of embedded formatting commands, font attribute
+   specifications, processing instructions, interpretation directives,
+   or content markup should be necessary for proper display.  The
+   default media type of "text/plain; charset=us-ascii" for Internet
+   mail describes existing Internet practice.  That is, it is the type
+   of body defined by RFC 822.
+
+   No other "text" subtype is defined by this document.
+
+4.1.4.  Unrecognized Subtypes
+
+   Unrecognized subtypes of "text" should be treated as subtype "plain"
+   as long as the MIME implementation knows how to handle the charset.
+   Unrecognized subtypes which also specify an unrecognized charset
+   should be treated as "application/octet- stream".
+
+4.2.  Image Media Type
+
+   A media type of "image" indicates that the body contains an image.
+   The subtype names the specific image format.  These names are not
+   case sensitive. An initial subtype is "jpeg" for the JPEG format
+   using JFIF encoding [JPEG].
+
+   The list of "image" subtypes given here is neither exclusive nor
+   exhaustive, and is expected to grow as more types are registered with
+   IANA, as described in RFC 2048.
+
+   Unrecognized subtypes of "image" should at a miniumum be treated as
+   "application/octet-stream".  Implementations may optionally elect to
+   pass subtypes of "image" that they do not specifically recognize to a
+   secure and robust general-purpose image viewing application, if such
+   an application is available.
+
+   NOTE: Using of a generic-purpose image viewing application this way
+   inherits the security problems of the most dangerous type supported
+   by the application.
+
+4.3.  Audio Media Type
+
+   A media type of "audio" indicates that the body contains audio data.
+   Although there is not yet a consensus on an "ideal" audio format for
+   use with computers, there is a pressing need for a format capable of
+   providing interoperable behavior.
+
+
+
+Freed & Borenstein          Standards Track                    [Page 11]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   The initial subtype of "basic" is specified to meet this requirement
+   by providing an absolutely minimal lowest common denominator audio
+   format.  It is expected that richer formats for higher quality and/or
+   lower bandwidth audio will be defined by a later document.
+
+   The content of the "audio/basic" subtype is single channel audio
+   encoded using 8bit ISDN mu-law [PCM] at a sample rate of 8000 Hz.
+
+   Unrecognized subtypes of "audio" should at a miniumum be treated as
+   "application/octet-stream".  Implementations may optionally elect to
+   pass subtypes of "audio" that they do not specifically recognize to a
+   robust general-purpose audio playing application, if such an
+   application is available.
+
+4.4.  Video Media Type
+
+   A media type of "video" indicates that the body contains a time-
+   varying-picture image, possibly with color and coordinated sound.
+   The term 'video' is used in its most generic sense, rather than with
+   reference to any particular technology or format, and is not meant to
+   preclude subtypes such as animated drawings encoded compactly.  The
+   subtype "mpeg" refers to video coded according to the MPEG standard
+   [MPEG].
+
+   Note that although in general this document strongly discourages the
+   mixing of multiple media in a single body, it is recognized that many
+   so-called video formats include a representation for synchronized
+   audio, and this is explicitly permitted for subtypes of "video".
+
+   Unrecognized subtypes of "video" should at a minumum be treated as
+   "application/octet-stream".  Implementations may optionally elect to
+   pass subtypes of "video" that they do not specifically recognize to a
+   robust general-purpose video display application, if such an
+   application is available.
+
+4.5.  Application Media Type
+
+   The "application" media type is to be used for discrete data which do
+   not fit in any of the other categories, and particularly for data to
+   be processed by some type of application program.  This is
+   information which must be processed by an application before it is
+   viewable or usable by a user.  Expected uses for the "application"
+   media type include file transfer, spreadsheets, data for mail-based
+   scheduling systems, and languages for "active" (computational)
+   material.  (The latter, in particular, can pose security problems
+   which must be understood by implementors, and are considered in
+   detail in the discussion of the "application/PostScript" media type.)
+
+
+
+
+Freed & Borenstein          Standards Track                    [Page 12]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   For example, a meeting scheduler might define a standard
+   representation for information about proposed meeting dates.  An
+   intelligent user agent would use this information to conduct a dialog
+   with the user, and might then send additional material based on that
+   dialog.  More generally, there have been several "active" messaging
+   languages developed in which programs in a suitably specialized
+   language are transported to a remote location and automatically run
+   in the recipient's environment.
+
+   Such applications may be defined as subtypes of the "application"
+   media type. This document defines two subtypes:
+
+   octet-stream, and PostScript.
+
+   The subtype of "application" will often be either the name or include
+   part of the name of the application for which the data are intended.
+   This does not mean, however, that any application program name may be
+   used freely as a subtype of "application".
+
+4.5.1.  Octet-Stream Subtype
+
+   The "octet-stream" subtype is used to indicate that a body contains
+   arbitrary binary data.  The set of currently defined parameters is:
+
+    (1)   TYPE -- the general type or category of binary data.
+          This is intended as information for the human recipient
+          rather than for any automatic processing.
+
+    (2)   PADDING -- the number of bits of padding that were
+          appended to the bit-stream comprising the actual
+          contents to produce the enclosed 8bit byte-oriented
+          data.  This is useful for enclosing a bit-stream in a
+          body when the total number of bits is not a multiple of
+          8.
+
+   Both of these parameters are optional.
+
+   An additional parameter, "CONVERSIONS", was defined in RFC 1341 but
+   has since been removed.  RFC 1341 also defined the use of a "NAME"
+   parameter which gave a suggested file name to be used if the data
+   were to be written to a file.  This has been deprecated in
+   anticipation of a separate Content-Disposition header field, to be
+   defined in a subsequent RFC.
+
+   The recommended action for an implementation that receives an
+   "application/octet-stream" entity is to simply offer to put the data
+   in a file, with any Content-Transfer-Encoding undone, or perhaps to
+   use it as input to a user-specified process.
+
+
+
+Freed & Borenstein          Standards Track                    [Page 13]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   To reduce the danger of transmitting rogue programs, it is strongly
+   recommended that implementations NOT implement a path-search
+   mechanism whereby an arbitrary program named in the Content-Type
+   parameter (e.g., an "interpreter=" parameter) is found and executed
+   using the message body as input.
+
+4.5.2.  PostScript Subtype
+
+   A media type of "application/postscript" indicates a PostScript
+   program.  Currently two variants of the PostScript language are
+   allowed; the original level 1 variant is described in [POSTSCRIPT]
+   and the more recent level 2 variant is described in [POSTSCRIPT2].
+
+   PostScript is a registered trademark of Adobe Systems, Inc.  Use of
+   the MIME media type "application/postscript" implies recognition of
+   that trademark and all the rights it entails.
+
+   The PostScript language definition provides facilities for internal
+   labelling of the specific language features a given program uses.
+   This labelling, called the PostScript document structuring
+   conventions, or DSC, is very general and provides substantially more
+   information than just the language level.  The use of document
+   structuring conventions, while not required, is strongly recommended
+   as an aid to interoperability.  Documents which lack proper
+   structuring conventions cannot be tested to see whether or not they
+   will work in a given environment.  As such, some systems may assume
+   the worst and refuse to process unstructured documents.
+
+   The execution of general-purpose PostScript interpreters entails
+   serious security risks, and implementors are discouraged from simply
+   sending PostScript bodies to "off- the-shelf" interpreters.  While it
+   is usually safe to send PostScript to a printer, where the potential
+   for harm is greatly constrained by typical printer environments,
+   implementors should consider all of the following before they add
+   interactive display of PostScript bodies to their MIME readers.
+
+   The remainder of this section outlines some, though probably not all,
+   of the possible problems with the transport of PostScript entities.
+
+    (1)   Dangerous operations in the PostScript language
+          include, but may not be limited to, the PostScript
+          operators "deletefile", "renamefile", "filenameforall",
+          and "file".  "File" is only dangerous when applied to
+          something other than standard input or output.
+          Implementations may also define additional nonstandard
+          file operators; these may also pose a threat to
+          security. "Filenameforall", the wildcard file search
+          operator, may appear at first glance to be harmless.
+
+
+
+Freed & Borenstein          Standards Track                    [Page 14]
+
+RFC 2046                      Media Types                  November 1996
+
+
+          Note, however, that this operator has the potential to
+          reveal information about what files the recipient has
+          access to, and this information may itself be
+          sensitive.  Message senders should avoid the use of
+          potentially dangerous file operators, since these
+          operators are quite likely to be unavailable in secure
+          PostScript implementations.  Message receiving and
+          displaying software should either completely disable
+          all potentially dangerous file operators or take
+          special care not to delegate any special authority to
+          their operation.  These operators should be viewed as
+          being done by an outside agency when interpreting
+          PostScript documents.  Such disabling and/or checking
+          should be done completely outside of the reach of the
+          PostScript language itself; care should be taken to
+          insure that no method exists for re-enabling full-
+          function versions of these operators.
+
+    (2)   The PostScript language provides facilities for exiting
+          the normal interpreter, or server, loop.  Changes made
+          in this "outer" environment are customarily retained
+          across documents, and may in some cases be retained
+          semipermanently in nonvolatile memory.  The operators
+          associated with exiting the interpreter loop have the
+          potential to interfere with subsequent document
+          processing.  As such, their unrestrained use
+          constitutes a threat of service denial.  PostScript
+          operators that exit the interpreter loop include, but
+          may not be limited to, the exitserver and startjob
+          operators.  Message sending software should not
+          generate PostScript that depends on exiting the
+          interpreter loop to operate, since the ability to exit
+          will probably be unavailable in secure PostScript
+          implementations.  Message receiving and displaying
+          software should completely disable the ability to make
+          retained changes to the PostScript environment by
+          eliminating or disabling the "startjob" and
+          "exitserver" operations.  If these operations cannot be
+          eliminated or completely disabled the password
+          associated with them should at least be set to a hard-
+          to-guess value.
+
+    (3)   PostScript provides operators for setting system-wide
+          and device-specific parameters.  These parameter
+          settings may be retained across jobs and may
+          potentially pose a threat to the correct operation of
+          the interpreter.  The PostScript operators that set
+          system and device parameters include, but may not be
+
+
+
+Freed & Borenstein          Standards Track                    [Page 15]
+
+RFC 2046                      Media Types                  November 1996
+
+
+          limited to, the "setsystemparams" and "setdevparams"
+          operators.  Message sending software should not
+          generate PostScript that depends on the setting of
+          system or device parameters to operate correctly.  The
+          ability to set these parameters will probably be
+          unavailable in secure PostScript implementations.
+          Message receiving and displaying software should
+          disable the ability to change system and device
+          parameters.  If these operators cannot be completely
+          disabled the password associated with them should at
+          least be set to a hard-to-guess value.
+
+    (4)   Some PostScript implementations provide nonstandard
+          facilities for the direct loading and execution of
+          machine code.  Such facilities are quite obviously open
+          to substantial abuse.  Message sending software should
+          not make use of such features.  Besides being totally
+          hardware-specific, they are also likely to be
+          unavailable in secure implementations of PostScript.
+          Message receiving and displaying software should not
+          allow such operators to be used if they exist.
+
+    (5)   PostScript is an extensible language, and many, if not
+          most, implementations of it provide a number of their
+          own extensions.  This document does not deal with such
+          extensions explicitly since they constitute an unknown
+          factor.  Message sending software should not make use
+          of nonstandard extensions; they are likely to be
+          missing from some implementations.  Message receiving
+          and displaying software should make sure that any
+          nonstandard PostScript operators are secure and don't
+          present any kind of threat.
+
+    (6)   It is possible to write PostScript that consumes huge
+          amounts of various system resources.  It is also
+          possible to write PostScript programs that loop
+          indefinitely.  Both types of programs have the
+          potential to cause damage if sent to unsuspecting
+          recipients.  Message-sending software should avoid the
+          construction and dissemination of such programs, which
+          is antisocial.  Message receiving and displaying
+          software should provide appropriate mechanisms to abort
+          processing after a reasonable amount of time has
+          elapsed. In addition, PostScript interpreters should be
+          limited to the consumption of only a reasonable amount
+          of any given system resource.
+
+
+
+
+
+Freed & Borenstein          Standards Track                    [Page 16]
+
+RFC 2046                      Media Types                  November 1996
+
+
+    (7)   It is possible to include raw binary information inside
+          PostScript in various forms.  This is not recommended
+          for use in Internet mail, both because it is not
+          supported by all PostScript interpreters and because it
+          significantly complicates the use of a MIME Content-
+          Transfer-Encoding.  (Without such binary, PostScript
+          may typically be viewed as line-oriented data.  The
+          treatment of CRLF sequences becomes extremely
+          problematic if binary and line-oriented data are mixed
+          in a single Postscript data stream.)
+
+    (8)   Finally, bugs may exist in some PostScript interpreters
+          which could possibly be exploited to gain unauthorized
+          access to a recipient's system.  Apart from noting this
+          possibility, there is no specific action to take to
+          prevent this, apart from the timely correction of such
+          bugs if any are found.
+
+4.5.3.  Other Application Subtypes
+
+   It is expected that many other subtypes of "application" will be
+   defined in the future.  MIME implementations must at a minimum treat
+   any unrecognized subtypes as being equivalent to "application/octet-
+   stream".
+
+5.  Composite Media Type Values
+
+   The remaining two of the seven initial Content-Type values refer to
+   composite entities.  Composite entities are handled using MIME
+   mechanisms -- a MIME processor typically handles the body directly.
+
+5.1.  Multipart Media Type
+
+   In the case of multipart entities, in which one or more different
+   sets of data are combined in a single body, a "multipart" media type
+   field must appear in the entity's header.  The body must then contain
+   one or more body parts, each preceded by a boundary delimiter line,
+   and the last one followed by a closing boundary delimiter line.
+   After its boundary delimiter line, each body part then consists of a
+   header area, a blank line, and a body area.  Thus a body part is
+   similar to an RFC 822 message in syntax, but different in meaning.
+
+   A body part is an entity and hence is NOT to be interpreted as
+   actually being an RFC 822 message.  To begin with, NO header fields
+   are actually required in body parts.  A body part that starts with a
+   blank line, therefore, is allowed and is a body part for which all
+   default values are to be assumed.  In such a case, the absence of a
+   Content-Type header usually indicates that the corresponding body has
+
+
+
+Freed & Borenstein          Standards Track                    [Page 17]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   a content-type of "text/plain; charset=US-ASCII".
+
+   The only header fields that have defined meaning for body parts are
+   those the names of which begin with "Content-".  All other header
+   fields may be ignored in body parts.  Although they should generally
+   be retained if at all possible, they may be discarded by gateways if
+   necessary.  Such other fields are permitted to appear in body parts
+   but must not be depended on.  "X-" fields may be created for
+   experimental or private purposes, with the recognition that the
+   information they contain may be lost at some gateways.
+
+   NOTE:  The distinction between an RFC 822 message and a body part is
+   subtle, but important.  A gateway between Internet and X.400 mail,
+   for example, must be able to tell the difference between a body part
+   that contains an image and a body part that contains an encapsulated
+   message, the body of which is a JPEG image.  In order to represent
+   the latter, the body part must have "Content-Type: message/rfc822",
+   and its body (after the blank line) must be the encapsulated message,
+   with its own "Content-Type: image/jpeg" header field.  The use of
+   similar syntax facilitates the conversion of messages to body parts,
+   and vice versa, but the distinction between the two must be
+   understood by implementors.  (For the special case in which parts
+   actually are messages, a "digest" subtype is also defined.)
+
+   As stated previously, each body part is preceded by a boundary
+   delimiter line that contains the boundary delimiter.  The boundary
+   delimiter MUST NOT appear inside any of the encapsulated parts, on a
+   line by itself or as the prefix of any line.  This implies that it is
+   crucial that the composing agent be able to choose and specify a
+   unique boundary parameter value that does not contain the boundary
+   parameter value of an enclosing multipart as a prefix.
+
+   All present and future subtypes of the "multipart" type must use an
+   identical syntax.  Subtypes may differ in their semantics, and may
+   impose additional restrictions on syntax, but must conform to the
+   required syntax for the "multipart" type.  This requirement ensures
+   that all conformant user agents will at least be able to recognize
+   and separate the parts of any multipart entity, even those of an
+   unrecognized subtype.
+
+   As stated in the definition of the Content-Transfer-Encoding field
+   [RFC 2045], no encoding other than "7bit", "8bit", or "binary" is
+   permitted for entities of type "multipart".  The "multipart" boundary
+   delimiters and header fields are always represented as 7bit US-ASCII
+   in any case (though the header fields may encode non-US-ASCII header
+   text as per RFC 2047) and data within the body parts can be encoded
+   on a part-by-part basis, with Content-Transfer-Encoding fields for
+   each appropriate body part.
+
+
+
+Freed & Borenstein          Standards Track                    [Page 18]
+
+RFC 2046                      Media Types                  November 1996
+
+
+5.1.1.  Common Syntax
+
+   This section defines a common syntax for subtypes of "multipart".
+   All subtypes of "multipart" must use this syntax.  A simple example
+   of a multipart message also appears in this section.  An example of a
+   more complex multipart message is given in RFC 2049.
+
+   The Content-Type field for multipart entities requires one parameter,
+   "boundary". The boundary delimiter line is then defined as a line
+   consisting entirely of two hyphen characters ("-", decimal value 45)
+   followed by the boundary parameter value from the Content-Type header
+   field, optional linear whitespace, and a terminating CRLF.
+
+   NOTE:  The hyphens are for rough compatibility with the earlier RFC
+   934 method of message encapsulation, and for ease of searching for
+   the boundaries in some implementations.  However, it should be noted
+   that multipart messages are NOT completely compatible with RFC 934
+   encapsulations; in particular, they do not obey RFC 934 quoting
+   conventions for embedded lines that begin with hyphens.  This
+   mechanism was chosen over the RFC 934 mechanism because the latter
+   causes lines to grow with each level of quoting.  The combination of
+   this growth with the fact that SMTP implementations sometimes wrap
+   long lines made the RFC 934 mechanism unsuitable for use in the event
+   that deeply-nested multipart structuring is ever desired.
+
+   WARNING TO IMPLEMENTORS:  The grammar for parameters on the Content-
+   type field is such that it is often necessary to enclose the boundary
+   parameter values in quotes on the Content-type line.  This is not
+   always necessary, but never hurts. Implementors should be sure to
+   study the grammar carefully in order to avoid producing invalid
+   Content-type fields.  Thus, a typical "multipart" Content-Type header
+   field might look like this:
+
+     Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p
+
+   But the following is not valid:
+
+     Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p
+
+   (because of the colon) and must instead be represented as
+
+     Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p"
+
+   This Content-Type value indicates that the content consists of one or
+   more parts, each with a structure that is syntactically identical to
+   an RFC 822 message, except that the header area is allowed to be
+   completely empty, and that the parts are each preceded by the line
+
+
+
+
+Freed & Borenstein          Standards Track                    [Page 19]
+
+RFC 2046                      Media Types                  November 1996
+
+
+     --gc0pJq0M:08jU534c0p
+
+   The boundary delimiter MUST occur at the beginning of a line, i.e.,
+   following a CRLF, and the initial CRLF is considered to be attached
+   to the boundary delimiter line rather than part of the preceding
+   part.  The boundary may be followed by zero or more characters of
+   linear whitespace. It is then terminated by either another CRLF and
+   the header fields for the next part, or by two CRLFs, in which case
+   there are no header fields for the next part.  If no Content-Type
+   field is present it is assumed to be "message/rfc822" in a
+   "multipart/digest" and "text/plain" otherwise.
+
+   NOTE:  The CRLF preceding the boundary delimiter line is conceptually
+   attached to the boundary so that it is possible to have a part that
+   does not end with a CRLF (line  break).  Body parts that must be
+   considered to end with line breaks, therefore, must have two CRLFs
+   preceding the boundary delimiter line, the first of which is part of
+   the preceding body part, and the second of which is part of the
+   encapsulation boundary.
+
+   Boundary delimiters must not appear within the encapsulated material,
+   and must be no longer than 70 characters, not counting the two
+   leading hyphens.
+
+   The boundary delimiter line following the last body part is a
+   distinguished delimiter that indicates that no further body parts
+   will follow.  Such a delimiter line is identical to the previous
+   delimiter lines, with the addition of two more hyphens after the
+   boundary parameter value.
+
+     --gc0pJq0M:08jU534c0p--
+
+   NOTE TO IMPLEMENTORS:  Boundary string comparisons must compare the
+   boundary value with the beginning of each candidate line.  An exact
+   match of the entire candidate line is not required; it is sufficient
+   that the boundary appear in its entirety following the CRLF.
+
+   There appears to be room for additional information prior to the
+   first boundary delimiter line and following the final boundary
+   delimiter line.  These areas should generally be left blank, and
+   implementations must ignore anything that appears before the first
+   boundary delimiter line or after the last one.
+
+   NOTE:  These "preamble" and "epilogue" areas are generally not used
+   because of the lack of proper typing of these parts and the lack of
+   clear semantics for handling these areas at gateways, particularly
+   X.400 gateways.  However, rather than leaving the preamble area
+   blank, many MIME implementations have found this to be a convenient
+
+
+
+Freed & Borenstein          Standards Track                    [Page 20]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   place to insert an explanatory note for recipients who read the
+   message with pre-MIME software, since such notes will be ignored by
+   MIME-compliant software.
+
+   NOTE:  Because boundary delimiters must not appear in the body parts
+   being encapsulated, a user agent must exercise care to choose a
+   unique boundary parameter value.  The boundary parameter value in the
+   example above could have been the result of an algorithm designed to
+   produce boundary delimiters with a very low probability of already
+   existing in the data to be encapsulated without having to prescan the
+   data.  Alternate algorithms might result in more "readable" boundary
+   delimiters for a recipient with an old user agent, but would require
+   more attention to the possibility that the boundary delimiter might
+   appear at the beginning of some line in the encapsulated part.  The
+   simplest boundary delimiter line possible is something like "---",
+   with a closing boundary delimiter line of "-----".
+
+   As a very simple example, the following multipart message has two
+   parts, both of them plain text, one of them explicitly typed and one
+   of them implicitly typed:
+
+     From: Nathaniel Borenstein <ns...@bellcore.com>
+     To: Ned Freed <ne...@innosoft.com>
+     Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST)
+     Subject: Sample message
+     MIME-Version: 1.0
+     Content-type: multipart/mixed; boundary="simple boundary"
+
+     This is the preamble.  It is to be ignored, though it
+     is a handy place for composition agents to include an
+     explanatory note to non-MIME conformant readers.
+
+     --simple boundary
+
+     This is implicitly typed plain US-ASCII text.
+     It does NOT end with a linebreak.
+     --simple boundary
+     Content-type: text/plain; charset=us-ascii
+
+     This is explicitly typed plain US-ASCII text.
+     It DOES end with a linebreak.
+
+     --simple boundary--
+
+     This is the epilogue.  It is also to be ignored.
+
+
+
+
+
+
+Freed & Borenstein          Standards Track                    [Page 21]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   The use of a media type of "multipart" in a body part within another
+   "multipart" entity is explicitly allowed.  In such cases, for obvious
+   reasons, care must be taken to ensure that each nested "multipart"
+   entity uses a different boundary delimiter.  See RFC 2049 for an
+   example of nested "multipart" entities.
+
+   The use of the "multipart" media type with only a single body part
+   may be useful in certain contexts, and is explicitly permitted.
+
+   NOTE: Experience has shown that a "multipart" media type with a
+   single body part is useful for sending non-text media types.  It has
+   the advantage of providing the preamble as a place to include
+   decoding instructions.  In addition, a number of SMTP gateways move
+   or remove the MIME headers, and a clever MIME decoder can take a good
+   guess at multipart boundaries even in the absence of the Content-Type
+   header and thereby successfully decode the message.
+
+   The only mandatory global parameter for the "multipart" media type is
+   the boundary parameter, which consists of 1 to 70 characters from a
+   set of characters known to be very robust through mail gateways, and
+   NOT ending with white space. (If a boundary delimiter line appears to
+   end with white space, the white space must be presumed to have been
+   added by a gateway, and must be deleted.)  It is formally specified
+   by the following BNF:
+
+     boundary := 0*69<bchars> bcharsnospace
+
+     bchars := bcharsnospace / " "
+
+     bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" /
+                      "+" / "_" / "," / "-" / "." /
+                      "/" / ":" / "=" / "?"
+
+   Overall, the body of a "multipart" entity may be specified as
+   follows:
+
+     dash-boundary := "--" boundary
+                      ; boundary taken from the value of
+                      ; boundary parameter of the
+                      ; Content-Type field.
+
+     multipart-body := [preamble CRLF]
+                       dash-boundary transport-padding CRLF
+                       body-part *encapsulation
+                       close-delimiter transport-padding
+                       [CRLF epilogue]
+
+
+
+
+
+Freed & Borenstein          Standards Track                    [Page 22]
+
+RFC 2046                      Media Types                  November 1996
+
+
+     transport-padding := *LWSP-char
+                          ; Composers MUST NOT generate
+                          ; non-zero length transport
+                          ; padding, but receivers MUST
+                          ; be able to handle padding
+                          ; added by message transports.
+
+     encapsulation := delimiter transport-padding
+                      CRLF body-part
+
+     delimiter := CRLF dash-boundary
+
+     close-delimiter := delimiter "--"
+
+     preamble := discard-text
+
+     epilogue := discard-text
+
+     discard-text := *(*text CRLF) *text
+                     ; May be ignored or discarded.
+
+     body-part := MIME-part-headers [CRLF *OCTET]
+                  ; Lines in a body-part must not start
+                  ; with the specified dash-boundary and
+                  ; the delimiter must not appear anywhere
+                  ; in the body part.  Note that the
+                  ; semantics of a body-part differ from
+                  ; the semantics of a message, as
+                  ; described in the text.
+
+     OCTET := <any 0-255 octet value>
+
+   IMPORTANT:  The free insertion of linear-white-space and RFC 822
+   comments between the elements shown in this BNF is NOT allowed since
+   this BNF does not specify a structured header field.
+
+   NOTE:  In certain transport enclaves, RFC 822 restrictions such as
+   the one that limits bodies to printable US-ASCII characters may not
+   be in force. (That is, the transport domains may exist that resemble
+   standard Internet mail transport as specified in RFC 821 and assumed
+   by RFC 822, but without certain restrictions.) The relaxation of
+   these restrictions should be construed as locally extending the
+   definition of bodies, for example to include octets outside of the
+   US-ASCII range, as long as these extensions are supported by the
+   transport and adequately documented in the Content- Transfer-Encoding
+   header field.  However, in no event are headers (either message
+   headers or body part headers) allowed to contain anything other than
+   US-ASCII characters.
+
+
+
+Freed & Borenstein          Standards Track                    [Page 23]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   NOTE:  Conspicuously missing from the "multipart" type is a notion of
+   structured, related body parts. It is recommended that those wishing
+   to provide more structured or integrated multipart messaging
+   facilities should define subtypes of multipart that are syntactically
+   identical but define relationships between the various parts. For
+   example, subtypes of multipart could be defined that include a
+   distinguished part which in turn is used to specify the relationships
+   between the other parts, probably referring to them by their
+   Content-ID field.  Old implementations will not recognize the new
+   subtype if this approach is used, but will treat it as
+   multipart/mixed and will thus be able to show the user the parts that
+   are recognized.
+
+5.1.2.  Handling Nested Messages and Multiparts
+
+   The "message/rfc822" subtype defined in a subsequent section of this
+   document has no terminating condition other than running out of data.
+   Similarly, an improperly truncated "multipart" entity may not have
+   any terminating boundary marker, and can turn up operationally due to
+   mail system malfunctions.
+
+   It is essential that such entities be handled correctly when they are
+   themselves imbedded inside of another "multipart" structure.  MIME
+   implementations are therefore required to recognize outer level
+   boundary markers at ANY level of inner nesting.  It is not sufficient
+   to only check for the next expected marker or other terminating
+   condition.
+
+5.1.3.  Mixed Subtype
+
+   The "mixed" subtype of "multipart" is intended for use when the body
+   parts are independent and need to be bundled in a particular order.
+   Any "multipart" subtypes that an implementation does not recognize
+   must be treated as being of subtype "mixed".
+
+5.1.4.  Alternative Subtype
+
+   The "multipart/alternative" type is syntactically identical to
+   "multipart/mixed", but the semantics are different.  In particular,
+   each of the body parts is an "alternative" version of the same
+   information.
+
+   Systems should recognize that the content of the various parts are
+   interchangeable.  Systems should choose the "best" type based on the
+   local environment and references, in some cases even through user
+   interaction.  As with "multipart/mixed", the order of body parts is
+   significant.  In this case, the alternatives appear in an order of
+   increasing faithfulness to the original content.  In general, the
+
+
+
+Freed & Borenstein          Standards Track                    [Page 24]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   best choice is the LAST part of a type supported by the recipient
+   system's local environment.
+
+   "Multipart/alternative" may be used, for example, to send a message
+   in a fancy text format in such a way that it can easily be displayed
+   anywhere:
+
+     From: Nathaniel Borenstein <ns...@bellcore.com>
+     To: Ned Freed <ne...@innosoft.com>
+     Date: Mon, 22 Mar 1993 09:41:09 -0800 (PST)
+     Subject: Formatted text mail
+     MIME-Version: 1.0
+     Content-Type: multipart/alternative; boundary=boundary42
+
+     --boundary42
+     Content-Type: text/plain; charset=us-ascii
+
+       ... plain text version of message goes here ...
+
+     --boundary42
+     Content-Type: text/enriched
+
+       ... RFC 1896 text/enriched version of same message
+           goes here ...
+
+     --boundary42
+     Content-Type: application/x-whatever
+
+       ... fanciest version of same message goes here ...
+
+     --boundary42--
+
+   In this example, users whose mail systems understood the
+   "application/x-whatever" format would see only the fancy version,
+   while other users would see only the enriched or plain text version,
+   depending on the capabilities of their system.
+
+   In general, user agents that compose "multipart/alternative" entities
+   must place the body parts in increasing order of preference, that is,
+   with the preferred format last.  For fancy text, the sending user
+   agent should put the plainest format first and the richest format
+   last.  Receiving user agents should pick and display the last format
+   they are capable of displaying.  In the case where one of the
+   alternatives is itself of type "multipart" and contains unrecognized
+   sub-parts, the user agent may choose either to show that alternative,
+   an earlier alternative, or both.
+
+
+
+
+
+Freed & Borenstein          Standards Track                    [Page 25]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   NOTE: From an implementor's perspective, it might seem more sensible
+   to reverse this ordering, and have the plainest alternative last.
+   However, placing the plainest alternative first is the friendliest
+   possible option when "multipart/alternative" entities are viewed
+   using a non-MIME-conformant viewer.  While this approach does impose
+   some burden on conformant MIME viewers, interoperability with older
+   mail readers was deemed to be more important in this case.
+
+   It may be the case that some user agents, if they can recognize more
+   than one of the formats, will prefer to offer the user the choice of
+   which format to view.  This makes sense, for example, if a message
+   includes both a nicely- formatted image version and an easily-edited
+   text version.  What is most critical, however, is that the user not
+   automatically be shown multiple versions of the same data.  Either
+   the user should be shown the last recognized version or should be
+   given the choice.
+
+   THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE:  Each part of a
+   "multipart/alternative" entity represents the same data, but the
+   mappings between the two are not necessarily without information
+   loss.  For example, information is lost when translating ODA to
+   PostScript or plain text.  It is recommended that each part should
+   have a different Content-ID value in the case where the information
+   content of the two parts is not identical.  And when the information
+   content is identical -- for example, where several parts of type
+   "message/external-body" specify alternate ways to access the
+   identical data -- the same Content-ID field value should be used, to
+   optimize any caching mechanisms that might be present on the
+   recipient's end.  However, the Content-ID values used by the parts
+   should NOT be the same Content-ID value that describes the
+   "multipart/alternative" as a whole, if there is any such Content-ID
+   field.  That is, one Content-ID value will refer to the
+   "multipart/alternative" entity, while one or more other Content-ID
+   values will refer to the parts inside it.
+
+5.1.5.  Digest Subtype
+
+   This document defines a "digest" subtype of the "multipart" Content-
+   Type.  This type is syntactically identical to "multipart/mixed", but
+   the semantics are different.  In particular, in a digest, the default
+   Content-Type value for a body part is changed from "text/plain" to
+   "message/rfc822".  This is done to allow a more readable digest
+   format that is largely compatible (except for the quoting convention)
+   with RFC 934.
+
+   Note: Though it is possible to specify a Content-Type value for a
+   body part in a digest which is other than "message/rfc822", such as a
+   "text/plain" part containing a description of the material in the
+
+
+
+Freed & Borenstein          Standards Track                    [Page 26]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   digest, actually doing so is undesireble. The "multipart/digest"
+   Content-Type is intended to be used to send collections of messages.
+   If a "text/plain" part is needed, it should be included as a seperate
+   part of a "multipart/mixed" message.
+
+   A digest in this format might, then, look something like this:
+
+     From: Moderator-Address
+     To: Recipient-List
+     Date: Mon, 22 Mar 1994 13:34:51 +0000
+     Subject: Internet Digest, volume 42
+     MIME-Version: 1.0
+     Content-Type: multipart/mixed;
+                   boundary="---- main boundary ----"
+
+     ------ main boundary ----
+
+       ...Introductory text or table of contents...
+
+     ------ main boundary ----
+     Content-Type: multipart/digest;
+                   boundary="---- next message ----"
+
+     ------ next message ----
+
+     From: someone-else
+     Date: Fri, 26 Mar 1993 11:13:32 +0200
+     Subject: my opinion
+
+       ...body goes here ...
+
+     ------ next message ----
+
+     From: someone-else-again
+     Date: Fri, 26 Mar 1993 10:07:13 -0500
+     Subject: my different opinion
+
+       ... another body goes here ...
+
+     ------ next message ------
+
+     ------ main boundary ------
+
+5.1.6.  Parallel Subtype
+
+   This document defines a "parallel" subtype of the "multipart"
+   Content-Type.  This type is syntactically identical to
+   "multipart/mixed", but the semantics are different.  In particular,
+
+
+
+Freed & Borenstein          Standards Track                    [Page 27]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   in a parallel entity, the order of body parts is not significant.
+
+   A common presentation of this type is to display all of the parts
+   simultaneously on hardware and software that are capable of doing so.
+   However, composing agents should be aware that many mail readers will
+   lack this capability and will show the parts serially in any event.
+
+5.1.7.  Other Multipart Subtypes
+
+   Other "multipart" subtypes are expected in the future.  MIME
+   implementations must in general treat unrecognized subtypes of
+   "multipart" as being equivalent to "multipart/mixed".
+
+5.2.  Message Media Type
+
+   It is frequently desirable, in sending mail, to encapsulate another
+   mail message.  A special media type, "message", is defined to
+   facilitate this.  In particular, the "rfc822" subtype of "message" is
+   used to encapsulate RFC 822 messages.
+
+   NOTE:  It has been suggested that subtypes of "message" might be
+   defined for forwarded or rejected messages.  However, forwarded and
+   rejected messages can be handled as multipart messages in which the
+   first part contains any control or descriptive information, and a
+   second part, of type "message/rfc822", is the forwarded or rejected
+   message.  Composing rejection and forwarding messages in this manner
+   will preserve the type information on the original message and allow
+   it to be correctly presented to the recipient, and hence is strongly
+   encouraged.
+
+   Subtypes of "message" often impose restrictions on what encodings are
+   allowed.  These restrictions are described in conjunction with each
+   specific subtype.
+
+   Mail gateways, relays, and other mail handling agents are commonly
+   known to alter the top-level header of an RFC 822 message.  In
+   particular, they frequently add, remove, or reorder header fields.
+   These operations are explicitly forbidden for the encapsulated
+   headers embedded in the bodies of messages of type "message."
+
+5.2.1.  RFC822 Subtype
+
+   A media type of "message/rfc822" indicates that the body contains an
+   encapsulated message, with the syntax of an RFC 822 message.
+   However, unlike top-level RFC 822 messages, the restriction that each
+   "message/rfc822" body must include a "From", "Date", and at least one
+   destination header is removed and replaced with the requirement that
+   at least one of "From", "Subject", or "Date" must be present.
+
+
+
+Freed & Borenstein          Standards Track                    [Page 28]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   It should be noted that, despite the use of the numbers "822", a
+   "message/rfc822" entity isn't restricted to material in strict
+   conformance to RFC822, nor are the semantics of "message/rfc822"
+   objects restricted to the semantics defined in RFC822. More
+   specifically, a "message/rfc822" message could well be a News article
+   or a MIME message.
+
+   No encoding other than "7bit", "8bit", or "binary" is permitted for
+   the body of a "message/rfc822" entity.  The message header fields are
+   always US-ASCII in any case, and data within the body can still be
+   encoded, in which case the Content-Transfer-Encoding header field in
+   the encapsulated message will reflect this.  Non-US-ASCII text in the
+   headers of an encapsulated message can be specified using the
+   mechanisms described in RFC 2047.
+
+5.2.2.  Partial Subtype
+
+   The "partial" subtype is defined to allow large entities to be
+   delivered as several separate pieces of mail and automatically
+   reassembled by a receiving user agent.  (The concept is similar to IP
+   fragmentation and reassembly in the basic Internet Protocols.)  This
+   mechanism can be used when intermediate transport agents limit the
+   size of individual messages that can be sent.  The media type
+   "message/partial" thus indicates that the body contains a fragment of
+   a larger entity.
+
+   Because data of type "message" may never be encoded in base64 or
+   quoted-printable, a problem might arise if "message/partial" entities
+   are constructed in an environment that supports binary or 8bit
+   transport.  The problem is that the binary data would be split into
+   multiple "message/partial" messages, each of them requiring binary
+   transport.  If such messages were encountered at a gateway into a
+   7bit transport environment, there would be no way to properly encode
+   them for the 7bit world, aside from waiting for all of the fragments,
+   reassembling the inner message, and then encoding the reassembled
+   data in base64 or quoted-printable.  Since it is possible that
+   different fragments might go through different gateways, even this is
+   not an acceptable solution.  For this reason, it is specified that
+   entities of type "message/partial" must always have a content-
+   transfer-encoding of 7bit (the default).  In particular, even in
+   environments that support binary or 8bit transport, the use of a
+   content- transfer-encoding of "8bit" or "binary" is explicitly
+   prohibited for MIME entities of type "message/partial". This in turn
+   implies that the inner message must not use "8bit" or "binary"
+   encoding.
+
+
+
+
+
+
+Freed & Borenstein          Standards Track                    [Page 29]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   Because some message transfer agents may choose to automatically
+   fragment large messages, and because such agents may use very
+   different fragmentation thresholds, it is possible that the pieces of
+   a partial message, upon reassembly, may prove themselves to comprise
+   a partial message.  This is explicitly permitted.
+
+   Three parameters must be specified in the Content-Type field of type
+   "message/partial":  The first, "id", is a unique identifier, as close
+   to a world-unique identifier as possible, to be used to match the
+   fragments together. (In general, the identifier is essentially a
+   message-id; if placed in double quotes, it can be ANY message-id, in
+   accordance with the BNF for "parameter" given in RFC 2045.)  The
+   second, "number", an integer, is the fragment number, which indicates
+   where this fragment fits into the sequence of fragments.  The third,
+   "total", another integer, is the total number of fragments.  This
+   third subfield is required on the final fragment, and is optional
+   (though encouraged) on the earlier fragments.  Note also that these
+   parameters may be given in any order.
+
+   Thus, the second piece of a 3-piece message may have either of the
+   following header fields:
+
+     Content-Type: Message/Partial; number=2; total=3;
+                   id="oc=jpbe0M2Yt4s@thumper.bellcore.com"
+
+     Content-Type: Message/Partial;
+                   id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
+                   number=2
+
+   But the third piece MUST specify the total number of fragments:
+
+     Content-Type: Message/Partial; number=3; total=3;
+                   id="oc=jpbe0M2Yt4s@thumper.bellcore.com"
+
+   Note that fragment numbering begins with 1, not 0.
+
+   When the fragments of an entity broken up in this manner are put
+   together, the result is always a complete MIME entity, which may have
+   its own Content-Type header field, and thus may contain any other
+   data type.
+
+5.2.2.1.  Message Fragmentation and Reassembly
+
+   The semantics of a reassembled partial message must be those of the
+   "inner" message, rather than of a message containing the inner
+   message.  This makes it possible, for example, to send a large audio
+   message as several partial messages, and still have it appear to the
+   recipient as a simple audio message rather than as an encapsulated
+
+
+
+Freed & Borenstein          Standards Track                    [Page 30]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   message containing an audio message.  That is, the encapsulation of
+   the message is considered to be "transparent".
+
+   When generating and reassembling the pieces of a "message/partial"
+   message, the headers of the encapsulated message must be merged with
+   the headers of the enclosing entities.  In this process the following
+   rules must be observed:
+
+    (1)   Fragmentation agents must split messages at line
+          boundaries only. This restriction is imposed because
+          splits at points other than the ends of lines in turn
+          depends on message transports being able to preserve
+          the semantics of messages that don't end with a CRLF
+          sequence. Many transports are incapable of preserving
+          such semantics.
+
+    (2)   All of the header fields from the initial enclosing
+          message, except those that start with "Content-" and
+          the specific header fields "Subject", "Message-ID",
+          "Encrypted", and "MIME-Version", must be copied, in
+          order, to the new message.
+
+    (3)   The header fields in the enclosed message which start
+          with "Content-", plus the "Subject", "Message-ID",
+          "Encrypted", and "MIME-Version" fields, must be
+          appended, in order, to the header fields of the new
+          message.  Any header fields in the enclosed message
+          which do not start with "Content-" (except for the
+          "Subject", "Message-ID", "Encrypted", and "MIME-
+          Version" fields) will be ignored and dropped.
+
+    (4)   All of the header fields from the second and any
+          subsequent enclosing messages are discarded by the
+          reassembly process.
+
+5.2.2.2.  Fragmentation and Reassembly Example
+
+   If an audio message is broken into two pieces, the first piece might
+   look something like this:
+
+     X-Weird-Header-1: Foo
+     From: Bill@host.com
+     To: joe@otherhost.com
+     Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
+     Subject: Audio mail (part 1 of 2)
+     Message-ID: <id...@host.com>
+     MIME-Version: 1.0
+     Content-type: message/partial; id="ABC@host.com";
+
+
+
+Freed & Borenstein          Standards Track                    [Page 31]
+
+RFC 2046                      Media Types                  November 1996
+
+
+                   number=1; total=2
+
+     X-Weird-Header-1: Bar
+     X-Weird-Header-2: Hello
+     Message-ID: <an...@foo.com>
+     Subject: Audio mail
+     MIME-Version: 1.0
+     Content-type: audio/basic
+     Content-transfer-encoding: base64
+
+       ... first half of encoded audio data goes here ...
+
+   and the second half might look something like this:
+
+     From: Bill@host.com
+     To: joe@otherhost.com
+     Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
+     Subject: Audio mail (part 2 of 2)
+     MIME-Version: 1.0
+     Message-ID: <id...@host.com>
+     Content-type: message/partial;
+                   id="ABC@host.com"; number=2; total=2
+
+       ... second half of encoded audio data goes here ...
+
+   Then, when the fragmented message is reassembled, the resulting
+   message to be displayed to the user should look something like this:
+
+     X-Weird-Header-1: Foo
+     From: Bill@host.com
+     To: joe@otherhost.com
+     Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
+     Subject: Audio mail
+     Message-ID: <an...@foo.com>
+     MIME-Version: 1.0
+     Content-type: audio/basic
+     Content-transfer-encoding: base64
+
+       ... first half of encoded audio data goes here ...
+       ... second half of encoded audio data goes here ...
+
+   The inclusion of a "References" field in the headers of the second
+   and subsequent pieces of a fragmented message that references the
+   Message-Id on the previous piece may be of benefit to mail readers
+   that understand and track references.  However, the generation of
+   such "References" fields is entirely optional.
+
+
+
+
+
+Freed & Borenstein          Standards Track                    [Page 32]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   Finally, it should be noted that the "Encrypted" header field has
+   been made obsolete by Privacy Enhanced Messaging (PEM) [RFC-1421,
+   RFC-1422, RFC-1423, RFC-1424], but the rules above are nevertheless
+   believed to describe the correct way to treat it if it is encountered
+   in the context of conversion to and from "message/partial" fragments.
+
+5.2.3.  External-Body Subtype
+
+   The external-body subtype indicates that the actual body data are not
+   included, but merely referenced.  In this case, the parameters
+   describe a mechanism for accessing the external data.
+
+   When a MIME entity is of type "message/external-body", it consists of
+   a header, two consecutive CRLFs, and the message header for the
+   encapsulated message.  If another pair of consecutive CRLFs appears,
+   this of course ends the message header for the encapsulated message.
+   However, since the encapsulated message's body is itself external, it
+   does NOT appear in the area that follows.  For example, consider the
+   following message:
+
+     Content-type: message/external-body;
+                   access-type=local-file;
+                   name="/u/nsb/Me.jpeg"
+
+     Content-type: image/jpeg
+     Content-ID: <id...@guppylake.bellcore.com>
+     Content-Transfer-Encoding: binary
+
+     THIS IS NOT REALLY THE BODY!
+
+   The area at the end, which might be called the "phantom body", is
+   ignored for most external-body messages.  However, it may be used to
+   contain auxiliary information for some such messages, as indeed it is
+   when the access-type is "mail- server".  The only access-type defined
+   in this document that uses the phantom body is "mail-server", but
+   other access-types may be defined in the future in other
+   specifications that use this area.
+
+   The encapsulated headers in ALL "message/external-body" entities MUST
+   include a Content-ID header field to give a unique identifier by
+   which to reference the data.  This identifier may be used for caching
+   mechanisms, and for recognizing the receipt of the data when the
+   access-type is "mail-server".
+
+   Note that, as specified here, the tokens that describe external-body
+   data, such as file names and mail server commands, are required to be
+   in the US-ASCII character set.
+
+
+
+
+Freed & Borenstein          Standards Track                    [Page 33]
+
+RFC 2046                      Media Types                  November 1996
+
+
+   If this proves problematic in practice, a new mechanism may be
+   required as a future extension to MIME, either as newly defined
+   access-types for "message/external-body" or by some other mechanism.
+
+   As with "message/partial", MIME entities of type "message/external-
+   body" MUST have a content-transfer-encoding of 7bit (the default).
+   In particular, even in environments that support binary or 8bit
+   transport, the use of a content- transfer-encoding of "8bit" or
+   "binary" is explicitly prohibited for entities of type
+   "message/external-body".
+
+5.2.3.1.  General External-Body Parameters
+
+   The parameters that may be used with any "message/external- body"
+   are:
+
+    (1)   ACCESS-TYPE -- A word indicating the supported access
+          mechanism by which the file or data may be obtained.
+          This word is not case sensitive.  Values include, but
+          are not limited to, "FTP", "ANON-FTP", "TFTP", "LOCAL-
+          FILE", and "MAIL-SERVER".  Future values, except for
+          experimental values beginning with "X-", must be
+          registered with IANA, as described in RFC 2048.
+          This parameter is unconditionally mandatory and MUST be
+          present on EVERY "message/external-body".
+
+    (2)   EXPIRATION -- The date (in the RFC 822 "date-time"
+          syntax, as extended by RFC 1123 to permit 4 digits in
+          the year field) after which the existence of the
+          external data is not guaranteed.  This parameter may be
+          used with ANY access-type and is ALWAYS optional.
+
+    (3)   SIZE -- The size (in octets) of the data.  The intent
+          of this parameter is to help the recipient decide
+          whether or not to expend the necessary resources to
+          retrieve the external data.  Note that this describes
+          the size of the data in its canonical form, that is,
+          before any Content-Transfer-Encoding has been applied
+          or after the data have been decoded.  This parameter
+          may be used with ANY access-type and is ALWAYS
+          optional.
+
+    (4)   PERMISSION -- A case-insensitive field that indicates
+          whether or not it is expected that clients might also
+          attempt to overwrite the data.  By default, or if
+          permission is "read", the assumption is that they are
+          not, and that if the data is retrieved once, it is
+          never needed again.  If PERMISSION is "read-write",
+
+
+
+Freed & Borenstein          Standards Track                    [Page 34]
+
+RFC 2046                      Media Types                  November 1996
+
+
+          this assumption is invalid, and any local copy must be
+          considered no more than a cache.  "Read" and "Read-
+          write" are the only defined values of permission.  This
+          parameter may be used with ANY access-type and is
+          ALWAYS optional.
+
+   The precise semantics of the access-types defined here are described
+   in the sections that follow.
+
+5.2.3.2.  The 'ftp' and 'tftp' Access-Types
+
+   An access-type of FTP or TFTP indicates that the message body is
+   accessible as a file using the FTP [RFC-959] or TFTP [RFC- 783]
+   protocols, respectively.  For these access-types, the following
+   additional parameters are mandatory:
+
+    (1)   NAME -- The name of the file that contains the actual
+          body data.
+
+    (2)   SITE -- A machine from which the file may be obtained,
+          using the given protocol.  This must be a fully
+          qualified domain name, not a nickname.
+
+    (3)   Before any data are retrieved, using FTP, the user will
+          generally need to be asked to provide a login id and a
+          password for the machine named by the site parameter.
+          For security reasons, such an id and password are not
+          specified as content-type parameters, but must be
+          obtained from the user.
+

[... 530 lines stripped ...]


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org