.lf 1 - .lf 1 mparse.0 .\" Description: mparse man page .\" $Id: ~|^` @(#)mparse.0 0.106 2006/08/06 00:37:26 copyright 2001, 2002, 2003, 2004, 2005, 2006 Bruce Lilly \\ $ .\" common man macros to V7, V10, DWB2 (unique ones omitted, differences noted) .\" .TH n c x V7,10 begin page n of chapter c; x is extra commentary .\" .TH t s c n DWB2 beg. pg. t of sect. s; c=extra comment, n=new man. name .\" c appears at bottom center of page, n at top center .\" .SH text subhead .\" .SS text sub-subhead .\" .B text make text bold .\" .I text make text italic .\" .SM text make text 1 point smaller than default .\" .RI a b concatenate and alternate Roman, Italic fonts <=6 args .\" .IR .RB .BR .IB .BI similar to .RI .\" .PP new paragraph .\" .HP in hanging paragraph with indent in .\" .TP in indented paragraph with hanging tag (on next line) .\" .IP t in indented paragraph with hanging tag t (arg 1) .\" .RS in increase relative indent by in .\" .RE k return to kth relative indent level (1-based) .\" .DT default tab settings .\" .PD v inter-paragraph spacing v (default 0.4v troff, 1v nroff) .\" \*R registered symbol (Reg.) .\" \*S change to default type size .\" prevent hyphenation of function names, etc. .lf 1 mparse.hw .hw -lmailparse .hw mailparse .hw -lmparse .hw mparse .hw parameters .hw memset .hw initialize_lexer .hw add_mode .hw clear_modes .hw in_mode .hw hook_received_by_domain_validate .hw hook_received_from_domain_validate .hw moderated .hw unsigned .hw token .hw Content .hw Content-Base .hw Content-Transfer-Encoding .hw mailbox .hw message .hw multipart .hw ioerror .hw encapsulated .hw data .hw encword .hw remove_mode .hw separator .hw arguments .hw BRACKETS .hw EMULTIADDRESS .hw process_message .hw parameter_string .hw Content-Type .hw epilogue .hw strerror .hw preamble .hw addr-specs .hw MIME-Version .hw Version .hw install_and_process_field .hw append_body_from_file .hw append_body_line .hw message/delivery-status .hw delivery-status .hw delivery .hw local-part .hw local .hw Followup-To .hw Followup .hw multipart/voice-message .hw voice-message .hw protocol_status .hw quoted-printable .hw experimental .hw const .hw Auto-Submitted .hw Auto .hw MPARSE_PRIMARY_ROLE_ACCESS .hw MPARSE_PRIMARY_ROLE_GENERATION .hw MPARSE_PRIMARY_ROLE_TRANSPORT .hw MPARSE_PRIMARY_ROLE_VALIDATION .hw MPARSE_SECONDARY_ROLE_TRANSFORM .hw MPARSE_SECONDARY_ROLE_REPAIR .hw MPARSE_STRING_MODE_UNFOLD .hw MPARSE_INTERPOLATE_AT_DELIMITER .hw MPARSE_INTERPOLATE_DISPLAY_NAME .hw MPARSE_INTERPOLATE_INCLUDE_COMMENTS .hw application/octet-stream .hw also .hw canonicalize .hw MPARSE_REQUIREMENTS_MUST .hw MPARSE_ERROR_COUNT_NO_COUNT .hw messages .hw MPARSE_ENCODED_WORD_IN_COMMENT .hw Disposition-Notification-Options .hw MPARSE_SUFFIX_NO_SUFFIX .hw Newsgroups .hw insert_field .hw mparse_insert_field .hw mparse_initialize_lexer .hw mparse_add_mode .hw mparse_clear_modes .hw mparse_in_mode .hw mparse_remove_mode .hw mparse_process_message .hw mparse_parameter_string .hw mparse_strerror .hw mparse_install_and_process_field .hw mparse_append_body_from_file .hw mparse_append_body_line .hw mparse_protocol_status .lf 26 mparse.0 .\" .DS .DE for -man ############################################## .lf 1 keepmacro.s .\" .DS .DE for -man ############################################## .\" $Id: ~|^` @(#) keepmacro.s 0.99.0.0 2005/05/01 23:03:49 Bruce Lilly extracted 202006/08/05 23:19:19\ $ .nr dD 0 \" not in keep .de DS \" keep (display) start .ie \\n(dD .tm WARNING: \\n(.F line \\n(.c: nested .DS .el \{.nr dD 1 \" in keep .nr dI \\n(IN \" current indent .nr dL \\n(LL \" current line length .if !\\n(dL .nr dL 6.5i \" make sure line length is not zero (default is 6.5i) .br \" finish current output .ev 1 \" set new environment for keep .in \\n(dIu \" set environment indent .ll \\n(dLu \" and line length same as page .di dC\} \" divert content into diversion dC .. .de DE \" keep (display) end .ie !\\n(dD .tm WARNING: \\n(.F line \\n(.c: .DE without .DS .el \{.br .di \" end keep diversion .ev \" pop keep environment .nr dH \\n(dn \" diversion height .nr dD 0 \" not in keep 'ne \\n(dHu \" break page if diversion won't fit on current page .ev 2 \" new environment for interpolation .in 0i \" zero indent for interpolating formatted diversion .ll \\n(dLu \" line length same as page .nf \" no fill mode for interpolating formatted content 'dC \" interpolate diverted content .br \" finish output .rm dC \" finished with diverted content .ev\} \" pop interpolation environment .. .\" ########################### end of keep macros ########################### .lf 28 mparse.0 .lg 0 .\" avoid groff's butt-ugly ligatures .ds ]W \" no 7th Edition designation .TH mparse 3 "release 0.106" .\" .TS [H], .TH [N] (redefinition), .TE for tables with -man ####### .lf 1 tblmacro.s .\" .TS [H], .TH [N] (redefinition), .TE for tables with -man ####### .\" @(#) tblmacro.s 1.2.0.0 202005/04/26 14:11:01 Bruce Lilly .de TS \" table start .rn TH tH \" move title heading out of way for table heading .rn xH TH \" table heading in effect for .TH .sp \\n(PDu \" inter-paragraph space (paragraph distance) .nr tI \\n(IN \" current indent .nr tL \\n(LL \" current line length .ev 1 \" set new environment for table; avoid gtbl polluting page environment .in \\n(tIu \" set environment indent .ll \\n(tLu \" and line length same as page .if x\\$1xHx \{'ne 5v \" .TS H .wh 1i rH \" 1i page trap after new page header .di hT\} \" divert heading into diversion hT .. .de xH \" becomes TH for tables .di \" end diversion hT .ie x\\$1xNx .br \" .TH N means no header 1st time .el \{.ev 2 \" new environment .in 0i \" zero indent for interpolating formatted diversion .nf \" no fill mode for interpolating formatted header .hT \" interpolate diverted table header .ev\} \" pop environment .nr iT 1 \" in table (with header) .. .de rH \" repeat header via page trap .if \\n(iT \{.ev 2 \" do nothing if not in table .in 0i \" zero indent for interpolating formatted diversion .ch rH 0.5i \" avoid retriggering page trap .sp |1i \" ensure correct position .nf \" no fill mode for interpolating formatted header .hT \" interpolate table header .ch rH 1i \" restore page trap .ev\} \" pop environment .. .de TE \" table end .rn TH xH \" hide table heading macro .rn tH TH \" restore title heading macro for .TH .rm hT \" finished with diverted heading .nr iT 0 \" not in table .ev \" pop table environment .sp \\n(PDu \" inter-paragraph space (paragraph distance) .. .\" ########################### end of table macros ########################### .lf 33 mparse.0 .SH NAME \" 1 line name \- explanatory text .B mparse \- parse and generate Internet text messages and their components .SH SYNOPSIS \fB#include \fP .PP .B int mparse_parse(struct mparse_message *, FILE *, FILE *); .PP .B . . . .PP .SH DESCRIPTION .B Mparse is a library of functions for parsing and generating Internet Message Format messages. Such messages are used by a number of applications, such as email, Internet fax, voice messaging, EDI, and Usenet news. The messages consist of a header, which contains one or more fields, and usually has some body content (a more detailed description of message format is provided below). .PP This software is OSI Certified Open Source Software. OSI Certified is a certification mark of the Open Source Initiative. .PP To understand the role of messages in applications, as well as the role of mparse, some definitions of terms are in order: .PP A format is an arrangement of content according to a combination of defined syntax and semantics for transferring said content in a manner which facilitates processing by machine. There are several types of formats, some of which interact in complex ways. The Internet Message Format breaks content into three broad categories: the message header which has defined fields with specified syntax, a separator line which serves to distinguish message header from message body, and finally the message body which consists of lines of text. The header fields each consist of a field name, a colon character which separates the field name from the field body, and finally the field body which has a defined syntax. MIME is a collection of standards which describe how to represent various types of complex text, non-textual, and combinations of content within the Internet Message Format (and extended to other contexts such as HTTP). MIME defines a number of media types, including message/rfc822 which can be used to encapsulate an Internet Message Format message within another Internet Message. Various other media types are defined by MIME and extensions to the MIME standards. These fit into two broad categories: discrete media types used to represent a particular type of content, and composite media types which represent potentially complex collections of content. Composite types are message and multipart categories, each with several subtypes; discrete types include various subtypes of text, audio, image, video, and model types, as well as application-specific types. Many media types are specified with specific formats. .PP A network protocol is a defined procedure for regulating data transfer. Examples of network protocols include Simple Mail Transfer Protocol (SMTP), Local Mail Transfer Protocol (LMTP), Network News Transfer Protocol (NNTP), File Transfer Protocol (FTP), Internet Message Access Protocol (IMAP), Message Posting Protocol (MPP), and Post Office Protocol (POP). .PP An application consists of a collection of programs that implement some function, possibly using network protocols to transfer some content in a defined format from a source to one or more destinations. Examples of applications include email, Usenet news, Internet fax, voice messaging, and Electronic Document Interchange (EDI). An application is an end-to-end process. .DS .PP The relationship between applications, the Internet Message Format, transport protocols, etc. can be seen in the following diagram, which is based on the OSI model: .lf 115 mparse.0 .PS 2.750i 6.000i .\" 0 -2.25 6 0.5 .\" 0.000i 2.750i 6.000i 0.000i .nr 00 \n(.u .nf .nr 0x 1 \h'6.000i' .sp -1 \D't -1.000p'\h'1.000p' .sp -1 \h'6.000i'\v'1.000i'\D'p0.000i -1.000i -6.000i 0.000i 0.000i 1.000i' .sp -1 .lf 116 \h'0.050i'\v'0.500i-(0v/2u)+0v+0.22m'Application .sp -1 \h'6.000i'\v'0.250i'\D'p0.000i -0.250i -5.000i 0.000i 0.000i 0.250i' .sp -1 .lf 117 \h'3.500i-(\w'application-specific and media-specific fields and media types'u/2u)'\v'0.125i-(0v/2u)+0v+0.22m'application-specific and media-specific fields and media types .sp -1 \h'6.000i'\v'0.500i'\D'p0.000i -0.250i -5.000i 0.000i 0.000i 0.250i' .sp -1 .lf 118 \h'3.500i-(\w'MIME message structure, media types, and fields'u/2u)'\v'0.375i-(0v/2u)+0v+0.22m'MIME message structure, media types, and fields .sp -1 \h'6.000i'\v'0.750i'\D'p0.000i -0.250i -5.000i 0.000i 0.000i 0.250i' .sp -1 .lf 119 \h'3.500i-(\w'Internet Message Format (header fields, separator, body)'u/2u)'\v'0.625i-(0v/2u)+0v+0.22m'Internet Message Format (header fields, separator, body) .sp -1 \h'6.000i'\v'1.000i'\D'p0.000i -0.250i -5.000i 0.000i 0.000i 0.250i' .sp -1 .lf 120 \h'3.500i-(\w'low-level field body components'u/2u)'\v'0.875i-(0v/2u)+0v+0.22m'low-level field body components .sp -1 \h'6.000i'\v'1.750i'\D'p0.000i -0.750i -6.000i 0.000i 0.000i 0.750i' .sp -1 .lf 121 \h'0.050i'\v'1.250i-(0v/2u)+0v+0.22m'Presentation .sp -1 .lf 121 \h'0.050i'\v'1.500i-(0v/2u)+0v+0.22m'Session .sp -1 \h'6.000i'\v'1.750i'\D'p0.000i -0.750i -5.000i 0.000i 0.000i 0.750i' .sp -1 .lf 122 \h'3.500i-(\w'encoding, encryption, packaging'u/2u)'\v'1.375i-(3v/2u)+0v+0.22m'encoding, encryption, packaging .sp -1 .lf 122 \h'3.500i-(\w'LMTP, SMTP, MPP, MTP, EMSD, NNTP, FTP, IMAP, POP, rmail, rnews, submission'u/2u)'\v'1.375i-(3v/2u)+1v+0.22m'LMTP, SMTP, MPP, MTP, EMSD, NNTP, FTP, IMAP, POP, rmail, rnews, submission .sp -1 .lf 122 \h'3.500i-(\w'supplementary notification messages: MDN, DSN, MTSN, bounces, rejections'u/2u)'\v'1.375i-(3v/2u)+2v+0.22m'supplementary notification messages: MDN, DSN, MTSN, bounces, rejections .sp -1 .lf 122 \h'3.500i-(\w'SIP'u/2u)'\v'1.375i-(3v/2u)+3v+0.22m'SIP .sp -1 \h'0.000i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'0.106i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'0.211i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'0.317i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'0.422i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'0.528i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'0.633i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'0.739i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'0.844i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'0.950i'\v'1.375i'\D'l0.050i 0.000i' .sp -1 \h'1.000i'\v'2.000i'\D'p0.000i -0.250i -1.000i 0.000i 0.000i 0.250i' .sp -1 .lf 124 \h'0.050i'\v'1.875i-(0v/2u)+0v+0.22m'Transport .sp -1 \h'4.000i'\v'2.000i'\D'p0.000i -0.250i -3.000i 0.000i 0.000i 0.250i' .sp -1 .lf 125 \h'2.500i-(\w'TCP'u/2u)'\v'1.875i-(0v/2u)+0v+0.22m'TCP .sp -1 \h'1.000i'\v'2.250i'\D'p0.000i -0.250i -1.000i 0.000i 0.000i 0.250i' .sp -1 .lf 126 \h'0.050i'\v'2.125i-(0v/2u)+0v+0.22m'Network .sp -1 \h'4.000i'\v'2.250i'\D'p0.000i -0.250i -3.000i 0.000i 0.000i 0.250i' .sp -1 .lf 127 \h'2.500i-(\w'IP'u/2u)'\v'2.125i-(0v/2u)+0v+0.22m'IP .sp -1 \h'6.000i'\v'2.250i'\D'p0.000i -0.500i -2.000i 0.000i 0.000i 0.500i' .sp -1 .lf 128 \h'5.000i-(\w'UUCP'u/2u)'\v'2.000i-(0v/2u)+0v+0.22m'UUCP .sp -1 \h'6.000i'\v'2.500i'\D'p0.000i -0.250i -6.000i 0.000i 0.000i 0.250i' .sp -1 .lf 129 \h'0.050i'\v'2.375i-(0v/2u)+0v+0.22m'Data Link .sp -1 \h'6.000i'\v'2.750i'\D'p0.000i -0.250i -6.000i 0.000i 0.000i 0.250i' .sp -1 .lf 130 \h'0.050i'\v'2.625i-(0v/2u)+0v+0.22m'Physical .sp -1 .sp 2.750i+1 .if \n(00 .fi .br .nr 0x 0 .lf 131 .PE .lf 132 .DE .PP Mparse handles the message format shown at the Application layer, except for media-specific body content (with some notable cases which are handled), and assists with the interface to the next level. That is, mparse handles the Internet Message Format, MIME message structure media types, fields defined in a few specific MIME media types, and the low-level field body components used by the Internet Message Format and MIME and their extensions. The criteria for handling MIME media types which include fields is that the media type must be defined as a group of fields optionally followed by a separator line and other content, and all fields defined must be compatible with RFC 2822 field syntax. Examples of media types that fail to conform to those criteria include message/http and message/sip (which begin with a "start-line" rather than with fields), and message/CPIM (whose syntax for several fields conflicts with RFC 2822 fields). Such non-conforming media type content is treated by mparse as opaque content (much like application/octet-stream). Application authors interested in parsing such media type content may of course do so using application-specific methods. .DS .PP The message format (excluding media-specific message body content) is specified in the following groups of RFCs: .PP Overall non-MIME message structure and related issues: .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 561 Standardizing Network Mail Headers (obsolete) 724 T{ Proposed Official Standard for the Format of ARPA Network Messages (obsoleted by RFC 733) T} 733 T{ STANDARD FOR THE FORMAT OF ARPA NETWORK TEXT MESSAGES (obsoleted by RFC 822) T} 822 T{ Standard for the format of ARPA Internet text messages (see also 2822) T} 2368 The mailto URL scheme 2822 Internet Message Format 4096 T{ Policy\-Mandated Labels Such as "Adv:" in Email Subject Headers Considered Ineffective At Best T} .TE .DE .DS .PP Non-MIME extension message header fields, and supplementary message header field definitions: .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 788 Simple Mail Transfer Protocol (see also 821, 2821) 821 Simple Mail Transfer Protocol (see also 2821) 850 T{ Standard for interchange of USENET messages (obsoleted by 1036) T} 886 Proposed Standard for Message Header Munging 987 T{ Mapping Between X.400 and RFC 822 (updated by 1026, obsoleted by 1148) T} 1026 T{ Addendum to RFC 987 (Mapping between X.400 and RFC-822) (obsoleted by 1327, 1148) T} 1036 Standard for interchange of USENET messages 1049 A CONTENT-TYPE HEADER FIELD FOR INTERNET MESSAGES 1138 T{ Mapping between X.400(1988) / ISO 10021 and RFC 822 (obsoleted by 1148) T} 1148 T{ Mapping between X.400(1988) / ISO 10021 and RFC 822 (obsoleted by 2156) T} 1154 T{ Encoding Header Field for Internet Messages (obsoleted by 1505) T} 1327 T{ Mapping between X.400(1988) / ISO 10021 and RFC 822 (obsoleted by 2156) T} 1505 Encoding Header Field for Internet Messages 2156 T{ MIXER (Mime Internet X.400 Enhanced Relay): Mapping between X.400 and RFC 822/MIME T} 2369 T{ .na The Use of URLs as Meta-Syntax for Core Mail List Commands and their Transport through Message Header Fields T} 2821 Simple Mail Transfer Protocol 2919 T{ .na List-Id: A Structured Field and Namespace for the Identification of Mailing Lists T} 3834 T{ Recommendations for Automatic Responses to Electronic Mail T} 3458 Message Context for Internet Mail 3865 T{ A No Soliciting Simple Mail Transfer Protocol (SMTP) Service Extension T} 3939 T{ Calling Line Identification for Voice Mail Messages T} 4021 T{ Registration of Mail and MIME Header Fields T} .TE .DE .DS .PP Low-level field components: .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 1034 Domain names - concepts and facilities (updated by 2181, 4343, 4592) 1035 Domain names - implementation and specification (updated by 2181, 4343) 1123 T{ Requirements for Internet Hosts - Application and Support (updated by 2181) T} 1342 T{ Representation of Non-ASCII Text in Internet Message Headers (obsoleted by 1522) T} .\" errors in EBCDIC and EBCDIC-derived charset tables, no errata 1345 Character Mnemonics & Character Sets 1893 Enhanced Mail System Status Codes (obsoleted by 3463) 1958 Architectural Principles of the Internet 2181 Clarifications to the DNS Specification (updated by 4343) 2277 IETF Policy on Character Sets and Languages 2373 IP Version 6 Addressing Architecture 2396 Uniform Resource Identifiers (URI): Generic Syntax 2978 IANA Charset Registration Procedures 3066 Tags for the Identification of Languages 3463 Enhanced Mail System Status Codes 3692 T{ Assigning Experimental and Testing Numbers Considered Useful T} 3848 ESMTP and LMTP Transmission Types Registration 3938 Video-Message Message-Context 4343 T{ Domain Name System (DNS) Case Insensitivity Clarification T} 4592 T{ The Role of Wildcards in the Domain Name System T} .TE .DE .DS .PP MIME message structure and fields, MIME enhancements to fields, MIME extension field definitions, and definition of MIME media types requiring specific Content-Type parameters: .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 1341 T{ .na MIME (Multipurpose Internet Mail Extensions): Mechanisms for Specifying and Describing the Format of Internet Message Bodies (obsoleted by 1521) T} 1344 Implications of MIME for Internet Mail Gateways 1521 T{ .na MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies (obsoleted by 2045, 2046, 2047, 2048, 2049; updated by 1590) T} 1522 T{ .na MIME (Multipurpose Internet Mail Extensions) Part Two: Message Header Extensions for Non-ASCII Text (obsoleted by 2045, 2046, 2047, 2048, 2049; updated by 1590) T} 1847 T{ Security Multiparts for MIME: Multipart/Signed and Multipart/Encrypted T} 1864 The Content-MD5 Header Field 2017 Definition of the URL MIME External-Body Access-Type 2045 T{ Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies T} 2046 T{ Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types T} 2047 T{ .na Multipurpose Internet Mail Extensions (MIME) Part Two: Message Header Extensions for Non-ASCII Text T} 2049 T{ Multipurpose Internet Mail Extensions (MIME) Part Five: Conformance Criteria and Examples T} 2184 T{ .na MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations (obsoleted by 2231) T} 2231 T{ .na MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations T} 2311 S/MIME Version 2 Message Specification 2387 The MIME Multipart/Related Content-type 2421 Voice Profile for Internet Mail - version 2 2424 T{ Content Duration MIME Header Definition (obsoleted by 3803) T} 2425 A MIME Content-Type for Directory Information 2480 Gateways and MIME Security Multiparts 2533 T{ A Syntax for Describing Media Feature Sets (updated by 2738) T} 2557 T{ MIME Encapsulation of Aggregate Documents, such as HTML (MHTML) T} 2586 The Audio/L16 MIME content type 2616 Hypertext Transfer Protocol -- HTTP/1.1 2633 T{ S/MIME Version 3 Message Specification (obsoleted by 3851) T} 2634 Enhanced Security Services for S/MIME 2652 T{ MIME Object Definitions for the Common Indexing Protocol (CIP) T} 2660 The Secure HyperText Transfer Protocol 2738 T{ Corrections to "A Syntax for Describing Media Feature Sets" T} 2912 Indicating Media Features for MIME Content 2913 MIME Content Types in Media Feature Expressions 3156 MIME Security with OpenPGP 3282 Content Language Headers 3801 Voice Profile for Internet Mail - version 2 (VPIMv2) 3803 Content Duration MIME Header Definition 3804 Voice Profile for Internet Mail (VPIM) Addressing 3851 T{ .na Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.1 Message Specification T} 4194 T{ The S Hexdump Format T} .TE .DE .DS .PP Notification message structure and applications: .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 1892 T{ The Multipart/Report Content Type for the Reporting of Mail System Administrative Messages (obsoleted by 3462) T} 1894 T{ An Extensible Message Format for Delivery Status Notifications (obsoleted by 3464) T} 2298 T{ An Extensible Message Format for Message Disposition Notifications (obsoleted by 3798) T} 2530 T{ Indicating Supported Media Features using Extensions to DSN and MDN T} 2634 Enhanced Security Services for S/MIME 3297 Content Negotiation for Messaging Services 3335 T{ .na MIME-based Secure Peer-to-Peer Business Data Interchange over the Internet T} 3462 T{ The Multipart/Report Content Type for the Reporting of Mail System Administrative Messages T} 3464 T{ An Extensible Message Format for Delivery Status Notifications T} 3798 T{ An Extensible Message Format for Message Disposition Notifications T} 3834 T{ Recommendations for Automatic Responses to Electronic Mail T} 3886 T{ An Extensible Message Format for Message Tracking Responses T} .TE .DE .PP RFC 822 was preceded by RFCs 561, 724, and 733. Message content (header and body) according to those older RFCs is recognized and parsed by .B mparse to the extent possible consistent with modern syntax, however many constructs provided for by those RFCs are no longer valid and a few are incompatible with modern syntax. Some are hopelessly ambiguous. .DS .PP .B Mparse also assists with network message transfer protocols that return status responses. Applicable network message transfer protocols are defined in the following RFCs: .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 765 FILE TRANSFER PROTOCOL (obsoleted by 959) 772 MAIL TRANSFER PROTOCOL (obsoleted by 780) 780 T{ MAIL TRANSFER PROTOCOL (see also 784, 785, 786; obsoleted by 788) T} 788 Simple Mail Transfer Protocol (see also 821, 2821) 821 Simple Mail Transfer Protocol (see also 2821) 977 Network News Transfer Protocol 1204 Message Posting Protocol (MPP) 2033 Local Mail Transfer Protocol 2476 Message Submission (obsoleted by 4409) 2821 Simple Mail Transfer Protocol 4409 T{ Message Submission for Mail T} .TE .PP Note that some RFCs define both message format and message transfer protocols. .DE .DS .PP Note that some RFCs contain errors. There is an errata page at .BR http://www.rfc-editor.org/errata.html . .PP The database at .B http://www.iana.org/assignments/numbers.html which was formerly in RFC 1700, is also applicable. .DE .DS .PP The remaining lists of RFCs are provided for reference. .DE .DS .PP Other network protocols (message access, transfer of messages without status responses, non-message file transfer, extensions) that may be of interest include: .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 850 T{ Standard for interchange of USENET messages (obsoleted by 1036) T} 918 POST OFFICE PROTOCOL (obsoleted by 937) 937 POST OFFICE PROTOCOL - VERSION 2 (historic) 959 FILE TRANSFER PROTOCOL (FTP) 976 UUCP Mail Interchange Format Standard 1036 Standard for interchange of USENET messages 1064 T{ INTERACTIVE MAIL ACCESS PROTOCOL- VERSION 2 (obsoleted by 1176, 1203) T} 1081 Post Office Protocol - Version 3 (obsoleted by 1225) 1082 T{ Post Office Protocol - Version 3 Extended Service Offerings T} 1123 T{ Requirements for Internet Hosts -- Application and Support T} 1176 T{ INTERACTIVE MAIL ACCESS PROTOCOL- VERSION 2 (experimental) T} 1203 T{ INTERACTIVE MAIL ACCESS PROTOCOL- VERSION 3 (historic) T} 1225 Post Office Protocol - Version 3 (obsoleted by 1460) 1425 SMTP Service Extensions (obsoleted by 1651) 1426 T{ SMTP Service Extension for 8bit-Mime transport (obsoleted by 1652) T} 1427 T{ SMTP Service Extension for Message Size Declaration (obsoleted by 1653) T} 1460 Post Office Protocol - Version 3 (obsoleted by 1725) 1651 SMTP Service Extensions (obsoleted by 1869) 1652 SMTP Service Extension for 8bit-Mime transport 1653 T{ SMTP Service Extension for Message Size Declaration (obsoleted by 1870) T} 1725 Post Office Protocol - Version 3 (obsoleted by 1939) 1730 T{ INTERNET MESSAGE ACCESS PROTOCOL- VERSION 4 (obsoleted by 2060, 2061) T} 1734 POP3 AUTHentication command 1830 T{ SMTP Service Extensions for Transmission of Large and Binary MIME Messages (obsoleted by 3030) T} 1845 T{ SMTP Service Extension for Checkpoint/Restart (experimental) T} 1846 SMTP 521 Reply Code 1854 T{ SMTP Service Extension for Command Pipelining (obsoleted by 2197) T} 1869 SMTP Service Extensions (obsoleted by 2821) 1891 T{ SMTP Service Extension for Delivery Status Notifications (obsoleted by 3461) T} 1893 Enhanced Mail System Status Codes (obsoleted by 3463) 1939 T{ Post Office Protocol - Version 3 (updated by 1957, 2449) T} 1985 T{ SMTP Service Extension for Remote Message Queue Starting T} 2034 T{ SMTP Service Extension for Returning Enhanced Error Codes T} 2060 T{ INTERNET MESSAGE ACCESS PROTOCOL- VERSION 4rev1 (obsoleted by 3501) T} 2197 T{ SMTP Service Extension for Command Pipelining (obsoleted by 2920) T} 2449 POP3 Extension Mechanism 2524 T{ Neda's Efficient Mail Submission and Delivery (EMSD) Protocol Specification Version 1.3 T} 2920 SMTP Service Extension for Command Pipelining 3030 T{ SMTP Service Extensions for Transmission of Large and Binary MIME Messages T} 3207 T{ SMTP Service Extension for Secure SMTP over Transport Layer Security T} 3461 T{ Simple Mail Transfer Protocol (SMTP) Service Extension for Delivery Status Notifications (DSNs) T} 3463 Enhanced Mail System Status Codes 3501 INTERNET MESSAGE ACCESS PROTOCOL- VERSION 4rev1 (updated by 4466) 3885 SMTP Service Extension for Message Tracking 3887 Message Tracking Query Protocol 3888 Message Tracking Model and Requirements 4466 T{ Collected Extensions to IMAP4 ABNF T} 4496 T{ Open Pluggable Edge Services (OPES) SMTP Use Cases T} .TE .DE .DS .PP The following RFCs specify MIME media types which have no processing implications for .BR mparse . The list is provided for reference only. .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 1523 T{ The text/enriched MIME Content-type (Obsoleted by RFC1563, RFC1896) T} 1563 T{ The text/enriched MIME Content-type (Obsoleted by RFC1896) T} 1740 MIME Encapsulation of Macintosh Files - MacMIME 1741 MIME Content Type for BinHex Encoded Files 1767 MIME Encapsulation of EDI Objects 1872 The MIME Multipart/Related Content-type 1896 The text/enriched MIME Content-type 1927 T{ Suggested Additional MIME Types for Associating Documents (1 April 1996) T} 2112 The MIME Multipart/Related Content-type 2159 A MIME Body Part for FAX 2161 A MIME Body Part for ODA 2301 File Format for Internet Fax (obsoleted by 3949) 2302 T{ Tag Image File Format (TIFF) - image/tiff MIME Sub-type T} 2422 T{ Toll Quality Voice - 32 kbit/s ADPCM MIME Sub-type Registration T} 2423 VPIM Voice Message MIME Sub-type Registration 2426 vCard MIME Directory Profile 2503 MIME Types for Use with the ISO ILL Protocol 2646 The Text/Plain Format Parameter (obsoleted by 3676) 2927 MIME Directory Profile for LDAP Schema 3009 Registration of parityfec MIME types 3028 Sieve: A Mail Filtering Language 3073 T{ Portable Font Resource (PFR) - application/font-tdpfr MIME Sub-type Registration T} 3204 MIME media types for ISUP and QSIG Objects 3240 T{ Digital Imaging and Communications in Medicine (DICOM) - Application/dicom MIME Sub-type Registration T} 3250 T{ Tag Image File Format Fax eXtended (TIFF-FX) - image/tiff-fx MIME Sub-type Registration (obsoleted by 3950) T} 3302 T{ Tag Image File Format (TIFF) - image/tiff MIME Sub-type Registration T} 3362 T{ Real-time Facsimile (T.38) - image/t38 MIME Sub-type Registration T} 3391 The MIME Application/Vnd.pwg-multiplexed Content-Type 3459 T{ Critical Content Multi-purpose Internet Mail Extensions (MIME) Parameter (Updates RFC3204) T} 3555 MIME Type Registration of RTP Payload Formats 3676 The Text/Plain Format and DelSp Parameters 3745 MIME Type Registrations for JPEG 2000 (ISO/IEC 15444) 3778 The application/pdf Media Type 3802 T{ .na Toll Quality Voice - 32 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM) MIME Sub-type Registration (obsoletes 2422) T} 3823 T{ MIME Media Type for the Systems Biology Markup Language (SBML) T} 3839 T{ .na MIME Type Registrations for 3rd Generation Partnership Project (3GPP) Multimedia files T} 3949 T{ File Format for Internet Fax. R. Buckley (Obsoletes RFC2301) T} 3950 T{ Tag Image File Format Fax eXtended (TIFF-FX) - image/tiff-fx MIME Sub-type Registration (Obsoletes RFC3250) T} 3870 application/rdf+xml Media Type Registration 3902 The "application/soap+xml" media type 4027 Domain Name System Media Types 4047 T{ MIME Sub-type Registrations for Flexible Image Transport System (FITS) T} 4155 T{ The application/mbox Media Type T} 4180 T{ Common Format and MIME Type for Comma-Separated Values (CSV) Files T} 4263 T{ Media Subtype Registration for Media Type text/troff T} 4329 T{ Scripting Media Types T} 4337 T{ MIME Type Registration for MPEG-4 T} 4374 T{ The application/xv+xml Media Type T} 4393 T{ MIME Type Registrations for 3GPP2 Multimedia Files T} 4536 T{ The application/smil and application/smil+xml Media Types T} 4539 T{ Media Type Registration for the Society of Motion Picture and Television Engineers (SMPTE) Material Exchange Format (MXF) T} 4573 T{ MIME Type Registration for RTP Payload Format for H.224 T} 4627 T{ The application/json Media Type for JavaScript Object Notation (JSON) T} .TE .DE .DS .PP The following RFCs contain information useful in interpreting RFCs, and the list is provided for reference: .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 1958 Architectural Principles of the Internet 2026 The Internet Standards Process -- Revision 3 2028 T{ The Organizations Involved in the IETF Standards Process T} 2119 T{ Key words for use in RFCs to Indicate Requirement Levels T} 2277 IETF Policy on Character Sets and Languages 3117 On the Design of Application Protocols 3160 T{ The Tao of IETF - A Novice's Guide to the Internet Engineering Task Force T} 3536 Terminology Used in Internationalization in the IETF 3692 T{ Assigning Experimental and Testing Numbers Considered Useful T} 3930 T{ The Protocol versus Document Points of View in Computer Protocols T} 3935 A Mission Statement for the IETF 3967 T{ Clarifying when Standards Track Documents may Refer Normatively to Documents at a Lower Level T} .TE .DE .DS .PP The following RFCs contain information which may be useful to implementors using .BR mparse , and the list is provided for reference: .TS H expand; lw(0.5i)fB cw(5.3i)fB . number description _ .TH .T& np-2fB lp-2 . 1344 Implications of MIME for Internet Mail Gateways 1428 T{ Transition of Internet Mail from Just-Send-8 to 8bit-SMTP/MIME T} 1496 T{ .na Rules for Downgrading Messages from X.400/88 to X.400/84 When MIME Content-Types are Present in the Messages T} 1556 Handling of Bi-directional Texts in MIME 1820 T{ Multimedia E-mail (MIME) User Agent Checklist (obsoleted by 1844) T} 1844 Multimedia E-mail (MIME) User Agent Checklist 1865 EDI Meets the Internet 2048 T{ Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures (obsoleted by RFCs 4288 and 4289) T} 2305 T{ A Simple Mode of Facsimile Using Internet Mail (obsoleted by 3965) T} 2532 T{ Extended Facsimile Using Internet Mail T} 3805 Printer MIB v2 3808 IANA Charset MIB 3864 Registration Procedures for Message Header Fields 3965 A Simple Mode of Facsimile Using Internet Mail 4134 T{ Examples of S/MIME Messages T} 4143 T{ Facsimile Using Internet Mail (IFAX) Service of ENUM T} 4160 T{ Internet Fax Gateway Requirements T} 4161 T{ Guidelines for Optional Services for Internet Fax Gateways T} 4239 T{ Internet Voice Messaging (IVM) T} 4249 T{ Implementer-Friendly Specification of Message and MIME-Part Header Fields and Field Components T} 4270 T{ Attacks on Cryptographic Hashes in Internet Protocols T} 4288 T{ Media Type Specifications and Registration Procedures T} 4289 T{ Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures T} 4355 T{ IANA Registration for Enumservices email, fax, mms, ems, and sms T} 4356 T{ Mapping Between the Multimedia Messaging Service (MMS) and Internet Mail T} 4367 T{ What's in a Name: False Assumptions about DNS Names T} 4550 T{ Internet Email to Support Diverse Service Environments (Lemonade) Profile T} .TE .DE .PP .B Mparse can serve as the skeleton for applications which need to be able to generate or process text messages (mail and news user agents, submission agents, transfer agents, delivery agents, spam filters, cancelbots, etc.). .B Mparse recognizes and reports syntax errors and provides hooks for user processing of message components (header lines, addresses, body sections, etc.). A program using .B mparse must be compiled with .B -lmparse on the .B cc command line. .PP .B Mparse is implemented as a lexical analyzer, a parser, and a number of support functions and keyword databases. The primary goal of .B mparse is correct parsing and generation of Internet messages. Because the Internet message format is specified in terms of a modified BNF, a parser generator using source which can be constructed from the modified BNF is the central component of .BR mparse . The parser and associated lexical analyzer are designed to accept a superset of legal Internet messages, so as to be able to accommodate common syntax errors. Support functions then check the parsed message components for such syntax errors and deprecated constructs, providing the capability of producing informative error messages and warnings. .PP A secondary goal of .B mparse is maintainability. Keywords which may change, such as registered values for charsets, language tags, etc. are maintained in separate source code files, most of which can be generated automatically from a reference source. .PP A third goal of .B mparse is flexibility. .B Mparse provides a mechanism where an application can configure different processing modes, and is designed to call application-provided functions for processing message components and in the event of exceptional conditions (errors and warnings). As such, .B mparse can be used by diverse applications. Implementation of .B mparse as a library of functions also facilitates realization of this goal. .PP Performance was also given consideration in the design of .BR mparse , but not at the expense of the other goals. For example, highly-tuned hash tables are used for efficient case-insensitive matching of keywords such as header field names, day-of-week and month names, charset tags, etc. The use of a formal parser and lexical analyzer do have performance implications, but the goals of correctness and maintainability, which are enhanced via the formal parser, were considered more important than raw performance. .PP A novel approach is used in generation of message components such as header fields. The component is generated from supplied text, then is parsed to detect syntax errors. Each error is flagged, and those errors which are correctable are repaired. .PP Before describing the details of .B mparse operation, some context is necessary. Simple text messages consist of three parts: header fields, an empty line acting as a separator, and body text. A simple message might consist of header fields alone; i.e. no body text (in that case there might or might not be an empty line following the header fields). At least some header fields must be present. MIME multipart messages (see RFC 2046) consist of several sections separated by boundary delimiters, which are specially formatted text lines. Between these delimiters are encapsulated body parts, each of which consists of optional MIME header fields, a mandatory empty separator line, and some content (possibly encoded as text for transport). MIME also provides for message encapsulation, in which there are generally MIME header fields, a mandatory empty separator line, and the encapsulated message (header fields, separator, and body). Some exceptions are delivery status notification (DSN) and message tracking status notification (MTSN) messages, which have multiple sets of fields separated by empty lines. Other exceptions include media types message/sip, message/sipfrag, message/CPIM, message/http and message/s-http which do not have normal message header fields in the encapsulation. .DS .PP Messages received via SMTP, POP, or NNTP protocols may have been byte-stuffed, i.e. lines beginning with a '.' character may have had an additional '.' inserted. Likewise, messages which must be passed via SMTP, POP, or NNTP protocols should be so processed. .B Mparse processes messages according to the following model: .lf 1005 .PS 0.500i 6.000i .\" 0 -0.25 6 0.25 .\" 0.000i 0.500i 6.000i 0.000i .nr 00 \n(.u .nf .nr 0x 1 \h'6.000i' .sp -1 .lf 1008 \h'0.400i-(\w'input'u/2u)'\v'0.250i-(0v/2u)+0v+0.22m'input .sp -1 \h'0.800i'\v'0.250i'\D'l0.500i 0.000i' .sp -1 \D'f 1000u'\h'-1000u' .sp -1 \h'1.300i'\v'0.250i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'1.300i'\v'0.250i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'2.100i'\v'0.500i'\D'p0.000i -0.500i -0.800i 0.000i 0.000i 0.500i' .sp -1 .lf 1010 \h'1.700i-(\w'optional'u/2u)'\v'0.250i-(1v/2u)+0v+0.22m'optional .sp -1 .lf 1010 \h'1.700i-(\w'unstuff'u/2u)'\v'0.250i-(1v/2u)+1v+0.22m'unstuff .sp -1 \h'2.100i'\v'0.250i'\D'l0.500i 0.000i' .sp -1 \h'2.600i'\v'0.250i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'2.600i'\v'0.250i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'3.400i'\v'0.500i'\D'p0.000i -0.500i -0.800i 0.000i 0.000i 0.500i' .sp -1 .lf 1012 \h'3.000i-(\w'parse'u/2u)'\v'0.250i-(0v/2u)+0v+0.22m'parse .sp -1 \h'3.400i'\v'0.250i'\D'l0.500i 0.000i' .sp -1 \h'3.900i'\v'0.250i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'3.900i'\v'0.250i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.700i'\v'0.500i'\D'p0.000i -0.500i -0.800i 0.000i 0.000i 0.500i' .sp -1 .lf 1014 \h'4.300i-(\w'optional'u/2u)'\v'0.250i-(1v/2u)+0v+0.22m'optional .sp -1 .lf 1014 \h'4.300i-(\w'byte-stuff'u/2u)'\v'0.250i-(1v/2u)+1v+0.22m'byte-stuff .sp -1 \h'4.700i'\v'0.250i'\D'l0.500i 0.000i' .sp -1 \h'5.200i'\v'0.250i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'5.200i'\v'0.250i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1016 \h'5.600i-(\w'output'u/2u)'\v'0.250i-(0v/2u)+0v+0.22m'output .sp -1 .sp 0.500i+1 .if \n(00 .fi .br .nr 0x 0 .lf 1017 .PE .lf 1018 When byte-stuffing is being removed from the input, a line consisting of a lone '.' is considered to be the end of the message (consistent with POP RETR command response, for example). When byte-stuffing is applied at the output, a line containing a lone '.' is output after the message body (consistent with the SMTP DATA command). .DE .PP Input to .B mparse might come from a file or pipe or from a socket. .B Mparse provides for input timeout specifically for socket input. .PP .B Mparse can copy its input to output, or such copying can be suppressed. Additional output can be generated for errors in the input message, and application-provided functions can be called, which might generate additional output. Separate .I stdio .B FILE pointers are used for copied output, for general .B mparse function errors, and for error messages generated in response to errors in the input. .DS .PP The latter error messages can be suppressed, and if not suppressed appear in three varieties: .TS expand; lw(1.0i)fB lw(5.0i)fB . line description _ .T& l l . X-NG: T{ An RFC violation. The offending input is pinpointed if possible. T} X-Warning: T{ .na Context-sensitive constructs which may violate RFCs under some conditions. The offending input is pinpointed if possible. T} X-Err: T{ An error in the number or type of header fields. Issued after the last header line. T} .TE .DE .DS .PP Overall operation of .B mparse is controlled by setting members of a .B mparse_message structure, which is pointed to by a function call argument to one of the .B mparse functions. That structure is: .TS expand; lw(1.8i)fB lw(3.8i)fB . member (struct mparse_message) description _ .T& lp-2fB lp-2 . void *userptr; T{ .na user pointer; not set or used by parser, may be used by applications to pass data to hooks T} struct mparse_entity *top; top-level entity struct mparse_hooks *hooks; application hooks unsigned int context; message processing context const struct mparse_charset *charset; T{ default charset to use for generation and/or repair T} const char *language_tag; T{ default language tag for generation and/or repair T} double timeout; input timeout in seconds void *r_flex; T{ .na pointer to data used by reentrant lexical analyzer T} int start_cond; lexical analyzer start condition FILE *merr; T{ X-NG, X-Err, X-Warning header fields go here T} struct mparse_debug *dbg; for debugging char modes[] T{ .na bitmap of RFC processing modes. Use mparse_add_mode(), mparse_clear_modes(), mparse_in_mode(), mparse_remove_mode() to read/modify. T} int language_index; language for warning and error messages int linelen T{ desired maximum body line length when generating content T} unsigned int edebug : 1; T{ debug storage associated with error structures T} unsigned int gdebug : 1; T{ provide debugging information related to parsing T} unsigned int hdebug : 1; debug storage associated with fields unsigned int ldebug : 1; T{ provide debugging information related to lexical analysis T} unsigned int ndebug : 1; debug storage associated with entity unsigned int pdebug : 1; T{ debug storage associated with MIME mparse_parameters T} unsigned int rdebug : 1; T{ debug storage associated with protocol status T} unsigned int sdebug : 1; debug storage associated with lists unsigned int tdebug : 1; debug storage associated with tokens unsigned int internal_parse : 1; internal use unsigned int ioerr : 1; input timeout or other input error unsigned int canonicalize : 1; T{ convert strings to canonical form when generating fields T} unsigned int byte_stuff : 1; T{ .na byte-stuff output (body lines beginning with '.' have an additional '.' prepended) T} unsigned int byte_unstuff : 1; T{ remove byte stuffing at input (strip '.' at beginning of body lines) T} unsigned int experimental : 1; T{ support experimental and private-use names T} unsigned int header_only : 1; don't process body text unsigned int no_copy : 1; suppress output unsigned int suppress_errors : 1; T{ suppress error X- header field generation T} unsigned int suppress_warnings : 1; T{ suppress warning X- header field generation T} .TE .DE The .B mparse_message structure provides a user pointer which may be used to pass arbitrary application data to application-defined functions. .DS .PP The bitmap of .B modes should be accessed and modified through the library functions .BR mparse_add_mode , .BR mparse_clear_modes , .BR mparse_in_mode , and .BR mparse_remove_mode : .PP .B int mparse_add_mode(struct mparse_message *, int); .PP .B int mparse_clear_modes(struct mparse_message *); .PP .B int mparse_in_mode(struct mparse_message *, int); .PP .B int mparse_remove_mode(struct mparse_message *, int); .PP The integer arguments to .BR mparse_add_mode , .BR mparse_in_mode , and .B mparse_remove_mode are RFC numbers; e.g. .B mparse_add_mode(p,\ 822) sets processing to report errors relevant to RFC 822. Specifying 0 includes ``common-sense'' errors. If a NULL pointer or invalid RFC number is supplied, .B mparse_add_mode returns -1 and sets .I errno to EINVAL. .B mparse_in_mode returns 1 if processing for the specified RFC is in effect, 0 if not, and -1 (with .I errno set to EINVAL) on error. .B mparse_clear_modes turns off all processing. .DE .DS .PP Many .B mparse functions return an integer or a pointer. In event of an error, such as described above, these functions return a negative integer or a .I NULL pointer and set the global variable .I errno to indicate the error. Some of the .I errno values used are the same as those which may be set by system calls (such as EINVAL in the above functions), while there are message-specific values which are used where applicable: .TS expand; lw(1.8i)fB lw(4.0i)fB . symbolic name description _ .T& lp-2fB lp-2 . MPARSE_ERRNO_ESIGNED signed message or entity MPARSE_ERRNO_EORIGINAL original message or entity MPARSE_ERRNO_EENCRYPTED encrypted message or entity MPARSE_ERRNO_ENOADDRESS no address MPARSE_ERRNO_EMULTIADDRESS T{ multiple addresses (where only one is appropriate) T} MPARSE_ERRNO_ECOMPOSITE composite media type MPARSE_ERRNO_EINCOMPATIBLE incompatible type, subtype, charset, etc. MPARSE_ERRNO_ECACHED information carried in field is cached MPARSE_ERRNO_ENODEFAULT T{ cached field information has no default value T} .TE .DE A string corresponding to the .I errno value may be obtained by calling the function .BR mparse_strerror : .PP \fBconst char *mparse_strerror(const struct mparse_message *, int);\fP .PP giving the pointer to the affected .B mparse_message structure as the first argument, and the value of .I errno as the second argument. If the .I errno value is one of the above messaging-specific values, a language-dependent string is returned; otherwise the system .B strerror function is called. Use of the .I language_index member of the .B mparse_message structure allows dynamically changing the language, or changing it on a per-message basis without repeated calls to change a system-dependent "locale" (which might not be available on non-POSIX systems). Note that such calls are necessary, where available, to change the language for the system .B strerror function, and further note that error messages generated by the parser-generator (bison) code are in a fixed language which is determined at the time that the parser is built. .PP The function .PP .B int mparse_set_msg_language(struct mparse_message *message, const char *str, unsigned int len); .PP returns zero after setting the .I language_index member of the .I mparse_message structure to the appropriate value for the language described by .I str if it is recognized. It returns a positive integer (which is always suitable for use as the .I language_index member, but which must be set by the application) if the language is not recognized, and a negative integer (with .I errno set appropriately) if .I mparse_message or .I str is a .I NULL pointer. .DS .PP .B Mparse can be called from diverse applications for parsing or generating messages, for message syntax validation, etc. The .I context member of the .B mparse_message structure should be set to a value to indicate the processing environment. It is made up of two parts: one indicates one of the following primary processing roles: .TS expand; lw(2.4i)fB lw(3.2i)fB . symbolic name description _ .T& lp-2fB lp-2 . MPARSE_PRIMARY_ROLE_GENERATION T{ .na generation of messages, content errors are repaired if possible; e.g. user agent T} MPARSE_PRIMARY_ROLE_VALIDATION T{ .na content validity checks, e.g. mail submission agent, news injection agent, gateway T} MPARSE_PRIMARY_ROLE_TRANSPORT T{ .na content transport; existing content not altered (except for MPARSE_SECONDARY_ROLE_TRANSFORM, etc.) (some content (trace fields) may be added) T} MPARSE_PRIMARY_ROLE_ACCESS T{ content access for display, filtering, content extraction; no modification T} .TE .PP Only one of the above primary roles should be specified. In addition to the primary role, one or more secondary roles may be indicated by bitwise ORing one or more of the following values with the primary role value: .TS expand; lw(2.4i)fB lw(3.2i)fB . symbolic name description _ .T& lp-2fB lp-2 . MPARSE_SECONDARY_ROLE_REPAIR T{ .na repair content errors during MPARSE_PRIMARY_ROLE_GENERATION, MPARSE_PRIMARY_ROLE_VALIDATION, or at gateways T} MPARSE_SECONDARY_ROLE_TRANSFORM T{ .na content transformations (transcoding, reassembly, etc.) during MPARSE_PRIMARY_ROLE_TRANSPORT or at gateways T} MPARSE_SECONDARY_ROLE_TRANSFORM_7BIT T{ .na encode to 7bit domain for MPARSE_PRIMARY_ROLE_TRANSPORT or at gateways T} .TE .PP The default value for .I context is .BR MPARSE_PRIMARY_ROLE_ACCESS . .DE .PP When generating a message, the .I linelen member of the .B mparse_message structure may be set to the desired maximum line length for body content. It is ignored when not generating message content. .PP Default operation of .B mparse is to pass the message at its input to its output, emitting lines documenting RFC violations in input header fields. (Technically, these lines are written to a separate .IR "stdio FILE" , though it may refer to the same file or standard output stream.) .PP The .B canonicalize flag, when set, causes protocol "names" (RFC 1958) to be presented with canonical capitalization as shown in the relevant RFC. This flag should only be set in MPARSE_PRIMARY_ROLE_GENERATION .I context or with the .I context flag MPARSE_SECONDARY_ROLE_REPAIR, but .B mparse does not enforce correspondence between .I canonicalize and .IR context . .PP The .B byte_stuff and .B byte_unstuff flags control processing according to the model presented earlier. .PP RFC 3692 requires that experimental and private-use names not be recognized by default. If the .B experimental flag is non-zero, experimental and private-use names can be recognized via the hooks provided for extension names (see below). By default (in accordance with RFC 3692) these are not recognized. .PP .B ldebug and .B gdebug will enable verbose debugging output for the lexical analyzer and parser, respectively. Changes made to the .B ldebug flag after one of the .B mparse function is called will have no effect on lexical analyzer debugging output. To change lexical analyzer debugging output after .B mparse has been called, the .B m_flex_debug member of the structure pointed to by .B r_flex (see below) must be modified via the .B flex function .B mparse_set_debug (refer to the flex documentation for .B yyset_debug for details). .PP .BR edebug , .BR hdebug , .BR pdebug , .BR sdebug , and .B tdebug control debugging output associated with allocated memory for the structures used by .BR mparse . .B Mparse itself has been tested to ensure that there are no memory leaks, but user code that calls allocation and/or deallocation functions should be tested. .PP If the flag .B header_only is set, body text is not parsed; processing ends after the header fields. This mode may be useful if only header content is of interest. This flag should only be set in MPARSE_PRIMARY_ROLE_ACCESS .IR context , but .B mparse does not enforce correspondence between .I header_only and .IR context . .PP .B suppress_errors causes generation of X-Err and X-NG header fields to be suppressed from output. .PP The .B suppress_warnings flag will cause warnings about context-dependent constructs to be suppressed from output. .P The .B no_copy flag suppresses the normal echoing of the input message to the stream pointed to by .BR mout . .PP While echoing of input is sent (unless suppressed) to .BR mout , error and warning messages are sent to the .B merr stream. If .B merr is not initialized before one of the .B mparse functions is called, it is set to the standard error stream .IR stderr. It is possible to set .B merr to the same stream as .BR mout , however that may technically violate some RFCs which prohibit adding header fields in some contexts. Debugging output and miscellaneous error messages are always sent to .IR stderr . .PP When reading input from a stream associated with a socket connection (e.g. to a POP or NNTP server), it may be desirable to set a time limit for input operations. If blocking input is used, a dropped connection may result in the process hanging, waiting for more input. If non-blocking input is used (e.g. via the .B O_NONBLOCK flag used with .IR fcntl (2)), .B EOF may be returned if there are network delays. .PP .B Mparse provides for a timeout for input. If .B timeout is a positive non-zero value, read operations that return EOF after a partial line, as may be the case when there are network delays and .B O_NONBLOCK is used, will cause .B mparse to check for availability of more input for up to the number of seconds (and partial seconds) given by .BR timeout . The .B ioerr flag described earlier will be set if there is no input within the specified timeout window. .SS Lexical Analysis .PP As an input message is processed, the first step is breaking the input stream into tokens. These tokens correspond to a piece of input text which has some significance in the message formats. .DS .PP Some information is stored with each lexical token returned to the parser by the lexical analyzer. This information is stored in a .B mparse_token structure, defined in the header file .BR mparse.h : .TS expand; lw(2.3i)fB lw(3.8i)fB . member (struct mparse_token) description _ .T& lp-2fB lp-2 . char *tok; allocated copy of token text size_t len; T{ length (not including terminating '\e0' (len = 1 for a MPARSE_TOKEN_NUL token)) T} int col; T{ input stream starting column (0\-based) of token (== line length for MPARSE_TOKEN_CRLF or MPARSE_TOKEN_EOH) T} int type; T{ token type, as defined in mparse.tab.h T} int val; T{ value associated with token, if any T} struct mparse_token *next; T{ singly\-linked list link to next lexical token in logical construct T} struct mparse_token *next2; T{ forward link of doubly\-linked list link; all tokens (to MPARSE_TOKEN_EOH in fields, to end in body) T} struct mparse_token *prev2; T{ reverse link of doubly\-linked list of tokens in field or body T} struct mparse_token *trailer; trailing CFWS after token struct mparse_token *close; T{ .na matching close delimiter (double quote, right parentheses, right bracket, '>', semicolon terminating group, etc.) T} struct mparse_list *list; for list navigation struct mparse_field *field; T{ pointer to enclosing field structure T} struct mparse_error *error; error(s) for this token struct mparse_parameter *parameter; T{ MIME\-parameters parameter if token is in a parameter T} void *userptr; T{ user pointer; not set or used by parser, may be used to associate data with token T} unsigned int address_has_comment : 1; T{ address or mailbox containing token has or is adjacent to a comment T} unsigned int contains_group : 1; T{ address or address list contains a group T} unsigned int has_cfws : 1; T{ angle\-addr (or addr\-spec in Received field for component) has internal CFWS T} unsigned int has_route : 1; T{ angle\-addr specified with route (not valid as msg\-id) T} unsigned int in_phrase : 1; T{ in phrase (RFC 2047 encoding permitted) T} unsigned int is_comment : 1; T{ token is part of a comment (including parentheses) T} unsigned int is_dlit : 1; T{ token is part of the interior of a domain literal T} unsigned int is_quoted : 1; T{ token is part of the interior of a quoted string T} unsigned int is_route : 1; token is in obsolete source route unsigned int is_utext : 1; token is part of unstructured text unsigned int non_ascii:1; token contains a non\-ascii octet unsigned int fold_val : 3; folding preference 0 - 7 .TE .DE .DS .PP The .B list member of the .B mparse_token structure points to a .B mparse_list structure if the token is part of a list construct: .TS expand; lw(1.9i)fB lw(3.9i)fB . member (struct mparse_list) description _ .T& lp-2fB lp-2 . struct mparse_token *head; token at head of list struct mparse_token *delimiter; next delimiter separating list elements struct mparse_token *element; next list element token struct mparse_token *next; token following list end struct mparse_list *sublist; for nested lists void *aux; additional data (internal use) int val; additional data (internal use) unsigned int nelements; T{ number of elements in list (valid only in head) T} unsigned int ndelimiters; T{ number of delimiters in list (valid only in head) T} unsigned int is_element:1; this list item is a list element unsigned int is_empty:1; T{ this list item is a delimiter with no corresponding list element T} .TE .DE .DS .PP The use of the pointers and flags is illustrated by an example of a To field. The header field line: .PP .B To: (foo)Group: Name , c@d.edu;, e@f.gov .PP is parsed into the following tokens and a rather complex structure of pointers: .PP .lf 1720 .PS 8.562i 6.050i .\" -0.35 -8.05 5.7 0.5125 .\" 0.000i 8.562i 6.050i 0.000i .nr 00 \n(.u .nf .nr 0x 1 \h'6.050i' .sp -1 \h'0.850i'\v'0.713i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1723 \h'0.600i-(\w'To'u/2u)'\v'0.513i-(1v/2u)+0v+0.22m'To .sp -1 .lf 1723 \h'0.600i-(\w'type TO'u/2u)'\v'0.513i-(1v/2u)+1v+0.22m'type TO .sp -1 \h'1.750i'\v'0.713i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1724 \h'1.500i-(\w':'u/2u)'\v'0.513i-(1v/2u)+0v+0.22m': .sp -1 .lf 1724 \h'1.500i-(\w\(tstype ':'\(tsu/2u)'\v'0.513i-(1v/2u)+1v+0.22m'type ':' .sp -1 \h'0.850i'\v'0.483i'\D'l0.400i 0.000i' .sp -1 \D'f 1000u'\h'-1000u' .sp -1 \h'1.250i'\v'0.483i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'1.250i'\v'0.483i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1725 \h'1.050i-(\w'next2'u/2u)'\v'0.483i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'1.250i'\v'0.543i'\D'l-0.400i 0.000i' .sp -1 \h'0.850i'\v'0.543i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'0.850i'\v'0.543i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1726 \h'1.050i-(\w'prev2'u/2u)'\v'0.543i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'2.650i'\v'0.713i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1727 \h'2.400i-(\w' 'u/2u)'\v'0.513i-(1v/2u)+0v+0.22m' .sp -1 .lf 1727 \h'2.400i-(\w'type WS'u/2u)'\v'0.513i-(1v/2u)+1v+0.22m'type WS .sp -1 \h'1.750i'\v'0.483i'\D'l0.400i 0.000i' .sp -1 \h'2.150i'\v'0.483i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'2.150i'\v'0.483i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1728 \h'1.950i-(\w'next2'u/2u)'\v'0.483i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'2.150i'\v'0.543i'\D'l-0.400i 0.000i' .sp -1 \h'1.750i'\v'0.543i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'1.750i'\v'0.543i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1729 \h'1.950i-(\w'prev2'u/2u)'\v'0.543i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'3.750i'\v'0.763i'\D'p0.000i -0.500i -0.700i 0.000i 0.000i 0.500i' .sp -1 .lf 1730 \h'3.400i-(\w'('u/2u)'\v'0.513i-(2v/2u)+0v+0.22m'( .sp -1 .lf 1730 \h'3.400i-(\w\(tstype '('\(tsu/2u)'\v'0.513i-(2v/2u)+1v+0.22m'type '(' .sp -1 .lf 1730 \h'3.400i-(\w'is_comment'u/2u)'\v'0.513i-(2v/2u)+2v+0.22m'is_comment .sp -1 \h'2.650i'\v'0.483i'\D'l0.400i 0.000i' .sp -1 \h'3.050i'\v'0.483i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'3.050i'\v'0.483i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1731 \h'2.850i-(\w'next2'u/2u)'\v'0.483i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'3.050i'\v'0.543i'\D'l-0.400i 0.000i' .sp -1 \h'2.650i'\v'0.543i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'2.650i'\v'0.543i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1732 \h'2.850i-(\w'prev2'u/2u)'\v'0.543i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'4.950i'\v'0.763i'\D'p0.000i -0.500i -0.800i 0.000i 0.000i 0.500i' .sp -1 .lf 1733 \h'4.550i-(\w'foo'u/2u)'\v'0.513i-(2v/2u)+0v+0.22m'foo .sp -1 .lf 1733 \h'4.550i-(\w'type STRING'u/2u)'\v'0.513i-(2v/2u)+1v+0.22m'type STRING .sp -1 .lf 1733 \h'4.550i-(\w'is_comment'u/2u)'\v'0.513i-(2v/2u)+2v+0.22m'is_comment .sp -1 \h'3.750i'\v'0.483i'\D'l0.400i 0.000i' .sp -1 \h'4.150i'\v'0.483i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.150i'\v'0.483i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1734 \h'3.950i-(\w'next2'u/2u)'\v'0.483i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'4.150i'\v'0.543i'\D'l-0.400i 0.000i' .sp -1 \h'3.750i'\v'0.543i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.750i'\v'0.543i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1735 \h'3.950i-(\w'prev2'u/2u)'\v'0.543i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'6.050i'\v'0.763i'\D'p0.000i -0.500i -0.700i 0.000i 0.000i 0.500i' .sp -1 .lf 1736 \h'5.700i-(\w')'u/2u)'\v'0.513i-(2v/2u)+0v+0.22m') .sp -1 .lf 1736 \h'5.700i-(\w\(tstype ')'\(tsu/2u)'\v'0.513i-(2v/2u)+1v+0.22m'type ')' .sp -1 .lf 1736 \h'5.700i-(\w'is_comment'u/2u)'\v'0.513i-(2v/2u)+2v+0.22m'is_comment .sp -1 \h'4.950i'\v'0.483i'\D'l0.400i 0.000i' .sp -1 \h'5.350i'\v'0.483i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'5.350i'\v'0.483i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1737 \h'5.150i-(\w'next2'u/2u)'\v'0.483i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'5.350i'\v'0.543i'\D'l-0.400i 0.000i' .sp -1 \h'4.950i'\v'0.543i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'4.950i'\v'0.543i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1738 \h'5.150i-(\w'prev2'u/2u)'\v'0.543i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'1.500i'\v'0.313i'\D'~0.500i -0.300i 0.400i 0.300i' .sp -1 \h'2.400i'\v'0.313i'\D'P-0.095i -0.040i 0.030i -0.040i' .sp -1 \h'2.400i'\v'0.313i'\D'p-0.095i -0.040i 0.030i -0.040i' .sp -1 .lf 1739 \h'1.950i-(\w'trailer'u/2u)'\v'0.313i-(0v/2u)+0v+0.22m-.5v'trailer .sp -1 \h'2.400i'\v'0.313i'\D'~0.500i -0.300i 0.500i 0.250i' .sp -1 \h'3.400i'\v'0.263i'\D'P-0.101i -0.022i 0.022i -0.045i' .sp -1 \h'3.400i'\v'0.263i'\D'p-0.101i -0.022i 0.022i -0.045i' .sp -1 .lf 1740 \h'2.900i-(\w'next'u/2u)'\v'0.288i-(0v/2u)+0v+0.22m-.5v'next .sp -1 \h'3.400i'\v'0.263i'\D'~0.500i -0.300i 0.650i 0.300i' .sp -1 \h'4.550i'\v'0.263i'\D'P-0.101i -0.019i 0.021i -0.045i' .sp -1 \h'4.550i'\v'0.263i'\D'p-0.101i -0.019i 0.021i -0.045i' .sp -1 .lf 1741 \h'3.975i-(\w'next'u/2u)'\v'0.263i-(0v/2u)+0v+0.22m-.5v'next .sp -1 \h'4.550i'\v'0.263i'\D'~0.500i -0.300i 0.650i 0.300i' .sp -1 \h'5.700i'\v'0.263i'\D'P-0.101i -0.019i 0.021i -0.045i' .sp -1 \h'5.700i'\v'0.263i'\D'p-0.101i -0.019i 0.021i -0.045i' .sp -1 .lf 1742 \h'5.125i-(\w'next'u/2u)'\v'0.263i-(0v/2u)+0v+0.22m-.5v'next .sp -1 \h'3.400i'\v'0.263i'\D'~0.350i -0.300i 1.600i 0.000i 0.350i 0.300i' .sp -1 \h'5.700i'\v'0.263i'\D'P-0.092i -0.046i 0.033i -0.038i' .sp -1 \h'5.700i'\v'0.263i'\D'p-0.092i -0.046i 0.033i -0.038i' .sp -1 .lf 1744 \h'4.550i-(\w'close'u/2u)'\v'0.063i-(0v/2u)+0v+0.22m'close .sp -1 \h'1.250i'\v'2.013i'\D'p0.000i -0.900i -0.900i 0.000i 0.000i 0.900i' .sp -1 .lf 1745 \h'0.800i-(\w'Group'u/2u)'\v'1.562i-(4v/2u)+0v+0.22m'Group .sp -1 .lf 1745 \h'0.800i-(\w'type STRING'u/2u)'\v'1.562i-(4v/2u)+1v+0.22m'type STRING .sp -1 .lf 1745 \h'0.800i-(\w'contains_group'u/2u)'\v'1.562i-(4v/2u)+2v+0.22m'contains_group .sp -1 .lf 1745 \h'0.800i-(\w'is_element'u/2u)'\v'1.562i-(4v/2u)+3v+0.22m'is_element .sp -1 .lf 1745 \h'0.800i-(\w'in_phrase'u/2u)'\v'1.562i-(4v/2u)+4v+0.22m'in_phrase .sp -1 \h'2.150i'\v'1.763i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1746 \h'1.900i-(\w':'u/2u)'\v'1.562i-(1v/2u)+0v+0.22m': .sp -1 .lf 1746 \h'1.900i-(\w\(tstype ':'\(tsu/2u)'\v'1.562i-(1v/2u)+1v+0.22m'type ':' .sp -1 \h'1.250i'\v'1.533i'\D'l0.400i 0.000i' .sp -1 \h'1.650i'\v'1.533i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'1.650i'\v'1.533i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1747 \h'1.450i-(\w'next2'u/2u)'\v'1.533i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'1.650i'\v'1.593i'\D'l-0.400i 0.000i' .sp -1 \h'1.250i'\v'1.593i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'1.250i'\v'1.593i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1748 \h'1.450i-(\w'prev2'u/2u)'\v'1.593i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'3.050i'\v'1.763i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1749 \h'2.800i-(\w' 'u/2u)'\v'1.562i-(1v/2u)+0v+0.22m' .sp -1 .lf 1749 \h'2.800i-(\w'type WS'u/2u)'\v'1.562i-(1v/2u)+1v+0.22m'type WS .sp -1 \h'2.150i'\v'1.533i'\D'l0.400i 0.000i' .sp -1 \h'2.550i'\v'1.533i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'2.550i'\v'1.533i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1750 \h'2.350i-(\w'next2'u/2u)'\v'1.533i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'2.550i'\v'1.593i'\D'l-0.400i 0.000i' .sp -1 \h'2.150i'\v'1.593i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'2.150i'\v'1.593i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1751 \h'2.350i-(\w'prev2'u/2u)'\v'1.593i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'4.250i'\v'1.913i'\D'p0.000i -0.700i -0.800i 0.000i 0.000i 0.700i' .sp -1 .lf 1752 \h'3.850i-(\w'Name'u/2u)'\v'1.562i-(3v/2u)+0v+0.22m'Name .sp -1 .lf 1752 \h'3.850i-(\w'type STRING'u/2u)'\v'1.562i-(3v/2u)+1v+0.22m'type STRING .sp -1 .lf 1752 \h'3.850i-(\w'is_element'u/2u)'\v'1.562i-(3v/2u)+2v+0.22m'is_element .sp -1 .lf 1752 \h'3.850i-(\w'in_phrase'u/2u)'\v'1.562i-(3v/2u)+3v+0.22m'in_phrase .sp -1 \h'3.050i'\v'1.533i'\D'l0.400i 0.000i' .sp -1 \h'3.450i'\v'1.533i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'3.450i'\v'1.533i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1753 \h'3.250i-(\w'next2'u/2u)'\v'1.533i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'3.450i'\v'1.593i'\D'l-0.400i 0.000i' .sp -1 \h'3.050i'\v'1.593i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.050i'\v'1.593i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1754 \h'3.250i-(\w'prev2'u/2u)'\v'1.593i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'5.150i'\v'1.763i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1755 \h'4.900i-(\w' 'u/2u)'\v'1.562i-(1v/2u)+0v+0.22m' .sp -1 .lf 1755 \h'4.900i-(\w'type WS'u/2u)'\v'1.562i-(1v/2u)+1v+0.22m'type WS .sp -1 \h'4.250i'\v'1.533i'\D'l0.400i 0.000i' .sp -1 \h'4.650i'\v'1.533i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.650i'\v'1.533i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1756 \h'4.450i-(\w'next2'u/2u)'\v'1.533i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'4.650i'\v'1.593i'\D'l-0.400i 0.000i' .sp -1 \h'4.250i'\v'1.593i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'4.250i'\v'1.593i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1757 \h'4.450i-(\w'prev2'u/2u)'\v'1.593i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'6.050i'\v'1.763i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1758 \h'5.800i-(\w'<'u/2u)'\v'1.562i-(1v/2u)+0v+0.22m'< .sp -1 .lf 1758 \h'5.800i-(\w\(tstype '<'\(tsu/2u)'\v'1.562i-(1v/2u)+1v+0.22m'type '<' .sp -1 \h'5.150i'\v'1.533i'\D'l0.400i 0.000i' .sp -1 \h'5.550i'\v'1.533i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'5.550i'\v'1.533i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1759 \h'5.350i-(\w'next2'u/2u)'\v'1.533i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'5.550i'\v'1.593i'\D'l-0.400i 0.000i' .sp -1 \h'5.150i'\v'1.593i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'5.150i'\v'1.593i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1760 \h'5.350i-(\w'prev2'u/2u)'\v'1.593i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'6.050i'\v'0.763i'\D'l-4.800i 0.350i' .sp -1 \h'1.250i'\v'1.113i'\D'P0.098i -0.032i 0.004i 0.050i' .sp -1 \h'1.250i'\v'1.113i'\D'p0.098i -0.032i 0.004i 0.050i' .sp -1 .lf 1761 \h'3.650i-(\w'next2'u/2u)'\v'0.938i-(0v/2u)+0v+0.22m'next2 .sp -1 \h'0.350i'\v'1.113i'\D'l5.000i -0.350i' .sp -1 \h'5.350i'\v'0.763i'\D'P-0.098i 0.032i -0.003i -0.050i' .sp -1 \h'5.350i'\v'0.763i'\D'p-0.098i 0.032i -0.003i -0.050i' .sp -1 .lf 1762 \h'2.850i-(\w'prev2'u/2u)'\v'0.938i-(0v/2u)+0v+0.22m'prev2 .sp -1 \h'1.900i'\v'1.363i'\D'~0.500i -0.300i 0.400i 0.300i' .sp -1 \h'2.800i'\v'1.363i'\D'P-0.095i -0.040i 0.030i -0.040i' .sp -1 \h'2.800i'\v'1.363i'\D'p-0.095i -0.040i 0.030i -0.040i' .sp -1 .lf 1763 \h'2.350i-(\w'trailer'u/2u)'\v'1.363i-(0v/2u)+0v+0.22m-.5v'trailer .sp -1 \h'3.850i'\v'1.212i'\D'~0.500i -0.300i 0.550i 0.450i' .sp -1 \h'4.900i'\v'1.363i'\D'P-0.093i -0.044i 0.032i -0.039i' .sp -1 \h'4.900i'\v'1.363i'\D'p-0.093i -0.044i 0.032i -0.039i' .sp -1 .lf 1764 \h'4.375i-(\w'trailer'u/2u)'\v'1.288i-(0v/2u)+0v+0.22m-.5v'trailer .sp -1 \h'1.200i'\v'3.062i'\D'p0.000i -0.550i -0.800i 0.000i 0.000i 0.550i' .sp -1 .lf 1765 \h'0.800i-(\w'a'u/2u)'\v'2.788i-(2v/2u)+0v+0.22m'a .sp -1 .lf 1765 \h'0.800i-(\w'type STRING'u/2u)'\v'2.788i-(2v/2u)+1v+0.22m'type STRING .sp -1 .lf 1765 \h'0.800i-(\w'in_phrase'u/2u)'\v'2.788i-(2v/2u)+2v+0.22m'in_phrase .sp -1 \h'2.100i'\v'2.988i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1766 \h'1.850i-(\w'@'u/2u)'\v'2.788i-(1v/2u)+0v+0.22m'@ .sp -1 .lf 1766 \h'1.850i-(\w\(tstype '@'\(tsu/2u)'\v'2.788i-(1v/2u)+1v+0.22m'type '@' .sp -1 \h'1.200i'\v'2.758i'\D'l0.400i 0.000i' .sp -1 \h'1.600i'\v'2.758i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'1.600i'\v'2.758i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1767 \h'1.400i-(\w'next2'u/2u)'\v'2.758i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'1.600i'\v'2.817i'\D'l-0.400i 0.000i' .sp -1 \h'1.200i'\v'2.817i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'1.200i'\v'2.817i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1768 \h'1.400i-(\w'prev2'u/2u)'\v'2.817i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'3.300i'\v'2.988i'\D'p0.000i -0.400i -0.800i 0.000i 0.000i 0.400i' .sp -1 .lf 1769 \h'2.900i-(\w'b'u/2u)'\v'2.788i-(1v/2u)+0v+0.22m'b .sp -1 .lf 1769 \h'2.900i-(\w'type STRING'u/2u)'\v'2.788i-(1v/2u)+1v+0.22m'type STRING .sp -1 \h'2.100i'\v'2.758i'\D'l0.400i 0.000i' .sp -1 \h'2.500i'\v'2.758i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'2.500i'\v'2.758i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1770 \h'2.300i-(\w'next2'u/2u)'\v'2.758i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'2.500i'\v'2.817i'\D'l-0.400i 0.000i' .sp -1 \h'2.100i'\v'2.817i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'2.100i'\v'2.817i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1771 \h'2.300i-(\w'prev2'u/2u)'\v'2.817i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'4.200i'\v'2.988i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1772 \h'3.950i-(\w'.'u/2u)'\v'2.788i-(1v/2u)+0v+0.22m'. .sp -1 .lf 1772 \h'3.950i-(\w\(tstype '.'\(tsu/2u)'\v'2.788i-(1v/2u)+1v+0.22m'type '.' .sp -1 \h'3.300i'\v'2.758i'\D'l0.400i 0.000i' .sp -1 \h'3.700i'\v'2.758i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'3.700i'\v'2.758i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1773 \h'3.500i-(\w'next2'u/2u)'\v'2.758i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'3.700i'\v'2.817i'\D'l-0.400i 0.000i' .sp -1 \h'3.300i'\v'2.817i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.300i'\v'2.817i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1774 \h'3.500i-(\w'prev2'u/2u)'\v'2.817i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'5.400i'\v'2.988i'\D'p0.000i -0.400i -0.800i 0.000i 0.000i 0.400i' .sp -1 .lf 1775 \h'5.000i-(\w'com'u/2u)'\v'2.788i-(1v/2u)+0v+0.22m'com .sp -1 .lf 1775 \h'5.000i-(\w'type STRING'u/2u)'\v'2.788i-(1v/2u)+1v+0.22m'type STRING .sp -1 \h'4.200i'\v'2.758i'\D'l0.400i 0.000i' .sp -1 \h'4.600i'\v'2.758i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.600i'\v'2.758i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1776 \h'4.400i-(\w'next2'u/2u)'\v'2.758i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'4.600i'\v'2.817i'\D'l-0.400i 0.000i' .sp -1 \h'4.200i'\v'2.817i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'4.200i'\v'2.817i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1777 \h'4.400i-(\w'prev2'u/2u)'\v'2.817i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'5.800i'\v'1.763i'\D'l-4.600i 0.750i' .sp -1 \h'1.200i'\v'2.513i'\D'P0.095i -0.041i 0.008i 0.049i' .sp -1 \h'1.200i'\v'2.513i'\D'p0.095i -0.041i 0.008i 0.049i' .sp -1 .lf 1778 \h'3.500i-(\w' next2'u/2u)'\v'2.138i-(0v/2u)+0v+0.22m' next2 .sp -1 \h'0.800i'\v'2.513i'\D'l4.750i -0.750i' .sp -1 \h'5.550i'\v'1.763i'\D'P-0.095i 0.040i -0.008i -0.049i' .sp -1 \h'5.550i'\v'1.763i'\D'p-0.095i 0.040i -0.008i -0.049i' .sp -1 .lf 1779 \h'3.175i-(\w'prev2 'u/2u)'\v'2.138i-(0v/2u)+0v+0.22m'prev2 .sp -1 \h'3.800i'\v'4.088i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1780 \h'3.550i-(\w'>'u/2u)'\v'3.888i-(1v/2u)+0v+0.22m'> .sp -1 .lf 1780 \h'3.550i-(\w\(tstype '>'\(tsu/2u)'\v'3.888i-(1v/2u)+1v+0.22m'type '>' .sp -1 \h'4.700i'\v'4.088i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1781 \h'4.450i-(\w','u/2u)'\v'3.888i-(1v/2u)+0v+0.22m', .sp -1 .lf 1781 \h'4.450i-(\w\(tstype ','\(tsu/2u)'\v'3.888i-(1v/2u)+1v+0.22m'type ',' .sp -1 \h'3.800i'\v'3.858i'\D'l0.400i 0.000i' .sp -1 \h'4.200i'\v'3.858i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.200i'\v'3.858i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1782 \h'4.000i-(\w'next2'u/2u)'\v'3.858i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'4.200i'\v'3.918i'\D'l-0.400i 0.000i' .sp -1 \h'3.800i'\v'3.918i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.800i'\v'3.918i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1783 \h'4.000i-(\w'prev2'u/2u)'\v'3.918i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'5.600i'\v'4.088i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1784 \h'5.350i-(\w' 'u/2u)'\v'3.888i-(1v/2u)+0v+0.22m' .sp -1 .lf 1784 \h'5.350i-(\w'type WS'u/2u)'\v'3.888i-(1v/2u)+1v+0.22m'type WS .sp -1 \h'4.700i'\v'3.858i'\D'l0.400i 0.000i' .sp -1 \h'5.100i'\v'3.858i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'5.100i'\v'3.858i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1785 \h'4.900i-(\w'next2'u/2u)'\v'3.858i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'5.100i'\v'3.918i'\D'l-0.400i 0.000i' .sp -1 \h'4.700i'\v'3.918i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'4.700i'\v'3.918i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1786 \h'4.900i-(\w'prev2'u/2u)'\v'3.918i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'4.450i'\v'3.688i'\D'~0.500i -0.300i 0.400i 0.300i' .sp -1 \h'5.350i'\v'3.688i'\D'P-0.095i -0.040i 0.030i -0.040i' .sp -1 \h'5.350i'\v'3.688i'\D'p-0.095i -0.040i 0.030i -0.040i' .sp -1 .lf 1787 \h'4.900i-(\w'trailer'u/2u)'\v'3.688i-(0v/2u)+0v+0.22m-.5v'trailer .sp -1 \h'0.800i'\v'3.062i'\D'~0.500i 0.300i 0.550i -0.375i' .sp -1 \h'1.850i'\v'2.988i'\D'P-0.069i 0.077i -0.028i -0.041i' .sp -1 \h'1.850i'\v'2.988i'\D'p-0.069i 0.077i -0.028i -0.041i' .sp -1 .lf 1788 \h'1.325i-(\w'next'u/2u)'\v'3.025i-(0v/2u)+0v+0.22m+.5v'next .sp -1 \h'1.850i'\v'2.988i'\D'~0.500i 0.300i 0.550i -0.300i' .sp -1 \h'2.900i'\v'2.988i'\D'P-0.076i 0.070i -0.024i -0.044i' .sp -1 \h'2.900i'\v'2.988i'\D'p-0.076i 0.070i -0.024i -0.044i' .sp -1 .lf 1789 \h'2.375i-(\w'next'u/2u)'\v'2.988i-(0v/2u)+0v+0.22m+.5v'next .sp -1 \h'2.900i'\v'2.988i'\D'~0.500i 0.300i 0.550i -0.300i' .sp -1 \h'3.950i'\v'2.988i'\D'P-0.076i 0.070i -0.024i -0.044i' .sp -1 \h'3.950i'\v'2.988i'\D'p-0.076i 0.070i -0.024i -0.044i' .sp -1 .lf 1790 \h'3.425i-(\w'next'u/2u)'\v'2.988i-(0v/2u)+0v+0.22m+.5v'next .sp -1 \h'3.950i'\v'2.587i'\D'~0.500i -0.300i 0.550i 0.300i' .sp -1 \h'5.000i'\v'2.587i'\D'P-0.100i -0.026i 0.024i -0.044i' .sp -1 \h'5.000i'\v'2.587i'\D'p-0.100i -0.026i 0.024i -0.044i' .sp -1 .lf 1791 \h'4.475i-(\w'next'u/2u)'\v'2.587i-(0v/2u)+0v+0.22m-.5v'next .sp -1 \h'4.250i'\v'1.913i'\D'~1.650i 0.575i 0.000i 0.700i -1.200i 0.200i -0.500i 0.300i' .sp -1 \h'4.200i'\v'3.688i'\D'P0.073i -0.073i 0.026i 0.043i' .sp -1 \h'4.200i'\v'3.688i'\D'p0.073i -0.073i 0.026i 0.043i' .sp -1 .lf 1793 \h'5.400i-(\w'delimiter'u/2u)'\v'3.388i-(0v/2u)+0v+0.22m'delimiter .sp -1 \h'5.000i'\v'2.988i'\D'l-1.450i 0.700i' .sp -1 \h'3.550i'\v'3.688i'\D'P0.079i -0.066i 0.022i 0.045i' .sp -1 \h'3.550i'\v'3.688i'\D'p0.079i -0.066i 0.022i 0.045i' .sp -1 .lf 1794 \h'4.275i-(\w' next2'u/2u)'\v'3.338i-(0v/2u)+0v+0.22m' next2 .sp -1 \h'3.300i'\v'3.688i'\D'l1.300i -0.700i' .sp -1 \h'4.600i'\v'2.988i'\D'P-0.076i 0.069i -0.024i -0.044i' .sp -1 \h'4.600i'\v'2.988i'\D'p-0.076i 0.069i -0.024i -0.044i' .sp -1 .lf 1795 \h'3.950i-(\w'prev2 'u/2u)'\v'3.338i-(0v/2u)+0v+0.22m'prev2 .sp -1 \h'5.800i'\v'1.763i'\D'~-0.200i 1.425i -0.900i 0.100i -0.900i 0.400i' .sp -1 \h'3.800i'\v'3.688i'\D'P0.081i -0.063i 0.020i 0.046i' .sp -1 \h'3.800i'\v'3.688i'\D'p0.081i -0.063i 0.020i 0.046i' .sp -1 .lf 1797 \h'5.700i-(\w'close'u/2u)'\v'2.188i-(0v/2u)+0v+0.22m'close .sp -1 \h'1.200i'\v'5.413i'\D'p0.000i -0.650i -0.800i 0.000i 0.000i 0.650i' .sp -1 .lf 1798 \h'0.800i-(\w'c'u/2u)'\v'5.088i-(3v/2u)+0v+0.22m'c .sp -1 .lf 1798 \h'0.800i-(\w'type STRING'u/2u)'\v'5.088i-(3v/2u)+1v+0.22m'type STRING .sp -1 .lf 1798 \h'0.800i-(\w'is_element'u/2u)'\v'5.088i-(3v/2u)+2v+0.22m'is_element .sp -1 .lf 1798 \h'0.800i-(\w'in_phrase'u/2u)'\v'5.088i-(3v/2u)+3v+0.22m'in_phrase .sp -1 \h'2.100i'\v'5.288i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1799 \h'1.850i-(\w'@'u/2u)'\v'5.088i-(1v/2u)+0v+0.22m'@ .sp -1 .lf 1799 \h'1.850i-(\w\(tstype '@'\(tsu/2u)'\v'5.088i-(1v/2u)+1v+0.22m'type '@' .sp -1 \h'1.200i'\v'5.058i'\D'l0.400i 0.000i' .sp -1 \h'1.600i'\v'5.058i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'1.600i'\v'5.058i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1800 \h'1.400i-(\w'next2'u/2u)'\v'5.058i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'1.600i'\v'5.118i'\D'l-0.400i 0.000i' .sp -1 \h'1.200i'\v'5.118i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'1.200i'\v'5.118i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1801 \h'1.400i-(\w'prev2'u/2u)'\v'5.118i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'3.300i'\v'5.288i'\D'p0.000i -0.400i -0.800i 0.000i 0.000i 0.400i' .sp -1 .lf 1802 \h'2.900i-(\w'd'u/2u)'\v'5.088i-(1v/2u)+0v+0.22m'd .sp -1 .lf 1802 \h'2.900i-(\w'type STRING'u/2u)'\v'5.088i-(1v/2u)+1v+0.22m'type STRING .sp -1 \h'2.100i'\v'5.058i'\D'l0.400i 0.000i' .sp -1 \h'2.500i'\v'5.058i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'2.500i'\v'5.058i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1803 \h'2.300i-(\w'next2'u/2u)'\v'5.058i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'2.500i'\v'5.118i'\D'l-0.400i 0.000i' .sp -1 \h'2.100i'\v'5.118i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'2.100i'\v'5.118i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1804 \h'2.300i-(\w'prev2'u/2u)'\v'5.118i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'4.200i'\v'5.288i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1805 \h'3.950i-(\w'.'u/2u)'\v'5.088i-(1v/2u)+0v+0.22m'. .sp -1 .lf 1805 \h'3.950i-(\w\(tstype '.'\(tsu/2u)'\v'5.088i-(1v/2u)+1v+0.22m'type '.' .sp -1 \h'3.300i'\v'5.058i'\D'l0.400i 0.000i' .sp -1 \h'3.700i'\v'5.058i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'3.700i'\v'5.058i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1806 \h'3.500i-(\w'next2'u/2u)'\v'5.058i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'3.700i'\v'5.118i'\D'l-0.400i 0.000i' .sp -1 \h'3.300i'\v'5.118i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.300i'\v'5.118i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1807 \h'3.500i-(\w'prev2'u/2u)'\v'5.118i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'5.400i'\v'5.288i'\D'p0.000i -0.400i -0.800i 0.000i 0.000i 0.400i' .sp -1 .lf 1808 \h'5.000i-(\w'edu'u/2u)'\v'5.088i-(1v/2u)+0v+0.22m'edu .sp -1 .lf 1808 \h'5.000i-(\w'type STRING'u/2u)'\v'5.088i-(1v/2u)+1v+0.22m'type STRING .sp -1 \h'4.200i'\v'5.058i'\D'l0.400i 0.000i' .sp -1 \h'4.600i'\v'5.058i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.600i'\v'5.058i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1809 \h'4.400i-(\w'next2'u/2u)'\v'5.058i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'4.600i'\v'5.118i'\D'l-0.400i 0.000i' .sp -1 \h'4.200i'\v'5.118i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'4.200i'\v'5.118i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1810 \h'4.400i-(\w'prev2'u/2u)'\v'5.118i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'0.800i'\v'5.413i'\D'~0.500i 0.300i 0.550i -0.425i' .sp -1 \h'1.850i'\v'5.288i'\D'P-0.064i 0.081i -0.031i -0.040i' .sp -1 \h'1.850i'\v'5.288i'\D'p-0.064i 0.081i -0.031i -0.040i' .sp -1 .lf 1811 \h'1.325i-(\w'next'u/2u)'\v'5.350i-(0v/2u)+0v+0.22m+.5v'next .sp -1 \h'1.850i'\v'5.288i'\D'~0.500i 0.300i 0.550i -0.300i' .sp -1 \h'2.900i'\v'5.288i'\D'P-0.076i 0.070i -0.024i -0.044i' .sp -1 \h'2.900i'\v'5.288i'\D'p-0.076i 0.070i -0.024i -0.044i' .sp -1 .lf 1812 \h'2.375i-(\w'next'u/2u)'\v'5.288i-(0v/2u)+0v+0.22m+.5v'next .sp -1 \h'2.900i'\v'5.288i'\D'~0.500i 0.300i 0.550i -0.300i' .sp -1 \h'3.950i'\v'5.288i'\D'P-0.076i 0.070i -0.024i -0.044i' .sp -1 \h'3.950i'\v'5.288i'\D'p-0.076i 0.070i -0.024i -0.044i' .sp -1 .lf 1813 \h'3.425i-(\w'next'u/2u)'\v'5.288i-(0v/2u)+0v+0.22m+.5v'next .sp -1 \h'3.950i'\v'4.888i'\D'~0.500i -0.300i 0.550i 0.300i' .sp -1 \h'5.000i'\v'4.888i'\D'P-0.100i -0.026i 0.024i -0.044i' .sp -1 \h'5.000i'\v'4.888i'\D'p-0.100i -0.026i 0.024i -0.044i' .sp -1 .lf 1814 \h'4.475i-(\w'next'u/2u)'\v'4.888i-(0v/2u)+0v+0.22m-.5v'next .sp -1 \h'3.850i'\v'1.913i'\D'~-3.450i 0.550i -0.050i 0.050i 0.000i 0.550i 0.050i 1.700i' .sp -1 \h'0.400i'\v'4.763i'\D'P-0.028i -0.099i 0.050i -0.001i' .sp -1 \h'0.400i'\v'4.763i'\D'p-0.028i -0.099i 0.050i -0.001i' .sp -1 .lf 1816 \h'0.500i-(\w'element'u/2u)'\v'3.888i-(0v/2u)+0v+0.22m'element .sp -1 \h'4.450i'\v'4.088i'\D'l-4.050i 0.675i' .sp -1 \h'0.400i'\v'4.763i'\D'P0.095i -0.041i 0.008i 0.049i' .sp -1 \h'0.400i'\v'4.763i'\D'p0.095i -0.041i 0.008i 0.049i' .sp -1 .lf 1817 \h'2.425i-(\w'element'u/2u)'\v'4.425i-(0v/2u)+0v+0.22m'element .sp -1 \h'3.950i'\v'6.388i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1818 \h'3.700i-(\w';'u/2u)'\v'6.188i-(1v/2u)+0v+0.22m'; .sp -1 .lf 1818 \h'3.700i-(\w\(tstype ';'\(tsu/2u)'\v'6.188i-(1v/2u)+1v+0.22m'type ';' .sp -1 \h'4.850i'\v'6.388i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1819 \h'4.600i-(\w','u/2u)'\v'6.188i-(1v/2u)+0v+0.22m', .sp -1 .lf 1819 \h'4.600i-(\w\(tstype ','\(tsu/2u)'\v'6.188i-(1v/2u)+1v+0.22m'type ',' .sp -1 \h'3.950i'\v'6.158i'\D'l0.400i 0.000i' .sp -1 \h'4.350i'\v'6.158i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.350i'\v'6.158i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1820 \h'4.150i-(\w'next2'u/2u)'\v'6.158i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'4.350i'\v'6.218i'\D'l-0.400i 0.000i' .sp -1 \h'3.950i'\v'6.218i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.950i'\v'6.218i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1821 \h'4.150i-(\w'prev2'u/2u)'\v'6.218i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'5.750i'\v'6.388i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1822 \h'5.500i-(\w' 'u/2u)'\v'6.188i-(1v/2u)+0v+0.22m' .sp -1 .lf 1822 \h'5.500i-(\w'type WS'u/2u)'\v'6.188i-(1v/2u)+1v+0.22m'type WS .sp -1 \h'4.850i'\v'6.158i'\D'l0.400i 0.000i' .sp -1 \h'5.250i'\v'6.158i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'5.250i'\v'6.158i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1823 \h'5.050i-(\w'next2'u/2u)'\v'6.158i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'5.250i'\v'6.218i'\D'l-0.400i 0.000i' .sp -1 \h'4.850i'\v'6.218i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'4.850i'\v'6.218i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1824 \h'5.050i-(\w'prev2'u/2u)'\v'6.218i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'4.600i'\v'5.988i'\D'~0.500i -0.300i 0.400i 0.300i' .sp -1 \h'5.500i'\v'5.988i'\D'P-0.095i -0.040i 0.030i -0.040i' .sp -1 \h'5.500i'\v'5.988i'\D'p-0.095i -0.040i 0.030i -0.040i' .sp -1 .lf 1825 \h'5.050i-(\w'trailer'u/2u)'\v'5.988i-(0v/2u)+0v+0.22m-.5v'trailer .sp -1 \h'5.000i'\v'5.288i'\D'l-1.300i 0.700i' .sp -1 \h'3.700i'\v'5.988i'\D'P0.076i -0.069i 0.024i 0.044i' .sp -1 \h'3.700i'\v'5.988i'\D'p0.076i -0.069i 0.024i 0.044i' .sp -1 .lf 1826 \h'4.350i-(\w' next2'u/2u)'\v'5.638i-(0v/2u)+0v+0.22m' next2 .sp -1 \h'3.450i'\v'5.988i'\D'l1.150i -0.700i' .sp -1 \h'4.600i'\v'5.288i'\D'P-0.072i 0.073i -0.026i -0.043i' .sp -1 \h'4.600i'\v'5.288i'\D'p-0.072i 0.073i -0.026i -0.043i' .sp -1 .lf 1827 \h'4.025i-(\w'prev2 'u/2u)'\v'5.638i-(0v/2u)+0v+0.22m'prev2 .sp -1 \h'1.900i'\v'1.763i'\D'~4.150i 0.625i 0.000i 1.500i 0.000i 1.400i -2.100i 0.700i' .sp -1 \h'3.950i'\v'5.988i'\D'P0.087i -0.055i 0.016i 0.047i' .sp -1 \h'3.950i'\v'5.988i'\D'p0.087i -0.055i 0.016i 0.047i' .sp -1 .lf 1829 \h'5.900i-(\w'close'u/2u)'\v'3.888i-(0v/2u)+0v+0.22m'close .sp -1 \h'5.600i'\v'4.088i'\D'l-4.400i 0.675i' .sp -1 \h'1.200i'\v'4.763i'\D'P0.095i -0.040i 0.008i 0.049i' .sp -1 \h'1.200i'\v'4.763i'\D'p0.095i -0.040i 0.008i 0.049i' .sp -1 .lf 1830 \h'3.400i-(\w' next2'u/2u)'\v'4.425i-(0v/2u)+0v+0.22m' next2 .sp -1 \h'0.800i'\v'4.763i'\D'l4.300i -0.675i' .sp -1 \h'5.100i'\v'4.088i'\D'P-0.095i 0.040i -0.008i -0.049i' .sp -1 \h'5.100i'\v'4.088i'\D'p-0.095i 0.040i -0.008i -0.049i' .sp -1 .lf 1831 \h'2.950i-(\w'prev2 'u/2u)'\v'4.425i-(0v/2u)+0v+0.22m'prev2 .sp -1 \h'1.200i'\v'7.663i'\D'p0.000i -0.650i -0.800i 0.000i 0.000i 0.650i' .sp -1 .lf 1832 \h'0.800i-(\w'e'u/2u)'\v'7.338i-(3v/2u)+0v+0.22m'e .sp -1 .lf 1832 \h'0.800i-(\w'type STRING'u/2u)'\v'7.338i-(3v/2u)+1v+0.22m'type STRING .sp -1 .lf 1832 \h'0.800i-(\w'is_element'u/2u)'\v'7.338i-(3v/2u)+2v+0.22m'is_element .sp -1 .lf 1832 \h'0.800i-(\w'in_phrase'u/2u)'\v'7.338i-(3v/2u)+3v+0.22m'in_phrase .sp -1 \h'2.100i'\v'7.538i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1833 \h'1.850i-(\w'@'u/2u)'\v'7.338i-(1v/2u)+0v+0.22m'@ .sp -1 .lf 1833 \h'1.850i-(\w\(tstype '@'\(tsu/2u)'\v'7.338i-(1v/2u)+1v+0.22m'type '@' .sp -1 \h'1.200i'\v'7.308i'\D'l0.400i 0.000i' .sp -1 \h'1.600i'\v'7.308i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'1.600i'\v'7.308i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1834 \h'1.400i-(\w'next2'u/2u)'\v'7.308i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'1.600i'\v'7.368i'\D'l-0.400i 0.000i' .sp -1 \h'1.200i'\v'7.368i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'1.200i'\v'7.368i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1835 \h'1.400i-(\w'prev2'u/2u)'\v'7.368i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'3.300i'\v'7.538i'\D'p0.000i -0.400i -0.800i 0.000i 0.000i 0.400i' .sp -1 .lf 1836 \h'2.900i-(\w'f'u/2u)'\v'7.338i-(1v/2u)+0v+0.22m'f .sp -1 .lf 1836 \h'2.900i-(\w'type STRING'u/2u)'\v'7.338i-(1v/2u)+1v+0.22m'type STRING .sp -1 \h'2.100i'\v'7.308i'\D'l0.400i 0.000i' .sp -1 \h'2.500i'\v'7.308i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'2.500i'\v'7.308i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1837 \h'2.300i-(\w'next2'u/2u)'\v'7.308i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'2.500i'\v'7.368i'\D'l-0.400i 0.000i' .sp -1 \h'2.100i'\v'7.368i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'2.100i'\v'7.368i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1838 \h'2.300i-(\w'prev2'u/2u)'\v'7.368i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'4.200i'\v'7.538i'\D'p0.000i -0.400i -0.500i 0.000i 0.000i 0.400i' .sp -1 .lf 1839 \h'3.950i-(\w'.'u/2u)'\v'7.338i-(1v/2u)+0v+0.22m'. .sp -1 .lf 1839 \h'3.950i-(\w\(tstype '.'\(tsu/2u)'\v'7.338i-(1v/2u)+1v+0.22m'type '.' .sp -1 \h'3.300i'\v'7.308i'\D'l0.400i 0.000i' .sp -1 \h'3.700i'\v'7.308i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'3.700i'\v'7.308i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1840 \h'3.500i-(\w'next2'u/2u)'\v'7.308i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'3.700i'\v'7.368i'\D'l-0.400i 0.000i' .sp -1 \h'3.300i'\v'7.368i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.300i'\v'7.368i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1841 \h'3.500i-(\w'prev2'u/2u)'\v'7.368i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'5.400i'\v'7.538i'\D'p0.000i -0.400i -0.800i 0.000i 0.000i 0.400i' .sp -1 .lf 1842 \h'5.000i-(\w'gov'u/2u)'\v'7.338i-(1v/2u)+0v+0.22m'gov .sp -1 .lf 1842 \h'5.000i-(\w'type STRING'u/2u)'\v'7.338i-(1v/2u)+1v+0.22m'type STRING .sp -1 \h'4.200i'\v'7.308i'\D'l0.400i 0.000i' .sp -1 \h'4.600i'\v'7.308i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.600i'\v'7.308i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 .lf 1843 \h'4.400i-(\w'next2'u/2u)'\v'7.308i-(0v/2u)+0v+0.22m-.5v'next2 .sp -1 \h'4.600i'\v'7.368i'\D'l-0.400i 0.000i' .sp -1 \h'4.200i'\v'7.368i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'4.200i'\v'7.368i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 1844 \h'4.400i-(\w'prev2'u/2u)'\v'7.368i-(0v/2u)+0v+0.22m+.5v'prev2 .sp -1 \h'0.800i'\v'7.663i'\D'~0.500i 0.300i 0.550i -0.425i' .sp -1 \h'1.850i'\v'7.538i'\D'P-0.064i 0.081i -0.031i -0.040i' .sp -1 \h'1.850i'\v'7.538i'\D'p-0.064i 0.081i -0.031i -0.040i' .sp -1 .lf 1845 \h'1.325i-(\w'next'u/2u)'\v'7.600i-(0v/2u)+0v+0.22m+.5v'next .sp -1 \h'1.850i'\v'7.538i'\D'~0.500i 0.300i 0.550i -0.300i' .sp -1 \h'2.900i'\v'7.538i'\D'P-0.076i 0.070i -0.024i -0.044i' .sp -1 \h'2.900i'\v'7.538i'\D'p-0.076i 0.070i -0.024i -0.044i' .sp -1 .lf 1846 \h'2.375i-(\w'next'u/2u)'\v'7.538i-(0v/2u)+0v+0.22m+.5v'next .sp -1 \h'2.900i'\v'7.138i'\D'~0.500i -0.300i 0.550i 0.300i' .sp -1 \h'3.950i'\v'7.138i'\D'P-0.100i -0.026i 0.024i -0.044i' .sp -1 \h'3.950i'\v'7.138i'\D'p-0.100i -0.026i 0.024i -0.044i' .sp -1 .lf 1847 \h'3.425i-(\w'next'u/2u)'\v'7.138i-(0v/2u)+0v+0.22m-.5v'next .sp -1 \h'3.950i'\v'7.138i'\D'~0.500i -0.300i 0.550i 0.300i' .sp -1 \h'5.000i'\v'7.138i'\D'P-0.100i -0.026i 0.024i -0.044i' .sp -1 \h'5.000i'\v'7.138i'\D'p-0.100i -0.026i 0.024i -0.044i' .sp -1 .lf 1848 \h'4.475i-(\w'next'u/2u)'\v'7.138i-(0v/2u)+0v+0.22m-.5v'next .sp -1 \h'0.800i'\v'2.013i'\D'~-0.600i 0.300i 0.000i 0.750i 0.000i 1.700i 0.000i 0.850i 1.000i 0.200i 2.250i -0.025i 0.900i 0.200i' .sp -1 \h'4.350i'\v'5.988i'\D'P-0.103i 0.003i 0.011i -0.049i' .sp -1 \h'4.350i'\v'5.988i'\D'p-0.103i 0.003i 0.011i -0.049i' .sp -1 .lf 1850 \h'2.900i-(\w'delimiter'u/2u)'\v'5.688i-(0v/2u)+0v+0.22m'delimiter .sp -1 \h'0.350i'\v'2.013i'\D'~-0.350i 0.500i 0.000i 2.900i 0.000i 1.600i 0.400i 0.325i' .sp -1 \h'0.400i'\v'7.338i'\D'P-0.093i -0.044i 0.032i -0.039i' .sp -1 \h'0.400i'\v'7.338i'\D'p-0.093i -0.044i 0.032i -0.039i' .sp -1 .lf 1852 \h'0.200i-(\w'element'u/2u)'\v'7.438i-(0v/2u)+0v+0.22m'element .sp -1 \h'4.600i'\v'6.388i'\D'l-4.200i 0.625i' .sp -1 \h'0.400i'\v'7.013i'\D'P0.095i -0.039i 0.007i 0.049i' .sp -1 \h'0.400i'\v'7.013i'\D'p0.095i -0.039i 0.007i 0.049i' .sp -1 .lf 1853 \h'2.500i-(\w'element'u/2u)'\v'6.700i-(0v/2u)+0v+0.22m'element .sp -1 \h'5.750i'\v'6.388i'\D'l-4.550i 0.625i' .sp -1 \h'1.200i'\v'7.013i'\D'P0.096i -0.038i 0.007i 0.050i' .sp -1 \h'1.200i'\v'7.013i'\D'p0.096i -0.038i 0.007i 0.050i' .sp -1 .lf 1854 \h'3.475i-(\w' next2'u/2u)'\v'6.700i-(0v/2u)+0v+0.22m' next2 .sp -1 \h'0.800i'\v'7.013i'\D'l4.450i -0.625i' .sp -1 \h'5.250i'\v'6.388i'\D'P-0.096i 0.039i -0.007i -0.050i' .sp -1 \h'5.250i'\v'6.388i'\D'p-0.096i 0.039i -0.007i -0.050i' .sp -1 .lf 1855 \h'3.025i-(\w'prev2 'u/2u)'\v'6.700i-(0v/2u)+0v+0.22m'prev2 .sp -1 \h'1.200i'\v'8.562i'\D'p0.000i -0.400i -0.800i 0.000i 0.000i 0.400i' .sp -1 .lf 1856 \h'0.800i-(\w'\er\en'u/2u)'\v'8.363i-(1v/2u)+0v+0.22m'\er\en .sp -1 .lf 1856 \h'0.800i-(\w'type EOH'u/2u)'\v'8.363i-(1v/2u)+1v+0.22m'type EOH .sp -1 \h'5.000i'\v'7.538i'\D'l-3.800i 0.625i' .sp -1 \h'1.200i'\v'8.162i'\D'P0.095i -0.041i 0.008i 0.049i' .sp -1 \h'1.200i'\v'8.162i'\D'p0.095i -0.041i 0.008i 0.049i' .sp -1 .lf 1857 \h'3.100i-(\w' next2'u/2u)'\v'7.850i-(0v/2u)+0v+0.22m' next2 .sp -1 \h'0.800i'\v'8.162i'\D'l3.800i -0.625i' .sp -1 \h'4.600i'\v'7.538i'\D'P-0.095i 0.041i -0.008i -0.049i' .sp -1 \h'4.600i'\v'7.538i'\D'p-0.095i 0.041i -0.008i -0.049i' .sp -1 .lf 1858 \h'2.700i-(\w'prev2 'u/2u)'\v'7.850i-(0v/2u)+0v+0.22m'prev2 .sp -1 \h'4.200i'\v'4.088i'\D'~-1.000i 0.100i -2.950i -1.025i 0.000i -0.375i 0.050i -0.375i 0.750i -0.400i' .sp -1 \h'1.050i'\v'2.013i'\D'P-0.076i 0.069i -0.024i -0.044i' .sp -1 \h'1.050i'\v'2.013i'\D'p-0.076i 0.069i -0.024i -0.044i' .sp -1 .lf 1860 \h'1.850i-(\w'head'u/2u)'\v'3.688i-(0v/2u)+0v+0.22m'head .sp -1 \h'0.400i'\v'7.013i'\D'~-0.250i -0.250i 0.000i -3.600i 0.000i -0.750i 0.300i -0.400i' .sp -1 \h'0.450i'\v'2.013i'\D'P-0.040i 0.095i -0.040i -0.030i' .sp -1 \h'0.450i'\v'2.013i'\D'p-0.040i 0.095i -0.040i -0.030i' .sp -1 .lf 1862 \h'0.500i-(\w'head'u/2u)'\v'6.713i-(0v/2u)+0v+0.22m'head .sp -1 \h'0.800i'\v'4.763i'\D'~2.400i -1.175i 0.650i -1.675i' .sp -1 \h'3.850i'\v'1.913i'\D'P-0.013i 0.102i -0.047i -0.018i' .sp -1 \h'3.850i'\v'1.913i'\D'p-0.013i 0.102i -0.047i -0.018i' .sp -1 .lf 1864 \h'1.850i-(\w'head'u/2u)'\v'4.188i-(0v/2u)+0v+0.22m'head .sp -1 .sp 8.562i+1 .if \n(00 .fi .br .nr 0x 0 .lf 1867 .PE .lf 1868 .DE .PP This To field consists of a list of RFC 2822 .BR addresses . They are the named .B group .I Group and the .B mailbox .IR e@f.gov . The named .B group consists of a list of .BR mailboxes , .I and .IR c@d.edu . Navigating these lists is simple; follow the .B element pointers in the .B mparse_list structure. Simple constructs such as an RFC 2822 .B addr-spec are linked using the .B next pointers. Comments, line folding, and whitespace (CFWS) are also linked together with the .B next pointers in structured fields, but are isolated from other tokens by being linked via the .B trailer pointers; the one exception being whitespace in an RFC 2822 .BR phrase , which is linked via the .B next pointers with the other parts of the .BR phrase . .PP Like the .B mparse_message structure, the .B mparse_token structure provides a user pointer which may be used to store arbitrary application data associated with a token. .DS .SS Fields and Body Content .PP An additional structure is used to hold field lines and to hold the body section. This .B mparse_field structure also holds a user pointer, several flags, a pointer to the token list, and pointers to other .B mparse_field structures to form a doubly-linked list. There is also a pointer to a collection of useful field characteristics (unused for body sections). .TS expand; lw(1.7i)fB lw(4.1i)fB . member (struct mparse_field) description _ .T& lp-2fB lp-2 . struct mparse_token *tokens; first token in logical line struct mparse_token *last_token; T{ most recently encountered token (used internally) T} struct mparse_field *prev; T{ doubly-linked list link to previous field or body section "line" T} struct mparse_field *next; T{ doubly-linked list link to next field or body section "line" T} const struct mparse_field_state *state; T{ pointer to field characteristics T} struct mparse_entity *entity; T{ pointer to enclosing entity structure T} void *userptr; T{ user pointer; not set or used by parser, may be used to associate data with field T} struct mparse_error *error; T{ error(s) for this field (entity type, etc.) T} int linelen; line length (max seen in body) time_t dt; date-time stamp value as time_t int tokenpos; T{ token start column (for error messages) T} unsigned int bit8 : 1; T{ found a non-ASCII character (used internally) T} unsigned int lonecr : 1; found a lone CR (used internally) unsigned int lonelf : 1; found a lone LF (used internally) unsigned int nul : 1; T{ found an ASCII NUL (used internally) T} unsigned int wsonly : 1; T{ found a whitespace-only continuation line (used internally) T} unsigned int token_errors : 1; T{ some field token has an error (used internally) T} unsigned int token_warnings : 1; T{ set if any token warnings (used internally) T} .TE .PP The .B mparse_field structure is defined in the header file .BR mparse.h . .DE .DS .SS Errors and Warnings .PP .B Mparse detects errors and potential errors (i.e. warnings) during parsing and stores relevant information which identifies the nature of the error. Error information is stored in a structure which contains the following members: .TS expand; lw(1.7i)fB lw(4.1i) . member (struct mparse_error) description _ .T& lp-2fB lp-2 . struct mparse_error *prev; T{ doubly-linked circular list link to prev error struct T} struct mparse_error *next; T{ doubly-linked circular list link to next error struct T} struct mparse_token *token; token responsible for error struct mparse_field *field; field responsible for error struct mparse_entity *entity; entity responsible for error const struct mparse_charset *cs; charset of error message struct mparse_protocol_status *status; T{ head of singly-linked list of transfer/access protocol RFCs and relevant status reply codes T} char *str; T{ reference string for error/warning T} int rfc; T{ for sorting error messages by rfc number T} const char *sect; reference section string int msgstr_index; T{ partial index into language-dependent error message string array T} int count; T{ optional count for error message T} int sufstr_index; T{ partial index into language-dependent suffix string array T} int type; MPARSE_REQUIREMENTS_MUST_NOT ... MPARSE_REQUIREMENTS_MUST int val; T{ internal use (e.g. alternate value for repair) T} int len; length of str unsigned int errors; bitwise OR of MPARSE_ERR_* values unsigned int remedies; T{ bitwise OR of MPARSE_FIX_* and/or MPARSE_FUBAR values T} .TE .DE .PP The .B next and .B prev pointers provide a doubly-linked circular list of .B mparse_error structures (more than one error message might be associated with an input error). .PP Errors might be associated with an input token, a field (or message body), or with the particular chunk of a message which is referenced by an .B mparse_entity structure (described below); there is provision in the .B mparse_error structure for pointers back to the relevant offending item. .PP A description of the error and the relevant standards documents is provided by the .B str member, whose length in bytes is given by .BR len . .PP RFCs generally provide for a few categories of specifications: some things are expressly forbidden, usually indicated in the RFC text by a "MUST NOT" statement. Conversely, absolute requirements are usually specified with a "MUST" statement. Between these extremes, a strong recommendation which falls short of an absolute requirement is generally indicated by a "SHOULD" statement or "SHOULD NOT" statement. Relevant terms are defined in RFC 2119. The .B type member records the relevant category for the error or warning. .PP The .B errors member is an unsigned integer consisting of 1-bit flags which indicate the detected error conditions. The flags are determined by the requirements, prohibitions, and recommendations given in the RFCs relevant to text messages. .PP When generating message components, some types of errors can be automatically corrected (errors found during parsing are not corrected, as that would change the message). Details of the type of correction to be applied and any corresponding replacement value vary with the nature of the error. These are recorded in the .B remedies and .B val members. .PP While the .B rfc structure member records the applicable message format RFC corresponding to an error, there may be separate transfer protocol RFCs which have provision for a specific status response code or extended status response. The .B status structure member is a pointer to the head of a linked list of structures which record relevant status and extended status response values for various transfer protocols. The .B mparse_protocol_status structure is described below. .DS .PP The function .PP .B "void mparse_count_errors(const struct mparse_message *, const struct mparse_token *, unsigned int *, unsigned int *, unsigned int *, unsigned int *, unsigned int *, unsigned int *, unsigned int *, unsigned int *, unsigned int, unsigned int);" .PP may be called to return total counts of errors and warnings. The first and second arguments point to a .B mparse_message structure and a .B mparse_token structure which is either the token of interest or any .B mparse_token structure in the .BR field , .BR entity , or .B message of interest. The error and warning counts are returned in unsigned integers pointed to by the third through tenth arguments. These consist of four pairs; hard errors and warnings for each of four categories: RFCs in effect per the message \fBmodes\fP, RFCs not in effect, internet drafts supported by .BR mparse , and miscellaneous non-RFC issues. The eleventh argument is comprised of bitwise ORed .I MPARSE_COUNT_ macros defined in the header file .BR mparse.h . The twelfth argument is comprised of bitwise ORed .I MPARSE_ERR_ macros defined in the header file .BR mparse.h , and defines which error types are to be counted A total count of all errors and warnings in a message may be obtained by: .PP .ft CW .ps 8 .vs 9 .nf unsigned int errcount, warncount, other_errs, other_warnings; mparse_count_errors(message, message->top->fields->tokens, &errcount, &warncount, \0\0\0\0&other_errs, &other_warnings, &other_errs, &other_warnings, \0\0\0\0&other_errs, &other_warnings, \0\0\0\0MPARSE_COUNT_ALL | MPARSE_COUNT_TOKENS | MPARSE_COUNT_FIELDS | \0\0\0\0MPARSE_COUNT_ENTITIES | MPARSE_COUNT_INSTANCES, ~(0U)); .fi .ft .ps .vs .PP which should be called from the .I hook_end_of_message application hook (described below). Types of errors or warnings may be ignored by passing a .I NULL pointer in place of the appropriate unsigned integer pointer. As in the example above, a single unsigned integer may be used to accumulate multiple types of errors or warnings. .DE .DS .PP As noted above, transfer protocols may provide for specific mparse_status or extended status responses when an error is detected in the message during transfer. The .B mparse_protocol_status structure associates status and extended status values with a transfer protocol RFC for a specific error. The structure has the following members: .TS expand; lw(2.5i)fB lw(3.0i) . member (struct mparse_protocol_status) description _ .T& lp-2fB lp-2 . struct mparse_error *error; pointer to associated error structure struct mparse_protocol_status *next; T{ singly-linked list pointer to next structure T} int rfc; applicable transport protocol RFC unsigned int status; T{ status code for protocol, if applicable T} unsigned int es_class; T{ class for extended status (RFCs 2034, 3463), if applicable T} unsigned int es_subject; T{ subject for extended status (RFCs 2034, 3463), if applicable T} unsigned int es_detail; T{ detail for extended status (RFCs 2034, 3463), if applicable T} .TE .DE .DS .PP The status or extended status values appropriate for a given transfer protocol RFC and error are obtained through use of several functions: .PP .B unsigned int mparse_status(struct mparse_protocol_status *p); .PP returns the .B status member of the .B mparse_protocol_status structure pointed to by .IR p . It returns zero with .I errno set appropriately in the event of an error. .DE .DS .PP .B int mparse_extended_status_string(struct mparse_protocol_status *p, char *buf, size_t sz); .PP interpolates an extended status string into the character array pointed to by .IR buf , and whose size is .IR sz . It returns -1 on error with errno set appropriately. Otherwise, it returns the number of characters written to .I buf if that array is sufficiently large, or the minimum number of characters necessary to hold the extended status string plus a terminating '\e0' character if .I buf is a .I NULL pointer or points to an array which is too small (as specified by .IR sz ). .DE .DS .PP If a specific error is known, the appropriate .B mparse_protocol_status structure may be found by walking the list pointed to by the .I status member of the .B mparse_error structure, looking for the .B mparse_protocol_status structure with the .I rfc member corresponding to the transfer protocol in question. .PP More likely a status or extended status response must be provided after a message or message component has been received and parsed. .PP .B struct mparse_protocol_status *mparse_protocol_status(struct mparse_message *m, struct mparse_entity *e, struct mparse_field *f, struct mparse_token *t, int rfc); .PP may be used to return a pointer to an appropriate .B mparse_protocol_status structure for a given message, entity, field, or token corresponding to a particular transfer protocol RFC. If a token .I t is specified, the corresponding structure for the most serious error associated with that token, its field, and its entity's field or body (as appropriate) is returned. If .I t is a .I NULL pointer, all tokens in the field, entity, or message are checked according to which of those pointers are not .IR NULL . If a field .I f is specified, the structure corresponding to the most serious error associated with that field or any of its tokens is returned. If .I t and .I f are both .I NULL and .I e points to an .B entity structure, then the .B mparse_protocol_status structure corresponding to the most serious error in that entity, its fields or body, and all of the tokens therein is returned. Likewise, if all of the pointers except .I m are .IR NULL , all entities, fields, bodies, and tokens in the message are considered. In each case, the structure corresponding to the specified transport protocol .I rfc is returned, if one exists. If there is no such structure .RI ( e.g. if there is no error), a .I NULL pointer is returned. A .I NULL pointer is also returned in case of an error .RI ( e.g. all pointers are .IR NULL ), however in that case .I errno will be set appropriately (N.B. errno is not altered if there is no error). .DE .DS .PP Application-specific errors may be set for tokens, fields, entity header, entity body, or overall entity by calling one of the following functions, each of which returns a pointer to the error structure which (if not NULL due to some error) has been attached to the corresponding token, field, etc.: .PP \s-2\fBstruct mparse_error *mparse_set_token_error(const struct mparse_message *, struct mparse_token *, unsigned int, unsigned int, int, const char *, int, int, int, int, int);\fP\s0 .PP sets an .B mparse_error structure attached to the specified token in the specified message. The two unsigned integer arguments specify the specific error types and remedies. The remaining seven integer arguments specify the RFC number containing the relevant specification, a language-neutral section specification string, the requirement level (MPARSE_REQUIREMENTS_MUST_NOT, ... MPARSE_REQUIREMENTS_MUST), an index into the possibly language-dependent array of strings describing the error (msg_lang.gperf and msg.h), an optional count (use MPARSE_ERROR_COUNT_NO_COUNT to cause it to be ignored), an index into a possibly language-dependent array of strings used as a suffix to the count (msg_lang.gperf and msg.h; MPARSE_SUFFIX_NO_SUFFIX to ignore it), and an integer value which may be used in some error messages or in repairing some types of errors. .PP Sometimes there is a string associated with an error message which cannot be referenced conveniently via an index. In such cases, the function .PP \s-2\fBstruct mparse_error *mparse_set_token_error_string(const struct mparse_message *, struct mparse_token *, unsigned int, unsigned int, int, const char *, const char *, int, const char *, const struct mparse_charset *, int, const char *, int);\fP\s0 .PP may be used. The initial arguments are the same as above, Either an RFC number can be provided, or if one of the symbolic constants in mparse.h is used, a string reference may be provided (it is ignored if the rfc number is not negative). Instead of indices for the description and suffix, character string pointers are provided by the caller. The character set should be specified. Integer arguments for requirement level, count, and optional value are used as above. .DE .DS .PP Corresponding functions for field errors are: .PP \s-2\fBstruct mparse_error *mparse_set_field_error_string(const struct mparse_message *, struct mparse_field *, unsigned int, unsigned int, int, const char *, const char *, int, const char *, const struct mparse_charset *, int, const char *, int);\fP\s0 .PP and .PP \s-2\fBstruct mparse_error *mparse_set_field_error(const struct mparse_message *, struct mparse_field *, unsigned int, unsigned int, int, const char *, int, int, int, int, int);\fP\s0 .PP Functions for entity-related errors are also provided; they do not need a message structure pointer: .PP \s-2\fBstruct mparse_error *mparse_set_entity_error_string(struct mparse_entity *, unsigned int, unsigned int, int, const char *, const char *, int, const char *, const struct mparse_charset *, int, const char *, int);\fP\s0 .PP \s-2\fBstruct mparse_error *mparse_set_entity_error(struct mparse_entity *, unsigned int, unsigned int, int, const char *, int, int, int, int, int);\fP\s0 .PP \s-2\fBstruct mparse_error *mparse_set_entity_body_error_string(struct mparse_entity *, unsigned int, unsigned int, int, const char *, const char *, int, const char *, const struct mparse_charset *, int, const char *, int);\fP\s0 .PP \s-2\fBstruct mparse_error *mparse_set_entity_body_error(struct mparse_entity *, unsigned int, unsigned int, int, const char *, int, int, int, int, int);\fP\s0 .PP \s-2\fBstruct mparse_error *mparse_set_entity_header_error_string(struct mparse_entity *, unsigned int, unsigned int, int, const char *, const char *, int, const char *, const struct mparse_charset *, int, const char *, int);\fP\s0 .PP \s-2\fBstruct mparse_error *mparse_set_entity_header_error(struct mparse_entity *, unsigned int, unsigned int, int, const char *, int, int, int, int, int);\fP\s0 .DE .DS .PP Accumulated error and warning messages may be handled by calling the following functions: .PP \s-2\fBsize_t mparse_token_error_messages(const struct mparse_token *, size_t (*)(const char *, size_t, va_list), ...);\fP\s0 .PP \s-2\fBsize_t mparse_field_error_messages(const struct mparse_message *, size_t (*)(const char *, size_t, va_list), ...);\fP\s0 .PP \s-2\fBsize_t mparse_entity_body_error_messages(const struct mparse_entity *, size_t (*)(const char *, size_t, va_list), ...);\fP\s0 .PP \s-2\fBsize_t mparse_entity_header_error_messages(const struct mparse_entity *, size_t (*)(const char *, size_t, va_list), ...);\fP\s0 .PP \s-2\fBsize_t mparse_missing_close_delimiter_error_messages(const struct mparse_entity *, size_t (*)(const char *, size_t, va_list), ...);\fP\s0 .PP \s-2\fBsize_t mparse_missing_separator_error_messages(const struct mparse_entity *, size_t (*)(const char *, size_t, va_list), ...);\fP\s0 .PP Each function handles error messages by calling a function with a string of known length for each error message, and with any supplied arguments. The called function takes a pointer to a sting, the length of the string, and a variable number of arguments (passed from the supplied arguments as a va_list). .PP .B Mparse includes the function .PP \s-2\fBsize_t mparse_fwrite_wrapper(const char *, size_t, va_list);\fP\s0 .PP which takes a .I stdio .B "FILE *" as an additional argument via the va_list. A typical call to output token error messages to .I stderr might be: .PP .ft CW .ps 8 .vs 9 .nf #include size_t i; struct mparse_token *token; /* ... */ i = mparse_token_error_messages(token, mparse_fwrite_wrapper, stderr); .fi .ft .ps .vs .PP Applications could supply alternative functions to support handling error messages via .I syslog or other methods. .DE .DS .PP Status and extended status may be set for errors by calling the following functions, passing pointers to the message structure and corresponding error structure: .PP \s-2\fBint mparse_success_status(const struct mparse_message *, struct mparse_error *);\fP\s0 .PP sets a success status code .PP \s-2\fBint mparse_address_syntax_status(const struct mparse_message *, struct mparse_error *);\fP\s0 .PP sets status for known protocols appropriately for an address syntax error, .PP \s-2\fBint mparse_general_transient_status(const struct mparse_message *, struct mparse_error *);\fP\s0 .PP sets a transient status code for conditions that can be expected to be temporary, such as timeouts, temporary unavailability of resources, etc., and .PP \s-2\fBint mparse_general_syntax_status(const struct mparse_message *, struct mparse_error *);\fP\s0 .PP sets an appropriate status for general (not address-specific) syntax errors. .DE .DS .SS Message Parts .PP .B Mparse groups message content into chunks which may contain a boundary delimiter line, a set of header fields (possibly empty), a separator line, and body text. A simple message has no delimiter, does have header fields, may have body text, and has a separator line if there is body text (and might or might not have such a line if there is no body). Such a structure is sufficient to hold an entire simple message. It can be represented thus: .lf 2522 .PS 1.833i 4.000i .\" 0 -1.75 4 0.0833335 .\" 0.000i 1.833i 4.000i 0.000i .nr 00 \n(.u .nf .nr 0x 1 \h'4.000i' .sp -1 \h'3.000i'\v'0.167i'\D'p0.000i -0.167i -3.000i 0.000i 0.000i 0.167i' .sp -1 \h'3.000i'\v'0.083i'\D'l1.000i 0.000i' .sp -1 \D'f 1000u'\h'-1000u' .sp -1 \h'3.000i'\v'0.083i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.000i'\v'0.083i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 2526 \h'4.000i'\v'0.083i-(0v/2u)+0v+0.22m'delimiter (none in this example) .sp -1 \h'3.000i'\v'0.833i'\D'p0.000i -0.667i -3.000i 0.000i 0.000i 0.667i' .sp -1 .lf 2529 \h'0.000i'\v'0.250i-(0v/2u)+0v+0.22m'From: a@b.org .sp -1 .lf 2531 \h'0.000i'\v'0.417i-(0v/2u)+0v+0.22m'To: c@d.edu .sp -1 .lf 2533 \h'0.000i'\v'0.583i-(0v/2u)+0v+0.22m'Date: 8 Oct 1999 12:34:56 -0700 .sp -1 .lf 2535 \h'0.000i'\v'0.750i-(0v/2u)+0v+0.22m'Subject: Hi .sp -1 \h'3.000i'\v'0.500i'\D'l1.000i 0.000i' .sp -1 \h'3.000i'\v'0.500i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.000i'\v'0.500i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 2536 \h'4.000i'\v'0.500i-(0v/2u)+0v+0.22m'header fields .sp -1 \h'3.000i'\v'1.000i'\D'p0.000i -0.167i -3.000i 0.000i 0.000i 0.167i' .sp -1 \h'3.000i'\v'0.917i'\D'l1.000i 0.000i' .sp -1 \h'3.000i'\v'0.917i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.000i'\v'0.917i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 2538 \h'4.000i'\v'0.917i-(0v/2u)+0v+0.22m'empty separator line .sp -1 \h'3.000i'\v'1.833i'\D'p0.000i -0.833i -3.000i 0.000i 0.000i 0.833i' .sp -1 .lf 2541 \h'0.000i'\v'1.083i-(0v/2u)+0v+0.22m'This is the body text. .sp -1 \h'3.000i'\v'1.417i'\D'l1.000i 0.000i' .sp -1 \h'3.000i'\v'1.417i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.000i'\v'1.417i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .lf 2542 \h'4.000i'\v'1.417i-(0v/2u)+0v+0.22m'body .sp -1 .sp 1.833i+1 .if \n(00 .fi .br .nr 0x 0 .lf 2543 .PE .lf 2544 .DE .PP MIME messages are represented as linked collections of these structures. A MIME multipart message consists of a first section which has no delimiter, has regular and MIME header fields, may have body text (called a .IR preamble ), and has an empty separator line if there is a preamble, but might not have one if there is no preamble. Following that initial section comes one or more encapsulated parts, each of which has a delimiter line, may have MIME header fields, has a mandatory separator line, and has some body text. A MIME multipart message ends with a final section which has a specially formatted close delimiter line, no header fields, may have body text (called an .IR epilogue ), and has an empty separator line if there is an epilogue, but might not have one if there is no epilogue. .PP The structures are linked together in two dimensions using left (previous), right (next), up (parent), and down (child) pointers. .DS .PP Many of the .B mparse functions take a pointer to a structure, called an .B mparse_entity structure, which is the structure described above. .\" This structure should not be confused with the .\" MIME term .\" .IR entity . The data and links which are held in the .B mparse_entity structure are: .TS expand; lw(1.8i)fB lw(4.0i)fB . member (struct mparse_entity) description _ .T& lp-2fB lp-2 . void *userptr; T{ .na user pointer; not set or used by parser, may be used by applications to pass data to hooks T} struct mparse_message *message; message enclosing entity struct mparse_token *delimiter; T{ delimiter token (beginning of text associated with entity section) T} struct mparse_field *fields; T{ message or MIME entity or MDN or DSN fields T} struct mparse_field *last_field; last field seen so far struct mparse_token *separator; T{ .na CRLF token separating header from body, unless missing (e.g. in a malformed message) T} struct mparse_field *body; T{ body section (or preamble, or epilogue, or phantom body) T} struct mparse_field *last_body; last body "line" seen so far struct mparse_entity *parent; T{ .na points to enclosing entity in a MIME multipart entity; null pointer in top-level message T} struct mparse_entity *child; T{ .na points to first (in order of appearance unless modified by an RFC 2387 \fIstart\fP parameter) enclosed MIME multipart entity T} struct mparse_entity *next_sibling; T{ .na points to next MIME multipart entity enclosed by same parent. T} struct mparse_entity *prev_sibling; T{ .na points to previous MIME multipart entity enclosed by same parent. T} struct mparse_cache cache; MIME information cache unsigned int mimepart; T{ .na mime section number; 0 for preamble, then incremented at each delimiter, 0 for epilogue T} struct mparse_error *field_errors; missing fields, etc. struct mparse_error *body_errors; T{ excessively long lines, illegal content, etc. T} unsigned int h_processed:1; header_end called (used internally) unsigned int b_processed:1; body_end called (used internally) .TE .PP .TS expand; lw(2.0i)fB lw(3.2i)fB . member (struct mparse_cache) description _ .T& lp-2fB lp-2 . struct mparse_token *content_type; MIME content type const struct mparse_type *media_type; MIME media type information struct mparse_token *content_subtype; MIME content subtype const struct mparse_subtype *subtype; MIME media subtype information struct mparse_token *content_parameters; MIME content-type parameters struct mparse_token *content_id; MIME content-id struct mparse_token *description; MIME Content-Description struct mparse_token *content_disposition; T{ MIME Content-Disposition disposition type T} struct mparse_disposition *disposition; MIME disposition information struct mparse_token *disposition_parameters; T{ MIME Content-Disposition parameters T} struct mparse_token *duration; MIME Content-Duration struct mparse_token *encoding; T{ MIME content-transfer-encoding T} const struct mparse_encoding *enc; MIME encoding/domain information struct mparse_token *languages; MIME Content-Language struct mparse_token *location; MIME Content-Location URI struct mparse_token *md5; MIME Content-MD5 .TE .DE .PP The details of the structures pointed to by these members are discussed below. .PP The .B userptr may be used to point to arbitrary data needed by the application. .PP A pointer to the enclosing .B mparse_message structure is held in the .B message member. .PP The .B last_field member points to the most-recently added field, and may be used within user functions (discussed below) which are called when a field is recognized in the input stream. .PP A similar member is the .B last_body pointer, however that is only used internally when generating body content in a piecemeal manner; normal parsing of a complete message produces a single .B body section stored in a .B mparse_field structure. .PP Some MIME information contained in the message or MIME-part header fields (or default information in the absence of a relevant MIME field) is cached in the .B mparse_cache structure, Examples include the .BR content_type , .BR content_subtype , .BR media_type , and .B subtype members. .B content_type and .B content_subtype point to the field tokens (structure details below) if a MIME Content-Type field is present. The .B media_type and .B media_subtype members point to structures described above, which are determined from the MIME Content-Type field if present, and are set to appropriate default values otherwise. This is an important distinction between the .I content_ and .I media_ member types; the .I content_ pointers may be .I NULL (in the absence of a MIME Content-Type field) or may point to tokens with non-standard type or subtype, whereas the .I media_ pointers (once initialized by mparse) will point to a structure for the corresponding standard type and subtype (defaults in the absence of a Content-Type field and applicable values used when non-standard values are provided in a Content-Type field are standardized by RFCs 2045 and 2046). .PP The difference between the struct mparse_token pointer .B encoding and the other struct pointer, .BR enc , for the MIME media encoding information is that the struct mparse_token pointer points to the token in the input message header field, while the other pointer points to information which may be a default (in the absence of an explicit field specification, in which case the struct mparse_token pointer is a \fINULL\fP pointer), or it may be set to a different type (e.g. if the specified type is unrecognized) according to the rules in RFCs 2045 and 2046. .PP There are two pointers to lists of .B mparse_error structures provided in the .B mparse_entity structure: .B field_errors and .BR body_errors . These point to errors for the message chunk field section and body, respectively, where the errors in question are not specifically tied to a particular input token or field. .PP A relatively simple MIME multipart message might look like this: .DS .ps 6 .lf 2820 .PS 3.000i 6.000i .\" 0 -2.35 6 0.65 .\" 0.000i 3.000i 6.000i 0.000i .nr 00 \n(.u .nf .nr 0x 1 \h'6.000i' .sp -1 \h'1.500i'\v'0.100i'\D'p0.000i -0.100i -1.500i 0.000i 0.000i 0.100i' .sp -1 \h'1.500i'\v'0.700i'\D'p0.000i -0.600i -1.500i 0.000i 0.000i 0.600i' .sp -1 \h'1.500i'\v'0.800i'\D'p0.000i -0.100i -1.500i 0.000i 0.000i 0.100i' .sp -1 \h'1.500i'\v'1.300i'\D'p0.000i -0.500i -1.500i 0.000i 0.000i 0.500i' .sp -1 .lf 2851 \h'0.000i'\v'0.150i-(0v/2u)+0v+0.22m'From: a@b.org .sp -1 .lf 2852 \h'0.000i'\v'0.250i-(0v/2u)+0v+0.22m'To: c@d.edu .sp -1 .lf 2853 \h'0.000i'\v'0.350i-(0v/2u)+0v+0.22m'Date: 8 Oct 1999 12:34:56 -0700 .sp -1 .lf 2854 \h'0.000i'\v'0.450i-(0v/2u)+0v+0.22m'Subject: Hi .sp -1 .lf 2855 \h'0.000i'\v'0.550i-(0v/2u)+0v+0.22m'MIME-Version: 1.0 .sp -1 .lf 2856 \h'0.000i'\v'0.650i-(0v/2u)+0v+0.22m'Content-Type: multipart/mixed;boundary=x .sp -1 .lf 2857 \h'0.000i'\v'0.850i-(0v/2u)+0v+0.22m'This is the preamble. .sp -1 \h'1.500i'\v'1.800i'\D'p0.000i -0.100i -1.500i 0.000i 0.000i 0.100i' .sp -1 \h'1.500i'\v'2.400i'\D'p0.000i -0.600i -1.500i 0.000i 0.000i 0.600i' .sp -1 \h'1.500i'\v'2.500i'\D'p0.000i -0.100i -1.500i 0.000i 0.000i 0.100i' .sp -1 \h'1.500i'\v'3.000i'\D'p0.000i -0.500i -1.500i 0.000i 0.000i 0.500i' .sp -1 .lf 2859 \h'0.000i'\v'1.750i-(0v/2u)+0v+0.22m'--x .sp -1 .lf 2860 \h'0.000i'\v'2.550i-(0v/2u)+0v+0.22m'This is default plain text. .sp -1 \h'3.750i'\v'1.800i'\D'p0.000i -0.100i -1.500i 0.000i 0.000i 0.100i' .sp -1 \h'3.750i'\v'2.400i'\D'p0.000i -0.600i -1.500i 0.000i 0.000i 0.600i' .sp -1 \h'3.750i'\v'2.500i'\D'p0.000i -0.100i -1.500i 0.000i 0.000i 0.100i' .sp -1 \h'3.750i'\v'3.000i'\D'p0.000i -0.500i -1.500i 0.000i 0.000i 0.500i' .sp -1 .lf 2862 \h'2.250i'\v'1.750i-(0v/2u)+0v+0.22m'--x .sp -1 .lf 2863 \h'2.250i'\v'1.850i-(0v/2u)+0v+0.22m'Content-Type: text/X-awful .sp -1 .lf 2864 \h'2.250i'\v'1.950i-(0v/2u)+0v+0.22m'Content-ID: .sp -1 .lf 2865 \h'2.250i'\v'2.050i-(0v/2u)+0v+0.22m'Content-Description: cruft .sp -1 .lf 2866 \h'2.250i'\v'2.150i-(0v/2u)+0v+0.22m'Content-Language: x-nonsense .sp -1 .lf 2867 \h'2.250i'\v'2.250i-(0v/2u)+0v+0.22m'Content-Transfer-Encoding: .sp -1 .lf 2868 \h'2.250i'\v'2.350i-(0v/2u)+0v+0.22m'\0\0\0quoted\-printable .sp -1 .lf 2869 \h'2.250i'\v'2.550i-(0v/2u)+0v+0.22m' .sp -1 .lf 2870 \h'2.250i'\v'2.650i-(0v/2u)+0v+0.22m' = .sp -1 .lf 2871 \h'2.250i'\v'2.750i-(0v/2u)+0v+0.22m'blurfl grimble pritz farkle = .sp -1 .lf 2872 \h'2.250i'\v'2.850i-(0v/2u)+0v+0.22m' .sp -1 .lf 2873 \h'2.250i'\v'2.950i-(0v/2u)+0v+0.22m' .sp -1 \h'6.000i'\v'1.800i'\D'p0.000i -0.100i -1.500i 0.000i 0.000i 0.100i' .sp -1 \h'6.000i'\v'2.400i'\D'p0.000i -0.600i -1.500i 0.000i 0.000i 0.600i' .sp -1 \h'6.000i'\v'2.500i'\D'p0.000i -0.100i -1.500i 0.000i 0.000i 0.100i' .sp -1 \h'6.000i'\v'3.000i'\D'p0.000i -0.500i -1.500i 0.000i 0.000i 0.500i' .sp -1 .lf 2875 \h'4.500i'\v'1.750i-(0v/2u)+0v+0.22m'--x-- .sp -1 .lf 2876 \h'4.500i'\v'2.550i-(0v/2u)+0v+0.22m'This is the epilogue. .sp -1 \h'0.500i'\v'1.300i'\D'l0.000i 0.400i' .sp -1 \D'f 1000u'\h'-1000u' .sp -1 \h'0.500i'\v'1.700i'\D'P-0.025i -0.100i 0.050i 0.000i' .sp -1 \h'0.500i'\v'1.700i'\D'p-0.025i -0.100i 0.050i 0.000i' .sp -1 \h'1.000i'\v'1.700i'\D'l0.000i -0.400i' .sp -1 \h'1.000i'\v'1.300i'\D'P0.025i 0.100i -0.050i 0.000i' .sp -1 \h'1.000i'\v'1.300i'\D'p0.025i 0.100i -0.050i 0.000i' .sp -1 \h'3.250i'\v'1.700i'\D'l-2.250i -0.400i' .sp -1 \h'1.000i'\v'1.300i'\D'P0.103i -0.007i -0.009i 0.049i' .sp -1 \h'1.000i'\v'1.300i'\D'p0.103i -0.007i -0.009i 0.049i' .sp -1 \h'5.500i'\v'1.700i'\D'l-4.500i -0.400i' .sp -1 \h'1.000i'\v'1.300i'\D'P0.102i -0.016i -0.004i 0.050i' .sp -1 \h'1.000i'\v'1.300i'\D'p0.102i -0.016i -0.004i 0.050i' .sp -1 \h'1.500i'\v'2.133i'\D'l0.750i 0.000i' .sp -1 \h'2.250i'\v'2.133i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'2.250i'\v'2.133i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'3.750i'\v'2.133i'\D'l0.750i 0.000i' .sp -1 \h'4.500i'\v'2.133i'\D'P-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.500i'\v'2.133i'\D'p-0.100i 0.025i 0.000i -0.050i' .sp -1 \h'4.500i'\v'2.567i'\D'l-0.750i 0.000i' .sp -1 \h'3.750i'\v'2.567i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'3.750i'\v'2.567i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 \h'2.250i'\v'2.567i'\D'l-0.750i 0.000i' .sp -1 \h'1.500i'\v'2.567i'\D'P0.100i -0.025i 0.000i 0.050i' .sp -1 \h'1.500i'\v'2.567i'\D'p0.100i -0.025i 0.000i 0.050i' .sp -1 .sp 3.000i+1 .if \n(00 .fi .br .nr 0x 0 .lf 2885 .PE .lf 2886 .ps .DE .PP .B Mparse will construct such a linked structure as it parses a message, or functions can be called to build such a structure for generating a message. In the latter case, mparse takes care of some of the gory details of building a MIME message so that the programmer need not worry too much about those details, but can instead concentrate on the content. .DS .PP In addition to multipart composite MIME entities, MIME message composite media types are provided for encapsulating messages. As already mentioned, several of these have unusual characteristics. Perhaps the simplest MIME message media type is the message/rfc822 type defined in RFC 2046. The encapsulation consists of MIME\-part header fields followed by an empty separator line. There is no delimiter and no body \fIper se\fP. Following the separator is the encapsulated message (header fields, separator, body). Note that the encapsulated message need not be a simple message; it may be a MIME message of arbitrary complexity. Because the structure discussed above does not provide for multiple sets of fields, the encapsulation links to a separate structure (or linked group of structures) for the encapsulated message. Here's a simple example: .ps 9 .lf 2925 .PS 3.200i 2.250i .\" 0 -2.45 2.25 0.75 .\" 0.000i 3.200i 2.250i 0.000i .nr 00 \n(.u .nf .nr 0x 1 \h'2.250i' .sp -1 \h'2.250i'\v'0.150i'\D'p0.000i -0.150i -2.250i 0.000i 0.000i 0.150i' .sp -1 \h'2.250i'\v'1.050i'\D'p0.000i -0.900i -2.250i 0.000i 0.000i 0.900i' .sp -1 \h'2.250i'\v'1.200i'\D'p0.000i -0.150i -2.250i 0.000i 0.000i 0.150i' .sp -1 \h'2.250i'\v'1.500i'\D'p0.000i -0.300i -2.250i 0.000i 0.000i 0.300i' .sp -1 .lf 2943 \h'0.000i'\v'0.225i-(0v/2u)+0v+0.22m'From: a@b.org .sp -1 .lf 2944 \h'0.000i'\v'0.375i-(0v/2u)+0v+0.22m'To: c@d.edu .sp -1 .lf 2945 \h'0.000i'\v'0.525i-(0v/2u)+0v+0.22m'Date: 8 Oct 1999 12:34:56 -0700 .sp -1 .lf 2946 \h'0.000i'\v'0.675i-(0v/2u)+0v+0.22m'Subject: Forwarded message .sp -1 .lf 2947 \h'0.000i'\v'0.825i-(0v/2u)+0v+0.22m'MIME-Version: 1.0 .sp -1 .lf 2948 \h'0.000i'\v'0.975i-(0v/2u)+0v+0.22m'Content-Type: message/rfc822 .sp -1 \h'2.250i'\v'1.850i'\D'p0.000i -0.150i -2.250i 0.000i 0.000i 0.150i' .sp -1 \h'2.250i'\v'2.750i'\D'p0.000i -0.900i -2.250i 0.000i 0.000i 0.900i' .sp -1 \h'2.250i'\v'2.900i'\D'p0.000i -0.150i -2.250i 0.000i 0.000i 0.150i' .sp -1 \h'2.250i'\v'3.200i'\D'p0.000i -0.300i -2.250i 0.000i 0.000i 0.300i' .sp -1 .lf 2950 \h'0.000i'\v'1.925i-(0v/2u)+0v+0.22m'From: e@f.net .sp -1 .lf 2951 \h'0.000i'\v'2.075i-(0v/2u)+0v+0.22m'To: a@b.org .sp -1 .lf 2952 \h'0.000i'\v'2.225i-(0v/2u)+0v+0.22m'Date: 8 Sep 1999 12:34:56 -0700 .sp -1 .lf 2953 \h'0.000i'\v'2.375i-(0v/2u)+0v+0.22m'Subject: Hi .sp -1 .lf 2954 \h'0.000i'\v'2.525i-(0v/2u)+0v+0.22m'Message-ID: .sp -1 .lf 2955 \h'0.000i'\v'2.975i-(0v/2u)+0v+0.22m'This is a brief note to say hi. .sp -1 \h'0.500i'\v'1.500i'\D'~0.000i 0.200i' .sp -1 \D'f 1000u'\h'-1000u' .sp -1 \h'0.500i'\v'1.700i'\D'P-0.025i -0.100i 0.050i 0.000i' .sp -1 \h'0.500i'\v'1.700i'\D'p-0.025i -0.100i 0.050i 0.000i' .sp -1 \h'1.750i'\v'1.700i'\D'~0.000i -0.200i' .sp -1 \h'1.750i'\v'1.500i'\D'P0.025i 0.100i -0.050i 0.000i' .sp -1 \h'1.750i'\v'1.500i'\D'p0.025i 0.100i -0.050i 0.000i' .sp -1 .sp 3.200i+1 .if \n(00 .fi .br .nr 0x 0 .lf 2958 .PE .lf 2959 .ps .DE .PP RFC 2046 also provides a message/partial media type. The first part of a series of message parts looks similar to the message/rfc822. Subsequent parts, however, do not have header fields associated with the encapsulated partial message (all of the header fields are in the first part) so the body of the encapsulated part is stored with the encapsulation (which, as noted above, does not have its own body). .PP Message/external-body is similar to message/rfc822 except that the encapsulated body (called a phantom body) may contain instructions for automated retrieval of the external body. The structure is the same as used for message/rfc822. .PP As mentioned earlier, some message media types are peculiar because the encapsulated content does not begin with structured header fields as defined in RFCs 822 and 2822. Instead, the content is in a specific format which will need to be processed separately by applications due to the incompatibilities with RFCs 822 and 2822. Since the encapsulation does not have RFC 822/2822 header fields, it is stored with the encapsulation, just as with continuation portions of message/partial. .PP RFC 2298 (superseded by RFC 3798) defined a message/disposition-notification media type. The encapsulated entity (the disposition notification) consists entirely of structured fields. As with message/rfc822, the encapsulated entity is stored in a structure linked to the encapsulation. The structured fields are stored as fields. .DS .PP The most complex message media type is the message/delivery\-status type originally defined in RFC 1894 (currently defined in RFC 3464). The encapsulated entity consists of a series of groups of structured fields separated by empty lines. As noted above, the structure used does not provided for multiple sets of independent fields per structure, so these groups (per-message fields, and one or more groups of per-recipient fields) are each stored in a separate structure. The links between structures resemble those of a multipart entity; the per-message fields are held in the encapsulation's child structure, and the per-recipient fields are held in structures linked to the right of the per-message fields' structure. Here's an example (from RFC 3464, Multi-Recipient DSN example): .ps 5 .lf 3043 .PS 1.840i 6.010i .\" 0 -1.455 6.01 0.385 .\" 0.000i 1.840i 6.010i 0.000i .nr 00 \n(.u .nf .nr 0x 1 \h'6.010i' .sp -1 \h'1.100i'\v'0.070i'\D'p0.000i -0.070i -1.100i 0.000i 0.000i 0.070i' .sp -1 \h'1.100i'\v'0.560i'\D'p0.000i -0.490i -1.100i 0.000i 0.000i 0.490i' .sp -1 \h'1.100i'\v'0.630i'\D'p0.000i -0.070i -1.100i 0.000i 0.000i 0.070i' .sp -1 \h'1.100i'\v'0.770i'\D'p0.000i -0.140i -1.100i 0.000i 0.000i 0.140i' .sp -1 .lf 3071 \h'0.000i'\v'0.105i-(0v/2u)+0v+0.22m'content-type: message/delivery-status .sp -1 \h'0.905i'\v'1.140i'\D'p0.000i -0.070i -0.905i 0.000i 0.000i 0.070i' .sp -1 \h'0.905i'\v'1.630i'\D'p0.000i -0.490i -0.905i 0.000i 0.000i 0.490i' .sp -1 \h'0.905i'\v'1.700i'\D'p0.000i -0.070i -0.905i 0.000i 0.000i 0.070i' .sp -1 \h'0.905i'\v'1.840i'\D'p0.000i -0.140i -0.905i 0.000i 0.000i 0.140i' .sp -1 .lf 3074 \h'0.000i'\v'1.175i-(0v/2u)+0v+0.22m'Reporting-MTA: dns; cs.utk.edu .sp -1 \h'2.775i'\v'1.140i'\D'p0.000i -0.070i -1.800i 0.000i 0.000i 0.070i' .sp -1 \h'2.775i'\v'1.630i'\D'p0.000i -0.490i -1.800i 0.000i 0.000i 0.490i' .sp -1 \h'2.775i'\v'1.700i'\D'p0.000i -0.070i -1.800i 0.000i 0.000i 0.070i' .sp -1 \h'2.775i'\v'1.840i'\D'p0.000i -0.140i -1.800i 0.000i 0.000i 0.140i' .sp -1 .lf 3077 \h'0.975i'\v'1.175i-(0v/2u)+0v+0.22m'Original-Recipient: rfc822;arathib@vnet.ibm.com .sp -1 .lf 3078 \h'0.975i'\v'1.245i-(0v/2u)+0v+0.22m'Final-Recipient: rfc822;arathib@vnet.ibm.com> .sp -1 .lf 3079 \h'0.975i'\v'1.315i-(0v/2u)+0v+0.22m'Action: failed .sp -1 .lf 3080 \h'0.975i'\v'1.385i-(0v/2u)+0v+0.22m'Status: 5.0.0 (permanent failure) .sp -1 .lf 3081 \h'0.975i'\v'1.455i-(0v/2u)+0v+0.22m'Diagnostic-Code: smtp; .sp -1 .lf 3082 \h'0.975i'\v'1.525i-(0v/2u)+0v+0.22m'\0\&550 'arathib@vnet.IBM.COM' is not a registered gateway user .sp -1 .lf 3083 \h'0.975i'\v'1.595i-(0v/2u)+0v+0.22m'Remote-MTA: dns; vnet.ibm.com .sp -1 \h'4.475i'\v'1.140i'\D'p0.000i -0.070i -1.630i 0.000i 0.000i 0.070i' .sp -1 \h'4.475i'\v'1.630i'\D'p0.000i -0.490i -1.630i 0.000i 0.000i 0.490i' .sp -1 \h'4.475i'\v'1.700i'\D'p0.000i -0.070i -1.630i 0.000i 0.000i 0.070i' .sp -1 \h'4.475i'\v'1.840i'\D'p0.000i -0.140i -1.630i 0.000i 0.000i 0.140i' .sp -1 .lf 3086 \h'2.845i'\v'1.175i-(0v/2u)+0v+0.22m'Original-Recipient: rfc822;johnh@hpnjld.njd.hp.com .sp -1 .lf 3087 \h'2.845i'\v'1.245i-(0v/2u)+0v+0.22m'Final-Recipient: rfc822;johnh@hpnjld.njd.hp.com .sp -1 .lf 3088 \h'2.845i'\v'1.315i-(0v/2u)+0v+0.22m'Action: delayed .sp -1 .lf 3089 \h'2.845i'\v'1.385i-(0v/2u)+0v+0.22m'Status: 4.0.0 (hpnjld.njd.jp.com: host name lookup failure) .sp -1 \h'6.010i'\v'1.140i'\D'p0.000i -0.070i -1.465i 0.000i 0.000i 0.070i' .sp -1 \h'6.010i'\v'1.630i'\D'p0.000i -0.490i -1.465i 0.000i 0.000i 0.490i' .sp -1 \h'6.010i'\v'1.700i'\D'p0.000i -0.070i -1.465i 0.000i 0.000i 0.070i' .sp -1 \h'6.010i'\v'1.840i'\D'p0.000i -0.140i -1.465i 0.000i 0.000i 0.140i' .sp -1 .lf 3092 \h'4.545i'\v'1.175i-(0v/2u)+0v+0.22m'Original-Recipient: rfc822;wsnell@sdcc13.ucsd.edu .sp -1 .lf 3093 \h'4.545i'\v'1.245i-(0v/2u)+0v+0.22m'Final-Recipient: rfc822;wsnell@sdcc13.ucsd.edu .sp -1 .lf 3094 \h'4.545i'\v'1.315i-(0v/2u)+0v+0.22m'Action: failed .sp -1 .lf 3095 \h'4.545i'\v'1.385i-(0v/2u)+0v+0.22m'Status: 5.0.0 .sp -1 .lf 3096 \h'4.545i'\v'1.455i-(0v/2u)+0v+0.22m'Diagnostic-Code: smtp; 550 user unknown .sp -1 .lf 3097 \h'4.545i'\v'1.525i-(0v/2u)+0v+0.22m'Remote-MTA: dns; sdcc13.ucsd.edu .sp -1 \h'0.367i'\v'0.770i'\D'l-0.065i 0.300i' .sp -1 \D'f 1000u'\h'-1000u' .sp -1 \h'0.302i'\v'1.070i'\D'P-0.008i -0.032i 0.029i 0.006i' .sp -1 \h'0.302i'\v'1.070i'\D'p-0.008i -0.032i 0.029i 0.006i' .sp -1 \h'0.603i'\v'1.070i'\D'l0.130i -0.300i' .sp -1 \h'0.733i'\v'0.770i'\D'P0.002i 0.033i -0.028i -0.012i' .sp -1 \h'0.733i'\v'0.770i'\D'p0.002i 0.033i -0.028i -0.012i' .sp -1 \h'2.175i'\v'1.070i'\D'l-1.442i -0.300i' .sp -1 \h'0.733i'\v'0.770i'\D'P0.032i -0.009i -0.006i 0.029i' .sp -1 \h'0.733i'\v'0.770i'\D'p0.032i -0.009i -0.006i 0.029i' .sp -1 \h'3.932i'\v'1.070i'\D'l-3.198i -0.300i' .sp -1 \h'0.733i'\v'0.770i'\D'P0.031i -0.012i -0.003i 0.030i' .sp -1 \h'0.733i'\v'0.770i'\D'p0.031i -0.012i -0.003i 0.030i' .sp -1 \h'5.522i'\v'1.070i'\D'l-4.788i -0.300i' .sp -1 \h'0.733i'\v'0.770i'\D'P0.031i -0.013i -0.002i 0.030i' .sp -1 \h'0.733i'\v'0.770i'\D'p0.031i -0.013i -0.002i 0.030i' .sp -1 \h'0.905i'\v'1.327i'\D'l0.070i 0.000i' .sp -1 \h'0.975i'\v'1.327i'\D'P-0.030i 0.015i 0.000i -0.030i' .sp -1 \h'0.975i'\v'1.327i'\D'p-0.030i 0.015i 0.000i -0.030i' .sp -1 \h'2.775i'\v'1.327i'\D'l0.070i 0.000i' .sp -1 \h'2.845i'\v'1.327i'\D'P-0.030i 0.015i 0.000i -0.030i' .sp -1 \h'2.845i'\v'1.327i'\D'p-0.030i 0.015i 0.000i -0.030i' .sp -1 \h'4.475i'\v'1.327i'\D'l0.070i 0.000i' .sp -1 \h'4.545i'\v'1.327i'\D'P-0.030i 0.015i 0.000i -0.030i' .sp -1 \h'4.545i'\v'1.327i'\D'p-0.030i 0.015i 0.000i -0.030i' .sp -1 \h'4.545i'\v'1.583i'\D'l-0.070i 0.000i' .sp -1 \h'4.475i'\v'1.583i'\D'P0.030i -0.015i 0.000i 0.030i' .sp -1 \h'4.475i'\v'1.583i'\D'p0.030i -0.015i 0.000i 0.030i' .sp -1 \h'2.845i'\v'1.583i'\D'l-0.070i 0.000i' .sp -1 \h'2.775i'\v'1.583i'\D'P0.030i -0.015i 0.000i 0.030i' .sp -1 \h'2.775i'\v'1.583i'\D'p0.030i -0.015i 0.000i 0.030i' .sp -1 \h'0.975i'\v'1.583i'\D'l-0.070i 0.000i' .sp -1 \h'0.905i'\v'1.583i'\D'P0.030i -0.015i 0.000i 0.030i' .sp -1 \h'0.905i'\v'1.583i'\D'p0.030i -0.015i 0.000i 0.030i' .sp -1 .sp 1.840i+1 .if \n(00 .fi .br .nr 0x 0 .lf 3111 .PE .lf 3112 .ps .DE .DS .SS MIME Media Types and Subtypes .PP The media types registered by the Internet Assigned Numbers Authority (IANA) are recognized by .BR mparse. A program may enumerate those media types (e.g. to create a menu) by calling the function .PP .B const struct mparse_type *mparse_media_type(int n); .PP with successive integer arguments beginning with 1. Each call will return a pointer to a structure (see below) which represents a media type; when all media types have been enumerated, .B mparse_media_type returns a NULL pointer. .PP Information regarding a registered media type tag can be obtained by calling: .PP .B const struct mparse_type *mparse_type_entry(const char *str, unsigned int len); .PP giving a pointer to the media type tag as .B str and its length (only the primary type tag) as .BR len . The string pointed to by .B str need not be null-terminated: one could call .B mparse_type_entry("message/rfc-822", 7U) to obtain the information for the .B message media type. If the tag is not a registered type, a .I NULL pointer is returned. Otherwise, a pointer to the structure defined in .B mparse.h is returned. The members of that structure are: .TS expand; lw(3.1i)fB lw(3.0i)fB . member (struct mparse_type) description _ .T& lp-2fB lp-2 . const char *type_name; canonical type tag unsigned int flags; T{ flags defined in \fBmparse.h\fP T} const struct mparse_subtype *(*e)(int); T{ .na pointer to a function useful for enumerating subtypes of the type T} const struct mparse_subtype *(*f)(const char *, unsigned int); T{ .na pointer to a function returning information about subtypes of the type T} .TE .DE .PP Likewise, information about subtypes may be obtained by calling the appropriate function returned in the .I mparse_type structure; they return a similar structure containing only the canonical subtype tag and flags. Note that the illegal "example" type will return a structure with .I NULL pointers for the subtype enumeration and information functions; there are no subtypes of the "example" type. User-defined types and subtypes can be supported via hooks provided for extension types and subtypes; these are application-provided functions which operate like the ones described above, but for the user-defined types (the extension subtype function also takes a pointer to the type). They are described below along with other application hooks. .PP In addition to accessing the media subtype enumeration and information functions via the .B mparse_type structure elements, it is possible to directly access those functions. For each media type except for the illegal "example" registered type, the name of the enumeration function is composed of the "mparse_" prefix, the type tag, and by a "_subtype" suffix, and the information function is composed of the "mparse_" prefix, the type tag, and by a "_entry" suffix. So, for example, the functions for the .B video media type are .B mparse_video_subtype (for enumeration) and .BR mparse_video_entry . .PP If the media type for an entity is not explicitly specified via a Content-Type field, there is a default type. Even if a type is specified via a Content-Type field, some other type might be in effect (e.g. if an unrecognized transfer coding is specified). The functions .PP .B const struct mparse_type *mparse_default_type(struct mparse_entity *, int); .PP and .PP .B const struct mparse_subtype *mparse_default_subtype(struct mparse_entity *, int); .PP return pointers to the default media type and subtype for the specified entity. if the integer argument is less than zero, indicating some error, the type and subtype returned correspond to application/octet-stream. .DS .SS Application Hooks .PP Many of the application-defined functions take a pointer to the current .B mparse_entity structure as its first argument. Note that the pointer may point to a different location on different calls to a function; typically this happens as a result of a multipart MIME message. Functions called for fields may access the field via the .B last_field pointer in the .B mparse_entity structure. .PP Several of the application-defined functions take one or more pointers to a .B struct mparse_field defined in the header file .BR mparse.h , or .B struct mparse_token which is defined in the header file .BR mparse.h . .PP If the .I userptr should be copied when an .B mparse_entity structure is copied or allocated, or if it points to allocated storage which should be freed when an .B mparse_entity structure is freed, the appropriate functions for copying and freeing should be set in the user application function hooks. Likewise for .B userptr members of the .B mparse_token and .B mparse_field structures. .DE .ne 4.0i .PP These function hooks are held in a structure pointed to by the .I hooks member of the .B mparse_message structure. It is defined in the header file .B mparse.h and contains: .TS H expand; lw(3.4i)fB lw(2.6i)fB . function (member of struct mparse_hooks) description _ .TH .T& lp-5fB lp-5 . T{ .na const void (*hook_accept_language)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Accept-Language field T} T{ .na void (*hook_action)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Action field T} T{ .na void (*hook_alt_recipient)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Alternate-Recipient field T} T{ .na void (*hook_approved)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Approved field T} T{ .na void (*hook_archive)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Archive field T} T{ .na void (*hook_archived_at)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Archived-At field T} T{ .na void (*hook_arrival_date)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Arrival-Date field T} T{ .na void (*hook_autoforwarded)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Autoforwarded field T} T{ .na void (*hook_autosubmitted)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Autosubmitted field T} T{ .na void (*hook_auto_submitted) (int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Auto\-Submitted field T} T{ .na void (*hook_bad_field)(const struct mparse_field *); T} T{ .na do something with bad mparse_field (use next2 pointers) T} T{ .na void (*hook_bcc)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Bcc field T} T{ .na void (*hook_bilateral_info)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Bilateral-Info field T} T{ .na void (*hook_body_section)(const struct mparse_entity *); T} T{ .na do something with body section (Use next2 pointers) T} T{ .na void (*hook_body_section_end)(const struct mparse_entity *); T} T{ .na do something after body section T} T{ .na void (*hook_body_section_end_of_MIME_fields)(const struct mparse_entity *); T} T{ .na do something at end of body section MIME fields T} T{ .na void (*hook_body_section_end_of_fields)(const struct mparse_entity *); T} T{ .na do something at end of body section fields T} T{ .na void (*hook_body_section_start)(const struct mparse_entity *); T} T{ .na prepare for body section (may include fields) T} T{ .na void (*hook_cancel)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process cancel control message T} T{ .na void (*hook_caller_id)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process Caller-ID field T} T{ .na void (*hook_caller_name)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Caller-Name field T} T{ .na void (*hook_cc)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Cc field T} T{ .na void (*hook_checkgroups)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process checkgroups control message T} T{ .na void (*hook_comments)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Comments field T} T{ .na void (*hook_content_alternative)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process content alternative filter T} T{ .na void (*hook_content_base)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process content base T} T{ .na void (*hook_content_description)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process content description T} T{ .na void (*hook_content_disposition)(int, const struct mparse_field *); T} T{ .na process content disposition T} T{ .na void (*hook_content_duration)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process content duration T} T{ .na void (*hook_content_features)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process content features T} T{ .na void (*hook_content_id)(int, const struct mparse_field *); T} T{ .na process content ID T} T{ void (*hook_content_language)(int, const struct mparse_field *, const struct mparse_token *); .na T} T{ .na process content language T} T{ .na void (*hook_content_location)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process content location T} T{ .na void (*hook_content_md5)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Content-MD5 field T} T{ .na void (*hook_content_type)(int, const struct mparse_field *, struct mparse_token *, struct mparse_token *, struct mparse_token *, struct mparse_token *); T} T{ .na process Content-Type field. RFC 1049 Content-Type fields may be distinguished from MIME Content-Type fields as the type of the second '\" .B causes unexpected point size changes \fBstruct mparse_token *\fP in the latter is '/' (\fIi.e.\fP a slash character), but not in the former (where it may be a '\" .I causes unexpected point size changes \fINULL\fP pointer or may point to a token with some other value. MIME media type, subtype, and mparse_parameters are cached; RFC 1049 Content-Type field body content is not cached. T} T{ .na void (*hook_conv_w_loss)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Conversion\-With\-Loss field T} T{ .na void (*hook_conversion)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Conversion field T} T{ .na void *(*hook_copy_token_user_data)(const struct mparse_token *); T} T{ .na return a pointer to (an allocated, if hook_free_token_userptr is used) copy of token userptr T} T{ .na void *(*hook_copy_entity_user_data)(const struct mparse_entity *); T} T{ .na return a pointer to (an allocated, if userptr is to be freed) copy of userptr T} T{ .na void *(*hook_copy_field_user_data)(const struct mparse_field *); T} T{ .na return a pointer to (an allocated, if userptr is to be freed) copy of userptr T} T{ .na void (*hook_cte)(int, const struct mparse_field *); T} T{ .na process content-transfer-encoding T} T{ .na void (*hook_date)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Date field T} T{ .na void (*hook_date_received)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Date-Received field T} T{ .na void (*hook_def_delivery)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Deferred-Delivery field T} T{ .na void (*hook_delivery_date)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Delivery-Date field T} T{ .na void (*hook_diag_code)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Diagnostic-Code field T} T{ .na void (*hook_discarded_x400_ipms_extensions)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Discarded-X400-IPMS-Extensions field T} T{ .na void (*hook_discarded_x400_mts_extensions)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Discarded-X400-MTS-Extensions field T} T{ .na void (*hook_disclose_recip)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Disclose-Recipients field T} T{ .na void (*hook_disposition)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process disposition T} T{ .na void (*hook_disposition_notification_opts)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Disposition-Notification-Options field T} T{ .na void (*hook_disposition_notification_to)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Disposition-Notification-To field T} T{ .na void (*hook_distribution)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Distribution field T} T{ .na int (*hook_distribution_validate)(const struct mparse_field *, const struct mparse_token *); T} T{ .na validate (0: OK) distribution name T} T{ .na void (*hook_dl_exp_history)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with DL-Expansion-History field T} T{ .na void (*hook_drc_bi)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Delivery-Report-Content-Billing-Information field T} T{ .na void (*hook_drc_it)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Delivery-Report-Content-Intermediate-Trace field T} T{ .na void (*hook_drc_rri)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Delivery-Report-Content-Reported-Recipient-Info field T} T{ .na void (*hook_drc_ua_c_id)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Delivery-Report-Content-UA-Content-ID field T} T{ .na void (*hook_drc_original)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Delivery-Report-Content-Original field T} T{ .na void (*hook_dsn_gateway)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with DSN-Gateway field T} T{ .na void (*hook_encoding)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Encoding field T} T{ .na void (*hook_encrypted)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process (obsolete) Encrypted field T} T{ .na int (*hook_end_of_message)(const struct mparse_message *); T} T{ .na do something at end of message, return zero for normal mparse_status, non-zero to indicate an abnormal condition T} T{ .na void (*hook_end_of_mdn_fields)(const struct mparse_entity *); T} T{ .na do something at end of MDN T} T{ .na void (*hook_end_of_per_message_fields)(const struct mparse_entity *); T} T{ .na do something at end of DSN per-message fields T} T{ .na void (*hook_end_of_per_recipient_fields)(const struct mparse_entity *); T} T{ .na do something at end of DSN per-recipient fields T} T{ .na void (*hook_error)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process RFC 2298, 3798 Error field T} T{ .na void (*hook_error_message)(const struct mparse_message *, const char *); T} T{ .na do something with error message before output T} T{ .na void (*hook_expires)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process Expires field T} T{ .na void (*hook_extension)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process extension field (contents) T} T{ .na const char *(*hook_extension_MTA_name_type) (const char *, unsigned int); T} T{ .na support x- and unregistered MTA-name-types T} T{ .na const struct mparse_name_val *(*hook_extension_access_type)(const char *, unsigned int); T} T{ .na support x- and unregistered access-type for message/external-body T} T{ .na const char *(*hook_extension_address_type) (const char *, unsigned int); T} T{ .na support x- and unregistered address-types T} T{ .na const struct mparse_name_val *(*hook_extension_auto_submitted) (const char *, unsigned int); T} T{ .na support x\- and unregistered auto\-submitted tokens T} T{ .na const struct mparse_charset *(*hook_extension_charset)(const char *, unsigned int); T} T{ .na support x- unregistered charsets T} T{ .na const char *(*hook_extension_diagnostic_type) (const char *, unsigned int); T} T{ .na support x- and unregistered diagnostic-types T} T{ .na const struct mparse_disposition *(*hook_extension_disposition)(const char *, unsigned int); T} T{ .na support x- and unregistered dispositions T} T{ .na const struct mparse_name_val *(*hook_extension_disposition_modifier)(const char *, unsigned int); T} T{ .na support x- and unregistered disposition modifiers T} T{ .na const struct mparse_name_val *(*hook_extension_disposition_notification_option)(const char *, unsigned int); T} T{ .na support x- and unregistered disposition notification options T} T{ .na const struct mparse_language *(*hook_extension_language)(const char *, unsigned int); T} T{ .na support x- and unregistered languages T} T{ .na const struct mparse_subtype *(*hook_extension_media_subtype)(const struct mparse_type *, const char *, unsigned int); T} T{ .na support x- and unregistered media type subtypes T} T{ .na const struct mparse_type *(*hook_extension_media_type)(const char *, unsigned int); T} T{ .na support x- and unregistered media types T} T{ .na const char *(*hook_extension_numbering_plan)(const char *, unsigned int); T} T{ .na support x- and unregistered numbering plans T} T{ .na const struct mparse_encoding *(*hook_extension_transfer_encoding)(const char *, unsigned int); T} T{ .na support x- and unregistered transfer encodings T} T{ .na void (*hook_failure)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process RFC 2298, 3798 Failure field T} T{ .na void (*hook_fcc)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Fcc field T} T{ .na void (*hook_final_log_id)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Final-Log-ID field T} T{ .na void (*hook_final_recipient)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process RFC 2298, 3798 final-recipient field T} T{ .na void (*hook_followup_to)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Followup-To field T} T{ .na void (*hook_free_entity_userptr)(const struct mparse_entity *, void *); T} T{ .na free allocated userptr in entity structure T} T{ .na void (*hook_free_field_userptr)(const struct mparse_field *, void *); T} T{ .na free allocated userptr in field structure T} T{ .na void (*hook_free_token_userptr)(const struct mparse_token *, void *); T} T{ .na free allocated userptr in token structure T} T{ .na void (*hook_from)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with From field T} T{ .na void (*hook_gen_del_rept)(int, const struct mparse_field *); T} T{ .na do something with Generate-Delivery-Report field T} T{ .na void (*hook_field)(const struct mparse_field *); T} T{ .na do something with logical field line (use next2 links) T} T{ .na void (*hook_ihave)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process ihave control message (optional message-id series, system name) T} T{ .na void (*hook_illegal_field)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Illegal-Field field T} T{ .na void (*hook_illegal_object)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Illegal-Object field T} T{ .na void (*hook_importance)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Importance field T} T{ .na void (*hook_in_reply_to)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process In-Reply-To field T} T{ .na void (*hook_incomplete_copy)(int, const struct mparse_field *); T} T{ .na do something with Incomplete-Copy field T} T{ .na void (*hook_injection_date)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Injection-Date field T} T{ .na void (*hook_injector_info)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Injector-Info field T} T{ .na void (*hook_keywords)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Keywords field T} T{ .na void (*hook_last_attempt_date)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Last-Attempt-Date field T} T{ .na void (*hook_latest_del_time)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Latest-Delivery-Time field T} T{ .na void (*hook_lines)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Lines field T} T{ .na void (*hook_list_archive)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process List-Archive field T} T{ .na void (*hook_list_help)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process List-Help field T} T{ .na void (*hook_list_id)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process List-ID field T} T{ .na void (*hook_list_owner)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process List-Owner field T} T{ .na void (*hook_list_post)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process List-Post field T} T{ .na void (*hook_list_subscribe)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process List-Subscribe field T} T{ .na void (*hook_list_unsubscribe)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process List-Unsubscribe field T} T{ .na void (*hook_mail_from)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Mail-From field T} T{ .na int (*hook_mailbox_domain_validate)(const struct mparse_field *, const struct mparse_token *); T} T{ .na validate (0: OK) domain token list T} T{ .na int (*hook_mailbox_validate)(const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na validate (0: OK) mailbox (local-part, domain) T} T{ .na void (*hook_mdn_gateway)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process MDN-Gateway field T} T{ .na void (*hook_media_accept_features)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process media-accept-features T} T{ .na void (*hook_message_context)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Message-Context T} T{ .na void (*hook_message_id)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Message-ID T} T{ .na void (*hook_message_type)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Message-Type field T} T{ .na void (*hook_mime_close_delimiter)(const struct mparse_entity *, const struct mparse_token *); T} T{ .na process MIME close delimiter boundary T} T{ .na void (*hook_mime_delimiter)(const struct mparse_entity *, const struct mparse_token *); T} T{ .na process MIME delimiter boundary T} T{ .na void (*hook_mime_encapsulation)(const struct mparse_entity *); T} T{ .na set up child entity (hooks, etc.) T} T{ .na void (*hook_mime_external_body)(const struct mparse_entity *); T} T{ .na process (e.g. fetch, display) MIME external-body message T} T{ .na void (*hook_mime_multipart)(const struct mparse_entity *); T} T{ .na set up child entity (hooks, etc.) T} T{ .na void (*hook_mime_version)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process MIME\-version field T} T{ .na void (*hook_mvgroup)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process mvgroup control message (newsgroup names) T} T{ .na void (*hook_newgroup)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process newgroup control message (newsgroup name, optional "moderated" keyword) T} T{ .na void (*hook_newsgroups)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Newsgroups field T} T{ .na int (*hook_newsgroup_validate)(const struct mparse_field *, const struct mparse_token *); T} T{ .na validate (0: OK) newsgroup name T} T{ .na void (*hook_obsoletes) (int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Obsoletes field T} T{ .na void (*hook_organization)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Organization field T} T{ .na void (*hook_orig_env_id)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Original-Envelope-ID field T} T{ .na void (*hook_orig_ret_addr)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Originator-Return_Address field T} T{ .na void (*hook_original_encoded_information_types)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Original-Encoded-Information-Types field T} T{ .na void (*hook_original_message_id)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process RFC 2298, 3798 original-message-id field T} T{ .na void (*hook_original_recipient)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process RFC 2298, 3798 original-recipient field T} T{ .na void (*hook_p1_content_type)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with P1-Content-Type field T} T{ .na void (*hook_p1_message_id)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with P1-Message-ID field T} T{ .na void (*hook_p1_recipient)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with P1-Recipient field T} T{ .na void (*hook_path)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Path line T} T{ .na void (*hook_posting_version)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process Posting\-Version field T} T{ .na void (*hook_prev_nondel_rep)(int, const struct mparse_field *); T} T{ .na do something with Prevent\-Nondelivery\-Report field T} T{ .na void (*hook_priority)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Priority field T} T{ .na void (*hook_received(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Received field T} T{ .na void (*hook_received_content_mic(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Received\-content\-MIC field T} T{ .na int (*hook_received_by_domain_validate)(const struct mparse_field *, const struct mparse_token *); T} T{ .na validate (0: OK) Received by domain T} T{ .na int (*hook_received_from_domain_validate)(const struct mparse_field *, const struct mparse_token *); T} T{ .na validate (0: OK) Received from domain T} T{ .na .na void (*hook_rec_from_mta)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Received\-From\-MTA field T} T{ .na void (*hook_references)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process References field T} T{ .na void (*hook_relay_version)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process Relay\-Version field T} T{ .na void (*hook_remote_mta)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Remote\-MTA field T} T{ .na void (*hook_reply_by)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Reply\-By field T} T{ .na void (*hook_reply_to)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Reply\-To field T} T{ .na void (*hook_reporting_mta)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Reporting\-MTA field T} T{ .na void (*hook_reporting_ua)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Reporting\-UA field T} T{ .na void (*hook_resent_bcc)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Resent\-Bcc field T} T{ .na void (*hook_resent_cc)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Resent\-Cc field T} T{ .na void (*hook_resent_date)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Resent\-Date field T} T{ .na void (*hook_resent_from)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Resent-From field T} T{ .na void (*hook_resent_message_id)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Resent-Message-ID T} T{ .na void (*hook_resent_reply_to)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with (deprecated) Resent-Reply-To field T} T{ .na void (*hook_resent_sender)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Resent-Sender field T} T{ .na void (*hook_resent_to)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Resent-To field T} T{ .na void (*hook_return_path)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Return-Path field T} T{ .na void (*hook_rmgroup)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process rmgroup control message (newsgroup name) T} T{ .na void (*hook_sender)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Sender field T} T{ .na void (*hook_sendme)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na process sendme control message (optional message-id series, system name) T} T{ .na void (*hook_sendsys)(int, const struct mparse_field *); T} T{ .na process sendsys control message T} T{ .na void (*hook_sensitivity)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Sensitivity field T} T{ .na void (*hook_separator)(const struct mparse_entity *); T} T{ .na process separator between header fields and body section T} T{ .na void (*hook_start_of_message)(const struct mparse_entity *); T} T{ .na do something at start of message T} T{ .na void (*hook_status)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Status field T} T{ .na void (*hook_subject)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Subject field T} T{ .na void (*hook_summary)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process Summary field T} T{ .na void (*hook_supersedes)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Supersedes field T} T{ .na void (*hook_to)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with To field T} T{ .na void (*hook_ua_c_id)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with UA-Content-ID field T} T{ .na void (*hook_undefined_control)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process undefined control message (mail to local "usenet" account?) T} T{ .na void (*hook_user_agent)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with User-Agent field T} T{ .na void (*hook_user_defined)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process user_defined (X-) mparse_field (contents) T} T{ .na void (*hook_version)(int, const struct mparse_field *); T} T{ .na process version control message T} T{ .na void (*hook_warning)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na process RFC 2298, 3798 Warning field T} T{ .na void (*hook_will_retry_until)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with Will-Retry-Until field T} T{ .na void (*hook_x400_cont_id)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with X400-Content-Identifier field T} T{ .na void (*hook_x400_cont_ret)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with X400-Content-Return field T} T{ .na void (*hook_x400_content_type)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with X400-Content-Type field T} T{ .na void (*hook_x400_mts_identifier)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with X400-MTS-Identifier field T} T{ .na void (*hook_x400_originator)(int, const struct mparse_field *, const struct mparse_token *, int); T} T{ .na do something with X400-Originator field T} T{ .na void (*hook_x400_received)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with X400-Received field T} T{ .na void (*hook_x400_recipients)(int, const struct mparse_field *, const struct mparse_token *, int); T} T{ .na do something with X400-Recipients field T} T{ .na void (*hook_x400_trace)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with X400-Trace field T} T{ .na void (*hook_xref)(int, const struct mparse_field *, const struct mparse_token *, const struct mparse_token *, const struct mparse_token *); T} T{ .na do something with Xref line T} T{ .na void (*hook_x_archived_at)(int, const struct mparse_field *, const struct mparse_token *); T} T{ .na do something with X-Archived-At field T} .TE .DS .SS Parser Initialization The .B mparse_message structure should first be initialized to all zero values either by using .BR calloc (3) or .BR memset (3). Flags, file pointers, and pointers to application-defined functions can then be set, prior to calling one of the .B mparse functions (described in detail below). .DE .DS .SS Message Parsing .PP The simplest call to .B mparse would be .PP .ft CW .ps 8 .vs 9 .nf mparse_parse(0,\ 0,\ 0); .fi .ft .ps .vs .PP which automatically allocates a .B mparse_message structure and associated .B r_flex structure, reads input from .IR stdin , sends output to .IR stdout , and error and warning output to .IR stderr . Since no flags or function hooks are set in the allocated .B mparse_message structure, all processing is the default. In order to obtain behavior other than the default, the caller should provide a pointer to a .B mparse_message structure which has been initialized with the desired function hooks, modes, and flags. A typical method would be: .PP .na .nf .ft CW .ps 8 .vs 9 #include /* ... */ \0\0\0\0int\0ret; \0\0\0\0struct mparse_message\0message; \0\0\0\0struct mparse_debug\0debug; \0\0\0\0memset(&message, 0, sizeof(struct mparse_message)); \0\0\0\0memset(&debug, 0, sizeof(struct mparse_debug)); \0\0\0\0message.dbg = &debug; \0\0\0\0/* set function hooks */ \0\0\0\0/* set modes using mparse_add_mode(), etc. */ \0\0\0\0/* set flags, e.g.: */ \0\0\0\0message.no_copy = 1U; \0\0\0\0ret = mparse_parse(&message, 0, 0); .vs .ps .ft .fi .ad .DE .PP .fi Of course, it is also possible to pass other FILE pointers; this is done in the usual way by calling .B fopen or related functions. When called with an initialized .B mparse_message structure, .B mparse_parse will process input according to the flags, generate warning and error messages according to the modes, and call the specified functions at the appropriate points. Processing can be performed on-the-fly by functions called, or the calling application can defer processing until one or more of the hook_end_of_... functions is called, or a mix of both types of processing may be used. .PP It is also possible to parse a message held in a character array rather than a file; use .PP .B int mparse_parse_string(struct mparse_message *, const char *, FILE *); .PP Because C strings are terminated by an ASCII .I NUL character, .B mparse_string might fail if the message has an embedded ASCII .IR NUL . In that case, one can use .PP .B int mparse_parse_buffer(struct mparse_message *, char *, unsigned int, FILE *); .PP where the message length in bytes is given by the unsigned integer argument. Refer to the .I flex documentation for details on parsing input from such a .IR buffer . .PP Token lists are generated for each field, each delimiter, each empty separator line, and each body (or preamble or epilogue) section. The delimiter and separator lists are referenced directly by the .B delimiter and .B separator pointers in the .B mparse_entity structure. .DS .SS Termination After .B mparse_parse returns, data (e.g. ioerr) in the .B mparse_message structure may be examined. If one wishes to examine parsed message structure, that should be done via the .I hook_end_of_message application hook described above, .I i.e. before mparse frees allocated structures and returns. .DE .DS .SS Support Functions .PP There are a number of support functions available for working with the .B mparse_token structure, which stores information about the token returned by the lexical analyzer, and provides pointers which comprise linked lists. One list links tokens in the order they are recognized, another is used to group tokens into a logical construct, such as an address, which may exclude some input tokens such as whitespace, line folding, and/or comments. Other pointers are used for navigating lists and bracketed constructs. .PP The .B mparse_print_token and .B mparse_print_tokens functions may be used to print a list of logically-grouped tokens (mparse_print_token) or a complete list of tokens (mparse_print_tokens) to the specified FILE. Character strings which precede and follow the tokens may be specified as the second and fourth arguments: .PP .B size_t mparse_print_token(FILE *, const char *, const struct mparse_token *, const char *); .PP .B size_t mparse_print_tokens(FILE *, const char *, const struct mparse_token *, const char *); .PP Both functions use .B fwrite to produce output, so there is no problem with binary data, including ASCII NUL. Byte-stuffing is included if specified for the message (if any) associated with the token or tokens. Each function returns the count of octets written. If a .I NULL FILE * argument is provided, no output is actually written, but the count returned is what would have been written. .DE .DS .PP Alternatively, the token lists may be expanded into an array of characters (with no provision for byte stuffing). The functions .B mparse_token_string and .B mparse_tokens_string may be used for this purpose: .PP .B size_t mparse_token_string(const struct mparse_token *, char *, size_t, unsigned int, unsigned int); .PP .B size_t mparse_tokens_string(const struct mparse_token *, char *, size_t, unsigned int, unsigned int); .PP The second argument points to the array of characters to be filled, and the third argument gives the size of the array. The return value gives the number of bytes written (not including the terminating '\e0') if the buffer is large enough, or the required minimum size if the buffer is not large enough. To determine how large an array is needed, therefore, one can first call one of these functions with either a NULL character pointer or a zero size. The fourth argument provides information on the context of the first argument. It is constructed by bitwise ORing zero or more of the following: .TS expand; lw(2.0i)fB lw(2.9i)fB . symbolic name description _ .T& lfB l . MPARSE_CONTEXT_CFWS token is part of CFWS MPARSE_CONTEXT_DID_SPACE whitespace exists at end of output buffer MPARSE_CONTEXT_QUOTED token is in a quoted-string .TE MPARSE_CONTEXT_DID_SPACE is primarily used for recursive calls and when .B mparse_tokens_string calls .BR mparse_token_string . The fifth argument controls how comments, line folding, whitespace (collectively .BR CFWS ), how non-ASCII and ASCII control bytes are handled, and whether or not quoted and backslash-escaped sequences are canonicalized. It is constructed by bitwise ORing zero or more of the following: .TS expand; lw(3.2i)fB lw(2.3i)fB . symbolic name description _ .T& lfB l . MPARSE_STRING_MODE_CANONICALIZE eliminate unnecessary quoting MPARSE_STRING_MODE_COMMENT_SPACE T{ use a single space character in place of each comment T} MPARSE_STRING_MODE_ENCODE_BIT8 T{ encode all non-ASCII octets using a quoted\-printable-like encoding T} MPARSE_STRING_MODE_ENCODE_CTLS T{ encode all ASCII control characters using a quoted\-printable-like encoding T} MPARSE_STRING_MODE_NORMALIZE_AT use @ in place of RFC 733 " at " MPARSE_STRING_MODE_REMOVE_WS ignore all non-essential whitespace MPARSE_STRING_MODE_SQUEEZE_WS T{ use a single space character to represent any run of whitespace T} MPARSE_STRING_MODE_DELETE_TRAILING_WS delete trailing whitespace MPARSE_STRING_MODE_ENCODE_TRAILING_WS encode trailing whitespace MPARSE_STRING_MODE_UNFOLD ignore line folding .TE There is interaction between these values; MPARSE_STRING_MODE_UNFOLD | MPARSE_STRING_MODE_COMMENT_SPACE | MPARSE_STRING_MODE_SQUEEZE_WS will replace all runs of CFWS with a single space character, and MPARSE_STRING_MODE_UNFOLD | MPARSE_STRING_MODE_COMMENT_SPACE | MPARSE_STRING_MODE_REMOVE_WS will ignore all non-essential CFWS. .PP Canonicalization of quoting may be useful when comparing local-parts or domains in addresses or message-ids; the following are all equivalent: .PP (canonical form) .PP <"foo.bar"@[1\e.2.3.4]> .PP <"f\eoo.bar"@[1\e.2.3.4]> .PP <"f\eoo\e.bar"@[1\e.2.3.4]> .PP <"f\eoo\e.bar"@[1\e.\e2.3.4]> .DE .PP Normally, storage for tokens is allocated and released automatically. In some situations, it may be necessary to free space allocated for tokens which are no longer required, such as when replacing a body section. the functions .B mparse_free_token and .B mparse_free_tokens are used for this purpose: .PP .B void mparse_free_token(const struct mparse_message *, struct mparse_token *); .PP .B void mparse_free_tokens(const struct mparse_message *, struct mparse_token *); .DS .PP A token may be created with a call to .PP .B struct mparse_token *mparse_new_token(const struct mparse_message *, const char *, int, int, int, int); .PP The token created uses an allocated copy of the string pointed to by the second argument as the token .B tok member; the four integer arguments are used for the .BR len ", " col ", " type ", and " val members, respectively. If .B len is zero, the appropriate length will be calculated using the usual C string rule that a zero valued character terminates the string. .DE .PP A copy of an existing related set of tokens may be made by calling .PP .B struct mparse_token *mparse_token_copy(const struct mparse_message *, const struct mparse_field *, const struct mparse_token *); .PP or .PP .B struct mparse_token *mparse_tokens_copy(const struct mparse_message *, const struct mparse_field *, const struct mparse_token *, struct mparse_token **); .PP The former copies tokens linked by the .B next and .B trailer pointers, while the latter uses .B trailer and .B next2 pointers. .B tokens_copy can also return the address of the last token copied by supplying a .RI non- NULL pointer as the last argument. Both functions assign the copied tokens to the specified field structure. .PP A token or group of related tokens may be inserted into an existing linked structure of tokens using .PP .B void mparse_insert_tokens(struct mparse_token *, struct mparse_token *); .PP The first argument points to a token in an existing linked structure; the token or linked structure of tokens pointed to by the second argument will be inserted after the token pointed to by the first argument. .PP Various types of pointers may be established between tokens using .PP .B struct mparse_token *mparse_link_tokens(int, struct mparse_token *, struct mparse_token *); .PP The first argument specifies the types of pointer and may be bitwise ORed from the values given by the macros .BR MPARSE_LINK_NEXT ", " MPARSE_LINK_NEXT2 ", " MPARSE_LINK_TRAILER ", and " MPARSE_LINK_CLOSE . The remaining two arguments are pointers to .B mparse_token structures to be linked. .PP A pointer to the first token of a field body (i.e. after the field name, colon, and any CFWS) may be obtained by calling .PP .B struct mparse_token *mparse_field_body(const struct mparse_field *fld); .PP with a pointer to the .I mparse_field structure. .PP When token string text is changed, or tokens are inserted or deleted from a token stream, the column numbers associated with tokens may change. While the above functions automatically correct for that situation, it may sometimes be necessary for applications to make corrections. That may be accomplished by calling the function .PP .B int mparse_adjust_col(struct mparse_token *); .PP with the argument pointing to the last token known to have a correct column number. Subsequent tokens' column numbers through the next line ending are corrected. The return value is the column of the last token corrected, or a negative value if no adjustment could be made. .DS .PP An instance of a particular field type in an entity may be located by calling the function .PP \s-1\fBstruct mparse_field *mparse_find_field(struct mparse_entity *, int, int, unsigned int);\fP\s0 .PP where the first integer argument is either zero or the symbolic constant MPARSE_RESENT_FIELD to indicate whether an ordinary field or a Resent\- version (if applicable) is to be returned. The second integer argument indicates the type of field, and is the token value in the associated field_state structure. The unsigned integer indicates how many instances should be skipped: .PP .ft CW .ps 8 .vs 9 .nf mparse_find_field(entity, 0, MPARSE_FIELD_RECEIVED, 2U); .fi .ft .ps .vs .PP returns a pointer to the 3rd Received field in the specified entity (two Received fields are skipped). .DE .DS .PP Given a token value for a field name, the function .PP .B const char *mparse_field_name(int); .PP returns a pointer to the canonical field name string. .DE .DS .PP The token value associated with a field name can also be used to return a pointer to a constant (read-only) .B field_state structure for that field name by calling .PP .B const struct mparse_field_state *mparse_field_state(int); .DE .DS .PP A copy of a field may be appended to the fields of an entity by calling the function .PP .B int mparse_copy_field(const struct mparse_field *, struct mparse_entity *, struct mparse_field *); .PP which allocates a copy of the field and inserts it in the fields in the specified entity, before the field (in the destination entity) specified by the last argument. It returns a negative value with .I errno set appropriately if bad arguments are supplied or in the event of a system error. It returns zero on success. .DE .SS Message Body Decoding .PP MIME provides two methods of encoding body information for transport: base64 encoding and quoted\-printable encoding as defined in RFC 2045. The functions .B decode_b64 and .B decode_qp decode body sections which have been encoded by these methods, replacing the original (encoded) body by a sequence of tokens representing the decoded content. .PP .B void mparse_decode_b64(struct mparse_entity *); .PP .B void mparse_decode_qp(struct mparse_entity *); .PP Message bodies may contain other coded content which has been coded for reasons other than transport robustness, e.g. HTML or other page description languages. Such encoding is beyond the scope of what .B mparse has been designed to handle; .B mparse will collect the content and may be used to reverse transport encoding, but further processing (e.g. rendering HTML or another page description language to a display) must be performed by the calling application via the provided function hooks. .DS .PP In addition to the transformations effected by base64 and quoted\-printable encoding, MIME provides specification of the domain of body content via the Content-Transfer-Encoding field. The valid encoding types are documented in RFC 2045. The .B mparse_encoding structure holds information about the message body domain for an encoding: .TS expand; lw(1.5i)fB lw(2.9i)fB . member (struct mparse_encoding) description _ .T& lfB l . const char *encoding_name; canonical encoding tag unsigned int domain; domain (7bit, 8bit, or binary) .TE .PP Registered encodings are recognized automatically by .BR mparse ; extensions can be recognized via the hook described above. .DE .DS .PP Names of registered encodings may also be recognized by calling .PP \s-2\fBconst struct mparse_encoding *mparse_encoding_entry(const char *, unsigned int);\fP\s0 .PP which returns a const .B mparse_encoding structure pointer for recognized encodings or a .I NULL pointer if the string is not a registered encoding name. .DE .DS The standard encoding values are available by calling the following functions, each of which returns a pointer to a constant encoding structure: .PP \s-2\fBconst struct mparse_encoding *mparse_encoding_7bit(void);\fP\s0 .PP \s-2\fBconst struct mparse_encoding *mparse_encoding_8bit(void);\fP\s0 .PP \s-2\fBconst struct mparse_encoding *mparse_encoding_binary(void);\fP\s0 .PP \s-2\fBconst struct mparse_encoding *mparse_encoding_base64(void);\fP\s0 .PP \s-2\fBconst struct mparse_encoding *mparse_encoding_qp(void);\fP\s0 .DE .DS .PP The function .PP \s-2\fBconst struct mparse_encoding *mparse_self_encoding(struct mparse_entity *);\fP\s0 .PP returns a pointer to a constant encoding structure based on the domain of the actual content of the entity specified. .DE .DS .SS Message Generation .PP .B Mparse provides functions for generating low-level and high-level message components, including complete MIME messages. Recall that simple messages are stored in an .B mparse_entity structure; an empty one may be created by calling the function .BR mparse_new_entity : .PP .B struct mparse_entity *mparse_new_entity(struct mparse_message *); .DE .PP If it is desired to copy an .B mparse_entity structure, including content, .B entity_copy may be used: .PP .B struct mparse_entity *mparse_entity_copy(struct mparse_message *, const struct mparse_entity *); .PP The copied entity is assigned to the message specified, if not .IR NULL , or to the same message as the original. .PP Having an .B mparse_entity structure, empty or otherwise, it is possible to add either header or body content. The .B mparse_insert_field function is provided for adding fields: .PP \s-2\fBint mparse_insert_field(struct mparse_entity *, struct mparse_field *, const char *, int, const char *, int, size_t, const char *, const char *, ...);\fP\s0 .PP The .B mparse_insert_field function accepts a variable number of character string arguments and will generate a string in a character array. The first two arguments to .B mparse_insert_field are the .B mparse_entity structure which will contain the field and a pointer to the field before which the new field will be inserted. If a .I NULL pointer is given for the field, the new field is appended to any existing fields. The third and fourth arguments point to a string naming the charset used for the field contents and give the length of that string. A .I NULL pointer or non-positive length indicates the default US-ASCII charset (it is then an error if any non-ASCII characters are provided in the field strings). Language information is provided via the fifth argument. The sixth argument is a flag which indicates whether to use the canonical capitalization for the field name tag, etc. The seventh argument gives the maximum line length and controls folding of long lines. The string pointer supplied as the eighth argument is used as the field tag, and should not include a colon. Following strings are used to build the field body content; a zero (NULL) pointer must be passed as the last argument. An empty Bcc field can be appended with .PP .ft CW .ps 8 .vs 9 .nf mparse_insert_field(entity, 0, 0, 0, 0, 1, 78, "bCc", 0); .fi .ft .ps .vs .PP which will generate the field "Bcc:\er\en". .PP To insert a field at the start of the field list, use the .B fields pointer from the .B mparse_entity structure. For example: .PP .ft CW .ps 8 .vs 9 .nf int c; char buf[64]; mparse_gen_date_local(buf, sizeof(buf)); c = mparse_insert_field(entity, entity->fields, 0, 0, 0, 1, 78, "Date", buf, 0); .fi .ps .vs .ft .PP will insert a Date field at the beginning of the list of fields in the .B mparse_entity structure pointed to by .I entity .RB ( gen_date_local will be described below). .DS .PP A low-level method of creating a field with no content is .PP .B struct mparse_field *mparse_new_field(const struct mparse_entity *); .PP which allocates a structure and associates it with the specified entity, returning a pointer to the allocated structure. .DE .DS .PP .B int mparse_header_end(struct mparse_entity *entity); .PP should be called after all header fields have been generated. .B mparse_header_end summarizes information about the header fields and cross\-checks fields for consistency and number. It returns a negative value if there is a very serious error, otherwise a count of the number of errors found. .DE .DS .PP A time-stamp line (a.k.a. Received field) may be inserted in a message by calling: .PP .B int mparse_time_stamp(struct mparse_message *message, struct sockaddr_in *peer, const char *name, int esmtp, const char *id); .PP where .B mparse_message points to the message, .B peer points to a .I sockaddr_in structure for the sender socket connection if available, otherwise .B name gives the domain name of the sender (preferably as determined by a reliable mechanism; the content of the HELO or EHLO SMTP command is unreliable (easily forged)), the .B esmtp flag indicates whether ESMTP protocol is being used (esmtp > 0), SMTP is being used (esmtp == 0), or neither protocol applies (esmtp < 0), and .B id gives a string for the optional id part of the field (NULL to omit the id, which will likely be elided in any case due to irreconcilable conflicts between the relevant RFCs). The generated time stamp line will be inserted at the beginning of the top-level message header fields as required by the SMTP RFCs. .DE .PP SMTP servers insert a return path line when making "final" delivery. Such a line may be inserted in a message by calling: .PP .B int mparse_return_path(struct mparse_message *message, const char *aa); .PP where .B mparse_message points to the message and .B aa is a character string containing the angle-bracketed return path. The Return-Path field is prepended to the top-level message header fields, and if there are no errors any pre-existing Return-Path fields are removed. .PP SMTP MTAs supporting DSN extensions may need to insert an Original-Recipient field, which may be accomplished by calling: .PP .B int mparse_original_recipient(struct mparse_message *message, const char *addr_type, const char *or); .PP where, as above, .B mparse_message points to the message, .B addr_type points to a character string giving the (registered) address type, and .B or points to a character string giving the original recipient address. The field is added near the beginning of the top-level message header fields, between the Return-Path and Received fields. On success, any pre-existing Original-Recipient top-level header fields are removed, as described in RFC 2298 section 2.3 (also RFC 3298). .DS .PP Body content may be appended to an .B mparse_entity structure using the .BR mparse_append_body_line , .BR mparse_append_body_from_file , or .B mparse_append_body_from_buffer functions: .PP .B int mparse_append_body_line(struct mparse_entity *, const char *, const char *, unsigned int, const char *, unsigned int, const char *, unsigned int, const char *, const char *, unsigned int, const char *, const char *, const char *, unsigned int); .PP .B int mparse_append_body_from_file(struct mparse_entity *, FILE *, const char *, unsigned int, const char *, unsigned int, const char *, unsigned int, const char *, const char *, unsigned int, const char *, const char *, const char *, unsigned int); .PP .B int mparse_append_body_from_buffer(struct mparse_entity *, char *, unsigned int, const char *, unsigned int, const char *, unsigned int, const char *, unsigned int, const char *, const char *, unsigned int, const char *, const char *, const char *, unsigned int); .PP Each function's first argument is the .B mparse_entity structure which is to contain the appended content. .B mparse_append_body_line takes a const char * as its second argument, while .B mparse_append_body_from_file takes a .I stdio .BR "FILE *" , and .B mparse_append_body_from_buffer takes a pointer to a character array and an unsigned integer giving the number of octets of interest. The remaining arguments provide information which is used to properly tag the content, and consists of: .TS expand; lw(0.5i)fB lw(3.9i)fB . argument description _ .T& lp-2fB lp-2 . const char *type content media type unsigned int typelen length of \fBtype\fP string const char *subtype content media subtype unsigned int subtypelen length of \fBsubtype\fP string const char *cst charset (if applicable) unsigned int cslen length of \fBcharset\fP string const char *type_other_parameters T{ .na any additional parameters for the Content-Type field (use \fImparse_parameter_string()\fP) T} const char *disp disposition (if applicable) unsigned int displen length of \fBdisposition\fP string const char *disp_other_parameters T{ .na any additional parameters for the Content-Disposition field (use \fImparse_parameter_string()\fP) T} const char *filename filename for Content-Disposition const char *languages content languages (if applicable) unsigned int linelen maximum line length for folding content .TE .DE .PP It is possible to build up a message body with multiple calls to .BR mparse_append_body_line , or multiple lines may be appended at one time by separating them with \er\en in the string. The file pointer should be opened and positioned properly before calling .BR mparse_append_body_from_file . This function makes it possible to read arbitrary binary content from files. To preserve binary content, open the file with a binary mode if your implementation of .I fopen supports it. .PP Body content may be copied from one entity to another with .B int mparse_copy_body(const struct mparse_entity *src, struct mparse_entity *dst); .PP which copies body fields from .I src to .IR dst . .DS .PP Application programmers who wish to enumerate standard disposition values (for example, to construct a menu) can call: .PP .B const struct mparse_disposition *mparse_disposition(int n); .PP sequentially with an integer argument beginning with 1. .B mparse_disposition will return a pointer to the structure in sequence according to the disposition value (see .BR mparse.h ), returning a .I NULL pointer when the end of the list is reached. Not all dispositions are appropriate for all messages; application authors may examine the disposition value and avoid presenting inappropriate values. .DE .DS .B const struct mparse_disposition *mparse_disposition_entry(const char *, unsigned int); .PP validates a particular string with specified length as a known disposition keyword. A .I NULL pointer is returned if the string is not a valid disposition keyword, otherwise a pointer to a structure as noted above is returned. .DE .DS .PP There are also some low\-level functions for putting content into an .B mparse_field structure, which can then be inserted in either the header or body section of an .B entity structure. .PP .B struct mparse_field *mparse_headerize_string(struct mparse_entity *, const char *, int, int); .PP returns a pointer to a newly allocated field structure which may be associated with the specified entity. Content is taken from the string pointed to by the second argument. The third argument indicates whether or not message content is to be canonicalized (non-zero to perform canonicalization), and the last argument indicates whether or not the content is pat of the entity body (non-zero for body content). .DE .DS .PP Similarly, content can be taken from a stdio FILE using: .PP .B struct mparse_field *mparse_headerize_file(struct mparse_entity *, FILE *, int, int); .DE .DS .PP Finally, content may be taken from a character array with specified length: .PP .B struct mparse_field *mparse_headerize_buffer(struct mparse_entity *, char *, unsigned int, int, int); .PP The unsigned integer argument specifies the length in octets. .DE .DS .PP A field and its allocated storage may be discarded by calling .PP .B void mparse_free_field(const struct mparse_message *message, struct mparse_field *h) .PP .DE .DS .PP .B void mparse_body_end(struct mparse_entity *entity); .PP should be called when all body content has been inserted. It consolidates multiple body .B struct mparse_field structures into a single structure, calls the user hooks for processing the body section, and checks for errors. .DE .DS .PP The above functions suffice for generating simple messages and discrete MIME messages. Composite MIME messages involve encapsulating messages in a MIME message entity or combining one or more simple body parts into a MIME multipart entity. .DE .DS .PP Sometimes it may be necessary to split a body into multiple .B struct mparse_field structures. .PP .B int mparse_split_body(struct mparse_entity *entity, unsigned int nlines); .PP splits the body into structures containing at most .I nlines lines. .DE .DS .PP It is possible to free the .B struct mparse_field structures associated with the body. That can be done by calling .PP .B void mparse_free_body(struct mparse_entity *); .DE .DS .PP Encapsulating a message can be performed with the function: .PP .B struct mparse_entity *mparse_encapsulate(struct mparse_entity *, const char *, const char *); .PP The three arguments are: the .B mparse_entity structure corresponding to the message to be encapsulated, a character string giving the message media subtype, and a character string giving any parameters (see \fBparameter_string\fP below). For example, to encapsulate a simple message as a MIME message/rfc822 type .PP .ft CW .ps 9 .na mparse_encapsulate(entity, "rfc822", 0); .ad .ps .ft .PP would return a pointer to the .B mparse_entity structure encapsulating the message. .PP .ft CW .ps 9 .na mparse_encapsulate(entity, "external-body", ";access-type=local-file;name=foo"); .ad .ps .ft .PP is another example. Details of the message subtypes and mparse_parameters may be found in RFC 2046. .DE .PP Application programmers who wish to enumerate standard external-body access-types (for example, to construct a menu) can call: .PP .B const struct mparse_name_val *mparse_access_type(int n); .PP sequentially with an integer argument beginning with 1. .B mparse_access_type will return a pointer to the structure in sequence according to the access-type name returning a .I NULL pointer when the end of the list is reached. .PP MIME multipart entities are initially constructed with the .B mparse_create_multipart function: .PP \s-2\fBstruct mparse_entity *mparse_create_multipart(struct mparse_entity *, const char *, const char *, const char *, const char *, const char *);\fP\s0 .PP The first argument is a pointer to the .B mparse_entity structure representing the entity to be placed within a multipart enclosure. The second argument is a character string for the multipart media subtype. The third argument is for the .I boundary parameter which is required for all multipart entities. The fourth argument is for any additional parameters. The fifth argument is a string for the optional preamble, and the sixth argument is a string for the epilogue. For example: .PP .ft CW .ps 7 .vs 8 mparse_create_multipart(entity, "mixed", "x", 0, "This is the preamble", "This is the epilogue.\er\en"); .ps .vs .ft .PP might have been used in the generation of the multipart example given in the earlier diagram. .PP Both .B mparse_encapsulate and .B mparse_create_multipart return a pointer to the enclosing .B mparse_entity structure. .PP Additional entities represented by .B mparse_entity structures may be inserted in existing encapsulations or enclosures with the .B mparse_insert_entity function: .PP .B int mparse_insert_entity(struct mparse_entity *, struct mparse_entity *, struct mparse_entity *, struct mparse_token *); .PP The first argument points to the .B mparse_entity structure representing the entity to be enclosed, while the second argument points to the .B mparse_entity structure of the enclosing composite entity. The third argument points to an entity within the enclosure; the new entity (specified by the first argument) will precede the one specified by the third argument (a third argument of zero places the new entity last). The fourth argument is for a delimiter and is intended for use in parsing text messages; it should be a zero (NULL) pointer when building messages using .BR mparse . .PP A string specifying MIME parameters may be generated with a call to: .PP \fB\s-2int mparse_parameter_string(const struct mparse_message *message, const char *attribute, const char *value, unsigned int vlen, const struct mparse_charset *cs, const unsigned char *language, char *buf, int sz);\s0\fP .PP which generates (in buf) a string of the form " ; attribute=value", taking into consideration the content of \fIvalue\fP, and any specified charset (required for non-ASCII content in \fIvalue\fP) and/or language. The generated string is compliant with RFC 2231. The value returned by .B mparse_parameter_string is -1 on error (e.g. non-ASCII content with no charset specified), the number of bytes written to buf (not including the terminating '\e0') if \fIbuf\fP is large enough (size specified by \fIsz\fP), or the required size of a buffer in bytes if the specified buffer is too small (or if a NULL pointer is given for \fIbuf\fP). Any required encoding, quoting, or continuation of long \fIvalue\fPs is handled automatically by .BR mparse_parameter_string . .DS .PP A string suitable for use as a phrase (such as appears in Keywords fields, and as the display name for a mailbox or named group) may be generated with a call to: .PP \fB\s-2int mparse_phrase_string(const struct mparse_message *message, const char *phrase, unsigned int len, const struct mparse_charset *cs, const unsigned char *language, const unsigned char *encoding, char *buf, int sz);\s0\fP .PP which generates (in buf) a string for the phrase. taking into consideration the content of \fIphrase\fP, and any specified charset (required for non-ASCII and/or certain ASCII (NUL, lone CR or LF) content in \fIphrase\fP) and/or language, and encoding. The value returned by .B mparse_phrase_string is -1 on error (e.g. non-ASCII content with no charset specified), the number of bytes written to buf (not including the terminating '\e0') if \fIbuf\fP is large enough (size specified by \fIsz\fP), or the required size of a buffer in bytes if the specified buffer is too small (or if a NULL pointer is given for \fIbuf\fP). Any required encoding or quoting is handled automatically by .BR mparse_phrase_string . If a language is specified, the entire phrase will be encoded as that is the only way that the entire phrase can be language-tagged. Encoding may be specified as one of the standard encoding types (B or Q), or a NULL pointer may be supplied, in which case .B mparse_phrase_string will select the encoding based on content. .DE .DS .PP A quoted-string may be generated with a call to: .PP \fB\s-2int mparse_quote_string(const char *s, unsigned int len, char *buf, int sz);\s0\fP .PP which quotes the content in \fIs\fP, backslash-quoting any double quotes or backslashes. The value returned by .B mparse_quote_string is -1 on error (e.g. non-ASCII content), the number of bytes written to buf (not including the terminating '\e0') if \fIbuf\fP is large enough (size specified by \fIsz\fP), or the required size of a buffer in bytes if the specified buffer is too small (or if a NULL pointer is given for \fIbuf\fP). .DE .PP Multiple strings can be concatenated into a single string via: .PP \fB\s-1size_t mparse_build_string(char *, size_t, int, const char *, const char *, ...);\s0\fP .PP the first and second arguments give a buffer for the result and its size. The third argument specifies whether a MPARSE_TOKEN_CRLF pair is to be appended (if nonzero). The fourth argument is a string giving a field name, and is used when generating a field string. The field name is followed in the generated string by a colon and space (which should not be supplied in the strings). When not generating a field string, the fourth argument should be a NULL pointer. The remaining arguments are pointers to the strings to be concatenated (after the field name, colon, and space if generating a field), and must end with a NULL pointer. .SS Message Output and Processing .PP Just as it is possible to print a list of tokens, it is possible to print an entire message consisting of linked .B mparse_entity structures. .PP .B size_t mparse_print_message(FILE *, const char *, struct mparse_message *, const char *, FILE *); .PP will print the entire message referenced by the .B mparse_message structure to the specified .IR FILE s (message proper to first .IR FILE , errors and warnings to the second). .B mparse_print_tokens is called to output each part in turn, so the comments above regarding the advantages of .B fwrite still apply. If byte-stuffing is specified, it will be applied where appropriate and a line with a lone dot will be output at the end of the message. Any prefix string pointed to by the first .I const char * argument is printed before the message, and any suffix string pointed to by the second .I const char * argument is printed after the message (and byte-stuffing lone dot line). Both prefix and suffix are printed to the same .I FILE as the message. .B mparse_print_message returns the number of bytes written to the .I FILE used for the message output (including bytes for errors and warnings if the two .IR FILE s are not .I NULL and are the same or if they use the same file descriptor). If a .I FILE * is .IR NULL , no output is actually produced, but the byte counts are computed. .DS .PP .B size_t mparse_print_entity(FILE *, const char *, const struct mparse_entity *, unsigned int, const char *, FILE *); .PP is similar to .BR mparse_print_message , except that it outputs the contents of a single .B mparse_entity structure. The third argument controls which parts of the structure are output, and is constructed by bitwise ORing the following values according to the desired parts; .TS expand; lw(3.3i)fB lw(2.4i)fB . symbolic name description _ .T& lp-2fB lp-2 . MPARSE_ENTITY_COMPONENT_DELIMITER MIME delimiter MPARSE_ENTITY_COMPONENT_HEADER fields MPARSE_ENTITY_COMPONENT_SEPARATOR header/body separator (empty line) MPARSE_ENTITY_COMPONENT_BODY body .TE .PP The symbolic value .B MPARSE_ENTITY_COMPONENT_ALL may be used to select all of the above parts. .PP The error and warning fields described above are also output by .B mparse_print_message and .B mparse_print_entity unless the .B suppress_errors and/or .B suppress_warnings flags are set in the .B mparse_message structure. .PP Likewise, .PP .B size_t mparse_print_field(FILE *, const char *, const struct mparse_field *, const char *, FILE *) .PP outputs a single field. .DE .DS .PP Generic processing of a message may be performed by calling .PP .B int mparse_process_message(int (*f)(struct mparse_entity *), struct mparse_message *, int); .PP passing a function which processes a single .B mparse_entity structure. The integer argument controls the order of processing of .B mparse_entity structures in the message: .TS expand; lw(0.4i)fB cw(4.3i)fB . argument description _ .T& lp-2fB lp-2 . > 0 T{ message's siblings (eldest first and including message in order) and their subtrees T} < 0 T{ starting with youngest descendant and working up through eldest sibling, reverse order T} 0 T{ message and its subtree first, then siblings (eldest -> youngest) and their subtrees T} .TE .DE .PP The fields in an .B mparse_entity structure may be processed by calling .PP .B int mparse_process_header(int (*f) (struct mparse_field *), struct mparse_entity *); .PP Likewise, the body may be processed via .PP .B int mparse_process_body(int (*f) (struct mparse_field *), struct mparse_entity *); .PP And the tokens comprising a field may be processed by calling the function .PP .B int mparse_process_field(int (*f) (struct mparse_token *), struct mparse_field *); .PP .DS If a copy of a message has been made, the specified application hooks can be called automatically and the allocated structures freed by calling: .PP \s-1\fBint mparse_process_and_free_message(struct mparse_message *, int);\fP\s0 .PP The integer argument, if non-zero, indicates abnormal termination and prevents message repair (if specified) and calls to the application-specified end-of-message hook. The return value is zero under normal conditions, but may be non-zero if some error occurred (e.g. if the supplied integer argument was non-zero, or if the end-of-message hook returned non-zero). .DE .SS Message Modification .PP In addition to building up composite MIME entities, it is sometimes necessary to decompose them. Several functions are provided for this purpose: .PP .B int mparse_unlink_entity(struct mparse_entity *); .PP .B void mparse_free_entity(struct mparse_entity *); .PP .B int mparse_v_free_entity(struct mparse_entity *, va_list); .PP .B int mparse_collapse_composite(struct mparse_entity *); .PP The function .B mparse_unlink_entity removes the .B mparse_entity structure from its context, closing up any links where it was removed. The structure still exists and can be referenced, printed, etc. but it is isolated from all other structures. The function .B mparse_free_entity removes the structure from its context, deletes all content, and frees allocated space. When a composite MIME entity has a single body part (not counting multipart preamble and epilogue) .B mparse_v_free_entity is like .I mparse_free_entity but takes an unused va_list, and can therefore be used with .I mparse_process_message to clean up in the event of an error. .B mparse_collapse_composite may be called to remove the MIME enclosure, substituting the contents, While there may be some esoteric cases where a single-part "multipart" entity makes sense, usually it is better to present the content without the multipart wrapper. .PP Each of the above three functions returns zero on success. A non-zero return means that the structure could not be unlinked because that would make some other structure unreachable and would therefore lead to a memory leak. .PP The .B mparse_free_entity function isolates, deletes content, and frees allocated storage for the referenced .B entity structure and all structures linked via child and next_sibling links: .PP .B void mparse_free_entity(struct mparse_entity *); .\" .PP .\" Fields in one .\" .B struct mparse_entity .\" structure may be copied to another .\" .B struct mparse_entity .\" by calling: .\" .PP .\" .B int mparse_copy_fields(struct mparse_entity *src, struct mparse_entity *dst); .PP A field may be removed by calling: .PP .B int mparse_delete_field(struct mparse_field *); .PP Message body content can be encoded for transport as described earlier. The following two functions perform the inverse of the two decoding functions: .PP .B int mparse_encode_b64(struct mparse_entity *); .PP .B int mparse_encode_qp(struct mparse_entity *); .PP and the function .PP .B int mparse_encode_body(struct mparse_entity *); .PP determines which encoding function produces the more compact encoding and applies that function. .PP Encoding can be applied to discrete media types within a message which require encoding by calling the function .PP .B int mparse_encode_message(struct mparse_message *); .PP .SS Message Copies .PP Sometimes the original message should be kept intact, but it is desired to make a copy which can be modified or otherwise manipulated without affecting the original. The function .B copy_message makes a copy of a message, including content and linked structures: .PP .B struct mparse_entity *mparse_copy_message(struct mparse_message *, struct mparse_entity *); .PP The result is a copy of the tree of entities beginning with the specified entity, including content. Each .B entity in the copy points to the specified .B mparse_message structure. .SS Message Cleanup .PP After encoding or decoding a body section, or after enclosing an entity in a composite structure, it may be necessary to change the specified transport encoding of parent structures. Use: .PP .B struct mparse_encoding *mparse_adjust_encoding(struct mparse_entity *); .PP which revises the encoding in the .B mparse_entity structure and returns a pointer to the .B mparse_encoding structure (described above) which holds information about the encoding and its domain. .PP There are defaults for MIME media types, charset, and transport encoding. It is not always necessary to provide explicit fields for these. When an entity is enclosed in a composite structure or its transport encoding has changed, it may be possible to elide some MIME fields. The function .PP .B int mparse_minimize_mime_fields(struct mparse_entity *); .PP checks for this condition and removes unnecessary MIME fields. It also ensures that there is a MIME\-Version field as required by RFC 2045 section 4. .DS .SS Message Fragmentation and Reassembly .PP It may be necessary or desirable to split large messages into smaller pieces for transport and to reassemble the fragments into a complete message. .PP .B int mparse_fragment_message(struct mparse_message *message, unsigned int frag_sz, struct mparse_message ***pmsg_ptr_array) .PP takes an original .I message and splits it into fragments with size no greater than .I frag_sz octets in accordance with the rules in RFC 2046, allocating message structures as needed and placing an array of pointers to those message structures in allocated memory whose address is stored in the location given by .I pmsg_ptr_array (if not .IR NULL ), returning a negative value on error (with .I errno set appropriately), otherwise returning the number of fragment messages. It is the caller's responsibility to release allocated storage associated with the fragment messages, their content, and the array of pointers to the message structures. .DE .DS .PP An application author may reconstitute a fragmented message by calling .PP .B int mparse_combine_partial(struct mparse_message *message, unsigned int nmessages, struct mparse_message **messages); .PP supplying a pointer to an .B mparse_message structure for the reassembled message (which should have a .I context value including .BR MPARSE_SECONDARY_ROLE_TRANSFORM_REASSEMBLE ), a count of the number of fragment messages, and an array of pointers to the fragment messages. On error, a negative value is returned with .I errno set appropriately. Otherwise, zero is returned. .DE .DS .SS Building and Bursting Message Digests .PP The MIME media type multipart/digest is useful for transporting collections of messages. A digest may be created from a set of messages by calling the function .PP .B struct mparse_entity *mparse_build_digest(unsigned int nmessages, struct mparse_message **messages, const char *boundary, const char *other_parameters, const char *preamble, const char *epilogue); .PP supplying a count of the number of messages, an array of pointers to the messages, an optional message boundary delimiter string, optional additional mparse_parameters, optional preamble text, and optional epilogue text. On error, a .I NULL pointer is returned with .I errno set appropriately. Otherwise, a pointer to the multipart/digest wrapper entity is returned. That entity may be further wrapped in an enclosing media type (e.g. to provide a table of contents) or it may be inserted in a message structure. .DE .DS .PP A digest may be split into individual messages by calling the library function .PP .B int mparse_burst_digest(struct mparse_entity *entity, struct mparse_message ***pmsg_ptr_array); .PP supplying a pointer to the multipart/digest wrapper entity and the address of a location which can hold a pointer to an array of .B mparse_message structures. On error, a negative value is returned with .I errno set appropriately. On success, a count of the number of individual messages is returned. If .I pmsg_ptr_array is not .IR NULL , The individual messages will have been placed in allocated message structures and a pointer to an array of .B mparse_message structure pointers will be placed in the location specified by .IR pmsg_ptr_array . It is the caller's responsibility to free allocated storage for the individual messages and their contents and the array of pointers. .DE .DS .SS Low-level Message Component Generation .PP A number of functions are provided for generating RFC-compliant message components. .PP There are four variants which generate date-time components, according to whether UTC or local time plus offset is used, and whether or not the optional day-of-week is included. Each function takes a pointer to a character array for the result, and the size of the array in bytes. Return value is the number of bytes written (excluding the terminating '\e0'), or, if the buffer is too small, the size of the buffer necessary to hold the result (including the terminating '\e0'), or a negative value if a serious error occurred. A separate function is also provided which can be made equivalent to any of the other four by specifying zero\-valued or non\-zero arguments to specify inclusion of day\-of\-week and local time. .PP .B int mparse_gen_date(char *, int, unsigned int, unsigned int); .PP .B int mparse_gen_date_local(char *, int); .PP .B int mparse_gen_date_utc(char *, int); .PP .B int mparse_gen_dow_date_local(char *, int); .PP .B int mparse_gen_dow_date_utc(char *, int); .DE .DS .PP A domain literal for an Internet address may be obtained by a call to .BR mparse_gen_ip_literal . The function takes a pointer to a character array for the result, and the size of the array in bytes. Return value is the number of bytes written (excluding the terminating '\e0'), or, if the buffer is too small, the size of the buffer necessary to hold the result (including the terminating '\e0'), or a negative value if a serious error occurred. The function .B mparse_gen_ip_name is similar, except it returns a domain name rather than a domain literal. .PP .B int mparse_gen_ip_literal(const struct sockaddr_in *, char *, int); .PP .B int mparse_gen_ip_name(const struct sockaddr_in *, char *, int); .DE .DS .PP The domain name for the host running .B mparse may be obtained by a call to the function .BR mparse_gen_hostname . The function takes a pointer to a character array for the result, and the size of the array in bytes. Return value is the number of bytes written (excluding the terminating '\e0'), or, if the buffer is too small, the size of the buffer necessary to hold the result (including the terminating '\e0'), or a negative value if a serious error occurred. The function .B mparse_gethostaddr returns the current host's IP address in the structure pointed to by the first argument (a struct sockaddr), and the second argument gives its size. Preference is given to routable addresses on a host with multiple addresses. .PP .B int mparse_gen_hostname(char *, int); .PP .B int mparse_gethostaddr(struct sockaddr *, int *); .DE .DS .PP Some IP addresses are reserved for private use, examples, etc. and are not generally routable through the Internet. The function .B mparse_is_routable returns zero for such addresses, non-zero for routable addresses. .PP .B int mparse_is_routable(const struct sockaddr_in *); .DE .DS .PP A message-id component (including the angle brackets) can be generated by calling .BR mparse_gen_message_id . The function takes a pointer to a character array for the result, and the size of the array in bytes. Return value is the number of bytes written (excluding the terminating '\e0'), or, if the buffer is too small, the size of the buffer necessary to hold the result (including the terminating '\e0'), or a negative value if a serious error occurred. The calling syntax is: .PP .B int mparse_gen_message_id(char *, int); .DE .DS .SS Header Fields and Information .PP The function .PP .B const struct mparse_field_state *mparse_field_entry(const char *, int); .PP returns a pointer to a structure defined in the header file \fBmparse.h\fP. The members of that structure are: .TS expand; lw(1.3i)fB lw(3.9i)fB . member (struct mparse_field_state) description _ .T& lp-2fB lp-2 . const char *field_name; Canonical field name int token; T{ integer value returned by lexical analyzer and stored in \fItype\fP member of \fBmparse_token\fP structure T} int next_state; used internally by lexical analyzer unsigned int resent_ok : 1; T{ flag indicates whether Resent- version of field is legal T} unsigned int mime_content : 1; T{ indicates whether field is a MIME Content- field T} unsigned int cached : 1; T{ indicates whether information is cached in the \fBmparse_cache\fP structure T} unsigned int defaultable : 1; T{ indicates whether a default value exists in the absence of the field T} unsigned int multiple_ok : 1; T{ are multiple instances of the field permitted in an entity T} unsigned int unstructured : 1; T{ field is unstructured (RFC 2047 encoding is permitted) T} unsigned int usefor_inherit : 1; T{ "inheritable" field as defined in usefor draft T} unsigned int usefor_variant : 1; "variant" field as defined in usefor draft unsigned int obsolete_name:1; field name has been changed unsigned int hook_tokens : 4; T{ number of token pointers (<= 15) passed to corresponding application hook function T} unsigned int modes : 8; T{ bitmap of permitted entity types for this field T} .TE .DE .DS .PP The arguments to .B mparse_field_entry are a (case-insensitive) character string and its length, for example as obtained from a .B token structure's .I tok and .I len members. .DE .DS .PP Application programmers who wish to enumerate known fields (for example, to construct a menu) can call: .PP .B const struct mparse_field_state *mparse_field(int n); .PP sequentially with an integer argument beginning with 1. .B mparse_field will return a pointer to the structure in case-insensitive alphabetic sequence, returning a .I NULL pointer when the end of the list is reached. Fields which are defined as having obsolete names are not included in the returned data. Note that not all fields are valid in all contexts; application authors may wish to examine the .I modes member of the .B mparse_field_state pointed to by the return value to determine whether or not to present a given field to the user. .DE .DS .PP RFC 2047 (as amended by RFC 2231 and errata) defines an encoded-word sequence which can represent human-readable text in character sets beyond those which may appear in message fields. The character set is specified and language may be specified. The function .PP \s-1\fBint mparse_is_encoded_word(const unsigned char *s, int location);\fP\s0 .PP is provided to indicate whether a character string matches the syntax for such an encoded-word. .B location specifies the context of the character string and is constructed from the following values: .TS expand; lw(3.1i)fB cw(2.7i)fB . symbolic name description _ .T& lp-2fB lp-2 . MPARSE_ENCODED_WORD_LOCATION_UNKNOWN location cannot be determined (used alone) MPARSE_ENCODED_WORD_IN_COMMENT T{ string is part of a comment in a structured field T} MPARSE_ENCODED_WORD_IN_PHRASE T{ string is a word in a phrase in a structured field T} MPARSE_ENCODED_WORD_IN_UTEXT string is part of unstructured text .TE Note that a comment might occur within a phrase (in which case .B MPARSE_ENCODED_WORD_IN_COMMENT suffices); otherwise the .B location values above are mutually exclusive. If the string matches the syntax for an encoded-word, the function returns a non-zero value. .DE .DS .PP It is possible that a token matching the syntax for an encoded-word is not a valid encoded-word. The function .B int mparse_bad_word(struct mparse_entity *entity, struct mparse_token *t); performs more stringent tests on a token and returns a non-zero value if the token is not a valid encoded-word, setting appropriate errors in the process. Context is determined automatically from the token. .DE .PP An encoded word may be decoded by calling the function .PP \s-2\fBint mparse_decode_encoded_word(struct mparse_entity *entity, struct mparse_token *t, unsigned char *buf, int len, const struct mparse_charset **pcs, const unsigned char **plang, unsigned int *planglen, const unsigned char **penc, unsigned int *penclen);\fP\s0 .PP where .I buf and .I len provide a buffer of length .I len for the decoded text (\fIbuf\fP may be \fINULL\fP to determine a suitable length), and .I entity and .I t point to the structures containing the encoded-word token. The function returns the length of the decoded text (not including the terminating '\e0' character unless the supplied buffer is too small). If .I pcs is not NULL, a pointer to the .B mparse_charset structure will be stored. Likewise, if .I plang and .I planglen are not NULL and a language is specified, the start of the language tag and its length will be stored. Finally, if .I penc and .I penclen are not NULL the start of the encoding tag and its length will be stored. .DS .PP Application programmers who wish to enumerate standard MIME-compatible charsets (for example, to construct a menu) can call: .PP .B const struct mparse_charset *mparse_charset(int n); .PP sequentially with an integer argument beginning with 1. .B mparse_charset will return a pointer to the structure in MIB enum sequence (see the IANA character-sets registry) returning a .I NULL pointer when the end of the list is reached. Only MIME-compatible charsets are listed, and only the structure for the preferred name for each distinct charset is returned. .DE .DS .PP The charset tag given by string .I s having length .I len may be validated by calling the function .PP \s-2\fBint mparse_bad_charset(struct mparse_message *message, struct mparse_token *t, const char *s, size_t len, const struct mparse_charset **pcs, unsigned int errors, unsigned int remedy int rfc, const char *ref, const char *sect, int x, const char *str);\fP\s0 .PP which returns non-zero if the charset is not valid. In that case it may also use the .I struct mparse_token *t to indicate the offending token .RI ( t will usually be the token containing .IR s ). Because a charset may be specified in a number of contexts (in an encoded-word, in an extended-initial-value in a parameter, or as the value corresponding to a "charset" parameter in a Content-Type header field), .B mparse_bad_charset has provision for specifying the relevant RFC etc. for a possible error message. If the charset is valid and .I const struct mparse_charset **pcs is not .IR NULL , .B mparse_bad_charset will store a pointer to the relevant struct mparse_charset there. .PP The function .PP .B const struct mparse_charset *mparse_charset_entry(const char *, unsigned int); .PP returns a pointer to a structure (defined in mparse.h) for a charset name with a specified length if the charset is registered, or a .I NULL pointer if there is no registered charset with the specified name. .DE .DS .PP The default charset in messages is US-ASCII. A pointer to the .B mparse_charset structure corresponding to US-ASCII may be obtained by calling the function .PP .B const struct mparse_charset *mparse_ascii_charset(void); .PP Charsets may have a number of aliases. The function .PP .B int mparse_charset_is_ascii(const struct mparse_charset *); .PP returns a positive integer if the specified charset is equivalent to US-ASCII, zero if it is not, and a negative integer if there is some error. .PP There are a number of restrictions on charsets which may be used with certain media types. .PP .B int mparse_incompatible_charset(const struct mparse_entity *, const struct mparse_charset *); .PP returns a positive integer if the specified charset is incompatible with the specified entity, zero if there is no incompatibility, and a negative value in the event of an error in arguments. .PP The function .PP .B const struct mparse_charset *mparse_preferred_charset(const struct mparse_charset *); .PP returns a pointer to the preferred .B mparse_charset structure for a given charset, if there is one. .DE .DS .PP A language tag may be validated by calling .PP .B int mparse_bad_language(struct mparse_message *message, struct mparse_token *t, const char *s, size_t len); .PP A language tag (as defined in RFC 3066) may consist of a raw ISO 3166 language tag possibly followed by a hyphen separator and an ISO 639 country code. IANA-registered language tags are also defined. .B mparse_bad_language checks these if present. It is also possible to check them separately by calling one of: .PP .B const struct mparse_language *mparse_language_entry(const char *str, unsigned int len); .PP .B const struct mparse_country *mparse_country_entry(const char *str, unsigned int len); .PP which return a .I NULL pointer if the string pointed to by the first argument (with length given by the second argument) is not a valid code of the corresponding type. Otherwise it returns a pointer to a structure defined in .BR mparse.h . Each structure holds a pointer to the code tag, a pointer to the English name(s) corresponding to the tag, a sequence number according to alphabetic order by English name, a pointer to the French name(s) corresponding to the tag, and a sequence number according to alphabetic order by French name (English and French being the languages of the ISO documents). [The IANA-registered language tags have descriptions in English only, but these appear in the English and French members.] .\" .PP .\" If the charset or language extension hooks described above are .\" If the language extension hook described above is .\" provided, .\" and the message \fIexperimental\fP flag is set, .\" .B mparse_bad_charset .\" and .\" .B bad_language .\" will use those hooks. .\" will use that hook. .PP Application programmers who wish to enumerate the country or language codes (for example, to construct a menu) can call one of: .PP .B const struct mparse_country *mparse_country_en(int n); .PP .B const struct mparse_country *mparse_country_fr(int n); .PP .B const struct mparse_language *mparse_language_en(int n); .PP .B const struct mparse_language *mparse_language_fr(int n); .PP sequentially with an integer argument beginning with 1. Each function will return a pointer to the structure in the corresponding alphabetic sequence, returning a .I NULL pointer when the end of the list is reached. The IANA-registered tags appear after the last ISO tag. .DE .DS .SS MIME Boundaries and Other Parameters .PP There are restrictions on the length and character set for MIME boundaries. The function .PP .B int mparse_boundary_ok(const char *); .PP returns 0 if the string is not suitable as a boundary, 1 if it is suitable as supplied, and 2 if it must be quoted when supplied as a parameter value in a Content-Type field. .DE .DS .PP RFC 2231 provides MIME extensions for parameter continuation and for specifying parameter charset and language. MIME parameter information is stored in a .B mparse_parameter structure having the following members: .TS expand; lw(1.5i)fB lw(3.9i)fB . member (struct mparse_parameter) description _ .T& lp-2fB lp-2 . struct mparse_parameter *initial; head of singly-linked list struct mparse_parameter *next; singly-linked list struct mparse_token *attribute; parameter name struct mparse_token *value; value struct mparse_token *importance; T{ .na importance (disposition-notification-options parameters) T} const struct mparse_charset *charset; charset for encoding const struct mparse_language *language; language for tagging unsigned int section; continuation fragment number unsigned int extended : 1; parameter is extended unsigned int quoted : 1; value is quoted .TE .DE .DS .PP MIME parameters can be extracted from a parameter list by a call to .BR mparse_get_parameter : .PP \s-1\fBint mparse_get_parameter(const struct mparse_message *, const struct mparse_token *, const char *, char *, size_t, struct mparse_parameter **);\fP\s0 .PP The arguments are: a pointer to the current .B mparse_message structure, a pointer to the .B mparse_token list comprising the MIME parameters, (usually the .I content_parameters member of the .B mparse_entity structure) a character string corresponding to the desired parameter name (e.g. "boundary"), a pointer to a character array where the corresponding parameter value is to be written, and the size of the array, and a pointer to a .B mparse_parameter structure pointer which will be updated to point to the .I initial .B mparse_parameter structure (for charset and language information; a NULL pointer may be passed to .B mparse_get_parameter if this information is not required). The return value from .B mparse_get_parameter is the number of characters returned (excluding the terminating '\e0'), or, if the buffer is too small, the size of the buffer (including terminating '\e0') required to hold the parameter value string. .DE .DS .PP The integer returned by .PP .B int mparse_parameters(const struct mparse_token *); .PP is a count of the number of parameters in the .B mparse_token list comprising the MIME parameters linked mparse_token structure which is supplied as an argument (usually the .I content_parameters member of the .B mparse_entity structure). .DE .DS .SS Convenience and Utility Functions .PP The following function provides information regarding a .BR message : .PP .B int mparse_ends_with_crlf(struct mparse_message *); .PP returns non-zero if the message ends with a CRLF. MIME multipart messages do not necessarily end with a CRLF; the closing boundary delimiter itself does not end with CRLF, and the optional epilogue following the close delimiter also might not end with CRLF, .DE .DS .PP The following functions provide quick tests for information regarding an .B mparse_entity structure: .PP .B int mparse_has_close_delimiter(const struct mparse_entity *); .PP returns non-zero for the last part of a multipart entity. .DE .DS .PP .B int mparse_is_application(const struct mparse_entity *); .PP returns non-zero for an application MIME media type entity. .DE .DS .PP .B int mparse_is_audio(const struct mparse_entity *); .PP returns non-zero for an audio MIME media type entity. .PP .DE .DS .B int mparse_is_composite(const struct mparse_entity *); .PP returns non-zero for a multipart or message composite MIME entity. .DE .DS .PP .B int mparse_is_dsn(const struct mparse_entity *); .PP returns non-zero for a Delivery Status Notification message. Return value is 1 for the per-message fields structure, and 2 for the per-recipient fields structures. Note that 0 is returned for the enclosing message/delivery-status structure, which is not considered part of the DSN \fIper se\fP. .DE .DS .PP .B int mparse_is_epilogue(const struct mparse_entity *); .PP returns non-zero for the last part of a multipart entity if an epilogue is present. .DE .DS .PP .B int mparse_is_external(const struct mparse_entity *); .PP returns non-zero for a message/external-body MIME media type. .DE .DS .PP .B int mparse_is_mdn(const struct mparse_entity *); .PP returns non-zero for a Message Disposition Notification message (not its enclosure). .DE .DS .PP .B int mparse_is_mdn_report(const struct mparse_entity *); .PP returns non-zero for a multipart/report holding a report-type of disposition-notification. .DE .DS .PP .B int mparse_is_message_type(const struct mparse_entity *); .PP returns non-zero for a MIME message composite media type. .DE .DS .PP .B int mparse_is_multipart(const struct mparse_entity *); .PP returns non-zero for a MIME multipart composite media type. .DE .DS .PP .B int mparse_is_plain(const struct mparse_entity *); .PP returns non-zero for a text/plain media type. .DE .DS .PP .B int mparse_is_preamble(const struct mparse_entity *); .PP returns non-zero for the first part of a multipart entity if a preamble is present. .DE .DS .PP .B int mparse_is_report(const struct mparse_entity *); .PP returns non-zero for a MIME multipart/report media type. .DE .DS .PP .B int mparse_is_rfc822(const struct mparse_entity *entity) .PP returns non-zero if the .B entity structure is encapsulated in a message/rfc822 encapsulation or equivalent (\fIe.g.\fP the obsolete "message/news" media type). .DE .DS .PP .B int mparse_is_signed(const struct mparse_entity *); .PP returns non-zero for any media type enclosed (at any level) as the first part of a multipart/signed entity. These may not be modified as that would prevent signature verification. Application code should operate on a copy of such an .B mparse_entity structure if it is necessary to modify it (e.g. decode transfer encoding). .DE .DS .PP .B int mparse_is_text(const struct mparse_entity *); .PP returns non-zero for any text media type. .DE .DS .PP .B int mparse_is_mtsn(const struct mparse_entity *); .PP returns non-zero for a Message Tracking Status Notification message. Return value is 1 for the per-message fields structure, and 2 for the per-recipient fields structures. Note that 0 is returned for the enclosing message/tracking-status structure, which is not considered part of the MTSN \fIper se\fP. .DE .DS .PP .B int mparse_is_multiheader(const struct mparse_entity *); .PP returns non-zero for an entity which consists of one of a set of multiple fields (DSNs and MTSNs). .DE .DS .PP .B int mparse_part(const struct mparse_entity *); .PP returns -1 if the type is not message/partial. Returns 0 (N.B.) for the encapsulated part of the first part of a partial message series (the encapsulation of the first part yields a return of 1), and returns an integer corresponding to the part number of subsequent parts. .DE .DS .PP .B int mparse_report_part(const struct mparse_entity *); .PP returns 0 if the structure pointed to is not part of a multipart/report. Returns a negative number if the multipart/report is malformed (fewer than two or more than three parts). Returns 1 for the first part (intended to be human readable text; see RFC 1892). Returns 2 for the machine-readable part (DSN or MDN). Returns 3 for the optional returned message or returned header fields. Note that zero is returned for the multipart/report enclosure. .DE .DS .PP The following function provides information regarding a field: .PP .B int mparse_in_body(const struct mparse_field *); .PP returns 1 if the field is part of an .B entity body, 0 if it is not. .DE .DS The following functions provide information regarding a token: .PP .B int mparse_is_ws(const struct mparse_token *); .PP returns 1 if the token is whitespace, 0 if it is not. .DE .DS .PP .B int mparse_has_whitespace(const struct mparse_token *); .PP returns 1 if the logical token contains any whitespace tokens, 0 if there are none. .DE .DS .PP .B int mparse_is_fws(const struct mparse_token *); .PP returns 1 if the token is whitespace or the MPARSE_TOKEN_CRLF which is part of line folding, 0 if it is not. .DE .DS .PP .B int mparse_has_folding(const struct mparse_token *); .PP returns 1 if the logical token contains any line folding tokens, 0 if there are none. .DE .DS .PP .B int mparse_is_cfws(const struct mparse_token *); .PP returns 1 if the token is part of CFWS, 0 if it is not. .DE .DS .PP .B int mparse_has_comment(const struct mparse_token *); .PP returns 1 if the logical token contains any comments, 0 if there are none. .DE .DS .PP .B int mparse_is_trailing_ws(const struct mparse_token *); .PP returns 1 if the token is whitespace at the end of a line or at the end of a message body, 0 if it is not. .DE .DS .PP A string with specified length may be characterized with the following functions: .PP .B int mparse_is_atom(const unsigned char *c, size_t len); .PP returns 1 if the string is a valid atom, 0 if it is not. .PP .B int mparse_is_qtext(const unsigned char *c, size_t len); .PP returns 1 if the string is valid qtext, 0 if not. .PP .B int mparse_is_mime_charset(const unsigned char *c, unsigned int len); .PP returns 1 if the string is a valid name for a MIME charset, else it returns 0. .DE .DS .PP The following functions return information about a single character, much like the C library .I ctype functions: .PP .B int mparse_isspecial(int c); .PP returns nonzero if the character is a special character as defined in RFCs 822 and 2822. .DE .DS .PP .B int mparse_isaspecial(int c); .PP returns nonzero if the character is an aspecial other than the percent sign, %. .DE .DS .PP .B int mparse_istspecial(int c); .PP returns nonzero if the character is a tspecial (i.e. must be quoted if used in a parameter value). .DE .DS .PP .B int mparse_isldh(int c); .PP returns non-zero for any letter, digit, or hyphen (i.e. valid characters for a domain name component). .DE .DS .PP .B int mparse_islwsp(int c); .PP returns nonzero for linear whitespace characters. .DE .DS .PP .B int mparse_is_ew_char(int c, unsigned int context); .PP returns nonzero if the character is valid in an encoded-word in the given context. .DE .DS .PP .B int mparse_isnewsgc(int c); .PP returns nonzero if the character is valid in a newsgroup or distribution name component. .DE .DS .PP .B int mparse_istokenc(int c); .PP returns nonzero if the character is valid in a MIME token. .DE .DS .PP .B int mparse_isboundaryc(int c); .PP returns nonzero if the character is valid in a MIME boundary string. .DE .DS .PP .B int mparse_is_uri_reserved(int c); .PP returns nonzero if the character is a URI reserved character (RFC 2396). .PP .DE .DS .B int mparse_is_uri_excluded(int c); .PP returns nonzero if the character is not permitted unescaped in a URI (RFC 2396). .DE .DS .PP .B int mparse_is_uri_authority_reserved(int c); .PP returns nonzero if the character is a URI reserved character in the authority URI component (RFC 2396). .DE .DS .PP .B int mparse_is_uri_path_reserved(int c); .PP returns nonzero if the character is a URI reserved character in the path URI component (RFC 2396). .DE .DS .PP .B int mparse_is_uri_query_reserved(int c); .PP returns nonzero if the character is a URI reserved character in the query URI component (RFC 2396). .DE .DS .PP The following functions provide access to base64 encoding and decoding translation: .PP .B unsigned char mparse_translate_b64(int); .PP returns the base64 character corresponding to the integer argument (range 0 to 63). .PP .B int mparse_encode_b64_word(const unsigned char *, unsigned int, unsigned int, unsigned int, char *, int); .PP encodes a word pointed to by the first argument, with length given by the second argument into a buffer with size given by the last two arguments. The two other unsigned integer arguments specify the number of leading and trailing space characters to be encoded with the word. The return value is the number of octets in the result, or if the buffer is too small, the number of octets necessary to hold the result. .PP .B int mparse_decode_b64_word(struct mparse_entity *, struct mparse_token *, char *, int, const char *, int); .PP Given an optional entity and token (for error reporting) and a buffer and its length (first four arguments), the base64-encoded text with length given by the last two arguments is decoded into the buffer. The return value is the number of octets in the result, or if the buffer is too small, the number of octets necessary to hold the result. .DE .DS .PP There is a required .B micalg parameter used with the multipart/signed media type. It is also used in conjunction with the Received-content-MIC header field. .PP Application programmers who wish to enumerate standard micalg values (for example, to construct a menu) can call: .PP .B const struct mparse_name_val *mparse_micalg(int n); .PP sequentially with an integer argument beginning with 1. .B mparse_micalg will return a pointer to the structure in sequence according to the micalg name returning a .I NULL pointer when the end of the list is reached. .DE .DS .PP IANA has a registry of context values for the Message-Context header field. .PP Application programmers who wish to enumerate standard context values (for example, to construct a menu) can call: .PP .B const struct mparse_name_val *mparse_context(int n); .PP sequentially with an integer argument beginning with 1. .B mparse_context will return a pointer to the structure in sequence according to the context value name, returning a .I NULL pointer when the end of the list is reached. .DE .DS .SS Mailboxes and Message-IDs .B int mparse_count_mailboxes(const struct mparse_token *list); .PP returns the number of mailboxes (or message-ids) in .IR list , which can be a series or delimited list or a single item. .DE .DS .PP \s-2\fBint mparse_mailbox_components(const struct mparse_token *list, int n, const struct mparse_token **display_name, const struct mparse_token **bracket, const struct mparse_token **route, const struct mparse_token **local_part, const struct mparse_token **at, const struct mparse_token **domain);\fP\s0 .PP finds the .IR n th mailbox (or message-id) in .I list and sets pointers to the components (if they exist): display name, opening angle bracket, route (which may be an RFC 822 list or an RFC 733 string of domains), local-part, @ delimiter, and domain. It returns -1 on serious errors with .I errno set, and returns .I n on success. If there are fewer than .I n mailboxes (or message-ids), it returns the number present in .IR list . .DE .DS .PP .B int mparse_interpolate_components(struct mparse_token *t, int n, char *buf, int sz, unsigned int components); .PP may be used to interpolate mailbox (or message-id) components into a buffer .IR buf . The .I components argument determines which components of the .IR n th mailbox are used according to the following values (defined in .BR mparse.h ) which should be bitwise ORed: .TS expand; lw(2.5i)fB lw(3.0i)fB . symbolic name description _ .T& lp-2fB lp-2 . MPARSE_INTERPOLATE_DISPLAY_NAME the display name (if present) MPARSE_INTERPOLATE_BRACKETS angle brackets MPARSE_INTERPOLATE_ROUTE RFC 822 source route or extra domains (RFC 733) MPARSE_INTERPOLATE_LOCAL_PART the local-part of the address MPARSE_INTERPOLATE_AT_DELIMITER the @ delimiter between local-part and domain MPARSE_INTERPOLATE_DOMAIN the domain part of the address MPARSE_INTERPOLATE_INCLUDE_COMMENTS T{ any comments associated with the mailbox components listed above T} MPARSE_INTERPOLATE_EXCESS_WS T{ .na any excess whitespace (otherwise compressed to a single space character outside brackets or eliminated within brackets including within local-part, domain, and around the @) T} .TE It is possible to get nonsense by specifying certain combinations of values; normally the values (MPARSE_INTERPOLATE_BRACKETS | MPARSE_INTERPOLATE_LOCAL_PART | MPARSE_INTERPOLATE_AT_DELIMITER | MPARSE_INTERPOLATE_DOMAIN) would be specified to obtain a full address for use with a transport protocol such as SMTP, MPARSE_INTERPOLATE_DOMAIN alone to obtain the domain name for use with DNS (e.g. for MX lookups), and (MPARSE_INTERPOLATE_DISPLAY_NAME | MPARSE_INTERPOLATE_BRACKETS | MPARSE_INTERPOLATE_LOCAL_PART | MPARSE_INTERPOLATE_AT_DELIMITER | MPARSE_INTERPOLATE_DOMAIN) for use in header fields. MPARSE_INTERPOLATE_DISPLAY_NAME alone (or possibly with MPARSE_INTERPOLATE_INCLUDE_COMMENTS) might be used by user agents to extract the display name for presentation. .PP The return value is the number of octets in the result, or if the buffer is too small, the number of octets necessary to hold the result. .DE .DS .SS Replies and Follow-ups .PP Sometimes one would like to know if a message is or is not a reply to another message. The function .PP .B "int mparse_is_reply(struct mparse_message *message);" .PP returns 0 if .I message is definitely not a reply, returns 1 if .I message is definitely a reply to another message, and returns -1 if it is not possible to definitively determine if .I message is or is not a reply. .DE .DS .PP Likewise .PP .B "int mparse_is_auto_reply(struct mparse_message *message);" .PP similarly indicates whether .I message is an automatic reply (not counting DSNs and MDNs). .DE .DS .PP The following functions may be used to extract addresses for replies to a message or for follow-ups to newsgroups: .PP .B int mparse_envelope_sender(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP places the envelope sender address (from the Return-Path field of .IR message ) in .IR buf , returning the number of characters copied if .I buf is sufficiently large (as specified by .IR sz ), or the required size if .I buf is a .I NULL pointer or is too small, or a negative value (with errno set) on error. The .I mode argument provides some control over interpolation of the address (see the description for the .I mparse_tokens_string function), though unfolding and normalization of "at" are always performed. .DE .DS .PP .B int mparse_error_addresses(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP places the address for an error reply in .IR buf , as above. The error reply address is the envelope sender address if present, the Sender field if there is no envelope sender address, or the From field if there is no envelope sender or Sender. .DE .DS .PP .B int mparse_sender_mailbox(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP copies the sender mailbox to .IR buf , as above. The sender mailbox is the mailbox specified in the Sender field if present or the From field if there is no Sender field. .DE .DS .PP .B int mparse_author_mailboxes(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP copies the authors' mailboxes to .IR buf , as above. The authors' mailboxes are the mailboxes specified in the From field if present. .DE .DS .PP .B int mparse_reply_addresses(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP copies normal reply primary addresses to .IR buf , as above. Normal replies go to the addresses specified in the Reply-To field if present, the From field if there is no Reply-To field. .DE .DS .PP .B int mparse_cc_mailboxes(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP copies the Cc field mailboxes to .IR buf , as above. .DE .DS .PP .B int mparse_to_mailboxes(struct mparse_message *, char *, int, unsigned int); .PP copies the To field mailboxes to .IR buf , as above. .DE .DS .PP .B int mparse_mdn_addresses(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP extracts the addresses from the Disposition-Notification-To field into .IR buf , as above. .DE .DS .PP .B int mparse_auto_response_address(struct mparse_message *message, char *buf, int sz, int is_service, unsigned int mode); .PP uses the Return-Path address (if present), primarily for use when generating reply fields for use with an Auto\-Submitted field. If .I is_service is non-zero, The From field will be used if it contains a single mailbox and there is no Return\-Path. If there are multiple mailboxes in the From field, .I errno is set to .BR MPARSE_ERRNO_EMULTIADDRESS, the addresses are placed in buf (if not NULL and if sufficiently large) and a negative value (-1 times the number of characters) is returned. .DE .DS .PP .B int mparse_list_post_uri(struct mparse_message *message, char *buf, int sz, unsigned int mode, unsigned int nskip); .PP places a list-post URI in .IR buf , as above. The unsigned integer argument .I nskip indicates how many bracketed URIs should be skipped, starting at the beginning of the list in the List-Post field, if present. Zero is returned if no bracketed URI is available (including if there is no List-Post field). .DE .DS .PP .B int mparse_followup_newsgroups(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP places a list of followup newsgroup names in .IR buf , as above. There are none if the message contains a Followup-To field with only the "poster" keyword. If there is a Followup-To field .\" without the "poster" keyword, it specifies the followup newsgroups. If there is no Followup-To field, the followup newsgroups are taken from the Newsgroups field if present. In this case, default processing (extendable via .IR mode) includes replacement of each comment with a space character. .DE .DS .PP .B int mparse_followup_addresses(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP copies followup addresses to .IR buf , as above. Addresses are taken from the Followup-To field if present and if it contains addresses, and/or addresses are taken from the Reply-To or From fields if the Followup-To field contains the "poster" keyword. If no Followup-To field is present, reply addresses are taken from Reply-To or From fields as described above for reply addresses. .DE .DS .PP .B int mparse_message_identifier(struct mparse_message *message, char *buf, int sz, unsigned int mode); .PP copies the message identifier from the Message-ID header field to .IR buf , as above. .DE .DS .SS Repair .PP When generating messages and when the context MPARSE_SECONDARY_ROLE_REPAIR flag is in effect, some errors may be repaired. There may be some errors which are beyond the ability to repair. Others can be repaired, and the following functions initiate repairs on message components: .PP .B struct mparse_token *mparse_fix_token_errors(struct mparse_entity *, struct mparse_field *, struct mparse_token *, struct mparse_token **, int, const struct mparse_charset *, const char *, unsigned int, unsigned int, unsigned int, unsigned int); .PP repairs certain errors and/or eliminates warnings recorded for a specified token. The entity and field for the token should be specified along with the token. A pointer to a token structure pointer must be supplied, if the line containing the token must be folded, the start of the following line will be at the token pointed to by the pointer on successful return. The charset and language tag string may be supplied to indicate charset and/or language in case some content must be encoded. The final four unsigned integer arguments indicate, in turn, whether the token is in a message body part (non-zero), the MPARSE_FIX_* values to be repaired, whether to repair items for which warnings (but no errors) have been issued (if non-zero), and whether to consider issues relative to all RFCs (if non-zero) or only the RFC modes in effect for the message (if zero). Because a token might be deleted in the process of making repairs, the return value is a pointer to the next physical token after the specified token after repairs have been made. .PP .B int mparse_fix_field_errors(struct mparse_entity *, struct mparse_field *, int, const struct mparse_charset *, const char *, unsigned int, unsigned int, unsigned int, unsigned int); .PP operates similarly for fields (including handling all tokens in the field). The entity and field are specified, followed by a maximum line length for line folding. The remaining arguments (beginning with charset) are as above. .PP .B int mparse_fix_entity_field_errors(struct mparse_entity *, int, const struct mparse_charset *, const char *, unsigned int, unsigned int, unsigned int); .PP makes repairs for all fields in a given entity (and including all tokens in all fields). The entity, line length limit, etc. are specified as above. .PP .B int mparse_fix_entity_body_errors(struct mparse_entity *, int, const struct mparse_charset *, const char *, unsigned int, unsigned int, unsigned int); .PP does the same for body lines and tokens within an entity. .PP .B int mparse_fix_entity_misc_errors(struct mparse_entity *, int, const struct mparse_charset *, const char *, unsigned int, unsigned int, unsigned int); .PP makes repairs for entity delimiter and separator lines, fields, tokens, and other entity error conditions (e.g. missing fields). .PP .B int mparse_fix_entity_errors(struct mparse_entity *, int, const struct mparse_charset *, const char *, unsigned int, unsigned int, unsigned int); .PP repairs all of the above types of errors in an entire entity. .DE .DS .PP It may be desirable to locate an .B mparse_error structure corresponding to some remedy which should be applied. .PP .B struct mparse_error *mparse_find_field_remedy(const struct mparse_message *, const struct mparse_field *, unsigned int, unsigned int, int); .PP returns a pointer to such a structure corresponding to a specified field if one exists. The first unsigned integer argument is constructed from the MPARSE_FIX* values to be considered. The other unsigned integer arguments determines whether to consider issues relative to all RFCs (if non-zero) or only the RFC modes in effect for the message (if zero). The last argument sets a minimum severity (MPARSE_REQUIREMENTS_SHOULD, MPARSE_REQUIREMENTS_MUST, etc.) to be considered. .PP .B int mparse_field_needs_remedy(const struct mparse_message *, const struct mparse_field *, unsigned int, unsigned int, int); .PP simply returns a positive non-zero value if there exists an .B mparse_error structure corresponding to the arguments as detailed above. It returns zero if there is no such structure and a negative value if the supplied arguments are not valid. .DE .DS .PP The functions .PP .B struct mparse_error *mparse_find_token_remedy(const struct mparse_message *, const struct mparse_token *, unsigned int, unsigned int, int); .PP and .PP .B int mparse_token_needs_remedy(const struct mparse_message *, const struct mparse_token *, unsigned int, unsigned int, int); .PP operate similarly for tokens. .DE .DS .SS Miscellany .PP Some fields may contain Uniform Resource Identifiers (URIs) as defined in RFC 2396. RFC 2396 Appendix B describes parsing a URI into component parts, which may be performed by calling: .PP \s-2\fBint mparse_parse_uri(const char *s, char *pscheme, size_t *pscheme_sz, char *pauthority, size_t *pauthority_sz, char *ppath, size_t *ppath_sz, char *pquery, size_t *pquery_sz, char *pfragment, size_t *pfragment_sz);\fP\s0 .PP The URI is supplied as a nul-terminated character string .IR s . Component strings are copied to corresponding char * arguments (if not NULL), up to the size given by the corresponding size_t * arguments (which must not be NULL and into which the length (not including terminating '\e0') is written if the supplied size was sufficient; if insufficient, the required size (including space for a terminating '\e0') is written). Return value is -1 on error, with .I errno set, zero if all sizes are adequate, and a positive value if one or more sizes are insufficient, or if there is an error involving regular expressions for parsing. N.B. path may include path parameters. .DE .DS .PP There are functions for conversion between messages and .I mailto URIs as defined in RFC 2368: .PP .B int mparse_instantiate_mailto(const char *s, struct mparse_message *message, unsigned int is_html); .PP turns a .I mailto URI in character string .B s into message content in .BR message . If the .I mailto URI has been taken from HTML context, .B is_html should be set to a non-zero value. The return value is negative, with .I errno set appropriately in the event of an error. A zero return value indicates that all went well with the conversion, while a positive return value indicates that some content in the resulting message has generated at least one error or warning with respect to the RFCs enabled via the .B modes set in .BR message . .DE .DS .PP .B int mparse_prepare_mailto(const struct mparse_message *message, char *buf, size_t sz, unsigned int is_html); .PP generates a .I mailto URI in .B buf from the message content in .BR message . If the .I mailto URI will be used in an HTML context, .B is_html should be set to a non-zero value. The return value is negative with .I errno set appropriately in the event of an error. Otherwise the number of characters placed into .B buf is returned (not including the terminating ASCII NUL if the buffer is sufficiently large); if that value is greater than the supplied .B sz value, the buffer is too small to hold the .I mailto URI, and the return value indicates the minimum buffer size that will accommodate the URI (including the terminating ASCII NUL). .DE .DS .PP Application programmers who wish to enumerate message MIME parts may use .PP .B int mparse_part_string(const struct mparse_entity *, char *, size_t); .PP which places an RFC 3501 IMAP-compatible part number for the .B struct mparse_entity pointed to by the first argument in the buffer pointed to by the second argument, and with size given by the third argument. The return value is the number of characters in the string, or the required buffer size if the buffer is too small. .DE .DS .PP Various registered protocol element names can be tested, and in some cases enumerated, by calling functions similar in operation to those already described for specific protocol elements. The functions described below are automatically called during parsing of messages (including when fields are generated via mparse_insert_field) and need not be called by application authors. These functions are provided: .PP .B const char *mparse_MTA_name_typ_entry (register const char *str, register unsigned int len); .PP validates str (with length len) as a registered MTA-name-type (RFC 3464). It returns a pointer to the canonical form of the name. There is no corresponding enumeration function. .DE .DS .PP MIME message/external-body provides for access-types. A potential access-type can be validated by calling .PP .B const struct mparse_name_val *mparse_access_entry(const char *, unsigned int); .PP It returns a structure which holds the canonical name and some additional information (see mparse.h). .PP .B const struct mparse_name_val *mparse_access_type(int n); .PP may be used to enumerate access-types. .DE .DS .PP Message Disposition Notifications (RFC 3798) provides for action-modes in the Disposition MDN field. .PP .B const char *mparse_act_mode_entry(const char *, unsigned int); .PP may be called to determine if a string of a given length is a valid action-mode value. It returns a pointer to the canonical form of the name. There is no corresponding enumeration function. .DE .DS .PP There are registered action-value names associated with the Action per-recipient field of Delivery Status Notifications (RFC 3464) and Message Tracking Status Notifications (RFC 3XXX). .PP .B const struct mparse_name_val *mparse_action_entry(const char *, unsigned int); .PP validates strings as action-value names. It returns a structure which holds the canonical name and some additional information (see mparse.h). A program may enumerate those action-value names (e.g. to create a menu) by calling the function .PP .B const struct mparse_type *mparse_action(int n); .PP with successive integer arguments beginning with 1. Each call will return a pointer to a .I mparse_name_val structure which represents an action-value; when all action-values have been enumerated, .B mparse_action returns a NULL pointer. .DE .DS .PP DSN and MTSN Original-Recipient and Final-Recipient fields use an address-type value. .PP .B const char *mparse_address_type_entry(const char *, unsigned int); .PP validates strings as address-type names. It returns a pointer to the canonical name. There is no corresponding enumeration function. .DE .\" .DS .\" .PP .\" .PP .\" .B const struct mparse_name_val *mparse_auto_sub_entry(const char *, unsigned int); .\" .PP .\" .DE .DS .PP The Autosubmitted header field defined in RFC 2156 contains an autosubmitted value. .PP .B const struct mparse_name_val *mparse_autosub_entry(const char *, unsigned int); .PP validates a string with a given length as an autosubmitted value name. It returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP Boolean values, i.e. the strings "true" and "false", may appear in some contexts. The function .PP .B const struct mparse_name_val *mparse_boolean_entry(const char *, unsigned int); .PP recognizes those strings and returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP RFC 3458 defines a Message-Context header field which can contain registered context values. .PP .B const struct mparse_name_val *mparse_conmparse_text_entry(const char *, unsigned int); .PP recognizes those strings and returns a structure which holds the canonical name and some additional information (see mparse.h). .PP .B const struct mparse_name_val *mparse_context_type(int); .PP is the corresponding enumeration function. .DE .DS .PP Usenet control messages use a control keyword. .PP .B const struct mparse_control *mparse_control_entry(const char *, unsigned int); .PP recognizes those keywords and returns a structure which holds the canonical name and some additional information (see mparse.h). .PP .B const char *mparse_control_name(int); .PP is the corresponding enumeration function. .DE .DS .PP Specification of dates using the date-time production of RFC 822 and RFC 2822 grammar provides for optional specification of the day-of-week. .PP .B const struct mparse_name_val *mparse_day_entry(const char *, unsigned int); .PP recognizes valid day-of-week strings and returns a structure which holds the canonical name and some additional information (see mparse.h). While there is no corresponding enumeration function, the global character string array .I mparse_dows provides access to the canonical strings (index 0 corresponds to Sunday, etc.). .DE .DS .PP DSN Diagnostic-Code fields use a diagnostic-type value. .PP .B const char *mparse_diagnostic_t_entry(const char *, unsigned int); .PP may be called to determine if a string of a given length is a valid diagnostic-type value. It returns a pointer to the canonical form of the name. There is no corresponding enumeration function. .DE .DS .PP Message Disposition Notifications (RFC 3798) provides for disposition-modifiers in the Disposition MDN field. .PP .B const struct mparse_name_val *mparse_disp_mod_entry(const char *, unsigned int); .PP may be called to determine if a string of a given length is a valid disposition-modifier value. It returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP The MDN RFC also provides for Disposition-Notification-Options parameters which are registered with IANA. .PP .B const struct mparse_name_val *mparse_disp_not_opt_entry(const char *, unsigned int); .PP may be called to determine if a string of a given length is a valid Disposition-Notification-Options parameter. It returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS The MDN RFC provides for Disposition types which are registered with IANA. .PP .PP .B const char *mparse_dtype_entry(const char *, unsigned int); .PP validates a string against known disposition types. It returns a pointer to the canonical form of the name. .DE .DS .PP The keywords WHERE and END may be used in constructing filters (RFC 2533) .PP .B const struct mparse_name_val *mparse_filter_entry(const char *, unsigned int); .PP recognizes these keywords. It returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP Media feature tags are also defined in RFC 2533 and are registered by IANA. .PP .B const char *mparse_ftag_entry(const char *, unsigned int); .PP recognizes registered media feature keywords. It returns a pointer to the canonical form of the name. There is no corresponding enumeration function. .DE .DS .PP Disposition-Notification-Options mparse_parameters use "importance" keywords to distinguish required and optional support for parameters. .PP .B const char *mparse_importance_entry(const char *, unsigned int); .PP recognizes valid importance keywords. It returns a pointer to the canonical form of the name. There is no corresponding enumeration function. .DE .DS .PP The MIXER RFCs defined an Importance header field. .PP .B const struct mparse_name_val *mparse_importance2_entry(const char *, unsigned int); .PP recognizes valid keywords used with that field. It returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP The MIXER RFCs also defined a Message-Type header field. .PP .B const struct mparse_name_val *mparse_message_type_entry(const char *, unsigned int); .PP recognizes valid phrases used with that field. It returns a structure which holds the canonical phrase and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP RFC 3335 defines a Received-content-MIC header field which contains a message integrity check (MIC) algorithm (micalg) name. These micalg names may also appear in a Disposition-Notification-Options header field which is used in conjunction with MDNs for secure business data interchange. .PP .B const struct mparse_name_val *mparse_micalg_entry(const char *, unsigned int); .PP recognizes valid micalg names. It returns a structure which holds the canonical name and some additional information (see mparse.h). .PP .B const struct mparse_name_val *mparse_micalg(int); .PP is the corresponding enumeration function. .DE .DS .PP Standardized alphabetic names are used to designate months in date-time specifications. .PP .B const struct mparse_name_val *mparse_month_entry(const char *, unsigned int); .PP recognizes valid month names and returns a structure which holds the canonical name and some additional information (see mparse.h). While there is no corresponding enumeration function, the global character string array .I mparse_mons provides access to the canonical strings (index 0 corresponds to January, etc.). .DE .DS .PP Newsgroup names have certain restrictions on allowable newsgroup components, and there are some keywords that may be used in place of newsgroup names in some contexts. .PP .B const struct mparse_name_val *mparse_newsgroups_entry(const char *, unsigned int); .PP determines if a string is such a keyword or forbidden name or component. It returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP The MIXER RFCs defined a Priority header field. .PP .B const struct mparse_name_val *mparse_priority_entry(const char *, unsigned int); .PP recognizes valid names used with that field. It returns a structure which holds the canonical phrase and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP The MIXER RFCs also defined several header fields which are used to indicate whether or not some end-to-end signaling feature is to be prohibited. .PP .B const struct mparse_name_val *mparse_prohibition_entry(const char *, unsigned int); .PP recognizes valid names used with those fields. It returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP The original SMTP RFC, RFC 788, specified several protocol names to be used with the Mail-From header field. .PP .B const char *mparse_protocol_entry(const char *, unsigned int); .PP recognizes valid protocol names for that context. It returns a pointer to the canonical form of the name. There is no corresponding enumeration function. .DE .DS .PP Uniform Resource Indicators (URIs) specify a particular access scheme. Such schemes are registered by IANA. .PP .B const char *mparse_schemes_entry(const char *, unsigned int); .PP recognizes registered scheme names. It returns a pointer to the canonical form of the name. .PP .B const char *mparse_scheme(int); .PP is the corresponding enumeration function. .DE .DS .PP Message Disposition Notifications (RFC 3798) provides for sending-modes in the Disposition MDN field. .PP .B const char *mparse_sending_entry(const char *, unsigned int); .PP may be called to determine if a string of a given length is a valid sending-mode value. It returns a pointer to the canonical form of the name. There is no corresponding enumeration function. .DE .DS .PP The MIXER RFCs defined a Sensitivity header field. .PP .B const struct mparse_name_val *mparse_sensitivity_entry(const char *, unsigned int); .PP recognizes valid names used with that field. It returns a structure which holds the canonical phrase and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP The MIXER RFCs also defined an X400-Received header field. .PP .B const struct mparse_name_val *mparse_x400act_entry(const char *, unsigned int); .PP recognizes valid mparse_action names used with that field. It returns a structure which holds the canonical phrase and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP The MIXER RFCs also defined an X400-Content-Type header field. .PP .B const char *mparse_x400ct_entry(const char *, unsigned int); .PP recognizes the sole valid name used with that field. It returns a pointer to the canonical form of the name. There is no corresponding enumeration function. .DE .DS .PP The MIXER RFCs defined encoded-information-type keywords. .PP .B const struct mparse_name_val *mparse_x400eit_entry(const char *, unsigned int); .PP recognizes valid encoded-information-type keyword names. It returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .\" .DS .\" .PP .\" .PP .\" .B const struct mparse_name_val *mparse_yesno_entry(const char *, unsigned int); .\" .PP .\" .DE .DS .PP Numeric time zone offsets (from UTC) and standardized alphabetic names are used in date-time specifications. .PP .B const struct mparse_name_val *mparse_zones_entry(const char *, unsigned int); .PP recognizes valid offsets and standardized names used in date-time specifications. It returns a structure which holds the canonical name and some additional information (see mparse.h). There is no corresponding enumeration function. .DE .DS .PP Calling Line Identification for Voice Mail Messages (RFC 3939) provides for a numbering plan option in the Caller-ID field. .PP .B const char *mparse_numbering_entry(const char *str, unsigned int len); .PP validates str (with length len) as a registered numbering plan. It returns a pointer to the canonical form of the name. There is no corresponding enumeration function. .DE .DS Two functions are provided for interpolating numbers into a buffer without the performance overhead of .IR snprintf . .PP .B size_t mparse_n_print(char *, size_t, int); .PP puts a string representation of a decimal integer into the character array given by the first two arguments (start and length). .PP .B size_t mparse_un_print(char *, size_t, unsigned int); .PP Does the same for unsigned integers. .PP .B const char mparse_digits[]; .PP is a global character array where each element is the ASCII decimal digit corresponding to the array index. .DE .DS .SS Hacking .PP Suppose that you wish to add support for some new field. The first thing to do is to determine if that's really wise. There is a tendency for novices to approach any messaging issue with ``let's invent a header field to do...''. There are frequently better approaches. The second step is to determine whether to provide support directly in mparse, or at the application level via the hooks for extension and user-defined fields. Unless there is a high probability that the field will be standardized, it may be best to use the hooks rather than hack .BR mparse . Assuming that one decides to proceed, the next thing required is a syntax specification for the field in a form similar to RFC 2822's ABNF. Additional required information is an understanding of where and how often the field may appear (e.g. it may be a mandatory DSN per-recipient field). Other field characteristics, such as whether or not a Resent- form of the field is to be recognized (see the mparse_field_state structure flags in mparse.h) should also be determined. Having collected the necessary information, one can proceed to modify the related files: .IP mparse.h.in: 0.8i Processing hooks may be added; these are called when the field is processed. The hook function should take an integer (indication of errors in the field as parsed) as its first argument, a pointer to the relevant .I mparse_field structure as the second argument, and may have additional arguments as pointers to interesting .I mparse_token structures, .IP hooktest.c: 0.8i Arrangements should be made to recognize added hooks and to generate some appropriate output (often simply echoing the field). .IP mparse.y: 0.8i Grammar rules for the field need to be added, and should be referenced by an entry in the list of fields which appears prior to the individual field grammar rules. See the existing field grammar rules for examples. Typically there should be an error-handling rule and one rule or more for normal field recognition. Each of the rules will typically call the function .I mparse_install_and_process_field with appropriate arguments (see the function definition in the file .B parse2.c or the prototype in the header file .BR mparse.h . Ideally, the grammar should not include a lot of C code in the corresponding rule action; the idea is to put processing code in .B parse2.c and reserve .B mparse.y for grammar rules per se. Of course, there are exceptional cases where some detailed action code may be necessary. There are a number of non-terminal productions defined in the grammar file; these should be used where applicable rather than attempting to reinvent the wheel. Another principle to bear in mind when formulating the grammar rules is ``In general, an implementation must be conservative in its sending behavior, and liberal in its receiving behavior'' (RFC 791). In .BR mparse , this is achieved by making the grammar as liberal as practical and detecting and flagging errors and dubious constructs (which can be repaired during field generation). .IP parse2.c: 0.8i Detailed field-specific actions, error-checking, etc. are handled in the function .IR mparse_install_and_process_field . See the existing code there for standard fields for examples of the type of processing performed. A check for proper context of the field may be added to the .I check_field_context function, and field count checks and cross-checks may be added to the .I mparse_header_end function. .IP fields.gperf: 0.8i The data corresponding to the mparse_field_state structure is entered on a line. Lexical analyzer start condition is determined by the field grammar; see the existing field definitions and the lexical analyzer rules (mlex.l). .IP mparse.0: 0.8i The hooks and any new functions related to the field should be documented. .IP fix_fold.c 0.8i Handling of the field during repair should be incorporated into the fix_* functions. .PP If there are any potential errors that should be flagged, it may be necessary to provide error message strings and/or references to a particular section of a particular standards document. Macros used to reference internationalized strings are found in file .BR msg.h ; the macros expand to an index integer which is used to retrieve the corresponding text from the appropriate language variant of the messages stored in .B msg_lang.gperf for text that may appear in various languages. .PP In rare cases, it may be necessary to modify or extend the lexical analyzer (mlex.l). .PP .B mparse should be rebuilt and the regression tests run after modifying the relevant files. That is accomplished by entering the command .BR "make regression" . The program .B hooktest should be used to test proper recognition of the field (including detection of and suitable diagnostics for illegal input). .DE .DS .SS Debugging .PP .B void mparse_dump_token(FILE *, const struct mparse_token *); .PP may be used to print information about a physical token and its attributes to the specified stdio FILE. .PP .B void mparse_dump_tokens(FILE *, const struct mparse_token *); .PP prints information for a logical token series. .DE .DS .SS Maintenance .PP IANA and other authorities maintain sets of keywords, language tags, charset names, etc. which may be periodically updated. .B Mparse can be rebuilt to incorporate such updates. The .I makefile (specifically the file .I make.file which contains the recipes for building .BR mparse ) automates much of the update, provided suitable tools are available. The high-performance keyword hash tables which are generated from the reference sets are regenerated. Any differences due to updates are displayed and the rebuild of .B mparse stops. That is intentional so that the differences can be reviewed. Occasionally a typographical error is introduced into the reference information, and some of the tables incorporate ancillary information (e.g. flags indicating required Content-Type parameters associated with media types) which cannot be automatically generated from the reference information (somebody needs to manually review the defining registration information and determine whether flags need to be updated). When the files automatically generated from the reference information have been reviewed (and manually updated if necessary), the build process may be restarted to continue the update. .DE .DS .SH EXAMPLE .PP The following example demonstrates construction of a delivery status notification: .DE .na .nf .ft CW .ps 7 .vs 8 .lf 1 dsntest.c /* Description: demonstration program: generate a Delivery Status Notification message */ #define optind EFFIN_GCC_NONSENSE /* work around gcc silliness */ #include #include /* errno */ #include /* strerror, strlen */ #include /* atoi */ #include /* access, R_OK */ #undef optind #define OPTSTRING "a:r:s:t:" #define USAGE "-r recipient -a action -s status -t to [file]" struct counts { unsigned int errcount; unsigned int warncount; }; static char *setopt(char *s, char **argv, int *poptind, unsigned int *perr, char **popt, const char *name) { if (!*++s) s = argv[++*poptind]; if (!s) (*perr)++; else { if (*popt) { (void)fprintf(stderr, "%s: %s already specified: %s\n", argv[0], name, *popt); (*perr)++; } else *popt = s; for (; *s; s++); s--; } return s; } static int field_err(struct mparse_entity *entity, FILE * s, int line) { if (entity) { struct mparse_field *f = entity->last_field; struct mparse_message *message = entity->message; (void)fprintf(s, "bad %s:\n", f->state ? f->state->field_name : f->tokens->tok); message->suppress_errors = message->suppress_warnings = 0U; (void)mparse_field_error_messages(f, mparse_fwrite_wrapper, s); (void)mparse_process_and_free_message(message, -1); } return line; } static int header_err(struct mparse_entity *entity, FILE * s, int line) { if (entity) { struct mparse_message *message = entity->message; (void)fprintf(s, "bad fields:\n"); message->suppress_errors = message->suppress_warnings = 0U; (void)mparse_entity_header_error_messages(entity, mparse_fwrite_wrapper, s); (void)mparse_process_and_free_message(message, -1); } return line; } static int copy_err(struct mparse_entity *entity, FILE * s, int line) { (void)fprintf(s, "message copy error\n"); if (entity) (void)mparse_process_and_free_message(entity->message, -1); return line; } static int insertion_err(struct mparse_entity *entity1, struct mparse_entity *entity2, FILE * s, int line) { (void)fprintf(s, "insertion error\n"); if (entity1) (void)mparse_process_and_free_message(entity1->message, -1); if (entity2) (void)mparse_process_and_free_message(entity2->message, -1); return line; } static int hook_end_of_message(struct mparse_message *message) { struct mparse_entity *cpy, *msg, *top = (struct mparse_entity *)(message->userptr); cpy = mparse_copy_message(message, message->top); if (!cpy) return copy_err(top, stderr, __LINE__); else if (mparse_insert_entity(msg = mparse_encapsulate(cpy, "rfc822", 0), top, 0, 0)) return insertion_err(msg, top, stderr, __LINE__); return 0; } /* args: -r recipient, -a action, -s status, -t to, [file] */ int main(int argc, char **argv) { char buf[1024], buf2[128], *act, *recipient, *st, *to, *s; int c, optind; FILE *in; struct counts counts; struct mparse_field *h; struct mparse_message message; struct mparse_entity *p, *q, *r; struct mparse_debug debug; setvbuf(stdout, NULL, _IOLBF, 0); memset(&message, 0, sizeof(struct mparse_message)); memset(&debug, 0, sizeof(struct mparse_debug)); message.dbg = &debug; memset(&counts, 0, sizeof(struct counts)); act = recipient = st = to = 0; /* parse option arguments; emulate getopt (too many implementation differences to rely on it) */ for (optind = 1; (optind < argc) && (argv[optind][0] == '-'); optind++) { if (!strcmp(argv[optind], "--")) { optind++; break; } for (s = argv[optind] + 1; s && ((c = *s) != '\0'); s++) switch (c) { case 'a': s = setopt(s, argv, &optind, &counts.errcount, &act, "action"); break; case 'r': s = setopt(s, argv, &optind, &counts.errcount, &recipient, "recipient"); break; case 's': s = setopt(s, argv, &optind, &counts.errcount, &st, "status"); break; case 't': s = setopt(s, argv, &optind, &counts.errcount, &to, "to"); break; default: counts.errcount++; break; } } /* make sure mandatory arguments have been specified */ if (!act || !recipient || !st || !to) counts.errcount++; else if (argc - optind > 1) { /* check file argument(s) */ (void)fprintf(stderr, "%s: too many file arguments\n", argv[0]); counts.errcount++; } else for (c = optind; c < argc; c++) { /* pass 0: quick check for readable input file */ if (!strcmp(argv[c], "-")) continue; errno = 0; if (access(argv[c], R_OK)) { (void)fprintf(stderr, "%s: can't read %s: %s\n", argv[0], argv[c], strerror(errno)); counts.errcount++; } } if (counts.errcount) { (void)fprintf(stderr, "%s: usage: %s %s\n", argv[0], argv[0], USAGE); return __LINE__; } s = getenv("YYDEBUG"); /* same */ if (s) /* as */ message.gdebug = (atoi(s) ? 1U : 0U); /* Berserkeley yacc */ message.no_copy = 1U; /* don't echo; will print_message after assembly */ message.suppress_errors = message.suppress_warnings = 1U; message.linelen = MPARSE_LIMIT_RECLINELEN; message.context = MPARSE_PRIMARY_ROLE_GENERATION; q = mparse_new_entity(&message); p = mparse_encapsulate(q, "delivery-status", 0); /* encapsulate before adding fields */ mparse_gen_hostname(buf, sizeof(buf)); if (mparse_insert_field(q, 0, 0, 0, 0, 1, MPARSE_LIMIT_RECLINELEN, "Reporting-MTA", " dns", "; ", buf, 0)) return field_err(q, stderr, __LINE__); /* could add more per-message fields here */ if (mparse_header_end(q)) return header_err(q, stderr, __LINE__); q = mparse_new_entity(&message); if (mparse_insert_entity(q, p, 0, 0)) /* insert in encapsulation before adding fields */ return insertion_err(q, p, stderr, __LINE__); if (mparse_insert_field(q, 0, 0, 0, 0, 1, MPARSE_LIMIT_ENCLINELEN, "Final-Recipient", " ", "rfc822", " ; ", recipient, 0)) return field_err(q, stderr, __LINE__); if (mparse_insert_field(q, 0, 0, 0, 0, 1, MPARSE_LIMIT_RECLINELEN, "Action", " ", act, 0)) return field_err(q, stderr, __LINE__); if (mparse_insert_field(q, 0, 0, 0, 0, 1, MPARSE_LIMIT_RECLINELEN, "Status", " ", st, 0)) return field_err(q, stderr, __LINE__); /* could add more per-recipient fields here */ if (mparse_header_end(q)) return header_err(q, stderr, __LINE__); q = mparse_new_entity(&message); (void)mparse_append_body_line(q, "This is a DSN example.", "text", 4U, "plain", 5U, "us-ascii", 8U, 0, 0, 0U, 0, 0, 0, MPARSE_LIMIT_RECLINELEN); mparse_body_end(q); (void)mparse_parameter_string(&message, "report-type", "delivery-status", 15U, 0, 0, buf2, sizeof(buf2)); r = mparse_create_multipart(p, "report", "report delimiter", buf2, 0, 0); if (mparse_insert_entity(q, r, p, 0)) return insertion_err(q, r, stderr, __LINE__); /* fields for report */ if (mparse_insert_field(r, r->fields, 0, 0, 0, 1, MPARSE_LIMIT_ENCLINELEN, "From", " postmaster@", buf, 0)) return field_err(r, stderr, __LINE__); h = r->fields->next; if (mparse_insert_field(r, h, 0, 0, 0, 1, MPARSE_LIMIT_ENCLINELEN, "To", " ", to, 0)) return field_err(r, stderr, __LINE__); mparse_gen_date_local(buf, sizeof(buf)); if (mparse_insert_field(r, h, 0, 0, 0, 1, MPARSE_LIMIT_RECLINELEN, "Date", " ", buf, 0)) return field_err(r, stderr, __LINE__); if (mparse_insert_field(r, h, 0, 0, 0, 1, MPARSE_LIMIT_RECLINELEN, "Subject", " Delivery Status Notification: ", act, 0)) return field_err(r, stderr, __LINE__); mparse_gen_message_id(buf, sizeof(buf)); if (mparse_insert_field(r, h, 0, 0, 0, 1, MPARSE_LIMIT_RECLINELEN, "message-id", " ", buf, 0)) return field_err(r, stderr, __LINE__); if (optind < argc) { struct mparse_message orig; struct mparse_hooks hooks; /* read and parse input file */ memset(&orig, 0, sizeof(struct mparse_message)); orig.context = MPARSE_PRIMARY_ROLE_ACCESS; orig.dbg = &debug; orig.no_copy = 1U; /* don't echo; will print_message after assembly */ orig.suppress_errors = orig.suppress_warnings = 1U; orig.userptr = r; orig.timeout = 10.0; memset(orig.hooks = &hooks, 0, sizeof(struct mparse_hooks)); hooks.hook_end_of_message = hook_end_of_message; if (!strcmp(argv[optind], "-")) in = stdin; else { in = fopen(argv[optind], "rb"); if (!in) { (void)fprintf(stderr, "%s: can't open %s: %s\n", argv[0], argv[optind], strerror(errno)); (void)mparse_process_and_free_message(&message, -1); return __LINE__; } } (void)mparse_parse(&orig, in, stderr); if (in != stdin) fclose(in); } if (!message.top) return __LINE__; (void)mparse_adjust_encoding(r); (void)mparse_minimize_mime_fields(r); if (mparse_header_end(r)) return header_err(q, stderr, __LINE__); (void)mparse_print_message(stdout, 0, &message, 0, stderr); return mparse_process_and_free_message(&message, 0); } .lf 8770 mparse.0 .vs .ps .ft .fi .ad .DS .SH RETURN VALUE .B mparse_parse returns zero unless a serious parsing error occurred. Examine .BR ioerr in the .B mparse_message structure to determine whether an I/O error such as a timeout was encountered while processing the message. .DE .DS .SH BUGS and CAVEATS .PP There are some conflicts between RFCs 821, 822, 2821, 2822, 1034, 1123, 1036, 2045, 2425, etc. and quite a few gray areas. These may be addressed as the standards are revised. .DE .PP RFC 2822 is rather generous in allowing certain constructs. This parser warns about invalid old dates, invalid time zones, domain names and domain literals that are syntactically legal per RFC 2822, but which are meaningless. These warnings are associated with an RFC number of zero. Printing such warnings via the demonstration program is enabled via the -0 (zero) command-line option, which sets RFC .B mode 0. .PP RFC 2822 eliminated the distinction between RFC 822 extension fields and user-defined (those beginning with "X-") fields. The RFC 822 scheme reduced the likelihood of namespace clashes. Mparse does distinguish between the two types, although they are treated similarly (some differences are mandated by RFC 2047). Separate user hooks are provided for the two types. .PP RFC 2184 (obsoleted by RFC 2231) used the number 1 for the first part of a parameter continuation. RFC 2231 uses 0. RFC 2184 (in effect from August to November 1997) parameter continuation numbering is not supported. This is a conflict between the two RFCs which cannot be readily resolved. .PP RFC 2821 permits a mix of angle bracketed addresses and mailbox specifications in a Received field .I for component, as well as permitting multiple mailboxes there. RFC 2822 does not permit a mix or multiple addr-specs. The mix and multiple mailboxes permitted by the 2821 specification lead to parsing conflicts and are therefore not supported. .PP The rules in RFC 2046 for message/partial reassembly require that an original message's Date field be placed in the initial piece message header when fragmenting in order to preserve that field through the fragmentation/reassembly process. That conflicts with RFC 2822 section 3.6.1 regarding the Date field semantics. .B Mparse preserves the Date information from the original message in the initial piece message header when generating that piece. .PP RFCs 850 and 1036 have some sections which effectively impose structure on the (RFC 822) unstructured Subject field. That contradicts RFC 822, and as both RFCs explicitly state that RFC 822 has precedence in the case of conflicts, the purported structural suggestions are not supported. .DS .PP A number of extension fields have been defined in several RFCs. The following are not yet fully supported, or are supported only to the extent of non-conflicting specifications. .TS expand; lw(2.45i)fB lw(1.85i)fB lw(1.5i)fB . field description resolution _ .T& lp-2fB lp-2 lp-2 . T{ .na Received (RFCs 821, 822, 886, 2821, 2822) T} T{ .na RFC 886 advocates using unregistered values in the 'with' component, contrary to the other RFCs and RFC 1958 which require an IANA-registered protocol. T} T{ .na Unregistered values are considered errors. T} T{ .na Delivery-Report-Content-Reported-Recipient-Info (RFC 987) T} T{ .na Use of *text in the case of drc-failure leads to ambiguities that cannot be resolved without extensive lookahead. T} T{ .na A phrase_list is recognized where RFC 987 specifies *text. T} T{ .na Content-Location (RFC 2557, 2616, 2912), also Content-Base (RFC 2110) T} T{ .na These can't be unambiguously parsed as currently defined in the RFCs. Also, encoding of URIs and handling of long URIs are not appropriately defined. No provision for #fragments. T} T{ .na Comments disallowed (parentheses are part of URI). URI encoding should be used if required. Long URIs assembled from portions separated by FWS. #fragments parsed per RFC 2396. T} Reporting-UA (RFCs 2298, 3798) Specification is ambiguous. T{ .na \fIua-name\fP parsed as RFC 2822 phrase not *text. T} T{ .na Received-content-MIC (RFC 3335) T} T{ .na Syntax uses MIME parameter value specification outside of a parameter. Permissible CFWS locations not specified. Min/max number of fields not specified. T} T{ .na Parsed as specified, but unlikely to be what the authors intended as the value might include RFC 2231 charset, language, and/or encoding. Liberal acceptance of CFWS. No min/max count enforced. T} .TE .DE .DS .PP A number of additional MIME types have been defined in multiple RFCs. At least the following may have parsing implications which are not fully supported, or are supported only to the extent of non-conflicting specifications: .TS expand; lw(1.2i)fB lw(2.95i)fB lw(1.5i)fB . media type description resolution _ .T& lp-2fB lp-2 lp-2 . T{ .na application/dicom (RFC 3240) T} T{ .na the required \fBid\fP parameter may be optional according to RFC 3240 T} T{ .na parameters are not checked T} T{ .na multipart/voice\-message (RFCs 2421, 2423) T} T{ .na Requires text/directory be present. Requires certain content in text/directory. This poses a couple of problems for \fBmparse\fP: the text/directory, being contained within the multipart/voice\-message wrapper, will not have been detected when the multipart/voice\-message Content-Type field is seen, and \fBmparse\fP does not attempt to interpret text body content as that is a display issue. T} T{ .na Content is not checked. Application code requiring conformance with the relevant RFCs should provide checks. N.B. RFC 3801 (obsoletes 2421) has dropped the text/directory requirements. T} T{ .na text/directory (RFCs 2425, 2927) T} T{ .na Requires charset parameter. Default encoding 8bit (illegal per 2045 [6.1]; encoding always defaults to 7bit in the absence of a Content-Transfer-Encoding field). T} T{ .na \fBcharset\fP parameter is checked for presence. RFC 2045 encoding rules are used. Content is uninterpreted. T} .TE .DE .DS .PP Many field definitions have since been obsoleted (e.g. due to name changes from RFC 1327 to 2156). .B Mparse recognizes the obsoleted names as equivalent to the current names. .DE .DS .PP It should also be noted that undefined fields (\fIi.e.\fP those not defined in a stable, public document like an RFC) cannot be checked for syntax since there is no official syntax definition. They are treated as extension fields which may contain arbitrary text. Applications which are intended to support user-defined or undefined fields are responsible for syntax checks, field count checks, and recognizing the field from among other user-defined or extension fields. .DE .DS .SH SEE ALSO The file .B generating for guidelines on generating RFC-compliant messages. .DE .DS .SH AUTHOR Bruce Lilly .DE