Implements a decoder for Apple BinHex 4.0-encoded content.
TIdDecoderBinHex4 = class(TIdDecoder4to3);
TIdDecoderBinHex4 is a TIdDecoder4to3 descendant that implements a decoder for Apple BinHex 4.0-encoded content, as described in the Internet Standards document:
RFC 1741 - MIME Content Type for BinHex Encoded Files
TIdDecoderBinHex4 is a variant of a 3-byte-to-4-byte decoder, but it uses ASCII character 90 for sequences of repeating characters, allowing some compression, but thereby not allowing it to be mapped in as another 3-to-4 decoder.
As per the RFC, BinHex encoded data must be encapsulated in a MIME part (it cannot be directly coded inline in an email "body"), the part is strictly defined to have a header entry (with the appropriate "myfile.ext"):
Content-Type: application/mac-binhex40; name="myfile.ext"
After the header, the part must start with the text (not indented):
(This file must be converted with BinHex 4.0)
This allows the option and the ambiguity of identifying it by either the Content-Type or by the initial text line. However, it is also stated that any text before the specified text line must be ignored, implying the line does not have to be the first - an apparent contradiction. The encoded file then follows, split with CRLFs (to avoid lines that are too long for emails) that must be discarded.
The file starts with a colon (:), a header, followed by the file contents, and ending in another colon.
There is also an interesting article on the web, "BinHex 4.0 Definition by Peter N Lewis, Aug 1991", which has very useful information on what is implemeted in practice, and seems to come with the good provenance of bitter experience.
The BinHex format
Here is a description of the Hqx7 (7 bit format as implemented in BinHex 4.0) formats for Macintosh Application and File transfers.
The main features of the format are:
BinHex format is processed at three different levels:
Data Type |
Description |
Byte |
Length of the FileName (1..63) |
Bytes |
FileName bytes (up to "Length" bytes) |
Byte | |
Long |
Type |
Long |
Creator |
Word |
Flags (And $F800) |
Long |
Length of Data Fork |
Long |
Length of Resource Fork |
Word |
CRC |
Bytes |
Data Fork ("Data Length" bytes) |
Word |
CRC |
Bytes |
Resource Fork ("Rsrc Length" bytes) |
Word |
CRC |
00 11 22 33 44 55 66 77 -> 00 11 22 33 44 55 66 77 11 22 22 22 22 22 22 33 -> 11 22 90 06 33 11 22 90 33 44 -> 11 22 90 00 33 44
The whole file is considered as a stream of bits. This stream will be divided in blocks of 6 bits and then converted to one of 64 characters contained in a table. The characters in this table have been chosen for maximum noise protection. The format will start with a ":" (first character on a line) and end with a ":". There will be a maximum of 64 characters on a line. It must be preceded, by this comment, starting in column 1 (it does not start in column 1 in this document):
(This file must be converted with BinHex 4.0)
Any text before this comment is to be ignored.
Use GBinHex4CodeTable to access the characters used in the encoded file format.
Implementation Notes
There are older variants referred to in RFC 1741, but I have only come across encodings in current use as separate MIME parts, which this implementation is targetted at.
When encoding into BinHex4, you do NOT have to implement the run-length encoding (the character 90 for sequences of repeating characters), and this encoder does not do it. The CRC values generated in the header have NOT been tested (because this decoder ignores them).
The decoder has to allow for the run-length encoding. The decoder works irrespective of whether it is preceded by the identification string or not (GBinHex4IdentificationString below). The string to be decoded must include the starting and ending colons. It can deal with embedded CR and LFs. Unlike base64 and quoted-printable, we cannot decode line-by-line cleanly, because the lines do not contain a clean number of 4-byte blocks due to the first line starting with a colon, leaving 63 bytes on that line, plus you have the problem of dealing with the run-length encoding and stripping the header. If the attachment only has a data fork, it is saved; if only a resource fork, it is saved; if both, only the data fork is saved. The decoder does NOT check that the CRC values are correct.
Indy units use the content-type to decide if the part is BinHex4:
Content-Type: application/mac-binhex40; name="myfile.ext"
WARNING: This code only implements BinHex4.0 when used as a part in a MIME-encoded email. To have a part encoded, set the value in the content transfer property for the message part:
ContentTransfer := 'binhex40'.
Copyright © 1993-2006, Chad Z. Hower (aka Kudzu) and the Indy Pit Crew. All rights reserved.
|
Post feedback to the Indy Docs Newsgroup.
|