HOW UUENCODING WORKS ?!!!

UU-encoding is a way to code a file which may contain any characters into
the standard, upper case character set so it can be reliably sent over
diverse networks.

The basic scheme is to break groups of 3 eight bit characters (24 bits) into
4 six bit characters and then add 32 dec (a space) to each six bit character
which maps it into the readily transmittable character.  As some
transmission mechanisms compress or remove spaces, spaces are changed into
back-quote characters (96 dec).  (A better scheme might be to use a bias of
33 dec, so the space is not created, but this is not done.)

A small number (usually 45) of eight bit characters are encoded in a single
line and a count is put at the start of the line.  (Most lines in an encoded
file start with "M" which is decimal 77 which less the 32 bias is 45.)

This UUENCODE program puts a checksum character at the end of each line.
The checksum is the sum of all the encoded characters, before adding the 32
offset, modulo 64.

Note: Horton 9/1/87 UUENCODE has a bug in the checksum algorithm; it uses
the sum of the original, not the encoded characters.  This decode program
accepts either his checksums or the correct form.

The lines of encoded data can be preceded by comments and by network
addressing information.  The encoded data is directly preceded by a line
containing:

            begin <mode-number> <file-name>

This line is created by the encoding program.  The decode program scans the
file looking for "begin" in column 1.

The final end of encoded data is an encoded line with zero encoded
characters followed by a line containing "end".

For integrity checking, this UUENCODE program puts a line containing the
number of bytes in the original file - "size nnnnn" - after the "end" line.

Long files are broken into several sections before transmission. Some
networks require files to be less than 64K bytes.  Most UUDECODE programs
require a single file as input; so these sections must be combined into one
large file before decoding.  This program pair automatically handles multi-
section files: the UUENCODE can automatically generate them and the UUDECODE
can automatically find and decode them back into the original file.

This UUENCODE program puts a header line, containing the section number and
file name, in front of every section:
        "section <number> of uuencode of file <file name>"

At the end of a section this UUENCODE program create a "section size nnnn"
line.  This gives the number of bytes of decoded data in the section.

All the "integrity fields": the checksum, the size lines, and the section
header line, are inserted so they are ignored by older UUDECODE programs
that cannot handle them.  This decode program does not require any of these
fields; if not present, integrity checking is not done.  This program pair
is 100% downward compatible!

When UUDECODE encounters a premature end-of-file or some data which is not
decodable it assumes the end of a file section.  UUDECODE is conservative
when it encounters data it cannot decode.

Usually this undecodable data is valid "trailer" data put at the end of file
for data transmission purposes.  However the file may also be bad.  So
UUDECODE continues to scan the file, if UUDECODE then encounters a line
which is decodable it assumes the file is bad.  Or if there are more that 30
"trailer" lines remaining in the file, UUDECODE assumes the file is bad.

Sometimes UUDECODE may reject a valid file.  If this happens, the file must
be edited to make it acceptable to UUDECODE.

When UUDECODE encounters a valid end of file section it must get the next
file in sequence.  If the file name ends with a number, UUDECODE tries the
next file in sequence.  Otherwise UUDECODE asks you for a file name.  If the
file it tries does not contain decodable data, UUDECODE asks for another
file name to try.

Sometimes files come across in shell archives that automatically check
sequencing and call uudecode for you on the UNIX systems.  If you prefer to
download the raw files to MS-DOS, this UUDECODE will filter thru the current
form of shell script and decode the data automatically.  That is, just use
this UUDECODE and do not mess with the shell script.

There is one more rarely used feature of UUENCODE, many input files can be
encoded into one large encode file.  I have never seen this used.  The end
of an input file is a zero length encoded line, followed by another "begin"
line instead of by an "end" line.  This UUDECODE will decode this sort of
file; but the UUENCODE will only handle a single input file.

UUDECODE is called with options and one or more parameters:

         First:  the input file name.  This must be supplied.
                 filename. (note period):    extension is spaces
                 filename  (no period here): extension is defaulted to UUE

         Second: Optionally, the result file name which overrides the
                 name on the first "begin" line in the input file.

         Next:   Result file names for other files of a multi-file encode.


This works well for me.  On UNIX I find a program I want in three sections:
             PRG1, PRG2, PRG3.
I copy the three files down to my PC as PRG1.UUE, PRG2.UUE, and PRG3.UUE.  I
then just enter UUDECODE PRG1 and the thing decodes.


Done privately and not for profit.  Suggestions appreciated.

Richard Marks
931 Sulgrave Lane
Bryn Mawr, PA 19010