You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@corinthia.apache.org by ja...@apache.org on 2015/03/23 17:19:51 UTC
[63/83] [abbrv] incubator-corinthia git commit: removed zlib
http://git-wip-us.apache.org/repos/asf/incubator-corinthia/blob/1a48f7c3/DocFormats/platform/3rdparty/zlib-1.2.8/doc/rfc1952.txt
----------------------------------------------------------------------
diff --git a/DocFormats/platform/3rdparty/zlib-1.2.8/doc/rfc1952.txt b/DocFormats/platform/3rdparty/zlib-1.2.8/doc/rfc1952.txt
deleted file mode 100644
index a8e51b4..0000000
--- a/DocFormats/platform/3rdparty/zlib-1.2.8/doc/rfc1952.txt
+++ /dev/null
@@ -1,675 +0,0 @@
-
-
-
-
-
-
-Network Working Group P. Deutsch
-Request for Comments: 1952 Aladdin Enterprises
-Category: Informational May 1996
-
-
- GZIP file format specification version 4.3
-
-Status of This Memo
-
- This memo provides information for the Internet community. This memo
- does not specify an Internet standard of any kind. Distribution of
- this memo is unlimited.
-
-IESG Note:
-
- The IESG takes no position on the validity of any Intellectual
- Property Rights statements contained in this document.
-
-Notices
-
- Copyright (c) 1996 L. Peter Deutsch
-
- Permission is granted to copy and distribute this document for any
- purpose and without charge, including translations into other
- languages and incorporation into compilations, provided that the
- copyright notice and this notice are preserved, and that any
- substantive changes or deletions from the original are clearly
- marked.
-
- A pointer to the latest version of this and related documentation in
- HTML format can be found at the URL
- <ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html>.
-
-Abstract
-
- This specification defines a lossless compressed data format that is
- compatible with the widely used GZIP utility. The format includes a
- cyclic redundancy check value for detecting data corruption. The
- format presently uses the DEFLATE method of compression but can be
- easily extended to use other compression methods. The format can be
- implemented readily in a manner not covered by patents.
-
-
-
-
-
-
-
-
-
-
-Deutsch Informational [Page 1]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
-Table of Contents
-
- 1. Introduction ................................................... 2
- 1.1. Purpose ................................................... 2
- 1.2. Intended audience ......................................... 3
- 1.3. Scope ..................................................... 3
- 1.4. Compliance ................................................ 3
- 1.5. Definitions of terms and conventions used ................. 3
- 1.6. Changes from previous versions ............................ 3
- 2. Detailed specification ......................................... 4
- 2.1. Overall conventions ....................................... 4
- 2.2. File format ............................................... 5
- 2.3. Member format ............................................. 5
- 2.3.1. Member header and trailer ........................... 6
- 2.3.1.1. Extra field ................................... 8
- 2.3.1.2. Compliance .................................... 9
- 3. References .................................................. 9
- 4. Security Considerations .................................... 10
- 5. Acknowledgements ........................................... 10
- 6. Author's Address ........................................... 10
- 7. Appendix: Jean-Loup Gailly's gzip utility .................. 11
- 8. Appendix: Sample CRC Code .................................. 11
-
-1. Introduction
-
- 1.1. Purpose
-
- The purpose of this specification is to define a lossless
- compressed data format that:
-
- * Is independent of CPU type, operating system, file system,
- and character set, and hence can be used for interchange;
- * Can compress or decompress a data stream (as opposed to a
- randomly accessible file) to produce another data stream,
- using only an a priori bounded amount of intermediate
- storage, and hence can be used in data communications or
- similar structures such as Unix filters;
- * Compresses data with efficiency comparable to the best
- currently available general-purpose compression methods,
- and in particular considerably better than the "compress"
- program;
- * Can be implemented readily in a manner not covered by
- patents, and hence can be practiced freely;
- * Is compatible with the file format produced by the current
- widely used gzip utility, in that conforming decompressors
- will be able to read data produced by the existing gzip
- compressor.
-
-
-
-
-Deutsch Informational [Page 2]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
- The data format defined by this specification does not attempt to:
-
- * Provide random access to compressed data;
- * Compress specialized data (e.g., raster graphics) as well as
- the best currently available specialized algorithms.
-
- 1.2. Intended audience
-
- This specification is intended for use by implementors of software
- to compress data into gzip format and/or decompress data from gzip
- format.
-
- The text of the specification assumes a basic background in
- programming at the level of bits and other primitive data
- representations.
-
- 1.3. Scope
-
- The specification specifies a compression method and a file format
- (the latter assuming only that a file can store a sequence of
- arbitrary bytes). It does not specify any particular interface to
- a file system or anything about character sets or encodings
- (except for file names and comments, which are optional).
-
- 1.4. Compliance
-
- Unless otherwise indicated below, a compliant decompressor must be
- able to accept and decompress any file that conforms to all the
- specifications presented here; a compliant compressor must produce
- files that conform to all the specifications presented here. The
- material in the appendices is not part of the specification per se
- and is not relevant to compliance.
-
- 1.5. Definitions of terms and conventions used
-
- byte: 8 bits stored or transmitted as a unit (same as an octet).
- (For this specification, a byte is exactly 8 bits, even on
- machines which store a character on a number of bits different
- from 8.) See below for the numbering of bits within a byte.
-
- 1.6. Changes from previous versions
-
- There have been no technical changes to the gzip format since
- version 4.1 of this specification. In version 4.2, some
- terminology was changed, and the sample CRC code was rewritten for
- clarity and to eliminate the requirement for the caller to do pre-
- and post-conditioning. Version 4.3 is a conversion of the
- specification to RFC style.
-
-
-
-Deutsch Informational [Page 3]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
-2. Detailed specification
-
- 2.1. Overall conventions
-
- In the diagrams below, a box like this:
-
- +---+
- | | <-- the vertical bars might be missing
- +---+
-
- represents one byte; a box like this:
-
- +==============+
- | |
- +==============+
-
- represents a variable number of bytes.
-
- Bytes stored within a computer do not have a "bit order", since
- they are always treated as a unit. However, a byte considered as
- an integer between 0 and 255 does have a most- and least-
- significant bit, and since we write numbers with the most-
- significant digit on the left, we also write bytes with the most-
- significant bit on the left. In the diagrams below, we number the
- bits of a byte so that bit 0 is the least-significant bit, i.e.,
- the bits are numbered:
-
- +--------+
- |76543210|
- +--------+
-
- This document does not address the issue of the order in which
- bits of a byte are transmitted on a bit-sequential medium, since
- the data format described here is byte- rather than bit-oriented.
-
- Within a computer, a number may occupy multiple bytes. All
- multi-byte numbers in the format described here are stored with
- the least-significant byte first (at the lower memory address).
- For example, the decimal number 520 is stored as:
-
- 0 1
- +--------+--------+
- |00001000|00000010|
- +--------+--------+
- ^ ^
- | |
- | + more significant byte = 2 x 256
- + less significant byte = 8
-
-
-
-Deutsch Informational [Page 4]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
- 2.2. File format
-
- A gzip file consists of a series of "members" (compressed data
- sets). The format of each member is specified in the following
- section. The members simply appear one after another in the file,
- with no additional information before, between, or after them.
-
- 2.3. Member format
-
- Each member has the following structure:
-
- +---+---+---+---+---+---+---+---+---+---+
- |ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->)
- +---+---+---+---+---+---+---+---+---+---+
-
- (if FLG.FEXTRA set)
-
- +---+---+=================================+
- | XLEN |...XLEN bytes of "extra field"...| (more-->)
- +---+---+=================================+
-
- (if FLG.FNAME set)
-
- +=========================================+
- |...original file name, zero-terminated...| (more-->)
- +=========================================+
-
- (if FLG.FCOMMENT set)
-
- +===================================+
- |...file comment, zero-terminated...| (more-->)
- +===================================+
-
- (if FLG.FHCRC set)
-
- +---+---+
- | CRC16 |
- +---+---+
-
- +=======================+
- |...compressed blocks...| (more-->)
- +=======================+
-
- 0 1 2 3 4 5 6 7
- +---+---+---+---+---+---+---+---+
- | CRC32 | ISIZE |
- +---+---+---+---+---+---+---+---+
-
-
-
-
-Deutsch Informational [Page 5]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
- 2.3.1. Member header and trailer
-
- ID1 (IDentification 1)
- ID2 (IDentification 2)
- These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139
- (0x8b, \213), to identify the file as being in gzip format.
-
- CM (Compression Method)
- This identifies the compression method used in the file. CM
- = 0-7 are reserved. CM = 8 denotes the "deflate"
- compression method, which is the one customarily used by
- gzip and which is documented elsewhere.
-
- FLG (FLaGs)
- This flag byte is divided into individual bits as follows:
-
- bit 0 FTEXT
- bit 1 FHCRC
- bit 2 FEXTRA
- bit 3 FNAME
- bit 4 FCOMMENT
- bit 5 reserved
- bit 6 reserved
- bit 7 reserved
-
- If FTEXT is set, the file is probably ASCII text. This is
- an optional indication, which the compressor may set by
- checking a small amount of the input data to see whether any
- non-ASCII characters are present. In case of doubt, FTEXT
- is cleared, indicating binary data. For systems which have
- different file formats for ascii text and binary data, the
- decompressor can use FTEXT to choose the appropriate format.
- We deliberately do not specify the algorithm used to set
- this bit, since a compressor always has the option of
- leaving it cleared and a decompressor always has the option
- of ignoring it and letting some other program handle issues
- of data conversion.
-
- If FHCRC is set, a CRC16 for the gzip header is present,
- immediately before the compressed data. The CRC16 consists
- of the two least significant bytes of the CRC32 for all
- bytes of the gzip header up to and not including the CRC16.
- [The FHCRC bit was never set by versions of gzip up to
- 1.2.4, even though it was documented with a different
- meaning in gzip 1.2.4.]
-
- If FEXTRA is set, optional extra fields are present, as
- described in a following section.
-
-
-
-Deutsch Informational [Page 6]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
- If FNAME is set, an original file name is present,
- terminated by a zero byte. The name must consist of ISO
- 8859-1 (LATIN-1) characters; on operating systems using
- EBCDIC or any other character set for file names, the name
- must be translated to the ISO LATIN-1 character set. This
- is the original name of the file being compressed, with any
- directory components removed, and, if the file being
- compressed is on a file system with case insensitive names,
- forced to lower case. There is no original file name if the
- data was compressed from a source other than a named file;
- for example, if the source was stdin on a Unix system, there
- is no file name.
-
- If FCOMMENT is set, a zero-terminated file comment is
- present. This comment is not interpreted; it is only
- intended for human consumption. The comment must consist of
- ISO 8859-1 (LATIN-1) characters. Line breaks should be
- denoted by a single line feed character (10 decimal).
-
- Reserved FLG bits must be zero.
-
- MTIME (Modification TIME)
- This gives the most recent modification time of the original
- file being compressed. The time is in Unix format, i.e.,
- seconds since 00:00:00 GMT, Jan. 1, 1970. (Note that this
- may cause problems for MS-DOS and other systems that use
- local rather than Universal time.) If the compressed data
- did not come from a file, MTIME is set to the time at which
- compression started. MTIME = 0 means no time stamp is
- available.
-
- XFL (eXtra FLags)
- These flags are available for use by specific compression
- methods. The "deflate" method (CM = 8) sets these flags as
- follows:
-
- XFL = 2 - compressor used maximum compression,
- slowest algorithm
- XFL = 4 - compressor used fastest algorithm
-
- OS (Operating System)
- This identifies the type of file system on which compression
- took place. This may be useful in determining end-of-line
- convention for text files. The currently defined values are
- as follows:
-
-
-
-
-
-
-Deutsch Informational [Page 7]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
- 0 - FAT filesystem (MS-DOS, OS/2, NT/Win32)
- 1 - Amiga
- 2 - VMS (or OpenVMS)
- 3 - Unix
- 4 - VM/CMS
- 5 - Atari TOS
- 6 - HPFS filesystem (OS/2, NT)
- 7 - Macintosh
- 8 - Z-System
- 9 - CP/M
- 10 - TOPS-20
- 11 - NTFS filesystem (NT)
- 12 - QDOS
- 13 - Acorn RISCOS
- 255 - unknown
-
- XLEN (eXtra LENgth)
- If FLG.FEXTRA is set, this gives the length of the optional
- extra field. See below for details.
-
- CRC32 (CRC-32)
- This contains a Cyclic Redundancy Check value of the
- uncompressed data computed according to CRC-32 algorithm
- used in the ISO 3309 standard and in section 8.1.1.6.2 of
- ITU-T recommendation V.42. (See http://www.iso.ch for
- ordering ISO documents. See gopher://info.itu.ch for an
- online version of ITU-T V.42.)
-
- ISIZE (Input SIZE)
- This contains the size of the original (uncompressed) input
- data modulo 2^32.
-
- 2.3.1.1. Extra field
-
- If the FLG.FEXTRA bit is set, an "extra field" is present in
- the header, with total length XLEN bytes. It consists of a
- series of subfields, each of the form:
-
- +---+---+---+---+==================================+
- |SI1|SI2| LEN |... LEN bytes of subfield data ...|
- +---+---+---+---+==================================+
-
- SI1 and SI2 provide a subfield ID, typically two ASCII letters
- with some mnemonic value. Jean-Loup Gailly
- <gz...@prep.ai.mit.edu> is maintaining a registry of subfield
- IDs; please send him any subfield ID you wish to use. Subfield
- IDs with SI2 = 0 are reserved for future use. The following
- IDs are currently defined:
-
-
-
-Deutsch Informational [Page 8]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
- SI1 SI2 Data
- ---------- ---------- ----
- 0x41 ('A') 0x70 ('P') Apollo file type information
-
- LEN gives the length of the subfield data, excluding the 4
- initial bytes.
-
- 2.3.1.2. Compliance
-
- A compliant compressor must produce files with correct ID1,
- ID2, CM, CRC32, and ISIZE, but may set all the other fields in
- the fixed-length part of the header to default values (255 for
- OS, 0 for all others). The compressor must set all reserved
- bits to zero.
-
- A compliant decompressor must check ID1, ID2, and CM, and
- provide an error indication if any of these have incorrect
- values. It must examine FEXTRA/XLEN, FNAME, FCOMMENT and FHCRC
- at least so it can skip over the optional fields if they are
- present. It need not examine any other part of the header or
- trailer; in particular, a decompressor may ignore FTEXT and OS
- and always produce binary output, and still be compliant. A
- compliant decompressor must give an error indication if any
- reserved bit is non-zero, since such a bit could indicate the
- presence of a new field that would cause subsequent data to be
- interpreted incorrectly.
-
-3. References
-
- [1] "Information Processing - 8-bit single-byte coded graphic
- character sets - Part 1: Latin alphabet No.1" (ISO 8859-1:1987).
- The ISO 8859-1 (Latin-1) character set is a superset of 7-bit
- ASCII. Files defining this character set are available as
- iso_8859-1.* in ftp://ftp.uu.net/graphics/png/documents/
-
- [2] ISO 3309
-
- [3] ITU-T recommendation V.42
-
- [4] Deutsch, L.P.,"DEFLATE Compressed Data Format Specification",
- available in ftp://ftp.uu.net/pub/archiving/zip/doc/
-
- [5] Gailly, J.-L., GZIP documentation, available as gzip-*.tar in
- ftp://prep.ai.mit.edu/pub/gnu/
-
- [6] Sarwate, D.V., "Computation of Cyclic Redundancy Checks via Table
- Look-Up", Communications of the ACM, 31(8), pp.1008-1013.
-
-
-
-
-Deutsch Informational [Page 9]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
- [7] Schwaderer, W.D., "CRC Calculation", April 85 PC Tech Journal,
- pp.118-133.
-
- [8] ftp://ftp.adelaide.edu.au/pub/rocksoft/papers/crc_v3.txt,
- describing the CRC concept.
-
-4. Security Considerations
-
- Any data compression method involves the reduction of redundancy in
- the data. Consequently, any corruption of the data is likely to have
- severe effects and be difficult to correct. Uncompressed text, on
- the other hand, will probably still be readable despite the presence
- of some corrupted bytes.
-
- It is recommended that systems using this data format provide some
- means of validating the integrity of the compressed data, such as by
- setting and checking the CRC-32 check value.
-
-5. Acknowledgements
-
- Trademarks cited in this document are the property of their
- respective owners.
-
- Jean-Loup Gailly designed the gzip format and wrote, with Mark Adler,
- the related software described in this specification. Glenn
- Randers-Pehrson converted this document to RFC and HTML format.
-
-6. Author's Address
-
- L. Peter Deutsch
- Aladdin Enterprises
- 203 Santa Margarita Ave.
- Menlo Park, CA 94025
-
- Phone: (415) 322-0103 (AM only)
- FAX: (415) 322-1734
- EMail: <gh...@aladdin.com>
-
- Questions about the technical content of this specification can be
- sent by email to:
-
- Jean-Loup Gailly <gz...@prep.ai.mit.edu> and
- Mark Adler <ma...@alumni.caltech.edu>
-
- Editorial comments on this specification can be sent by email to:
-
- L. Peter Deutsch <gh...@aladdin.com> and
- Glenn Randers-Pehrson <ra...@alumni.rpi.edu>
-
-
-
-Deutsch Informational [Page 10]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
-7. Appendix: Jean-Loup Gailly's gzip utility
-
- The most widely used implementation of gzip compression, and the
- original documentation on which this specification is based, were
- created by Jean-Loup Gailly <gz...@prep.ai.mit.edu>. Since this
- implementation is a de facto standard, we mention some more of its
- features here. Again, the material in this section is not part of
- the specification per se, and implementations need not follow it to
- be compliant.
-
- When compressing or decompressing a file, gzip preserves the
- protection, ownership, and modification time attributes on the local
- file system, since there is no provision for representing protection
- attributes in the gzip file format itself. Since the file format
- includes a modification time, the gzip decompressor provides a
- command line switch that assigns the modification time from the file,
- rather than the local modification time of the compressed input, to
- the decompressed output.
-
-8. Appendix: Sample CRC Code
-
- The following sample code represents a practical implementation of
- the CRC (Cyclic Redundancy Check). (See also ISO 3309 and ITU-T V.42
- for a formal specification.)
-
- The sample code is in the ANSI C programming language. Non C users
- may find it easier to read with these hints:
-
- & Bitwise AND operator.
- ^ Bitwise exclusive-OR operator.
- >> Bitwise right shift operator. When applied to an
- unsigned quantity, as here, right shift inserts zero
- bit(s) at the left.
- ! Logical NOT operator.
- ++ "n++" increments the variable n.
- 0xNNN 0x introduces a hexadecimal (base 16) constant.
- Suffix L indicates a long value (at least 32 bits).
-
- /* Table of CRCs of all 8-bit messages. */
- unsigned long crc_table[256];
-
- /* Flag: has the table been computed? Initially false. */
- int crc_table_computed = 0;
-
- /* Make the table for a fast CRC. */
- void make_crc_table(void)
- {
- unsigned long c;
-
-
-
-Deutsch Informational [Page 11]
-
-RFC 1952 GZIP File Format Specification May 1996
-
-
- int n, k;
- for (n = 0; n < 256; n++) {
- c = (unsigned long) n;
- for (k = 0; k < 8; k++) {
- if (c & 1) {
- c = 0xedb88320L ^ (c >> 1);
- } else {
- c = c >> 1;
- }
- }
- crc_table[n] = c;
- }
- crc_table_computed = 1;
- }
-
- /*
- Update a running crc with the bytes buf[0..len-1] and return
- the updated crc. The crc should be initialized to zero. Pre- and
- post-conditioning (one's complement) is performed within this
- function so it shouldn't be done by the caller. Usage example:
-
- unsigned long crc = 0L;
-
- while (read_buffer(buffer, length) != EOF) {
- crc = update_crc(crc, buffer, length);
- }
- if (crc != original_crc) error();
- */
- unsigned long update_crc(unsigned long crc,
- unsigned char *buf, int len)
- {
- unsigned long c = crc ^ 0xffffffffL;
- int n;
-
- if (!crc_table_computed)
- make_crc_table();
- for (n = 0; n < len; n++) {
- c = crc_table[(c ^ buf[n]) & 0xff] ^ (c >> 8);
- }
- return c ^ 0xffffffffL;
- }
-
- /* Return the CRC of the bytes buf[0..len-1]. */
- unsigned long crc(unsigned char *buf, int len)
- {
- return update_crc(0L, buf, len);
- }
-
-
-
-
-Deutsch Informational [Page 12]
-
http://git-wip-us.apache.org/repos/asf/incubator-corinthia/blob/1a48f7c3/DocFormats/platform/3rdparty/zlib-1.2.8/doc/txtvsbin.txt
----------------------------------------------------------------------
diff --git a/DocFormats/platform/3rdparty/zlib-1.2.8/doc/txtvsbin.txt b/DocFormats/platform/3rdparty/zlib-1.2.8/doc/txtvsbin.txt
deleted file mode 100644
index 3d0f063..0000000
--- a/DocFormats/platform/3rdparty/zlib-1.2.8/doc/txtvsbin.txt
+++ /dev/null
@@ -1,107 +0,0 @@
-A Fast Method for Identifying Plain Text Files
-==============================================
-
-
-Introduction
-------------
-
-Given a file coming from an unknown source, it is sometimes desirable
-to find out whether the format of that file is plain text. Although
-this may appear like a simple task, a fully accurate detection of the
-file type requires heavy-duty semantic analysis on the file contents.
-It is, however, possible to obtain satisfactory results by employing
-various heuristics.
-
-Previous versions of PKZip and other zip-compatible compression tools
-were using a crude detection scheme: if more than 80% (4/5) of the bytes
-found in a certain buffer are within the range [7..127], the file is
-labeled as plain text, otherwise it is labeled as binary. A prominent
-limitation of this scheme is the restriction to Latin-based alphabets.
-Other alphabets, like Greek, Cyrillic or Asian, make extensive use of
-the bytes within the range [128..255], and texts using these alphabets
-are most often misidentified by this scheme; in other words, the rate
-of false negatives is sometimes too high, which means that the recall
-is low. Another weakness of this scheme is a reduced precision, due to
-the false positives that may occur when binary files containing large
-amounts of textual characters are misidentified as plain text.
-
-In this article we propose a new, simple detection scheme that features
-a much increased precision and a near-100% recall. This scheme is
-designed to work on ASCII, Unicode and other ASCII-derived alphabets,
-and it handles single-byte encodings (ISO-8859, MacRoman, KOI8, etc.)
-and variable-sized encodings (ISO-2022, UTF-8, etc.). Wider encodings
-(UCS-2/UTF-16 and UCS-4/UTF-32) are not handled, however.
-
-
-The Algorithm
--------------
-
-The algorithm works by dividing the set of bytecodes [0..255] into three
-categories:
-- The white list of textual bytecodes:
- 9 (TAB), 10 (LF), 13 (CR), 32 (SPACE) to 255.
-- The gray list of tolerated bytecodes:
- 7 (BEL), 8 (BS), 11 (VT), 12 (FF), 26 (SUB), 27 (ESC).
-- The black list of undesired, non-textual bytecodes:
- 0 (NUL) to 6, 14 to 31.
-
-If a file contains at least one byte that belongs to the white list and
-no byte that belongs to the black list, then the file is categorized as
-plain text; otherwise, it is categorized as binary. (The boundary case,
-when the file is empty, automatically falls into the latter category.)
-
-
-Rationale
----------
-
-The idea behind this algorithm relies on two observations.
-
-The first observation is that, although the full range of 7-bit codes
-[0..127] is properly specified by the ASCII standard, most control
-characters in the range [0..31] are not used in practice. The only
-widely-used, almost universally-portable control codes are 9 (TAB),
-10 (LF) and 13 (CR). There are a few more control codes that are
-recognized on a reduced range of platforms and text viewers/editors:
-7 (BEL), 8 (BS), 11 (VT), 12 (FF), 26 (SUB) and 27 (ESC); but these
-codes are rarely (if ever) used alone, without being accompanied by
-some printable text. Even the newer, portable text formats such as
-XML avoid using control characters outside the list mentioned here.
-
-The second observation is that most of the binary files tend to contain
-control characters, especially 0 (NUL). Even though the older text
-detection schemes observe the presence of non-ASCII codes from the range
-[128..255], the precision rarely has to suffer if this upper range is
-labeled as textual, because the files that are genuinely binary tend to
-contain both control characters and codes from the upper range. On the
-other hand, the upper range needs to be labeled as textual, because it
-is used by virtually all ASCII extensions. In particular, this range is
-used for encoding non-Latin scripts.
-
-Since there is no counting involved, other than simply observing the
-presence or the absence of some byte values, the algorithm produces
-consistent results, regardless what alphabet encoding is being used.
-(If counting were involved, it could be possible to obtain different
-results on a text encoded, say, using ISO-8859-16 versus UTF-8.)
-
-There is an extra category of plain text files that are "polluted" with
-one or more black-listed codes, either by mistake or by peculiar design
-considerations. In such cases, a scheme that tolerates a small fraction
-of black-listed codes would provide an increased recall (i.e. more true
-positives). This, however, incurs a reduced precision overall, since
-false positives are more likely to appear in binary files that contain
-large chunks of textual data. Furthermore, "polluted" plain text should
-be regarded as binary by general-purpose text detection schemes, because
-general-purpose text processing algorithms might not be applicable.
-Under this premise, it is safe to say that our detection method provides
-a near-100% recall.
-
-Experiments have been run on many files coming from various platforms
-and applications. We tried plain text files, system logs, source code,
-formatted office documents, compiled object code, etc. The results
-confirm the optimistic assumptions about the capabilities of this
-algorithm.
-
-
---
-Cosmin Truta
-Last updated: 2006-May-28
http://git-wip-us.apache.org/repos/asf/incubator-corinthia/blob/1a48f7c3/DocFormats/platform/3rdparty/zlib-1.2.8/examples/README.examples
----------------------------------------------------------------------
diff --git a/DocFormats/platform/3rdparty/zlib-1.2.8/examples/README.examples b/DocFormats/platform/3rdparty/zlib-1.2.8/examples/README.examples
deleted file mode 100644
index 56a3171..0000000
--- a/DocFormats/platform/3rdparty/zlib-1.2.8/examples/README.examples
+++ /dev/null
@@ -1,49 +0,0 @@
-This directory contains examples of the use of zlib and other relevant
-programs and documentation.
-
-enough.c
- calculation and justification of ENOUGH parameter in inftrees.h
- - calculates the maximum table space used in inflate tree
- construction over all possible Huffman codes
-
-fitblk.c
- compress just enough input to nearly fill a requested output size
- - zlib isn't designed to do this, but fitblk does it anyway
-
-gun.c
- uncompress a gzip file
- - illustrates the use of inflateBack() for high speed file-to-file
- decompression using call-back functions
- - is approximately twice as fast as gzip -d
- - also provides Unix uncompress functionality, again twice as fast
-
-gzappend.c
- append to a gzip file
- - illustrates the use of the Z_BLOCK flush parameter for inflate()
- - illustrates the use of deflatePrime() to start at any bit
-
-gzjoin.c
- join gzip files without recalculating the crc or recompressing
- - illustrates the use of the Z_BLOCK flush parameter for inflate()
- - illustrates the use of crc32_combine()
-
-gzlog.c
-gzlog.h
- efficiently and robustly maintain a message log file in gzip format
- - illustrates use of raw deflate, Z_PARTIAL_FLUSH, deflatePrime(),
- and deflateSetDictionary()
- - illustrates use of a gzip header extra field
-
-zlib_how.html
- painfully comprehensive description of zpipe.c (see below)
- - describes in excruciating detail the use of deflate() and inflate()
-
-zpipe.c
- reads and writes zlib streams from stdin to stdout
- - illustrates the proper use of deflate() and inflate()
- - deeply commented in zlib_how.html (see above)
-
-zran.c
- index a zlib or gzip stream and randomly access it
- - illustrates the use of Z_BLOCK, inflatePrime(), and
- inflateSetDictionary() to provide random access
http://git-wip-us.apache.org/repos/asf/incubator-corinthia/blob/1a48f7c3/DocFormats/platform/3rdparty/zlib-1.2.8/examples/enough.c
----------------------------------------------------------------------
diff --git a/DocFormats/platform/3rdparty/zlib-1.2.8/examples/enough.c b/DocFormats/platform/3rdparty/zlib-1.2.8/examples/enough.c
deleted file mode 100644
index b991144..0000000
--- a/DocFormats/platform/3rdparty/zlib-1.2.8/examples/enough.c
+++ /dev/null
@@ -1,572 +0,0 @@
-/* enough.c -- determine the maximum size of inflate's Huffman code tables over
- * all possible valid and complete Huffman codes, subject to a length limit.
- * Copyright (C) 2007, 2008, 2012 Mark Adler
- * Version 1.4 18 August 2012 Mark Adler
- */
-
-/* Version history:
- 1.0 3 Jan 2007 First version (derived from codecount.c version 1.4)
- 1.1 4 Jan 2007 Use faster incremental table usage computation
- Prune examine() search on previously visited states
- 1.2 5 Jan 2007 Comments clean up
- As inflate does, decrease root for short codes
- Refuse cases where inflate would increase root
- 1.3 17 Feb 2008 Add argument for initial root table size
- Fix bug for initial root table size == max - 1
- Use a macro to compute the history index
- 1.4 18 Aug 2012 Avoid shifts more than bits in type (caused endless loop!)
- Clean up comparisons of different types
- Clean up code indentation
- */
-
-/*
- Examine all possible Huffman codes for a given number of symbols and a
- maximum code length in bits to determine the maximum table size for zilb's
- inflate. Only complete Huffman codes are counted.
-
- Two codes are considered distinct if the vectors of the number of codes per
- length are not identical. So permutations of the symbol assignments result
- in the same code for the counting, as do permutations of the assignments of
- the bit values to the codes (i.e. only canonical codes are counted).
-
- We build a code from shorter to longer lengths, determining how many symbols
- are coded at each length. At each step, we have how many symbols remain to
- be coded, what the last code length used was, and how many bit patterns of
- that length remain unused. Then we add one to the code length and double the
- number of unused patterns to graduate to the next code length. We then
- assign all portions of the remaining symbols to that code length that
- preserve the properties of a correct and eventually complete code. Those
- properties are: we cannot use more bit patterns than are available; and when
- all the symbols are used, there are exactly zero possible bit patterns
- remaining.
-
- The inflate Huffman decoding algorithm uses two-level lookup tables for
- speed. There is a single first-level table to decode codes up to root bits
- in length (root == 9 in the current inflate implementation). The table
- has 1 << root entries and is indexed by the next root bits of input. Codes
- shorter than root bits have replicated table entries, so that the correct
- entry is pointed to regardless of the bits that follow the short code. If
- the code is longer than root bits, then the table entry points to a second-
- level table. The size of that table is determined by the longest code with
- that root-bit prefix. If that longest code has length len, then the table
- has size 1 << (len - root), to index the remaining bits in that set of
- codes. Each subsequent root-bit prefix then has its own sub-table. The
- total number of table entries required by the code is calculated
- incrementally as the number of codes at each bit length is populated. When
- all of the codes are shorter than root bits, then root is reduced to the
- longest code length, resulting in a single, smaller, one-level table.
-
- The inflate algorithm also provides for small values of root (relative to
- the log2 of the number of symbols), where the shortest code has more bits
- than root. In that case, root is increased to the length of the shortest
- code. This program, by design, does not handle that case, so it is verified
- that the number of symbols is less than 2^(root + 1).
-
- In order to speed up the examination (by about ten orders of magnitude for
- the default arguments), the intermediate states in the build-up of a code
- are remembered and previously visited branches are pruned. The memory
- required for this will increase rapidly with the total number of symbols and
- the maximum code length in bits. However this is a very small price to pay
- for the vast speedup.
-
- First, all of the possible Huffman codes are counted, and reachable
- intermediate states are noted by a non-zero count in a saved-results array.
- Second, the intermediate states that lead to (root + 1) bit or longer codes
- are used to look at all sub-codes from those junctures for their inflate
- memory usage. (The amount of memory used is not affected by the number of
- codes of root bits or less in length.) Third, the visited states in the
- construction of those sub-codes and the associated calculation of the table
- size is recalled in order to avoid recalculating from the same juncture.
- Beginning the code examination at (root + 1) bit codes, which is enabled by
- identifying the reachable nodes, accounts for about six of the orders of
- magnitude of improvement for the default arguments. About another four
- orders of magnitude come from not revisiting previous states. Out of
- approximately 2x10^16 possible Huffman codes, only about 2x10^6 sub-codes
- need to be examined to cover all of the possible table memory usage cases
- for the default arguments of 286 symbols limited to 15-bit codes.
-
- Note that an unsigned long long type is used for counting. It is quite easy
- to exceed the capacity of an eight-byte integer with a large number of
- symbols and a large maximum code length, so multiple-precision arithmetic
- would need to replace the unsigned long long arithmetic in that case. This
- program will abort if an overflow occurs. The big_t type identifies where
- the counting takes place.
-
- An unsigned long long type is also used for calculating the number of
- possible codes remaining at the maximum length. This limits the maximum
- code length to the number of bits in a long long minus the number of bits
- needed to represent the symbols in a flat code. The code_t type identifies
- where the bit pattern counting takes place.
- */
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <assert.h>
-
-#define local static
-
-/* special data types */
-typedef unsigned long long big_t; /* type for code counting */
-typedef unsigned long long code_t; /* type for bit pattern counting */
-struct tab { /* type for been here check */
- size_t len; /* length of bit vector in char's */
- char *vec; /* allocated bit vector */
-};
-
-/* The array for saving results, num[], is indexed with this triplet:
-
- syms: number of symbols remaining to code
- left: number of available bit patterns at length len
- len: number of bits in the codes currently being assigned
-
- Those indices are constrained thusly when saving results:
-
- syms: 3..totsym (totsym == total symbols to code)
- left: 2..syms - 1, but only the evens (so syms == 8 -> 2, 4, 6)
- len: 1..max - 1 (max == maximum code length in bits)
-
- syms == 2 is not saved since that immediately leads to a single code. left
- must be even, since it represents the number of available bit patterns at
- the current length, which is double the number at the previous length.
- left ends at syms-1 since left == syms immediately results in a single code.
- (left > sym is not allowed since that would result in an incomplete code.)
- len is less than max, since the code completes immediately when len == max.
-
- The offset into the array is calculated for the three indices with the
- first one (syms) being outermost, and the last one (len) being innermost.
- We build the array with length max-1 lists for the len index, with syms-3
- of those for each symbol. There are totsym-2 of those, with each one
- varying in length as a function of sym. See the calculation of index in
- count() for the index, and the calculation of size in main() for the size
- of the array.
-
- For the deflate example of 286 symbols limited to 15-bit codes, the array
- has 284,284 entries, taking up 2.17 MB for an 8-byte big_t. More than
- half of the space allocated for saved results is actually used -- not all
- possible triplets are reached in the generation of valid Huffman codes.
- */
-
-/* The array for tracking visited states, done[], is itself indexed identically
- to the num[] array as described above for the (syms, left, len) triplet.
- Each element in the array is further indexed by the (mem, rem) doublet,
- where mem is the amount of inflate table space used so far, and rem is the
- remaining unused entries in the current inflate sub-table. Each indexed
- element is simply one bit indicating whether the state has been visited or
- not. Since the ranges for mem and rem are not known a priori, each bit
- vector is of a variable size, and grows as needed to accommodate the visited
- states. mem and rem are used to calculate a single index in a triangular
- array. Since the range of mem is expected in the default case to be about
- ten times larger than the range of rem, the array is skewed to reduce the
- memory usage, with eight times the range for mem than for rem. See the
- calculations for offset and bit in beenhere() for the details.
-
- For the deflate example of 286 symbols limited to 15-bit codes, the bit
- vectors grow to total approximately 21 MB, in addition to the 4.3 MB done[]
- array itself.
- */
-
-/* Globals to avoid propagating constants or constant pointers recursively */
-local int max; /* maximum allowed bit length for the codes */
-local int root; /* size of base code table in bits */
-local int large; /* largest code table so far */
-local size_t size; /* number of elements in num and done */
-local int *code; /* number of symbols assigned to each bit length */
-local big_t *num; /* saved results array for code counting */
-local struct tab *done; /* states already evaluated array */
-
-/* Index function for num[] and done[] */
-#define INDEX(i,j,k) (((size_t)((i-1)>>1)*((i-2)>>1)+(j>>1)-1)*(max-1)+k-1)
-
-/* Free allocated space. Uses globals code, num, and done. */
-local void cleanup(void)
-{
- size_t n;
-
- if (done != NULL) {
- for (n = 0; n < size; n++)
- if (done[n].len)
- free(done[n].vec);
- free(done);
- }
- if (num != NULL)
- free(num);
- if (code != NULL)
- free(code);
-}
-
-/* Return the number of possible Huffman codes using bit patterns of lengths
- len through max inclusive, coding syms symbols, with left bit patterns of
- length len unused -- return -1 if there is an overflow in the counting.
- Keep a record of previous results in num to prevent repeating the same
- calculation. Uses the globals max and num. */
-local big_t count(int syms, int len, int left)
-{
- big_t sum; /* number of possible codes from this juncture */
- big_t got; /* value returned from count() */
- int least; /* least number of syms to use at this juncture */
- int most; /* most number of syms to use at this juncture */
- int use; /* number of bit patterns to use in next call */
- size_t index; /* index of this case in *num */
-
- /* see if only one possible code */
- if (syms == left)
- return 1;
-
- /* note and verify the expected state */
- assert(syms > left && left > 0 && len < max);
-
- /* see if we've done this one already */
- index = INDEX(syms, left, len);
- got = num[index];
- if (got)
- return got; /* we have -- return the saved result */
-
- /* we need to use at least this many bit patterns so that the code won't be
- incomplete at the next length (more bit patterns than symbols) */
- least = (left << 1) - syms;
- if (least < 0)
- least = 0;
-
- /* we can use at most this many bit patterns, lest there not be enough
- available for the remaining symbols at the maximum length (if there were
- no limit to the code length, this would become: most = left - 1) */
- most = (((code_t)left << (max - len)) - syms) /
- (((code_t)1 << (max - len)) - 1);
-
- /* count all possible codes from this juncture and add them up */
- sum = 0;
- for (use = least; use <= most; use++) {
- got = count(syms - use, len + 1, (left - use) << 1);
- sum += got;
- if (got == (big_t)0 - 1 || sum < got) /* overflow */
- return (big_t)0 - 1;
- }
-
- /* verify that all recursive calls are productive */
- assert(sum != 0);
-
- /* save the result and return it */
- num[index] = sum;
- return sum;
-}
-
-/* Return true if we've been here before, set to true if not. Set a bit in a
- bit vector to indicate visiting this state. Each (syms,len,left) state
- has a variable size bit vector indexed by (mem,rem). The bit vector is
- lengthened if needed to allow setting the (mem,rem) bit. */
-local int beenhere(int syms, int len, int left, int mem, int rem)
-{
- size_t index; /* index for this state's bit vector */
- size_t offset; /* offset in this state's bit vector */
- int bit; /* mask for this state's bit */
- size_t length; /* length of the bit vector in bytes */
- char *vector; /* new or enlarged bit vector */
-
- /* point to vector for (syms,left,len), bit in vector for (mem,rem) */
- index = INDEX(syms, left, len);
- mem -= 1 << root;
- offset = (mem >> 3) + rem;
- offset = ((offset * (offset + 1)) >> 1) + rem;
- bit = 1 << (mem & 7);
-
- /* see if we've been here */
- length = done[index].len;
- if (offset < length && (done[index].vec[offset] & bit) != 0)
- return 1; /* done this! */
-
- /* we haven't been here before -- set the bit to show we have now */
-
- /* see if we need to lengthen the vector in order to set the bit */
- if (length <= offset) {
- /* if we have one already, enlarge it, zero out the appended space */
- if (length) {
- do {
- length <<= 1;
- } while (length <= offset);
- vector = realloc(done[index].vec, length);
- if (vector != NULL)
- memset(vector + done[index].len, 0, length - done[index].len);
- }
-
- /* otherwise we need to make a new vector and zero it out */
- else {
- length = 1 << (len - root);
- while (length <= offset)
- length <<= 1;
- vector = calloc(length, sizeof(char));
- }
-
- /* in either case, bail if we can't get the memory */
- if (vector == NULL) {
- fputs("abort: unable to allocate enough memory\n", stderr);
- cleanup();
- exit(1);
- }
-
- /* install the new vector */
- done[index].len = length;
- done[index].vec = vector;
- }
-
- /* set the bit */
- done[index].vec[offset] |= bit;
- return 0;
-}
-
-/* Examine all possible codes from the given node (syms, len, left). Compute
- the amount of memory required to build inflate's decoding tables, where the
- number of code structures used so far is mem, and the number remaining in
- the current sub-table is rem. Uses the globals max, code, root, large, and
- done. */
-local void examine(int syms, int len, int left, int mem, int rem)
-{
- int least; /* least number of syms to use at this juncture */
- int most; /* most number of syms to use at this juncture */
- int use; /* number of bit patterns to use in next call */
-
- /* see if we have a complete code */
- if (syms == left) {
- /* set the last code entry */
- code[len] = left;
-
- /* complete computation of memory used by this code */
- while (rem < left) {
- left -= rem;
- rem = 1 << (len - root);
- mem += rem;
- }
- assert(rem == left);
-
- /* if this is a new maximum, show the entries used and the sub-code */
- if (mem > large) {
- large = mem;
- printf("max %d: ", mem);
- for (use = root + 1; use <= max; use++)
- if (code[use])
- printf("%d[%d] ", code[use], use);
- putchar('\n');
- fflush(stdout);
- }
-
- /* remove entries as we drop back down in the recursion */
- code[len] = 0;
- return;
- }
-
- /* prune the tree if we can */
- if (beenhere(syms, len, left, mem, rem))
- return;
-
- /* we need to use at least this many bit patterns so that the code won't be
- incomplete at the next length (more bit patterns than symbols) */
- least = (left << 1) - syms;
- if (least < 0)
- least = 0;
-
- /* we can use at most this many bit patterns, lest there not be enough
- available for the remaining symbols at the maximum length (if there were
- no limit to the code length, this would become: most = left - 1) */
- most = (((code_t)left << (max - len)) - syms) /
- (((code_t)1 << (max - len)) - 1);
-
- /* occupy least table spaces, creating new sub-tables as needed */
- use = least;
- while (rem < use) {
- use -= rem;
- rem = 1 << (len - root);
- mem += rem;
- }
- rem -= use;
-
- /* examine codes from here, updating table space as we go */
- for (use = least; use <= most; use++) {
- code[len] = use;
- examine(syms - use, len + 1, (left - use) << 1,
- mem + (rem ? 1 << (len - root) : 0), rem << 1);
- if (rem == 0) {
- rem = 1 << (len - root);
- mem += rem;
- }
- rem--;
- }
-
- /* remove entries as we drop back down in the recursion */
- code[len] = 0;
-}
-
-/* Look at all sub-codes starting with root + 1 bits. Look at only the valid
- intermediate code states (syms, left, len). For each completed code,
- calculate the amount of memory required by inflate to build the decoding
- tables. Find the maximum amount of memory required and show the code that
- requires that maximum. Uses the globals max, root, and num. */
-local void enough(int syms)
-{
- int n; /* number of remaing symbols for this node */
- int left; /* number of unused bit patterns at this length */
- size_t index; /* index of this case in *num */
-
- /* clear code */
- for (n = 0; n <= max; n++)
- code[n] = 0;
-
- /* look at all (root + 1) bit and longer codes */
- large = 1 << root; /* base table */
- if (root < max) /* otherwise, there's only a base table */
- for (n = 3; n <= syms; n++)
- for (left = 2; left < n; left += 2)
- {
- /* look at all reachable (root + 1) bit nodes, and the
- resulting codes (complete at root + 2 or more) */
- index = INDEX(n, left, root + 1);
- if (root + 1 < max && num[index]) /* reachable node */
- examine(n, root + 1, left, 1 << root, 0);
-
- /* also look at root bit codes with completions at root + 1
- bits (not saved in num, since complete), just in case */
- if (num[index - 1] && n <= left << 1)
- examine((n - left) << 1, root + 1, (n - left) << 1,
- 1 << root, 0);
- }
-
- /* done */
- printf("done: maximum of %d table entries\n", large);
-}
-
-/*
- Examine and show the total number of possible Huffman codes for a given
- maximum number of symbols, initial root table size, and maximum code length
- in bits -- those are the command arguments in that order. The default
- values are 286, 9, and 15 respectively, for the deflate literal/length code.
- The possible codes are counted for each number of coded symbols from two to
- the maximum. The counts for each of those and the total number of codes are
- shown. The maximum number of inflate table entires is then calculated
- across all possible codes. Each new maximum number of table entries and the
- associated sub-code (starting at root + 1 == 10 bits) is shown.
-
- To count and examine Huffman codes that are not length-limited, provide a
- maximum length equal to the number of symbols minus one.
-
- For the deflate literal/length code, use "enough". For the deflate distance
- code, use "enough 30 6".
-
- This uses the %llu printf format to print big_t numbers, which assumes that
- big_t is an unsigned long long. If the big_t type is changed (for example
- to a multiple precision type), the method of printing will also need to be
- updated.
- */
-int main(int argc, char **argv)
-{
- int syms; /* total number of symbols to code */
- int n; /* number of symbols to code for this run */
- big_t got; /* return value of count() */
- big_t sum; /* accumulated number of codes over n */
- code_t word; /* for counting bits in code_t */
-
- /* set up globals for cleanup() */
- code = NULL;
- num = NULL;
- done = NULL;
-
- /* get arguments -- default to the deflate literal/length code */
- syms = 286;
- root = 9;
- max = 15;
- if (argc > 1) {
- syms = atoi(argv[1]);
- if (argc > 2) {
- root = atoi(argv[2]);
- if (argc > 3)
- max = atoi(argv[3]);
- }
- }
- if (argc > 4 || syms < 2 || root < 1 || max < 1) {
- fputs("invalid arguments, need: [sym >= 2 [root >= 1 [max >= 1]]]\n",
- stderr);
- return 1;
- }
-
- /* if not restricting the code length, the longest is syms - 1 */
- if (max > syms - 1)
- max = syms - 1;
-
- /* determine the number of bits in a code_t */
- for (n = 0, word = 1; word; n++, word <<= 1)
- ;
-
- /* make sure that the calculation of most will not overflow */
- if (max > n || (code_t)(syms - 2) >= (((code_t)0 - 1) >> (max - 1))) {
- fputs("abort: code length too long for internal types\n", stderr);
- return 1;
- }
-
- /* reject impossible code requests */
- if ((code_t)(syms - 1) > ((code_t)1 << max) - 1) {
- fprintf(stderr, "%d symbols cannot be coded in %d bits\n",
- syms, max);
- return 1;
- }
-
- /* allocate code vector */
- code = calloc(max + 1, sizeof(int));
- if (code == NULL) {
- fputs("abort: unable to allocate enough memory\n", stderr);
- return 1;
- }
-
- /* determine size of saved results array, checking for overflows,
- allocate and clear the array (set all to zero with calloc()) */
- if (syms == 2) /* iff max == 1 */
- num = NULL; /* won't be saving any results */
- else {
- size = syms >> 1;
- if (size > ((size_t)0 - 1) / (n = (syms - 1) >> 1) ||
- (size *= n, size > ((size_t)0 - 1) / (n = max - 1)) ||
- (size *= n, size > ((size_t)0 - 1) / sizeof(big_t)) ||
- (num = calloc(size, sizeof(big_t))) == NULL) {
- fputs("abort: unable to allocate enough memory\n", stderr);
- cleanup();
- return 1;
- }
- }
-
- /* count possible codes for all numbers of symbols, add up counts */
- sum = 0;
- for (n = 2; n <= syms; n++) {
- got = count(n, 1, 2);
- sum += got;
- if (got == (big_t)0 - 1 || sum < got) { /* overflow */
- fputs("abort: can't count that high!\n", stderr);
- cleanup();
- return 1;
- }
- printf("%llu %d-codes\n", got, n);
- }
- printf("%llu total codes for 2 to %d symbols", sum, syms);
- if (max < syms - 1)
- printf(" (%d-bit length limit)\n", max);
- else
- puts(" (no length limit)");
-
- /* allocate and clear done array for beenhere() */
- if (syms == 2)
- done = NULL;
- else if (size > ((size_t)0 - 1) / sizeof(struct tab) ||
- (done = calloc(size, sizeof(struct tab))) == NULL) {
- fputs("abort: unable to allocate enough memory\n", stderr);
- cleanup();
- return 1;
- }
-
- /* find and show maximum inflate table usage */
- if (root > max) /* reduce root to max length */
- root = max;
- if ((code_t)syms < ((code_t)1 << (root + 1)))
- enough(syms);
- else
- puts("cannot handle minimum code lengths > root");
-
- /* done */
- cleanup();
- return 0;
-}
http://git-wip-us.apache.org/repos/asf/incubator-corinthia/blob/1a48f7c3/DocFormats/platform/3rdparty/zlib-1.2.8/examples/fitblk.c
----------------------------------------------------------------------
diff --git a/DocFormats/platform/3rdparty/zlib-1.2.8/examples/fitblk.c b/DocFormats/platform/3rdparty/zlib-1.2.8/examples/fitblk.c
deleted file mode 100644
index c61de5c..0000000
--- a/DocFormats/platform/3rdparty/zlib-1.2.8/examples/fitblk.c
+++ /dev/null
@@ -1,233 +0,0 @@
-/* fitblk.c: example of fitting compressed output to a specified size
- Not copyrighted -- provided to the public domain
- Version 1.1 25 November 2004 Mark Adler */
-
-/* Version history:
- 1.0 24 Nov 2004 First version
- 1.1 25 Nov 2004 Change deflateInit2() to deflateInit()
- Use fixed-size, stack-allocated raw buffers
- Simplify code moving compression to subroutines
- Use assert() for internal errors
- Add detailed description of approach
- */
-
-/* Approach to just fitting a requested compressed size:
-
- fitblk performs three compression passes on a portion of the input
- data in order to determine how much of that input will compress to
- nearly the requested output block size. The first pass generates
- enough deflate blocks to produce output to fill the requested
- output size plus a specfied excess amount (see the EXCESS define
- below). The last deflate block may go quite a bit past that, but
- is discarded. The second pass decompresses and recompresses just
- the compressed data that fit in the requested plus excess sized
- buffer. The deflate process is terminated after that amount of
- input, which is less than the amount consumed on the first pass.
- The last deflate block of the result will be of a comparable size
- to the final product, so that the header for that deflate block and
- the compression ratio for that block will be about the same as in
- the final product. The third compression pass decompresses the
- result of the second step, but only the compressed data up to the
- requested size minus an amount to allow the compressed stream to
- complete (see the MARGIN define below). That will result in a
- final compressed stream whose length is less than or equal to the
- requested size. Assuming sufficient input and a requested size
- greater than a few hundred bytes, the shortfall will typically be
- less than ten bytes.
-
- If the input is short enough that the first compression completes
- before filling the requested output size, then that compressed
- stream is return with no recompression.
-
- EXCESS is chosen to be just greater than the shortfall seen in a
- two pass approach similar to the above. That shortfall is due to
- the last deflate block compressing more efficiently with a smaller
- header on the second pass. EXCESS is set to be large enough so
- that there is enough uncompressed data for the second pass to fill
- out the requested size, and small enough so that the final deflate
- block of the second pass will be close in size to the final deflate
- block of the third and final pass. MARGIN is chosen to be just
- large enough to assure that the final compression has enough room
- to complete in all cases.
- */
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <assert.h>
-#include "zlib.h"
-
-#define local static
-
-/* print nastygram and leave */
-local void quit(char *why)
-{
- fprintf(stderr, "fitblk abort: %s\n", why);
- exit(1);
-}
-
-#define RAWLEN 4096 /* intermediate uncompressed buffer size */
-
-/* compress from file to def until provided buffer is full or end of
- input reached; return last deflate() return value, or Z_ERRNO if
- there was read error on the file */
-local int partcompress(FILE *in, z_streamp def)
-{
- int ret, flush;
- unsigned char raw[RAWLEN];
-
- flush = Z_NO_FLUSH;
- do {
- def->avail_in = fread(raw, 1, RAWLEN, in);
- if (ferror(in))
- return Z_ERRNO;
- def->next_in = raw;
- if (feof(in))
- flush = Z_FINISH;
- ret = deflate(def, flush);
- assert(ret != Z_STREAM_ERROR);
- } while (def->avail_out != 0 && flush == Z_NO_FLUSH);
- return ret;
-}
-
-/* recompress from inf's input to def's output; the input for inf and
- the output for def are set in those structures before calling;
- return last deflate() return value, or Z_MEM_ERROR if inflate()
- was not able to allocate enough memory when it needed to */
-local int recompress(z_streamp inf, z_streamp def)
-{
- int ret, flush;
- unsigned char raw[RAWLEN];
-
- flush = Z_NO_FLUSH;
- do {
- /* decompress */
- inf->avail_out = RAWLEN;
- inf->next_out = raw;
- ret = inflate(inf, Z_NO_FLUSH);
- assert(ret != Z_STREAM_ERROR && ret != Z_DATA_ERROR &&
- ret != Z_NEED_DICT);
- if (ret == Z_MEM_ERROR)
- return ret;
-
- /* compress what was decompresed until done or no room */
- def->avail_in = RAWLEN - inf->avail_out;
- def->next_in = raw;
- if (inf->avail_out != 0)
- flush = Z_FINISH;
- ret = deflate(def, flush);
- assert(ret != Z_STREAM_ERROR);
- } while (ret != Z_STREAM_END && def->avail_out != 0);
- return ret;
-}
-
-#define EXCESS 256 /* empirically determined stream overage */
-#define MARGIN 8 /* amount to back off for completion */
-
-/* compress from stdin to fixed-size block on stdout */
-int main(int argc, char **argv)
-{
- int ret; /* return code */
- unsigned size; /* requested fixed output block size */
- unsigned have; /* bytes written by deflate() call */
- unsigned char *blk; /* intermediate and final stream */
- unsigned char *tmp; /* close to desired size stream */
- z_stream def, inf; /* zlib deflate and inflate states */
-
- /* get requested output size */
- if (argc != 2)
- quit("need one argument: size of output block");
- ret = strtol(argv[1], argv + 1, 10);
- if (argv[1][0] != 0)
- quit("argument must be a number");
- if (ret < 8) /* 8 is minimum zlib stream size */
- quit("need positive size of 8 or greater");
- size = (unsigned)ret;
-
- /* allocate memory for buffers and compression engine */
- blk = malloc(size + EXCESS);
- def.zalloc = Z_NULL;
- def.zfree = Z_NULL;
- def.opaque = Z_NULL;
- ret = deflateInit(&def, Z_DEFAULT_COMPRESSION);
- if (ret != Z_OK || blk == NULL)
- quit("out of memory");
-
- /* compress from stdin until output full, or no more input */
- def.avail_out = size + EXCESS;
- def.next_out = blk;
- ret = partcompress(stdin, &def);
- if (ret == Z_ERRNO)
- quit("error reading input");
-
- /* if it all fit, then size was undersubscribed -- done! */
- if (ret == Z_STREAM_END && def.avail_out >= EXCESS) {
- /* write block to stdout */
- have = size + EXCESS - def.avail_out;
- if (fwrite(blk, 1, have, stdout) != have || ferror(stdout))
- quit("error writing output");
-
- /* clean up and print results to stderr */
- ret = deflateEnd(&def);
- assert(ret != Z_STREAM_ERROR);
- free(blk);
- fprintf(stderr,
- "%u bytes unused out of %u requested (all input)\n",
- size - have, size);
- return 0;
- }
-
- /* it didn't all fit -- set up for recompression */
- inf.zalloc = Z_NULL;
- inf.zfree = Z_NULL;
- inf.opaque = Z_NULL;
- inf.avail_in = 0;
- inf.next_in = Z_NULL;
- ret = inflateInit(&inf);
- tmp = malloc(size + EXCESS);
- if (ret != Z_OK || tmp == NULL)
- quit("out of memory");
- ret = deflateReset(&def);
- assert(ret != Z_STREAM_ERROR);
-
- /* do first recompression close to the right amount */
- inf.avail_in = size + EXCESS;
- inf.next_in = blk;
- def.avail_out = size + EXCESS;
- def.next_out = tmp;
- ret = recompress(&inf, &def);
- if (ret == Z_MEM_ERROR)
- quit("out of memory");
-
- /* set up for next reocmpression */
- ret = inflateReset(&inf);
- assert(ret != Z_STREAM_ERROR);
- ret = deflateReset(&def);
- assert(ret != Z_STREAM_ERROR);
-
- /* do second and final recompression (third compression) */
- inf.avail_in = size - MARGIN; /* assure stream will complete */
- inf.next_in = tmp;
- def.avail_out = size;
- def.next_out = blk;
- ret = recompress(&inf, &def);
- if (ret == Z_MEM_ERROR)
- quit("out of memory");
- assert(ret == Z_STREAM_END); /* otherwise MARGIN too small */
-
- /* done -- write block to stdout */
- have = size - def.avail_out;
- if (fwrite(blk, 1, have, stdout) != have || ferror(stdout))
- quit("error writing output");
-
- /* clean up and print results to stderr */
- free(tmp);
- ret = inflateEnd(&inf);
- assert(ret != Z_STREAM_ERROR);
- ret = deflateEnd(&def);
- assert(ret != Z_STREAM_ERROR);
- free(blk);
- fprintf(stderr,
- "%u bytes unused out of %u requested (%lu input)\n",
- size - have, size, def.total_in);
- return 0;
-}