1LZMA compression 2---------------- 3Version: 24.07 4 5This file describes LZMA encoding and decoding functions written in C language. 6 7LZMA is an improved version of famous LZ77 compression algorithm. 8It was improved in way of maximum increasing of compression ratio, 9keeping high decompression speed and low memory requirements for 10decompressing. 11 12Note: you can read also LZMA Specification (lzma-specification.txt from LZMA SDK) 13 14Also you can look source code for LZMA encoding and decoding: 15 C/Util/Lzma/LzmaUtil.c 16 17 18LZMA compressed file format 19--------------------------- 20Offset Size Description 21 0 1 Special LZMA properties (lc,lp, pb in encoded form) 22 1 4 Dictionary size (little endian) 23 5 8 Uncompressed size (little endian). -1 means unknown size 24 13 Compressed data 25 26 27 28ANSI-C LZMA Decoder 29~~~~~~~~~~~~~~~~~~~ 30 31Please note that interfaces for ANSI-C code were changed in LZMA SDK 4.58. 32If you want to use old interfaces you can download previous version of LZMA SDK 33from sourceforge.net site. 34 35To use ANSI-C LZMA Decoder you need the following files: 361) LzmaDec.h + LzmaDec.c + 7zTypes.h + Precomp.h + Compiler.h 37 38Look example code: 39 C/Util/Lzma/LzmaUtil.c 40 41 42Memory requirements for LZMA decoding 43------------------------------------- 44 45Stack usage of LZMA decoding function for local variables is not 46larger than 200-400 bytes. 47 48LZMA Decoder uses dictionary buffer and internal state structure. 49Internal state structure consumes 50 state_size = (4 + (1.5 << (lc + lp))) KB 51by default (lc=3, lp=0), state_size = 16 KB. 52 53 54How To decompress data 55---------------------- 56 57LZMA Decoder (ANSI-C version) now supports 2 interfaces: 581) Single-call Decompressing 592) Multi-call State Decompressing (zlib-like interface) 60 61You must use external allocator: 62Example: 63void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); } 64void SzFree(void *p, void *address) { p = p; free(address); } 65ISzAlloc alloc = { SzAlloc, SzFree }; 66 67You can use p = p; operator to disable compiler warnings. 68 69 70Single-call Decompressing 71------------------------- 72When to use: RAM->RAM decompressing 73Compile files: LzmaDec.h + LzmaDec.c + 7zTypes.h 74Compile defines: no defines 75Memory Requirements: 76 - Input buffer: compressed size 77 - Output buffer: uncompressed size 78 - LZMA Internal Structures: state_size (16 KB for default settings) 79 80Interface: 81 int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen, 82 const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode, 83 ELzmaStatus *status, ISzAlloc *alloc); 84 In: 85 dest - output data 86 destLen - output data size 87 src - input data 88 srcLen - input data size 89 propData - LZMA properties (5 bytes) 90 propSize - size of propData buffer (5 bytes) 91 finishMode - It has meaning only if the decoding reaches output limit (*destLen). 92 LZMA_FINISH_ANY - Decode just destLen bytes. 93 LZMA_FINISH_END - Stream must be finished after (*destLen). 94 You can use LZMA_FINISH_END, when you know that 95 current output buffer covers last bytes of stream. 96 alloc - Memory allocator. 97 98 Out: 99 destLen - processed output size 100 srcLen - processed input size 101 102 Output: 103 SZ_OK 104 status: 105 LZMA_STATUS_FINISHED_WITH_MARK 106 LZMA_STATUS_NOT_FINISHED 107 LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK 108 SZ_ERROR_DATA - Data error 109 SZ_ERROR_MEM - Memory allocation error 110 SZ_ERROR_UNSUPPORTED - Unsupported properties 111 SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src). 112 113 If LZMA decoder sees end_marker before reaching output limit, it returns OK result, 114 and output value of destLen will be less than output buffer size limit. 115 116 You can use multiple checks to test data integrity after full decompression: 117 1) Check Result and "status" variable. 118 2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize. 119 3) Check that output(srcLen) = compressedSize, if you know real compressedSize. 120 You must use correct finish mode in that case. */ 121 122 123Multi-call State Decompressing (zlib-like interface) 124---------------------------------------------------- 125 126When to use: file->file decompressing 127Compile files: LzmaDec.h + LzmaDec.c + 7zTypes.h 128 129Memory Requirements: 130 - Buffer for input stream: any size (for example, 16 KB) 131 - Buffer for output stream: any size (for example, 16 KB) 132 - LZMA Internal Structures: state_size (16 KB for default settings) 133 - LZMA dictionary (dictionary size is encoded in LZMA properties header) 134 1351) read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header: 136 unsigned char header[LZMA_PROPS_SIZE + 8]; 137 ReadFile(inFile, header, sizeof(header) 138 1392) Allocate CLzmaDec structures (state + dictionary) using LZMA properties 140 141 CLzmaDec state; 142 LzmaDec_Constr(&state); 143 res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc); 144 if (res != SZ_OK) 145 return res; 146 1473) Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop 148 149 LzmaDec_Init(&state); 150 for (;;) 151 { 152 ... 153 int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen, 154 const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode); 155 ... 156 } 157 158 1594) Free all allocated structures 160 LzmaDec_Free(&state, &g_Alloc); 161 162Look example code: 163 C/Util/Lzma/LzmaUtil.c 164 165 166How To compress data 167-------------------- 168 169Compile files: 170 7zTypes.h 171 Threads.h 172 Threads.c 173 LzmaEnc.h 174 LzmaEnc.c 175 LzFind.h 176 LzFind.c 177 LzFindMt.h 178 LzFindMt.c 179 LzFindOpt.c 180 LzHash.h 181 182Memory Requirements: 183 - (dictSize * 11.5 + 6 MB) + state_size 184 185Lzma Encoder can use two memory allocators: 1861) alloc - for small arrays. 1872) allocBig - for big arrays. 188 189For example, you can use Large RAM Pages (2 MB) in allocBig allocator for 190better compression speed. Note that Windows has bad implementation for 191Large RAM Pages. 192It's OK to use same allocator for alloc and allocBig. 193 194 195Single-call Compression with callbacks 196-------------------------------------- 197 198Look example code: 199 C/Util/Lzma/LzmaUtil.c 200 201When to use: file->file compressing 202 2031) you must implement callback structures for interfaces: 204ISeqInStream 205ISeqOutStream 206ICompressProgress 207ISzAlloc 208 209static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); } 210static void SzFree(void *p, void *address) { p = p; MyFree(address); } 211static ISzAlloc g_Alloc = { SzAlloc, SzFree }; 212 213 CFileSeqInStream inStream; 214 CFileSeqOutStream outStream; 215 216 inStream.funcTable.Read = MyRead; 217 inStream.file = inFile; 218 outStream.funcTable.Write = MyWrite; 219 outStream.file = outFile; 220 221 2222) Create CLzmaEncHandle object; 223 224 CLzmaEncHandle enc; 225 226 enc = LzmaEnc_Create(&g_Alloc); 227 if (enc == 0) 228 return SZ_ERROR_MEM; 229 230 2313) initialize CLzmaEncProps properties; 232 233 LzmaEncProps_Init(&props); 234 235 Then you can change some properties in that structure. 236 2374) Send LZMA properties to LZMA Encoder 238 239 res = LzmaEnc_SetProps(enc, &props); 240 2415) Write encoded properties to header 242 243 Byte header[LZMA_PROPS_SIZE + 8]; 244 size_t headerSize = LZMA_PROPS_SIZE; 245 UInt64 fileSize; 246 int i; 247 248 res = LzmaEnc_WriteProperties(enc, header, &headerSize); 249 fileSize = MyGetFileLength(inFile); 250 for (i = 0; i < 8; i++) 251 header[headerSize++] = (Byte)(fileSize >> (8 * i)); 252 MyWriteFileAndCheck(outFile, header, headerSize) 253 2546) Call encoding function: 255 res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable, 256 NULL, &g_Alloc, &g_Alloc); 257 2587) Destroy LZMA Encoder Object 259 LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc); 260 261 262If callback function return some error code, LzmaEnc_Encode also returns that code 263or it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS. 264 265 266Single-call RAM->RAM Compression 267-------------------------------- 268 269Single-call RAM->RAM Compression is similar to Compression with callbacks, 270but you provide pointers to buffers instead of pointers to stream callbacks: 271 272SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen, 273 const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark, 274 ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig); 275 276Return code: 277 SZ_OK - OK 278 SZ_ERROR_MEM - Memory allocation error 279 SZ_ERROR_PARAM - Incorrect paramater 280 SZ_ERROR_OUTPUT_EOF - output buffer overflow 281 SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version) 282 283 284 285Defines 286------- 287 288Z7_LZMA_SIZE_OPT - Enable some code size optimizations in LZMA Decoder to get smaller executable code. 289 290Z7_LZMA_PROB32 - It can increase the speed on some 32-bit CPUs, but memory usage for 291 some structures will be doubled in that case. 292 293Z7_DECL_Int32_AS_long - Define it if int is 16-bit on your compiler and long is 32-bit. 294 295Z7_DECL_SizeT_AS_unsigned_int - Define it if you don't want to use size_t type. 296 297 298Defines for 7z decoder written in C 299----------------------------------- 300These defines are for 7zDec.c only (the decoder in C). 301C++ 7z decoder doesn't uses these macros. 302 303Z7_PPMD_SUPPORT - define it if you need PPMD method support. 304Z7_NO_METHODS_FILTERS - do not use filters (except of BCJ2 filter). 305Z7_USE_NATIVE_BRANCH_FILTER - use filter for native ISA: 306 use x86 filter, if compiled to x86 executable, 307 use arm64 filter, if compiled to arm64 executable. 308 309 310C++ LZMA Encoder/Decoder 311~~~~~~~~~~~~~~~~~~~~~~~~ 312C++ LZMA code use COM-like interfaces. So if you want to use it, 313you can study basics of COM/OLE. 314C++ LZMA code is just wrapper over ANSI-C code. 315 316 317C++ Notes 318~~~~~~~~~~~~~~~~~~~~~~~~ 319If you use some C++ code folders in 7-Zip (for example, C++ code for 7z archive handling), 320you must check that you correctly work with "new" operator. 3217-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator. 322So 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator, 323if compiled by old MSVC compilers (MSVC before version VS 2010): 324 325operator new(size_t size) 326{ 327 void *p = ::malloc(size); 328 if (!p) 329 throw CNewException(); 330 return p; 331} 332 333If the compiler is VS 2010 or newer, NewHandler.cpp doesn't redefine "new" operator. 334Sp if you use new compiler (VS 2010 or newer), you still can include "NewHandler.cpp" 335to compilation, and it will not redefine operator new. 336Also you can compile without "NewHandler.cpp" with new compilers. 337If 7-zip doesn't redefine operator "new", standard exception will be used instead of CNewException. 338Some code of 7-Zip catches any exception in internal code and converts it to HRESULT code. 339So you don't need to catch CNewException, if you call COM interfaces of 7-Zip. 340 341--- 342 343http://www.7-zip.org 344http://www.7-zip.org/sdk.html 345http://www.7-zip.org/support.html 346