A small tool to extract pak files
Find a file
Daniel Gibson 55d4ca2007 Update README to mention that Sin and Daikatana
and to describe commandline options
2017-07-05 19:13:28 +02:00
LICENSE Add LICENSE 2012-04-29 11:24:08 +02:00
Makefile Support Daikatana and Quake(2) .pak files in same binary, -dk switch 2016-04-02 12:49:04 +02:00
pakextract.c Support Sin .sin pak files 2017-04-08 22:57:06 +02:00
README Update README to mention that Sin and Daikatana 2017-07-05 19:13:28 +02:00

                      pakextract
                      ----------

pakextract is a small tool to extract or list the contents 
of a Quake/Quake II or Daikatana pak file or Sin .sin file
into the current directory. Usage:
 
Usage: pakextract [-l] [-dk] [-o output dir] pakfile
   -l     don't extract, just list contents
   -dk    Daikatana pak format (Quake is default,
            Sin is detected automatically)
   -o     directory to extract to (default is $PWD)

Examples:
  ./pakextract /path/to/pakfile.pak
  ./pakextract -o /path/to/output_dir /path/to/pakfile.sin
  ./pakextract -l -dk /path/to/daikatana_pakfile.pak

Only Archives from Quake, Quake II, Daikatana and Sin
are supported. Other pak formats may work but it's
untested and unsupported.

-------------------------------------------------------

            The Quake II Pak File Format
            ----------------------------

A Quake II pak file consists of 3 parts:
 - Header
 - Directory
 - Data

The header is written right after the start
of the file and consists of 3 parts in the
following order:
 - A 4 byte identification string "PACK" (in ASCII)
 - A 4 byte integer value defining the offset to the
   directory in bytes
 - A 4 byte iteger giving the length of the directory
   in bytes. Since every directory entry is 64 bytes 
   long this value modulo 64 must be 0: 
     (dir_length % 64) == 0;
   Also this means that the number of directory entries
   can be calculated with dir_length / 64

The directory can be anywere in the file but most times
it's written to the end. In consists of datablocks, 
written one after the other without any space between
them. A directory entry is 64 bytes long has entries
in the following order:
 - A 56 byte file name (in ASCII)
 - A 4 byte integer value defining the position if the file
   as an offset in bytes to the start of the pak file.
 - A 4 byte integer giving the length of the file
   in bytes.

------------------------------------------------------

            The Sin Pak File Format (.sin)
            ------------------------------

It's very similar to the Quake II format, there are only
small differences:
 - The header on the start of the file starts with "SPAK"
   instead of "PACK"
 - The filename in the directory entry is 120 bytes long
   => a directory entry is 128 bytes long instead of 64
   => the number of directory entries is dir_length / 128
      instead of dir_length / 64


-------------------------------------------------------

            The Daikatana Pak File Format
            -----------------------------

The Daikatana pak file format is similar to the Quake II one, but
additionally supports compressing files within the pak.
Daikatana only did that for .tga .bmp .wal .pcx and .bsp files and might
expect other files to be uncompressed (in case you want to write a compressor).

In Daikatana directory entries are 72 bytes long, because they have two
additional 4byte integer fields at the end:
 - A 4 byte integer giving the compressed length of the file in bytes,
   if it is compressed.
 - A 4 byte integer indicating whether the file is compressed
   (0: it's not compressed, else it's compressed)

If the file is compressed, the file length field (the one also defined in
Quake II pak) indicates the uncompressed length of the file, while the
additional "compressed length" field indicates the number of bytes the
compressed data of that file takes in the .pak.

Compressed files are decompressed like this:
	while not done:
		read a byte (unsigned char) _x_.
		// x is never written to output, only used to determine what to do
		if      x < 64:
			x+1 bytes of uncompressed data follow (just read+write them as they are)
		else if x < 128:
			// run-length encoded zeros
			write (x - 62) zero-bytes to output
		else if x < 192:
			// run-length encoded data
			read one byte, write it (x-126) times to output
		else if x < 254:
			// this references previously uncompressed data
			read one byte to get _offset_
			read (x-190) bytes from the already uncompressed and written output data,
				starting at (offset+2) bytes before the current write position
				(and add them to output, of course)
		else if x == 255:
			you're done decompressing (used as terminator)
			// but I'd also abort once compressed_length bytes are read, to be sure

See also https://gist.github.com/DanielGibson/8bde6241c93e5efe8b75e5e00d0b9858
("Description of Daikatana .pak format")