Documentation for SMUSH files

Index

  1. Introduction
  2. General Format of a SMUSH file
  3. Format of the ANIM chunck
  4. Format of the AHDR chunck
  5. Format of the FRME chunck
  6. Format of the NPAL chunck
  7. Format of the FOBJ chunck
  8. Format of the PSAD chunck
  9. Format of the TRES chunck
  10. Format of the XPAL chunck
  11. Format of the IACT chunck
  12. Format of the STOR chunck
  13. Format of the FTCH chunck
  14. Format of the SKIP chunck
  15. Format of the SAUD chunck
  16. Format of the STRK chunck
  17. Format of the SDAT chunck
  18. Format of the SMRK chunck
  19. Format of the SHDR chunck
  20. Format of the iMUS chunck
  21. Format of the MAP  chunck
  22. Format of the FRMT chunck
  23. Format of the TEXT chunck
  24. Format of the REGN chunck
  25. Format of the STOP chunck
  26. Format of the DATA chunck
  27. Codec description
    1. Codec 1 description
    2. Codec 3 description
    3. Codec 21 description
    4. Codec 44 description
    5. Codec 37 description

Introduction

I will here mostly explain the content of the SMUSH files used in The Dig and Full Throttle. Most of this information may apply also to other LEC games that use the SMUSH format.

The SMUSH files in SPU(tm) games are used for three reasons :

  1. Full frame animations
  2. Fonts description
  3. icons or small images/animation repository

This description tries to be as informative as possible. Each bit of information that could be understood or decoded by the author from the files is described here. What is not at all described is how to use this information to write a SMUSH file reader.

General Format of a SMUSH file

The SMUSH format follows the rules of others LEC file formats. They are composed of chuncks. Each chunck is composed like this :
Offset Length Type Description
0 4 Big Endian unsigned int (or char[4]) Type of the chunck
4
4
Big Endian unsigned int Length of the chunck's data (in byte)
8
n
char []
chunck's data

Some chuncks contain meaningfull data, whereas others are simply containers for other sub-chuncks.

As I already said previously, this file format is used by all data files of LEC games. SMUSH files are files that contains only one chunck, of the type ANIM. The ANIM chunck is a container chunck, i.e. it contains sub-chuncks.

Format of the ANIM chunck

Description

This is the generic chunck that identifies a SMUSH animation file.

Container?

Yes

List of valid sub-chunck's type

There is only one AHDR chunck, and it is always the first one. It is followed by one or more FRME chuncks.

Format of the AHDR chunck

Description

This is the chunck that contains information about the SMUSH animation file (it probably means Animation HeaDeR).

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 2 Little Endian unsigned short Version number of the animation
2 2 Little Endian unsigned short Number of frames in the animation
4 2 Little Endian unsigned short unknown
6 768 unsigned char[256][3] Starting palette of the animation, in RGB format
The following is optionnal (see comment below)
774 4 Little Endian unsigned int Secondary version number (???)
778 4 Little Endian unsigned int unknown
782 4 Little Endian unsigned int sound sampling frequency
786 8 char [8] unknown, always 0

The version number can be either 0, 1 or 2. The optionnal part is available if and only if version == 2.

The secondary version number is maybe a bitfield, instead of a version number. Values for this field are 0, 10, 12, 14 or 15. So that would mean that bit 0, 1, 2 and 3 have a meaning, but I don't know it yet...

The sound sampling number can be either 0, 11025 or 22050. Notice that a animation that has 11025 here can still contain sound tracks with a sampling rate of 22050.

The chunck's data size can be either 774 or 794, based on the version number.

I've been unable yet to understand the meaning of the two unknown fields. Both can be either 0 or some values.

Format of the FRME chunck

Description

This is the generic chunck that is a container for the content of a frame of the SMUSH animation.

Container?

Yes

List of valid sub-chunck's type

Format of the NPAL chunck

Description

This chunck allows to change the palette of the animation by specifying a new palette.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 768 unsigned char[256][3] New palette of the animation, in RGB format

The new palette is valid as soon as the chunck is read.

Animations use this chunck when they reach a cut.

Format of the FOBJ chunck

Description

This chunck contains the graphic data of the frame.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 2 Little Endian unsigned short codec used for data compression
2 2 Little Endian unsigned short Horizontal position of the start of the frame object
4 2 Little Endian unsigned short Vertical position of the start of the frame object
6 2 Little Endian unsigned short width of the frame object
8 2 Little Endian unsigned short height of the frame object
10 2 Little Endian unsigned short unknown
12 2 Little Endian unsigned short unknown
14 n char [] compressed frame's content

There are different codecs used, their description is given below.

Format of the PSAD chunck

Description

The chunck contains sound information for Full Throttle only. See IACT chunck for The Dig.

This is a progressive container chunck. Progressive container chuncks are container on the long run. For each frame, several PSAD chunck may be found. Each PSAD has a track identifier. The data of each PSAD of the same track can be concatenated to create a new chunck.

Container?

Yes, as progressive

Structure of the chunck's data

Offset Length Type Description
0 2 Little Endian unsigned short Track identifier
2 2 Little Endian unsigned short progressive index
4 2 Little Endian unsigned short number of progressive indexes ?
6 2 Little Endian unsigned short flags for the track
8 1 signed byte (maybe unsigned) volume of the track
9 1 signed byte balance of the track (0 is centered, negative means left, positive right)
10 n char [] Progressive chunck's content

If the flag equals 127, it means voice sound (useful if you want to mute voices). If the flag equals 64, it means background music.

The maximum value of balance known is 99. So I think it represents a percentage (where 100 means completely right, -100 completely left, and 0 centered).

Progressive chunck's content

The progressive chunck contains one and only one SAUD chunck.

Format of the TRES chunck

Description

This chunck describes the position and identifier of a Text to be drawn in the current frame.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 2 Little Endian signed short Horizontal position of the subtitle
2 2 Little Endian signed short Vertical position of the subtitle
4 2 Little Endian unsigned short flags
6 2 Little Endian signed short Left border of the subtitle bounding rectangle
8 2 Little Endian signed short Top border of the subtitle bounding rectangle
10 2 Little Endian signed short Width of the subtitle bounding rectangle
12 2 Little Endian signed short Height of the subtitle bounding rectangle
14 2 Little Endian unsigned short unknown
16 2 Little Endian unsigned short subtitle identifier

The subtitle is identified by a number. The text itself is contained in another file (which is game dependant, i.e. in DIGTXT.TRS for The Dig, and in a file with the same name as the animation, but with a .TRS extension instead of .SAN for Full Throttle).

The bit 3 of the flag indicates a subtitle. So these can be avoided if subtitles are inactive. I'm not sure about the meaning of the other bits yet.

Bit 0 not set seems to indicate an absolut positioning of the text within the frame. I.e. You start to render from the position given in the first two fields of the TRES chunck. When set, then the bounding rectangle is to be used, and may be shifted to "best-fit" the position given in the first two fields.

Bit 1 does not seem to be used (I haven't found any instance of this bit being set in any of the 4-5 anims used to understand the TRES chunck content).

Bit 2 has an unclear meaning... Maybe it says that the string can be split if it's too big to fit in the given position/bounding box...

My implementation right now does not have the exact same positioning as the original game. But it's a good approximate.

No font or color information is given in the chunck. These information may be found in the text itself (using escape codes).

For Full Throttle, the fonts used are SCUMMFNT.NUT and TITLFNT.NUT (the later only in CREDITS.SAN). For The Dig, FONT0.NUT to FONT3.NUT are used.

Format of the XPAL chunck

Description

This chunck allows to perform palette transition inside the animation.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 2 Little Endian unsigned short size of the palette (???)
2 2 Little Endian unsigned short unknown
The following is dependant on the size of the chunck's data (see comment below)
4 2 Little Endian unsigned short index of the transition ???
The following is dependant on the size of the chunck's data (see comment below)
4 1536 Little Endian signed short [256][3] Palette delta value
1540 768 unsigned char [256][3] Starting Palette in RGB format

If the chunck's data size equals 2308, then the second part must be read, otherwise the size is 6 and the first part is to be read.

In a SMUSH animation, the first XPAL chunck is always 2308 (type 2), then in the following frames will have a size of 6 (type 1).

Animations use this chunck to do fadeins, fadeouts or other palette manipulation.

The type 2 chunck give's several informations. It gives probably the size of the palette to use. I'm not sure about this, because this value is always 256. It will also give the starting palette, and a list of delta value to apply to the palette at each occurence of the following XPAL chuncks.

Format of the IACT chunck

Description

The chunck contains sound information mainly for The Dig. See PSAD chunck for Full Throttle.

This is a progressive container chunck. Progressive container chuncks are container on the long run.

For each frame, several IACT chunck may be found. Each IACT has a track identifier.

The data of each IACT of the same track can be concatenated to create a new chunck.

The chunck is used inThe Dig as a container for sounds data, in Full Throttle, it is used for other purposes (unknown yet, but probably to give information about the sound : volume, balance, other ?).

Container?

Yes, as progressive

Structure of the chunck's data

Offset Length Type Description
0 2 Little Endian unsigned short operation code ???
2 2 Little Endian unsigned short flags ???
4 2 Little Endian unsigned short unknown
6 4 Little Endian unsigned int Track Identifier

The track identifier seems to be separated in two shorts. Don't know the signification of each subvalues. I suppose the first identify the main track number, and the other is a identifier for the current sample in the track.

The same track can be used several time in a single SMUSH animation.

Possible meaning for code s :
0prepare sound playback of track ???
1Start playback of track ???
2unknown, a code 2 always precede a serie of code 4
3unknown, a code 3 always follows a serie of code 4
4unknown ???
5Some kind of update of track with a sequence ???
6some kind of sequence ???
7Some kind of update of track ???
8Feed sample data to track (used by The Dig)

If the code is 8, then the rest of the chunck's content has the following meaning :

Offset Length Type Description
10 2 Little Endian unsigned short progressive index
12 2 Little Endian unsigned short number of progressive indexes
14
4
Little Endian unsigned int remaining size of progressive content
18 n char [] Progressive chunck's content

The "progressive index" value range from 0 to "number of progressive indexes" -1.

"number of progressive indexes" has the same value for each IACT chunck for a specific track.

The "remaining size of progressive content" is very interresting, because when index == 0, then this value gives the complete size of the progressive chunck. When this value is equal to n, then this is the last IACT chunck for the track.

Progressive chunck's content

The progressive chunck contains one iMUS chunck.

Format of the STOR chunck

Description

This chunck is probably an instruction to the SMUSH engine to Store the current framefor further processing...

Only used in two anims from Full Throttle : DAZED.SAN and MO_FUME.SAN.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 4 unsigned char [4] unknown

Format of the FTCH chunck

Description

This chunck is probably an instruction to the SMUSH engine to Retrieve the frame previously stored...

Only used in two anims from Full Throttle : DAZED.SAN and MO_FUME.SAN. The frames that contains these chuncks do not contain any other FOBJ chuncks.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 2 Little endian signed short unknown (some kind of counter ?)
2 4 unsigned char [6] unknown (always 0)

Format of the SKIP chunck

Description

The skip chunck instruct the Smush player to render the following FRAME OBJECT chunck if the named flag is set.

This chunck is always followed by a Frame Object chunck.

SMUSH Player should select to draw the Frame Object depending on it's ID and some state.

To better explain, I'll give an example :
The TORANCH.SAN animation contains 3 Identifiers. The first one should be set when the truck has been destroyed by the cavefish. The second should be set when the truck has been pushed by ben, and the last one only when bolus and the other bad guy has crashed their cars.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 4 Little Endian unsigned int Identifier of the next frame object.

Format of the SAUD chunck

Description

This is the generic chunck that is a container for the content of a sound track of the SMUSH animation.

Container?

Yes

List of valid sub-chunck's type

The chunck starts with a STRK subchunck, then an SDAT. The existence of an SMRK chunck is optional, but it is always terminated by a SHDR chunck.

Format of the STRK chunck

Description

Unknown

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 1 unsigned byte unknown (1 or 6)
1 1 unsigned byte unknown (8 or 12)
2 4 unsigned int unknown (always 0)
6 4 Big Endian unsigned int sample size ???
The following is optional
10 4 Big Endian unsigned int sampling rate ???

Format of the SDAT chunck

Description

Sound data, unsigned 8-bit mono.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 n unsigned char[] Sound sample data.

Format of the SMRK chunck

Description

Sound Marker ????

Container?

No

Structure of the chunck's data

Offset Length Type Description
The following is repeated until the end of the chunck
n 1 unsigned byte Flags or instruction ???
n+1 1 unsigned byte Position ???
n+2 2 unsigned short unknown (0) ???
n+4 2 unsigned short instruction ???

Format of the SHDR chunck

Description

Sound header ??. If it's an header, why is it at the end of the sound ?

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 4 Little Endian unsigned int Sound sample rate.

Format of the iMUS chunck

Description

This is the generic chunck that is a container for the content of an imuse track of the SMUSH animation.

Container?

Yes

List of valid sub-chunck's type

Format of the MAP  chunck

Description

This is the chunck that is a container for the mapping (??) of an imuse track of the SMUSH animation.

Notice the space character at the end of the chunck's name.

Container?

Yes

List of valid sub-chunck's type

Format of the FRMT chunck

Description

Description of the sound's format.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 4 Big Endian unsigned int Sound start position ???
4 4 Big Endian unsigned int unknown.
8 4 Big Endian unsigned int Bit size of the sound sample.
12 4 Big Endian unsigned int rate of the sound sample.
16 4 Big Endian unsigned int number of channels of the sound sample.

Format of the TEXT chunck

Description

name of the sound track ??

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 4 Little Endian unsigned int Sound start position ???
4 n unsigned char[] zero terminated ascii string

Format of the REGN chunck

Description

Sound region ??

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 4 Little Endian unsigned int Sound start position
4 4 Little Endian unsigned int Sound size ??

Format of the STOP chunck

Description

Descriptor of when to stop the sound ??

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 4 Little Endian unsigned int Sound stop position ??

Format of the DATA chunck

Description

Sound data.

Container?

No

Structure of the chunck's data

Offset Length Type Description
0 n unsigned char[] Sound sample data.

Codec description

Codec 1

The codec 1 as coded in scummvm (as the date of 30/07/2002) is incorrect !

Structure of the codec's data
Offset Length Type Description
The following is repeated for each line of the frame object
n 2 Little Endian unsigned short Size of the line's compressed data, in bytes
The following is repeated until the size of the line is reached
n + 2 1 unsigned byte RLE code byte. (see comment below)
The following is valid if bit 0 is set
n + 3 1 unsigned byte color index
The following is valid if bit 0 is not set
n + 3 RLE length unsigned byte[] color indexes

The RLE code byte contains two information. Bit 1-7 contains the length of the decompressed data minus one, and the bit 0 is a flag that indicates if the following data is a single byte repeated or uncompressed data. In pseudo-C, the decompression algorithm looks like this :

for(each line) {
	int line_size = next_word();
	while(line_size > 0) {
		int code = next_byte();
		line_size -= 1;
		int len = (code >> 1) + 1;
		if(code & 1) {
			int value = next_byte();
			line_size -= 1;
			if(value != 0)
				while(len--) put_byte(value);
			else
				skip(len);
		} else {
			line_size -= len;
			while(len--) {
				int value = next_byte();
				if(val) put_byte(value);
				else skip(1);
			}
		}
	}
}

The codec1 is transparent, with 0 as the transparency index

Codec 3

The codec 3 behaves exactly like codec 1.

Codec 21

The codec 21 is used mostly for fonts.

In pseudo-C, the decompression algorithm looks like this :

for(each line) {
	int line_size = next_word();
	boolean zero = true;
	while(line_size > 0) {
		int len = next_word();
		line_size -= 2;
		if(zero == true) {
			skip(len);
		} else {
			len++;
			while(len--) {
				int value = next_byte();
				if(val) put_byte(value);
				else skip(1);
			}
		}
		zero = ! zero;
	}
}

WARNING : The coded frame object is one pixel bigger in width and height than the value set in the FOBJ chunck header. This is only true for codec 21 and 44.

Codec 44

The codec 44 behaves exactly like codec 21.

Codec 37

This codec is used for the full frame animations. This is the big one and the most difficult to understand.

The codec works with two buffers. Images are generated by either reading encoded data, or from the previous frame.

The codec split images in small blocks. Each block correspond to a square of 4 pixel in width and height.

Structure of the codec's data
Offset Length Type Description
0 1 unsigned char subcodec identifier
1 1 unsigned byte table index ??
2 2 Little Endian unsigned short frame sequence number
4 4 Little Endian unsigned int size of decoded data
8 4 Little Endian unsigned int size of encoded data
12 4 Little Endian unsigned int flags ??
16 n unsigned char[] encoded data

There are 5 subcodec values :

  1. subcodec 0 means the data is a full frame uncompressed.
  2. subcodec 1 uses previous frame information, and is block based. See below.
  3. subcodec 2 means the data is a full frame compressed with a RLE algorithm similar to the codec 1, except that there is no concept of line. See below.
  4. subcodec 3 uses previous frame information, and is block based. See below.
  5. subcodec 4 uses previous frame information, and is block based. See below.

subcodec 1 information

The subcodec 1 use a double compression. The first decompression to do is a simple RLE encoding. In pseudo-C, the decompression algorithm looks like this :

while(compressed data available) {
	int code = next_byte();
	int len = (code >> 1) + 1;
	if(code & 1) {
		int value = next_byte();
		while(len--) put_byte(value);
	} else {
		while(len--) {
			int value = next_byte();
			put_byte(value);
		}
	}
}

Once the data has been first decoded, then another decompression takes place. In pseudo-C, the decompression algorithm looks like this :

while(compressed data available) {
	int code = next_byte();
	if(code == 0xFF) {
		char data[8];
		read_a_block(data); // This read a block from the compressed data
		put_block(data);
	} else {
		int offset = OFFSET_TABLE[code];
		put_block_from_offset(offset); // See below for more information about OFFSET_TABLE
	}
}

The subcodec 1 is used only in some Full Throttle Animation that are available on the CD, but are never played during the game. These animations are probably working versions of other animations. An insteresting feature is that the RLE compressor was buggy, and the animation are 384*242 in size, instead of 320*200.

subcodec 2 information

The subcodec 2 use a simple RLE encoding. In pseudo-C, the decompression algorithm looks like this :

while(compressed data available) {
	int code = next_byte();
	int len = (code >> 1) + 1;
	if(code & 1) {
		int value = next_byte();
		while(len--) put_byte(value);
	} else {
		while(len--) {
			int value = next_byte();
			put_byte(value);
		}
	}
}

The subcodec 2 is used at cuts in the animation to setup a new frame.

subcodec 3 information

Depending on the value of the bit 0 of the flag field of the codec37 header, the behaviour of subcodec 3 changes.

subcodec 4 information