Skip to content

Extract data from binary .nif file using the nif.xml file description

neomonkeus edited this page Feb 8, 2016 · 6 revisions

Introduction

In this document, we will present how to read data from the Nif files.
It is assumed you have basic knowledge of low-level programmation : what a string is, what a byte is, what a condition is and what a variable is.

Tools

We recommend to equip yourself with a Binary reader first. The HxD tool is a free Windows tool to read such files. You can also consider 010, which has a 30-day trial but adds a bunch more options such as templates to see what is what in a file.

If you decide to go with 010, you can find templates made for the Nif files on the NifTools ecosystem.

Reading the NIF file

All the examples in this document will be based on the file "OutfitCaitGO", from the Fallout 4 game.

Where to start

The Nif header contains all the necessary information to read the file. It mostly contains the file version, the list of the actual data to read and sometimes - it depends of the version - the strings used in the file.

To read the header, go to the nif.xml file and search for the "compound" object named Header.

<compound name="Header">
	The NIF file header.
	<add name="Header String" type="HeaderString">[...]</add>
	<add name="Copyright" type="LineString" arr1="3" ver2="3.1" />
	<add name="Version" type="FileVersion" default="0x04000002" ver1="3.3.0.13">[...]</add>
	[...]
</compound>

Then, open the binary file and begin to read the described XML object.

Here is how it works :

Image describing how the XML and the Binary data correlates

A compound (it's also true for a niobject) describes the contents of the bloc in order.
Each content is called a member of the compound or a member of the niobject.

For this specific bloc, you'll first want to read the "HeaderString" member.
To find what an "HeaderString" is, go to the nif.xml file again and find the basic xml node with a "name" attribute of "HeaderString".

<basic name="HeaderString" count="0" niflibtype="HeaderString" nifskopetype="headerstring">
	A variable length string that ends with a newline character (0x0A).  The string starts as follows depending on the version:
	Version &lt;= 10.0.1.0:  &#039;NetImmerse File Format&#039;
	Version &gt;= 10.1.0.0:  &#039;Gamebryo File Format&#039;
</basic>

Then, read the bloc content to know what the member actually is.
Here, it's a series of characters ended by the 0A value.
Finally, read all the bytes from the file until you read the value 0A. All the bytes before are the actual chars you have to gather.

You then should have a resulting string like this : Gamebryo File Format, Version 20.2.0.7

Save your current postion, as it is supposed to be your current position in the file, and continue to read from there.

Version condition

In compounds or objects, some members have an associated attribute called ver1 or ver2. These are version checking conditions.

Basically, the version condition is verified when ver1 < version < ver2.
If either ver1 (resp. ver2) isn't specified, then the lower (resp. higher) bound isn't checked.

Some examples:

  • version = 10.0.0.3 ; ver1 = 3.0.0.1 ; ver2 = 10.0.0.2
    The condition isn't checked
  • version = 20.0.0.7 ; ver1 = 3.0.0.1
    The condition is checked
  • version = 20.0.0.7 ; ver2 = 3.1
    The condition isn't checked

Let's go back to our example header. We had the HeaderString Gamebryo File Format, Version 20.2.0.7, and we're asked to read the next member:

<compound name="Header">
	The NIF file header.
	<add name="Header String" type="HeaderString">[...]</add>
	<add name="Copyright" type="LineString" arr1="3" ver2="3.1" />
	<add name="Version" type="FileVersion" default="0x04000002" ver1="3.3.0.13">[...]</add>
	[...]
</compound>

The next member is then Copyright. However, it has a ver2 attribute of 3.1.
We already extracted version from the HeaderString, and the version value is 20.2.0.7.
Because 20.2.0.7 > 3.1, the version condition isn't met.

Therefore, we don't read the Copyright member.

However, we will read the next member, <add name="Version" type="FileVersion" default="0x04000002" ver1="3.3.0.13">[...]</add>, as its version check (ver1 < version) is met by our current version.

Arrays

The idea of an array is that the array size is either static (as seen in Copyright) or dynamic. A dynamic size will be defined by either a memberor an ARG. We will see ARGs later, but for array lengths it works the same way a member size would work.

An array is defined by the arr1 attribute. In Copyright, arr1 was associated to 3. That meant that the intended value was three consecutive instances of the Copyright type, LineString.

Let's go further into the header :

<compound name="Header">
	[...]
	<add name="Num Block Types" type="ushort" ver1="10.0.1.0">[...]</add>
	<add name="Block Types" type="SizedString" arr1="Num Block Types" ver1="10.0.1.0">[...]</add>
	<add name="Block Type Index" type="BlockTypeIndex" arr1="Num Blocks" ver1="10.0.1.0">[...]</add>
	<add name="Block Size" type="uint" arr1="Num Blocks" ver1="20.2.0.7">[...]</add>
	[...]
</compound>

Here, we can see that the Num Block Types value is used for arr1 for all the next members.

What that basically means is that the value acquired from the file when reading Num Block Types is actually the size of the array.

To go back to our example, we have the following data :

Image describing the reading of an array

The first value to read is a ushort that describes the length. As seen at the beginning of the file, a ushort is an 'unsigned 16-bits integer', which means it is a 2-byte object.

Here, the ushort Num Block Types is 7. We will read 7 array components.

Each array component is of the following format :

    <compound name="SizedString" niflibtype="string" nifskopetype="sizedstring">
        A string of given length.
        <add name="Length" type="uint">The string length.</add>
        <add name="Value" type="char" arr1="Length">The string itself.</add>
    </compound>

The structure of an array 'component' is therefore a Length value, which according to the beginning of the file is a 32-bits (4-bytes) number.
Then, there is Length consecutive char values, each char being an 8-bits (1-byte) value. A char can also be seen as a letter.

We can then acquire the data : we will do the operation "read a SizedString" 7 times in a row, each operation being "read the length" then "read the chars".

Cond

Some members, in addition to the ver1, ver2 and arr1 attributes, have a cond attribute.
The cond attribute describes a condition for the member to exist, much like ver1 and ver2 do.
It has the ability to be custom described and much like arr1 attributes it can take the value of another member as a variable.

Arg

The ARG value is in reality a value passed from a compound or object to a member. It isn't used in the header, but later in the file you might find the "arg" attribute.

The value you pass as the arg attribute is used in the underlying member as an ARG variable. It can therefore be used as an array size or in a cond.

Block reading

The only thing always read in the nif.xml document is the header. The idea is that all other data in the file is first described in the header itself.

In the header, we already saw there is a Block Types array:

<compound name="Header">
	[...]
	<add name="Num Block Types" type="ushort" ver1="10.0.1.0">[...]</add>
	<add name="Block Types" type="SizedString" arr1="Num Block Types" ver1="10.0.1.0">[...]</add>
	[...]
</compound>

The idea is that the array describes all the available types in the file.

There is also a Block Type Index array in the header :

<compound name="Header">
	[...]
	<add name="Num Blocks" type="ulittle32" ver1="3.3.0.13">[...]</add>
	[...]
	<add name="Block Type Index" type="BlockTypeIndex" arr1="Num Blocks" ver1="10.0.1.0">[...]</add>
	[...]
</compound>

This array is a list of all the blocks in the file. Each member of this array is a "BlockTypeIndex" :

<basic name="BlockTypeIndex" count="0" niflibtype="unsigned short" nifskopetype="blocktypeindex">
	A 16-bit (signed?) integer, which is used in the header to refer to a particular object type in a object type string array.
	The upper bit appears to be a flag used for PhysX block types.
</basic>

These values are in reality positions in the header array.

Let's see the values we have in our header example:

Image describing the values in the header of the file

We have 7 values, which are basically just numbers from 0 to 6. They are the positions in the array, and describe all the blocks we have to read. Here, the file is quite simple, as there is only one instance for each bloc. But keep in mind that in other files it could be more complex with repeating values.

Now, once you've finished to read the header, you can read each block you found one by one from the file.

Just read the niobject that corresponds to the Bloc Type (e.g. here you'll first read the NiNode object, because the first object is of type 00 which is the type NiNode)

Strings

As from version 20.1.0.3, Nif files contains their strings in the header. The idea is that each bloc needing a string will read it from the head of the file.

String reading is much like Block Type reading, and you'll in reality have a "reference" in the form of a value. This value is the position in the header String array.

What now ?

Well, hopefully you now know everything you need to try and read Nif files yourself.

Of course, there is several utils by the NifTool community already able to do that automatically:

  • NifSkope is the most known util to read Nif files.
  • PyFFI supports Nif files, and is used as part of the Belnder plugin
  • NifLib is a library made specifically to read and write Nif files, and is used as part of several tools (the 3DsMax and Maya plugins, Outfit Studio).

But knowing how the Nif files works at a basic level is an useful skill when a new version comes out or when you try to decode unknown parts of the file.

You can always refer to the full Nif.XML reference here

Clone this wiki locally