Thursday, February 6, 2014

Parsing Creatures 2 .S16 files (Extracting/Manipulating any of the game's graphics).

"Back then" few easy image manipulation libraries were available, making the extraction of game image data an intricate process (that most creature fans/developers tackled in an easy but not very user friendly way as we will expose).Nowadays things got easier.

In this post I will explain the creatures 2 .s16 file format, and show you how you can now very easily extract any picture from the game using python.
Not only that but we can also harness the power of modern image manipulation libraries to convert the game sprites to any file format we like contrary to most legacy tools only proposing to export images as BMP's and letting us do the conversion work manually.





The s16 file format

In C2, all images are stored inside .s16 files (s16 standing for 16 bit Sprites).
Each s16 file contains one or more images.
The file format for those is quite simple and explained (quite messily by a trainee ??) on the CDN.

Let's make a bit more sense of all that as some of the information found there is slightly incorrect and mostly unreadable.

The general format for an s16 file is as such:


Where flags is all zeroes except the last bit.
If last bit is 0, it means the file is in "555" format, if last bit is 1, then the file is in "565" format.
 
Each sprite header is composed as such:


Where "Offset for data" is the position from the beginning of the whole file where data for the corresponding image is found.
The size of the data field can be easily computed with "image width x image height"
This makes indexing of the file easy, and contrary to the other file formats studied so far, we can read out single pictures out of s16 files without needing to parse the whole file first.

So what is that 565 and 555 stuff ?

You've probably came across the term if you've spent some time in the Creatures community but what is that about ?
Basically 565 and 555 modes describe the way the image data is stored inside the sprites "data" chunk.

Color images are most often described by packing a red, blue, and green value together to represent each pixel.
All images in C2 are stored in a "16bit" format.
What this means is that each pixel of the final image is stored in 16 bits (aka 2 bytes, aka 1 "word").
The 16 bit boundary is something imposed by the underlying computer architecture that "likes" to operate on values that are powers of two( 2,4,8,16,32,64,128...1024...).Those values will most likely be familiar to you as they're found all over the computer world ( memory sizes, connection speeds...)

The problem is that 16 is not easily divided into 3 equal parts for storing red, blue and green values for each pixel.
Two common approaches were thus retained for storing those 3 components in 16 bit fields :
  • The "555" approach, means "5 bits for red, 5 bits for green,5 bits for blue". this wastes one bit (the leftmost one) that is always set to 0, and therefore fits 16bit entries.
  • The "565" approach means "5 bits for red, 6 bits for green,5 bits for blue" , this doesn't readily waste any bits to fall on the 16 bits per entry boundary, but seems to provide the "green" channel additional range compared to the other components.We will end up wasting that additional range anyway by never using those values. 
Each pixel is stored on 16 bits


If you are wondering about those wasted bits, yes it remains much more efficient to waste that one bit for each pixel rather than trying to pack those values more tightly and having them not fall on proper power of two boundaries  your computer's processor is designed to manipulate.

Those two methods for storing RGB data using 16 bits for each pixel are nothing Creatures specific, actually this is the exact method used to store image data inside 16bit BMP files.
We will come back to this as a conclusion, but for now let's just fire up python and parse a .s16 file to extract its data.

Parsing an actual .s16 file

 As we've seen in a previous article, the "History" folder of the C2 game contains files named "Photo_XXX.s16", where XXXX is any creature's 4 character moniker.
Those are where each creatures photo albums are stored.
By parsing those files we can get those pictures out, which might be useful for a lot of things, such as browsing a creature's photo album even after it's death.(also think about how automatically extracting files and putting them in neatly organised directories could save you much time while designing a new creatures breed)

We won't even have to dive into the ugly bit level stuff early programmers had to deal with.
By using the python "PIL" library, all of the hard stuff is taken care of for us, and reading out a .s16 file images becomes as simple as :

import struct
import Image

def readbyte( readfromfile ):
    return struct.unpack("B",readfromfile.read(1))[0]

def readLong( readfromfile ):
    return struct.unpack("L",readfromfile.read(struct.calcsize("L")))[0]

def readWord( readfromfile ):
    return struct.unpack("H",readfromfile.read(struct.calcsize("H")))[0]

def readSpriteHeader(readfromfile):
    header={}
    header["Offset"]=readLong(readfromfile)
    header["Width"]=readWord(readfromfile)
    header["Height"]=readWord(readfromfile)
    return header

#Replace this with your own file:

fic=open("Photo_0BLT.s16","rb")

TheFile={}

flags=readLong(fic)
TheFile["Flags"]=flags
print "The file is in " + ["555","565"][flags] +" format."

SpriteCount=readWord(fic)
TheFile["SpriteCount"]=SpriteCount
print "The file has %d sprites" % SpriteCount

for i in range(1,TheFile["SpriteCount"]+1):
    TheFile[i]=readSpriteHeader(fic)
    print "Image N° %d starts at %d and is %d x %d" % (i,TheFile[i]["Offset"],TheFile[i]["Width"],TheFile[i]["Height"])
   
print TheFile

#For each sprite:
for i in range(1,TheFile["SpriteCount"]+1):
    #Read the corresponding data:
    TheFile[i]["data"]=fic.read(TheFile[i]["Width"]*TheFile[i]["Height"]*2)

    #BGR;16 is the 565 format,  BGR;15 is for 555
    if flags==1:
        im=Image.fromstring("RGB", (TheFile[i]["Width"], TheFile[i]["Height"]),TheFile[i]["data"], "raw", "BGR;16")
    else:
        im=Image.fromstring("RGB", (TheFile[i]["Width"], TheFile[i]["Height"]),TheFile[i]["data"], "raw", "BGR;15")

    im.show()

Easy wasn't it ? We didn''t even need to make sense of the data section.You can grab a working copy of this script here.
We could also convert the extracted data into all sorts of arbitrary image formats just by changing the following line :


im.show()

to:

im.save("Image_"+str(i)+".png")

Yay ! Png's extracted from .s16 files!


Notice how python magic operates here, automatically converting our image data to the expected format by guessing it from the 3 last characters of the output filename we suggest :)
You could use any common output format such as bmp,jpeg,pdf...
We really have it easy nowdays.I feel like I've been cheating sometimes when using python.

But how was all this done with the earliest creatures tools ?

The 565/555 easy BMP hack

You might have noticed if you ever used any of the Creatures s16 conversion or manipulation tools, that most of them exclusively accept to work with "16 bit BMPs".

The reason behind that is quite simple.
As I've mentioned earlier, the way the pixel data section is stored inside the s16 files is as a series of 16 bit entries, one entry per pixel.

This happens to be the exact same scheme used to store 16 bit BMP file data.
It then made sense for developers not to waste precious CPU cycles by converting back and forth between images manipulable by standard tools and a proprietary format.

To convert a .s16 sprite into the corresponding BMP file is as simple as :
  • Parsing the s16 file to get width height and the data blob out of the file.
  • Composing a working BMP header for the corresponding image size and mode ( this is just a matter of filling in a dozen field values )
  • Blindly pasting the raw data from the s16 file into the data section of the new BMP.

Converting back from 16 bit BMP's to .s16 files is the exact same process, only with a much simpler header to generate on the .s16 side.

This is why most of the image manipulation tools insist on you using only 16 bit BMPs,  most of them don't really need to make actual sense of the data they manipulate, just transpose it back and forth between gutted BMP and s16 templates.

What's next ?

Using PIL is not only about extracting data, the library also offers many image manipulation primitives to draw shapes, resize and crop stuff, apply various filters...Once you've generated a PIL "Image" object you can use all those methods to manipulate your image and write it back.
We will cover the generation of .s16 files later on.Meanwhile here's a small list of possibilities opened by the ability to manipulate the game graphics data :

  • Extracting Norns pictures from their photo album files.this can be useful for recovering pictures of dead Norns as the games won't let you access to their album after their death.
  • Understanding to which object any given classifier corresponds.In the game scriptorium, objects are referred to by classifier numbers and never by any explicit name.Mapping a script to the corresponding object can be really tedious.By reading the "Enter scope" script for a given classifier, extracting the CAOS command describing which picture is used, and then visualising the corresponding picture, one could easily get a visual clue of what each classifier is.This critical feature for easing Albian explorations is currently being implemented as a "Wikipedia Nornica" applet inside my upcoming CKC tool (along with automated descriptions of each cob's effects extracted dynamically from the game scripts.No more wondering if eating a given plant will be good or bad, just look it up !).
  • Extracting pictures from photo albums to incorporate them directly in our family tree making experiments.
  • Automating image extraction and repacking from/to neat directory structures while working on a new breed.This could save a lot of time while designing sprites for new breeds, and make the overall workflow easier and more user friendly.
  • Reconstructing creatures tombstones in outside applications from data found in the history files and photo albums.
  • Automatically exporting your game's photo albums to web pages where visitors can follow your world.
  • Quickly visualising/automatically generating attachment data files on corresponding sprites while designing a breed.
  • Generating a one-piece image of the albian background from the background .s16 file.(The file contains hundreds of small sections of the whole world that are stitched together at runtime, but one cannot easily get the whole game background without resorting to third party tools)
  • Of course any other image related work such as extraction, modification, repacking...
I hope you found that useful and that adding this ability to your skillset will open new perspectives in your albian explorations.

Also, as I'm writing all of this I'm packing all the code snippets into a more robust and usable python library for manipulating the various Creature games files.I will make it available on the site's github as soon as it gets usable enough to be worth sharing.



No comments:

Post a Comment