Thursday, September 6, 2018

Kowalski's Image Format (KIF)



I have developed a new lossless image format that is consistently (but not always) better than the .png format.

The files referred to in this post are here:
kifany.exe   convert.exe 

The purpose originally was to have my own compressed-ish image format so that I didn’t need to rely on others’ libraries, for use in the development of my game - with working title: “Kow’s Balls”.

Result, in brief, is a consistent improvement of, on average, 6% over the same sample images saved as .png files.

This statistic comes from a varied sample of 24 images that ranged from cartoonish and flat to sharp and detailed.  The range of improvement compared to the same files saved as .pngs was: -13% to (+)20%.

Give it a go.

The compression work is done by my own library (that is image format agnostic) that does not use any external libraries, that however does use a front end to shuttle image data between the world and itself (the compression lib).

I’ve created three variants of execuable each with a different (internal) front-end to handle this image i/o to a greater or lesser extent, some of which do use an existing image handling library (again, only for the initial loading/ final saving of images, and not for any part of the compression):
  • kif.exe             – converts between kif files and targa files only
  • kifbmp.exe     – converts between kif files and bitmaps (in addition to targas) using the CImg library 
  • Kifany.exe (+ convert.exe) – converts between kif files and pretty much any other image format, using the CImg library that itself calls ImageMagick’s convert.exe (tested with png, jpg, tiff, pcx, but probably works with a bunch of others) in addition to targas and bmps

To just test the compression, use kif.exe or kifbmp.exe, and compare the output .kif to a .png of the same original file.  For greatest flexibility,use kifany.exe (that itself calls convert.exe). 

-Usage-
 
Where you see “kif.exe”, interpret that as either “kif.exe”, “kifbmp.exe” or “kifany.exe”, depending on the kinds of file you’re working with.
  • kif.exe [filename].[img ext]                   - outputs [filename].kif
  • kif.exe [filename].kif                             - outputs [filename]_.bmp (default) 
  • kif.exe [filename].kif - out [img ext]      - outputs [filename]_.[img ext]
  • kif.exe [...] -p                         - prints some progress and metrics to console during execution

Notice output files, other than kifs, get an extra underscore after the filename. 

-Some examples-

From the console, (don’t type “<console>”):

<console> kif mycat.tga             (outputs mycat.kif)
<console> kif mycat.kif              (output mycat_.tga)
<console> kifbmp mycat.bmp    (outputs mycat.kif)
<console> kifbmp mycat.kif       (output mycat_.bmp)
<console> kifany mycat.jpg       (outputs mycat.kif)
<console> kifany mycat.kif -out png     (outputs mycat_.png)

The most direct way of comparing the compression between pngs and kifs, is to directly process a png and see how big the kif is that gets output.

<console> kifany mycat.png    (outputs mycat.kif)


As is, this is a ‘proof of concept’ free-standing program, nothing ‘installs’. Just put the exe/s where you have some images and run (from console) from there.   Alternatively, put the executable/s somewhere and point your Path to that location, or put the files where Path already points. Then run from a console anywhere you have images.   Got it?  Good.

-About targas-  

I’ve noticed that at least one program saves targas in a way that can produce strange results.  If you compress then uncompress a targa and the image is flipped with weird colouring, try saving the original targa with another program and try again, or use bmps instead.
 
-----------------------------------------------------------------------------------------------------------------
About the programming environment 

All done in C++(11) with gcc version 5.1.0 (tdm64-1), using the Code::Blocks IDE.

I make extensive use of the C++ Standard Template Library.

The kif library including its (my) dependant libraries has ~1550 lines of code.  The front-ends each have about 100 to 200.  So pretty compact, though it does seem to compile to a relatively large executable...

I’ve built for Windows only for now. But there’s nothing system specific that I know of (endianness wrt bit-packing?) that would inhibit compiling for Linux or Apple – when the day comes.
 
-----------------------------------------------------------------------------------------------------------------
Method (short version)

Identify all distinct two pixel aligned patterns and replace them with Huffman codes, then bit-pack everything. 

-Why only two pixels?-

I initially supposed that longer patterns would be better. I tried 4, 8, and longer, and some in-between.  I tried rectangular regions too.  The result of using longer patterns is that there will be fewer distinct patterns, which produces a corresponding increase in the number of codes, which in terms of the Huffman coding used, means more longer codes, which finally means less compression. 

-Why aligned?-
 
Because that ‘fixes’ where to perform processing, making everything that little bit easier.



-----------------------------------------------------------------------------------------------------------------
Special features and or known bugs


What I know or have otherwise noticed about the program:
  • Purpose built for 24 and 32 bit images; (for those formats that support 32 bit)  
  • Not yet tolerant of 8 bit indexed images. Program currently treats these as 24 bit  
  • Images must have even width and height.  Because I just happened to be build-testing with images that were even and so didn’t notice that I hadn’t accounted for odd sizes until running my statistics and I can’t be bothered attending to that now because the program does what I want it to do as is (mostly) and I’m happy to require myself to use even images until I care enough otherwise so there  
  • All serial processing; no multi-threading nor parallel/ GPU processing
I intend to address these in due course.
 
-----------------------------------------------------------------------------------------------------------------
Magic Source, two kinds

Very broadly, there are two main technologies involved; Huffman-coding and bit-packing.  Each is its own comp sci topic but here’s a brief note on each for now. 

-Huffman coding-
   
Huffman coding is the process of generating (variable length) codes based on the probability of the items occurring, which will result in shorter codes for more common items. That’s important. And where no code is a segment of another code.  That is, if one code is “234”, then there can’t be another code such as “2345”.  This ensures that when you have a datum string that looks like a code, it is that code and not part of another longer code.  That’s the second point of Huffman coding.  In the kif case, I’m coding for 16 bit chunks (2 x 8 bit colour values).  A common pattern is {0, 0} (no change).  In some cases (but not all) this pattern (of 16 bits) is represented by a Huffman code of 1 bit; ‘0’ or ‘1’.  The result is the replacement of 16 bits with 1 bit!  If there’s a lot of not much changing, then the compression can be significant.

Below is some data on an image that was reduced by 36% (compressed to 64% of original size).  It is marginally better than the .png (reduced by 35%), and is about half way down my list of 24 images in terms of compression rate, so it’s a reasonable pick to investigate.

Here’s the length of the H codes against how many codes there are of that length, and that count as a percentage of the total.  In other words these are the stats for the 'alphabet' of codes.


Here are the codes again but this time showing how many times each occurrs in the (same) image

  
  
What have I shown here?  The top table and chart shows that most of the generated codes are long but the bottom chart and table shows that those longer codes don’t actually occur much, that it’s the smaller codes that occur a lot. In fact that one single code of 3 bits constitutes almost 10% of all code occurrences in the image, repeating more than 86,000 times from over 870,000 total instances.   (So that’s 3 bits replacing 16 bits 10% of the time.)  The bottom data also shows that more than half (~56%) of the codes used are 8 or fewer bits long, that is, more than half the data has been replaced by codes half as long as the original (16 bit) raw data.

-Bit packing-
   
Suppose you have some numbers; 3, 10, 45, 154, 27, 79. Below is a series of bytes with the values, in binary as they are stored in memory.

 
Notice I put a box around the segment that contains the bits required to represent the value; the significant bits. 

Below are the significant bits packed into bytes.  The colours represent the distinct values, and the boxes are the byte boundaries.  You can see how some of the bit strings pack into one byte while others span byte boundaries.  And obviously the total number of bytes is smaller.  That’s because the bits have been packed.



To unpack, given that the 'natural' boundaries have been stripped, we must either know how many bits are to be expanded to make up a proper item, or have a way of checking against known patterns (like say, by using Huffman codes!).  I use the former for the code table and the latter for the encoded image body. 

For the image scrutinised:
  • Raw bitmap size is 1710 kB.
  • 8 kB for the code table + 885 kB for the packed body + 199 kB for the raw un-encoded data (+ some for the file header; width, height, etc) = 1092 kB.
  • 1092 / 1710 = ~0.64. As mentioned up top, the compressed file is reduced to 64% of the original file size.
-----------------------------------------------------------------------------------------------------------------
That's it


I’m slightly proud of what I’ve done, and in less than a month!
  • created a library for Huffman coding from scratch
  • created a library for bit packing – both for known length items and for variable length items to be recognised against bit string patterns, from scratch
  • created an image compression/ decompression program using these tools that, from what I can see, is consistently better than the .png format that with the zlib library it uses have been around for decades?
If you are also proud of me, you’re welcome to drop a coin (or a note) in the Donation box. I live in Brisbane, Australia, and am a former, since made redundant, data analyst, looking after my daughter myself, and currently productively fulfilling the role of ‘unemployed dude’ hoping to make a living selling the Android-based future-soon smash “Kow’s Balls” (that may or may not be renamed to “Zaf’s Asteroids”).

Zaf.


No comments:

Post a Comment