Saturday, June 29, 2019

Interpolation for image resizing

I wanted my own image resizing method (specifically for image enlargement) for use in some project so, here we are.

Early on I wanted to learn how the Lanczos filter works but I couldn’t find a readable explanation.  I still don’t know how the Lanczos filter works.  Given the (tight) similarity of the results, I may have just re-invented the Lanczos filter (after a fashion), but I don’t know.

Here is an image, origally 500 x 500 pixels.



Below is a part of the image after resizing by 3 using Lanczos.




Below is the same part of the image using the method being described.
 


As is evident, my method is hardly distinguishable from Lanczos.  My method shows here to be slightly lighter or having lower contrast but that can be adjusted with a parameter that can be passed in during resizing.

My method employs two steps, though both steps are functional on their own.  These are linear interpolation and a special kind of contrast enhancement.

I’ll only detail the interpolation part in this post.  



Linear interpolation – core method


Given two values, A and B, and a proportional position, prop, (between 0 and 1) between them, the interpolated value is found as follows.

    Apart = A * (1.0 – prop)
    Bpart = B * prop
    interp_val = Apart + Bpart


It can be seen from the pic that when prop is large, the position is closer to B, hence A’s value is proportionally less (hence the 1.0 - prop).

Or as a function (for later use):

float interp (float A, float B, float prop) {
    float Apart = A * (1.0 – prop);
    float Bpart = B * prop;
    return Apart + Bpart;
}


or just

float interp (float A, float B, float prop) {
    return (A * (1.0 – prop)) + (B * prop);
}


Linear interpolation of an image


The general method (to be improved on futher down) is:

determine the inverse scaling factor

for each dest pixel position
    get decimal position in source (src)
    get integer part of decimal value (srci)
    get fractional part of decimal value (srcf)
    interpolate for values at srci and srci+1 with srcf as prop

Or more specifically, (and without any bounds checking):

Where scale is a parameter passed in as a decimal multiple of original image, inp is input  image, and outp is output image, presumed already correctly sized to accommodate new data.

float inv_scale = 1.0 / scale;
float src, srcf;
int srci;

for (int x=0; x<dest.width(); x++) {
    src = x * inv_scale;
    srci = floor (src);
    srcf = src – srci;
    outp (x) = interp (inp (srci), inp (srci+1), srcf);

}

Where the pixel’s value is


The previous method presumes that each pixel is represented on pixel boundaries.

It is however more accurate to interpret the discrete (pixel) data as a continuum, the value of a pixel then being in the abstract middle between the discrete pixel boundaries.


If each pixel’s value is in its middle then both the source and destination references must be to their respective middles.

So instead of
    src = x * inv_scale

this   
    src = (x + 0.5) * inv_scale


Further, if srcf is less than 0.5 then the position is between that and the previous pixel, and not that and the next pixel.  This is important.

And if that is the case then the value of the proportional position changes too.  If it was 0.2 (that is 0.3 to the right of middle of the pixel at srci) then the proportional position is 0.7 from the left of middle of the previous pixel.

So
    if (srcf < 0.5) {
        srci = srci – 1;
        srcf = 0.5 + srcf;

    }

However if srcf is not less than 0.5 then the proportion needs to change too but differently.  So we also have:

    else {
        srcf = srcf – 0.5

    }



Edge cases


If srcf is to the left of middle of the first pixel then there is nothing to interpolate with, so use that pixel’s value as the dest value.  Similarly if srcf is to the right of the middle of the last pixel then use that pixel’s value as the dest value.
Left edge

    if (srci == 0 and srcf < 0.5)
        dst_val = inp [srci];


Right edge

    if (srci == width-1 and srcf > 0.5)
        dst_val =
inp [srci];

Complete(r) code


void resize (float scale, image & inp, image & outp)

    for (int x=0; x<dest.width(); x++) {
        src = (x + 0.5) * inv_scale;
  
        srci = floor (src);
        srcf = src – srci;
  
        if (srci == 0 and srcf < 0.5) {
            outp (x) = inp (srci);
            continue;

        }
        if (srci == width-1 and srcf > 0.5) {
            outp (x) = inp (srci);
            continue;

        }

        if (srcf < 0.5) {
            srci = srci – 1;
            srcf = 0.5 + srcf;

        }
        else {
            srcf = srcf - 0.5;

        }
  
        outp (x) = interp (inp (srci), inp (srci+1), srcf);

    }

Blocking and cumulative scaling


Scaling by factors greater than two results in apparent (soft) blocks of colour.  This is because many dest pixels get their values from the same source pixels (generally).

This can be addressed by cumulatively resizing an image, limiting each run to a maximum scaling factor of 2, (and a minimum of 1).

If scaling by 750%, or 7.5, then

    7.5 / 2 = 3.75         // scale by 2
    3.75 / 2 = 1.875     // scale by 2
    1.875                     // scale by 1.875

Sanity check: 2 * 2 * 1.875 = 7.5

Generally:
    while 1
        if scale >= 2.0
            interp by 2.0
            scale = scale / 2.0
        else if scale > 1.0
            interp by scale
            break

void cum_resize (float scale, image & inp, image & outp)

    while (1) {
        if (scale >= 2.0) {
            resize (2.0, inp, outp);
            inp = outp;
            scale = scale / 2.0;

        }        
        else if (scale > 1.0) {
            resize (scale, inp, outp);
            break;

        }    
    }

Resizing down


The method generally works for image reductions too - so long as the scale is > 0.5.  If the factor is <= 0.5 then dest pixels will be missing source pixels.

Note however that there is a better way to resize down.  (The Lanczos filter produces ‘nicer’ down-sized images than this method.)  And that better way may be the topic of another post.

In the meantime, cum_resize () needs to be extended to accommodate cumulative reductions for scaling factors <= 0.5;

    if (scale < 1) {
        while 1 {
            if (scale <= 0.5) {
                resize (0.5, inp, outp);
                inp = outp;
                scale = scale / 0.5;

            {
            else if (scale < 1.0) {
                resize (scale, inp, outp);
                break;

            { 
        }
    else if (scale > 1) {
        while 1 {  
            if (scale >= 2.0) {...}
        }
    else { // scale == 1, so nothing to do but pass outp as inp
        outp = inp

    }

Images are two dimensional


Indeed they are.  The solution is to recognise that scaling is separable, meaning that scaling by x and y separately is fine (not just fine but correct).  My personal method is to cumulatively scale by xscale, rotate (transpose) the image, cumulatively scale by yscale, rotate the image back.

Generally:

    cum_resize (xscale)
    rotate left

    cum_resize (yscale)
    rotate right



Conclusion


That's all.

Saturday, June 22, 2019

Static Ordering (A Better Ordered Container?)

[For the record, the development here described was done through October 2018.]

How to insert data into a container such that the values are maintained in order - is a classic problem in computer science.

A standard method is to store values as nodes of a tree.  The tree is usually adjusted on an as-needed basis to maintain some structural rules.  C++ uses Red-Black Trees to maintain its set and map containers.

My thinking on this is that there could be a way to put values in order that relies on the implicit structure of each value.  Say, given value 245, could each of 2, 4 and 5 be used as nodes on a tree?

I have built such a contraption. 

Basic concept is to link numerals together.  So 23 and 24 would share a common 2 node that joins to respective 3 and 4 nodes.  Each node has as many potential child nodes as is the numeric base of the tree (that is a parameter that can be set at tree creation or hard-coded).

Insertion durations for the same random numbers are given below (compared to a C++ set).  Lower is better.




Why is this faster than RB Tree for insertion?


Because comparisons are not made between values (per C++ sets), and no nodes are ever re-linked to maintain any structural rules (also per C++ sets).  In SO trees the initial placement is perpetually optimal.
 

Some history


I initially converted a value to a string then got elements by accessing each character and converting that to a number.  Easy but slow.

I then dropped the string conversion.  Instead, the initial value was converted to an alternate base, storing each numeral along the way.  Using base 2 was found competitive but not better than C++ set.



Current Version


Converting bases requires some division, which is slow.  Current version uses the value’s binary chunks, raw.  Optimal span has been found to be 4 bits (base 16).
 

Iterating through container


RB trees are maintained such that adjacent nodes have the next smaller or next larger value.  In SO trees, adjacent nodes are not necessarily the next smaller or larger valued nodes.  With SO trees, the next value (smaller or larger) must be discovered by traversing the tree from the reference node.  This requires a method to manage the traversal, and also a method to manage the edges of the tree (so that traversal can continue onto the opposite side, but lower or higher, if need be).  I’ve tried describing how this is implemented but it ends up not much clearer than the source code for it – that could be described, at the moment, as some variation on the theme of grotesque.  So I’ve spared you.  You’re welcome.

I haven’t timed traversing SO trees.  It’s probably really really bad.  But I will likely do this at some point and post a follow-up, post.



Data types supported


Only implemented for 32 bit integers.  The thing can however be trivially extended for chars, shorts, long ints, etc, or anything that can be represented by ints like, say, pointers.

Floating point value storage has been thought about.  Just so you know.



Finally


Given my no-doubt amateurish implementation does insertions at nearly double the speed of the industry standard library, there’s probably some potential in this model.