Wavelet Analysis and Synthesis in G'MIC

Figure 1: One cycle of the DWT, each 2˟2 quadruple of pixels, Pjk, in the input maps to four pixels, one in each quadrant of the analysis, each placed in the same relative location as the quadruple in the original image. These image pixels report on conditions around the neighborhood of the quadruple. Northwest: average intensity, northeast, gradients along the width, southwest, gradients along the height, southeast: gradients along the diagonal. The formulas are for image pixels from the quadrupal of pixels clustered around pixel Pjk.

Similar to Fourier analysis and synthesis (-fft, ifft), a wavelet analysis transforms an image into another representation that reflects its frequency, or spectral, content. Unlike Fourier analysis, the wavelet analysis retains spatial relationships. One may view a DWT analysis and recognize scaled versions of the original image, locate versions at a particular scale, and alter them, When reconstituted by the synthesizer, -ihaar, these will reflect the changes applied only to the selected levels of detail. Thus, as in Fourier analysis, one can undertake frequency-based editing, but – unlike the Fourier transform – may readily locate particular places in the image to work an effect.

Unlike Fourier synthesis, there are myriad basis functions which can perform wavelet analysis and synthesis, these tailored to meet particular requirements. G'MIC uses Haar wavelets, the most compact and simplest to implement, and the oldest, first described by Alfréd Haar in 1909.

The -haar command progressively reduces an original image into sets of three axially oriented differences and one base image at each scale, in aggregate forming an image pyramid. At each cycle, or scaling, the command operates on an intermediary from the previous cycle, or, when at the start, the original. Each cycle produces an analysis pertinent to image detail at that scale. The image pyramid contains all possible analyses stacked one on top of the other.

Transformation Cycles

For clarity, the following narrative describes the Discrete Wavelet Transform in two dimensions, though the extension to the third is straightforward. One reads “octant” for “quadrant” which has one base cube and seven difference regions, instead of three for the two dimensional quadrant. G'MIC's implementation will analyze along width, height and depth dimensions when presented with a multi-slice original.

During one cycle, or scaling of the Discrete Wavelet Transform, the analysis for that scale takes the form of four reduced and modified versions of the input image. Scaled at 50% of the input by length, these modified versions occupy the same buffer as the original, with each situated in one of four quadrants, and, following a scaling, each conveying information about average intensities (the base image) or axially oriented gradients (the three difference images) that might be present at the particular scale.

  1. The northwest quadrant is the base image and conveys information about average intensities observed in the input image. However, as the DWT progresses, the northwest quadrant becomes the input for the next cycle, so that at every scale except the last, the northwest quadrant is replaced by images of higher scalings. This gives the impression of objects receding off to infinity, the signature aspect of a set of DWT analyses. The transform progresses until the northwest quadrant acquires an odd dimension, has been reduced to one pixel, or has reached a preset limit. The base image at this highest scale then becomes the starting point for re-synthesis of the original by the inverse Haar command (-ihaar)
  2. The northeast quadrant conveys rates of intensity change observed across columns in the input image.
  3. The southwest quadrant conveys rates of intensity change observed across rows in the input image.
  4. The southeast quadrant conveys rates of intensity change observed across diagonals in the input image.

Formally, one cannot further scale an image if, at some intermediary scaling, one or two dimensions becomes indivisible by two, but in practice, one may pad the width and height of originals with zero-valued pixels as necessary to obtain images with power of two dimensions. See -resize_pow2.

Computations

Each quadrant conveys a class of information about the input, its constituent pixels reporting the same class of information from 2 ? 2 pixel neighborhoods situated in the same relative position in the input image. During an analysis, these neighborhoods are "dissected" into four single pixels, each migrating to each of the quadrants.

  1. The first pixel, destined for the northwest quadrant, conveys the sum of the four pixels composing the neighborhood, divided by two. This quantity indicates the average intensity of the neighborhood.
  2. The second pixel, destined for the northeast quadrant, conveys the sum of differences of the left pixel column, minus the right pixel column, divided by two. This difference indicates the rate of change of intensity or gradient across image columns within the neighborhood.
  3. The third pixel, destined for the southwest quadrant, conveys the difference between the sums of the bottom row or pixels, minus the sums of the top row of pixels, divided by two. This difference indicates the rate of change of intensity or gradient across the image rows within the neighborhood.
  4. The fourth pixel, destined for the southeast quadrant, conveys the difference of the bottom right pixel, minus the bottom left, and the top right pixel, minus the top left, divided by two. This "difference of differences" indicates the diagonal gradient across the image rows and columns within the neighborhood.

Walking Through The G'MIC Implementation

G'MIC implements wavelet analysis through a compact and efficient procedure which mainly entails differencing shifted images, image addition, subtraction and multiplication. As before, we will mainly concentrate on the two dimensional narrative; the three dimensional case is a straightforward extension of concepts present in the two dimensional case.

The accompanying G'MIC code has been adapted from the implementation of the -haar command:

1

The command first separates the input image along its width into even and odd column images. The 'even' image possesses columns 0, 2, 4,...; the 'odd' image possesses columns 1, 3, 5,...
 gmic hface.cimg \
--shift -1      \
-resize 50%   # Other dimensions default to 100%; NB: Nearest neighbor         \
              # interpolation (default) decimates every other row              \
              # at precisely 50% reductions, leaving even rows on the          \
              # left and odd rows on the right. We are guaranteed this         \
              # separation by the one pixel leftward shift of the image on the \
              # right. With the every-other-row deletion arising from the      \
              # nearest neighbor interpolation, the columns retained in the    \
              # shifted version are just those deleted in the unshifted        \
              # version, and vice-versa.                                       \

2

Makes a second image, diff_col, by subtracting the even columns from the odd (oddeven)
-sub[1] [0] \

3

Makes a third image, sum_col, by adding the odd and even columns (odd + even).
Appends together sum_col and diff_col along the x axis and scales the pair by 1 2 1 over sqrt 2 .
-add[0,1]        \
-div '{sqrt(2)}' \
-append x        \

The left half of the image (sum_col) consists of pairwise column sums, the right half (diff_col) pairwise differences, the whole scaled by the square root of two.

Though not obvious in the math or script, this stage reflects a decimation and two convolutions of the initial data set. In effect, the left hand side of the image evolves from a two pixel wide box filter – the Haar scaling function – convolving a decimated version of the input data. Correspondingly, the right hand side of the image evolves from the Haar wavelet convolving over the same data set. The Haar wavelet is zero everywhere except for a one unit positive impulse followed by a one unit negative impulse – a "square wave".

4

The command repeats steps 1 through 3, but by separating sum_col diff_col into even and odd rows and finding sums and differences of these rows along the height axis, again scaling the results by 1 2 1 over sqrt 2 .
Twice scaled by this quantity, the overall dataset has now been divided by 2, and represent pairwise averages of sums and/or differences characteristic of each quadrant.
# Computations along the height axis \
# mirror those along the width.      \
--shift 0,-1     \
-resize 100%,50% \
--sub[1] [0]     \
-add[0,1]        \
-div '{sqrt(2)}' \
-append y        \

5

At the conclusion of one scaling step, every 2 × 2 quadruple of pixels from the input image recurs as a single pixel in each quadrant of the analysis, four single pixel “images” of the quadruple in all. See the beginning of this article for a larger version of this diagram.

 — 

A pixel, Pj,k in row j and column k, and its immediate neighbors, Pj,k+1 ,Pj+1,k, and Pj+1,k+1 “merge together” in four different ways after one DWT scaling, producing four new pixels. These “migrate” to each of the quadrants of the analysis.

Each arise from a particular combination of pairwise sums and differences that characterize the quadrant. These combinations are:

  1. ( P j + 1, k + 1 + P j + 1, k ) + ( P j , k + 1 + P j , k ) 2 {(P_{j+1,k+1} + P_{j+1,k}) + (P_{j,k+1} + P_{j,k})} over 2 in the northwest quadrant. Pixels here are all averages of row sums of column sums. This specific quadrant becomes the initial image in the next round of DWT scaling. Put another way, these pixels all arise from box filter convolutions running width- and height-wise across the original input image.
  2. ( P j + 1, k + 1 − P j + 1, k ) + ( P j , k + 1 − P j , k ) 2 {(P_{j+1,k+1} - P_{j+1,k}) + (P_{j,k+1} - P_{j,k})} over 2 in the northeast quadrant. Pixels here are all averages of row sums of column differences. Another way to put it is that these pixels all arise from a width-wise Haar wavelet convolution and a height-wise box filter convolution.
  3. ( P j + 1, k + 1 + P j + 1, k ) − ( P j , k + 1 + P j , k ) 2 {(P_{j+1,k+1} + P_{j+1,k}) - (P_{j,k+1} + P_{j,k})} over 2 in the southwest quadrant. Pixels here are all averages of row differences of column sums. Put another way, they arise from a width-wise box filter convolution and a height-wise Haar wavelet convolution.
  4. ( P j + 1, k + 1 − P j + 1, k ) − ( P j , k + 1 − P j , k ) 2 {(P_{j+1,k+1} - P_{j+1,k}) - (P_{j,k+1} - P_{j,k})} over 2 in the southeast quadrant. Pixels here are all averages of row differences of column differences and arise from width- and height-wise convolutions of Haar wavelets.