Figure 1: One cycle of the DWT, each 2˟2 quadruple of pixels, Pjk, in the input maps to four pixels, one in each quadrant of the analysis, each placed in the same relative location as the quadruple in the original image. These image pixels report on conditions around the neighborhood of the quadruple. Northwest: average intensity, northeast, gradients along the width, southwest, gradients along the height, southeast: gradients along the diagonal. The formulas are for image pixels from the quadrupal of pixels clustered around pixel Pjk.
Similar to Fourier analysis and synthesis (-fft, ifft), a wavelet analysis transforms an image into another representation that reflects its frequency, or spectral, content. Unlike Fourier analysis, the wavelet analysis retains spatial relationships. One may view a DWT analysis and recognize scaled versions of the original image, locate versions at a particular scale, and alter them, When reconstituted by the synthesizer, -ihaar, these will reflect the changes applied only to the selected levels of detail. Thus, as in Fourier analysis, one can undertake frequency-based editing, but – unlike the Fourier transform – may readily locate particular places in the image to work an effect.
Unlike Fourier synthesis, there are myriad basis functions which can perform wavelet analysis and synthesis, these tailored to meet particular requirements. G'MIC uses Haar wavelets, the most compact and simplest to implement, and the oldest, first described by Alfréd Haar in 1909.
The -haar command progressively reduces an original image into sets of three axially oriented differences and one base image at each scale, in aggregate forming an image pyramid. At each cycle, or scaling, the command operates on an intermediary from the previous cycle, or, when at the start, the original. Each cycle produces an analysis pertinent to image detail at that scale. The image pyramid contains all possible analyses stacked one on top of the other.
For clarity, the following narrative describes the Discrete Wavelet Transform in two dimensions, though the extension to the third is straightforward. One reads “octant” for “quadrant” which has one base cube and seven difference regions, instead of three for the two dimensional quadrant. G'MIC's implementation will analyze along width, height and depth dimensions when presented with a multi-slice original.
During one cycle, or scaling of the Discrete Wavelet Transform, the analysis for that scale takes the form of four reduced and modified versions of the input image. Scaled at 50% of the input by length, these modified versions occupy the same buffer as the original, with each situated in one of four quadrants, and, following a scaling, each conveying information about average intensities (the base image) or axially oriented gradients (the three difference images) that might be present at the particular scale.
Formally, one cannot further scale an image if, at some intermediary scaling, one or two dimensions becomes indivisible by two, but in practice, one may pad the width and height of originals with zero-valued pixels as necessary to obtain images with power of two dimensions. See -resize_pow2.
Each quadrant conveys a class of information about the input, its constituent pixels reporting the same class of information from 2 ? 2 pixel neighborhoods situated in the same relative position in the input image. During an analysis, these neighborhoods are "dissected" into four single pixels, each migrating to each of the quadrants.
G'MIC implements wavelet analysis through a compact and efficient procedure which mainly entails differencing shifted images, image addition, subtraction and multiplication. As before, we will mainly concentrate on the two dimensional narrative; the three dimensional case is a straightforward extension of concepts present in the two dimensional case.
The accompanying G'MIC code has been adapted from the implementation of the -haar command:
1The command first separates the input image along its width into even and odd column images. The 'even' image possesses columns 0, 2, 4,...; the 'odd' image possesses columns 1, 3, 5,... |
|
gmic hface.cimg \ --shift -1 \ -resize 50% # Other dimensions default to 100%; NB: Nearest neighbor \ # interpolation (default) decimates every other row \ # at precisely 50% reductions, leaving even rows on the \ # left and odd rows on the right. We are guaranteed this \ # separation by the one pixel leftward shift of the image on the \ # right. With the every-other-row deletion arising from the \ # nearest neighbor interpolation, the columns retained in the \ # shifted version are just those deleted in the unshifted \ # version, and vice-versa. \ |
|
2Makes a second image, diff_col, by subtracting the even columns from the odd (odd – even) |
|
-sub[1] [0] \ | |
3Makes a third image, sum_col, by adding the odd and even columns (odd + even).Appends together sum_col and diff_col along the x axis and scales the pair by . |
|
-add[0,1] \ -div '{sqrt(2)}' \ -append x \ |
The left half of the image (sum_col) consists of pairwise column sums, the right half (diff_col) pairwise differences, the whole scaled by the square root of two.
Though not obvious in the math or script, this stage reflects a decimation and two convolutions of the initial data set. In effect, the left hand side of the image evolves from a two pixel wide box filter – the Haar scaling function – convolving a decimated version of the input data. Correspondingly, the right hand side of the image evolves from the Haar wavelet convolving over the same data set. The Haar wavelet is zero everywhere except for a one unit positive impulse followed by a one unit negative impulse – a "square wave".
4The command repeats steps 1 through 3, but by separating sum_col diff_col into even and odd rows and finding sums and differences of these rows along the height axis, again scaling the results by .Twice scaled by this quantity, the overall dataset has now been divided by 2, and represent pairwise averages of sums and/or differences characteristic of each quadrant. |
|
# Computations along the height axis \ # mirror those along the width. \ --shift 0,-1 \ -resize 100%,50% \ --sub[1] [0] \ -add[0,1] \ -div '{sqrt(2)}' \ -append y \ |
|
5At the conclusion of one scaling step, every 2 × 2 quadruple of pixels from the input image recurs as a single pixel in each quadrant of the analysis, four single pixel “images” of the quadruple in all. See the beginning of this article for a larger version of this diagram. |
|
— |
A pixel, Pj,k in row j and column k, and its immediate neighbors, Pj,k+1 ,Pj+1,k, and Pj+1,k+1 “merge together” in four different ways after one DWT scaling, producing four new pixels. These “migrate” to each of the quadrants of the analysis.
Each arise from a particular combination of pairwise sums and differences that characterize the quadrant. These combinations are: