Overview Under point processing a single input sample is processed to produce a single output sample. In regional processing the value of the output sample is dependent on the values of samples within close proximity to the input sample. Exactly how the region is selected and how samples within the region affect the output depends upon the desired effect. Convolution and Correlation are two important regional operations
Convolution Convolution establishes a strong connection between the spatial and frequency domains and can be used to achieve a wide variety of effects. Convolution employs a rectangular grid of coefﬁcients, known as a kernel Kernel determines which elements of an image are in the neighborhood of a particular pixel Kernel dictates how these neighborhood elements affect the result. The Kernel is a set of weights that is applied to corresponding input samples that are summed to produce the output sample. The “key element” of the kernel corresponds to the sample that is being processed
Deﬁnition Given a WxH grayscale image I and an MxN kernel K such that M and N are odd the convolution of I with K is given below where x is in [0,W-1] and y is in [0, H-1]. Note the (x-j) term can be understood as a reflection of the kernel about the central vertical axis. The (y-k) term can be understood as a reflection of the kernel about the central horizontal axis. The kernel weights are multiplied by the corresponding image samples and then summed together.
Convolution The value of a pixel is determined by computing a weighted sum of nearby pixels. +1 0 -1 -1 0 +1 -1 0 1 -2 0 2 -1 0 1 Given a “kernel” of weights to be centered on the pixel of interest 7 2 7 5 3 5 6 0 6 6 8 8 6 7 8 5 8 2 Compute the dst value of the center pixel by “overlaying” the kernel and computing 7 the weighted sum
Convolution The formula can be easily translated into Java code! Assume that I is a BufferedImage Assume that K is a MxN array of floats Assume that (x,y) is given as a Location object The summations become ‘for’ loops How many loops are there? Four: For every x, for every y, for every j, for every k The nested loop structure is hidden by our RasterScanner iterator.
Convolution: The Edge Problem Solutions are to Redeﬁne convolution at the edge boundary Eliminate edges from the source image by making the source image inﬁnitely large. Speciﬁcally: Convolution is redefined to produce zero when the kernel falls off of the boundary. If the kernel extends beyond the source image when centered on a sample I(x, y) then the output sample is set to zero. Convolution is redefined to produce I(x, y) when the kernel falls off the boundary. If the kernel extends beyond the source image when centered on a sample I(x,y) then the output sample is deﬁned as I(x, y). Extend the source image with a color border. The source image is inﬁnitely extended in all directions by adding zero-valued samples so that samples exist off the edge of the source. This is often referred to as zero padding. Since the source image is inﬁnite in extent, convolution is well deﬁned at all points within the bounds of the unpadded source. Extend the source image by circular indexing. The source image is inﬁnitely extended in all directions by tiling the source image so that samples exist off the edge of the source. Extend the source image by reflective indexing. The source image is inﬁnitely extended in all directions by mirroring the source image so that samples exist off the edge of the source. This mirroring is achieved through the use of reflective indexing.
ImagePadder Of course we don’t create a BufferedImage of inﬁnite size Use methods to intercept the images ‘getSample’ method of the Raster and redirect the indices appropriately back onto the source. The ImagePadder interface extends an image via it’s own version of getSample Implementations will extend the image differently: zero padding circular indexing reflected indexing
Design Observation on the Padders Note that each of the padders is a stateless object with a single method. Also note that the source image is not modiﬁed through the getSample and hence is thread safe. This suggests use of the singleton pattern: only one of each type of padder should ever be constructed. Can modify the code to achieve this.
Convolution: The Transfer Type Problem The second issue relating to convolution deals with the transfer type and color depth of the resulting image. Kernel coefficients are real values There are no limits on the values of the weights Hence, both the type and color depth of the output image are determined by the kernel and may be different than the source. Example: an 8-bit grayscale source image that is convolved with a 3x3 kernel where all coefficients have a value of 1/4 or .25. Each output sample will be real valued since multiplication of an integer (the 8-bit input sample) and a real (the kernel coefﬁcient) produces a real (the output sample). Each output sample will fall within the range [0, 573.75] since when the kernel is convolved with an all-white region of the source, the output is (9x255/4)=573.75. The transfer type of the output image is float and the color depth is on the order of 10 bits.
Convolution: The Transfer Type Problem BufferedImage does not fully support floats as a transfer type (you can’t really have floats as sample values. The float type is not reasonably supported throughout the rendering pipeline). Solutions: Rescale the output to the same transfer type as the source Rescale the kernel before convolution to ensure that there are no problems. Can only do this for kernels that contain only non-negative coefficients. Clamp the output to the sources transfer type.
Convolution: The Transfer Type Problem Rescale the kernel by making the coefficients sum to 1. The effect of the kernel is unchanged Rescaling the kernel is equivalent to rescaling the convolved image.
Convolution Complexity What is the computational complexity of the ‘brute-force’ approach to convolving a WxH image? Computing each destination sample requires M*N multiplications and additions There are W*H samples. Convolution is on the order of W*H*M*N operations. This is SLOW! Large kernels are impractical
Convolution: Analysis Convolution is a computationally expensive operation when implemented in straightforward fashion. Consider a WxH single band image and a MxN kernel. Each sample must be convolved with the kernel. MxN multiplications are required MxN additions are required The source image contains WxH samples The total number of arithmetic operations required for convolution of an image is on the order of WxHxMxN. The computational effort is therefore directly proportional to the size of the kernel and the size of the image. Large kernels should be used sparingly.
Convolution: Separability Computational efficiency can be dramatically improved if the kernel is separable. A MxN kernel is separable iff there exists an Mx1 column vector A and a 1xN row vector B such that K = AB. Example: The beneﬁt derives from realizing that convolution can be performed by consecutive application of convolution of I with A followed by B.
Convolution in Java Java has built-in classes to support convolution using the Kernel and ConvolveOp classes The code is typically (at least on Windows boxes) implemented in native code (usually C) The code never takes advantage of separable kernels The code clamps the destination image to 8 bits The code either zero pads or copies boundaries. No other options. The Kernel is a raster-like object. Makes a 1D array into a 2D entity. The ConvolveOp is a BufferedImageOp subclass 24
Convolution: Custom Solution Let’s create a set of classes that perform convolution Design our own Kernel2D class Are flexible in boundary handling Leverage separability whenever possible Use an abstract base class for managing dimensions Create concrete subclasses: one that is used for separable kernels and one that is used for non-separable kernels. Write a ConvolutionOp that uses a Kernel2D to support separability An ImagePadder to address the edge handling problem