Spitzer: MOPEX User's Guide

Spitzer Home > Data Analysis & Tools > Tools > MOPEX > MOPEX User's Guide

MOPEX User's Guide

8.2 Outlier Detection

Outlier rejection is central to the mosaicking process. MOPEX employs four algorithms to detect radiation hits and moving objects:

1. Single Frame Outlier Detection, implemented as the Detect Radhit module. It represents spatial filtering of input images. It requires the background-subtracted images created by MedFilter.

2. Multiframe Temporal Outlier Detection implemented by the Mosaic Outlier module. This algorithm is best suited for high-coverage data (i.e. many input frames per pixel).

3. Dual Outlier Detection: a more complicated algorithm that uses both spatial and temporal information. The implementation involves a series of modules: Detect Outlier, Mosaic Projection, Mosaic Dual Outlier, and Level. It requires the background-subtracted images created by MedFilter. It is best suited for low-to-medium coverage data.

4. Box Outlier Detection: a method designed to use both the temporal and spatial information like the Dual Outlier, but using the statistical analysis of the kind used by the Multiframe Temporal Outlier. It is best suited for low-to-medium coverage data.

The four detection results can be combined into one mask - the RMask - by module Mosaic RMask. Warning: running the modules for these outlier schemes does not mean that they are automatically used to create the RMask file. The default rejection scheme used to create the RMask is Single Frame Outlier Detection. In order to use the other schemes you must check the relevant boxes in the Mosaic RMask module (Use Outlier for Rmask; Use Dual Outlier for Rmask; Use Box Outlier for Rmask), and set the RMask Fatal Mask Bit Pattern in the Initial Setup to use the relevant bits (see §8.11: Fatal Mask Bit Patterns for more information). The equivalent on the command line is to set RMask_Fatal_Bit_Pattern and the following lines in the General Settings section of your namelist:

USE_OUTLIER_FOR_RMASK = 1 (use Multiframe Temporal Outlier Detection)

USE_DUAL_OUTLIER_FOR_RMASK = 1 (use Dual Outlier Detection)

USE_BOX_OUTLIER_FOR_RMASK = 1 (use Box Outlier Detection)

The RMask Fatal Mask Bit Pattern is set as a separate step to allow users to compare the different outlier rejection schemes without having to run all of the outlier algorithms several times. Instead of re-running all of the modules, users can run the whole flow once, with all of the outlier rejection schemes enabled to create the RMask. Once the RMask has been created, the outlier schemes can be compared by turning off all of the Mosaic modules except for Mosaic Reinterpolate, Mosaic Coadd and Mosaic Combine, and re-running Mosaic (mosaic.pl) with different RMask Fatal Mask Bit Pattern settings. You can also adjust the thresholds for some of the outlier schemes within RMask itself. Since the outlier rejection modules are among the most complex steps, this can save a lot of time, especially for large datasets.

The module Mosaic Coverage is run prior to Mosaic RMask, because the latter uses coverage thresholds to include the results of the temporal and dual outlier detection.

8.2.1 Single Frame Outlier Rejection

The single frame radhit detection represents spatial filtering of input images. It is implemented in the Detect Radhit module. It is a variant of image segmentation and is based on the idea that when a relatively low detection threshold is applied, for each bright point source a large number of pixels will be detected above the threshold. Bright detections with small areas are, therefore, classified as radhits. It should be applied to background subtracted images.

Figure 8.2: Cartoon of single-frame outlier detection.

It has three input parameters: Segmentation Threshold, Detection Max Area, and Radhit Threshold. It performs image segmentation based on a low threshold Segmentation Threshold and then applies two conditions to weed out point sources. The first condition is that the total area in pixels should be no greater than a user specified Detection Max Area. The second condition is that at least one pixel in the cluster should be greater than another user specified Radhit Threshold, which should be set much higher than Segmentation Threshold.

Figure 8.3: An illustration of how Detect Radhit selects a radhit over two point sources.

Figure 8.3 illustrates the above algorithm. The three objects depicted there are a bright point source, a faint point source, and a radhit. After segmentation with a low Segmentation Threshold is performed, the bright point source creates a cluster of pixels with the number of pixels greater than Detection Max Area. Then the second criterion of having at least one pixel greater than Radhit Threshold weeds out the faint point source.

8.2.2 MultiFrame Temporal Outlier Detection

Figure 8.4: Cartoon of temporal outlier detection.

The second outlier detection method carries out temporal filtering of the interpolated images. It is performed by the Mosaic Outlier module. The input parameters used for this module are Bottom Threshold, Top Threshold, and Min Pix Number.

Important: Using this module does not mean that MOPEX will automatically use the results for outlier detection. In order to use the results from this module, you must check the Use Outlier for RMask box in the Mosaic RMask module and set the RMask Fatal Mask Bit Pattern in the Initial Setup module to use bit 1. If you are using the command line, include the USE_OUTLIER_FOR_RMASK trigger in the namelist and set RMask_Fatal_BitPattern to use bit 1. Note that this is not the same as setting it to a value of 1. See §8.11: Fatal Mask Bit Patterns for more information.

The input images are interpolated onto a common grid, such that the pixels in the interpolated images are perfectly lined up and create a stack of pixels. Each spatial location, i.e. a pixel in the common grid, is filtered in the time domain and outliers are found (see Figure 8.5).

If pixel value I_k for pixel k in the stack satisfies either of the following conditions:

I_k < M - Bottom Threshold *σ

I_k > M + Top Threshold *σ

pixel k is declared an outlier. M and σ are estimates of the mean and scatter of the pixel values in the stack. If pixel uncertainty images are provided σ above is set to the greater of two values: the scatter and the smallest pixel uncertainty in the stack. The scatter cannot be reliably estimated with only a handful of samples. If the stack size is less than Min Pix Number, then σ is set to the smallest pixel uncertainty in the stack.

Figure 8.5: Interpolated images (red) are stacked up and shown here in the side view. The underlying grid is shown in black.

Two ways to estimate the mean M and scatter σ of the pixel values in the stack are implemented. It is controlled by the namelist variable THRESH_OPTION, but this variable is only available on the command line. The GUI uses the default value of THRESH_OPTION = 1, which is highly recommended for all users.

THRESH_OPTION = 1 is the more robust and highly recommended way of sigma estimation. Under this option,

M = median(I_k), σ=MAD(I_k) = median(|I_k-median(I_i)|)/0.6745

where MAD stands for the median absolute deviation. The factor of 0.6745 here represents the inverse of the third quartile of the normal distribution. The division is necessary because the numerator alone tends to underestimate the standard deviation, so dividing by this value makes the MAD more accurate.

THRESH_OPTION = 2 is less robust and based on the minimum difference between the values of two consecutive pixels in the stack. We strongly recommend that users do not use this option.

The product of this step is an outlier map, where only outlier pixels have a non-zero value. The pixel value O_k in an outlier map is the deviation of the pixel k in the input images from the mean in the stack in terms of the number of standard deviations σ:

Equation 8.1

The thresholds - Bottom Threshold and Top Threshold - can be set to 0, which is the default. In this case the decision of declaring a particular pixel an outlier is put off until running the Mosaic RMask module. The advantage of doing it this way is that the user can experiment with different thresholds for outliers without having to rerun Mosaic Outlier.

The main limitation of temporal outlier rejection is the cases of shallow coverage.

8.2.3 Dual Outlier Detection

Figure 8.6: Cartoon of dual outlier detection.

The third method represents a more complicated dual spatial-temporal filtering.

Dual outlier detection consists of two-stage filtering. At the first stage, all spatial pixel outliers are detected and saved as detection maps. These detection maps include point sources and radhits. Detection maps are interpolated to a common grid. Then for each spatial location, the values of the interpolated pixels in the detection maps are compared. If in the majority of the detection maps this pixel location has not been detected by spatial filtering, then it is declared an outlier in the images where this pixel location has been detected. This method is expected to be reliable for a small number of images, i.e. in the shallow coverage case. Therefore it represents a suitable supplement of the multiframe temporal outlier detection. Below are the four steps of the dual outlier detection in more detail.

Important: Using this module does not mean that MOPEX will automatically use the results for outlier detection. In order to use the results from this module, you must check the Use Dual Outlier for RMask box in the Mosaic RMask module and set the RMask Fatal Mask Bit Pattern in the Initial Setup module to use bit 2. If you are using the command line, include the USE_DUAL_OUTLIER_FOR_RMASK trigger in the namelist and set RMask_Fatal_BitPattern to use bit 2. Note that this is not the same as setting it to a value of 2. See §8.11: Fatal Mask Bit Patterns for more information.

Spatial detection

The Detect module, applied to background subtracted images, finds all the pixels with values greater than a specified number of standard deviations. The input parameters used here are Detection Max Area, Detection Min Area, and Detection Threshold. The product of this step is called a detection map. Each contiguous cluster is assigned a number and the values of all the pixels in the cluster are set to this number.

Detection map interpolation

The map interpolation is carried out by the Mosaic Projection module. The process is different from image interpolation. The non-zero input pixels set the values of the output pixels they overlap without any weighting. The process is ambiguous if two clusters are close, and there are output pixels that overlap input pixels from both clusters. The products of this step are the interpolated detection maps (see Figure 8.7).

Figure 8.7: A sample interpolated detection map. Clusters 1, 2 and 5 are point sources, clusters 3 and 4 are radhits. Other pixels are set either to 0 if they overlap non-outlier pixels in the input detection map, or to NaN (not a number) if they overlap no pixels in the input detection map.

Multiframe Comparison of Detection Maps

The comparison is performed by the Mosaic Dual Outlier module. The input parameters that are used are Max Outlier Image and Max Outlier Fraction. Interpolated detection maps are stacked up. For each stack, the number of spatial outlier pixels is found. If the fraction of outliers in the stack is smaller or equal to Max Outlier Fraction and the number of spatial outliers in the stack is smaller or equal to Max Outlier Image, then these pixels are now classified as dual (spatial and temporal) outliers, and the sign of the dual outlier pixels is changed. For example, in Figure 8.8, suppose Max Outlier Fraction = 0.5 and Max Outlier Image = 3. Stack A is the perfect case when only one pixel in the stack is a spatial outlier (value of -12). Its sign is changed from positive to negative. Stack B has 4 pixels that are spatial outliers (1, 12, 5, 8). The fraction of spatial outliers is 0.4 < Max Outlier Fraction, but 4 > Max Outlier Image, so since the stack does not satisfy both conditions, their signs are unchanged and remain positive. In Stack C the fraction of spatial outlier is 0.6 > Max Outlier Fraction, so the signs remain unchanged. Stack D is the perfect case of a pixel in a point source detected in all overlapping images. The fraction of spatial outliers is 1. The products of this step are dual detection maps (see Figure 8.9).

Figure 8.8: Suppose Max Outlier Fraction = 0.5 and Max Outlier Image = 3. Stack A is the perfect case when only one pixel in the stack is a spatial outlier (value of -12). Its sign is changed from positive to negative. Stack B has 4 pixels that are spatial outliers (1, 12, 5, 8). The fraction of spatial outliers is 0.4 < Max Outlier Fraction, but 4 > Max Outlier Image, so since the stack does not satisfy both conditions, their signs are unchanged and remain positive. In Stack C the fraction of spatial outlier is 0.6 > Max Outlier Fraction, so the signs remain unchanged. Stack D is the perfect case of a pixel in a point source detected in all overlapping images. The fraction of spatial outliers is 1.

Figure 8.9: As a result of temporal detection, the sign of the pixels of clusters 3 and 4 has been changed. Also some of the pixels of cluster 2 have their sign changed. Cluster 5 is probably a moving source - the spatial detection suggests that it is real, but the temporal detection shows that it is no longer at the same coordinates in subsequent frames. One of the pixels in cluster 5 has its sign unchanged. This could happen if an overlapping frame has a radhit affecting the corresponding pixel.

Dual Detection Map Correction

The processing is done by the Level module. The dual detection map is processed in order to eliminate detection of the outskirts of legitimate point sources as dual outliers. If a dual outlier belongs to a cluster where the majority of pixels are not dual outliers the chances are the pixel has been wrongly marked because it is on the edge of the point source.

The sign of wrongly marked pixels is flipped based on the namelist parameter Threshold Ratio. If the number of negative pixels N- in a cluster with N pixels is smaller than the threshold

Equation 8.2

then their signs are flipped. If the number of positive pixels N+ in a cluster with N pixels is smaller than the threshold

Equation 8.3

then their signs are flipped. If Threshold Ratio = 0.5, which is the default, the pixels within each cluster will have the same sign.

The products of this step are the corrected dual detection maps (see Figure 8.10).

Figure 8.10: Suppose Threshold Ratio = 0.5. In Figure 4, cluster 2 has outlier fraction of 0.25, which is less than Threshold Ratio. The corrected dual map has all the pixels in cluster 2 set positive. Cluster 5 in Figure 4 has an outlier fraction of 0.9, which is greater than Threshold Ratio. The single positive pixel in cluster 5 has been set negative.

8.2.4 Box Outlier Detection

The fourth outlier detection method is designed to use both the temporal and spatial information like the dual outlier, but it uses the statistical analysis of the kind used by the temporal outlier. It is implemented by the Mosaic Box Outlier module.

Important: Using this module does not mean that MOPEX will automatically use the results for outlier detection. In order to use the results from this module, you must check the Use Box Outlier for RMask box in the Mosaic RMask module and set the RMask Fatal Mask Bit Pattern in the Initial Setup module to use bit 3. If you are using the command line, include the USE_BOX_OUTLIER_FOR_RMASK trigger in the namelist and set RMask_Fatal_BitPattern to use bit 3. Note that this is not the same as setting it to a value of 3. See §8.11: Fatal Mask Bit Patterns for more information.

For each mosaic pixel position, Mosaic Box Outlier creates two stacks of pixel values. The first stack is the same as the one created for the regular outlier detection method. It includes pixels from each interpolated image corresponding to the mosaic pixel position. This pixel in each interpolated image will be referred to as the main pixel. In each interpolated image a box of the user defined size is created centered on the main pixel. Input parameters Box X and Box Y define the size of the box. The second stack -- the box stack -- includes pixels from the above box in each interpolated image. The values of M and σ are computer in the box stack:

M = biased_median(I_k)

σ = MAD(I_k) = biased_median(|I_k biased_median(I_i)|) / 0.6745

The biased_median is defined as follows. If there are N pixels in the stack, then the biased median is equal to the N/2-Bias element of the stack:

biased_median(I_k) = I[N/2 - Bias]

The input parameter Box Median Bias controls the value of Bias.

The product of this step is an outlier map. The pixel value O_k in an outlier map is the deviation of the pixel k in the input images from the mean M in the stack in terms of the number of standard deviations:

Equation 8.4

If all the pixels in the main stack are outside of the M +/- σ range, the program falls back on the temporal outlier method 1 using the main stack and the uncertainty images, if the latter are available.

Inclusion of the neighboring pixels allows for outlier detection, even in the case of coverage of two, where the simple temporal detection is completely powerless. Clearly, a single pixel radhit on a relatively constant background can be easily identified as outlier, even with the smallest box size of 3. However, even radhit in the shape of wide and long streaks can be successfully detected by this method owing to the Box Median Bias. Suppose, the box size is 3, so the stack consists of 18 pixels, and the size of the radhit is at least 3x3 pixels. Out of the 18 pixels 9 background pixels have values relatively close to each other. The 9 radhit pixels will in general have greater scatter, then the 9 background pixels. Because of the bias, the median will be equal to the highest value of the background pixels. The difference between the highest and the lowest values of the background pixels are expected to be smaller, than the difference between the highest value of the background pixel and the lowest value of the radhit pixels. Consequently, the biased median deviation will be equal to the difference between the highest and lowest values of the background pixels.

Figure 8.11: The 18 pixel values shown here are from two images and the box size of 3. The 9 lower pixel values are the background and the 9 higher pixel values are from a radhit.

The problem with temporal outlier rejection is that it sometimes rejects valid pixels in the center of bright sources. It is especially acute for the undersampled data. Interpolation of such data results in significant errors and some interpolated pixels have values higher than the uncertainties and the scatter in the stack. Including the neighboring pixels increases the estimate of the scatter and prevents identifying such pixels as outliers.

Figure 8.12: The pixels inside the red rectangle are included in the box stack. The red rectangle is centered around the solid red pixel, which is the mosaic pixel for which the estimation is performed. The solid red pixel is part of the main stack. In the figure BOX X = 5 and BOX Y = 3.

Figure 8.13: Interpolated images (red) are stacked up and shown here in the side view. The underlying FIF grid is shown in black. The solid red pixels comprise the main stack and the solid red and pink pixels comprise the box stack. In this figure BOX X = 3 and BOX Y is not shown.