VI. Analysis of the 2MASS All-Sky Data Release
1. Source Reliability
a. Single Band Sources in the PSC
Davy Kirkpatrick's analysis of 2MASS sources without optical counterparts in the Sloan Digital Sky Survey Early Release area shows that the majority of non-confirming sources that are not associated with asteroids or bright star bright star confusion in the optical data are H-band only sources in the PSC. Based on these findings, Mike Skrutskie examined the distribution and statistics of H-band sources on the sky. He found a significant number of H-only sources in the PSC with ph_qual values of 'UAU', which signifies otherwise the highest quality photometry. The majority of H-only sources are concentrated towards the Galactic plane, and are most likely associated with confusion. However, there are a significant number (~48,000) of H-only, high quality sources at high galactic latitude. Among these, many more are present in data taken from the southern 2MASS telescope than from the northern observatory.
H-band-only sources are difficult to explain astrophysically. Single-band J sources should be relatively common in the high latitude sky because of the intrinsic colors of stars and the relative sensitivity between the 2MASS bands. Single-band Ks sources can result from intrinsically very red objects, or those obscured by dust, such as extreme AGB and carbon stars, a number of which are found even at high latitude. The best way to produce an H-band "peaked" source would be by the presence of strong emission lines in the H-band. This might be possible with the Fe emission lines in some planetary nebulae, or perhaps in z~1.5 QSOs if the H-alpha line is redshifted into the H-band window. Either of these cases are expected to be rare, though, so we conclude that the majority of H-only detections at high latitude are spurious detections. The asymmetry between observatories also suggests that their occurrence is related to a characteristric of the instruments.
This page presents statistics and general discussion of all single-band sources in the PSC to put the H-band-only sources into context.
b. Basic Statistics
Table 1 contains some basic statistics for the high galactic latitude PSC single-band sources. Single band sources with photometric quality of ph_qual='A' make up 1.43, 1.63 and 0.04% of all high-quality J, H and Ks sources in the |b|>30o PSC, respectively. If most of the H-band only sources are spurious, as suggested by the SDSS comparison, then this implies a source reliability below the 99.95% specified for the Level 1 Science Requirements. Even if all of the high-latitude Ks-only sources are spurious, which is known not to be the case, the Ks reliability requirements are met.
Table 1 - Statistics for the High Latitude PSC
Characteristic | Selection Criteria | Total | North | South |
---|---|---|---|---|
High Latitude PSC | abs(glat)>30 | 48163068 | 16338802 | 31824266 |
... high quality any band | ... AND ph_qual MATCHES '*A*' | 33163550 | 11316044 | 21847506 |
High Latitude PSC, all J-band detections | abs(glat)>30 AND rd_flg MATCHES '[1-6]??' | 47774906 | 16180008 | 31594899 |
... highest quality J band | ... AND ph_qual MATCHES 'A??' | 32770441 | 11187232 | 21583209 |
High Latitude PSC, all H-band detections | abs(glat)>30 AND rd_flg MATCHES '?[1-6]?' | 46227096 | 15608464 | 30618632 |
... highest quality H band | ... AND ph_qual MATCHES '?A?' | 26828899 | 9171390 | 17657509 |
High Latitude PSC, all Ks band detections | abs(glat)>30 AND rd_flg MATCHES '??[1-6]' | 40981199 | 26703141 | 14278058 |
... highest quality Ks band | ... AND ph_qual MATCHES '??A' | 20203160 | 7310781 | 12892379 |
Single-band J Read_2 | rd_flg='200' and abs(glat)>30 | 1314471 | 425129 | 889342 |
... high quality | ... AND ph_qual = 'AUU' | 46989 | 18306 | 28683 |
Single-band H Read_2 | rd_flg='020' and abs(glat)>30 | 186500 | 45968 | 140532 |
... high quality | ... AND ph_qual = 'UAU' | 43677 | 3217 | 40460 |
Single-band Ks, Read_2 | rd_flg='002' and abs(glat)>30 | 113011 | 70279 | 42732 |
... high quality | ... AND ph_qual = 'UUA' | 7929 | 3680 | 4249 |
How reliable/unreliable are the high galactic latitude single band sources? Table 2 below gives the number of ph_qual='A' single band sources in each band, followed by the percentage of those that have optical counterparts within 2", those that are flagged as extended sources (ext_key NOT null) or that are flagged as contaminated by an extended source (gal_contam>0). The numbers are also listed separately for each hemisphere.
Association with an optical counterpart within 2" is a good secondary reliability indicator for high latitude 2MASS sources. Of course, an intrinsically red source may very well appear as a Ks-only object without an optical counterpart, and such objects are among the most very interesting sources to some of us! Sources which are extended and/or confused with nearby extended are also prone to bandmerge confusion, or single-band detection. In either case, they do correspond to real sources on the sky, albeit ones for which the point source photometry does not measure the flux that well. Another cause for single-band detections of real sources is confusion of close multiple stars. Unfortunately, there isn't a reliable measure of these.
The final line for each band, highlighted in green, gives the combined percentage of all sources that either have an optical counterpart, are an extended source, or are confused with an extended source. This gives a lower limit to the reliability for the single band sources. At least half of the J-only sources are in this category and are likely reliable, with the fraction roughly consistent between hemispheres. Just under 50% of the Ks-only sources are likely reliable, with the southern hemisphere showing about 10% less reliability than the north. The H-only sources are the real problem, with only 6.6% reliability as a whole. This is dominated by the southern H-only sources with are at least only 3.75% reliable. Thus, H-only single-band sources seriously compromise the Level 1 reliability requirements.
Table 2 - Reliability Estimates for High Latitude, Single-Band Sources
Band/Selection | Subset | All | North | South |
---|---|---|---|---|
# J ph_qual='AUU' | All | 46989 | 18306 | 28683 |
Percentage w/opt < 2" | 49.0 | 41.2 | 53.9 | |
Percentage extended, or w/gal_contam>0 | 7.4 | 11.9 | 4.5 | |
Percentage w/opt,ext,gal_contam | 54.5 | 50.0 | 57.4 | |
# H ph_qual='UAU' | All | 43677 | 3217 | 40460 |
Percentage w/opt < 2" | 3.8 | 15.3 | 2.9 | |
Percentage extended, or w/gal_contam>0 | 3.3 | 31.3 | 1.1 | |
Percentage w/opt,ext,gal_contam | 6.6 | 42.4 | 3.8 | |
# Ks ph_qual='UUA' | All | 7929 | 3680 | 4249 |
Percentage w/opt < 2" | 28.5 | 26.1 | 30.4 | |
Percentage extended, or w/gal_contam>0 | 18.2 | 27.6 | 10.0 | |
Percentage w/opt,ext,gal_contam | 43.1 | 48.5 | 38.5 |
c. Sky Distribution
Figures 1-3 shows the distribution of the high latitude single-band sources in pseudo-aitoff equatorial and galactic coordinates. In the case of the J-band sources, only the ph_qual='AUU' sources are plotted to limit the number of points. These can be compared to John Carpenter's all-sky maps of single band sources.
- The J-band maps show that there appears to be more uniformly distributed single-band sources in the north than in the south, contrary to the numbers in Table 1. However, the increasing concentration of sources towards the Galactic plane and in the Magellanic Clouds dominates the lower density of sources in the field. The concentration of sources towards the Plane and the Magellanic Clouds indicates that many of the J-only sources are real, as expected. Such a gradient is not obvious in the H or Ks maps, but there is a condensation of Ks-only points on the LMC and SMC where both intrinsically red and reddened stars are expected, as is confusion leading to missed bandmerges and source blending.
- Asymmetries in the number of single-band sources between observatories are visible in all three bands, indicating that there is some instrumental influence on single-band sources. This is most striking in the high-quality H-only sources where a factor of 12.5 times more sources are found in the southern data than in the north.
- Although it is difficult to see in Figures 2a and 2b, there are excess H-only sources along the scan declination boundaries. Closer examination shows an excess along the scan RA boundaries, we well. This suggest that the H-only sources are non-repeating between scans and make it into the catalog as use_src=dup_src=0 sources.
d. Cross-Scan Distribution
Figures 4-9 show the distribution of cross-scan positions (x_scan) for the single-band sources in the high latitude PSC. For each band, each hemisphere is plotted separately, and for each hemisphere, the distributions for all sources and sources with ph_qual='AUU' are shown. For the northern H-band, the distributions for the old and new arrays are plotted in different colors. All of the figures can be seen full size together on this page.
If all single-band sources were associated with real sources on the sky, then they should be uniformly distributed in cross-scan. The actual distributions in all three bands show a number of discrete, narrow peaks, indicating that many of the single-band sources cannot be real.
- The peaks with the highest contrast are seen in the southern H-band data. Figures 7a and 7b show that the majority of the H-only sources in the PSC must be concentrated in one peak near cross-scan position +154.
- There is a discontinuity at the array quadrant boundary in the distributions of several of the arrays, with the southern J-only data showing it most prominently. In the southern J-only data, the discontinuity i s accentuated by a narrow trough of sources just to the negative side of the boundary. It would be interesting to see if there is a corresponding discontinuity/trough in the distribution of multi-band sources.
- Peaks near x_scan=0 are common to nearly all of the distributions. This may be related to read out problems with the first columns in the quadrants. Since sources near the scan edges are trimmed out of the PSC, the corresponding problem at the edge of the other quadrants will not be seen.
- Not all peaks seen in the plots of the full single band samples are seen in the high-quality plots, and the amplitude of the peaks often differ between between the two. This suggests that some of the peaks produce fainter and/or noiser sources than others that do not satisfy the criteria for the highest photometric quality.
- The northern Ks data show the largest number of peaks. Only 8-9 of the peaks are seen in the ph_qual='UUA' distributions, though.
- The southern J, H and Ks distributions all show a low-level alternating "jail-bar" pattern in the background counts (the pattern in the H-distribution can be seen in a zoomed version of the plot). The difference between hemispheres suggests that the different sets of camera electronics may have an effect.
Click here to see all of the full-size images on one page.
d. Chi-Squared Distribution
Figures 10-15 show the profile-fit chi-squared values for all single-band R2 sources in each band and hemisphere. Plots showing the distributions for all single-band sources with ph_qual='A' are provided for each case.
- All of the chi-squared vs. x_scan distributions are characterized by large populations near chi-squared of unity punctuated by narrow fingers of high chi-sqauared sources. The high chi-sqauared peaks are coincident with the peaks seen in the x_scan histograms in Figures 4-9.
- The discontinuity in the number of southern J-only sources seen in Figure 5a is confirmed in Figure 11a.
- The high chi-squared fingers come on two "flavors". The first class shows extensive distribution in chi-squared space, reaching ?_psfchi values of 5 or more. The worst of these have chi-squared distributions clearly extending above 10, the cutoff for selection for the PSC. The second class of high chi-squared sources have ?_psfchi values peaking near 2, with a much narrower range of values. These are quite prominent in the northern Ks array.
e. Cause of the H-only Sources
The distribution of the H-only sources indicates that they are detections of features in the arrays, rather than astrophysical sources. The isolated peaks are most likely caused by "hot" pixels in the arrays. The broader excess of H-only sources in the old northern H-array data is probably associated with the low coverage due to the growing number of dead pixels concentrated in the center of that array.
Noisy pixels were usually masked automatically by the DARKS and DFLAT subsystems in the pipeline. In those systems, pixels with consistently aberrantly large dispersions in the dark and sky-flat stacks were identifed and turned off in the processing. This was usually quite effective, and succeeded in supressing the large number of known hot pixels in the northern Ks array. Gene Kopan has found the probable mechanism that allowed these spurious noise detections to slip by the masking procedure. These are most likely hot pixels situated next to masked, dead pixels.
f. Identifying Spurious Sources Associated with the Southern H-array Hot Pixel
The combination of unmasked hot pixel and nearby dead pixel suggests that at least the spurious H-only detections associated with the hot pixel in cross-scan position ~154" can be detected by a combination of cross-scan position, chi-squared and the ndet parameter. Figures 14 and 15 show histograms of h_psfchi values for the southern H-only sources in the worst cross-scan excess region, 153<x_scan<155. Different values of ndet coded in different colors. Immediately apparent is the fact the bulk of the high chi-squared sources have ndet='000500', meaning that these are sources for which one frame of coverage was lost, and no >3-sigma aperture measurements were achieved. This is what is expected from the situation Gene found. The frame with the hot pixel is rejected from the aperture measurement set, hence M=5, and there are no detections on the other frames (N=0). Profile-fitting does use the frame with the masked pixel since it works around such pixels, so there is a profile-fit measurement, but it has a high chi-squared value since there is signal on only one frame. There are also a small number of high chi-squared sources with ndet='000400', which are expected since there will be a small fraction of second frames lost due to random events such as radiation hits.
Figure 16 shows the x_scan histogram of all high latitude southern H-only sources, along with those having ndet='000500', zoomed into the region around x_scan=154. The excess sources are confined to a width of 2.0", as expected since that is the array pixel size.
A proposed formula for identifying the vast majority of spurious H-band only sources associated with this hot pixel is:
This query run on the entire provisional PSC yields a count of 68,815 sources. Of those, 41,173 are found at |b|>30o. The general properties of the sample of H-only hot pixel candidate sources are discussed in the following memo:
Properties of Southern H-array Hot Pixel "Sources" to be Deleted from the PSC
Figure 14 - South H-only, h_psfchi and ndet |
Figure 15 - South H-only, h_psfchi and ndet - Log Scale |
Figure 16 - South H-only xscan histogram zoom-in on xscan=154 |
To investigate if any reliable sources would be filtered out using the criterion described above, I selected southern H-only sources in the 90<x_scan<110 range that had ndet='000500'. That cross-scan region doesn't show any excess of sources in Figures 7a and 7b. The chi-squared criterion is intentionally left out to investigate if it is useful to discriminate between real and unreliable sources.
There are 17 H-only sources that satisfy these criteria. None have ph_qual='UAU", six have ph_qual='UBU", seven have ph_qual='UCU" in that range. The images for each candidate were examined, and this annotated table gives the results. The table also contains four sources in that cross-scan range that have ndet='000400'. Here is a summary:
- 10 sources appear to be real objects on the sky. Most of these were visible in other bands, and/or on DSS images. 4 of these were in confused and/or crowded environments including the Magellanic Clouds and a globular cluster.
- 1 source was a detection on the disk of an edge-on spiral galaxy. This source correctly had gal_contam=2.
- 3 sources were unflagged detections on relatively bright diffraction spikes. (two of these are associated with R Dor which is one of the variables known to have left residual diffraction spike artifacts in the PSC.
- 1 source was a detection on a residual meteor trail.
- 2 sources were blank on the 2MASS and DSS images.
All of the real sources, the galaxy disk, and two of the diffraction spike detections have h_psfchi<2. The two blank field sources, the meteor trail detection and one of the three diffraction spike artifacts have h_psfchi>2. Thus, retaining sources in the affected columns with ?_psfchi<2 preserves the reliable detections. It also leaves in a few artifacts.
Interestingly, all of the sources with h_psfchi>2 in this test region of x_scan space that doesn't have known hot pixel, are unreliable. One could conclude from this limited test that all faint, single-band sources with high chi-squared values have a strong chance of being unreliable.
g. Identifying Spurious Sources Associated with Other Hot Pixels
Can the criteria used to isolate sources associated with the worst of the southern H-array hot pixels be generalized to find all unreliable hot pixel sources? The table below links to sets of plots showing cross-scan histograms and ?_psfchi vs. x_scan plots for all high galactic latitude, single band sources from each array used in the survey. Separate plots are provided for different values of ndet.
- All of the arrays show narrow bands of excess sources in the cross-scan histograms. Most of these are also seen in the chi-squared vs. cross-scan plots, and so are the signatures of hot or noisy pixels.
- High chi-squared excesses are seen in the distribution of sources with ndet='[1-7]? (i.e. >3-sigma detection in one or more of the available frames) in all of the arrays, and not just in the N=0 values of ndet. The peaks containing the largest number of these sources are in the southern J-array data. For the worst peak, with -200<x_scan<-218, virtually all (97%) of the sources have ndet='160000'. So these are predominantly single-frame events.
- The amplitude of the excess peaks differs between different values of ndet. Some peaks are apparent only for one value, and some persist through many values
- Comparison of the cross-scan histograms in these plots with those of the ph_qual='A' distributions shown in Figures 4b-9b show that many of the high chi-squared excess peaks do not propagate into the high SNR regime.
- The northern Ks-band array shows the largest number of high chi-squared excess peaks. Most of these peaks show chi-squared distributions that peak around 2-3, though, and are somewhat different from the peaks seen in the other arrays that extend to very large chi-squared values. It is the sources in these numerous, moderate chi-squared peaks that account for the excess of Ks-only sources seen in John Carpenter's all-sky maps of single band sources. Raymond Tam has examined some of the sources in one these peaks and reports that most are not real objects on the sky.
Table 3 - Cross-Scan Histograms and ?_psfchi vs. x_scan Plots for Different ndet Combinations in Each Array
Hemisphere | J | H | Ks |
---|---|---|---|
North | X | X | X |
North - new | X | South | X | X | X |
The x_scan position and width of each of the cross-scan histogram peaks with corresponding high chi-squared populations were measured, and an estimate of the number of "excess" sources that are in each peak was made by summing the total in each peak and sutracting the mean "background" counts. The "background" in this case corresponds to the mean number of uniformly distributed, presumably reliable sources. In some cases, estimating the background was not simple, so expect some errors here. Table 4 contains a summary for each detector of the cross-scan position of each peak, and the number of excess sources. This is a measure of the peaks from all of the high latitude single-band sources. Also given in the table, are the number of excess sources in each of the peaks in the distributions of only ph_qual='A' sources. The fraction of all single-band sources represented by the total of all "excess" (presumably unreliable) sources is also provided.
Table 4 - Location and Numbers of Sources in Cross-Scan Excess, High Chi-Squared Peaks
Table 5 gives a summary of the numbers of hot-pixel candidate sources for each detector. The final line of the table lists the percentages of all high latitude sources in a given band that the hot-pixel candidates represent. The percentages are listed for all detections in a given band, and for all detections in a band with ph_qual='A' in that band. See Table 1 for a summary of the total number of sources in each band.
For the high-quality (ph_qual='A') PSC, the H-band reliability is seriously compromised by the hot-pixel detections. In J and H, the hot pixel candidate sources represent a very smaller fraction of sources relative to our unreliability budget (0.05%). The unreliable fractions are larger for the lower SNR component of the PSC, as expected.
Table 5 - Number of Hot Pixel Candidate Sources in High Latitude PSC
Hemisphere | J - all | J - 'AUU' | H - all | H - 'UAU' | Ks - all | Ks - 'UUA' |
---|---|---|---|---|---|---|
North | 2272 | 371 | 4125 | 175 | 15132 | 651 |
North (new H array) | 1235 | 182 | ||||
South | 28941 | 402 | 84279 | 37575 | 4852 | 366 |
Total | 31213 | 773 | 89639 | 37932 | 19984 | 1017 |
Percentage of all Band Detections | 0.065% | 0.002% | 0.194% | 0.141% | 0.049% | 0.005% |
h. Summary
The reliability requirement for the PSC (99.95% in unconfused areas) is currently not satisfied for H-band ph_qual='A' detections because of a large number of spurious detections caused by a single unmasked hot pixel in the southern H-band array.
All of the survey detectors exhibit some evidence for hot pixels that produce spurious detections, but all are much less severe than the southern H-band instance. Moreover, most of the probable unreliable detections of the hot pixel events produce "sources" with lower photometric quality detections, so do not compromise the strict survey requirements.
The unreliable detections of the southern H-band hot pixel can be reliably identified using a source's cross-scan position, ndet and chi-squared values, so can be be easily flagged in or removed from the PSC. Accounting for just the 68566 sources associated with the one southern hot pixel would improve the H-band reliability of the high photometric quality component of the PSC from ~99.86% to ~99.99% (ignoring other sources of unreliability). This clean-up would remove very few if any reliable sources.
The potential exists for identifying and/or removing sources associated with all of the candidate hot pixels from all of the arrays. This would improve the reliability of both the high and lower photometric quality components of the PSC.
[Last Updated: 2003 May 22; by R. Cutri]