V. Catalog Generation


3. Source Selection

The source selection criteria for 2MASS Catalog Generation, described in the following sections, are designed to draw from the Working Databases lists of reliable sources with accurate photometry and positions. Because reliability and completeness trade directly, though, the criteria have been tuned so as not to compromise the completeness of the Catalogs.

a. SNR Limits

Sources selected for the Second Incremental Point and Extended Source Catalogs are required to have a signal-to-noise ratio (SNR) and brightness (for XSC sources) in excess of:

In this case SNR is derived from the photometric measurement uncertainty (SNR = 1.0857 / mag).

i. Reliability

The source detection thresholds used in 2MASS pipeline processing (cf. IV.4a) are set low enough to ensure that the completeness requirements of the Survey are met. The SNR and brightness limits given above limit the contents of the Catalogs to reliable sources, per the 2MASS reliability requirements.

This is illustrated in Figures 1, 2 and 3. The lower right panels in these Figures show for J, H and Ks, respectively, the distribution of photometric measurement uncertainties for all point sources in the Working Database with b>+85°. The equivalent SNR=7, 4, and 3 levels are indicated by the horizontal lines in the lower right and lower left panel, which shows the photometric measurement uncertainties plotted against source brightness. Two indicators of high latitude point source reliability for 2MASS are detection in more than one band (particularly for and H and Ks) and association with an optical source from the ACT or USNO-A catalog (cf. IV.4g). The red and blue curves in the Figures show the distributions for multi-band detections and sources with optical counterparts, respectively. The distribution of putatively reliable multi-band detected sources begins to fall-away from the distribution of all sources (in black) for SNR levels below 7; this is well above the nominal detection threshold of SNR~3.5 indicated by the peak in the uncertainty histogram near ~0.3 mag. The SNR~7 threshold generally occurs at a brightness level fainter than the nominal completeness level for the Survey, as can be seen in the differential source count vs. magnitude (dlogN, dM) curves in the top left panels of the figures. Therefore, the SNR thresholds do not compromise the completeness requirements for 2MASS.

The 2MASS PSC and XSC will contain many sources with with SNR < 7 in some bands because the SNR and brightness limits require that a source satisfy them in only one band. Most frequently a higher-SNR J-band measurement will "pull-along" fainter detections in the H and Ks bands.

Figure 1Figure 2Figure 3

ii. Flux Over Estimation Bias

The second purpose of the SNR thresholds for the Second Incremental Release is to filter out sources from the PSC and XSC which are most affected by flux overestimation bias. This bias is a natural consequence of selecting a flux-limited sample of sources with non-zero measurement uncertainties. A source with an intrinsic brightness near the sensitivity limit of a measurement is more likely to be detected if noise drives up the measured brightness as opposed to driving it down. Therefore, sources detected near the sensitivity limit will have, on average, a measured brightness higher than their true brightness, or equivalently a higher SNR than their true value. Such sources will also have measurement errors that do not accurately represent their true SNR. The closer a measured brightness is to the detection limit, the larger the amplitude of the statistical overestimation. A simulation based on pure Gaussian noise statistics shows that the flux overestimation is still 5% at the SNR~7 level, but diminishes rapidly for brighter sources. Because the noise is rarely Gaussian, this is a lower limit for the expected flux overestimation.

The flux overestimation bias is especially troublesome for distributions in which the number of sources rises rapidly with decreasing brightness, such as most astronomical source distributions. There are more sources fainter than a detection threshold than brighter, so more sources are scattered above the threshold than will be scattered below it. Therefore, at low SNR, sources will "pile-up" in the faintest magnitude bins, causing an apparent excess in source count curves. Because of its statistical nature, it is impossible to avoid this bias entirely, but limiting the Catalogs to higher SNR sources minimizes its impact.

b. PSC - Goodness-of-fit and Frame-detection Limits

Because they do not persist on the sky, cosmic ray strikes and meteor trails generally cause spurious detections in only one out of the six frames covering a particular spot on the sky. Isolated single-frame "events" are filtered out during production of the Atlas Images, and this avoids many detections of cosmic ray strikes. However, cosmic ray strikes near true sources, those affecting many pixels (i.e., grazing hits), and meteor trails largely persist into the coadded Atlas Images, and will therefore trigger spurious detections. These spurious sources can have any brightness from the saturation limit down to the faint detection limit.

To filter out these spurious detections, candidate sources for the Point Source Catalog are required to have:

The parameter (<band>_psfchi in the PSC records) tends to unity for sources with profiles consistent with the noise model on all frames, but will have a value >>1 if the source profile does not match the point-spread-function or is dramatically different between frames in the stack of six. Extended sources, such as close multiple stars, galaxies and Galactic nebulae also have parameters >1. It is the intention that there be a PSC entry corresponding to virtually all sources in the XSC, so in order to preserve extended sources in the PSC the goodness-of-fit threshold is set conservatively high.

The frame-detection thresholds are designed primarily to filter out bright spurious detections of cosmic rays and meteor trails. The 2MASS scanning strategy covers each point on the sky with six (and occasionally seven) frames. A high SNR source should be detectable on all frames covering its location. However, a small fraction of the time masked pixels, cosmic rays and proximity to a frame edge will render one or more of the samples of a particular source unusable. For an accurate measurement, we require that there should be at least three useful coverages of a source. A reliable high SNR source should be detected on at least 40% of the possible frames which covered its position. "Frame-detection" in this context means that an aperture measurement on each individual frame at the position of a source returns a measurement uncertainty of < 0.36 mag (SNR>3). Source records in the PSC include the ndet_flg which gives for each band the number of >3 frame detections and the number of frames on which a source could have been detected. Faint sources detected on the coadded stack of six frames (Atlas Images) can be undetectable on the individual frames, so the threshold is applied only for high SNR sources. Thus, the ndet_flg parameter often lists zero detections for otherwise valid faint sources.

c. XSC - Extended Source Classification, Untracked Seeing and the Galactic Center

i. The e_score and g_score

Candidate sources for the Second Incremental Release XSC are required to have:

The Working Extended Source Database contains candidate objects selected to be extended with respect to the observed point-spread-function within a Tile. These candidates are a combination of resolved objects, galaxies and nebulae, and "false" extended objects, primarily close multiple stars. As discussed in Section IV.5, further classification of objects in the Working Database using an Oblique Decision Tree that incorporates three radial extent attributes, three symmetry attributes and four photometric attributes, assigns a confidence score to each band for a source. The final scores are SNR-weighted average of the three wavelength band scores, the e_score and g_score. The e_score is tuned to finding any resolved sources, and the g_score is optimized to identify galaxies among the candidates using color information. A point source has e_score=2.0 while a clearly resolved source has e_score=1.0.

The upper panel of Figure 4 shows the e_score for extended source candidates from the Working Database plotted as a function of the integrated J magnitude The classification of each source in this figure was determined using either visual inspection of optical or non-2MASS infrared imaging data, or with spectroscopy. Verified galaxies are denoted by filled white circles, double stars by red triangles and higher multiples of stars by cyan cross symbols. Galaxies cluster in two places, either at a score of 1.0 or around 1.1 to 1.4 at the faint end (J~14, Ks~13 mag). The galaxy clustering is due to the weighted-averaging; the score value jumps from 1 if all three bands agree, to an intermediate value of about 1.3, if the source failed in one band. "False" galaxies are predominately located at 2.0 with clustering around 1.5 to 1.8. The lower panel of Figure 4 shows the g_score for extended source candidates from the Working Database plotted as a function of the integrated Ks magnitude. Verified galaxies tend to have a g_score that ranges from 1.0 to 1.4. This score does a slightly better job of discriminating galaxies from non-galaxies than does the e_score.

Figure 4

ii. Untracked Seeing

When the atmospheric seeing varies during an observation of a 2MASS Tile faster than the pipeline's ability to track it accurately, then the ability to discriminate reliably between point and extended sources is compromised. Most Tile observations that have significantly untracked seeing are identified during the Quality Assurance review and are given low quality scores that schedule them for reobservation. However, there are a few Tiles that are retained in the Second Incremental Data release in which the seeing was untracked for short periods. There are 255 extended source candidates in the regions of the affected Tiles that were excluded from the XSC, even if they otherwise met all other nominal Catalog selection criteria, because of the high probability of unreliability.

iii. The Galactic Center

The stellar source density in the vicinity of the Galactic Center overwhelms any automated star-galaxy discriminators due to the presence of so many multiple groupings of stars. Fewer than 1% of the ~17,000 candidate extended sources in this region identified by 2MASS scan pipeline processing are known to be reliably extended. This reliability is so low that all XSC candidates within an elliptical region having a semi-major axis of 12.8° parallel to the Galactic Plane, and an axis ratio of 0.47 have been omitted from the Catalog.

d. Bright Star Artifact Identification

As discussed in Sections IV.5 and IV.7, spurious detections of and real source detections affected by optical artifacts of bright stars are identified during scan pipeline processing and in the Catalog Generation phase. Sources in the Working Databases that are not identified to be spurious detections are candidates for the PSC and XSC. Sources which are believed to be real astrophysical objects, but may have photometry affected by the artifacts, are passed to the Catalogs and are flagged as possibly corrupted using the "cc_flg" (cf. I.6.b and I.6.c).

e. Additional Flagging

During the Catalog Generation phase, additional effort was made to identify residual artifacts that persist into the Catalogs even though they meet the nominal criteria for Catalog selection defined above. PSC and XSC sources so identified are flagged with cc_flg="U" indicating that they are "unreliable," but they have been left in the Catalogs so as to not compromise the objective selection criteria for catalog generation.

[Last Update: 2000 March 1, by R. Cutri, T. Jarrett and T. Chester]


Previous page. Next page.
Return to Explanatory Supplement TOC Page.