SUMMARY
Within the context of the maximum-likelihood estimation procedure used
in PROPHOT, the computation time involved in active deblending depends
critically on the search techique used to arrive at the initial guesses
of the source component positions. As an example, the processing time
for an N=2 blend, expressed as a fraction of the single-source processing
time, would be approximately 40 in the case of brute-force search, or
as little as 8 if a guided search is used. For a 2MASS scan of high
source density, the deblending operation could easily increase the total
processing time for the scan by factor of 4 (brute-force search) or 1.5
(guided search) assuming that the criterion for an attempted deblend is
rchisq > 3.
INTRODUCTION
PROPHOT has already been designed with the inherent capability of simultaneous
fits for an arbitrarily large number, N, of point sources. All that is
required is to provide the algorithm with initial-guess positions of the N
sources, and it will then perform a maximum likelihood estimation of the source
parameters and their associated uncertainties. If the sources are sufficiently
far apart that they produce separate peaks on the coadded image,
the initial positions can be obtained from the locations of those peaks,
and this case (referred to as "passive deblending") is dealt with
satisfactorily by the current version of PROPHOT.
If, however, the sources are blurred into a single peak (the "active
deblending" case), the determination of the initial positions is nontrivial.
It is, however, a necessary step, since nonlinearities in the measurement model
lead to a multi-valley structure for the chi-squared function being minimized,
and poor initial guesses for the source locations could prevent the attainment
of the global minimum. This will require some further development in PROPHOT,
and the purpose of this memo is to outline some of the issues involved, and to
provide some estimates of the impact on run time.
OBTAINING THE INITIAL POSITIONS
There are many different possible approaches to this problem, some of which involve steep increases of computation time with increasing N, and some for which the computation time is independent of N, but for which there is a substantial computational overhead (e.g. Richardson and Marsh 1991). In the case of 2MASS, for which PSF errors limit the feasible range of N to values of at most 2 or 3, the most efficient scheme would probably be one which involves an initial search on a coarse grid of source locations, calculating the value of the chi squared function at each location. The grid would actually be a phase space of dimensionality 2N, corresponding to a pair of coordinates per source. At each grid point, a limited (flux only) maximum likelihood estimation would be performed, and the phase-space point at which the chi squared function is minimized would then correspond to the set of initial positions for the subsequent full maximum likelihood fit (positions and fluxes). Such a search could be either brute-force (a complete search of a restricted volume of phase space) or a guided search of some kind.
In the brute-force case, the search grid should cover a range of positions somewhat in excess of the critical source separation at which the two respective response functions combine to produce a single peak. In practical terms this corresponds to the FWHM of the PSF. The sampling interval should be such that the PSF behaves roughly quadratically from one sample to the next. For 2MASS images, these requirements would be satisfied by a 5 x 5 grid with a half-pixel sampling interval. For N sources, the number of phase-space search points is then ^{25}C_N, which can obviously get quite large.
A guided search could dramatically reduce the number of required search
points. For example, in the N=2 case, one only need search along the axis
of elongation in the coadded image. Asymmetries in the PSF could, however,
compromise this procedure. In addition, it would still leave us unprepared
for the N>2 cases. Another possibility is to use an iterative procedure
such as the one proposed by Elliott (1998), whereby we start with the N=1 case,
and then successively add extra sources, with the search being restricted to
the most recently added component at each iteration. After each
single-component search, a full maximum likelihood fit is performed in
order to adjust the positions of previously-found compoents. This scheme
would have the advantage that the computation time increases roughly
quadratically with N, thus avoiding the combinatorial explosion associated
with the brute-force approach.
ESTIMATES OF COMPUTATION TIME PER BLEND
Factors involved in assessing the computation time involved in active deblending are:
(1) The number of initial search points. For a 5 x 5 grid, this would involve a total of ^{25}C_N evaluations of the chi squared function in the case of a brute-force search, and 25N in the case of the guided search procedure.
(2) The number of arithmetic operations involved in an N-source maximum likelihood solution. This is roughly proportional to N^2.
Taking these factors into consideration, the estimated factor by which
the computation time is increased for an N-blend over a single source
is:
N Brute-force Guided search search (Elliott 1998) -------------------------------------------------------- 2 40 8 3 500 20
EFFECT ON COMPUTATION TIME FOR 2MASS SCANS
To assess the impact on the computation time for a complete scan, it is necessary to determine the fraction of the sources to be considered as unresolved blends; it would seem reasonable to base this on the value of reduced chi squared (rchisq).
Based on an analysis of 6 science scans from the night of 990928n, covering a wide range of source densities (specifically, from about 7000 to 37000 sources per scan), the fraction of extracted sources whose rchisq exceeded 3.0 in scans of low source density was of the order of 1%. For the scans of highest source density, it reached 7%. Assuming these to be representative, the condition rchisq > 3 would seem a reasonable criterion for attempting active deblending. This being the case, we can use the values in the above table to estimate the increase in computation time for processing a scan. The following predictions are based on N=2:
For low density scan (in which the majority of the 1% of blend candidates would probably turn out to be single sources), the scan processing time would increase by a factor of 1.4 for brute-force search, and 1.07 for the guided search procedure.
For a high density scan (in which the majority of the 7% of blend candidates
would probably turn out to be genuine), the scan processing time would
increase by a factor of 3.7 for brute-force search, and 1.5 for the guided
search procedure.
COMMENT
From considerations of computation time, a guided search is obviously
desirable. It will be necessary to do some numerical simulations to
make sure that it is robust for the parameters involved in 2MASS source
extraction.
REFERENCES
Elliott, D. G. 1998, "Crowded-Field Photometry for WIRE," JPL Report No.
D-15360.
Richardson, J. M. and Marsh, K. A. 1991, "Point-Process Theory and the
Surveillance of Many Objects," in Maximum Entropy and Bayesian Methods,
C. R. Smith et al (eds.), Kluwer Academic Publishers, pp 213-220.