Version 3 POSMAN Upgrade Plan

H.L. McCallon 05-22-00

Contained herein is a brief description of the proposed plan for POSMAN-related upgrades to support 2MAPPS v3.0 processing. Despite the overall high quality positions resulting from version 2 processing, some problems remain. By taking advantage of all the information now available, these problems can be eliminated and the overall position errors reduced significantly. In order to achieve the improvement, this plan involves some significant changes from version 2 processing. In a nutshell, the frame positions from version 2 processing are refined using information from overlaps and Tycho-2 residuals prior to version 3 processing. POSFRM is changed to allow acceptance of the refined frame positions without modification during version 3 processing. This will save considerable CPU time and (more importantly) avoid the risk of diverging from the already-achieved global solution.

The principle advantage of this approach is that it puts all the available information, including Tycho-2 and the scan overlaps, into the version 3 positions. Furthermore, all the auxiliary position information (such as optical ID and asteriod position differences) coming out of the pipeline will be consistent. If one were to run version 3 with a slightly modified POSFRM (say with Tycho-2 plus distortions) and then Martinize afterward to pick up the overlap information, previously noted consistency problems arise with the auxiliary position data.

Based on the results from a large-scale (entire 2nd Release) prototype test along with the recent public release of two new astrometric catalogs ( Tycho-2 and UCAC), it is believed that significant improvements can be obtained with rather low risk. In the following paragraphs the plan is broken down into eight steps, followed by a discussion of advantages verses risks.

I. Correct for Distortion:

In order to avoid introducing biases, version 2 source positions should be corrected for distortion prior to their use in position refinement. Any effects due to changes in distortion as a function of in-scan "y" position within a frame will be averaged over many different "y" positions due to frame stacking. The differences with band are, of course, also averaged into the final bandmerged positions. Thus, it should be possible to remove most of the effect of distortion on block adjustments using bandmerged positions to fit a model of in-scan and x-scan distortion as a function of x-scan only. Computing distortions for sources already bandmerged has been previously investigated and the southern results can now be reasonably well confirmed in-scan and x-scan using 6.6 million matches with the newly available UCAC.

Keep in mind that, even if the frame positions coming out of POSFRM are perfect, distortion continues to effect the generation of band-scan positions downstream. Band-frame level distortion, where in-scan and x-scan adjustments for each band are modeled as a function of "x" and "y" positions within each frame, has been fitted. The initial computations were made for sample nights in the north and the south using special scans over Stone astrometric fields. It was later demonstrated in both northern hemisphere and southern hemisphere tests that the band-frame level distortion can also be well determined using the the lower-accuracy higher-density and (most importantly) globally-available USNO-A2.0. Possible variations with time (particularly in the north) need further investigation, but there's little doubt that the tools are at hand to generate the band-frame model.

A 05/23/00 meeting of most of the affected CogE's resulted in the following proposed plan for handling distortion in processors downstream of POSFRM:

PROPHOT- Do nothing. The smearing induced is "in the noise" compared to other sources of smearing. An alternative approach of uniformly translating the pixel positions for each of the six single-frame apparitions of each source according to the distortion at the nominal source position, just before entering the chisquare minimization loop, would give better correction, at the cost of greater complexity. The corresponding section of PSFMAKE would also need to be modified, in the same way. If this method were adopted, a correction for the remaining frame-level differential refraction could be done at the same point.

BANDMERGE- Assuming distortion is not corrected in PROPHOT, in-scan and x-scan corrections can be applied as a function of x-scan position separately for each band prior to bandmerge.

PICMAN- Effect is very small, placing it at the bottom of the task priority list.

GALWORKS- Assuming distortion is not corrected in PICMAN, in-scan and x-scan corrections can be applied to the bandmerged extended-source positions as a function of x-scan position.

II. Generate Scan-Segment Overlap and Tycho-2 Difference Files:

First, scan overlap matches are retrived from the "tmass" (full working) database. It is important that it be done with "tmass", as opposed to "tmasss" (catalog release database), so as not to constrain future decisions regarding which scan should be used for a given tile in the version 3 processing. Since everything is set up to work with scans, rather than tiles, having multiple scans of the same tile does not present a problem. Using the "tmass" database, in fact, brings more information to bear. As is the case with virtually all database retrieval of large datasets, this will be a time consuming task. We need to start early on this task and figure a way to make incremental additions as nights are processed. Once the data is available, each scan is divided in-scan into a dozen 1/2 degree segments. Trimmed-mean differences w.r.t. all overlapping scan-segments are computed using high quality distortion-corrected sources from the version 2 processing.

It will be necessary as well to extract from the database 2MASS matches to the Tycho-2 and UCAC catalogs. This may also be a time consuming task and should be approached as described above. Files which provide quick access by scan-segment number, for both the mean overlap differences and the differences w.r.t. Tycho-2 and UCAC (where available), are built for later use.

III. Identify and Re-Run Worst-Case Scans Using POSFRM with Tycho-2:

Potential problem scans can be quickly identified by searching for large overlap and/or systematically large Tycho-2 differences generated back in step II. The list can be further refined by checking overlap difference consistency on the east and west sides of a scan. Once the worst scans are identified POSFRM can be rerun for them using the newly available Tycho-2 as the reference catalog. This capability has already been incorporated in POSFRM, but is not turned on for mainstream pipeline processing. It has been demonstrated to be of considerable benefit to problem scans. POSFRM can be executed in stand-alone mode without need to rerun the entire pipeline on these scans. Once the worst-case scans have been rerun, the step II difference files are adjusted accordingly.

While this step is not absolutely necessary to the plan, eliminating the worst-case offenders up front reduces the chances of mistakenly attributing some of their errors to adjacent scans. The prudent course would be to rank the scans in decreasing order of apparent position deviation and then rerun the first "n" scans. The value of "n" will likely need be driven by resource limitations.

IV. Compute Pre-Adjustment Scan-Segment Position Uncertainties:

Scan-segments are assigned an initial uncertainty based on the mean extraction uncertainties of hi-quality extractions within the segment. The resulting variances are multiplied by variance factors which nominally start at one. The initial values of variance factors for scan-segments known to have reconstruction problems can be manually adjusted upward. Chi-square values are computed for all overlapping scan-segments. A small adjustment is made to the variance factor assigned to each scan-segment in the direction which drives the weighted mean of it's overlap chi-squares toward one. As each scan-segment variance factor is changed it effects the chi-square values differently for each of it's overlaps. All chi-square values are redetermined and the process is repeated iteratively, with no segment uncertainty allowed to fall below a specified minimum. This type of technique to determine position uncertainties was first used for the Sampler and is further described in a later URL. Early plans were to solve a set of simultaneous equations to get the desired variance factor adjustments. But given the complicated overlap possibilities, the dimension of the problem and various other considerations, iteration turned out to be a more effective technique. This process has been used in slightly modified forms to assign position uncertainties for all releases to date and has shown no problems with convergence.

V. Adjust Scan-Segment Positions to Globally Minimize Differences:

In a manner very similar to that described for the uncertainties, scan-segment positions are iteratively adjusted to minimize (weighted least-squares) overlap differences as well as Tycho-2 residuals. In this case the uncertainties from step IV provide the initial inverse-variance weighting factors. This type of process is often referred to as "block adjustment", but is more popularly known in 2MASS as "Martinizing" after Martin Weinberg who first suggested it for 2MASS.

A prototype test on approximately half the survey (2nd Release) has demonstrated both the feasibility and desirability of this approach. The test is classed as "prototype" in the sense that it was missing the following important items:

1) Distortion was not corrected prior to Martinizing

2) The ACT rather than the Tycho-2 was used as the reference catalog

3) The UCAC was not available as a truth table for parameter tuning

4) Only half the sky was available for processing

Despite these handicaps the test results show a marked improvement in position reconstruction. This can be seen in the before/after overlap difference histograms for the entire release and is confirmed by the 2MASS-UCAC difference histograms for a significant portion of the Release which overlaps UCAC in the south. In each case the dashed red lines show position accuracy as released and the solid black lines show accuracy after correction. These histograms reflect large sample statistics, with the former based on 11 million 2MASS scan overlap matches and the latter on 6.6 million 2MASS/UCAC matches. The selection criteria for the sources going into the overlap difference stats were more stringent requiring all to be clean 3-band sources. The improvement can only increase as the four liens of the prototype test listed above are removed.

VI. Compute Uncertainties for Adjusted Scan-Segment Positions:

It should be possible to repeat the technique described in step IV using the refined differences. This gets a little sticky because at this point one would expect the overlapping scans to be correlated. However, given that adjacent scans will generally be using different Tycho-2 stars and overlaps are coming from all sides in the Martinizing process, the scan-to-scan errors may not be as correlated as first thought. More work is needed here to determine whether the correlation is significant, and if so, how to incorporate that into the algorithm.

VII. Generate "Super FPOS" Files with Refined Frame Positions and Uncertainties:

The results from steps V and VI provide most of the information needed to update the Frame POSition (FPOS) files from version 2. Frame corrections can be obtained via linear interpolation of the scan-segment position correction and uncertainty files. Small additional adjustments will need to be made to each of the three FPOS files (J,H,K) to account for differences in distortion with band at the origin of each frame. The adjustments needed will be provided by the band-frame distortion analysis previously discussed. It is also important that the BANDMERGE code be changed to actually use the uncertainties in the FPOS files, rather than taking them from namelist.

VIII. Modify PFPREP and POSFRM Processors:

The PFPREP and POSFRM processors will be modified to, under namelist control, accept the super FPOS files from step VII as definitive. No attempt will be made to further adjust the frame positions. Since a large fraction of the code for these processors is devoted to this task, many CPU cycles will be saved. Far more importantly, the risk of diverging from the already achieved global solution will be eliminated. Remember that a POSFRM reconstruction (even with Tycho-2) works on an isolated scan. It does not have the scan overlap information available to it, whereas the Martinized solution has both Tycho-2 and the overlaps. Another reason for preventing POSFRM from making further changes is an apparent POSFRM bug (so far resisting diagnosis) which, on infrequent occasion, allows the frame solution to systematically move away from the reference stars. The Martinizing process should remove the effect and we want to be sure POSFRM doesn't have a chance to re-introduce it.

POSFRM will be reduced to a single pass and that pass will run faster than either pass from the version 2 processing. PFPREP should also run faster, with two of the remaining POSMAN processors (PFPOST and POSPTS) staying about the same. PFPOST may need some minor changes for special handling of Read1 saturated sources but that shouldn't effect runtime. It's likely that REFPOS can be removed altogether. All files currently produced by the POSMAN subsystem for use by downstream processors and quality assurance will continue to be generated. Plots of residual distortion will be added to the quality assurance output to verify successful correction.

IX. Discussion of Advantages vs Risks:

The advantage of using all available information, including Tycho-2 and the scan overlaps, while providing consistent auxiliary position outputs has already been discussed. Another important advantage of the approach is that it virtually eliminates the possibility of scans with really bad position reconstruction slipping through. It provides an environment with global visibility of problem scans and the capability to make adjustments and rerun the global solution in a timely manner. Control variables include overlap, Tycho-2 and segment-to-segment difference weighting, with the UCAC providing an excellent truth table over almost half the sky for performance evaluation. The only type of error which would not show up during the Martinizing process would be a "processing mismatch". This refers to a case where the inputs to the Martinizing algorithm come from one processing of a scan and the FPOS files to be corrected from a different processing of the scan. In order to avoid this possible mishap, unique date-and-time-of-processing codes have been added for each scan in the database and are also present in the header of each FPOS file.

An obvious disadvantage to the approach is that it requires a lot of work on version 2 output data in order to prepare the needed inputs for version 3. Fortunately much of the work can be done before all the version 2 data is available. With proper planning we should be able to "hit the ground running" when the version 2 processing is complete, so the remaining tasks can be completed in a timely manner. Along these lines, careful attention to scheduling the database retrieval tasks is probably the most critical. The final Martinizing will have to wait until all the data is available, but it should be possible to work out the kinks well in advance of that. Even size-related problems could probably be worked out when 90% of the sky is available.

One further point to keep in mind is that a significant fraction of the effort (steps II and IV) will need to be done in any case to provide position uncertainties for the remaining version 2 release(s).

http://spider.ipac.caltech.edu/staff/hlm/2mass/v3plan/v3plan.html
Comments to: Howard McCallon
Last update: 06 June 2000