Spitzer Data Analysis Cookbook

Spitzer Home > Data Analysis & Tools > Spitzer Data Analysis Cookbook

Recipe 8. Bandmerge: A How-To Example

This recipe provides a step-by-step guide on how to use the Bandmerge software.

8.1 Requirements

Bandmerge was designed to work on tables produced from Spitzer data by the SSC's MOPEX/APEX software which does photometry on images. The software can be downloaded from the SSC web site (http://irsa.ipac.caltech.edu/data/SPITZER/docs/dataanalysistools/tools/bandmerge/).

This recipe uses a sample dataset included in the Bandmerge package.

8.2 What does Bandmerge do?

Bandmerge merges source photometry lists from different wavelengths by matching source positions. It can read in source list tables for two to seven different wavelength bands and produce a merged list.

Bandmerge was designed primarily to work on photometry tables produced from Spitzer IRAC and MIPS data by the SSC's MOPEX/APEX software. Strictly speaking, APEX is not required, but Bandmerge does expect certain keywords and columns, and when it recognizes a Spitzer band, certain parameters are "hard-wired". For instance, Bandmerge will label Spitzer bands with an Outband number based on Spitzer wavelength order: IRAC (3.6, 4.5, 5.8, 8.0 microns) = 1 - 4. MIPS (24, 70, 160 microns) = 5 - 7. Bandmerge has some limited ability to work on other kinds of data. But this demo assumes you are working on Spitzer data, either IRAC or MIPS, or both. Sample APEX tables are provided.

There are several features that make Bandmerge more sophisticated than a simple position-matcher. One important one is that when performing positional matching, Bandmerge takes into account positional uncertainties, and it will estimate any systematic offsets between two different bands, correct this offset, and do more accurate source-matching iteratively.

For a more general, though less sophisticated, "closest-match" position-matching tool, see the included file README_mgsa.

Bandmerge is just a "best" position-matcher. It should work for uncrowded fields. Its weakness is close multiples. It will try to make unique one-to-one matches. A common failure might be a close pair at the shortest wavelength, a smeared source at the next longest, and a large blob at the longest. Bandmerge might assign the smeared source to one of the pair, the large blob to the other. Further analysis of close multiples is required of the user, taking fluxes into account.

Bandmerge consists of a Perl wrapper script, bandmerge.pl, which is called from the command line. It in turn calls C and Fortran binaries. The Bandmerge Perl script is most easily controlled by a namelist, e.g. bandmerge.nl, which needs to be in the ./cdf sub-directory of the working directory.

Bandmerge will typically read in source positions in sky coordinates (RA, Dec) and project them onto a common fiducial image plane with corresponding (x,y) positions. The bandmerging is performed in the fiducial plane in (x,y), and the merged catalog with (x,y) positions is converted back to sky coordinates. The fiducial plane can be specified by the user with an FIF.tbl, like those generated by MOPEX, or Bandmerge can compute an FIF.tbl if the user provides a reference FITS image.

A required input file is called the band-pair registration uncertainty file. As mentioned above, Bandmerge will try to check for offsets between bands iteratively. It needs to start with a file, e.g. bpru.tbl, with some nominal initial guesses for the offset uncertainties. It should be in cdf/.

In the namelist file, you set how many iterations of this (bandmerge) - (get offsets) - (bandmerge) step you want to run. For each iteration, a separate namelist, bmg1.nl, bmg2.nl, etc., will control Bandmerge parameters. These namelists are in cdf/.

8.3 Setting up Bandmerge software and the example dataset

Within the downloaded Bandmerge package are both the software and a sample dataset.

8.3.1 Set up the environment for Bandmerge

Before starting Bandmerge, we need to set some environment variables. In the bandmerge directory, edit the first line in bandmerge.csh to your bandmerge directory; see the line highlighted in bold type.

setenv MOPEX_INSTALLATION /yourpath/bandmerge
setenv SIRTF_JAVA ${MOPEX_INSTALLATION}/java
setenv SIRTF_BIN ${MOPEX_INSTALLATION}/bins
setenv WRAPPER_UTILS ${MOPEX_INSTALLATION}/bins
setenv WRAPDIR ${MOPEX_INSTALLATION}/bins
setenv SIRTF_ANC ${MOPEX_INSTALLATION}/source
set path = ( ${MOPEX_INSTALLATION}/bins $path )
if ($?LD_LIBRARY_PATH) then
setenv LD_LIBRARY_PATH ${MOPEX_INSTALLATION}/libs:${LD_LIBRARY_PATH}
else
setenv LD_LIBRARY_PATH ${MOPEX_INSTALLATION}/libs
endif
if ($?PERL_PATH) then
else
setenv PERL_PATH `which perl | sed "s/$.*$perl/\1/" `
endif

setenv SIRTF_QA cdf
setenv SIRTF_CDF cdf
setenv SIRTF_CAL cal

Then in a Cshell type: source bandmerge.csh.

NOTE: The bandmerge.csh file "borrows" some of the same environment variables used by MOPEX. If you also use MOPEX, it's best to start a new shell to run Bandmerge and exit it when done.

One can run the Perl script from the bandmerge directory, but it is often clearer to move to a separate "working" directory. For our example this is working_demo/.

For the demo: cd ./working_demo.

Note the sub-directories data/ and cdf/. The data can be anywhere, provided a path is given. The namelists, bandmerge_demo.nl and bmg1.nl and bmg2.nl, are in cdf/. The initial band-pair registration uncertainty file, bpru.tbl, is there as well. Log files for the bandmerge and getoff modules will also go into cdf/. These log files are very useful for identifying the sources of failures. The final merged source list, newtbl.tbl, will be in the user-specified Output directory.

8.3.2 Dataset for this example

In data/ are sections of images in IRAC channels 1, 3, and MIPS 24 microns from the S-COSMOS Legacy Science Program (PI: D. Sanders). The dataset includes images, coverage and uncertainty maps, with names like irac1.fits, irac1_cov.fits, and irac1_unc.fits. We will choose irac1.fits to be the reference image for the fiducial plane. There are also APEX extraction tables, e.g. irac1_extract.tbl.

8.4 How to use Bandmerge

8.4.1 The input source files

Bandmerge required input file format:

Bandmerge can only take input data in IPAC table format with the first 8 required columns shown below. Additional columns are allowed. The Perl script does a check of the headers. Specifically, it will count the actual number of sources in the input file, and internally add or update the keyword Total_PS_Number if it is not consistent. Similarly, keywords CDELT1,2 and NAXIS1,2 are taken from the reference FITS image header or the FIF table. The Bandmerge namelist also has a block of parameters which maps out the instruments and channel numbers, and keywords INSTRUME and CHNLNUM are added internally as needed.

Aperture fluxes are usually part of an APEX run and one generally wants them in the final bandmerged table. So Bandmerge will carry along the number of apertures given by N_Apertures. Both the flux value for aperture n "aperture(n)" and the bad area in pixels "bad_pix(n)" will be carried along.

An optional input column is flux signal-to-noise ratio, SNR, which Bandmerge can use to bin sources into different groups while computing statistics.

APEX tips for Bandmerging (not required for demo):

The most direct way to generate input files with the appropriate format is to use the SSC's photometry extraction software APEX. For descriptions on how to use APEX, refer to the MOPEX documentation (http://irsa.ipac.caltech.edu/data/SPITZER/docs/dataanalysistools/tools/mopex/). As described above, Bandmerge requires certain columns of data. With APEX, make sure the "select" module is allowing these columns through to the final output extract.tbl. (If using the GUI on Mac computers, use Apple-click to toggle selected columns; for Unix systems, use Control-left mouse.)

8.4.2 Prepare to run Bandmerge

Defining the fiducial image plane

In this test case, we do not have an FIF.tbl file, so we will have Bandmerge generate one using the IRAC channel 1 image as the master reference. To do this, comment out the FIF_FILE_NAME and specify in the namelist:

REFERENCE_FITS_FILENAME = data/irac1.fits

#FIF_FILE_NAME = myfif.tbl

This will produce an output FIF.tbl file in the working directory like the following:

\char comment=fif.tbl created by make_fif in bandmerge script
\real CRVAL1=150.092822
\real CRVAL2=2.1826860
\real CRPIX1=-3569.926508
\real CRPIX2=-3450.384412
\int NAXIS1=501
\int NAXIS2=501
\real CDELT1=-0.000166667
\real CDELT2=0.000166667
\real CROTA2=0
\char CTYPE1=RA---TAN
\char CTYPE2=DEC--TAN
\char PROJTYPE=TAN
\real EXTENT_X=0.083500167
\real EXTENT_Y=0.083500167

Band-pair registration uncertainty file:

Bandmerge requires an initial band-pair registration uncertainty file, bpru.tbl, in cdf/. Here Columns 1 and 2 specify the pair of bands, Columns 3, 4 and 5 are the positional shift uncertainties between two bands in X and Y and a cross term shift (XY). There is a default file typical for the Spitzer bands provided as a part of the bandmerge package. Using the default file provided should work for most users. This file can contain the initial offset values for each pair of 7 bands, even though you might be using fewer bands.

\char comment = Spitzer band-pair registration uncertainties
\ Generated by getoff vsn 1.91 A60816 on Thu Jan 11 12:16:05 2007
|seed_index |cand_index |X_sigma |Y_sigma |XY_csd   |
|int         |int         |real     |real     |real     |
1           2           0.15     0.15     0.0
1           3           0.15     0.15     0.0
1           4           0.15     0.15     0.0
1           5           0.15     0.15     0.0
1           6           2.00     2.00     0.0
1           7           2.00     2.00     0.0
2           3           0.15     0.15     0.0
2           4           0.15     0.15     0.0
2           5           0.15     0.15     0.0
2           6           2.00     2.00     0.0
2           7           2.00     2.00     0.0
3           4           0.15     0.15     0.0
3           5           0.15     0.15     0.0
3           6           2.00     2.00     0.0
3           7           2.00     2.00     0.0
4           5           0.15     0.15     0.0
4           6           2.00     2.00     0.0
4           7           2.00     2.00     0.0
5           6           2.00     2.00     0.0
5           7           2.00     2.00     0.0
6           7           2.00     2.00     0.0

The uncertainty values will be recalculated by Bandmerge on each iteration and new bpru.tbl files written.

8.4.3 Set Bandmerge Parameters:

Bandmerge is controlled by a set of namelists, which are stored in the cdf/ directory. The first, e.g. bandmerge_demo.nl, controls the run. If you specify in bandmerge_demo.nl that you want to run the bandmerge - getoffset - bandmerge step iteratively 2 times, you will need another two namelists (with fixed names), bmg1.nl and bmg2.nl, also in cdf/.

Most of the namelist parameters should give acceptable results as is. Arrows describe some important parameters. For other data, edit the PointSourceList_Filenames, REFERENCE_FITS_FILENAME, and Instrument_Channel ID's.

8.4.4 bandmerge.nl

This namelist is required to run bandmerge.pl. It looks like the following:

convert_sky_to_cartesian = 1   -----> this switch converts (RA DEC) into (X Y) positions
run_bandmerge = 1             -----> this turns on/off the bandmerge module.
convert_cartesian_to_sky = 1   -----> this switch converts (X Y) to (RA DEC) in the final merged output.

#input files, PointSourceList_Filename has to be sequential
#
PointSourceList_Filename1 = data/irac1_extract.tbl ---> you can use the absolute path if the data are not in the current path
PointSourceList_Filename2 = data/irac3_extract.tbl
PointSourceList_Filename3 = data/irac4_extract.tbl

# input to define the fiducial plan
# bandmerge perl script will look for either one of these two input parameters,
# if both parameters are not found, bandmerge will stop.
REFERENCE_FITS_FILENAME = data/irac1.fits              ---> you can use the absolute path if the data are not in the current path
#FIF_FILE_NAME = fif.tbl                 ---> this has been commented out since we are using an image as reference

# map out the channels with Spitzer ID's
Instrument_Channel1 = IRAC_1      -----> IRAC_1 = 3.6, IRAC_2 = 4.5, IRAC_3 = 5.8, IRAC_4 = 8.0 (microns)
Instrument_Channel2 = IRAC_3
Instrument_Channel3 = IRAC_4      -----> MIPS_1 = 24, MIPS_2 = 70, MIPS_3 = 160 (microns)

Input_BPRU_Filename = bpru.tbl    -----> this is the initial band pair registration uncertainty file.
                                  -----> It can contain more bands than actually used.

#needed for iterations. For no iteration, set Number_of_iterations = 0
Number_of_iterations = 2          ----> number of iteration of bandmerge-getoffset-bandmerge. Cannot be zero.
#correction type used in iterations
clean_type = 3          ----> value of 1 means that the derived offsets are to be applied to
                                  ----> the source positions, value of 2 allows updates in (X,Y)
                                  ----> positional variances, value of 3 allows updates in both (X,Y)
                                  ----> positions and uncertainties.

#default is 's2c'
s2c_prefix = 'cnv'                  ----> the prefix to the files after (RA,Dec) have been converted to (X,Y).
OUTPUT_DIR = output_bandmerge_demo ----> give absoute path if not in the current path
#file of merged sources
OUTPUT_FILE_NAME = 'newtbl.tbl'   ----> Note the final output file is called newtbl.tbl, the intermediate output file is called iter1_newtbl.tbl and iter2_newtl.tbl. This file will be in the output directory.

# the following parameter block is for bandmerge module.
# Most parameters can be set as the default values as shown here
# Chi_Sq_Max can be tuned to perform more stringent matches.
#
&BANDMERGEIN
Comment = 'Generic namelist file for bandmerge - default values.',
Output_STAT_Filename = 'statfile.tbl',
Output_DUMP_Filename_Base = 'bm_dump',
Output_SPCMB_Filename = 'spcmbfile.bm',
Project = 1,
ChiSq_Denom_Min = 1.0e-06,
Chi_Sq_Max = 24.0,                 -----> Maximum allowed chi-square value used to specify the match.
Pseudo_Chi_Sq_Max = 24.0,
LR_Scale_Fact = 2.5,
Epsilon = -1.0e+39,
SigMin = 0.03,
Pseudo_Pos_SF = 1.0,
Pseudo_Phot_SF = 0.0,
X_Window = 20.0,
Y_Window = 20.0,
Status_Check_Flag = 0,
Status_Threshold = 0,
Dump = 0,
Trace_PS1 = 0,
Trace_PS2 = 0,
Trace_PS3 = 0,
Trace_PS4 = 0,
Trace_PS5 = 0,
Trace_PS6 = 0,
Trace_PS7 = 0,
SNR_Threshold1 = 10.0,
SNR_Threshold2 = 10.0,
SNR_Threshold3 = 10.0,
SNR_Threshold4 = 10.0,
SNR_Threshold5 = 10.0,
SNR_Threshold6 = 10.0,
SNR_Threshold7 = 10.0,
#Input_Table_Column1 = 'SNR',       -----> To bin by SNR, put SNR column in input files and uncomment
SNR_Bin1 = 20.0,
SNR_Bin2 = 50.0,
SNR_Bin3 = 100.0,
SNR_Bin4 = 200.0,
SNR_Bin5 = 500.0,
SNR_Bin6 = 1000.0,
SNR_Bin7 = 2000.0,
SNR_Bin8 = 5000.0,
SNR_Bin9 = 10000.0,
NL_print = 0,
&END

8.4.5 bmg1.nl and bmg2.nl --- two additional namelists for subsequent iterations 1 and 2.

During iterations to determine band pair systematic offsets, the bandmerge script is controlled by namelists with fixed names, bmg1.nl and bmg2.nl, and so on. In our example the only difference between these files is that the Chi_Sq_Max parameter is set to be smaller for the first pass of bandmerge than for the second pass. The reason is that in order to get reliable estimates of the systematic offsets between pair of bands, it is better to use only very good matches, thus a smaller chi square value. After any offsets between the band pairs have been computed and applied, one can allow larger chi square values to get more matches. The two example files are shown below:

8.4.6 bmg1.nl

&BANDMERGEIN
Project               = 1,
ChiSq_Denom_Min          = 1.0e-06,
Chi_Sq_Max            = 12.0,
Pseudo_Chi_Sq_Max     = 12.0,
Epsilon               = -1.0e+39,
SigMin                = 0.03,
Pseudo_Pos_SF         = 1.0,
Pseudo_Phot_SF        = 0.0,
LR_Scale_Fact         = 2.5,
X_Window              = 20.0,
Y_Window              = 20.0,
Status_Check_Flag     = 0,
Status_Threshold      = 0,
SNR_Threshold1        = 10.0,
SNR_Threshold2        = 10.0,
SNR_Threshold3        = 10.0,
SNR_Threshold4        = 10.0,
SNR_Threshold5        = 10.0,
SNR_Threshold6        = 10.0,
SNR_Threshold7        = 10.0,
SNR_Bin1              = 20.0,
SNR_Bin2              = 50.0,
SNR_Bin3              = 100.0,
SNR_Bin4              = 200.0,
SNR_Bin5              = 500.0,
SNR_Bin6              = 1000.0,
SNR_Bin7              = 2000.0,
SNR_Bin8              = 5000.0,
SNR_Bin9              = 10000.0,
Warning_Max           = 10,
NL_print              = 1,
&END

8.4.7 bmg2.nl

&BANDMERGEIN
Project               = 1,
ChiSq_Denom_Min       = 1.0e-06,
Chi_Sq_Max            = 24.0,
Pseudo_Chi_Sq_Max     = 24.0,
Epsilon               = -1.0e+39,
SigMin                = 0.03,
Pseudo_Pos_SF         = 1.0,
Pseudo_Phot_SF        = 0.0,
LR_Scale_Fact         = 2.5,
X_Window              = 20.0,
Y_Window              = 20.0,
Status_Check_Flag     = 0,
Status_Threshold      = 0,
SNR_Threshold1        = 10.0,
SNR_Threshold2        = 10.0,
SNR_Threshold3        = 10.0,
SNR_Threshold4        = 10.0,
SNR_Threshold5        = 10.0,
SNR_Threshold6        = 10.0,
SNR_Threshold7        = 10.0,
SNR_Bin1              = 20.0,
SNR_Bin2              = 50.0,
SNR_Bin3              = 100.0,
SNR_Bin4              = 200.0,
SNR_Bin5              = 500.0,
SNR_Bin6              = 1000.0,
SNR_Bin7              = 2000.0,
SNR_Bin8              = 5000.0,
SNR_Bin9              = 10000.0,
Warning_Max           = 10,
NL_print              = 1,
&END

8.5 Output from Bandmerge

In our example, we have irac1_extract.tbl, irac3_extract.tbl, irac4_extract.tbl as the input files to Bandmerge, and they are in working_demo/data/. The namelist for bandmerge is ./cdf/bandmerge_demo.nl. Now we run Bandmerge, by typing (in working_demo/):

prompt% bandmerge.pl -n bandmerge_demo.nl

This will give screen output like:

prompt% bandmerge.pl -n bandmerge_demo.nl
=================================================================
Wrapper-script bandmerge.pl, Version 1.5

path /stage/ssc-pipe/work/set11/postbcd/irk/mopex/tst_bins
input ref fits = irac1sci.fits
naxes = 3246 4103
crvals = 150.092822 2.182686
ctypes   RA---TAN DEC--TAN
cdelts = -0.000166667 0.000166667 crota2 = 0.
proj = TAN extents = 0.541001082 0.683834701
Executables pathname = /stage/ssc-pipe/work/set11/postbcd/irk/mopex/tst_bins/
Ancillary-data pathname = /stage/ssc-pipe/work/set11/postbcd/irk/mopex/source/
-r file name = fif.tbl
-i file name = irac1sci_extract.tbl
-o file name = Output_1/cnv_irac1sci_extract.tbl

/stage/ssc-pipe/work/set11/postbcd/irk/mopex/tst_bins/s2c_trans -r fif.tbl -i irac1sci_extract.tbl -o Output_1/cnv_irac1sci_extract.tbl

Pipeline Module S2C_TRANS Version 1.8
Processing time Tue Jul 8 09:30:17 2008
                   .
                   .
                   .
Pipeline Module C2S_TRANS Version 2.0
Processing time Tue Jul 8 09:31:26 2008

c2s_trans_parse_namelist: Information only: Namelist not specified.
Time = 8.420 seconds
exit code 0

Perl system() return code = 0
c2s_trans terminated normally.

System Exit Code (                     c2s_trans): 0

Wrapper-script bandmerge.pl terminated normally.

Note: On Mac OS X and Linux systems, ignore the messages "m: command not found" near the end. These are just some timing scripts that are not vital.

After executing the Bandmerge Perl script, let us look at the output. All of the log files are stored in the cdf/ directory. Look at these to help diagnose errors. The science outputs are stored in the Output directory which was specified in the namelist, output_bandmerge_demo/. These outputs are:

8.5.1 Intermediate output files:

In the example the intermediate source lists have a prefix given in the namelist, cnv_*, for example, cnv_irac1_extract.tbl.

Bandmerge derives any systematic offsets between pair of bands, and corrects the input positions and positional uncertainties. The corrected files are called, for example, iter1_cnv_irac1_extract.tbl, after the first iteration of bandmerge and getoff, and iter2_* for the second iteration.

8.5.2 Merged source lists:

The name of the merged source photometry files is determined by the user's input in the namelist. In this example, we specified newtbl.tbl in the namelist. After the first iteration, the merged source photometry file is called iter1_newtbl.tbl. It's the final merged source list that is called newtbl.tbl.

The final merged source list has a format like the following. Note how Bandmerge assigns the Outband number based on Spitzer wavelength order (IRAC 1-4, MIPS 5-7):

For a larger version of this image, please see http://irsa.ipac.caltech.edu/data/SPITZER/docs/dataanalysistools/cookbook/files/Recipe5.gif

How to check merged entries in the final merged list:

The 7th column, CnfFlag, indicates the source matching results among the three bands. The flag has a width of Nbands*(Nbands-1). In the above example, Source 1 has CnfFlg=000000. The first 2 digits (00) mean the lowest-wavelength band IRAC-3.6um (Outband 1) had no matches in IRAC-5.8um (Outband 3) and MIPS 24um (Outband 5). This is why srcid3 and srcid5 are zero. For source 274, CnfFlg = 101000, where the first two digits (10) mean IRAC-3.6um got a match in IRAC-5.8um, but not in MIPS 24um , and the second two digits (10) mean that IRAC-5.8um got a match in IRAC-3.6um, but not in MIPS 24um, and the third two digits (00) mean no MIPS 24um detection.

Extrapolating the above definition to 4 bands, if a source has CnfFlag=111111111111, the first set of (111) means that the lowest-wavelength channel has a unique match with the other 3 wavelengths, and the second set of (111) means that the second lowest-wavelength channel has a match with the first band plus the two longest bands, etc.

To select sources with a given match, the user can either parse the CnfFlag (called CF if only 2 bands), or read in fluxes and drop the ones that are -9.9e+09. Close multiples are difficult for Bandmerge. The user may need to check sources for nearby companions and examine the matches, perhaps taking fluxes into account.

8.5.3 Diagnostic files:

Bandmerge also outputs two statistical files in the Output dir. The first one is called spcmbfile.bm (the name can be adjusted in the namelist). It gives the number of matched sources in each band combination:

Sp.Comb. Count
======== =====
    001   50471
    010    3315
    011   11186
    100     156
    101     833
    110     359
    111    2299
===============
Total:    68619

The digits of the first column correspond to bands in descending order, with 0 for no match and 1 for a match. Here, a total of 2299 sources got matched in all three bands, and 50471 sources had only IRAC chan1 detections

The second diagnostic file is statfile.tbl, whose name can also be adjusted in the namelist. This file gives some statistics about each two-band matched set in the x,y plane. The second column indicates that no binning by SNR was done - these are statistics for the Total set.