IRSA VO Table Access Protocol (TAP) Instructions

IRSA offers program-friendly interfaces to all of its catalogs. An Application Program Interface (API), allows access to IRSA catalogs (within a script or on the command line) without the need to go through a browser. IRSA accepts three kinds of APIs for catalogs (more on IRSA's Catalog Search APIs).

This page describes IRSA's implementation of the VO Table Access Protocol, or TAP ( TAP protocol document ). TAP allows a rich variety of searches, including cone, box, polygon, or all-sky. You can upload a table with multiple positions. The output can be a VO Table, an IPAC table, a FITS table, or several other formats. It also provides the option of selecting output columns, and performing functions on the results.

Using IRSA TAP services from Python

Using IRSA TAP services from the command line

Here is an example of how to do a command-line cone search of the 2MASS Point Source Catalog ("fp_psc") with the commonly available curl and wget commands. The units of RA, Dec, and search radius are decimal degrees. The result in "out.xml" is a VOTable, the default format.

Curl is preferred as it handles table upload more easily. Note the "+" signs which indicate spaces within a URL. These will be omitted in the descriptions below for clarity. They are not needed with the "-F" option shown below.

How do I get IRSA's Catalog Names and Column Names?

Ready to go and just need the IRSA catalog and/or column names? See the sections labeled "table" and "columns" in blue, below.

Constructing a TAP Query

The format for the TAP query depends upon whether you make a synchronous or asynchronous query. A synchronous query runs until it completes and streams the results back -- use this for small-area searches. An asynchronous query will run in the background and give you a place to check whether your job is done, rather than streaming results back directly. For both cases, the main query parameters are QUERY, FORMAT (optional), and UPLOAD (for table upload), described below.

For synchronous calls (small-area searches):

For asynchronous calls (large searches):

You will be given a web address that you can use to check on your job. For example, if the address is:

Then you can check the progress of your job by checking the "phase" associated with this job:

This will return either QUEUED, EXECUTING, COMPLETED, ERROR, or ABORT. When the job is completed, you get your results like this:


QUERY:

The QUERY clause follows this pattern:

QUERY=SELECT {columns} FROM {table} WHERE {geometric constraint} AND ({sql constraints}) {order by}{group by} {having}

In particular,

  • the query contains at most one geometric constraint.
  • the operator between geometric constraint and sql constraints, if both exist, is 'AND'.
  • multiple sql constraints are grouped together and enclosed within parentheses.
  • keyword description
    columns

    Comma-separated list of column names, or functions of columns to be returned.

    Supported functions are those that are both defined in ADQL 2.0 and implemented by database software providers Oracle (IRSA catalogs) or PostgreSQL (other IRSA tables).

    Examples (just for illustration -- don't run without constraints):

    1. Output all columns.

    • SELECT *

    2. Output selected columns.

    • SELECT ra,dec,j_m

    3. Output the date of observation from the main 2MASS catalog, but round to the nearest day.

    • SELECT round(jdate, 0) FROM fp_psc

    4. Rename the output column returned by "round(jdate,0)" to rounded_jdate.

    • SELECT round(jdate, 0) as rounded_jdate FROM fp_psc

    5. Prefix the ra with the table name. This is advisable when performing operations on multiple tables.

    • SELECT fp_psc.ra FROM fp_psc

    6. Limit the number of rows in the result table. This can be done either with a "SELECT TOP" command within the "QUERY" clause or by means of the MAXREC parameter. If both "TOP" and "MAXREC" are present, "TOP" is ignored.

    • QUERY=SELECT+TOP+5+ra+FROM+fp_psc
    • QUERY=SELECT+ra+FROM+fp_psc&MAXREC=5

    Note on Column Names: To get the available column names, say for the catalog "fp_psc" (retrieving the catalog table name is described in the following section):

    • curl -o out.xml "https://irsa.ipac.caltech.edu/TAP/sync?QUERY=SELECT+*+FROM+TAP_SCHEMA.columns+WHERE+table_name='fp_psc'"

    table

    Specifies the catalog table to search.

    Example (just for illustration):

    Select all columns from the COSMOS Photometry catalog.

    • SELECT * FROM cosmos_phot

    Note on Catalog Table Names: To obtain the string needed for the "table" parameter, you can download a VOTable of IRSA's available catalogs. The string needed is in the column labeled "table_name". There is also a useful column called "description".

    • curl -o out.xml "https://irsa.ipac.caltech.edu/TAP/sync?QUERY=SELECT+*+FROM+TAP_SCHEMA.tables"

    Some Popular Catalogs

    table_name Description
    allwise_p3as_psd AllWISE Source Catalog
    fp_psc 2MASS Point Source Catalog
    glimpse_s07 GLIMPSE I Spring 07 Catalog (Spitzer)
    cosmos_phot COSMOS Photometry Catalog
    iraspsc IRAS Point Source Catalog

    geometric constraint

    The geometric constraint is typically a CONTAINS function (= 1 for true) operating on a POINT and a shape. Note that only a single geometric constraint is supported per query. Please see tip below.

    The POINT's two elements must be (possibly qualified) column names of which the final unqualified parts are "ra" and "dec" respectively. There are no restrictions on column names in the shape clause.

    There are three supported shape functions: CIRCLE, BOX, and POLYGON. All coordinates and angular sizes are in decimal degrees. Shape arguments may be constants or names of columns. Binary arithmetic expressions, i.e. "A op B" where "op" is one of '+', '-', '*', and '/', may be supplied in many places where constants are expected. Depending on how the call is made, a '+' might have to be encoded as "%2B".

    As of ADQL version 2.1, specifying a coordinate system in geometry constructors is deprecated. IRSA catalogs use equatorial coordinates with ra and dec usually consistent with the J2000 system. If another coordinate system such as ICRS was used in the catalog creation, the search will assume the search parameters are ICRS.

    Circle

    This requires the coordinates of the center and the radius. A 1-degree cone search around M101 would be:

    • SELECT * FROM fp_psc WHERE CONTAINS(POINT(ra,dec),CIRCLE(210.80225,54.34894,1.0))=1

    Box

    This requires the coordinates of the center of the box, the width, and the height. The box will be aligned with the coordinate system. A 1 degree by 1 degree box around M101 would be:

    • SELECT * FROM fp_psc WHERE CONTAINS(POINT(ra,dec),BOX(210.80225,54.34894,1.0,1.0))=1

    Polygon

    This requires the concatenated coordinates of the vertices of the polygon. A triangle search around M101 would be:

    • SELECT * FROM fp_psc WHERE CONTAINS(POINT(ra,dec),POLYGON(209.80225,53.34894,209.80225,55.34894,211.80225,54.34894))=1

    The region searched is the convex hull of the supplied coordinates, so the order of coordinates does not matter. This is different from the ADQL specification, which allows arbitrary polygons.

    Note: Some functions (e.g. count()) do not work when there is a {geometric constraint}.

    Tip:

    As mentioned above, no more than one geometric constraint is supported per query. Therefore, queries like the one below, in which the user tries to query two locations on the sky at once, will not succeed. Users wishing to query more than one location on the sky must submit multiple queries. For example, this query will fail...

    • SELECT * FROM fp_psc WHERE (CONTAINS(POINT(ra,dec),CIRCLE(210.80225,54.34894,1.0))=1 OR CONTAINS(POINT(ra,dec),CIRCLE( 110.52208,44.49843,1.0))=1)

    ...but the combined results of these two queries will provide the desired data:

    • SELECT * FROM fp_psc WHERE CONTAINS(POINT(ra,dec),CIRCLE(210.80225,54.34894,1.0))=1
    • SELECT * FROM fp_psc WHERE CONTAINS(POINT(ra,dec),CIRCLE(110.52208,44.49843,1.0))=1
    sql constraints

    SQL constraints can be any constraints expressible in ADQL 2.0, with the restriction that functions must be supported by the Oracle (most tables) or Informix (some WISE tables) backend.

    Note: If the WHERE clause contains a geometric constraint as well as multiple SQL constraints, the SQL constraints must be grouped together and enclosed within parenthesis.

    Examples:

    Query the 2MASS Point Source Catalog, and return all columns for records that fall within a certain time range.

    • SELECT * FROM fp_psc WHERE (jdate>=2451500 and jdate<=2451700)

    Perform a 1-degree cone search around M101, restricting to records that fall within a certain time range.

    • SELECT * FROM fp_psc WHERE CONTAINS(POINT(ra,dec),CIRCLE(210.80225,54.34894,1.0))=1 and (jdate>=2451500 and jdate<=2451700)
    order by

    A comma-separated list of column names specifying which to use for sorting the output table rows, in order of priority.

    Example:

    Query the 2MASS Point Source Catalog. Return all columns for records within a specified time range, and order the results by the observation date. If multiple records have identical observation dates, then order by right ascension.

    • SELECT * FROM fp_psc WHERE (jdate>=2451500 and jdate<=2451700) order by jdate, ra
    group by Groups the returned records by the specified columns, which allows you to perform functions on that group.

    Example:

    Query the 2MASS Point Source Catalog. Return all columns for records within a specified time range, and group the records by date.

    • SELECT * FROM fp_psc WHERE (jdate>=2451500 and jdate<=2451700) group by jdate
    having Constrains selections after an aggregate function such as "group by". Quicker than a WHERE since it acts on the smaller group.

    Example:

    Query the 2MASS Point Source Catalog. Return all columns for records within a specified time range, group the records by date, then select those in the group with a certain brightness.

    • SELECT * FROM fp_psc WHERE (jdate>=2451500 and jdate<=2451700) group by jdate having j_m < 8

    Note: Other ADQL 2.0 functions may also be implemented, but this page is not meant to be a complete description of SQL functions.


    MAXREC:

    Limit the number of rows in the result table. This can be done either with a "SELECT TOP" command within the "QUERY" clause (please see columns) or by means of the "MAXREC" parameter. If both "TOP" and "MAXREC" are present, "TOP" is ignored.

    To retrieve only the highest ranked N rows meeting the search criteria, specify both (1) a ranking or sort order (via SQL's "ORDER BY" keyword), and (2) the limit N (via either SQL's "SELECT TOP" or the "MAXREC" parameter).


    FORMAT:

    The TAP service will by default return a VOTable. Other formats are also possible by setting the FORMAT keyword. The supported output formats are:

    keyword description
    VOTABLE VO Table - a type of XML
    CSV Comma Separated Value table
    TSV Tab Separated Value table
    IPAC_TABLE IPAC Table Format
    HTML HyperText Markup Language
    FITS Flexible Image Transport System Binary Table

    UPLOAD:

    UPLOAD={name},{URI}

    keyword description
    name

    The name of the table within TAP (e.g. my_favorite_quasars). Prefix this name with TAP_UPLOAD to refer to columns in the query (e.g. TAP_UPLOAD.my_favorite_quasars.ra).

    Currently the uploaded table needs to be a VOTable, though a special syntax allows an IPAC table on the local machine. See Example 7 below.

    URI Location of table to upload. If it is an http or https URL, then the TAP service will attempt to fetch it over the network. To upload a file along with the query, use the special URL scheme "param". This indicates that the value after the colon will be the name of the inline content. The content type used is multipart/form-data, using a "file" type input element, e.g. the -F option of curl. The "name" attribute in the file input must match that used in the UPLOAD parameter. See Example 7 below.

    Examples

    1. Cone Search

      • https://irsa.ipac.caltech.edu/TAP/sync?QUERY=SELECT+*+FROM+fp_psc+WHERE+CONTAINS(POINT(ra,dec),CIRCLE(210.80225,54.34894,1.0))=1
    2. Cone Search returning a FITS table file

      • https://irsa.ipac.caltech.edu/TAP/sync?QUERY=SELECT+*+FROM+fp_psc+WHERE+CONTAINS(POINT(ra,dec),CIRCLE(210.80225,54.34894,1.0))=1&format=fits

    3. Cone Search with only ra, dec, and date rounded to the nearest day
    4. Cone Search with a date filter, ordered by date

      • https://irsa.ipac.caltech.edu/TAP/sync?QUERY=SELECT+*+FROM+fp_psc+WHERE+CONTAINS(POINT(ra,dec),CIRCLE(210.80225,54.34894,1.0))=1+and+(jdate>=2451500+and+jdate<=2451700)+order+by+jdate
    5. Box Search

      • https://irsa.ipac.caltech.edu/TAP/sync?QUERY=SELECT+*+FROM+fp_psc+WHERE+CONTAINS(POINT(ra,dec),BOX(210.80225,54.34894,1.0,1.0))=1
    6. Polygon Search

      • https://irsa.ipac.caltech.edu/TAP/sync?QUERY=SELECT+*+FROM+fp_psc+WHERE+CONTAINS(POINT(ra,dec),POLYGON(209.80225,53.34894,209.80225,55.34894,211.80225,54.34894))=1
    7. Upload local file

      If you have a file "upload.vo", then this command will run a match against objects in the 2MASS catalog. Note that the POINT clause refers to columns from the catalog and the CIRCLE clause to columns from the uploaded table.

      • curl -o fp_psc.xml -F "UPLOAD=my_table,param:table" -F "table=@upload.vo" -F "QUERY=SELECT fp_psc.ra, fp_psc.dec FROM fp_psc WHERE CONTAINS(POINT(ra,dec), CIRCLE(TAP_UPLOAD.my_table.ra, TAP_UPLOAD.my_table.dec, 0.01)) =1" https://irsa.ipac.caltech.edu/TAP/sync

      If you have an IPAC table "upload.tbl" with "ra" and "dec" columns named "my_ra" and "my_dec" respectively, then this command will run a match against objects in the 2MASS catalog, returning an IPAC table.

      • curl -o fp_psc.tbl -F "UPLOAD=my_table,param:table.tbl" -F "table.tbl=@upload.tbl" -F "FORMAT=IPAC_TABLE" -F "QUERY=SELECT fp_psc.ra, fp_psc.dec FROM fp_psc WHERE CONTAINS(POINT(ra,dec), CIRCLE(TAP_UPLOAD.my_table.my_ra, TAP_UPLOAD.my_table.my_dec, 0.01)) =1" https://irsa.ipac.caltech.edu/TAP/sync

      Note that the name of the uploaded table is omitted from the FROM clause in this section's examples. Including the uploaded table's name in that clause is not incorrect and is required if you wish to introduce an alias for the upload table.

      • curl -o fp_psc.tbl -F "UPLOAD=my_table,param:table.tbl" -F "table.tbl=@upload.tbl" -F "FORMAT=IPAC_TABLE" -F "QUERY=SELECT cat_table.ra, cat_table.dec FROM TAP_UPLOAD.my_table AS my_table, fp_psc AS cat_table WHERE CONTAINS(POINT(ra,dec), CIRCLE(my_table.my_ra, my_table.my_dec, 0.01)) =1" https://irsa.ipac.caltech.edu/TAP/sync

    8. Upload remote file

      This will run a simple cone search on WISE and pipe it directly into a cross match with 2MASS.

      • curl -o fp_psc.xml -F "UPLOAD=my_table,https://irsa.ipac.caltech.edu/SCS?table=allwise_p3as_psd&RA=210.80225&DEC=54.34894&SR=0.01" -F "QUERY=SELECT fp_psc.ra, fp_psc.dec FROM fp_psc WHERE CONTAINS(POINT(ra,dec), CIRCLE(TAP_UPLOAD.my_table.ra, TAP_UPLOAD.my_table.dec, 0.01)) =1" https://irsa.ipac.caltech.edu/TAP/sync

    9. Asynchronous Cone Search

      Submit the query.

      • curl -v "https://irsa.ipac.caltech.edu/TAP/async?QUERY=SELECT+*+FROM+fp_psc+WHERE+CONTAINS(POINT(ra,dec),CIRCLE(210.80225,54.34894,1.0))=1"

      The output should include a line like:

      • > Location: https://irsa.ipac.caltech.edu/TAP/async/10

      Check the progress of the job.

      • curl "https://irsa.ipac.caltech.edu/TAP/async/10/phase"

      This query completes fairly quickly, so you should see "COMPLETED". To get the results:

      • curl -o fp_psc.xml "https://irsa.ipac.caltech.edu/TAP/async/10/results/result"