Duplicate Source Resolution
Multiple Source Resolution Validation




I. Multiple Source Resolution

The objective of multiple source resolution processing in 2MASS final product generation is to select one apparition of sources detected more than once because they fall in scan overlap regions. This process was carried out for the full Point and Extended Source Catalog Generation (CatGen) DBs.

The result of multiple source resolution is that every source in the CatGen DBs have values assigned to the use_src and dup_src flags, that are used in determining whether or not a source. will be used in the final release catalog. A source is accepted for the final catalog if:

use_src = 1 OR (use_src = 0 AND dup_src = 0)

and other final selection criteria are met for the Catalogs.

II. Statistics



III. Validation

1. < 2" Pairs

Multiple source resolution should produce a source list that contains pairs of sources from different scans that have separations of < 2". The source proximity calculation for point sources reveals that there are 19 unique pairs of sources from different scans with separation < 2", and the list of pairs is given below (note that each pair is listed twice).

List of < 2" Separation Source Pairs

Eighteen of the pairs are associated with one scan, 990227s scan 053 (tile 307281), and one is associated with 970724n scan 048 (tile 8333). As noted in night processing QA, both of these scans were offset in RA relatively far from the nominal tile positions. 990227s s053 was offset -442.4" and 970724n s048 was offset -159.4". Because of these large offsets, some of the overlaps between these two scans and adjacent scans were omitted from the multiple source resolution processing. This resulted in the 19 pairs being associated and having proper values of use_src and dup_src being set.

The proper values of use_src and dup_src for these sources are given here.

2. < 4" Pairs (R. Stiening - 8/21/02)

The 469 million row public data release contains 909,985 rows in which the nearest neighbor is within 4" and the nearest neighbor is in a different scan. I have used these rows to study the duplicate resolution algorithm used by IPAC.

Figure 1 shows the distribution in separation of the sources with neighbors in different scans. There are a few sources with separations < 2". These have been identified as missed duplicates because of an error in the processing which is limited to two scans. The other significant feature shown in this plot is the peak at 2" separation. There are approximately 50,000 sources in the public release in the peak. I did not find evidence that this peak is populated by sources with large proper motion. Figure 2 shows the Julian Day difference between the observations of sources which have a nearest neighbor in a different scan with a separation between 1.8 and 2.2 seconds. The peaks separated by 360 days are an artifact created by the observatory scheduling software. The distribution of the sources in the 2" peak by scan shows that they are preferentially located in scans with low galactic latitude. They appear to be a consequence of confusion.

Table 1.  The scan distribution of sources with neighbors between 1.8 and
2.2 arc seconds.  Only the scans with the largest number of sources are
listed in the table.

 scan_key | count |   glon   |   glat
----------+-------+----------+----------
    71794 |    25 |   1.7965 |   0.9086
    10379 |    24 |   2.2865 |   0.0981
    10376 |    23 |    2.104 |    0.402
    10384 |    22 |    2.587 |    -0.41
    35377 |    22 |  29.5378 |   0.2025
    27425 |    21 | 330.3194 |   1.7017
    10377 |    21 |    2.165 |   0.3001
    26097 |    21 | 339.6748 |   0.8628
    10380 |    20 |   2.3471 |  -0.0034
     2220 |    20 |   9.8373 |  -1.0266
    10383 |    20 |   2.5272 |  -0.3085
    11315 |    20 | 354.6753 |   0.9304
    26096 |    20 | 339.5987 |     0.95
    71795 |    19 |   1.8582 |   0.8077
    25518 |    19 |   9.0275 |   0.4326
    10375 |    19 |   2.0433 |    0.503
    34529 |    19 |   9.6649 |  -0.7102
    26742 |    19 | 340.4313 |  -0.0164
    25957 |    18 |  29.4861 |   0.3008
    26750 |    18 | 341.0194 |   -0.728
    49649 |    17 | 316.0657 |   -3.148
    36757 |    17 |  29.5919 |   0.0973
    23365 |    17 |  30.1928 |  -1.0729
    23364 |    17 |   30.137 |  -0.9671
    11313 |    17 | 354.5445 |   1.1256
    33182 |    17 |  43.7972 |  -1.2612
    10374 |    17 |    1.982 |   0.6051

The maximum number of sources in the 1.8-2.2 second peak in scans where the galactic latitude is greater than 10 degrees is 8.

I conclude from the above that the 2" search radius used for duplicate resolution is a conservative choice and that the number of missed duplicates in the public release is insignificant.

IV. Analyses Using the Multiple Source Resolution Statistics


Last Updated: 21 August 2002
R. Cutri (IPAC), S. Wheelock (IPAC), R. Stiening (UMass)