Extended SRC Database and Final Catalog Generation

T. Jarrett, IPAC
(980717)

The 2MASS extended source database (ESDB) contains information derived for sources deemed "extended" by the 2MASS GALWORKS subprocessor. The criteria toward this end (discriminating between point sources and extended sources) are slightly relaxed in order to error on the side of extended source completeness. The ESDB, therefore, contains a mix of real galaxies, real galactic fuzzies (e.g., nebulae, H II regions, etc), stars, double stars, triple stars, and "artifacts" (generally pieces of meteor streaks, but they also comprise pieces of bright stars). It is the job of the final product generator to cull the ESDB of "false" extended sources, leaving a final catalog that is 99% reliable for most of the sky.

For a description of the level-1 (final catalog) specifications for extended sources, see T. Chester's memo: Brief Summary of 2MASS Facts or simply, SURVEY LEVEL 1 REQUIREMENTS .

At this time, both the northern and southern RTB scans (most of) have been fully processed and analysized, so we now have a very good idea what is going into the ESDB. The tables and plots below give a high level summary of what is in the ESDB. We will first concentrate upon the high GLAT fields (low stellar number density), which are relevant to the level-1 specs. The last section of this memo addresses the issue of how we may build a "final product generator" to construct a clean ES catalog. The first scheduled release of 2MASS ES data is in late March of 1999.

TChester has sketched a plan toward building & verifying a final product generator (generalized to both the point and extended src cats) as well as the documentation that goes with it (the mini-supplement). See Final Product Generator Tasks and The Mini-Explanatory Supplement.


Summary of the Extended Source Database C & R

Low Stellar Density Scans

That is to say, the starcount density at Ks <=14 of less than 3.1 in the log. The following table breaks down the classification results for all sources going into the ESDB (RTB fields), covering over 250 sq. degrees and ~14,000 total sources. These are the raw numbers -- we have not applied any limits to cull false sources, except that we have eliminated the duplicate observations (dupes occur at both inscan and cross-scan overlaps). See Appendix for a table that includes dupes.

Classification Results for All Sources Going into the ESDB (no dupes)

J- J+ nG nBog reliab nS nD nA ntr nU nG/mgdg^2 elow ehigh
5.00 9.00 0 1 .00 1 0 0 0 0 .000 .000 .000
9.00 10.00 0 4 .00 3 1 0 0 0 .000 .000 .000
10.00 11.00 3 29 .09 2 27 0 0 0 .011 .005 .017
11.00 12.00 52 47 .53 0 45 0 2 0 .185 .160 .211
12.00 13.00 227 119 .66 1 109 1 8 0 .809 .756 .863
13.00 13.50 417 135 .76 6 116 5 8 1 2.973 2.828 3.119
13.50 14.00 853 238 .78 15 202 5 16 7 6.082 5.874 6.290
14.00 14.50 1830 418 .81 70 310 12 26 48 13.048 12.743 13.353
14.50 15.00 3388 1028 .77 413 531 44 40 327 24.157 23.742 24.572
15.00 15.50 2571 488 .84 286 169 22 11 459 18.332 17.970 18.693
15.50 16.00 309 34 .90 17 8 9 0 90 2.203 2.078 2.329
- - - - - - - - - - - - -
- - - - - - - - - - - - -
H- H+ nG nBog reliab nS nD nA ntr nU nG/mgdg^2 elow ehigh
5.00 9.00 0 1 .00 1 0 0 0 0 .000 .000 .000
9.00 10.00 1 12 .08 4 8 0 0 0 .004 .000 .007
10.00 11.00 32 37 .46 0 36 0 1 0 .114 .094 .134
11.00 12.00 174 68 .72 0 64 0 4 0 .620 .573 .667
12.00 13.00 819 205 .80 4 190 0 11 2 2.920 2.818 3.022
13.00 13.50 1204 242 .83 11 213 6 12 23 8.585 8.337 8.832
13.50 14.00 2512 498 .83 110 331 25 32 93 17.911 17.554 18.268
14.00 14.50 3889 1071 .78 501 499 34 37 583 27.729 27.284 28.174
14.50 15.00 975 368 .73 188 151 17 12 233 6.952 6.729 7.175
15.00 15.50 77 61 .56 27 25 9 0 27 .549 .486 .612
15.50 16.00 11 24 .31 10 7 6 1 10 .078 .055 .102
- - - - - - - - - - - - -
- - - - - - - - - - - - -
K- K+ nG nBog reliab nS nD nA ntr nU nG/mgdg^2 elow ehigh
5.00 9.00 0 1 .00 1 0 0 0 0 .000 .000 .000
9.00 10.00 5 13 .28 4 9 0 0 0 .018 .010 .026
10.00 11.00 55 39 .59 0 38 0 1 0 .196 .170 .223
11.00 12.00 278 82 .77 0 78 0 4 0 .991 .932 1.051
12.00 13.00 1551 255 .86 8 220 10 17 12 5.529 5.389 5.670
13.00 13.50 2370 357 .87 46 269 24 18 73 16.898 16.551 17.246
13.50 14.00 4256 864 .83 365 429 36 34 539 30.346 29.881 30.811
14.00 14.50 1079 791 .58 388 361 15 27 314 7.693 7.459 7.928
14.50 15.00 95 225 .30 124 85 10 6 43 .677 .608 .747
15.00 15.50 7 74 .09 34 34 4 2 8 .050 .031 .069
15.50 16.00 5 25 .17 11 8 5 1 3 .036 .020 .052

It can be seen that the contamination rate is on order of 20%, dominated by "double stars". These false galaxies are mostly blue in color (J-K < 1.0) and thus are easily eliminated using a color criterion --- more on this later ---. Artifacts, unlike double stars, are not easily discriminated from galaxies due to their real non-point source nature (e.g., meteor streaks, pieces of bright stars, etc). The only hope of eliminating them is to discern their "correlatedness" -- pieces of meteor streaks align in coord position and in position angle space. Fortunatly, their total numbers are generally <= 1% of the galaxies detected.

For the level-1 spec, we want to generate a catalog that is 90% complete and 99% reliable. The spec refers to galaxies brighter than Ks = 13.5, H < 14.3 and J < 15.0, for which the "sh" value is larger than 0.5 (see the small print in the lev-1 spec doc; link given above). The "sh" limit is designed to limit consideration to only sources that are extended beyond the PSF (that is, to the ability of measuring a significant difference between the radial extent of a galaxy and the PSF). A raw "sh value" of 0.5 corresponds to a "sh score" of ~10.0. The latter parameter is what is written to the ESDB. The following table shows the classification breakdown (see table 1, above) with a "sh" score limit of 10.0.

Classification Results "sh" score >= 10.0

J- J+ nG nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 0 .00 0 0 0 0 0 .000 .000 .000
9.00 10.00 0 0 .00 0 0 0 0 0 .000 .000 .000
10.00 11.00 3 9 .25 0 9 0 0 0 .011 .005 .017
11.00 12.00 51 16 .76 0 15 0 1 0 .182 .156 .207
12.00 13.00 226 47 .83 1 41 1 4 0 .806 .752 .859
13.00 13.50 416 51 .89 1 41 5 4 0 2.966 2.821 3.112
13.50 14.00 838 86 .91 1 69 3 13 2 5.975 5.769 6.181
14.00 14.50 1775 145 .92 4 114 7 20 26 12.656 12.356 12.956
14.50 15.00 3101 356 .90 48 242 36 30 211 22.111 21.713 22.508
15.00 15.50 2139 157 .93 53 76 21 7 268 15.251 14.922 15.581
15.50 16.00 163 13 .93 2 3 8 0 39 1.162 1.071 1.253
- - - - - - - - - - - - -
- - - - - - - - - - - - -
H- H+ nG nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 0 .00 0 0 0 0 0 .000 .000 .000
9.00 10.00 1 1 .50 0 1 0 0 0 .004 .000 .007
10.00 11.00 32 11 .74 0 11 0 0 0 .114 .094 .134
11.00 12.00 173 21 .89 0 19 0 2 0 .617 .570 .664
12.00 13.00 807 76 .91 2 71 0 3 1 2.877 2.776 2.978
13.00 13.50 1156 85 .93 1 71 5 8 12 8.242 8.000 8.485
13.50 14.00 2274 173 .93 19 114 19 21 56 16.214 15.874 16.554
14.00 14.50 3237 380 .89 104 219 32 25 321 23.080 22.675 23.486
14.50 15.00 597 87 .87 19 48 14 6 98 4.257 4.082 4.431
15.00 15.50 15 14 .52 1 6 7 0 8 .107 .079 .135
15.50 16.00 1 0 1.00 0 0 0 0 1 .007 .000 .014
- - - - - - - - - - - - -
- - - - - - - - - - - - -
K- K+ nG nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 0 .00 0 0 0 0 0 .000 .000 .000
9.00 10.00 5 1 .83 0 1 0 0 0 .018 .010 .026
10.00 11.00 55 9 .86 0 9 0 0 0 .196 .170 .223
11.00 12.00 274 25 .92 0 22 0 3 0 .977 .918 1.036
12.00 13.00 1368 71 .95 2 52 10 7 6 4.877 4.745 5.009
13.00 13.50 1841 125 .94 6 87 23 9 33 13.127 12.821 13.432
13.50 14.00 2811 272 .91 68 152 31 21 245 20.043 19.665 20.421
14.00 14.50 490 195 .72 63 106 12 14 97 3.494 3.336 3.652
14.50 15.00 20 22 .48 5 14 3 0 5 .143 .111 .174
15.00 15.50 0 4 .00 0 2 2 0 0 .000 .000 .000
15.50 16.00 0 0 .00 0 0 0 0 0 .000 .000 .000

The reliablity now hovers between 75 and 95%. Just looking at the last half mag bin relevant to the spec (e.g., 13.0 < Ks < 13.5), the reliablity is around 90%. The true completeness is difficult to measure (one needs a "truth" set independently measured and verified), but based on repeatibility tests, the completeness is likely to be well above 95%. See Coma Repeatibility from 10 scans of 970521n and GALWORKS Performance on the Spiral-Rich Cluster Hercules and 2MASS Validation Fields .

We therefore require more stringent limits upon the star-galaxy discrimination parameters in the ESDB to cull out additional false galaxies to bring the relability up to >98%.


Methods Toward Producing a Reliable Extended Source Catalog

In order to cull the ESDB of unwanted double and triple stars, we need to look more carefully at the star-galaxy discrimination parameters. The following link shows the results for all sources going into the ESDB sans dupes.

It can be seen that the galaxies and doubles star are most easily separated with the "wsh" and with J-K color (or alternatively, JHK color score). The following tables summarize the C&R with a color-color criterion applied and with a combination of color-color and "wsh".

Classification Results "sh" score >= 10.0 and "color" score >= -0.25

J- J+ nG CJ nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000
9.00 10.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000
10.00 11.00 3 1.00 2 .60 0 2 0 0 0 .011 .005 .017
11.00 12.00 51 1.00 6 .89 0 5 0 1 0 .182 .156 .207
12.00 13.00 226 1.00 18 .93 0 16 0 2 0 .806 .752 .859
13.00 13.50 416 1.00 22 .95 1 19 0 2 0 2.966 2.821 3.112
13.50 14.00 838 1.00 35 .96 0 27 1 7 2 5.975 5.769 6.181
14.00 14.50 1770 1.00 87 .95 3 70 5 9 24 12.620 12.320 12.920
14.50 15.00 3048 .98 196 .94 29 125 25 17 196 21.733 21.339 22.126
15.00 15.50 2090 .98 93 .96 28 45 14 6 243 14.902 14.576 15.228
15.50 16.00 158 .97 12 .93 2 3 7 0 39 1.127 1.037 1.216
- - - - - - - - - - - - - -
H- H+ nG CH nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000
9.00 10.00 1 1.00 0 1.00 0 0 0 0 0 .004 .000 .007
10.00 11.00 32 1.00 5 .86 0 5 0 0 0 .114 .094 .134
11.00 12.00 172 .99 7 .96 0 5 0 2 0 .613 .566 .660
12.00 13.00 807 1.00 34 .96 1 31 0 2 1 2.877 2.776 2.978
13.00 13.50 1155 1.00 45 .96 0 36 4 5 12 8.235 7.993 8.478
13.50 14.00 2265 1.00 104 .96 14 63 16 11 54 16.150 15.810 16.489
14.00 14.50 3175 .98 227 .93 60 128 25 14 302 22.638 22.236 23.040
14.50 15.00 567 .95 41 .93 8 21 9 3 85 4.043 3.873 4.213
15.00 15.50 12 .80 3 .80 0 0 3 0 8 .086 .061 .110
15.50 16.00 1 1.00 0 1.00 0 0 0 0 1 .007 .000 .014
- - - - - - - - - - - - - -
K- K+ nG CK nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000
9.00 10.00 5 1.00 0 1.00 0 0 0 0 0 .018 .010 .026
10.00 11.00 55 1.00 3 .95 0 3 0 0 0 .196 .170 .223
11.00 12.00 274 1.00 12 .96 0 10 0 2 0 .977 .918 1.036
12.00 13.00 1368 1.00 44 .97 1 28 10 5 6 4.877 4.745 5.009
13.00 13.50 1841 1.00 94 .95 6 59 22 7 33 13.127 12.821 13.432
13.50 14.00 2808 1.00 241 .92 68 127 29 17 242 20.021 19.644 20.399
14.00 14.50 458 .93 148 .76 60 70 10 8 97 3.266 3.113 3.418
14.50 15.00 15 .75 7 .68 2 5 0 0 2 .107 .079 .135
15.00 15.50 0 .00 1 .00 0 1 0 0 0 .000 .000 .000
15.50 16.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000

Classification Results for "sh" score >= 10.0 and "color" score + "wsh" >= 0.25

J- J+ nG CJ nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000
9.00 10.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000
10.00 11.00 3 1.00 0 1.00 0 0 0 0 0 .011 .005 .017
11.00 12.00 51 1.00 2 .96 0 1 0 1 0 .182 .156 .207
12.00 13.00 226 1.00 1 1.00 0 1 0 0 0 .806 .752 .859
13.00 13.50 416 1.00 10 .98 1 5 2 2 0 2.966 2.821 3.112
13.50 14.00 838 1.00 19 .98 0 14 1 4 2 5.975 5.769 6.181
14.00 14.50 1775 1.00 85 .95 4 61 7 13 24 12.656 12.356 12.956
14.50 15.00 3099 1.00 250 .93 36 154 35 25 206 22.096 21.699 22.493
15.00 15.50 2135 1.00 119 .95 37 57 18 7 264 15.223 14.893 15.552
15.50 16.00 163 1.00 12 .93 2 3 7 0 39 1.162 1.071 1.253
- - - - - - - - - - - - - -
H- H+ nG CH nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000
9.00 10.00 1 1.00 0 1.00 0 0 0 0 0 .004 .000 .007
10.00 11.00 32 1.00 0 1.00 0 0 0 0 0 .114 .094 .134
11.00 12.00 173 1.00 0 1.00 0 0 0 0 0 .617 .570 .664
12.00 13.00 807 1.00 14 .98 1 11 0 2 1 2.877 2.776 2.978
13.00 13.50 1155 1.00 29 .98 0 22 4 3 12 8.235 7.993 8.478
13.50 14.00 2274 1.00 107 .96 17 57 19 14 55 16.214 15.874 16.554
14.00 14.50 3234 1.00 278 .92 81 146 31 20 318 23.059 22.653 23.464
14.50 15.00 596 1.00 61 .91 9 35 13 4 96 4.250 4.075 4.424
15.00 15.50 14 .93 10 .58 0 6 4 0 8 .100 .073 .127
15.50 16.00 1 1.00 0 1.00 0 0 0 0 1 .007 .000 .014
- - - - - - - - - - - - - -
K- K+ nG CK nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000
9.00 10.00 5 1.00 0 1.00 0 0 0 0 0 .018 .010 .026
10.00 11.00 55 1.00 0 1.00 0 0 0 0 0 .196 .170 .223
11.00 12.00 274 1.00 2 .99 0 2 0 0 0 .977 .918 1.036
12.00 13.00 1368 1.00 25 .98 1 10 10 4 6 4.877 4.745 5.009
13.00 13.50 1841 1.00 82 .96 6 49 22 5 33 13.127 12.821 13.432
13.50 14.00 2811 1.00 240 .92 68 122 31 19 244 20.043 19.665 20.421
14.00 14.50 489 1.00 163 .75 60 80 11 12 97 3.487 3.329 3.644
14.50 15.00 20 1.00 20 .50 5 12 3 0 5 .143 .111 .174
15.00 15.50 0 .00 4 .00 0 2 2 0 0 .000 .000 .000
15.50 16.00 0 .00 0 .00 0 0 0 0 0 .000 .000 .000

Thus a simple color-color cutoff plus "wsh" score results in a vast improvement to the reliablity without significantly affecting the internal completeness. The level-1 spec of 99% still remains elusive, however. The last couple % on R will require more sophisticated methods, either using the other star-galaxy parameters in some clever combination, or employing an OBDT method that uses all information in a complicated (non-intuitive) fashion.


Tasks to Produce the 2MASS Extended Source Catalog

See also Final Product Generator Tasks and Extended Source Catalog Criteria: Proposal .




Appendix

The following table shows the classification breakdown for all sources in the ESDB (located along sites of low stellar number density), including duplicates (there should be between 10 and 15% dupes due cross and inscan coadd overlap).

Classification Results for All Sources Going into the ESDB (w/ dupes)

J- J+ nG nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 1 .00 1 0 0 0 0 .000 .000 .000
9.00 10.00 0 5 .00 4 1 0 0 0 .000 .000 .000
10.00 11.00 3 32 .09 2 30 0 0 0 .011 .005 .017
11.00 12.00 87 50 .64 0 48 0 2 2 .310 .277 .343
12.00 13.00 468 131 .78 1 120 1 9 0 1.668 1.591 1.746
13.00 13.50 676 147 .82 7 127 5 8 1 4.820 4.635 5.005
13.50 14.00 1237 262 .83 15 223 5 19 8 8.820 8.569 9.071
14.00 14.50 2598 448 .85 75 330 13 30 70 18.524 18.161 18.887
14.50 15.00 4513 1103 .80 424 587 44 48 464 32.178 31.699 32.657
15.00 15.50 3561 540 .87 300 205 23 12 619 25.390 24.965 25.816
15.50 16.00 446 39 .92 20 9 10 0 114 3.180 3.029 3.331
- - - - - - - - - - - - -
- - - - - - - - - - - - -
H- H+ nG nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 1 .00 1 0 0 0 0 .000 .000 .000
9.00 10.00 1 13 .07 5 8 0 0 0 .004 .000 .007
10.00 11.00 46 40 .53 0 39 0 1 0 .164 .140 .188
11.00 12.00 369 80 .82 0 75 0 5 3 1.316 1.247 1.384
12.00 13.00 1305 219 .86 4 204 0 11 2 4.652 4.524 4.781
13.00 13.50 1708 267 .86 11 234 7 15 35 12.178 11.884 12.473
13.50 14.00 3453 535 .87 116 357 25 37 134 24.620 24.201 25.039
14.00 14.50 5126 1141 .82 518 545 34 44 758 36.549 36.039 37.060
14.50 15.00 1519 409 .79 199 179 18 13 333 10.831 10.553 11.109
15.00 15.50 112 73 .61 28 36 9 0 41 .799 .723 .874
15.50 16.00 13 25 .34 10 7 7 1 11 .093 .067 .118
- - - - - - - - - - - - -
- - - - - - - - - - - - -
K- K+ nG nBog reliab nS nD nA ntr nU nG/(mgdg^2elow ehigh
5.00 9.00 0 1 .00 1 0 0 0 0 .000 .000 .000
9.00 10.00 6 14 .30 5 9 0 0 0 .021 .013 .030
10.00 11.00 98 42 .70 0 41 0 1 3 .349 .314 .385
11.00 12.00 531 95 .85 0 90 0 5 0 1.893 1.811 1.975
12.00 13.00 2281 273 .89 9 236 10 18 13 8.132 7.962 8.302
13.00 13.50 3292 391 .89 47 297 25 22 103 23.472 23.063 23.881
13.50 14.00 5582 917 .86 380 461 37 39 704 39.800 39.268 40.333
14.00 14.50 1693 836 .67 399 392 15 30 446 12.071 11.778 12.365
14.50 15.00 174 258 .40 135 104 10 9 57 1.241 1.147 1.335
15.00 15.50 13 84 .13 34 43 5 2 11 .093 .067 .118
15.50 16.00 6 30 .17 12 12 5 1 4 .043 .025 .060