2MASS Photometric Redshift Project

2MASS Photometric Redshift Project

Methods & Calibration

L. Johnson & T. Jarrett
99/06/23


I. Intro to Photo-Redshift Method

It has long been know that broad band photometry can be used to estimate redshift. Baum first experimented with the idea in 1962 using nine passbands to detect the redshifed 4000 angstrom break in a galaxy's SED, and from this key break in a galaxy's continuum he was able to calibrate a photometric-redshift relation and obtain rough estimates for the z of individual galaxies. Since the day of Baum, many new methods have been applied to calibrate photmetric-redshift relations, and the inscentive comes from the fact that broad band photometry is on the order of magnitudes less time consuming than is spectroscopy. Furthermore, photometry is available for faint galaxies that are not spectroscopically accessible, at the least, because of finite telescope time. Therefore with the release of the Hubble Deep Field, and an ever-present interest in the large-scale structure of the universe, came an explosion of interest in the usefullness of photometrically derived redshifts. With quick estimates of redshift, astronomers are able to investigate galaxy evolution and other (to be trite) mysteries of the universe as illuminated by the Hubble Deep Field. Indeed Sawicki et al. (1997), LanZetta et al. (1996), Gwyn & Hartwick (1996), and Mobasher et al. (1996) are only a representative few who use photometrically derived redshifts as a fundamental part of studying the HDF. The use of photometric redshifts is certainly not, however, restricted to the HDF. There have been many efforts by the likes of Brunner et al. (1997), Connolly et al. (1995), and Kondama et al. (1999) to simply (or not-so as the case actually is) refine certain photometric-redshift relations in addition to other scientific agendas. Whatever the atmosphere surrounding the task, estimating redshifts from broad band phtomoetry proves tricky. Often there is the lack of a sufficient training set from which to build an empirical relation, and theoretical relations generally encorporate a galaxy evolution model that is only one of many necessary 'loose-ends' needed to build a set of theoretical color or SED templates from which to compare actual photometric data. Nonetheless, results have been encouraging. As far back as 1985 Koo was able to derive the redshifts of galaxies with a z < 0.30 to within an uncertainty of less than 0.07 (although with z > .3 the uncertainty increases and is non-gaussian). This project presents the trials and tribulations of calibrating yet another photometric-redshift relation, this time in regards to the 2MASS colors of galaxy clusters.

2MASS photometry provides exciting inspiration for deriving a color-redshift relation. The theory behind a 2MASS relation is sound in that a galaxy (cluster)'s stellar luminosity is redshfited toward the K band with increasing distance, the K-correction, and this redshifting is particularly apparent in the 2MASS H-K color of a galaxy (cluster). See 2MASS Galaxy C o l o r s : Hercules Cluster and 2MASS Galaxy Cluster Catalog as well as the thumbnail at the bottom of this section for a derivation and better understanding of this 'theory'. We set out, at least initially, to build an empirical color-redshift relation for clusters with z less than 0.1, and 2MASS offers ideal data. Much of the uncertainty of past photometric-redshift relations has been attributed to the uncertainty in the photometric data; and, as mentioned, few empirical relations exist for the lack of a sufficent training set. 2MASS, being an all-sky-survey, solves both problems. The survey provides an unprecedented database that offers both precise photometric data and a huge selection of galaxy clusters from which to calibrate a color-redshift relation.


(Click on picture for bigger and better image.)



II. Data & Methodology

In order to determine the redshift of a galaxy cluster from the J-H and H-K colors of its principle members, we must first be able to accurately distinguish between cluster members and 'other' sources. This fundamental task is initially tackled by the sorting algorithm GALWORKS. GALWORKS sorts through 2MASS data and discriminates between extended sources (primarily galaxies) and point sources (primarily stars), double and triple stars, artifacts (meteor streaks, bright star artifacts, etc.), etc. with a reliability and repeatibility discussed in the preceding link. Following this distinction, and thus left with a database of 'only' extended sources, we are faced with the challenge of isolating cluster members (or more essentially their colors and redshifts) from foreground and background extended sources. We do so with a computer program written by TJ that encorporates both a photometric-redshift relation and a redshift-sorting algorithm. The remainder of this project focuses on first establishing the best set of default parameters for the redshift-sorting algorithm and secondly deriving an empirical photometric-redshift curve using an iterative approach with the established default parameters.

TARGET CLUSTERS
To establish the best default parameters for the redshift-sorting algorithm we chose sixteen target galaxy clusters with a range in redshift from 0.016 to 0.103 as a training set for the algorithm. These sixteen clusters in turn make up eleven different sets from which they were chosen that correspond to different parts of the northern sky with areas ranging from 400 to 1000 square degrees. The eleven sets were chosen sufficiently far form the galactic plane so as to minimize confusion inherent with the plane (an exciting challenge to be met in the future.) For each target cluster we load filtered 2MASS data corresponding to roughly three square degrees of sky centered on each cluster into our computer program, Cluster_Search and later, Cluster_Blind. We then systematically change the parameters entered into the redshift-sorting algorithm and write several outputs to a log that clue to the parameters' success. We ultimately attempt to minimize the scatter associated with the parameters, albeit via a game of small number statistics associated only with our sixteen clusters which are themselves unfortunately biased towards a redshift range of 0.04-0.07.

Z-CODED COLOR PLOT
Cluster_Search first displays a
plot of the filtered 2MASS data centered on a particular cluster. The plot is a right ascension-declination plot coded with respect to both redshift and signal-to-noise ratios for individual sources. The preceding link and a small thumbnail at the bottom of this section will open a window with an example plot of the galaxy cluster Abell 2589. The plot is a standard RA-DEC plot with the RA increasing to the left. 'Every' point on the plot is an extended source and its color and size are indicitive of its redshift and H-K color SNR, respectively. A color palette at the top of the plot displays a key for the coded redshift of individual galaxies, the rainbow color range of dark blue to deep red corresponding to a Z of 0.00-0.14. This redshift is, of course, not inherent in the 2MASS data but rather Cluster_Search applies each source above a certain SNR threshold to a photometric-redshift relation and deduces a Z from the source's 2MASS J-H and H-K colors. As a first attempt at calibrating the redshift-sorting algorithm we used a theoretical photometric-redshift relation based upon the K-correction curve as discussed in the intro. However, we ultimately side with an empirical relation inspired by Steve Schneider to train the algorithm. The SNR of a source is evidenced by its apparent size; sources with a SNR > 20 have the largest point size, and sources with a SNR of 15-20, 10-15, 8-10, and < 8 have point sizes of 90%, 75%, 50%, and 25% of the largest point size, respectively. The small white points on the plot are extended sources as well but they have a SNR that is below the then current SNR threshold (a user controlled parameter) and we thus do not calculate a redshift for these sources and they are then left out of the redshift-sorting algorithm. Lastly, a small 'x' on each plot indicates the center coordinates of the cluster as documented in the Abell catalog.

CLUSTER RADIUS
Once the plot is made Cluster_Search is largely interactive. The user determines the cluster's radius and thus to some extent the initial sources that are entered into the redshift-sorting algorithm. Determining the cluster radius is designed to be user-intaractive because of the high variability in shape and size of galxy clusters. Throughout the project we experiment with 'large' and 'small' radii but fundamentally a cluster's radius is chosen so as to include all obvious cluster members and encircle the cluster at the most obvious break, as well as to acknowledge that more distant clusters will have smaller apparent radii (given an intrinsic size). In low density areas this approach is quite easy to implement, whereas in greater density areas it is still generally possible to determine a break at the cluster's edge but one must be nominally aware not to include too many errant sources, particulary when dealing with clusters with larger redshifts. We set a lower limit of twenty arcminutes for a cluster's radius when investigating default parameters but occasionally choose a smaller radius for sources with Z > 0.15 when isolating clusters' colors for deriving an empirical photometric-redshift curve. Once the radius of a cluster is chosen a circle with that radius constitutes the two dimensional area of the cluster. The center of this circle is placed at the center of the cluster by default but occasionally when dealing with more distant or odd-shaped clusters it is necessary to offset the circle so as to encorporate all cluster members within a minimum radius. Once the user has identified the extent of the cluster all the sources within this area and above the SNR threshold are entered into the redshift-sorting algorithm.

REDSHIFT-SORTING ALGORITHM
The redshift-sorting algorithm is based upon the fundamental premise that within a cluster's radius there will be more cluster members with ~similar Z than either foreground or background sources with ~similar Z. (Indeed, one apparent problem from the onset is thus the sticky situation of multiple clusters within a single line of sight.) The sorting algorithm first creates a histogram of the redshifts within a cluster's radius with a user-determined bin width and an arbitray histogram binning starting point. Initially then, this first histogram has as a peak the mode of all the redshifts within the cluster radius, and if our premise holds this mode will be indicative of the cluster redshift. However, at this point, the cluster redshift has an uncertainty imposed by the bin width equal to the ~value of the bin width. To minimize this uncertainly we would have to create a histogram with a bin width approaching Zero (or at least having a smaller bin width to lower the uncertainty), but at a certain point a small bin width puts in jepoardy our fundamental premise. With a smaller and smaller bin width the number of sources in the mode decreases and the chance of incorrectly identifying a cluster redshift due to foreground and background sources skyrockets.

There is fortunately another means to attack the uncertainty inherent with the bin width. After Cluster_Search creates a first histogram it changes the starting point of the histogram and re-distributes the redshifts of all the sources. It does so repeatedly, each time changing the seed by a fraction of the bin width, and it thus systematically shifts the histogram along the Z-axis re-distibuting the redshifts each time. Cluster_Search keeps track throughout of the histogram with the highest peak, the histogram with the greatest number of sources in its mode, as this historgram will have most accurately determined the mode of the redshifts. This shifting identifies the histogram in which the fewest number of sources that appropriately belong to the mode will have fallen in adjacent bins. In this manner we can use a large enough bin width so as to accurately isolate the redshift of the cluster and also lower the uncertainty of the mode to below that imposed by the bin width. Nonetheless, the bin width remains a fundamental parameter whose ideal value must be determined from experiment, for the larger the bin width the better the probability of correctly identifying a cluster's redshift within an unfortunately larger uncertainty.

HISTOGRAMS
Given a cluster center and radius as well as values for all parameters, Cluster_Search thus shifts a cluster's histogram until it has isolated the histogram with the peakiest mode. The program initially outputs this histogram for whichever parameter values are set to default, but it can recalculate the histogram after changes are made to any of the parameters. For each target cluster we record a
default histogram, best histogram, and a best monte-carlo histogram (defined below). The preceding links as well as thumbnails at the bottom of this section open windows with example histograms of galaxy cluster Abell 2589. Each histogram has at its top the values of the three parameters bin width, SNR limit, and power. The oddities of each are explained fully in the next section on parameters, but simply the bin width is that of the histogram, the SNR limit (expressed as 1/SNR as the delta mag of a source's signal) is a lower SNR threshold for sources entered into the redshift-sorting algorithm, and the power is a value that deteremines the weighting, if any, of sources based on their respective SNR. Furthermore, other pertinent information is listed in the upper right corner of each histogram; there listed the cluster's RA and DEC coordinates, the cluster's radius, the mode of the histogram, the mean of the redshifts within the mode, and the fraction of all the sources that comprise the mode. The number of sources within a bin can be determined from the vertical axis and a bin's redshift from the horizontal axis. The histogram is traced by a solid white line and the vertical dashed red line represents the mean of the redshifts within the mode, the bmean. The dashed red line that may or may not prescribe a second histogram traces the same histogram if its power parameter were Zero (thus if a histogram's power is set to Zero the red dashed line will lie directly over the solid white line).

Occasionally we did not record a histogram with original default parameters for we had not yet decided upon them. Rather, we recorded a 'first best' histogram; a histogram acheived with relatively little fiddling of the parameters that derived a fairly accurate Z. These exceptions are maked with an asterix as not being true default histograms in the section on Target Clusters and occur only with a few clusters when using the theoretical K-Correction photometric-redshift relation. In other instances the histogram does not agree exactly with the quoted value for the mode. As Cluster_Search shifts through histograms many slightly different histograms have the same number of sources in their respective modes and thus the same proabability of containing the best mode. The best histogram corresponds to the one with the mode closest to the bin mean (of the best mode). Cluster_Search then displays the histogram which corresponds to the calculated mode.

LOG
Control of Cluster_Search is returned to the user once the progam has created a histogram given a set of parameters. The user has options to begin afresh, at which point the user picks a new center and radius for the cluster; change the value of individual parameters; enter or exit
Monte-Carlo mode; redo the histogram, at which point changes in parameters and mode take effect but the cluster center and radius do not change; write pertinent information to a log; and exit the program, among other options uselful to future experimentation. The log is a means of recording information to be used in determining the best default parameters. The preceding link as well a larger one below refers to an example log for galaxy cluster Abell 2589. Each entry in the log records the cluster's RA and DEC coordinates; its radius; the number of sources in the mode; the percent of sources within the cluster radius that comprise the mode; the mode; the bmean; the three then current values for the bin width, SNR limit (as 1/SNR), and power; the true redshift of the cluster if know, -9.00 otherwise; two values representative of the percent error in the derived redshift values of the mode and bmean, del1 and del2; and a comment if so desired. Thus a string of entries records not only the systematic changes made to a particular parameter but its effect on the derived redshift values, the percent error, number of sources in mode, et cetera. Generally, the presented logs record the systematic change in the three key parameters bin width, SNR limit, and power, as well as the information pertaining to a best set of parameters derived from these entries in the log. It furthermore records systematic changes in bin width and SNR limit as well as a best combination thereof once in Monte-Carlo mode. Notes to this effect are reported in the comment section of the log. The logs for our sixteen target clusters are thus split roughly into two halves, the bottom half denoted as taken in Monte-Carlo mode. Clearly the columns marked del1 and del2, the percent errors in the mode and bmean redshift values, will be of key interest when determining a set of best default parameters; see results. Some of the earlier logs (non-empirical (K-Correction) logs for certain clusters) differ in regards to the original default values of parameters before each is sytematically varied. We quickly made a few decisions as to what we wanted as original default parameters, and ultimately the earlier entries with slightly different default parameters prove insignificant as it is only the best combination thereof that influences the derivation of the best set of default paramaters. On a larger scale, the parameter of whether or not Cluster_Search is running in Monte-Carlo proves to be of fundamental importance.

MONTE-CARLO
The Monte-Carlo mode of Cluster_Search is a theoretically sound approach to dealing with the uncertainty inherent with a source's signal. When Cluster_Search is not in Monte-Carlo mode the user has two means by which to approach the SNR of a source. The full details are available in the section on
key parameters, but simply the user has options to one, apply a SNR limit for sources entered into the redshift-sorting algorithm and two, apply a weighting scheme to sources based on their respective SNR, giving greater weight to higher SNR values. Both of these options induce systematic errors into redshift derivation as they favor certain ranges in Z. The Monte-Carlo option is theoretically more sound in that before a source is entered into the redshift-sorting algorithm, Cluster_Search fits the derived redshift of a galaxy to a guassian based on the SNR of the source. Cluster_Search prescibes a given number of 'hits' to each source, and these hits are spread out over the gaussian begot of a source's uncertainty in its 2MASS magnitudes. Each hit is then entered into the redshift-sorting algorithm. Cluster_Search uses a random number generator and a seed for the geneartor to spread the hits over a given guassian curve. In order to make the seed irrelevant and eliminate any unwanted effects of small number statistics, we must experiment with the number of hits prescibed to each source. A larger number of hits slows down the program but smoothes each guassian; experimenting with a range of 100-500 hits, we ultimately conclude that 400 hits per source sufficiently represents each guassian without serious threat of random ill-representation. Furthermore, we ultimately enforce a two sigma cut-off point for prescribing hits to a given guassian. This cut-off minimizes the effect of both high and low SNR sources 'tugging' on a cluster redshift from afar in the histogram. This method clearly increases many fold the number of quasi-sources entered into the redshift-sorting algorithm. In order to normalize the resulting histogram, and thus derive appropriate values for the number and percent of sources in the mode, we simply divide the number of sources in each bin by the number of hits prescribed to each source. Directly accounting for the SNR of each source, the Monte-Carlo mode creates a histogram that is more representative of the true 2MASS data present within a cluster radius. Nonetheless, the parameters of bin width and SNR threshold remain of profound importance. More important still, however, is the photometric-redshift relation that we choose to enter into Cluster_Search. We want to minimize the parameters associated with the program that we will eventually use to build our own empirical curve. It was quickly apparent that the K-Correction photometric-redshift relation is subject to a systematic error in which it underestimates the redshift of all but the closest galaxy clusters, Z roughly less than 0.07. Thus before we attempt to analyze the data taken with this relation, we decide to experiment with a new empirical relation offered by Steve Schneider.

EMPIRICAL PHOTOMETRIC-REDSHIFT RELATION
The Abell 262 Field 2MASS Galaxy Sample is Schneider's detailed study of the Abell 262 cluster field. The project investigates the redshift distribution, luminosity functions, and absolute magnitudes of hundreds of galaxies in the area and ultimately offers both a color-redshift and a color-luminosity relationship. Refer to the above link for a full presentation of Steve's color-redshift results for 323 galaxies in the area. Schneider presents the color-redshfit relationship

Z = 0.1682 (H-K) + 0.017

begot of a least squares fit to the H-K colors of the galaxies. Furthermore, he finds that some of the intrinsic scatter associated with galxy colors can be removed if the J-H colors are also encorporated in the relationship. Steve finds that a color term of

Q = (H-K) + f (J-H),

where f=0.644, minimizes the scatter in the predicted redshift. The first empirical photometric-redshift relationship that we then enter into Cluster_Search becomes

Z = 0.2059 Q - 0.1073.

Again attempting to minimize all parameters using this empirical relationship we discover that this photometric-redshift relation is also subject to a systematic error. This round, the relationship overestimates the redshift of most of our target clusters with a Z roughly less than 0.08. The derived relationship may be subject to the shortcomings of statistics, for plotting the relationship over figures of color versus redshift for the Abell 262 cluster field reveals that the curve is indeed rather flat. As discussed in the intro, the H-K colors of galaxies redden with increasing Z. If the slope of the photometric-redshift curve is too great, as is the case with the K-Correctin curve, then it will underestimate the redshift, particularly of more distant galaxies; whereas, if the curve is too flat, as is the case here, it will overestimate the redshift, drastically so for closer galaxies. We need only a rough but decent photometric-redshift relation before we can minimize our parameters and ultimately derive our own empirical curve. We thus compute a quick relationship from hand via the data gathered by Steve. Using plots of H-K, J-H, and Q all versus Z we compute a color-redshift relationship; see Calibration. We use this empirical relation and proceed to take data. Data available in Target Clusters (as well as elsewhere throughout the document) denoted as taken with an empirical photometric-redshift relation refers to this latter relationship.



EXAMPLE LOG, PLOT, AND HISTOGRAMS
FOR GALAXY CLUSTER ABELL 2589 (Z = 0.042)


(Z and SNR coded RA-DEC plot)


(default histogram)


(best histogram)


(best monte-carlo histogram)



III. Key Parameters

Aside from the fundamentals of which photometric-redshift relation to use and whether or not to run Cluster_Search in Monte-Carlo mode, we attempt to minimize the three key parameters of bin width, dmag limit, and power. The bin width used by the redshift-sorting algorithm, the dmag limit (1/SNR) for sources entered into the algorithm, and a power value that determines how extreme, if at all, sources are weighted based on their respective signal-to-noise-ratios, all influence the success of the redshift-sorting algorithm. We suspect that extremes of any parameter will result in a systematic error in redshift derivation and thus all must be determined from experiment. Theoretically one would expect nearby clusters to be favored, have better z estimates, with a small bin width, high SNR limit, and a high power. Contrarily, in theory more distant clusters should, if not favored, have better z estimates with a larger bin width, lower SNR limit, and no power. Indeed, these trends are somewhat evident throughout our experimentation, but the infinite oddities of individual clusters with their respective foreground and background sources provide many an exception. All told, it is particularly difficult to adjust the parameters so as to favor more distant galaxies. Because of their small size, small 2MASS member number, and low SNR's, derived redshifts are easily misplaced by foreground sources. Nonetheless we vary the parameters as best we can to favor each target cluster and carefully record the results. The following tables depict the widest range of parameters we investigate throughout the project using Cluster_Search in and out of Monte-Carlo mode.


NOT IN MONTE-CARLO MODE
PARAMETER RANGE
bin_width 0.01-0.03
dmag_limit 0.05-0.30
power 0.00-3.00


MONTE-CARLO MODE
PARAMETER RANGE
bin_width 0.005-0.020
dmag_limit 0.10-0.25
power 0.00-0.00




IV. Target Clusters

All relevant data pertaining to our sixteen target clusters is here available. Abell 262 serves as an example below and the following links lead to a cluster's respective information. Each cluster has a Digital Sky Survey (DSS) image, it is 30 arcminutes in width and height and is a reversed optical image. The image is centered on each cluster with a small gray cross denoting the cluster's center. For both a K-Correction photometric-redshift relation (see: Intro) and an empirical photometric-redshift relation there is a log, an RA-DEC plot, a default histogram, a best histogram, and a best Monte-Carlo histogram for each cluster. Furthermore, there is either one or two J-H and H-K color histograms for each cluster under the empirical photometric-redshift relation.



*Some clusters do not have default histograms using the K-Correction Method, rather they have a 'first best' histogram. These were made before we had settled upon default parameters.
**Some histograms do not agree exactly with their respective 'mode'; the calculated number for the mode is correct, the displayed histogram is simply one of many histograms that initially had the same probibility of having the best mode. (see Data & Methodology: HISTOGRAMS )

V. Results

The second empirical color-redshift relation with Cluster_Search in Monte-Carlo mode with a bin width of 0.015, a SNR limit (as 1/SNR) of 0.10, and a power of 0.00 yields the least amount of scatter in the derived Z for the target clusters. Having used a theoretical K-Correction curve and two empirical curves, as well as varying parameters in and out of Monte-Carlo mode, these appear to be the best set of parameters, and indeed this is the set we use when deriving the J-H and H-K colors for the sixteen target clusters. Ultimately however, a SNR limit of 0.10 is too unsettling. Such a high SNR limit will, it would seem, inevitably favor closer galaxies and subject more distant galaxies to serious 'Z-threats' from foreground sources as perhaps many of the distant cluster members will be excluded from the plot. Furthermore, as mentioned, these results are begot of very low number statistics from our target clusters wich are biased themselves towards a middle-range redshift, 0.04 < Z < 0.07. Thus, for the purposes of gathering data for a better empirical relationship we use the afore mentioned parameters with the exception that we choose a dmag (1/SNR) limit of 0.15. This set of parameters with a dmag limit of 0.15 provides the third least amount of scatter for the target clusters, the second least amount of scatter provided by another set with a dmag limit of 0.10.

Looking predominantly at the percent errors of the mode and bmean redshift estimates, we use the target cluster logs to determine the best sets of parameters. The following tables document the excercise.




VI. Color - Redshift Relation

Having at last settled upon default parameters for Cluster_Search with its temporary photometric-redshift relation we set out to derive our own color-redshift curve. For this leg of the project we need a much larger set of galaxy clusters with known redshifts, but we nonetheless begin with our sixteen target clusters. We must extend the capablities of Cluster_Search in order to derive a photometric-redshift curve, for in addition to isolating the redshift of a given cluster, we must now also isolate and document the 2MASS colors of the cluster. TJ adds a color-sorting algorithm, as well as a few other perks, to Cluster_Search and the program is renamed Cluster_Blind, as it will eventually be used blind to search for unknown clusters.

COLOR-SORTING ALGORTIHM AND J-H, H-K COLOR HISTOGRAMS
The color-sorting algorithm works in much the same manner as does the
redshift-sorting algorithm. The J-H and H-K colors of each source within a given cluster radius and above the now established SNR threshold are entered into the color-sorting algorithm. The algorithm creates two histograms, one each for the J-H and H-K colors, with now established bin widths and once again, calculates both a mode and a bmean. However, the bmean calculated by the color-sorting algorithm is not simply the average of the values within the mode. Initially we did not think it necessary to apply the 'shifting' method to the color-sorting algortihm (see: redshift-sorting algorithm), but eventually did so in the name of precision; rather, we first applied a weighting scheme to the mode and its adjacent bins and used all three bins to calculate a bmean. Each of the adjacent bins was given a weight of 25% and the primary bin (mode) 50%, and thus if a significant number of sources that belonged to the true mode fell into either adjacent bin, they would be averaged into the bmean. The color histograms outputed by the color-sorting algorithm are seomewhat simpler than are the redshift histograms. The preceding link as well as the thumbnail at the bottom of this section opens a window with example color histograms of galaxy cluster Abell 2589. The histograms are traced in white and the dashed line is that of the bmean. The value of the mode and the bmean are displayed at the top of each histogram (in that order) and the number of sources in a bin may be determined from the vertical axis and the color of a bin from the horizontal axis. Here is a table of the 2MASS colors for the target clusters. There are two entries for many of the clusters because we had been experimenting with a 'larger' cluster radius but felt that especially for the more distant clusters it would be wise to note the influence of a smaller cluster radius. Indeed for redshifts greater than 0.07 a smaller radius provides a little improvement in the z estimate. When a better z estimate is derived from a smaller radius we use the corresponding oolors to plot the clusters on our photometric-redshift curve.


EXAMPLE COLOR HISTOGRAMS FOR GALAXY CLUSTER ABELL 2589 (Z = 0.042)


(J-H, H-K color histograms)


CLUSTER_LOCALMAX
After procurring the colors of our target clusters, we proceed to known galaxy clusters in a small section of southern sky and a patchy 20 degree band of northern sky. The section of southern sky is an area between the declination of -20 and -40 degrees and right ascesion of 0 and 80 degrees. Within this large an area of sky (not along the plane) there is inevitably many galaxy clusters with known redshifts, yet it remains a challenge to isolate the clusters and apply them to the color and redshift-sorting algorithms. TJ designed Cluster_Blind optimally so that it operates fundamentally from a 400 square degree plot of 2MASS sky (thus, for example, Cluster_Blind must be run four seperate times in order to search through our small area of southern sky). From this view the user isolates a cluster and Cluster_Blind Zooms into a nine square degree Z and SNR coded plot of the 2MASS cluster. However, before the 2MASS data corresponding to the 400 square degree plot is entered into Cluster_Blind it is filtered through yet another program written by TJ, Cluster_Localmax; and even before the data is entered into Cluster_Localmax a minimum SNR limit (expressed as 1/SNR as the delta in a source's signal) of 0.15 is imposed on the data Cluster_Localmax is a searching algorithm that identifies possible clusters on the basis of spatial clustering alone; it has no clue as to the redshift or SNR of an individual source. The program operates on the fundamental parameters of a minimum number of sources to be found within a circle area with a given radius. When Cluster_Blind uses data that has been run through Cluster_localmax, a blue circle is overlain on the plot identifying where Cluster_Localmax has found a possible cluster. These identifying circes are carried over to the smaller Z-coded plot, if the user decides to go there, so as to more precisely mark the spatial cluster (in this view the circles are made much larger and colored yellow so as not to confuse the color plot). For the purposes of soley identifying clusters with known Z, Cluster_Localmax is somewhat superfluous, but we use it here so as to get a feel for the program and experiment with its parameters as it will ultimately lead us in our blind search of the sky. We can gage the success of the parameters of Cluster_Localmax in that in addition to the blue cirlces overlain on the larger Cluster_Blind plot, there are orange and yellow circles which correspond to known galaxy clusters. The yellow circles represent clusters with known Z from the Abell catalog and the orange cirlces indicate clusters from the NED and HUCHRA catalogs that may or may not have known redshifts as well as clusters from the Abell catalog that do not have documented Z. Thus, as we sort through the known clusters we can check whether or not Cluster_Localmax has correctly identified them. Furthermore, spatial clustering of galaxies is fairly apparent to the human eye and one can thus hunt for unknown clusters and check whether or not Cluster_Localmax did or 'should' have identified the cluster and then tune the parameters accordingly. For the majority of its use, Cluster_Localmax was set to identify areas that had a minimum of ten sources within a circle area with a radius of 15 arcminutes. In areas of low density these parameters appear to work fairly well whereas in areas of higher density the number of possible clusters identified may be a little too great.

LOG
Thus when Cluster_Blind fires up, it presents a 400 square degree plot of sky with yellow and orange circles identifying known clusters and blue circles marking spatial clusters. As we secondly investigate the success of Cluster_Localmax, we primarily search systematically through the known clusters for those with documented redshifts. Often the user may Zoom in on an optimal nine square degree plot that contains many clusters, and Cluster_Blind thus marks the center coordinates of each known cluster with a small 'x'. Among the improvements of Cluster_Blind is the option to display a list of known clusters close to any chosen point on the plot. The list displays all clusters in the Abell, NED, and HUCHRA catalogs and presents a cluster's name, coordinates, size, and redshift if know, in addition to the distance in arcminutes from a cluster's center to the user-marked point on the plot. From the larger view we can then choose to create a Z-coded plot for an area, mark any 'x' on the plot, determine the cluster's name and redshift, and ultimately write pertinent information to a log. The Cluster_Blind log is similar to that for Cluster_Search, save that there are additional columns of information. Cluster_Blind supercedes the former log in that it records not only the RA-DEC coordinates of a cluster but also its galactic longitude and latitude coordinates, as well as the cluster's average extinction, and the mode and bmean of its J-H and H-K colors. Using the established set of default parameters a given galaxy cluster warrants only a sinle entry in the log, and we record the cluster's name in the comment section. Furthermore, there are occasionally instances when a cluster will have differing documented redshifts and others still when there are multiple clusters within a single line of sight each with its own spectroscopically derived redshift. Under these circumstances we note the details in the comment section and essentially side with the documented Z that most closely matches our estimated Z. In so doing we choose the Z for the cluster that is predominant in our histogram when there are multiple clusters in a line of sight, and for the former case we essentially assume that our estimated Z gives credit to the most similar spectroscopic Z. case The user continues to determine the cluster radius as Cluster_Blind trecks across the sky and in this manner we can search through large areas of sky, gathering data for a color-redshift relation. Using the established set of default parameters, the only control the user has over the derived Z is via the cluster's radius. We simply choose a sensible radius (as described in the preceding link) and vary it slighly in order to derive the best redshift estimates. Doing so ensures that we enter appropriate sources into the color-sorting algortihm. Occasionally for clusters with Z > 0.15, we pick a cluster radius below our formerly established twenty arcminute minimum. This minimum radius is designed to be a lower limit for searching for and identifying clusters, and although we are able to deduce redshifts out to a distance of Z > 0.15 it is unlikely that we will be able to detect them. All told, we calculate the colors of 66 galaxy clusters from our small section of southern sky. We also scour a band of northern sky between the declination of 23 and 43 degrees, less the areas between right ascesnsion 0 and 20, 60 and 80, and 260 and 280 degrees. Including the sixteen target clusters we derive the colors of a total of 208 clusters from our northern 2MASS data. We have now 274 data points from which to construct an empirical photometric-redshift relation.

Summary of Results: see 2MASS Photometric Redshift Project: Status