Repeatability tests (see, for example, Cutri (2001)), have shown that the RMS flux dispersion of PROPHOT point source photometry is systematically less than the quoted flux uncertainty for S/N > 10, and the discrepancy is particularly large for the bright stars, whereby the flux uncertainty is overestimated typically by a factor of 2. We have been aware of this effect for some time, and have found that it can be attributed entirely to an overestimate of the PSF variance map caused by the presence of undersampling-related sinc interpolation errors in PSF generation. A more detailed discussion of this effect will be presented in a forthcoming document. The goal of the present document is to present some empirical results showing how the discrepancy varies as a function of magnitude and seeing shape factor, and to propose a rectification procedure.
Because of the likelihood that the correction factor for the flux error (which we will refer to as the "sigma correction") will have some dependence on seeing, it is necessary to examine the flux repeatability as a function of the seeing shape factor, in addition to source magnitude. With this goal in mind, the flux repeatability of calibration sources has been determined for each individual PSFID, for a representative sample of southern hemisphere data. The analysis was restricted to galactic latitudes outside the range -20 to +20 degrees, with the following additional selection criteria:
(1) 2nd moment ratio > 0.9
(2) S/N > 10 at K.
(3) Read 2 only (rd_flg='222')
(4) No confusion artifacts (cc_flg='000')
(5) Reduced chi squared < 2.0 in all 3 bands.
(6) Single sources only (bl_flg='111')
The time ranges were selected so as to include a representative set of
seeing shape factors, and included a total of more than 9 months worth of
cal scan data. The specific intervals chosen were:
1998 Jul 14 - 1998 Jul 31 1998 Sep 27 - 1998 Oct 24 1998 Dec 1 - 1999 Feb 28 1999 Jun 1 - 1999 Aug 31 1999 Sep 14 - 1999 Oct 29 1999 Nov 30 2000 Apr 7 2000 Jul 2
The flux differences used to assess repeatability were obtained from the multiple observations of a given source on the same night. For each pair of observations of a given source, the flux difference was expressed as a ratio with respect to the expected error derived from the quoted sigmas from PROPHOT, and the RMS values were evaluated in bins corresponding to the PSFID and source magnitude (in bins of width 0.5 mag).
The complete set of results is summarized in a set of plots of RMS/sigma as a function of magnitude for all PSFIDs encountered, in Figure 1a, Figure 1b, and Figure 1c, representing J, H, and K bands, respectively.
In these plots, various ranges of seeing shape factors, sh, are indicated by solid lines (sh < 1.0), dashed lines (1.0 < sh < 1.1), dash-dot (1.1 < sh < 1.2), and dotted (sh > 1.2).
In order to show better the dependence on seeing shape factor, the bright-star values of RMS/sigma (representing the average of RMS/sigma for J<12, H<11.5, and K<11) are plotted as a function of shape in Figure 2.
Since the sigma discrepancy is due to the overestimation of the PSF variance maps, one approach to correcting the sigmas would be to scale the variance maps in accordance with the above results, and then to re-derive the sigmas from scratch, using the expressions corresponding to the PROPHOT error model. This would be perfectly feasible and would not involve re-doing the photometry. However, I am not recommending that this be done during catalog generation, partly because it would add a significant amount of computation to the generation process, but more importantly because there exists a much simpler alternative, in the form of an empirical correction.
Based on the slowly varying nature of the sigma correction with magnitude and seeing shape factor, it should be possible to come up with a set of simple empirical expressions which would enable the flux sigmas to be corrected with a minimum of computation and risk. The form of such an empirical expression should be motivated by the form of the PROPHOT error model itself, which is described in the 2nd release documentation. Based on that model, one can express the variance of the i-th pixel value as:
sigma_i^2 = f H_i / g + sigma_b^2 + f^2 V_i
where f is the source flux [du], g is the pixel gain [counts/du],
sigma_b is the standard deviation of local sky background [du], and
H_i and V_i are the local values of the PSF and variance map.
Based on this expression, we can derive a suitable form for the
empirical sigma correction by means of the following (admittedly rough)
plausibility argument:
If, for the purposes of simplicity, we ignore the Poisson noise of the source itself (since it is dominated by PSF error for bright stars and the background noise for faint stars), and make the further simplification that V_i = V = constant, then it can be easily shown that a direct proportionality exists between the a-posteriori flux uncertainty, sigma_f, and the standard deviation of a pixel value, i.e.
sigma_f ~ sigma_b^2 + f^2 V
where ~ denotes proportionality.
If we denote the PSF variance as used in PROPHOT by V_0 and the true variance by V, then we can express the correction factor for the flux error as:
alpha(f) = sqrt[(sigma_b^2 + f^2 V)/(sigma_b^2 + f^2 V_0)]
which, after some minor manipulation can be expressed as:
alpha(f) = sqrt[(c + alpha_b^2 f^2)/(c + f^2)]
.................................................... (1)
where c and alpha_b are constants whose values can be estimated from the
repeatability results in the previous section. The quantity alpha_b represents
the bright-source limit, i.e.,
the correction factor in the limit of large flux.
To verify that this expression gives a good representation of the
repeatability data, a sample fit is shown in
Figure 3, corresponding to a seeing
shape factor of 1.007 at J-band. On this plot, the diamonds represent
the measured values of RMS/sigma, and the solid line represents the
best fit to Equation (1).
Since Figure 2 shows some shape dependence, the parameters c and alpha_b should ideally reflect this. Initial attempts at fitting a linear variation of these parameters with respect to shape, however, were unsuccessful; the true dependence is more complicated, and may, in some cases, reflect the quirks of individual PSFs.
The next strategy tried was to fit the parameters c and alpha_b in separate shape ranges. Figure 4a, Figure 4b, and Figure 4c show the results (at J, H, and K bands, respectively) of splitting the data into two segments with respect to shape. The cutoff value used to divide the shape ranges was determined on the basis of minimizing the overall residuals of the fit, and is indicated on the figure in each case. Data values in the lower and upper shape ranges are indicated by crosses and filled circles, respectively. The fitted curves are represented by dashed lines (lower shape ranges) and solid lines (upper shape ranges). We could, in principle, use these curves to correct the flux sigmas of the survey data. The accuracy that we would thereby obtain can be assessed from the scatter of the points about the fitted curves, which indicate that in the bright-star regime, the standard deviation of the corrected sigmas would be approximately 20% of the sigma values themselves.
Ideally, however, we would obtain greater accuracy by doing the fit separately for each PSFID rather than binning into two coarse shape ranges, as illustrated by the superiority of the fit in Figure 3. The only danger associated with separate fits is that some PSFs have rather sparse statistics. However, the "good seeing" PSFIDs are well represented, and so the best option is probably to fit these individually, and use average values for the more sparsely-populated "poor seeing" PSFs. Based on these considerations, a set of c and alpha_b values has been obtained for each individual PSFID for the cases with good statistics (N>100 in at least 10 magnitude bins), defaulting to the average values for the appropriate seeing bin for the remainder of the PSFIDs. For the latter purpose, the seeing shape factors were divided into two coarse bins as in Figures 4a, 4b, and 4c.
Using this set of c and alpha_b values, the sigma corrections to be applied to the survey data can be calculated using Equation (1). This has been done for a pair of 9 deg x 5 deg test regions: a fairly dense region in CMa (l ~ 240, b ~ -10), and a less dense region in Eri (l ~ 230, b ~ -45). The results are shown in Figure 5a and Figure 5b, respectively. In these figures, the plotted points represent RMS/sigma, i.e., the magnitude repeatability (from overlaps of adjacent scans) divided by the theoretical value of magnitude uncertainty. The symbols on the plots have the following meanings:
Open circles: Values based on uncorrected sigmas, i.e. the values from PROPHOT.
Filled circles: Corrected as described above, using individual PSF fits where possible.
Crosses: Corrected using NO shape information (for comparison).
These results suggest that the magnitude sigmas for all of the PROPHOT-reduced sources in the catalog could be corrected in a relatively simple way. If necessary, this could be done without even using shape information, although the H-band plot in Figure 5b illustrates the systematic errors that can occur when shape is neglected.
Since photometric accuracy will inevitably be a function of seeing shape, one would expect that plots of sigma as a function of magnitude should show different loci for different PSFIDs, and this is, in fact the case, as shown in Figures 6a (J band), 6b (H band), and 6c (K band), made using data from the first of the two test regions above.
One thing which is a little surprising is that the J-band plot, for the case in which sigma has been corrected using individual PSFIDs, shows a greater spread in sigma values than does the uncorrected plot. The spread is not due to random error, as demonstrated by the plot of sigma as a function of PSF shape factor, for J < 11, in Figure 7, which shows an orderly progression of sigma with shape. Evidently, there is a genuine variation of photometric error with seeing shape which did not get properly represented in the PROPHOT uncertainties.
By contrast, the H-band plot (and, to slight extent, the K-band plot) show decreased dispersion of sigma values in the individual-PSF-corrected plot. It appears that for these two bands (particularly H), there were random errors in the variance maps which were alleviated by the sigma correction.
A drawback with the multiplicative form of the correction is that in those cases involving an anomalously high sigma (due, for example, to confusion), the contribution of the anomalous effect would be scaled along with the standard term, leading to an erroneous value of the uncertainty. This could be overcome by the use of an additive, rather than multiplicative, correction. A suitable form for such a correction can be derived as follows:
If, as before, we denote the normalized PSF variance as used in PROPHOT by V_0 and the true variance by V, then we can express the corresponding difference of a-posteriori variances of the flux estimates by:
[(sigma_f)_0]^2 - sigma_f^2 = a (V_0 - V) f^2
.................................................... (2)
where (sigma_f)_0 and sigma_f represent the prophot-quoted and true flux
uncertainties, respectively, f is the source flux, and "a" is a constant.
Since the flux sigma is related to the magnitude sigma, sigma_m, by:
sigma_f / f = 0.4 ln 10 sigma_m
then Equation (2) reduces to:
[(sigma_m)_0]^2 - sigma_m^2 = a (V_0 - V) / (0.4 ln 10)^2
.................................................... (3)
i.e. the correction is magnitude-independent and is constant for a given
PSFID.
The correction may be derived from the cal scan repeatability data as before. Plots of sqrt(sigma^2 - RMS^2) as a function of magnitude for the southern hemisphere are shown in Figures 7a, 7b, and 7c, for J, H, and K, respectively.
In each of these plots, the vertical axis represents the correction which must be subtracted in quadrature from the quoted magnitude sigma to give the true magnitude sigma; various ranges of seeing shape factors, sh, are indicated by solid lines (sh < 1.0), dashed lines (1.0 < sh < 1.1), dash-dot (1.1 < sh < 1.2), and dotted (sh > 1.2). As predicted by Equation (3), the correction is more or less constant as a function of magnitude.
For a given PSFID, the maximum likelihood estimate of the correction is obtained by taking an inverse-variance-weighted average of the quantity (sigma^2 - RMS^2). Since the RMS^2 values are described by a chi square distribution with characteristic width sigma^2/sqrt(N) (where N is the number of samples), the appropriate weighting to use is N/sigma^4.
Plots of the estimated correction as a function of seeing shape factor are shown in Figure 8.
Plots of RMS/sigma for a subset of survey data (same region as Figure 5b) are shown in Figure 9). As before, open circles represent uncorrected values, while the filled circles and crosses represent values corrected on the basis of individual PSFs and the mean correction, respectively.
Plots of the sigma v. magnitude loci are shown in Figures 10a (J band), 10b (H band), and 10c (K band).
Plots of the estimated correction as a function of seeing shape factor for all 5 hardware periods are as follows:
North period 1
North period 2
North period 3
North period 4
South
The values of the corrections, to be subtracted in quadrature from the quoted sigmas, are contained in a set of files which may be accessed from the following table:
| Hardware period | Dates | J | H | K |
| North 1 | 970521 - 980604 | n1/jpsf.cortable | n1/hpsf.cortable | n1/kpsf.cortable |
| North 2 | 980605 - 980916 | n2/jpsf.cortable | n2/hpsf.cortable | n2/kpsf.cortable |
| North 3 | 980919 - 990723 | n3/jpsf.cortable | n3/hpsf.cortable | n3/kpsf.cortable |
| North 4 | 990913 - end | n4/jpsf.cortable | n4/hpsf.cortable | n4/kpsf.cortable |
| South | 980318 - end | s/jpsf.cortable | s/hpsf.cortable | s/kpsf.cortable |
The format of each file is as follows:
(1) The first set of lines represent the PSFIDs for which an individual value of the correction is available; each such line consists of the PSFID followed by the correction value and the bright-star RMS repeatability.
(2) The line following the last PSFID contains the default correction and RMS, designated "All others", applicable to all PSFIDs which are not explicitly listed in the previous lines.
(3) The last line of the file contains the mean correction and RMS (average for all PSFIDs), designated "Mean value". Normally this value would not be used, but it is included in case computational limitations prevent the application of individual PSFID-based corrections.
Not all PSFIDs are represented in the files in Table 1 since not all PSFIDs had adequate repeatability statistics to calculate a reliable correction value. The criterion for inclusion in the table is that the estimation error of the correction factor be less than or equal to 15% of its value. The basis for this is that a statistical analysis of the correction values has shown that if the error associated with a particular PSFID is greater than about 15%, then a more accurate correction can be obtained by defaulting to the average value for the corresponding shape range. Since the PSFs in this category are all at the poor-seeing end of the shape scale, the default correction value (denoted "All others" in the above tables) was obtained from a weighted average of the correction values for shape > 1.05.
The formula for applying the corrections is:
sigma_corrected = sqrt( max[(sigma_prophot^2 - correction^2), (sigma_prophot/3)^2] )
............................................... (4)
The "floor" value to the argument, (sigma_prophot/3)^2, is included to prevent the argument going negative in the case of occasional anomalously low values of sigma_prophot. Its chosen value was based on the lower envelope of the distribution of correction values.
The distribution of corrected sigma values was determined over a limited
magnitude range for the whole sky. For this purpose, a database query
was made about J values of 10 and 12, using the following selection criteria:
rd_flg = '222'
cc_flg = '000'
bl_flg = '111'
dirty = 1
10 < j_m_psf < 10.1 or 12 < j_m_psf < 12.025
Histograms of the uncorrected and corrected sigmas were then made for J
in the above two magnitude ranges, and for H and K in 0.5-mag-wide bins
about the corresponding modal values in those two bands, which were
H = 9.5 & 11.7, and K = 9.4 & 11.6, respectively. The results are
presented in the following plots, which show the uncorrected (PROPHOT) sigma
distributions on the left, and the corrected values on the right:
North 1, J=10 and
North 1, J=12
North 2, J=10 and
North 2, J=12
North 3, J=10 and
North 3, J=12
North 4, J=10 and
North 4, J=12
South, J=10 and
South, J=12
It is of interest to determine the frequency at which the sigma/3 floor kicks in. The following table presents some statistics in that regard:
Hardware #sources Fraction of floor-limited cases
period at J
J H K
______________ _____________ _____________
North 1 42168 0.0014 (0.0014) 0.0 (0.0) 0.0 (0.0)
North 2 7921 0.0 (0.0) 0.16 (0.035) 0.23 (0.13)
North 3 52967 3.8e-5 (3.8e-5) 0.033 (0.015) 0.0029 (0.0028)
North 4 78592 0.0039 (0.0039) 2.5e-5 (0.0) 5.1e-4 (0.0)
South 414200 8.1e-4 (7.2e-5) 0.055 (2.2e-5) 2.9e-4 (1.2e-5)
In this table, the results for cases in which an individual PSFID solution was not available (which default to the "All others" value in the ?psf.cortable files) are indicated by parentheses.
The substantial number of floor hits at H and K in northern hardware period 2 indicates that the sigma/3 floor was inappropriate for that period; examination of the data confirms that for that hardware period, the quoted PROPHOT sigmas for bright stars frequently exceeded the RMS repeatability by more than a factor of 3.
A floor value is necessary, however, since without it, the finite width of the distribution of PROPHOT sigma values could still result in occasional negative values of the argument in Equation (4). The fundamental issue is that the correction that we are applying is, strictly speaking, exact only for the ensemble mean; there is an error associated with applying it to individual values of the distribution of sigmas, and the fractional value of this error increases as the correction becomes a larger fraction of the PROPHOT sigma value itself.
A robust solution to the "floor" problem would be to adopt the bright-star RMS repeatability as the floor value instead of sigma/3. In the bright-star limit, the procedure would then be equivalent to the one suggested by M. Skrutskie, in which an excess, derived from the RMS repeatability, is subtracted in quadrature from the PROPHOT sigmas before applying the correction. With this modification, the formula for correcting the sigmas is then:
sigma_corrected = sqrt( max[(sigma_prophot^2 - correction^2), RMS_bright^2] )
............................................... (5)
where RMS_bright is the bright-star RMS repeatability for the particular PSFID,
as listed in the third column of each of the ?psf.cortable files in Table 1.
Applying this procedure to the all-sky data gives the following
set of sigma distributions:
North 1, J=10 and
North 1, J=12
North 2, J=10 and
North 2, J=12
North 3, J=10 and
North 3, J=12
North 4, J=10 and
North 4, J=12
South, J=10 and
South, J=12
The above corrections were applied to the survey data in a set of 6 x 6 deg
test regions, one for each hardware period. The repeatability of sources
in the overlap regions between scans was then compared with the sigma values
before and after correction. The results are presented in a set of plots
which may be accessed via Table 2 below. In the RMS/sigma v. mag plots,
the open and filled circles represent the values before and after correction,
respectively.
| Hardware period | RA [deg] | Dec [deg] | glon [deg] | glat [deg] | RMS/sigma v. mag | sigma_J v. mag | sigma_H v. mag | sigma_K v. mag |
| North 1 | 273 | 16 | 43 | 15 | X | X | X | X |
| North 2 | 295.5 | 45 | 79 | 11 | X | X | X | X |
| North 3 | 272 | 14 | 41 | 15 | X | X | X | X |
| North 4 | 268 | 5 | 31 | 16 | X | X | X | X |
| South | 285 | -32 | 5 | -16 | X | X | X | X |
A final modification has been made, whereby the default correction for missing PSFs (denoted "All others" in the correction tables) has been changed from a constant value to a smooth extrapolation based on a power-law fit to the data for shape > 1.0. The fitted curves are shown superposed on the plots of correction factors as follows:
North period 1
North period 2
North period 3
North period 4
South
The fact that the power-law fits all show a decrease in the correction with increasing shape factor is consistent with theoretical expectation, since the fundamental cause of the inflated sigmas is the interpolation error during PSF generation, which decreases as the PSFs become smoother, i.e., as the seeing becomes poorer.
The files containing the corrections, with the above modification, are presented in Table 3 below. These files are in a slightly different format than Table 1, and include all PSFIDs in the database.
Note that it is not desirable to replace all of the PSFIDs with the power-law fit, since some of the scatter in the correction values can be attributed to quirks in individual PSFs. For this reason, the solutions for individual PSFIDs are still used where available. The only PSFIDs which have been replaced by the power-law extrapolation are those for which either no solution could be obtained, or for which the solution was poor (i.e., an estimation error of more than 15% of the correction value). In these cases, the power-law extrapolation has been applied to the floor values as well as the corrections themselves.
| Hardware period | Dates | J | H | K |
| North 1 | 970521 - 980604 | n1/jpsf.newcortable | n1/hpsf.newcortable | n1/kpsf.newcortable |
| North 2 | 980605 - 980916 | n2/jpsf.newcortable | n2/hpsf.newcortable | n2/kpsf.newcortable |
| North 3 | 980919 - 990723 | n3/jpsf.newcortable | n3/hpsf.newcortable | n3/kpsf.newcortable |
| North 4 | 990913 - end | n4/jpsf.newcortable | n4/hpsf.newcortable | n4/kpsf.newcortable |
| South | 980318 - end | s/jpsf.newcortable | s/hpsf.newcortable | s/kpsf.newcortable |
North period 1
North period 2
North period 3
North period 4
South
Plots of the floor (bright-star RMS value) as a function of shape are as
follows:
North period 1
North period 2
North period 3
North period 4
South
The corresponding sigma histograms for the all-sky data in the limited
magnitude ranges are as follows:
North 1, J=10 and
North 1, J=12
North 2, J=10 and
North 2, J=12
North 3, J=10 and
North 3, J=12
North 4, J=10 and
North 4, J=12
South, J=10 and
South, J=12
Plots of sigma v. magnitude for all of the calibration periods used in
obtaining the sigma corrections are as follows:
| Hardware period | J | H | K |
| North 1 | X | X | X |
| North 2 | X | X | X |
| North 3 | X | X | X |
| North 4 | X | X | X |
| South | X | X | X |
Plots of RMS repeatability (in units of sigma) as a function of magnitude
for the cal scan data, with and without the correction, are as follows:
North period 1
North period 2
North period 3
North period 4
South
As before, the filled and open circles represent the values with and without
the sigma correction, respectively.
Although these plots involve calibration scan data from the same nights used to
derive the corrections, they
represent a much greater quantity of data than the restricted subset used to
derive those corrections. The reason is that the correction calculations
were restricted to those data for which repeated observations of a given
source were made using the same PSFID.
The corresponding repeatability plots for the test regions of survey data
are presented in column 6 of the following table:
| Hardware period | RA [deg] | Dec [deg] | glon [deg] | glat [deg] | RMS/sigma v. mag | sigma_J v. mag | sigma_H v. mag | sigma_K v. mag |
| North 1 | 273 | 16 | 43 | 15 | X | X | X | X |
| North 2 | 295.5 | 45 | 79 | 11 | X | X | X | X |
| North 3 | 272 | 14 | 41 | 15 | X | X | X | X |
| North 4 | 268 | 5 | 31 | 16 | X | X | X | X |
| South | 285 | -32 | 5 | -16 | X | X | X | X |
The above repeatability plots for both the cal scan and survey scan data
show that the sigma corrections produce a substantial improvement in the
validity of the flux error bars. It will also be noted, however, that a
systematic effect is still present, whereby the corrected error bars are
slightly overestimated for bright stars and slightly underestimated for
stars of intermediate magnitude (typically 13th-15th). This is a
second-order effect which is due to
an approximation necessary during the derivation of the correction,
whereby it was assumed that the relative spatial weighting of the pixel
values was magnitude independent, leading to a magnitude-independent
sigma correction.
Although the approximation is valid
over a large fraction of the magnitude range (as evidenced by the relative
constancy of the correction in Figures
7a,
7b, and
7c),
it breaks down in the
"transition zone" between the PSF-dominated and background-dominated regimes.
Specifically what happens is that the correction itself is slowly decreasing
with magnitude as a consequence of the increasing spatial dilution of the
weighting function used in the photometry solution, so that the
necessary correction for faint stars is somewhat less than for bright ones.
Since the derived corrections represent
an average over magnitudes (specifically, with an inverse-variance weighting),
they tend to be underestimated for the bright stars and overestimated for
the faint ones. In the faint-star limit, however, the latter is
inconsequential since the correction represents a small fraction of the
sigma value itself. There is, however, an intermediate magnitude range
over which the effect is manifest.
The distribution of ?_cmsig values in the provisional Point Source Catalog
has been examined by making the following two types of plots for each
hardware period:
10. Validation of catalog cmsigs
(1) In a narrow range of galactic latitude, plot cmsig v. magnitude,
where 30.0 < |glat| < 30.1 for south, and 30.0 < |glat| < 30.5 for north.
The results are presented in the following set of plots, using a
color-scale representation for the density of points:
Hardware period
J
H
K
Number of
sources plotted
North 1
X
X
X
119615
North 2
X
X
X
9072
North 3
X
X
X
122495
North 4
X
X
X
123695
South
X
X
X
126497
(Note that the plotted sources were restricted to 3-band unblended read2
detections with cc_flg='000'.)
(2) In narrow range of magnitude, plot cmsig v. PSFID for the whole sky.
Magnitude range: 10.0 < J < 10.1 for south, and 10.0 < J < 10.5 for north.
The results were as follows:
Hardware period
J
H
K
Number of
sources plotted
North 1
X
X
X
84043
North 2
X
X
X
15160
North 3
X
X
X
110147
North 4
X
X
X
168906
South
X
X
X
138486
In this latter set of plots, the tick marks above the x axis represent the
set of available PSFs. Note also that the positions of plotted points have
been randomly perturbed slightly to give a better visual indication of density.
Last Update - 2002 Oct 2
K. A. Marsh - IPAC