Performance of CLEAN as applied to WIRE Data





1. Procedure

40 frames of dithered, simulated 12 micron data with a strong galaxy-galaxy correlation were produced. These 40 frames were then processed by the data pipeline and coadded by the pipeline coadder. Sources were then extracted from this image using the source extractor "wdaophot". Matches were made to the truth list using "match4" in order to provide a comparison data set.

The CLEAN image was derived from the output of the survey coadder. Using IRAF/DAOPHOT, a PSF was derived for the coadded frame using the brightest isolated stars in the coadd. This PSF was then applied to the coadded image using SCLEAN. The algorithm was allowed to iterate 5000 times using a gain of 0.02. The CLEAN map was convolved with a 2 pixel FWHM (15.47") gaussian beam and added back to the residual image in order to produce the final CLEAN image. For the source extraction, the standalone version of DAOPHOT II was used to construct a new PSF equivalent to the clean beam in the final CLEANed image (mostly for compatibility with the source extractor). This new PSF and the CLEANed image were then fed into "wiredao" for extraction. The resulting source list was matched to the truth list using "match4". The results are shown below. For anyone who is interested, the output of stats7 and stplot2 are also available.

2. Source Detection

The source extractor detects 447 real sources in the CLEAN image, as opposed to 257 in the raw image. The following two graphs are instructive:



On the left is the differential completeness per flux, where completeness is defined as (# true sources matched)/(# true sources present). Both images are complete to approximately 1 mJy, but the CLEANed image is much more complete below that level.

On the right is the differential "effectiveness" of each method. The effectiveness is defined as:


                          (# true matches)  -  (# false detections)
        effectiveness =   -----------------------------------------
                                   # true sources present




A perfect extraction (all targets detected with no false detections) will have an effectiveness of 1. If the effectiveness reaches zero or becomes negative then the detections are no longer reliable or even useful since as many or more false detections are being made as there are real detections (this is equivalent to the hitting effectiveness in volleyball and other sports). We can see that the CLEANed image is still more effective than the raw image for source detection, although not as much as much better as the completeness would lead one to believe. This is because CLEAN is producing many more false sources at low flux levels than are detected in the raw coadd. This then leads us to the reliability statistic, defined as what percentage of detections are actually real detections:



Here we see that the reliability of the CLEANed image is lower than that of the raw coadd for flux levels below about 1 mJy, and then falls to unacceptable levels by 0.5 mJy. This is an unfortunate trade-off; while the CLEANed image is far more complete at these flux levels, in the course of finding the additional sources the source extractor finds a lot of artifacts as well. At such low flux levels this is an almost predictable behavior of CLEAN. The most common failing of the CLEAN algorithm is that it reconstructs diffuse, extended emission in the only way it can - as a set of point sources. Given the low spatial resolution, the large numbers of faint objects, and noisy images, the very faintest objects will appear similar to a diffuse background which CLEAN then reconstructs into a field of points, most of which are real, but many of which are not. In particular, problems arise if two points are close enough so as to produce only a single, extended peak. For a noiseless image this corresponds to the Rayleigh limit, however the presence of noise widens this. CLEAN will necessarily reconstruct such an object as a single point with extended emission; only injection of an image prior (which we don't have) will solve this problem. Initial experiments with altering the CLEAN parameters (gain, # of iterations) have shown that as expected one trades completeness for reliability; while reliability can be increased by decreasing the number of iterations and thereby decreasing the artifacting, there is a corresponding drop in completeness. Additional tuning will be necessary to evaluate the true performance of the algorithm.

3. Photometric Accuracy

The next figure presents the photometric accuracy of the true extracted sources. The actual zeropoint of the magnitude difference is arbitrary, since in both cases a constant was added to offset the input fluxes to the measured simulated fluxes. The accuracy of the fluxes derived from the CLEAN image are higher than those in the raw coadd.



DAOPHOT appears to be consistently overestimating the brightness of the faint sources. This is particularly apparent in the raw coadded frame. If we assume that the bright sources have been extracted accurately (which seems likely, given the low scatter), then it has overestimated the brightness of every detection. Joe's work with the truthlist has indicated that this is probably a result of blending; since the galaxy-galaxy correlation is strong in this simulation, it is likely that at low flux levels DAOPHOT is measuring flux from more than one galaxy inside the apertures. In the CLEANed image this scatter is somewhat more random, and is considerably tighter (sigma=0.30 vs. 0.35), and is probably a result of CLEAN helping clear up some of the confusion. At flux levels of 0.5 mJy the uncertainty seems to be around 0.2 magnitudes.

4. Positional Accuracy

Both methods produce similar results for positional accuracy. The average uncertainty in position is about 3" at 0.5 mJy for both methods. Given that this is 1/5 of a WIRE detector pixel, it is unlikely that much better can be achieved.