2MASS Data Volume Profile

2MASS Data Volume Profile

Roc Cutri - IPAC

Revised 8 August 1997



I. Introduction

For the purposes of projecting storage media and database sizes, it is
useful to estimate the size of the raw and processed data 2MASS data sets, and
the rate at which they will grow during the survey.  This memo seeks to make 
that estimate based on basic implementation of Survey Strategy, initial Northern 
2MASS operations, preliminary formats of the survey databases, and models 
and observations of of source distributions. These estimates will continue
to be refined as the operations continue. 

II.  Basic Assumptions

   i. 2MASS single frame: 
	 Size= 256x256 pixels, scale= 2"/pixel, area= 72.8 sq.arcmin

   ii. Survey Scan:  
	 272 frames, step size = 41.3 pixels (82.6"), 
	 nominal length L = l + d*(N-1) = 6.4 deg.
	    where l = length of one frame (512")
		  d = step size (82.6")
		  N = No. frames in scan

	 6-deep coverage begins 5 steps from ends of scan, 
	 so effective length L' = L - 2*(5 * d) = 6.1 deg. 
	 Note this is 6 degrees plus one full frame.

	 Area = L' * l = 0.87 sq. deg.  (neglects effects of cross-stepping)

   iii. Calibration Scan:
	48 frames, step size = 41.3 pixels (82.6"),

	 L' = 0.99 deg.
	 Area = 0.14 sq. deg.

   iii. Scan rate:

	Frame rate ~ 1.3s (R2) + 0.05 sec (R1) + 0.05 sec (reset) 
		   ~ 1.455s/frame (with scan overhead)

        Survey scan time      = 272 * 1.5s = 408
        Calibration scan time =  48 * 1.5s =  72

	with overheads including slewing and settling times:

	Survey scan time        = 423s (7.05 min)
        6xCalibration scan time = 600s (10 min)

   iv. Raw data rate:

	16 bit/pixel raw image frames, 131kB/frame,
	2 frames per position (R1 and R2), and 3 channels/frame.


        Survey scan volume = 213.8 MB 
	Cal. scan volume   =  37.7 MB 

	Scan data rate = 0.51 MB/s = 1820 MB/hr

III. Nightly Raw Data Acquisition Profile (assume 10hr night)

   i. 6 calibration sessions. During each session, scan two standard 
      fields 6 times, with ~4-5" cross-stepping between scans.

	 Total calib. time for night = 12 * 10min = 120 min.

	 Calib. Data Volume = 6 * 2 * 6 * 37.7 MB = 2714.4 MB/night

   ii. Survey continuously between calibration sessions, 
       available time ~ 100.4 minutes. 

	 Total scans ~ 14 per inter-calibration period.

	 14 * 5 = 70 scans per night.

	 Survey Data Volume = 70 * 213.8 MB = 14966.0 MB/night

	 Survey area = 70 * 0.87 sq.deg = 60.9 sq.deg./night

   iii. Engineering Data:

	    Darks, equiv. of 2 survey scans = 427.6 MB
	    Flats, equiv. of 2 survey scans = 427.6 MB

   iv.  Nightly Summary (10 hr night) 

        Total Raw Data Volume:  18535.6 MB/night
	Survey Areal Coverage:     60.9 sq.deg./night

IV. Processed Data Volume

   i. Effective Survey Area/Overlap

       Nominal survey area = 59,650 Survey Tiles * 0.87 sq.deg/tile 
			   = 51895.5 sq.deg.

       Hemisphere overlap (1/3 tiles in overlap DEC band (+18deg):

			   = 2822 tiles / 3 = 941 tiles = 818.7 sq.deg.

       Net overlap of survey = (51895.5 + 818.7)sq.deg. / 41253 sq.deg.
			     = 1.28

   ii. Average Nightly Tile Accumulation (10 hour night)
   
       72 Calibration scans/night (= 12 survey scan volume)
       70 Survey scans/night

       Effective 82 Survey scans/night

   iii. Image Atlas

       22 Atlas images/scan, 512x1024 pixel, 1"/pixel, 32 bit/pixel 

       Volume = [(22 * 2.11 MB/image)  + (1 * 1.68 MB/image)] * 3 bands 
	      = 144.3 MB/scan

       Survey image volume = 70 * 144.3 MB = 10.1 GB/night
       Calib. image volume = 12 * 144.3 MB =  1.7 GB/night

       Total volume = 11.8 GB/night (full resolution)

       20x lossy compressed volume = 0.6 GB/night

       Total Survey Image Atlas Size = 59650 tiles * 143.7 MB/scan    = 8572 GB
       Total Calibration Image Atlas Size ~ 400 nights * 1.7 GB/night =  680 GB

       Lossy Compressed Image Atlas Size ~ 429 GB

   iv. Point Source Database Volume

       Predicted source counts for full sky:
                         down to survey limits   ~ 5e8   (Full Database)
                         down to 10-sigma limits ~ 3.4e8 (Catalog)

			 (Based on Jarrett model, confirmed from ProtoCam
			  source counts.  Database contains data for all sources
                          down to detection limit.  Catalog is assumed
			  to cutoff at Survey SNR=10 limits.)

       Total number point sources expected in DB:
			 ~ 5e8 * 1.28 = 6.4e8  (Full Database)

			 (Database contains all good scan data including 
			  tile overlaps.  Assumes that only ONE version of 
			  each tile coverage will be loaded into DB.  If 
			  multiple observations loaded, total source counts
			  will increase.)

       Average src. density over sky:
       
			=   5e8/41253 = 12120 srcs/sq.deg  (Full Database)
			= 3.4e8/41253 =  8242 srcs/sq.deg. (Catalog)

       Average number of sources per Survey tile:
       
	   =   (5e8 srcs/41253 sq.deg.) * 0.87 sq.deg. = 10544   (Full Database)
	   = (3.4e8 srcs/41253 sq.deg.) * 0.87 sq.deg. =  7171   (Catalog)

       Average number of sources per Calibration tile:

           =   (5e8 srcs/41253 sq.deg.) * 0.14 sq.deg. = 1697   (Full Database)
           = (3.4e8 srcs/41253 sq.deg.) * 0.14 sq.deg. = 1154   (Catalog)

       Point Source record size: 500B/record ASCII
				 473B/record in db (Informix, w/indices)

       Average Point Source DB growth rate      = 4.99 MB/scan
						= 349.3 MB/night
       Average Point Source Cat. growth rate    = 3.39 MB/scan 
						= 237.4 MB/night

       Average Cal. Point Src. DB growth rate   = 0.80 MB/scan
						= 57.8 MB/night
       Average Cal. Point Src. Cat. growth rate = 0.55 MB/scan
						= 39.3 MB/night


       Total Point Source DB Volume      = 6.4e8 * 473B = 303 GB
       Total Point Source Catalog Volume = 3.4e8 * 473B = 161 GB

	   (Again, this assumes only one observation of each tile
	    is included in the Database.  Multiple observations will
	    increase DB size.)

   v. Extended Source Database Volume

      Average number of extended source candidates (including LCSB's):
      
	                            = 160 / scan    (Full Database)
				    = 184 / sq.deg.
	                            =  85 / scan    (Catalog)
				    =  98 / sq.deg.

			(Database assumed to contain all candidates down
			 to detection limit.  Catalog assumed to contain
			 only extended sources brighter than SNR 10 limit.
			 Numbers based on early 3-channel camera observations.
			 Note that clusters will have higher source counts,
			 but high source density regions will have lower.)


      Extended Src. record size: 2329B/record ASCII
	  		         1469B/record in DB (Informix, w/indices)

      Extended Src. Survey DB growth rate   =  0.24 MB/scan
					    = 16.45 MB/night
      Extended Src. Calibration growth rate =  0.04 MB/scan
					    =  2.65 MB/night

     
      Total Extended Source DB Volume       = 60,591 scans * 0.24MB/scan   
					    = 14.54 GB
      Total Extended Source Catalog Volume  = 41253 sq.deg. * 98/sq.deg * 1469B
					    =  5.94 GB

   vi.  Extended Source Postage Stamp Volume

       "Standard" postage stamp size 101x101 pixels.  Assume postage stamps
       saved only for SNR>10 candidates.

       Postage stamp volume = 101x101 pix * 32 bit/pix * 3 ch = 0.122 MB

       Postage stamp growth rate = 85 srcs/scan * 0.122 = 10.4 MB/scan
				 = 10.4 MB/scan * 70 scans/night = 725.9 MB/nt

       Total postage stamp volume = 41253 sq.deg. * 98 srcs/sq.deg. * 0.122MB
				  = 493 GB
				  

   vii. Total DB Growth Rate

      a. Nightly per telescope
                                        MB/night
                               Survey    Calib.     Total
			   ---------------------------------------
	 Atlas Images:       10100.0    1700.0    11800.0

	 Src Lists:  Pt.:      349.3      57.8      407.1
                    Ext.:       16.5       2.7       19.2 
                   Total:      365.8      60.5      426.3

         Ext. Src. Images:     493.0       ...      493.0

      b. Semi-Annually

	 Assume 30% photometric weather for north, 50% photometric for south.

	    Integrated period = 178 nights * 0.3 = 53.4 nights north
				178 nights * 0.5 = 89.0 nights south

	 Assume operations begin 4/97 in north, 1/98 in south.


				   Total DB Volume (GB) *     
			  Atlas   Atlas    Source   Post.Stamp
	 Period                  20x comp.
         ------------------------------------------------------

         4/97-9/97         630      32       23         39
        10/97-3/98        1660      83       60        102
         4/98-9/98        3340     167      121        206
        10/98-3/99        5020     251      182        309
         4/99-9/99        6700     335      242        412
        10/99-3/00        8380     419      303        516
         4/00-9/00       10060     503      363        619

	 * These growth estimates are based solely on nightly data rates
	 and expected photometric fractions.  They provide a guideline
	 on how the data holdings should grow with time, independent of
	 sky coverage, repeat observations, etc.


   viii. Survey Data Products Size Estimates *

	 Image Atlas                     =  8572 GB  (Assumes 1x per tile)
	 Compressed Image Atlas          =   429 GB

	 Point Source DB                 >=  303 GB  (x repeat factor)
	 Point Source Catalog            =   208 GB
	 Point Source "Shortform"        ~    26 GB  (minimum information)

	 Extended Source DB              >=   15 GB
	 Extended Source Catalog          =    6 GB
	 Extended Source "Shortform"      ~  0.3 GB
	 Extended Source Snapshots        =  493 GB

	 * These volume estimates are based on observed source densities.
	 The DB size estimates take into account tile overlaps in the survey,
	 but they do not take into account repeat observations and archiving
	 of tiles.  They are, therefore, lower limits to the size.

V. Tape Volume Requirements


   Note that this section has been superceded by the 
    2MASS Tape Operations Plan of 14 August 1996 .