Database Sizes

Size (TB)ContentsCommentsSource
40Sloan Digital Sky Survey Raw Data VolumeSurvey to start in 1999, will finish collecting data in 2004.7
24Walmart's item information2 NCR WorldMark 5100M "massively parallel processing server"1 1
20Contents of Library of Congressnot digitized. Source 2 says size is 10.11
12Two Micron All-Sky Survey (2MASS) Raw Data VolumeSurvey in progress, will finish collecting data in 2000.3
6Entire indexable WWW in February 1999800 million pages containing "more than" 6 trillion characters. Size was ~3 in December 1997 and ~2 in January-February 19978,10,11
3.5Uncompressed images of Earth distributed over the Web by Microsoft's TerraServerDatabase Size: 1.01 TeraBytes
Size of uncompressed images: 3.5 TeraBytes
Database rows: 173.6 million
4
~1Near-Earth Asteroid Tracking (NEAT) ProjectGrowing at ~1 TB/year after 6/19989
0.25Hipparcos Satellite Raw Data 5
0.1?Microlensing Raw Data6 billion brightness measurements6

Further comments:

  1. Walmart just installed an NCR WorldMark 5100M "massively parallel processing server" and upgraded a second NCR 5100M. The combined package will take walmart's data warehouse from 7.5 TB to more than 24 TB. The system collects and analyzes item information from some 2,900 stores to track buying trends department by department, shelf by shelf and item by item. It handles more than 30 applications and some 50,000 queries per week. "It's the largest commercial computer in the world," reports NCR's John Bloom of Encinitas. Bloom said the upgrade is the second largest order in the local NCR division's history, second only to a US Postal Service point-of-sale contract.
  2. San Diego Supercomputer Center is about to receive the Tera MTA, with 64-128 MB of memory, and 20 TB of disk. (Computerlink, SDUT, 9/30/97, 3.)

Sources:

  1. NCT 2/23/97, D1
  2. PC Mag, 5/6/97, 9.
  3. 2MASS Tape Operations
  4. PC Mag, 7/97, 10; Microsoft's TerraServer
  5. ADC/CDS Standard Document for Catalog:/catalogs/1/1239/
  6. Sky & Telescope, 9/97, 41.
  7. Sky & Telescope, 8/97, 43, 44; A. Szalay talk at AAS 10 June 1998.
  8. Steve Lawrence, C. Lee Giles, Searching the World Wide Web, Science, Volume 280, Number 5360, p. 98 for number of webpages.
  9. Steve Pravdo, email communication 6, 15 June 1998.
  10. Edupage, 15 October 1998, from AP 13 Oct 98. See http://www.alexa.com.
  11. Nature, July 8, 1999, Steve Lawrence and C. Lee Giles.

Claim is that medical imaging and banks that digitize and store images of millions of checks is in the petabyte range (PC Mag, 5/6/97, 9.). Also, large physics experiments at particle accelerators may also be this large.

Sizes of Large Scientific Products

Size (TB)ContentsCommentsSource
8.62MASS Image AtlasSurvey in Progress; will release all data in 2001 
1.0Sloan Digital Sky Survey Imaging DataSurvey to start in 1999, will finish collecting data in 2004.7
0.32MASS Point Source CatalogSurvey in Progress; will release all data in 2001 
0.25Sloan Digital Sky Survey CatalogsSurvey to start in 1999, will finish collecting data in 2004.7
0.2Hipparcos/Tycho Star Catalog 5
0.003Human Genome Project 4

Need to add Monet's catalog, Guide Star Catalog, Hubble raw data volume, IRAS, MSX.

Names of Some Data Volumes

UnitAbbreviationSize (Bytes)Equivalent
YottabyteYB1024103 ZB
ZettabyteZB1021103 EB
ExabyteEB1018103 PB
PetabytePB1015103 TB
TerabyteTB1012103 GB
20,000 four-drawer filing cabinets filled with typewritten pages
Twice the number of all transactions ever made on the New York Stock Exchange
GigabyteGB109103 MB
MegabyteMB106103 kB
500-page novel
KilobytekB103103 B
Half of one page in a novel.
ByteB18 bits
One character stored as ASCII

Source for units: "Guide for Metric Practice", Physics Today, 8/96, BG16.


Go To:


Copyright © 1996-1999 by Tom Chester.
Permission is freely granted to reproduce any or all of this page as long as credit is given to me at this source:
http://sd.znet.com/~schester/facts/database_sizes.html
Comments and feedback: Tom Chester
Last update: 8 July 1999.