Johns Hopkins University: SDSS

From Devwiki

Jump to: navigation, search

In recent years Johns Hopkins University has won more Federal research and development funding than any other university. Our customer is a collaborator in the Sloan Digital Sky Survey (SDSS).

Note that another customer, the Space Telescope Science Institute, is located on the same campus.


Contents

[hide]

[edit] Objectivity Case History

This page needs expansion. Please update it and remove this banner when all fields are complete.

[edit] Important

This information is an archive, so any use of the present sense in the text should be taken in the historical context, generally determinable from the Status section below.

[edit] Customer Information

[edit] Status

  • First Contact: Circa 1998
  • Lead came from:
  • Evaluation Start Date:
  • Evaluation Finish Date:
  • First Purchase Date:
  • Deployment Date: Circa 1999
  • Current Status: Deployed.
  • Can we talk about this customer and the product/project?
  • Referenceable?: Yes, but with care as Microsoft is gradually taking over the whole campus. they donated extensive resources to build the Virtual Observatory web site using the files from SDSS.

[edit] Environment

  • Hardware:
  • Operating System:
  • Precision:
  • Development language:
  • Compiler:
  • Third Party vendor tools:
  • Open Source tools:

[edit] The Project/Product

[edit] Project Background

The Sloan Digital Sky Survey (SDSS) is a $32M project run by a consortium, led by Johns Hopkins Space Telescope Science Institute and Chicago's FermiLab, to produce a new, digital survey of all the objects (stars, galaxies , quasars, etc.) in the sky. The previous survey, now over 50 years old, consists of analog glass photographic plates, which dramatically limit distribution, amount of information, and accuracy of information. SDSS has constructed a new 2.5m telescope, based directly on digital (CCD, 4Kx2K chips) imaging, which will gather information including luminosity, spectra and spectral intensities in various bands, and variability, for all objects in space. It is expected to take 5 years to map the northern celestial sphere.

[edit] Project/Product Description

Raw information goes into the SDSS database, as well as analysis results that collect that information into identified objects, resulting in a catalog of 100 Million (astronomical) objects. The database will be accessible to scholars, and parts of it even to schools and the public, via the Internet. Production software entered test usage in 1996, and the final database is expected to be approximately 40 TB, in Objectivity/DB.

The key to use of this enormous archive is supporting multidimensional indexing for queries not only in 3D space, but also in luminosity in several spectral ranges. To make this efficient, SDSS developed a new indexing algorithm, based on modified quad-trees, for n-dimensions. This index partitions all 100M objects into some 40K containers, a clustering device in Objectivity/DB. Coarse-grained maps of the multidimensional space then model the contents of these containers. Queries, then, quickly index over this coarse grain map to quickly determine which of the containers are needed. This efficiently reduces the search space, and hence query execution time, dramatically. Measured results show it to be 100x faster than traditional databases.

There is extensive information on the usage of the database at SDSS database details

[edit] Buying Criteria

[edit] Business Priorities

  • Price
  • Company stability

[edit] Technical Priorities

  • Scalability
  • Performance with complex data and relationships.
  • Dec Alpha support

[edit] Competitors/Alternatives

  • Oracle
  • Versant
  • ObjectStore
  • Later -- Microsoft SQL Server
    • The person responsible for building the SDSS database, was a strong supporter of Objectivity/DB. However, after a few years, Jim Gray of Microsoft became interested in the project. Microsoft donated extensive resources to build a web site (the Virtual Observatory) to showcase the data from SDSS. The project team had built its own parallel search capability on Objectivity/DB and NT. However, the Microsoft team took the single column "Event Tags", bitmap indexed them and performed parallel searches faster than Objectivity/DB could run sequential scans over the same objects (which were; in fact, scattered all over the databases). As a result of this the team decided to use SQL Server to perform initial searches for the Virtual Observatory. However, the data is still collected, analyzed and maintained in Objectivity/DB. Two positive events were triggered by this episode:
  1. We eventually built a distributed, Parallel Query Engine (PQE)PQE.
  2. The (under 1 TB) SQL Server database takes about six Microsoft consultants to maintain it. The much larger (over 10 TB) federation is a part time job for a scientist.

[edit] Why They Chose Objectivity

Fermi and Johns Hopkins evaluated other DBMSs, including RDBMSs and ODBMSs, and chose Objectivity/DB for these reasons:

  • Objectivity's performance in large-scale applications
  • Objectivity's support for platform heterogeneity (they can spread the very large database across multiple hardware from multiple vendors)
  • Objectivity provides features (e.g. caching) that Fermi would otherwise have had to code themselves in their application

[edit] Partners

[edit] Collateral

  1. Press Releases:
  2. Fliers:
  3. White Papers:
  4. Case Study: Case study page
  5. Other:

[edit] Contact Information

  • Objectivity Rep:
  • Customer Contact:
  • Customer Phone:
  • Customer Email:
  • URL:

[edit] External Links

[edit] Related Pages


[edit] Categories