DR5 Help
 Archive Intro
 Table Descriptions
 Schema Browser
 Glossary
 Algorithms
 Introduction to SQL
 Sample SQL Queries
 Query Limits
 How To
 FAQ
 Details of the Data
 API
 Download
 SkyServer Sites
 SkyServer Traffic Page
 Web Browsers
 Site News
 Known Problems
 Contact Help Desk
Archive Intro

Click here for any release note(s) that may apply to this release.

The Catalog Archive Server (CAS) provides access to the object catalogs and related data from the Sloan Digital Sky Survey (SDSS). It is a companion to the Data Archive Server (DAS), which allows users to download raw (FITS) images and spectra from the survey. The SkyServer provides the Web interface to the CAS.

Data Organization

The SDSS Catalog Data Model

The SDSS catalog data is stored in a commercial relational database management system (DBMS) - Microsoft's SQL Server. The data is therefore organized in several 2-dimensional tables. The tables and their relationships to each other are referred to as the schema in database jargon.

Diagrammatic view of the DR5 schema

There are 3 different types of data in the tables - imaging data is in the photo group of tables, spectroscopic and tiling data is in the spectro tables, and other data such as documentation or other information about the photo and spectro data, i.e. metadata, is in the meta tables. Some tables are also created specifically for speef or convenience, for example the SpecPhotoAll table, which contains a pre-computed join of relevant fields in the PhotoObjAll and SpecObjAll tables.

The important tables are described below, along with the views that are currently defined on each table. A view is a subset of the corresponding table that can be used instead of the table - in other words it is a virtual table. A view is usually faster than using the base table, since it only loads a subset of the objects, but more importantly, the views we have defined on the tables select only the objects that are important for science, and they filter out non-science objects such as sky, QA or defective observations. As such, even though we list the base tables for completeness below, in the vast majority of the cases, you should use the views defined on the tables instead of the tables themselves, e.g. use the PhotoObj and SpecObj views for science instead of the PhotoObjAll and SpecObjAll tables.

BESTDR5, TARGDR5 and other databases

There are two main datasets in the SDSS catalog archive server - the BEST and TARGET datasets that are contained in separate databases. Each contains different reprocessings of the same raw data:

  • the BEST dataset is contained in the BESTDR5 database, and represents the latest, greatest calibration of the raw data. BESTDR5 is the default database for queries. Other databases must be explicitly specified (see below for sample syntax).
  • The TARGET dataset is contained in the TARGDR5 database, and represents a snapshot of the data as it was when the target selection algorithm was run on it, i.e. it is the calibration upon which the spectroscopic targets were chosen.  Preserving this data version is important for doing science with the data.

Both databases have the identical schema (tables), but different data.  The BESTDR5 database also contains the spectroscopic and tiling data, whereas the TARGDR5 database only contains imaging data.  The vast majority of queries are run on the BESTDR5 database.

To choose a database other than the default BESTDR5 in your query, you must specify it as <dbname>..<tablename>, e.g., TARGDR5..PhotoObj:

	SELECT TOP 10 objid,ra,dec FROM TARGDR5..PhotoObj
Please see the SQL Intro page for further help with SQL queries.

The Hierarchical Triangular Mesh (HTM)

We have build a spatial indexing scheme called the Hierarchical Triangular Mesh (HTM) that spatially decomposes the region of the sky that is covered by the SDSS data and enables much faster spatial searches of the data by coordinate cuts.

Database Indices

In addition to the HTM, which is an overall indexing scheme for multidimensional spatial data, the DBMS itself has the capability to define indices for fast searches on each table. We have defined indices on all the major tables.

An index is a tree representation of a subset of the columns in a table that enables much faster searches of the table (compared to sequential scans of the table data) when constraints involving those columns are included in the query. All tables have an index on their primary key (unique row identifier), but the larger tables have indexes in addition to the primary key index. In all there are 3 kinds of indices:

  • Primary Key Index - index on the unique primary key of a table.
  • Foreign Key Index - index on a foreignkey of a table, i.e. a column that is a primary key of another table.
  • Covering Index - an index that covers one or more columns of a table. This is a combined index on those fields, so it speeds up searches if any of those fields are included in the WHERE clause.
Click here to view a table of all the current indices defined on the data.

Imaging (Photo) Data Tables

The important tables and views are described below. For each table, the views and indices are defined on that table are described in brief. For more information, please see the Table Descriptions help page as well as the links to the Glossary entries that are incorporated in the table listing below.

  1. PhotoObjAll - By far the largest table in the database, PhotoObjAll contains the 100+ parameters for each imaging (photo) object. For most of these parameters, there are actually 5 rows each, one for each wavelength band. This table includes data on all photo objects, not just science objects, hence the name PhotoObjAll. The view of this table that includes only science objects and excludes sky and other unknown objects is the PhotoObj view. The PhotoObjAll table is there for completeness, but science queries are usually done on the PhotoObj view.
    PhotoObjAll Views:

    View NameContentsDescription
    PhotoFamily These are in PhotoObj, but neither PhotoPrimary or Photosecondary. These objects are generated if they are neither primary nor secondary survey objects but a composite object that has been deblended or the part of an object that has been deblended wrongfully (like the spiral arms of a galaxy). These objects are kept to track how the deblender is working. It inherits all members of the PhotoObj class.
    PhotoObj All primary and secondary objects in the PhotoObjAll table, which contains all the attributes of each photometric (image) object. It selects PhotoObj with mode=1 or 2.
    PhotoPrimary These objects are the primary survey objects. Each physical object on the sky has only one primary object associated with it. Upon subsequent observations secondary objects are generated. Since the survey stripes overlap, there will be secondary objects for over 10% of all primary objects, and in the southern stripes there will be a multitude of secondary objects for each primary (i.e. reobservations).

    They are defined by the status flag: (PRIMARY & OK_RUN) = 0x2010.

    PhotoSecondary Secondary objects are reobservations of the same primary object.

    PhotoObjAll Indices:

    Index TypeKey or Field List
    ">