|
Click here for any release note(s) that may apply to this release.
The Catalog Archive Server (CAS) provides access to the object catalogs and
related data from the Sloan Digital Sky Survey (SDSS). It is a companion to the
Data Archive Server (DAS),
which allows users to download raw (FITS) images and spectra from the survey.
The SkyServer provides the Web interface to the CAS.
Data Organization
The SDSS Catalog Data Model
The SDSS catalog data is stored in a commercial relational database management
system (DBMS) - Microsoft's SQL Server. The data is therefore organized
in several 2-dimensional tables. The tables and their relationships to each
other are referred to as the schema in database jargon.
There are 3 different types of data in the tables - imaging data
is in the photo group of tables, spectroscopic and tiling data is
in the spectro tables, and other data such as documentation or other
information about the photo and spectro data, i.e. metadata, is in the
meta tables. Some tables are also created specifically for speef or
convenience, for example the SpecPhotoAll table, which contains a pre-computed
join of relevant fields in the PhotoObjAll and SpecObjAll tables.
The important tables are described below, along with the views that are
currently defined on each table. A view is a subset of the corresponding
table that can be used instead of the table - in other words it is a
virtual table. A view is usually faster than using the base table,
since it only loads a subset of the objects, but more importantly, the views
we have defined on the tables select only the objects that are important for
science, and they filter out non-science objects such as sky, QA or defective
observations. As such, even though we list the base tables for
completeness below, in the vast majority of the cases, you should use the
views defined on the tables instead of the tables themselves, e.g. use
the PhotoObj and SpecObj views for science instead of the PhotoObjAll and
SpecObjAll tables.
BESTDR5, TARGDR5 and other databases
There are two main datasets in the SDSS catalog archive server -
the BEST and TARGET datasets that are contained in separate databases. Each
contains different reprocessings of the same raw data:
- the BEST dataset is contained in the BESTDR5
database, and represents the latest, greatest calibration of the raw data.
BESTDR5 is the default database for queries. Other databases
must be explicitly specified (see below for sample syntax).
- The TARGET dataset is contained in the TARGDR5
database, and represents a snapshot of the data as it was when the target
selection algorithm was run on it, i.e. it is the calibration upon which
the spectroscopic targets were chosen. Preserving this data version
is important for doing science with the data.
Both databases have the identical schema (tables), but different
data. The BESTDR5 database also contains the spectroscopic and
tiling data, whereas the TARGDR5 database only contains imaging
data. The vast majority of queries are run on the BESTDR5
database.
To choose a database other than the default BESTDR5 in your
query, you must specify it as <dbname>..<tablename>, e.g.,
TARGDR5..PhotoObj:
SELECT TOP 10 objid,ra,dec FROM TARGDR5..PhotoObj
Please see the SQL Intro page for further help with
SQL queries.
The Hierarchical Triangular Mesh (HTM)
We have build a spatial indexing scheme called the
Hierarchical Triangular Mesh (HTM) that spatially decomposes the region of the sky
that is covered by the SDSS data and enables much faster spatial searches of the data
by coordinate cuts.
Database Indices
In addition to the HTM, which is an overall indexing scheme for multidimensional spatial
data, the DBMS itself has the capability to define indices for fast searches on each table.
We have defined indices on all the major tables.
An index is a tree representation of a subset of the columns in a table that enables much
faster searches of the table (compared to sequential scans of the table data) when
constraints involving those columns are included in the query. All tables have an index
on their primary key (unique row identifier), but the larger tables have indexes
in addition to the primary key index. In all there are 3 kinds of indices:
- Primary Key Index - index on the unique primary key of a table.
- Foreign Key Index - index on a foreignkey of a table, i.e. a
column that is a primary key of another table.
- Covering Index - an index that covers one or more columns of a
table. This is a combined index on those fields, so it speeds up
searches if any of those fields are included in the WHERE clause.
Click here to view a table of all the current indices
defined on the data.
Imaging (Photo) Data Tables
The important tables and views are described below. For each table, the views
and indices are defined on that table are described in brief. For more
information, please see the Table Descriptions
help page as well as the links to the Glossary
entries that are incorporated in the table listing below.
- PhotoObjAll - By far the largest table in the database, PhotoObjAll contains
the 100+ parameters for each imaging (photo) object. For most of these parameters, there are
actually 5 rows each, one for each wavelength band. This table includes data on all photo
objects, not just science objects, hence the name PhotoObjAll. The view of this
table that includes only science objects and excludes sky and other unknown objects is
the PhotoObj view. The PhotoObjAll table is there for completeness, but science queries
are usually done on the PhotoObj view.
PhotoObjAll Views:
| View Name | Contents | Description |
| PhotoFamily |
These are in PhotoObj, but neither PhotoPrimary or Photosecondary. | These objects are generated if they are neither primary nor secondary survey objects but a composite object that has been deblended or the part of an object that has been deblended wrongfully (like the spiral arms of a galaxy). These objects are kept to track how the deblender is working. It inherits all members of the PhotoObj class. |
| PhotoObj |
All primary and secondary objects in the PhotoObjAll table, which contains all the attributes of each photometric (image) object. | It selects PhotoObj with mode=1 or 2. |
| PhotoPrimary |
These objects are the primary survey objects. | Each physical object on the sky has only one primary object associated with it. Upon subsequent observations secondary objects are generated. Since the survey stripes overlap, there will be secondary objects for over 10% of all primary objects, and in the southern stripes there will be a multitude of secondary objects for each primary (i.e. reobservations). They are defined by the status flag: (PRIMARY & OK_RUN) = 0x2010. |
| PhotoSecondary |
Secondary objects are reobservations of the same primary object. | |
PhotoObjAll Indices:
| Index Type | Key or Field List |
| "> | |