Clastro

An interstellar objects sorter

clastro_collage.png

Clastro is an automatic sorter of interstellar objects from the Sloan Digital Sky Survey DB using machine learning algorithms.

Part command line, part GUI program, it aids astronomers on the inspection and catalogation of huge amounts of data from the telescopes, in this case from the SDSS catalog (30TB of spectroscopic and photometric data). The program reads raw spectra data, spectra information, photometric information, redshift values, etc, and builds trained classifiers. Those classifiers can use a list of features to classify unknown interstellar objects into quasars or galaxies.

The program uses several algorithms. For feature extraction and transformations:

  • Douglas-Peucker algorithm or Savitzky Golay filter for smoothing the curves
  • Cubic Spline interpolation or Support Vector Regression for the continuum extraction

For classification:

  • Support Vector Machines
  • K-Nearest Neighbour

Role

SW designer and developer in a team of 4, as of 2012.

Features

  • Fast connection to the SDSS database and complex search queries
  • Private object cache
  • Lazy calculation method of all the needed data for the feature extraction
  • Key algorithms for feature extraction and data preparation
  • Machine learning algorithms to classify objects by their features
  • Cross validation algorithm has been used to evaluate our classifiers and estimate how accurately they work
  • Import & export features for set of data and single interstellar objects
  • Modular design
  • Test Driven Development

Repository

Clastro is in process of being released as free software, and it continues to be developed at WWU Muenster. Meanwhile you can read a report of its state as of Sept 2012 here.

github.com/ClastroLabs

Technologies used

Java, SQL, JTest, Scrumm