about us || news || projects || github || contact

terascale uv astronomy with galex

Our work on the massive data set produced by the Galaxy Evolution Explorer (GALEX) telescope brings many of our interests together. It requires us to use interesting statistical methods, perform archaeological work on legacy scientific software, and find tricky solutions to programming problems. Most of all, it gives us opportunities to make efficient, beautiful tools people can use to increase the scientific return on existing data.


GLCat (the GALEX Legacy Catalog) is an upcoming catalog of every unique source observed by GALEX. It will be the largest UV sky catalog ever developed. GLCat is being funded by the NASA Astrophysics Data Analysis Program (ADAP) and builds on our previous work with GFCat and gPhoton (see below). Full-scale work on GLCat will begin Q3 2021.

proposal document


GFCat (the GALEX Flare Catalog) is a catalog of variable sources observed by GALEX over short timescales (less than about 30 minutes, the length of a full GALEX 'visit'). The very high temporal resolution of GFCat gives us -- and other astronomers -- the opportunity to search for phenomena that have rarely been previously observed in stars other than our own sun. In particular, the contrast between M Dwarf stars and stellar flares is nearly maximized in the GALEX bandpass, so the data are particularly sensitive to lower energy flares, unobscured by the host star. M Dwarfs are prime targets in the search for habitable worlds. Stellar flares are expected to play some role in planet habitability, and the rate of relatively more common low energy flares might be key. Other fast variable objects observable in GALEX include binary star eclipses, quasi-periodic pulsations, and white dwarf pulsations. These variables were serendipitously observed by GALEX and could not be systematically discovered without gPhoton (see below). The catalog of 2-minute time resolution and example software workflows are currently available on GitHub.

AAS poster on variable source detection || GitHub repo


gPhoton is a Python software toolkit that opens up GALEX's short time domain capabilities for the first time. An implementation of the gPhoton database has been publicly deployed at the Mikulski Archive for Space Telescopes (MAST) at the Space Telescope Science Institute (STScI). It makes the ~1.7 trillion individual photons observed in “direct” imaging mode by that spacecraft during the NASA-funded phases of the mission available for fast, on-demand analysis of brief, time-variable astrophysical phenomena with simple command-line tools. The source code is available on GitHub.

GitHub repo || ApJ article || Sky & Telescope

data usability and preservation

Scientific data are precious but difficult to distribute and preserve. Even in domains like planetary science that boast excellent central archives like the Planetary Data System (PDS), older data can suffer from bitrot and obsolescence, creating impassable barriers for new researchers who want to draw on our collective scientific legacy. Million Concepts is committed to finding ways to make data, both new and old, easily and durably accessible to as many people as possible.

pdr: the planetary data reader

Planetary data are complex, scattered, and varied. We like that sort of thing, but it is not to everyone's taste. Reusable tools to make data analysis ready in standard workflows minimize wasted effort and maximize return on data in any domain. This is particularly important in fields like planetary science, in which most of the observational data is produced by one-off bespoke instruments on missions produced at tremendous expense and that are unlikely to be repeated. pdr is a project intended to solve the user-facing end of some data access problems in a simple but expansive way. It is a tool designed to ingest every planetary data type with a single command: pdr.read(). It is currently in open beta and available on GitHub. We would very much appreciate feedback, requests, and bug reports.

GitHub repo

chang'e 2 mrm analysis

We are working on a project to recalibrate data from the Microwave Radiometer (MRM) on the Chinese Lunar Exploration Program's Chang'e 2 orbiter. The output of this project include not only these recalibrated data and new maps of the dielectric and thermal characteristics of the Moon. Work products will be archived in the PDS. The project formally began in January 2021, and we hope to release preliminary data within a year.

proposal document

ch-1 m3 and clementine data conversion

In late 2020 and early 2021, we produced two archives containing new versions of:

We are pleased to help make these crucial data more usable and maintainable. These projects also gave us the opportunity to test several new pieces of software, including alpha versions of pdr and its submodule pdr.converter. The data bundles are currently in peer review at the PDS, but more information, preliminary documentation, and conversion software are available on a project page here and in GitHub repos.

reflections and videos || M3 GitHub repo || Clementine GitHub repo || LPSC abstract

ahfe concatenated bundle

In 2019 and 2020, we repaired all available data from the Apollo 15 and 17 Heat Flow Experiments and repackaged them in a user-friendly, PDS4-compliant format, complete with highly-transparent code. The resulting bundle has been released in the Planetary Data System, and a mirror of our latest version is always available on GitHub.

Summary page || PDS bundle || Bundle Mirror on GitHub || P&SS Article

scientific software development and archiving

What people usually call "data" and "metadata" aren't the only things we need in order to understand science. Observations are meaningless out of context, which includes the software necessary to read observational data as anything but digital nonsense. Million Concepts continues to develop technologies, methodologies, and best practices for scientific software development across the entire lifecycle.

Chase Million has been selected as a 2021 Better Scientific Software (BSSw) Fellow with a project to develop tools and techniques for generating realistic and useful software project estimates that account for the unique needs of scientific research.

In 2015, we proposed to create a new "Software" archiving node of the Planetary Data System. We have made the unsuccessful proposal available as a reference.



The Extraterrestrial Virtual Field Experience (EVFE) is a browser- and tablet-based educational activity with game-like elements that puts player-students in the role of the Mars Exploration Rover science team tasked with allocating limited resources to make targeted scientific observations to address specific scientific questions. The EVFE uses exclusively real (not simulated) scientific data from the MER Spirit rover, and the interface and resource tradeoffs closely mimic those made in actual mission planning workflows. EVFE is in active use as an outreach activity by the Spacecraft Planetary Image Facility (SPIF) at Cornell University.

EVFE website @ SPIF || Retrospective Essay on Design

technology commercialization

peristaltic endoscopy

Coverage of 2016 TechCelerator graduation

the arisian lens

2018 NASA iTech Cycle 1 Forum presentation