Since 2020, we have been working on a variety of tools for analyzing multispectral data from the big cameras that stick up out of the heads of the Perseverance and Curiosity Mars rovers. (These cameras are also called MASTCAM-Z and MASTCAM). These tools reduce the work scientists must do to process these data by a factor of 10. They also improve the quality and consistency of the resulting data.
These tools include:asdf
, an application and workflow for rapid, consistent, beautiful
last-mile multispectral data reduction.
These applications will eventually become public. Unfortunately, because of elements
that remain confidential, we are not yet able to share them. However, you can look at many
of their components -- and perhaps find features you'll be able to use! -- in our
broader marslab
ecosystem, including
marslab
itself and the utility library
dustgoggles
.
Our work on the massive data set produced by the Galaxy Evolution Explorer (GALEX) telescope brings many of our interests together. It requires us to use interesting statistical methods, perform archaeological work on legacy scientific software, and find tricky solutions to programming problems. Most of all, it gives us opportunities to make efficient, beautiful tools people can use to increase the scientific return on existing data.
GLCat (the GALEX Legacy Catalog) is an upcoming catalog of every unique source observed by GALEX. It will be the largest UV sky catalog ever developed. GLCat is being funded by the NASA Astrophysics Data Analysis Program (ADAP) and builds on our previous work with GFCat and gPhoton (see below). Work on GLCat is beginning with an extensive rewrite of the gPhoton pipeline, which has increased its speed by two orders of magnitude and will enable rapid iterative recalibration of the GALEX data archive.
proposal documentGFCat (the GALEX Flare Catalog) is a catalog of variable sources observed by GALEX over short timescales (less than about 30 minutes, the length of a full GALEX 'visit'). The very high temporal resolution of GFCat gives us -- and other astronomers -- the opportunity to search for phenomena that have rarely been previously observed in stars other than our own sun. In particular, the contrast between M Dwarf stars and stellar flares is nearly maximized in the GALEX bandpass, so the data are particularly sensitive to lower energy flares, unobscured by the host star. M Dwarfs are prime targets in the search for habitable worlds. Stellar flares are expected to play some role in planet habitability, and the rate of relatively more common low energy flares might be key. Other fast variable objects observable in GALEX include binary star eclipses, quasi-periodic pulsations, and white dwarf pulsations. These variables were serendipitously observed by GALEX and could not be systematically discovered without gPhoton (see below). The catalog of 2-minute time resolution and example software workflows are currently available on GitHub.
AAS poster on variable source detection || GitHub repogPhoton is a Python software toolkit that opens up GALEX's short time domain capabilities for the first time. An implementation of the gPhoton database has been publicly deployed at the Mikulski Archive for Space Telescopes (MAST) at the Space Telescope Science Institute (STScI). It makes the ~1.7 trillion individual photons observed in “direct” imaging mode by that spacecraft during the NASA-funded phases of the mission available for fast, on-demand analysis of brief, time-variable astrophysical phenomena with simple command-line tools. The source code is available on GitHub.
GitHub repo || ApJ article || Sky & Telescope
Scientific data are precious but difficult to distribute and preserve. Even in domains like planetary science that boast excellent central archives like the Planetary Data System (PDS), older data can suffer from bitrot and obsolescence, creating impassable barriers for new researchers who want to draw on our collective scientific legacy. Million Concepts is committed to finding ways to make data, both new and old, easily and durably accessible to as many people as possible.
pdr
: the planetary data reader
Planetary data are complex, scattered, and varied. We like that sort of
thing, but it is not to everyone's taste. Reusable tools to make data
analysis ready in standard workflows minimize wasted effort and maximize
return on data in any domain. This is particularly important in fields like planetary
science, in which most of the observational data is produced by one-off
bespoke instruments on missions
produced at tremendous expense and that are unlikely to be repeated.
pdr
is a project intended to solve the user-facing end of some
data access problems in a simple but expansive way. It is a tool designed to
ingest every planetary data type with a single command: pdr.read()
.
It is currently in open beta and available on GitHub. We would very much
appreciate feedback, requests, and bug reports.
We are working on a project to recalibrate data from the Microwave Radiometer (MRM) on the Chinese Lunar Exploration Program's Chang'e 2 orbiter. The output of this project include not only these recalibrated data and new maps of the dielectric and thermal characteristics of the Moon. Work products will be archived in the PDS. The project formally began in January 2021, and we hope to release preliminary data in Q1 2022.
proposal documentIn late 2020 and early 2021, we produced two archives containing new versions of:
We are pleased to help make these crucial data more usable and
maintainable. These projects also gave us the opportunity to test
several new pieces of software, including alpha versions of
pdr
and its submodule pdr.converter
. The data
bundles are currently in peer review at the PDS, but more information,
preliminary documentation, and conversion software are available on
a project page here and in GitHub repos.
In 2019 and 2020, we repaired all available data from the Apollo 15 and 17 Heat Flow Experiments and repackaged them in a user-friendly, PDS4-compliant format, complete with highly-transparent code. The resulting bundle has been released in the Planetary Data System, and a mirror of our latest version is always available on GitHub.
Summary page || PDS bundle || Bundle Mirror on GitHub || P&SS Article
What people usually call "data" and "metadata" aren't the only things we need in order to understand science. Observations are meaningless out of context, which includes the software necessary to read observational data as anything but digital nonsense. Million Concepts continues to develop technologies, methodologies, and best practices for scientific software development across the entire lifecycle.
Chase Million has been selected as a 2021 Better Scientific Software (BSSw) Fellow with a project to develop tools and techniques for generating realistic and useful software project estimates that account for the unique needs of scientific research.
In 2015, we proposed to create a new "Software" archiving node of the Planetary Data System. We have made the unsuccessful proposal available as a reference.
The Extraterrestrial Virtual Field Experience (EVFE) is a browser- and tablet-based educational activity with game-like elements that puts player-students in the role of the Mars Exploration Rover science team tasked with allocating limited resources to make targeted scientific observations to address specific scientific questions. The EVFE uses exclusively real (not simulated) scientific data from the MER Spirit rover, and the interface and resource tradeoffs closely mimic those made in actual mission planning workflows. EVFE is in active use as an outreach activity by the Spacecraft Planetary Image Facility (SPIF) at Cornell University.
EVFE website @ SPIF || Retrospective Essay on Design