Factlen ResearchCitizen ScienceResearch MilestoneJun 17, 2026, 9:23 AM· 4 min read· #8 of 8 in ai

High School Student's AI Model Uncovers 1.5 Million Hidden Cosmic Objects in NASA Archive

Using open-source machine learning tools, a Pasadena teenager processed a decade of NASA telescope data to find over a million previously unnoticed celestial phenomena. The peer-reviewed breakthrough highlights how accessible AI is democratizing advanced astrophysics.

By Factlen Editorial Team

Astrophysics Researchers 40%Open-Source AI Advocates 35%STEM Education Proponents 25%
Astrophysics Researchers
Scientists view the AI pipeline as a critical tool for managing the overwhelming volume of data produced by modern telescopes.
Open-Source AI Advocates
Technologists celebrate the breakthrough as proof that frontier science is no longer gatekept by institutional supercomputers.
STEM Education Proponents
Educators highlight the necessity of early, hands-on mentorship in bridging the gap between theoretical math and real-world application.

What's not represented

  • · Traditional supercomputing facilities
  • · Commercial space data providers

Why this matters

This achievement proves that frontier scientific discovery no longer requires massive institutional supercomputers. As AI tools become democratized, anyone with a laptop and curiosity—even a high school student—can mine public datasets to fundamentally expand our understanding of the universe.

Key points

  • A Pasadena high school student used an automated machine-learning pipeline to process nearly 200 billion rows of archived NASA telescope data.
  • The AI model successfully identified 1.5 million previously unnoticed cosmic phenomena, including faint variable light sources.
  • The breakthrough was achieved using open-source tools, highlighting the democratization of advanced data analysis in modern astrophysics.
  • The findings have been formally recognized by NASA and published as a peer-reviewed paper in The Astronomical Journal.
1.5 million
New cosmic objects discovered
200 billion
Rows of NEOWISE data processed
6 weeks
Time to build the ML pipeline

A California teenager has transformed a summer astronomy project into a peer-reviewed scientific breakthrough, utilizing artificial intelligence to uncover 1.5 million previously undocumented cosmic phenomena. Matteo Paz, a student at Pasadena High School, achieved the milestone by applying custom machine-learning algorithms to a massive, publicly available NASA dataset.[1][2]

The discovery centers on archival data from the Near-Earth Object Wide-field Infrared Survey Explorer (NEOWISE). Launched in 2009, the space telescope spent more than a decade scanning the sky in infrared light, cataloging everything from near-Earth asteroids to distant galaxies. Over its operational lifespan, NEOWISE accumulated a staggering dataset comprising nearly 200 billion rows of measurements.[1][4]

While the NEOWISE archive is a goldmine for astrophysicists, its sheer volume presents a formidable challenge. Traditional software and manual human review are sufficient for tracking bright, obvious objects, but they frequently miss faint, transient, or slowly varying light sources. These subtle variations often represent some of the most intriguing phenomena in the universe, such as brown dwarfs or distant active galactic nuclei.[3][4]

Paz initially joined the Planet Finder Academy—a program designed to immerse students in real-world astronomical research—in the summer of 2022. Working under the mentorship of Caltech scientist Davy Kirkpatrick at the Infrared Processing and Analysis Center (IPAC), the original plan was for Paz to manually study a small subset of the NEOWISE data.[1][2]

The scale of the AI pipeline's data processing.
The scale of the AI pipeline's data processing.

However, recognizing the limitations of manual analysis, Paz pivoted to a much more ambitious, computationally driven approach. Drawing on a strong background in theoretical mathematics and coding, he spent six weeks designing and training an automated machine-learning pipeline capable of processing the entire archive.[1][5]

The resulting AI model utilized advanced mathematical techniques, specifically Fourier transforms and wavelet analysis. These tools are highly effective at isolating time-based signals, allowing the algorithm to detect faint fluctuations in the infrared spectrum that standard sampling methods would overlook.[1][3]

The resulting AI model utilized advanced mathematical techniques, specifically Fourier transforms and wavelet analysis.

The model began showing promise almost immediately, flagging objects whose brightness changed too subtly or unpredictably for conventional detection. By automating the search, the AI could sift through billions of data points in a fraction of the time it would take a human team, operating with a level of precision that eliminated human fatigue.[1][3]

The final output of the pipeline was staggering: a map of 1.5 million previously invisible or unnoticed cosmic objects. The findings were so robust that they formed the basis of a formal paper, recently published as a peer-reviewed breakthrough in The Astronomical Journal.[1][2]

AI-driven discovery vastly outpaces traditional manual review methods.
AI-driven discovery vastly outpaces traditional manual review methods.

Beyond the astronomical value of the newly discovered objects, the achievement highlights a profound shift in how modern science is conducted. For decades, processing datasets of this magnitude required access to massive institutional supercomputers and teams of specialized researchers.[4][5]

Today, the democratization of artificial intelligence and open-source machine learning libraries has fundamentally lowered the barrier to entry. Tools that were once the exclusive domain of elite laboratories can now be run on consumer hardware, empowering citizen scientists and students to make genuine contributions to frontier research.[5]

This paradigm shift is particularly relevant as the scientific community grapples with a growing data deluge. Next-generation observatories, such as the Vera C. Rubin Observatory, are expected to generate petabytes of data annually. AI-driven pipelines like the one developed by Paz will be essential for triaging this information and identifying targets for follow-up observation.[2][5]

Modern observatories produce petabytes of data, requiring AI pipelines for effective analysis.
Modern observatories produce petabytes of data, requiring AI pipelines for effective analysis.

NASA leadership formally recognized Paz's contributions earlier this year, underscoring the agency's commitment to open data initiatives. By making archives like NEOWISE publicly accessible, space agencies provide the raw material necessary for AI-enabled discoveries by the broader public.[1][4]

The 1.5 million newly identified objects now present a fresh challenge for the astronomical community: classifying them. Researchers anticipate that the catalog will yield a wealth of new variable stars, eclipsing binaries, and potentially entirely new classes of celestial bodies that have never been documented.[3][5]

Ultimately, this breakthrough serves as a powerful proof of concept for the future of scientific inquiry. It demonstrates that when open data is paired with accessible artificial intelligence, the next major discovery can come from anywhere—even a high school classroom.[1][5]

How we got here

  1. 2009

    NASA launches the WISE telescope, which is later repurposed as NEOWISE to scan the sky in infrared.

  2. Summer 2022

    Matteo Paz joins the Planet Finder Academy and begins working with Caltech scientists on NEOWISE data.

  3. Late 2022

    Paz pivots from manual review to building an automated machine-learning pipeline over a six-week period.

  4. Early 2026

    NASA leadership formally recognizes Paz's achievement in mapping 1.5 million new cosmic objects.

  5. June 2026

    The methodology and findings are published as a peer-reviewed breakthrough in The Astronomical Journal.

Viewpoints in depth

Astrophysics Researchers

Scientists view the AI pipeline as a critical tool for managing the overwhelming volume of data produced by modern telescopes.

For professional astronomers, the sheer scale of the NEOWISE archive—nearly 200 billion rows of data—represents both a treasure trove and a logistical nightmare. Researchers emphasize that traditional manual review is no longer viable for next-generation observatories. They view Paz's application of wavelet analysis as a highly efficient method for isolating transient phenomena, arguing that such automated pipelines will be mandatory for triaging the petabytes of data expected from upcoming facilities like the Vera C. Rubin Observatory.

Open-Source AI Advocates

Technologists celebrate the breakthrough as proof that frontier science is no longer gatekept by institutional supercomputers.

Advocates for open-source artificial intelligence point to this achievement as the ultimate validation of democratized tech. By utilizing accessible machine learning libraries and consumer-grade hardware, a high school student was able to replicate and exceed the data-processing capabilities of elite laboratories from a decade ago. This camp argues that the future of scientific discovery lies in open data paired with open-source AI, allowing a global decentralized network of citizen scientists to tackle problems that institutions lack the bandwidth to solve.

STEM Education Proponents

Educators highlight the necessity of early, hands-on mentorship in bridging the gap between theoretical math and real-world application.

For education advocates, the story is less about the software and more about the environment that fostered it. They point to the Planet Finder Academy and Caltech's mentorship as the crucial catalysts that allowed a student to apply theoretical mathematics to a real-world problem. This perspective argues that modern STEM education must move beyond textbook learning and provide students with direct access to massive datasets and professional AI tools, treating them as junior researchers rather than passive learners.

What we don't know

  • The exact classification of the 1.5 million newly discovered objects, which will require years of follow-up observation to categorize into specific celestial classes.
  • How quickly other scientific disciplines will adopt similar open-source AI pipelines to mine their own massive historical datasets.

Key terms

NEOWISE
A NASA space telescope that surveyed the entire sky in infrared light, primarily to detect near-Earth asteroids and comets.
Wavelet Analysis
A mathematical technique used to extract information from various kinds of data, particularly useful for analyzing signals that change over time.
Machine Learning Pipeline
An automated sequence of software processes that extracts data, trains an AI model, and generates predictions or classifications.
Brown Dwarf
A celestial object intermediate in size between a giant planet and a small star, often emitting faint infrared light.
Variable Star
A star whose brightness as seen from Earth fluctuates over time.

Frequently asked

What did the high school student discover?

Matteo Paz used an AI model to uncover 1.5 million previously unnoticed cosmic objects, such as variable stars and faint light sources, in archived NASA data.

What telescope provided the data?

The data came from NASA's NEOWISE telescope, which scanned the sky in infrared light for over a decade.

How did the AI model work?

The machine-learning pipeline used Fourier transforms and wavelet analysis to detect subtle, time-based fluctuations in infrared light that human reviewers missed.

Why is this breakthrough significant for AI?

It demonstrates the democratization of artificial intelligence, proving that open-source ML tools allow students to process massive datasets that once required supercomputers.

Sources

Source coverage

5 outlets

3 viewpoints surfaced

Astrophysics Researchers 40%Open-Source AI Advocates 35%STEM Education Proponents 25%
  1. [1]Futura-SciencesSTEM Education Proponents

    An unexpected breakthrough: a high school student's AI uncovers 1.5 million previously invisible cosmic phenomena

    Read on Futura-Sciences
  2. [2]Fox 11 Los AngelesSTEM Education Proponents

    Pasadena high schooler stuns scientists by mapping 1.5 million unknown space objects

    Read on Fox 11 Los Angeles
  3. [3]The Astronomical JournalAstrophysics Researchers

    Automated Detection of Faint Variable Sources in the NEOWISE Archive via Wavelet Analysis

    Read on The Astronomical Journal
  4. [4]NASA Jet Propulsion LaboratoryAstrophysics Researchers

    NEOWISE Data Archive Yields New Discoveries Through Machine Learning

    Read on NASA Jet Propulsion Laboratory
  5. [5]Factlen Editorial TeamOpen-Source AI Advocates

    Synthesis by Factlen editorial team

    Read on Factlen Editorial Team
Stay informed

Every angle. Every day.

Get ai stories with full source coverage and perspective breakdowns delivered to your inbox.