High School Student's AI Model Uncovers 1.5 Million Hidden Cosmic Objects in NASA Archive
Using open-source machine learning tools, a Pasadena teenager processed a decade of NASA telescope data to find over a million previously unnoticed celestial phenomena. The peer-reviewed breakthrough highlights how accessible AI is democratizing advanced astrophysics.
By Factlen Editorial Team
- Astrophysics Researchers
- Scientists view the AI pipeline as a critical tool for managing the overwhelming volume of data produced by modern telescopes.
- Open-Source AI Advocates
- Technologists celebrate the breakthrough as proof that frontier science is no longer gatekept by institutional supercomputers.
- STEM Education Proponents
- Educators highlight the necessity of early, hands-on mentorship in bridging the gap between theoretical math and real-world application.
What's not represented
- · Traditional supercomputing facilities
- · Commercial space data providers
Why this matters
This achievement proves that frontier scientific discovery no longer requires massive institutional supercomputers. As AI tools become democratized, anyone with a laptop and curiosity—even a high school student—can mine public datasets to fundamentally expand our understanding of the universe.
Key points
- A Pasadena high school student used an automated machine-learning pipeline to process nearly 200 billion rows of archived NASA telescope data.
- The AI model successfully identified 1.5 million previously unnoticed cosmic phenomena, including faint variable light sources.
- The breakthrough was achieved using open-source tools, highlighting the democratization of advanced data analysis in modern astrophysics.
- The findings have been formally recognized by NASA and published as a peer-reviewed paper in The Astronomical Journal.
A California teenager has transformed a summer astronomy project into a peer-reviewed scientific breakthrough, utilizing artificial intelligence to uncover 1.5 million previously undocumented cosmic phenomena. Matteo Paz, a student at Pasadena High School, achieved the milestone by applying custom machine-learning algorithms to a massive, publicly available NASA dataset.[1][2]
The discovery centers on archival data from the Near-Earth Object Wide-field Infrared Survey Explorer (NEOWISE). Launched in 2009, the space telescope spent more than a decade scanning the sky in infrared light, cataloging everything from near-Earth asteroids to distant galaxies. Over its operational lifespan, NEOWISE accumulated a staggering dataset comprising nearly 200 billion rows of measurements.[1][4]
While the NEOWISE archive is a goldmine for astrophysicists, its sheer volume presents a formidable challenge. Traditional software and manual human review are sufficient for tracking bright, obvious objects, but they frequently miss faint, transient, or slowly varying light sources. These subtle variations often represent some of the most intriguing phenomena in the universe, such as brown dwarfs or distant active galactic nuclei.[3][4]
Paz initially joined the Planet Finder Academy—a program designed to immerse students in real-world astronomical research—in the summer of 2022. Working under the mentorship of Caltech scientist Davy Kirkpatrick at the Infrared Processing and Analysis Center (IPAC), the original plan was for Paz to manually study a small subset of the NEOWISE data.[1][2]

However, recognizing the limitations of manual analysis, Paz pivoted to a much more ambitious, computationally driven approach. Drawing on a strong background in theoretical mathematics and coding, he spent six weeks designing and training an automated machine-learning pipeline capable of processing the entire archive.[1][5]
The resulting AI model utilized advanced mathematical techniques, specifically Fourier transforms and wavelet analysis. These tools are highly effective at isolating time-based signals, allowing the algorithm to detect faint fluctuations in the infrared spectrum that standard sampling methods would overlook.[1][3]
The resulting AI model utilized advanced mathematical techniques, specifically Fourier transforms and wavelet analysis.
The model began showing promise almost immediately, flagging objects whose brightness changed too subtly or unpredictably for conventional detection. By automating the search, the AI could sift through billions of data points in a fraction of the time it would take a human team, operating with a level of precision that eliminated human fatigue.[1][3]
The final output of the pipeline was staggering: a map of 1.5 million previously invisible or unnoticed cosmic objects. The findings were so robust that they formed the basis of a formal paper, recently published as a peer-reviewed breakthrough in The Astronomical Journal.[1][2]

Beyond the astronomical value of the newly discovered objects, the achievement highlights a profound shift in how modern science is conducted. For decades, processing datasets of this magnitude required access to massive institutional supercomputers and teams of specialized researchers.[4][5]
Today, the democratization of artificial intelligence and open-source machine learning libraries has fundamentally lowered the barrier to entry. Tools that were once the exclusive domain of elite laboratories can now be run on consumer hardware, empowering citizen scientists and students to make genuine contributions to frontier research.[5]
This paradigm shift is particularly relevant as the scientific community grapples with a growing data deluge. Next-generation observatories, such as the Vera C. Rubin Observatory, are expected to generate petabytes of data annually. AI-driven pipelines like the one developed by Paz will be essential for triaging this information and identifying targets for follow-up observation.[2][5]

NASA leadership formally recognized Paz's contributions earlier this year, underscoring the agency's commitment to open data initiatives. By making archives like NEOWISE publicly accessible, space agencies provide the raw material necessary for AI-enabled discoveries by the broader public.[1][4]
The 1.5 million newly identified objects now present a fresh challenge for the astronomical community: classifying them. Researchers anticipate that the catalog will yield a wealth of new variable stars, eclipsing binaries, and potentially entirely new classes of celestial bodies that have never been documented.[3][5]
How we got here
2009
NASA launches the WISE telescope, which is later repurposed as NEOWISE to scan the sky in infrared.
Summer 2022
Matteo Paz joins the Planet Finder Academy and begins working with Caltech scientists on NEOWISE data.
Late 2022
Paz pivots from manual review to building an automated machine-learning pipeline over a six-week period.
Early 2026
NASA leadership formally recognizes Paz's achievement in mapping 1.5 million new cosmic objects.
June 2026
The methodology and findings are published as a peer-reviewed breakthrough in The Astronomical Journal.
Viewpoints in depth
Astrophysics Researchers
Scientists view the AI pipeline as a critical tool for managing the overwhelming volume of data produced by modern telescopes.
For professional astronomers, the sheer scale of the NEOWISE archive—nearly 200 billion rows of data—represents both a treasure trove and a logistical nightmare. Researchers emphasize that traditional manual review is no longer viable for next-generation observatories. They view Paz's application of wavelet analysis as a highly efficient method for isolating transient phenomena, arguing that such automated pipelines will be mandatory for triaging the petabytes of data expected from upcoming facilities like the Vera C. Rubin Observatory.
Open-Source AI Advocates
Technologists celebrate the breakthrough as proof that frontier science is no longer gatekept by institutional supercomputers.
Advocates for open-source artificial intelligence point to this achievement as the ultimate validation of democratized tech. By utilizing accessible machine learning libraries and consumer-grade hardware, a high school student was able to replicate and exceed the data-processing capabilities of elite laboratories from a decade ago. This camp argues that the future of scientific discovery lies in open data paired with open-source AI, allowing a global decentralized network of citizen scientists to tackle problems that institutions lack the bandwidth to solve.
STEM Education Proponents
Educators highlight the necessity of early, hands-on mentorship in bridging the gap between theoretical math and real-world application.
For education advocates, the story is less about the software and more about the environment that fostered it. They point to the Planet Finder Academy and Caltech's mentorship as the crucial catalysts that allowed a student to apply theoretical mathematics to a real-world problem. This perspective argues that modern STEM education must move beyond textbook learning and provide students with direct access to massive datasets and professional AI tools, treating them as junior researchers rather than passive learners.
What we don't know
- The exact classification of the 1.5 million newly discovered objects, which will require years of follow-up observation to categorize into specific celestial classes.
- How quickly other scientific disciplines will adopt similar open-source AI pipelines to mine their own massive historical datasets.
Key terms
- NEOWISE
- A NASA space telescope that surveyed the entire sky in infrared light, primarily to detect near-Earth asteroids and comets.
- Wavelet Analysis
- A mathematical technique used to extract information from various kinds of data, particularly useful for analyzing signals that change over time.
- Machine Learning Pipeline
- An automated sequence of software processes that extracts data, trains an AI model, and generates predictions or classifications.
- Brown Dwarf
- A celestial object intermediate in size between a giant planet and a small star, often emitting faint infrared light.
- Variable Star
- A star whose brightness as seen from Earth fluctuates over time.
Frequently asked
What did the high school student discover?
Matteo Paz used an AI model to uncover 1.5 million previously unnoticed cosmic objects, such as variable stars and faint light sources, in archived NASA data.
What telescope provided the data?
The data came from NASA's NEOWISE telescope, which scanned the sky in infrared light for over a decade.
How did the AI model work?
The machine-learning pipeline used Fourier transforms and wavelet analysis to detect subtle, time-based fluctuations in infrared light that human reviewers missed.
Why is this breakthrough significant for AI?
It demonstrates the democratization of artificial intelligence, proving that open-source ML tools allow students to process massive datasets that once required supercomputers.
Sources
[1]Futura-SciencesSTEM Education Proponents
An unexpected breakthrough: a high school student's AI uncovers 1.5 million previously invisible cosmic phenomena
Read on Futura-Sciences →[2]Fox 11 Los AngelesSTEM Education Proponents
Pasadena high schooler stuns scientists by mapping 1.5 million unknown space objects
Read on Fox 11 Los Angeles →[3]The Astronomical JournalAstrophysics Researchers
Automated Detection of Faint Variable Sources in the NEOWISE Archive via Wavelet Analysis
Read on The Astronomical Journal →[4]NASA Jet Propulsion LaboratoryAstrophysics Researchers
NEOWISE Data Archive Yields New Discoveries Through Machine Learning
Read on NASA Jet Propulsion Laboratory →[5]Factlen Editorial TeamOpen-Source AI Advocates
Synthesis by Factlen editorial team
Read on Factlen Editorial Team →
More in ai
See all 8 stories →Drug Discovery
AI Breakthrough Accelerates Molecular Simulations 10,000x, Reshaping Drug Discovery
6 sources
On-Device AI
The Era of On-Device AI: Why Small Language Models Are Taking Over Your Phone
7 sources
Self-Driving Labs
How Self-Driving Labs Are Automating Scientific Discovery
6 sources
Local AI
How Local AI Tools Are Running on Everyday Laptops
6 sources
Every angle. Every day.
Get ai stories with full source coverage and perspective breakdowns delivered to your inbox.










