Lessons Learned Building a UFO Classifier

My first approach to analyzing 152,000 UFO sightings had a 34% accuracy rate. I almost published it anyway. Here's what went wrong with semantic embeddings, why I pivoted to LLM extraction, and the methodology that actually worked.

Classifying the Unknown: A Taxonomy of UFO Sightings

I claimed triangles are nocturnal. That disks are 5x more likely to involve entities. That shadow beings terrify while light beings inspire awe. Here's the sighting ID index so you can verify it yourself.

Lessons Learned Building a UFO Classifier

First, thank you Insiders for supporting my work and making this type of deep-dive technical content possible.

In my first article, I walked through patterns I found across 152,000 UFO sightings: silent triangles, disk-entity correlations, the emotional aspect of encounters. In my second article, I did a deep dive on entity encounters. In the third, I provided the sighting ID index so you could verify my claims against the original NUFORC data.

This article is for those who want the technical nitty gritty: the taxonomy, the prompts, the code, what worked and what failed along the way.

PSA: this is a long one AND technical. If that's not your idea of a good time, believe me I get it. But if you want to reproduce this type of analysis, change it up for your own research, or just understand the methodology behind the findings, read on.

Here's what you'll find:

The approach that didn't work (and why it failed)
The pivot that saved the project
The full taxonomy specification (35 dimensions for classifying sightings)
The extraction prompts I used with AI
Cost breakdowns and processing details
Code you can adapt for your own projects

What I'm NOT providing: Access to the source data. The NUFORC database is proprietary. If you want the raw sightings, I'd suggest reaching out to the great folks at NUFORC directly.

The full methodology, code, and lessons learned are only available to Insiders. If you're interested in reproducing this analysis or applying these techniques to your own research, consider joining to get access to this article and everything else behind the curtain.

Read the full story

Already have an account? Sign in

Revisiting the Tinley Park Lights