Post

Why a Dental Search Engine Comes Before the AI Model

Why a Dental Search Engine Comes Before the AI Model

Introduction

In the previous post on representation learning, we discussed how to map complex dental data into computing spaces. However, before deploying flashy AI models, we must ask a fundamental architectural question: “Does the trained AI model actually come first, or is there something much simpler that needs to be built before that?” The short answer is: before any artificial intelligence, you need a solid search engine.

1. The Temptation to Start with the Model

When starting a new project, many teams fall into the trap of wanting to play with deep learning models and complex embeddings right away. Working with cutting-edge AI is undeniably exciting. But in practical fields like forensic medicine, this is a dangerous pitfall. If you deploy a highly advanced AI model without a properly structured search and query foundation, you are left with an uncontrollable engine. You might have an impressive tech stack to show off, but it will fail to solve the actual bottlenecks in real-world workflows.

2. What a Search Engine Actually Does

In the context of forensic identification, the true role of a search engine isn’t to make the final decision. Instead, it is to ruthlessly and efficiently narrow down the candidates (candidate retrieval). When an unidentified body is found, it’s impossible to manually cross-reference the entire database of millions of dental records. The search engine acts as a funnel, taking in this massive dataset and condensing it into a meaningful shortlist of dozens of candidates that an expert can actually review and evaluate.

1
2
3
4
5
6
7
Full Database (millions)
        ↓  filtering / retrieval
Candidate Set (dozens)
        ↓  ranking
Shortlist (top-k)
        ↓  expert review
Identification Decision

3. Retrieval Before Representation

So, why is it possible to build a powerful search system before introducing learned embeddings? Because dental records are inherently highly structured data. Hard data like missing teeth, amalgams, resins, and gold crowns are absolute facts. Explicit filtering—such as searching for “a patient with a gold crown on tooth 46 and an extracted tooth 14”—is enough to build an excellent rule-based retrieval system. You don’t need a complex vector space to determine if a tooth is missing or not. Embeddings and advanced AI models should only be introduced after this definitive structured search is complete, specifically to handle subtle patterns or ambiguous records.

4. The Funnel: Retrieval → Ranking → Embedding

Because of this, the pipeline must follow a hierarchical order of increasing complexity. First, rule-based retrieval casts a wide net over structured data to secure candidates. Next, simple heuristics (e.g., the number of matching restorations) are used to rank them. Learned embeddings are the final weapon, applied only to the top candidates remaining after this process to maximize precision.

1
2
3
4
5
6
7
Structured Query
       ↓
Rule-based Retrieval     ← Possible right now
       ↓
Ranking (heuristics)     ← Next step
       ↓
Learned Embeddings       ← After data is accumulated

5. Why This Order Is Practical

This design sequence offers overwhelming advantages in the field:

  • Immediate Utility: The system provides value immediately through simple search functionalities, without having to wait months to train an AI model.
  • Clear Debugging: Rule-based search is deterministic. If a query fails, you simply open the code or data structure and fix the logic. Debugging a black-box AI model in the early stages is an unnecessary waste of resources.
  • Data Accumulation: The search logs and matching results generated by experts using the system become the ultimate “ground truth” data for training sophisticated AI embeddings later.
  • Earning Expert Trust: In forensics, trust is everything. Telling a medical examiner, “The locations and materials of 5 restorations match perfectly,” builds operational trust far faster than saying, “The vector similarity is 0.92.”

6. How This Maps to a Real System

This philosophy isn’t just theory; it is the core design principle behind dental_record_app, the forensic dental identification app I am currently building. When a senior official at the National Forensic Service asked for this, I didn’t rush to boot up a deep learning framework. As a dentist and medical examiner working in the field, I knew the most urgent bottleneck was standardizing clear dental records into a robust database and enabling instant queries.

Because the app is designed for comprehensive dental record retrieval rather than just implant tracking, my first priority has been building a reliable system in Flutter with highly structured fields. Ensuring that search based on explicit anatomical facts works flawlessly comes first. Only when this robust, rule-based skeleton operates perfectly can we flesh it out with AI to truly multiply its impact.

Conclusion

Ultimately, a well-designed search engine is the most practical and powerful first step you must take before introducing an AI model. It firmly roots the system in reality, delivers immediate utility, and sets the perfect stage for advanced AI in the future. In the next post, I will discuss the specific technical choices and engineering tradeoffs I faced while putting this philosophy into practice.

This post is licensed under CC BY 4.0 by the author.