Healthcare fraud costs the US healthcare system over $100 billion per year, according to the National Health Care Anti-Fraud Association. That is roughly 3-10% of total healthcare spending — money that funds phantom billing, upcoding schemes, and identity fraud instead of patient care. Traditional fraud detection relies on rules-based systems and manual Special Investigations Unit (SIU) audits that catch 8-12% of fraudulent claims. The other 88% passes through.
This case study documents how we built an unsupervised learning system for a regional health plan processing 2.4 million claims per month. The system identifies anomalous billing patterns that rules-based systems miss — without requiring labeled fraud examples to train on. After 6 months in production, it identified $12.3M in suspicious claims that passed through existing rule-based filters, with a false positive rate of 11%.
Why Unsupervised Learning for Fraud Detection
The fundamental challenge with supervised fraud detection: you can only find fraud patterns you have already seen. Supervised models are trained on historically confirmed fraud cases. They excel at catching repeat patterns — the same upcoding scheme, the same phantom billing structure. They fail at detecting novel fraud schemes because, by definition, there are no labeled examples to learn from.
Unsupervised learning inverts the approach. Instead of learning "what does fraud look like," it learns "what does normal look like" and flags everything that deviates significantly. This catches novel schemes, evolving patterns, and subtle anomalies that rules-based systems cannot encode.
The Model Ensemble
We deployed three complementary unsupervised models, each detecting different anomaly types:
Isolation Forest — Identifies individual claims that are statistical outliers across multiple dimensions simultaneously. A single claim billing $47,000 for a Level 3 E&M visit, or a provider submitting 85 claims in a single day when the specialty average is 22, gets flagged by Isolation Forest. We tuned the contamination parameter to 0.03 (3% expected anomaly rate) based on historical SIU findings.
Autoencoder (Neural Network) — Learns a compressed representation of "normal" claim patterns. Claims that the autoencoder cannot accurately reconstruct have high reconstruction error — meaning they deviate from learned normal patterns in complex, multi-dimensional ways that univariate rules cannot capture. Particularly effective at detecting coordinated fraud rings where individual claims look normal but the combination is suspicious.
DBSCAN Clustering — Groups providers with similar billing profiles into clusters. Providers that do not fit any cluster (noise points) or that shift between clusters over time are flagged. This catches slow-evolving fraud where a provider gradually increases billing intensity over months — the change is invisible in weekly snapshots but clear in longitudinal clustering analysis.
Feature Engineering
The model performance depends almost entirely on feature quality. We engineered 127 features across 5 categories from the raw X12 835/837 claims data:
| Feature Category | Examples | Count |
|---|---|---|
| Billing patterns | Claims per day, average charge per CPT, billing time distribution, weekend billing ratio | 34 |
| Code combinations | CPT modifier usage, E&M level distribution, procedure-diagnosis mismatch scores | 28 |
| Provider behavior | Patient panel size, referral network density, specialty deviation score | 24 |
| Temporal patterns | Month-over-month charge growth, seasonal deviation, billing pattern consistency | 22 |
| Geographic signals | Distance between provider and patient, cluster density, cross-state billing | 19 |
Results: 6 Months in Production
| Metric | Rules-Based (Before) | Hybrid ML (After) |
|---|---|---|
| Claims flagged per month | 3,200 | 8,400 |
| Confirmed fraud rate (of flagged) | 22% | 41% |
| Dollar value identified (6 months) | $4.1M | $12.3M |
| False positive rate | 45% | 11% |
| Novel scheme detection | 0 (only catches known patterns) | 7 new schemes identified |
| SIU analyst productivity | 12 cases/analyst/month | 28 cases/analyst/month |
The 7 novel schemes included: a DME supplier billing for equipment delivered to vacant addresses, a behavioral health provider billing group therapy codes for individual sessions, and a pharmacy chain submitting claims for brand-name drugs while dispensing generics. None of these would have been caught by existing rules because the rules had never encoded these specific patterns.
At Nirmitee, we build healthcare data infrastructure and AI systems. If you are building fraud detection, clinical data pipelines, or de-identification systems, talk to our team.


