In 2025, 50% of healthcare organizations reported that missing or inaccurate claim data was the number one factor driving rising denial rates — up 4% from the prior year, according to Experian Health's State of Claims Report. Each denied claim costs between $25 and $118 to rework, with Medicare Advantage rework averaging $47.77 and commercial claims reaching $63.76 per denial. For a mid-size practice processing 50,000 claims annually with a 12% denial rate, that's $150,000–$708,000 in annual rework costs — money that never should have been spent.
The industry benchmark for clean claim rates is 95% or higher. Organizations pushing toward 98% are the ones that have moved beyond basic claim scrubbing into systematic, rule-based pre-submission validation. This isn't about buying a better clearinghouse. It's about catching errors before your claim ever leaves your system.
Competitors publish "clean claim checklists" with 10 bullet points and no code. This guide gives you 25 production-ready validation rules — organized by category, each with Python validation code, common failure examples, and the exact fix. You can implement these rules in your billing system, EHR, or claim scrubber today.
Why 25 Rules? The Math Behind Clean Claims
The MGMA DataDive shows a single-specialty aggregate denial rate of 8% on first submission — a number that hasn't improved since 2019. Meanwhile, Aptarro's 2026 analysis reports that 41% of survey respondents say at least one in ten claims is denied.
The top denial reasons map directly to specific data fields that can be validated programmatically:
- Missing or inaccurate data (50%) — Patient demographics fields
- Authorization issues (35%) — Insurance/payer verification
- Registration errors (26%) — Patient intake validation
- Coding errors (18-25%) — ICD-10 and CPT validation
- Provider data issues (10%) — NPI and taxonomy verification
Twenty-five rules, spread across six categories, cover the complete surface area of preventable claim denials. Each rule targets a specific field or relationship that, when invalid, triggers an automatic denial from payers.
Category 1: Patient Demographics (Rules 1–5)
Patient demographics errors account for 50% of all claim denials. These five rules validate the foundational data that every claim depends on.
Rule 1: Patient Name Format Validation
What it checks: Patient first and last name fields contain valid characters, are not blank, not obviously test data, and follow standard name formatting (no excessive special characters, no all-caps anomalies, minimum length requirements).
Why it matters: Payers perform exact-match verification against their enrollment databases. A name submitted as "JOHN SMITH" when enrolled as "John Smith" can trigger a rejection on strict matching systems. Single-character names, numeric characters, or missing name components trigger immediate rejections on 837P/837I submissions.
import re
def validate_patient_name(first_name: str, last_name: str) -> dict:
"""Rule 1: Validate patient name format for claim submission."""
errors = []
for field, value in [("first_name", first_name), ("last_name", last_name)]:
if not value or not value.strip():
errors.append(f"{field} is required")
continue
name = value.strip()
# Minimum length (single char names are flagged)
if len(name) < 2:
errors.append(f"{field} too short: '{name}' — verify with patient")
# No numeric characters
if re.search(r'\d', name):
errors.append(f"{field} contains numbers: '{name}'")
# Valid characters only (letters, hyphens, apostrophes, spaces, periods)
if not re.match(r"^[a-zA-Z\s\-'.]+$", name):
errors.append(f"{field} contains invalid characters: '{name}'")
# Test data detection
test_patterns = ['test', 'sample', 'demo', 'xxx', 'zzz', 'foo', 'bar']
if name.lower() in test_patterns:
errors.append(f"{field} appears to be test data: '{name}'")
return {"valid": len(errors) == 0, "errors": errors}Common failure: first_name="J", last_name="Smith" — Single-character first names get flagged by payer matching algorithms. Fix: Confirm the full legal first name from the insurance card or enrollment record.
Rule 2: Date of Birth Validation
What it checks: DOB is a valid date, falls within a logical range (not in the future, not over 130 years ago), and is consistent with the services being billed (pediatric services for adults, geriatric for children).
Why it matters: DOB mismatches are the second most common demographic denial. Transposed digits (03/15/1985 vs 03/15/1958) pass basic date validation but fail payer enrollment matching.
from datetime import datetime, date
def validate_dob(dob_str: str, service_date: str = None) -> dict:
"""Rule 2: Validate date of birth logic and range."""
errors = []
try:
dob = datetime.strptime(dob_str, "%Y-%m-%d").date()
except (ValueError, TypeError):
return {"valid": False, "errors": [f"Invalid date format: '{dob_str}' — expected YYYY-MM-DD"]}
today = date.today()
# Future date check
if dob > today:
errors.append(f"DOB is in the future: {dob_str}")
# Age range check (0-130 years)
age = (today - dob).days // 365
if age > 130:
errors.append(f"DOB implies age {age} — likely data entry error")
if age < 0:
errors.append(f"DOB is in the future")
# Service date consistency
if service_date:
try:
svc = datetime.strptime(service_date, "%Y-%m-%d").date()
if svc < dob:
errors.append(f"Service date {service_date} is before DOB {dob_str}")
except ValueError:
errors.append(f"Invalid service date: {service_date}")
return {"valid": len(errors) == 0, "errors": errors, "age": age}Common failure: dob="1958-03-15" entered as "1985-03-15" — Transposed year digits. The claim processes but gets denied on payer enrollment match. Fix: Cross-reference DOB against insurance eligibility response before submission.
Rule 3: Address Standardization (USPS)
What it checks: Patient address conforms to USPS standardized format — validated street address, proper city/state/ZIP combination, ZIP+4 when available, and no PO Box when physical address is required.
Why it matters: Address mismatches trigger denials, particularly for Medicaid claims where state residency verification is required. Non-standard formatting ("Street" vs "St.", "Apartment" vs "Apt") causes matching failures across payer systems.
import re
def validate_address(street: str, city: str, state: str, zip_code: str) -> dict:
"""Rule 3: Validate and standardize patient address."""
errors = []
warnings = []
# Required fields
for field, value in [("street", street), ("city", city), ("state", state), ("zip_code", zip_code)]:
if not value or not value.strip():
errors.append(f"{field} is required")
if errors:
return {"valid": False, "errors": errors}
# State validation (2-letter code)
valid_states = {
'AL','AK','AZ','AR','CA','CO','CT','DE','FL','GA','HI','ID','IL','IN',
'IA','KS','KY','LA','ME','MD','MA','MI','MN','MS','MO','MT','NE','NV',
'NH','NJ','NM','NY','NC','ND','OH','OK','OR','PA','RI','SC','SD','TN',
'TX','UT','VT','VA','WA','WV','WI','WY','DC','PR','VI','GU','AS','MP'
}
if state.upper() not in valid_states:
errors.append(f"Invalid state code: '{state}'")
# ZIP code format (5 digits or ZIP+4)
if not re.match(r'^\d{5}(-\d{4})?$', zip_code.strip()):
errors.append(f"Invalid ZIP format: '{zip_code}' — expected 12345 or 12345-6789")
else:
if len(zip_code.strip()) == 5:
warnings.append("ZIP+4 not provided — may delay processing for some payers")
# Street address basic validation
if len(street.strip()) < 5:
errors.append(f"Street address too short: '{street}'")
if re.match(r'^\d+$', street.strip()):
errors.append(f"Street address is numbers only: '{street}'")
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: state="California" instead of "CA" — Full state name instead of 2-letter code. Fix: Auto-convert state names to USPS standard 2-letter abbreviations in your intake form.
Rule 4: SSN / MRN Format Verification
What it checks: Social Security Numbers follow valid formatting (not all zeros, not known invalid ranges like 000-xx-xxxx or 666-xx-xxxx), and Medical Record Numbers follow the organization's internal format pattern.
Why it matters: Invalid SSNs cause coordination of benefits failures and duplicate patient record issues. MRN format violations indicate potential patient matching errors that can route claims to the wrong accounts.
import re
def validate_ssn(ssn: str) -> dict:
"""Rule 4a: Validate SSN format and known-invalid patterns."""
errors = []
# Remove formatting
clean = re.sub(r'[\s\-]', '', ssn)
if not re.match(r'^\d{9}$', clean):
return {"valid": False, "errors": ["SSN must be exactly 9 digits"]}
area, group, serial = clean[:3], clean[3:5], clean[5:]
# Known invalid ranges (SSA rules)
if area == '000':
errors.append("SSN area number cannot be 000")
if area == '666':
errors.append("SSN area number 666 is never assigned")
if int(area) >= 900:
errors.append(f"SSN area {area} is reserved (900-999 range)")
if group == '00':
errors.append("SSN group number cannot be 00")
if serial == '0000':
errors.append("SSN serial number cannot be 0000")
# All same digit check
if len(set(clean)) == 1:
errors.append(f"SSN is all same digit: {clean[0]*9}")
# Known test/advertising SSNs
known_invalid = ['078051120', '219099999', '123456789']
if clean in known_invalid:
errors.append("SSN matches known invalid/test number")
return {"valid": len(errors) == 0, "errors": errors}
def validate_mrn(mrn: str, pattern: str = r'^[A-Z]{2}\d{6,10}$') -> dict:
"""Rule 4b: Validate MRN against organization's format pattern."""
if not mrn or not mrn.strip():
return {"valid": False, "errors": ["MRN is required"]}
if not re.match(pattern, mrn.strip()):
return {"valid": False, "errors": [f"MRN '{mrn}' doesn't match expected pattern: {pattern}"]}
return {"valid": True, "errors": []}Common failure: ssn="000-00-0000" or ssn="123-45-6789" — Placeholder SSNs that pass basic 9-digit checks but are known invalid. Fix: Validate against SSA invalid ranges before submission.
Rule 5: Insurance ID Format Validation
What it checks: Insurance member ID follows the expected format pattern for the specific payer (Medicare: starts with alphanumeric + 10-11 chars, Medicaid: state-specific patterns, Commercial: payer-specific patterns).
Why it matters: Insurance ID mismatches are the fastest denial trigger. A single transposed character means the claim hits a non-existent account and gets rejected immediately — no manual review, no appeal opportunity.
import re
# Payer-specific ID patterns
PAYER_ID_PATTERNS = {
"medicare": r'^[1-9][A-Z0-9]{9,10}$', # MBI format: C + AN + A + N + AN + N + AN + N + AN + N + AN
"medicaid_ny": r'^[A-Z]{2}\d{5}[A-Z]$',
"medicaid_ca": r'^\d{14}$',
"bcbs": r'^[A-Z]{3}\d{9,12}$',
"aetna": r'^W\d{8,12}$',
"united": r'^\d{9,11}$',
"cigna": r'^U\d{8}$',
"default": r'^[A-Z0-9]{6,20}$'
}
def validate_insurance_id(member_id: str, payer_type: str = "default") -> dict:
"""Rule 5: Validate insurance ID format against payer-specific patterns."""
errors = []
if not member_id or not member_id.strip():
return {"valid": False, "errors": ["Insurance member ID is required"]}
clean_id = member_id.strip().upper()
pattern = PAYER_ID_PATTERNS.get(payer_type.lower(), PAYER_ID_PATTERNS["default"])
if not re.match(pattern, clean_id):
errors.append(
f"Member ID '{clean_id}' doesn't match {payer_type} format. "
f"Expected pattern: {pattern}"
)
# Check for obvious placeholder/test IDs
if clean_id in ['000000000', 'TESTID', 'SAMPLE', 'XXXXXXXXX']:
errors.append(f"Member ID appears to be test/placeholder data: '{clean_id}'")
return {"valid": len(errors) == 0, "errors": errors}Common failure: member_id="12345678" for a Medicare patient — Medicare Beneficiary Identifiers (MBIs) follow a specific alphanumeric pattern (e.g., 1EG4-TE5-MK72). A purely numeric ID indicates the old HICN format or wrong payer enrollment. Fix: Verify member ID from the patient's current insurance card and validate against payer-specific patterns.
Category 2: Provider Information (Rules 6–9)
Provider data errors account for approximately 10% of claim denials, but they're among the easiest to catch programmatically because NPI and taxonomy codes follow strict, well-documented formats.
Rule 6: NPI Validation (Luhn Algorithm)
What it checks: The National Provider Identifier is exactly 10 digits and passes the Luhn check digit algorithm mandated by CMS. The NPI is prefixed with "80840" (health application + US country code) before applying the ISO standard Luhn formula.
Why it matters: An invalid NPI causes immediate electronic rejection — the claim never reaches adjudication. This is a 100% preventable denial that costs your organization the full rework cycle for a simple math check.
def validate_npi(npi: str) -> dict:
"""Rule 6: Validate NPI using the Luhn check digit algorithm.
Per CMS: prefix 80840 to the 10-digit NPI, then apply
standard Luhn mod-10 validation.
"""
errors = []
# Format check
clean = npi.strip().replace('-', '').replace(' ', '')
if not clean.isdigit() or len(clean) != 10:
return {"valid": False, "errors": [f"NPI must be exactly 10 digits: '{npi}'"]}
# Luhn algorithm with 80840 prefix
# The prefix 80 = health applications, 840 = United States
full_number = "80840" + clean
digits = [int(d) for d in full_number]
checksum = 0
# Process from rightmost digit, doubling every second digit
for i, digit in enumerate(reversed(digits)):
if i % 2 == 1: # Double every second digit from right
doubled = digit * 2
checksum += doubled - 9 if doubled > 9 else doubled
else:
checksum += digit
if checksum % 10 != 0:
errors.append(f"NPI '{clean}' fails Luhn check digit validation")
# Type 1 (individual) vs Type 2 (organization) — informational
npi_type = "Individual (Type 1)" if clean[0] in '12' else "Organization (Type 2)"
return {"valid": len(errors) == 0, "errors": errors, "npi_type": npi_type}Common failure: npi="1234567890" — Passes basic 10-digit check but fails Luhn validation. A valid NPI example: "1497759544". Fix: Always run Luhn validation, not just length checks.
Rule 7: Taxonomy Code Validation
What it checks: The provider's taxonomy code exists in the NUCC (National Uniform Claim Committee) Health Care Provider Taxonomy code set, and the code is appropriate for the services being billed.
Why it matters: Taxonomy codes on Box 24Ja of the CMS-1500 form identify the provider's specialty. An incorrect or non-existent taxonomy code can cause claim routing errors and denials, particularly for specialty-specific procedure codes.
import re
# Taxonomy code format: 10-character alphanumeric (XXXXXXXXXx)
# Full list from NUCC: https://taxonomy.nucc.org/
COMMON_TAXONOMY_CODES = {
"207R00000X": "Internal Medicine",
"207Q00000X": "Family Medicine",
"208D00000X": "General Practice",
"207V00000X": "Obstetrics & Gynecology",
"2084N0400X": "Neurology",
"207RC0000X": "Cardiovascular Disease",
"261QM0801X": "Mental Health Clinic",
"208600000X": "Surgery",
"367A00000X": "Advanced Practice Midwife",
"363L00000X": "Nurse Practitioner",
"363A00000X": "Physician Assistant",
"332B00000X": "Durable Medical Equipment",
}
def validate_taxonomy_code(taxonomy: str) -> dict:
"""Rule 7: Validate taxonomy code format and existence."""
errors = []
if not taxonomy or not taxonomy.strip():
return {"valid": False, "errors": ["Taxonomy code is required"]}
clean = taxonomy.strip().upper()
# Format: 10 characters — alphanumeric with 'X' padding
if not re.match(r'^[0-9]{3}[A-Z0-9]{7}$', clean) or len(clean) != 10:
errors.append(f"Invalid taxonomy format: '{clean}' — expected 10-char alphanumeric")
# Check against known codes (in production, query NUCC API or local database)
# This is a simplified check — production should use the full NUCC dataset
if clean not in COMMON_TAXONOMY_CODES and not errors:
errors.append(f"Taxonomy '{clean}' not found in common codes — verify against NUCC registry")
return {
"valid": len(errors) == 0,
"errors": errors,
"description": COMMON_TAXONOMY_CODES.get(clean, "Unknown")
}Common failure: taxonomy="207R0000X" (9 characters instead of 10) — Missing a trailing zero. Taxonomy codes are always exactly 10 characters. Fix: Validate length and look up against the current NUCC code set.
Rule 8: Rendering vs. Billing Provider Validation
What it checks: When the rendering provider (who performed the service) differs from the billing provider (the entity submitting the claim), both NPIs are valid, both are actively enrolled with the payer, and the rendering provider is properly associated with the billing entity.
Why it matters: Payers verify that the rendering provider is credentialed and enrolled under the billing provider's group. A mismatch triggers a denial that often requires re-credentialing documentation to resolve — a process that can take weeks.
def validate_rendering_vs_billing(
billing_npi: str,
rendering_npi: str,
rendering_required: bool = True
) -> dict:
"""Rule 8: Validate rendering and billing provider relationship."""
errors = []
warnings = []
# Validate billing NPI
billing_check = validate_npi(billing_npi)
if not billing_check["valid"]:
errors.append(f"Billing NPI invalid: {billing_check['errors']}")
# Rendering NPI validation
if rendering_npi and rendering_npi.strip():
rendering_check = validate_npi(rendering_npi)
if not rendering_check["valid"]:
errors.append(f"Rendering NPI invalid: {rendering_check['errors']}")
# Same NPI check (not inherently wrong, but worth flagging)
if billing_npi.strip() == rendering_npi.strip():
warnings.append("Billing and rendering NPI are the same — ensure this is a solo practice")
elif rendering_required:
errors.append("Rendering NPI is required when billing under a group")
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: Group practice submits with billing NPI only, omitting the individual rendering provider NPI. Most payers require both when billing under a group TIN. Fix: Always include rendering provider NPI in Box 24J when the billing entity is a group practice.
Rule 9: Group NPI Cross-Reference
What it checks: The group/organization NPI is valid, and the individual rendering provider's NPI is properly affiliated with the group NPI in the NPPES (National Plan and Provider Enumeration System) registry.
Why it matters: Payers cross-reference provider affiliations. If a rendering provider isn't listed under the group NPI's authorized members, the claim gets denied as "provider not authorized to bill under this group."
def validate_group_npi_affiliation(
group_npi: str,
individual_npi: str,
affiliated_npis: list = None
) -> dict:
"""Rule 9: Validate group NPI and provider affiliation.
In production, query NPPES API: https://npiregistry.cms.hhs.gov/api/
to verify active status and affiliation.
"""
errors = []
# Validate group NPI
group_check = validate_npi(group_npi)
if not group_check["valid"]:
errors.append(f"Group NPI invalid: {group_check['errors']}")
# Validate individual NPI
ind_check = validate_npi(individual_npi)
if not ind_check["valid"]:
errors.append(f"Individual NPI invalid: {ind_check['errors']}")
# Affiliation check (against pre-loaded roster or NPPES query)
if affiliated_npis is not None:
if individual_npi.strip() not in [n.strip() for n in affiliated_npis]:
errors.append(
f"Provider {individual_npi} is not affiliated with group {group_npi}. "
f"Update NPPES or provider enrollment before submitting."
)
return {"valid": len(errors) == 0, "errors": errors}Common failure: New provider joins the practice but NPPES affiliation hasn't been updated yet. Claims submitted under the group NPI get denied. Fix: Build a credentialing workflow that blocks claim submission for new providers until NPPES affiliation is confirmed.
Category 3: Insurance / Payer (Rules 10–13)
Insurance and payer verification errors represent the second largest denial category, with 35% of organizations citing authorization issues and 22% citing expired or invalid insurance as top denial drivers.
Rule 10: Payer ID Enumeration Lookup
What it checks: The payer ID used in the claim matches a valid, active payer in the national payer ID registry. This prevents claims from being routed to non-existent or deactivated payer endpoints.
Why it matters: Submitting to the wrong payer ID means the clearinghouse can't route the claim. It sits in limbo, burning through timely filing deadlines, until someone notices.
def validate_payer_id(payer_id: str, payer_database: dict = None) -> dict:
"""Rule 10: Validate payer ID against known active payers.
In production, integrate with your clearinghouse's payer list
or use the CMS PECOS database for Medicare/Medicaid.
"""
errors = []
if not payer_id or not payer_id.strip():
return {"valid": False, "errors": ["Payer ID is required"]}
clean = payer_id.strip().upper()
# Basic format check (most payer IDs are 5-10 alphanumeric characters)
if not 3 <= len(clean) <= 15:
errors.append(f"Payer ID length unusual: '{clean}' ({len(clean)} chars)")
# Check against payer database
if payer_database:
payer_info = payer_database.get(clean)
if not payer_info:
errors.append(f"Payer ID '{clean}' not found in active payer registry")
elif payer_info.get("status") != "active":
errors.append(f"Payer ID '{clean}' is {payer_info.get('status')} — use updated ID")
# Known common payer IDs for basic validation
COMMON_PAYERS = {
'00001': 'Medicare Part A', '00002': 'Medicare Part B',
'00003': 'Medicare Railroad', '00004': 'Medicare Part B Railroad',
'60054': 'Aetna', 'SB580': 'Blue Cross CA', '87726': 'UnitedHealthcare',
}
return {"valid": len(errors) == 0, "errors": errors}Common failure: Practice migrates to a new clearinghouse but carries over old payer IDs that have been remapped. Fix: Re-validate your entire payer ID table after any clearinghouse change.
Rule 11: Policy Active Date Verification
What it checks: The patient's insurance policy is active on the date of service. Verifies effective date, termination date, and that the service date falls within the coverage window.
Why it matters: Submitting claims against terminated policies is the most wasteful denial — zero chance of payment, full cost of rework and patient communication. Real-time eligibility checks via the 270/271 EDI transaction can prevent this entirely.
from datetime import datetime, date
def validate_policy_active(
service_date: str,
policy_effective: str,
policy_termination: str = None
) -> dict:
"""Rule 11: Verify insurance policy is active on service date."""
errors = []
try:
svc = datetime.strptime(service_date, "%Y-%m-%d").date()
eff = datetime.strptime(policy_effective, "%Y-%m-%d").date()
except ValueError as e:
return {"valid": False, "errors": [f"Date parsing error: {e}"]}
# Service before policy effective date
if svc < eff:
errors.append(
f"Service date {service_date} is before policy effective date {policy_effective}"
)
# Policy terminated before service
if policy_termination:
try:
term = datetime.strptime(policy_termination, "%Y-%m-%d").date()
if svc > term:
errors.append(
f"Policy terminated on {policy_termination}, "
f"service date {service_date} is after termination"
)
except ValueError:
errors.append(f"Invalid termination date: {policy_termination}")
# Warn if policy is very new (< 30 days) — may not be fully enrolled with payer yet
days_since_effective = (svc - eff).days
warnings = []
if 0 <= days_since_effective <= 30:
warnings.append("Policy is less than 30 days old — verify enrollment is complete with payer")
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: Patient switched insurance on the 1st of the month, but the appointment was on the 3rd and the old policy was used. Fix: Run real-time 270/271 eligibility verification at check-in, not just at scheduling.
Rule 12: Coordination of Benefits (COB) Order
What it checks: When a patient has multiple insurance policies, the primary/secondary/tertiary payer order is correct. Validates that the primary payer is billed first and that COB information is properly populated on secondary claims.
Why it matters: Submitting to the secondary payer first results in an automatic denial. The secondary payer requires the primary payer's EOB/ERA before processing. Getting the order wrong doubles your submission cycle time.
def validate_cob_order(
insurances: list,
patient_dob: str = None,
spouse_dob: str = None
) -> dict:
"""Rule 12: Validate coordination of benefits payer ordering.
Args:
insurances: List of dicts with keys: payer_name, relationship,
subscriber_dob, policy_type, cob_order
"""
errors = []
warnings = []
if not insurances:
return {"valid": False, "errors": ["At least one insurance is required"]}
if len(insurances) == 1:
return {"valid": True, "errors": [], "warnings": ["Single insurance — no COB needed"]}
# Check that COB order is specified and unique
orders = [ins.get('cob_order') for ins in insurances]
if None in orders:
errors.append("COB order must be specified for all insurances when multiple exist")
elif len(set(orders)) != len(orders):
errors.append(f"Duplicate COB order numbers: {orders}")
# Birthday rule: for dependent children covered by both parents,
# the parent whose birthday falls earlier in the calendar year is primary
has_both_parents = (
any(ins.get('relationship') == 'self' for ins in insurances) and
any(ins.get('relationship') == 'spouse' for ins in insurances)
)
if has_both_parents and patient_dob and spouse_dob:
warnings.append(
"Both parent policies detected — verify Birthday Rule: parent with "
"earlier calendar birthday (month/day) is primary for dependent children"
)
# Medicare Secondary Payer rules
has_medicare = any('medicare' in ins.get('payer_name', '').lower() for ins in insurances)
has_employer = any(ins.get('policy_type') == 'employer_group' for ins in insurances)
if has_medicare and has_employer:
warnings.append(
"Medicare + employer group detected — Medicare is typically SECONDARY "
"when patient is actively employed with group health plan (MSP rules)"
)
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: Patient has employer group insurance AND Medicare. Office bills Medicare as primary. Under Medicare Secondary Payer (MSP) rules, employer group is primary for actively employed individuals. Fix: Always ask patients about employment status and other coverage during intake to determine proper COB order.
Rule 13: Subscriber Relationship Code Validation
What it checks: The patient-to-subscriber relationship code is valid (self, spouse, child, other) and is consistent with the patient demographics. For example, a 65-year-old patient shouldn't have a "child" relationship to a 40-year-old subscriber.
Why it matters: Relationship code mismatches cause the payer to reject the claim because the patient can't be identified in the subscriber's enrollment record.
def validate_subscriber_relationship(
relationship_code: str,
patient_age: int,
subscriber_age: int = None
) -> dict:
"""Rule 13: Validate subscriber relationship code and demographic consistency."""
errors = []
warnings = []
VALID_CODES = {
'18': 'Self',
'01': 'Spouse',
'19': 'Child',
'20': 'Employee',
'21': 'Unknown',
'39': 'Organ Donor',
'40': 'Cadaver Donor',
'53': 'Life Partner',
'G8': 'Other Relationship',
}
if relationship_code not in VALID_CODES:
errors.append(f"Invalid relationship code: '{relationship_code}' — valid: {list(VALID_CODES.keys())}")
return {"valid": False, "errors": errors}
# Consistency checks
if relationship_code == '19': # Child
if patient_age and patient_age > 26:
warnings.append(
f"Patient age {patient_age} with 'child' relationship — "
f"verify: most plans drop dependents at age 26"
)
if subscriber_age and patient_age and patient_age > subscriber_age:
errors.append(
f"Patient age ({patient_age}) older than subscriber ({subscriber_age}) "
f"with 'child' relationship — likely data entry error"
)
if relationship_code == '01': # Spouse
if patient_age and patient_age < 16:
warnings.append(f"Patient age {patient_age} with 'spouse' relationship — verify")
if relationship_code == '18': # Self
if subscriber_age and patient_age and abs(patient_age - subscriber_age) > 1:
warnings.append("'Self' relationship but patient and subscriber ages differ — verify")
return {
"valid": len(errors) == 0,
"errors": errors,
"warnings": warnings,
"relationship": VALID_CODES[relationship_code]
}Common failure: Patient is the subscriber (relationship should be "18-Self") but the form defaults to "19-Child." The payer searches for the patient as a dependent under a nonexistent subscriber. Fix: Auto-set relationship to "Self" when subscriber ID and patient ID match.
Category 4: Clinical / Diagnosis (Rules 14–17)
Clinical and diagnosis code errors contribute to 18-25% of claim denials. These rules validate ICD-10-CM codes against CMS requirements that go beyond simple code existence checks.
Rule 14: ICD-10 Code Validity
What it checks: The diagnosis code exists in the current fiscal year's ICD-10-CM code set, is not a header/category code (requires more specific child code), and hasn't been deleted or replaced in the latest annual update.
Why it matters: CMS updates ICD-10-CM annually (October 1 effective date). Using a deleted code or a code from the wrong fiscal year triggers an automatic rejection. There's no appeal — you must resubmit with the correct code.
import re
def validate_icd10_code(
code: str,
valid_codes: set = None,
header_codes: set = None
) -> dict:
"""Rule 14: Validate ICD-10-CM code format and existence.
In production, load the current FY ICD-10-CM code set from:
https://www.cms.gov/medicare/coding-billing/icd-10-codes
"""
errors = []
if not code or not code.strip():
return {"valid": False, "errors": ["Diagnosis code is required"]}
clean = code.strip().upper().replace('.', '')
# Format check: 3-7 alphanumeric characters
# First character is alpha (A-Z), remaining are alphanumeric
if not re.match(r'^[A-Z]\d[A-Z0-9]{1,5}$', clean):
errors.append(
f"Invalid ICD-10-CM format: '{code}' — "
f"expected letter + digit + 1-5 alphanumeric chars (e.g., E11.9, M54.5)"
)
# Category-only check (3-character codes are usually headers, not billable)
if len(clean) == 3 and not errors:
errors.append(
f"Code '{code}' is a category code (3 chars) — "
f"most payers require codes to the highest level of specificity"
)
# Validate against code set (if provided)
if valid_codes and clean not in valid_codes and not errors:
errors.append(f"Code '{code}' not found in current FY ICD-10-CM code set")
if header_codes and clean in header_codes:
errors.append(f"Code '{code}' is a header code — select a more specific child code")
return {"valid": len(errors) == 0, "errors": errors}Common failure: code="J45.9" (Asthma, unspecified) when the documentation supports "J45.40" (Moderate persistent asthma, uncomplicated). The unspecified code may process but risks medical necessity denials for specific treatments. Fix: Always code to the highest documented specificity.
Rule 15: ICD-10 Specificity Level Check
What it checks: The diagnosis code is coded to the highest level of specificity available in the ICD-10-CM hierarchy. Per CMS ICD-10-CM Official Guidelines, codes must be reported to the highest number of characters available.
Why it matters: CMS explicitly states: "A code is invalid if it has not been coded to the full number of characters required." Using a 4-character code when 5, 6, or 7 characters are available means the claim is technically non-compliant.
def validate_icd10_specificity(
code: str,
code_hierarchy: dict = None
) -> dict:
"""Rule 15: Check ICD-10 code specificity level.
Args:
code: The ICD-10-CM code
code_hierarchy: Dict mapping parent codes to their children.
If a code has children, it's not the most specific.
"""
errors = []
warnings = []
clean = code.strip().upper().replace('.', '')
# Check if code has more specific children available
if code_hierarchy:
children = code_hierarchy.get(clean, [])
if children:
errors.append(
f"Code '{code}' has more specific subcodes available: "
f"{', '.join(children[:5])}{'...' if len(children) > 5 else ''}. "
f"Code to the highest level of specificity per CMS guidelines."
)
# Laterality check — many musculoskeletal and injury codes require it
laterality_prefixes = ['M', 'S', 'T', 'G', 'H'] # Common laterality-required chapters
if clean[0] in laterality_prefixes and len(clean) >= 4:
# Check if code ends in 9 (unspecified side) when laterality is available
if len(clean) >= 5 and clean[-1] == '9':
warnings.append(
f"Code '{code}' uses unspecified laterality — "
f"if documentation indicates left/right, use the specific code"
)
# 7th character extension check (injury codes)
if clean[0] in ['S', 'T'] and len(clean) < 7:
warnings.append(
f"Injury code '{code}' may require a 7th character extension "
f"(A=initial, D=subsequent, S=sequela)"
)
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: code="M25.56" (Pain in knee) without laterality — should be "M25.561" (right knee) or "M25.562" (left knee). CMS requires laterality when documented. Fix: Build laterality prompts into your coding interface for applicable code categories.
Rule 16: Gender-Age Appropriateness
What it checks: The diagnosis code is clinically appropriate for the patient's gender and age. For example, prostate-specific codes for female patients, or pediatric-specific diagnoses for elderly patients.
Why it matters: Gender/age-inappropriate diagnosis codes trigger automatic edits in the Outpatient Code Editor (OCE) and National Correct Coding Initiative (NCCI). These are hard denials that require code correction and resubmission.
# Gender-specific code ranges (simplified — production needs full CMS edit tables)
MALE_ONLY_PREFIXES = ['N40', 'N41', 'N42', 'N43', 'N44', 'N45', 'N46', 'N47', 'N48', 'N49', 'N50', 'C61', 'D29']
FEMALE_ONLY_PREFIXES = ['N70', 'N71', 'N72', 'N73', 'N74', 'N75', 'N76', 'N77', 'O00', 'O10', 'O20', 'O30', 'C56', 'C57', 'D25', 'D26', 'D27']
# Age-specific ranges
NEWBORN_ONLY_PREFIX = 'P' # Certain conditions originating in perinatal period
MATERNITY_PREFIX = 'O' # Pregnancy, childbirth, puerperium (typically age 12-55)
def validate_gender_age_appropriateness(
icd10_code: str,
patient_gender: str,
patient_age: int
) -> dict:
"""Rule 16: Validate diagnosis code is appropriate for patient gender and age."""
errors = []
warnings = []
clean = icd10_code.strip().upper().replace('.', '')
gender = patient_gender.upper()[0] if patient_gender else ''
# Gender checks
if gender == 'F':
for prefix in MALE_ONLY_PREFIXES:
if clean.startswith(prefix):
errors.append(
f"Code '{icd10_code}' is male-specific but patient gender is female"
)
break
if gender == 'M':
for prefix in FEMALE_ONLY_PREFIXES:
if clean.startswith(prefix):
errors.append(
f"Code '{icd10_code}' is female-specific but patient gender is male"
)
break
# Age checks
if clean.startswith('O') and gender == 'F': # Maternity codes
if patient_age < 10 or patient_age > 65:
warnings.append(
f"Maternity code '{icd10_code}' with patient age {patient_age} — verify"
)
if clean.startswith('P'): # Perinatal codes
if patient_age > 1:
warnings.append(
f"Perinatal code '{icd10_code}' with patient age {patient_age} — "
f"these codes are typically for newborns (age 0-1)"
)
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: code="N40.0" (Enlarged prostate) submitted for a female patient due to a copy-paste error from a prior encounter template. Fix: Run gender/age edits before submission — every major claim scrubber should include CMS OCE-level edit checks.
Rule 17: Medical Necessity Indicators
What it checks: The diagnosis code(s) support the medical necessity of the procedure code(s) being billed. This validates that there's a clinically logical relationship between what was diagnosed and what was done.
Why it matters: Medical necessity denials represent a growing portion of the $262 billion annual denial crisis. Payers increasingly use AI to flag claims where the diagnosis doesn't justify the procedure, making pre-submission validation essential.
# Simplified medical necessity mapping (production: use LCD/NCD databases)
PROCEDURE_DIAGNOSIS_MAP = {
'99214': { # Office visit, moderate complexity
'supported': ['E11', 'I10', 'J45', 'M54', 'F41', 'G43'], # Chronic conditions
'denied': ['Z00'], # General exam — use preventive visit codes instead
},
'71046': { # Chest X-ray, 2 views
'supported': ['J18', 'J44', 'R05', 'R06', 'J96', 'J98'], # Respiratory
'denied': ['Z00', 'Z01'], # Routine exam without symptoms
},
'20610': { # Arthrocentesis, major joint
'supported': ['M25', 'M17', 'M06', 'M10', 'M15'], # Joint conditions
'denied': ['Z96'], # Presence of implants — needs specific dx
},
}
def validate_medical_necessity(
cpt_code: str,
icd10_codes: list,
lcd_database: dict = None
) -> dict:
"""Rule 17: Check if diagnosis codes support medical necessity for the procedure."""
errors = []
warnings = []
# Use LCD database if available, otherwise simplified map
procedure_map = (lcd_database or PROCEDURE_DIAGNOSIS_MAP).get(cpt_code)
if not procedure_map:
warnings.append(f"No medical necessity mapping found for CPT {cpt_code} — manual review recommended")
return {"valid": True, "errors": [], "warnings": warnings}
supported = procedure_map.get('supported', [])
denied = procedure_map.get('denied', [])
# Check if any diagnosis supports the procedure
has_support = False
for dx in icd10_codes:
clean_dx = dx.strip().upper().replace('.', '')
dx_prefix = clean_dx[:3]
if dx_prefix in denied:
errors.append(
f"Diagnosis '{dx}' is known to NOT support medical necessity for CPT {cpt_code}"
)
if dx_prefix in supported:
has_support = True
if not has_support and supported:
warnings.append(
f"None of the submitted diagnoses {icd10_codes} are in the typical "
f"medical necessity list for CPT {cpt_code}. Expected categories: {supported}"
)
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: cpt="99214" (moderate E/M visit) paired only with icd10="Z00.00" (general exam). This combination will deny because a general exam doesn't justify a problem-oriented visit — you need a preventive visit code (99395-99397) instead. Fix: Map procedure codes to supported diagnoses and flag mismatches before submission.
Category 5: Procedure / Service (Rules 18–21)
Procedure and service code errors cause 18% of denials and are particularly costly because they often require clinical documentation review to resolve, not just a data correction.
Rule 18: CPT/HCPCS Code Validity
What it checks: The procedure code exists in the current year's CPT-4 or HCPCS Level II code set, hasn't been deleted, and is appropriate for the setting (some codes are facility-only or professional-only).
Why it matters: CPT codes are updated annually by the AMA (January 1 effective date, not October 1 like ICD-10). Using a deleted code means automatic rejection with no appeal path — you must resubmit with the correct code.
import re
def validate_procedure_code(
code: str,
valid_cpt_codes: set = None,
valid_hcpcs_codes: set = None,
deleted_codes: set = None
) -> dict:
"""Rule 18: Validate CPT/HCPCS code existence and format."""
errors = []
warnings = []
if not code or not code.strip():
return {"valid": False, "errors": ["Procedure code is required"]}
clean = code.strip().upper()
# Determine code type
if re.match(r'^\d{5}$', clean): # CPT-4 format (5 digits)
code_type = "CPT"
elif re.match(r'^[A-V]\d{4}$', clean): # HCPCS Level II (letter + 4 digits)
code_type = "HCPCS"
else:
return {"valid": False, "errors": [f"Invalid procedure code format: '{code}'"]}
# Check against deleted codes
if deleted_codes and clean in deleted_codes:
errors.append(
f"{code_type} code '{clean}' has been deleted — "
f"check AMA crosswalk for replacement code"
)
# Validate against current code set
if code_type == "CPT" and valid_cpt_codes:
if clean not in valid_cpt_codes:
errors.append(f"CPT code '{clean}' not found in current year code set")
elif code_type == "HCPCS" and valid_hcpcs_codes:
if clean not in valid_hcpcs_codes:
errors.append(f"HCPCS code '{clean}' not found in current code set")
# Category I CPT range checks
if code_type == "CPT":
code_num = int(clean)
if 99201 <= code_num <= 99499:
warnings.append("E/M code — ensure documentation level supports code selection")
return {"valid": len(errors) == 0, "errors": errors, "code_type": code_type, "warnings": warnings}Common failure: code="99201" — This E/M code was deleted in 2021 when CMS consolidated new patient office visit codes. The replacement is "99202" or higher. Fix: Update your charge master and encounter templates annually when CPT updates are released.
Rule 19: Modifier Appropriateness
What it checks: Modifiers attached to procedure codes are valid, logically appropriate (e.g., modifier 25 on E/M codes, modifier 59 for distinct procedures), and not in conflict with each other or with NCCI edits.
Why it matters: Inappropriate modifiers trigger NCCI edit failures and can flag claims for audit. Overuse of modifier 59 (distinct procedural service) is one of the most common OIG audit targets.
VALID_MODIFIERS = {
'25': {'description': 'Significant, separately identifiable E/M', 'applies_to': 'E/M codes only (99202-99499)'},
'59': {'description': 'Distinct procedural service', 'applies_to': 'Surgical/procedure codes'},
'XE': {'description': 'Separate encounter', 'applies_to': 'Replaces 59 for encounter separation'},
'XS': {'description': 'Separate structure', 'applies_to': 'Replaces 59 for anatomic separation'},
'XP': {'description': 'Separate practitioner', 'applies_to': 'Replaces 59 for provider separation'},
'XU': {'description': 'Unusual non-overlapping service', 'applies_to': 'Replaces 59 for service separation'},
'26': {'description': 'Professional component', 'applies_to': 'Diagnostic tests with technical component'},
'TC': {'description': 'Technical component', 'applies_to': 'Diagnostic tests'},
'LT': {'description': 'Left side', 'applies_to': 'Bilateral procedures'},
'RT': {'description': 'Right side', 'applies_to': 'Bilateral procedures'},
'50': {'description': 'Bilateral procedure', 'applies_to': 'Procedures performed on both sides'},
'76': {'description': 'Repeat procedure by same physician', 'applies_to': 'Same day repeat'},
'77': {'description': 'Repeat procedure by different physician', 'applies_to': 'Same day repeat'},
}
def validate_modifiers(cpt_code: str, modifiers: list) -> dict:
"""Rule 19: Validate modifier appropriateness for the procedure code."""
errors = []
warnings = []
if not modifiers:
return {"valid": True, "errors": [], "warnings": []}
for mod in modifiers:
clean_mod = mod.strip().upper()
if clean_mod not in VALID_MODIFIERS:
errors.append(f"Unknown modifier: '{clean_mod}'")
continue
# Modifier 25 only on E/M codes
if clean_mod == '25':
try:
if not (99202 <= int(cpt_code) <= 99499):
errors.append(f"Modifier 25 is only valid on E/M codes (99202-99499), not {cpt_code}")
except ValueError:
pass
# 26 and TC are mutually exclusive
if clean_mod == '26' and 'TC' in [m.strip().upper() for m in modifiers]:
errors.append("Modifiers 26 (professional) and TC (technical) are mutually exclusive")
# LT and RT together = should use 50 instead
if clean_mod == 'LT' and 'RT' in [m.strip().upper() for m in modifiers]:
warnings.append("Both LT and RT modifiers present — consider using modifier 50 (bilateral) instead")
# Check for excessive modifiers
if len(modifiers) > 4:
warnings.append(f"{len(modifiers)} modifiers on one code — unusual, verify correctness")
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: Modifier 25 attached to a surgical code instead of an E/M code. Modifier 25 is exclusively for E/M services that are "significant and separately identifiable" from a procedure performed on the same day. Fix: Build modifier validation logic that checks the code category before allowing specific modifiers.
Rule 20: Units of Service Validation
What it checks: The number of units billed is appropriate for the procedure code, time-based services have corresponding documentation of time, and units don't exceed medically reasonable limits for a single encounter.
Why it matters: Overbilling units (accidentally entering 10 units instead of 1) can trigger fraud detection algorithms. Underbilling means lost revenue. Both undermine claim integrity.
# Unit limits by procedure type (simplified — production should use MUE values)
MEDICALLY_UNLIKELY_EDITS = {
'99214': {'max_units': 1, 'reason': 'E/M code — one per encounter'},
'99213': {'max_units': 1, 'reason': 'E/M code — one per encounter'},
'96372': {'max_units': 4, 'reason': 'Therapeutic injection — max 4 per encounter'},
'90837': {'max_units': 1, 'reason': 'Psychotherapy 53+ min — one per day'},
'97110': {'max_units': 4, 'reason': 'Therapeutic exercises — 4 units = 60 min'},
'97140': {'max_units': 4, 'reason': 'Manual therapy — 4 units = 60 min'},
'J3301': {'max_units': 8, 'reason': 'Triamcinolone injection'},
}
def validate_units(
cpt_code: str,
units: int,
mue_database: dict = None
) -> dict:
"""Rule 20: Validate units of service against MUE limits."""
errors = []
warnings = []
if units < 1:
return {"valid": False, "errors": [f"Units must be at least 1, got {units}"]}
# Check against MUE (Medically Unlikely Edits)
mue_table = mue_database or MEDICALLY_UNLIKELY_EDITS
mue_entry = mue_table.get(cpt_code)
if mue_entry:
max_units = mue_entry['max_units']
if units > max_units:
errors.append(
f"CPT {cpt_code}: {units} units exceeds MUE limit of {max_units}. "
f"Reason: {mue_entry['reason']}"
)
else:
# General reasonableness check
if units > 20:
warnings.append(
f"CPT {cpt_code}: {units} units seems high — no MUE found, manual review recommended"
)
# Time-based code check (15-min units)
time_based_codes = ['97110', '97112', '97116', '97140', '97530', '97535', '97542']
if cpt_code in time_based_codes and units > 1:
total_minutes = units * 15
warnings.append(
f"Time-based code: {units} units = {total_minutes} minutes — "
f"ensure documentation supports this treatment time"
)
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: cpt="99214", units=2 — E/M codes are limited to 1 unit per encounter. Billing 2 units triggers MUE denial. Fix: Enforce MUE limits in your charge capture system so providers can't enter invalid unit counts.
Rule 21: Place of Service (POS) Consistency
What it checks: The Place of Service code (Box 24B on CMS-1500) is valid, matches the actual service location, and is consistent with the procedure code. Some procedures are only payable in specific settings.
Why it matters: POS code determines the facility vs. non-facility reimbursement rate. Using POS 11 (Office) for a procedure performed in POS 22 (Outpatient Hospital) means incorrect reimbursement or denial if the payer detects the mismatch.
PLACE_OF_SERVICE_CODES = {
'02': 'Telehealth (not patient home)',
'10': 'Telehealth (patient home)',
'11': 'Office',
'12': 'Home',
'21': 'Inpatient Hospital',
'22': 'On Campus Outpatient Hospital',
'23': 'Emergency Room - Hospital',
'24': 'Ambulatory Surgical Center',
'31': 'Skilled Nursing Facility',
'32': 'Nursing Facility',
'81': 'Independent Laboratory',
}
# POS restrictions for common procedure types
POS_RESTRICTIONS = {
'telehealth_only': {'pos': ['02', '10'], 'codes': ['99441', '99442', '99443']},
'facility_only': {'pos': ['21', '22', '23', '24'], 'codes': ['43239', '43249', '27447']},
'not_telehealth': {'exclude_pos': ['02', '10'], 'codes': ['20610', '96372', '11102']},
}
def validate_place_of_service(cpt_code: str, pos_code: str) -> dict:
"""Rule 21: Validate place of service consistency with procedure."""
errors = []
warnings = []
if pos_code not in PLACE_OF_SERVICE_CODES:
errors.append(f"Invalid Place of Service code: '{pos_code}'")
return {"valid": False, "errors": errors}
pos_name = PLACE_OF_SERVICE_CODES[pos_code]
# Check POS restrictions
for rule_name, rule in POS_RESTRICTIONS.items():
if cpt_code in rule.get('codes', []):
if 'pos' in rule and pos_code not in rule['pos']:
errors.append(
f"CPT {cpt_code} requires POS {rule['pos']} "
f"but submitted with POS {pos_code} ({pos_name})"
)
if 'exclude_pos' in rule and pos_code in rule['exclude_pos']:
errors.append(
f"CPT {cpt_code} cannot be billed with POS {pos_code} ({pos_name})"
)
# Telehealth code with non-telehealth POS
telehealth_cpts = ['99441', '99442', '99443', '99421', '99422', '99423']
if cpt_code in telehealth_cpts and pos_code not in ['02', '10']:
errors.append(
f"Telehealth CPT {cpt_code} requires POS 02 or 10, not {pos_code} ({pos_name})"
)
return {
"valid": len(errors) == 0,
"errors": errors,
"warnings": warnings,
"pos_description": pos_name
}Common failure: Telehealth visit billed with POS 11 (Office) instead of POS 02 or 10 (Telehealth). Since COVID, payers have tightened telehealth POS enforcement. Fix: Auto-set POS based on the encounter type in your scheduling/EHR system.
Category 6: Financial (Rules 22–25)
Financial validation catches the errors that survive clinical review — duplicate submissions, unreasonable charges, fee schedule mismatches, and the silent killer: timely filing violations.
Rule 22: Charge Amount Reasonableness
What it checks: The billed charge amount falls within a reasonable range for the procedure code, based on Medicare fee schedule benchmarks and historical charge data. Flags both suspiciously low charges (potential data entry errors) and extremely high charges.
Why it matters: A charge of $5.00 for a 99214 (should be ~$150+) indicates a decimal point error. A charge of $15,000 for a routine office visit triggers fraud detection flags. Both scenarios waste time and delay payment.
def validate_charge_amount(
cpt_code: str,
charge_amount: float,
fee_schedule: dict = None
) -> dict:
"""Rule 22: Validate charge amount reasonableness."""
errors = []
warnings = []
if charge_amount <= 0:
return {"valid": False, "errors": [f"Charge amount must be positive: ${charge_amount}"]}
# Reference fee schedule (simplified — use actual Medicare PFS in production)
DEFAULT_FEE_SCHEDULE = {
'99213': {'min': 50, 'typical': 120, 'max': 500},
'99214': {'min': 75, 'typical': 180, 'max': 750},
'99215': {'min': 100, 'typical': 250, 'max': 1000},
'99203': {'min': 75, 'typical': 170, 'max': 700},
'99204': {'min': 100, 'typical': 260, 'max': 1000},
'99385': {'min': 100, 'typical': 250, 'max': 800},
'90837': {'min': 80, 'typical': 180, 'max': 600},
'96372': {'min': 15, 'typical': 45, 'max': 200},
}
schedule = (fee_schedule or DEFAULT_FEE_SCHEDULE).get(cpt_code)
if schedule:
if charge_amount < schedule['min']:
errors.append(
f"Charge ${charge_amount:.2f} for CPT {cpt_code} is below minimum "
f"${schedule['min']} — possible data entry error"
)
elif charge_amount > schedule['max']:
warnings.append(
f"Charge ${charge_amount:.2f} for CPT {cpt_code} exceeds typical max "
f"${schedule['max']} — verify amount is correct"
)
elif charge_amount > schedule['typical'] * 3:
warnings.append(
f"Charge ${charge_amount:.2f} is 3x+ the typical ${schedule['typical']} — confirm"
)
# Decimal point check — if charge is suspiciously round or small
if charge_amount < 10 and cpt_code.startswith('99'):
errors.append(
f"Charge ${charge_amount:.2f} for E/M code {cpt_code} — likely missing a decimal place"
)
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: charge=18.00 for CPT 99214 — Someone entered $18 instead of $180. The claim processes at $18 and gets paid at $18 — no denial, just silent revenue loss. Fix: Set floor/ceiling charge validation per CPT code based on your fee schedule.
Rule 23: Fee Schedule Compliance
What it checks: The billed amount aligns with the contracted fee schedule for the specific payer, ensures charges are at or above the contracted rate (to capture full allowed amount), and flags any codes not on the contracted fee schedule.
Why it matters: Billing below your contracted rate means you leave money on the table — permanently. Most payers pay the lower of billed charges or contracted rate. If your billed charge is $100 but your contracted rate is $150, you just forfeited $50.
def validate_fee_schedule_compliance(
cpt_code: str,
charge_amount: float,
payer_id: str,
contracted_rates: dict = None
) -> dict:
"""Rule 23: Validate charge amount against contracted fee schedule."""
errors = []
warnings = []
if not contracted_rates:
warnings.append("No contracted rates loaded — cannot validate fee schedule compliance")
return {"valid": True, "errors": [], "warnings": warnings}
payer_schedule = contracted_rates.get(payer_id, {})
contracted_rate = payer_schedule.get(cpt_code)
if contracted_rate is None:
warnings.append(
f"CPT {cpt_code} not found in {payer_id} fee schedule — "
f"verify code is covered under this contract"
)
return {"valid": True, "errors": [], "warnings": warnings}
# Critical: billed amount should be >= contracted rate
if charge_amount < contracted_rate:
errors.append(
f"REVENUE LOSS: Billed ${charge_amount:.2f} is below contracted rate "
f"${contracted_rate:.2f} for CPT {cpt_code} with payer {payer_id}. "
f"Payer will pay ${charge_amount:.2f} instead of ${contracted_rate:.2f}"
)
# Best practice: bill at 150-200% of Medicare to ensure full contracted capture
if charge_amount < contracted_rate * 1.1:
warnings.append(
f"Billed amount ${charge_amount:.2f} is within 10% of contracted rate "
f"${contracted_rate:.2f} — consider raising charges to ensure full capture"
)
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: Practice charges $120 for 99214 but the Blue Cross contracted rate is $145. Every 99214 claim to this payer leaves $25 on the table. Over 5,000 annual visits, that's $125,000 in lost revenue. Fix: Load all payer contracted rates into your validation engine and flag any billed charge below the contracted amount.
Rule 24: Duplicate Claim Detection
What it checks: The claim hasn't been previously submitted with the same patient, date of service, procedure code, and provider combination. Also detects near-duplicates where only the charge amount differs (common in rebilling scenarios).
Why it matters: Duplicate claims account for 12% of denials. Beyond denials, systematic duplicate submissions can trigger fraud investigations. The OIG actively monitors for patterns of duplicate billing.
from datetime import datetime
import hashlib
def generate_claim_fingerprint(claim: dict) -> str:
"""Create a unique fingerprint for duplicate detection."""
key_fields = [
claim.get('patient_id', ''),
claim.get('service_date', ''),
claim.get('cpt_code', ''),
claim.get('rendering_npi', ''),
claim.get('pos_code', ''),
str(claim.get('units', 1)),
]
fingerprint_str = '|'.join(str(f).strip().upper() for f in key_fields)
return hashlib.sha256(fingerprint_str.encode()).hexdigest()
def validate_duplicate_claim(
claim: dict,
submitted_claims: dict = None
) -> dict:
"""Rule 24: Detect duplicate claim submissions."""
errors = []
warnings = []
fingerprint = generate_claim_fingerprint(claim)
if submitted_claims is None:
warnings.append("No claim history loaded — duplicate detection unavailable")
return {"valid": True, "errors": [], "warnings": warnings}
existing = submitted_claims.get(fingerprint)
if existing:
errors.append(
f"DUPLICATE DETECTED: This claim matches previously submitted claim "
f"#{existing.get('claim_id')} from {existing.get('submit_date')}. "
f"Status: {existing.get('status', 'unknown')}"
)
# Near-duplicate check (same patient + date + provider, different code)
partial_key = f"{claim.get('patient_id')}|{claim.get('service_date')}|{claim.get('rendering_npi')}"
same_encounter = [
c for fp, c in (submitted_claims or {}).items()
if isinstance(c, dict) and
f"{c.get('patient_id')}|{c.get('service_date')}|{c.get('rendering_npi')}" == partial_key
]
if len(same_encounter) >= 5:
warnings.append(
f"Patient has {len(same_encounter)} claims for the same date/provider — "
f"verify this isn't a multi-line correction that should be a single claim"
)
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}Common failure: Staff resubmits a claim that was actually paid (ERA not yet posted), creating a true duplicate that gets denied and wastes rework time. Fix: Check claim status in your system before resubmission — integrate ERA posting into your duplicate detection window.
Rule 25: Timely Filing Deadline Check
What it checks: The claim is being submitted within the payer's timely filing deadline. Medicare requires submission within 365 days. Medicaid varies by state (90 days to 1 year). Commercial payers typically require 90-180 days.
Why it matters: Timely filing denials are non-appealable. If you miss the window, the revenue is gone forever — no exceptions, no appeals, no second chances. This is the one denial that has zero recovery path.
from datetime import datetime, timedelta
# Payer-specific filing deadlines (days from service date)
TIMELY_FILING_LIMITS = {
'medicare': 365,
'medicaid_ny': 90,
'medicaid_ca': 180,
'medicaid_tx': 95,
'medicaid_fl': 365,
'bcbs': 180,
'aetna': 90,
'united': 90,
'cigna': 180,
'humana': 365,
'default_commercial': 120,
}
def validate_timely_filing(
service_date: str,
submission_date: str,
payer_type: str = 'default_commercial'
) -> dict:
"""Rule 25: Validate claim is within timely filing deadline."""
errors = []
warnings = []
try:
svc = datetime.strptime(service_date, "%Y-%m-%d").date()
sub = datetime.strptime(submission_date, "%Y-%m-%d").date()
except ValueError as e:
return {"valid": False, "errors": [f"Date parsing error: {e}"]}
days_elapsed = (sub - svc).days
filing_limit = TIMELY_FILING_LIMITS.get(
payer_type.lower(),
TIMELY_FILING_LIMITS['default_commercial']
)
deadline = svc + timedelta(days=filing_limit)
days_remaining = (deadline - sub).days
if days_remaining < 0:
errors.append(
f"TIMELY FILING EXPIRED: Service date {service_date}, "
f"{payer_type} deadline was {deadline} ({filing_limit} days). "
f"Expired {abs(days_remaining)} days ago. THIS DENIAL IS NON-APPEALABLE."
)
elif days_remaining <= 14:
warnings.append(
f"URGENT: Only {days_remaining} days remaining to file. "
f"Deadline: {deadline} ({payer_type}: {filing_limit}-day limit)"
)
elif days_remaining <= 30:
warnings.append(
f"Approaching deadline: {days_remaining} days remaining. "
f"Deadline: {deadline}"
)
return {
"valid": len(errors) == 0,
"errors": errors,
"warnings": warnings,
"days_elapsed": days_elapsed,
"days_remaining": max(days_remaining, 0),
"deadline": str(deadline)
}Common failure: Claim from a date of service 95 days ago, payer is Aetna (90-day filing limit). The claim is already past the non-appealable deadline. Fix: Run timely filing checks daily on your unbilled/pending queue and escalate anything within 30 days of its deadline.
Building the Pre-Submission Validation Pipeline
Individual rules are valuable, but the real impact comes from running them as a sequential validation pipeline. Here's the orchestration architecture:
from dataclasses import dataclass, field
from typing import List, Dict, Any
from datetime import datetime
@dataclass
class ValidationResult:
rule_id: int
rule_name: str
category: str
passed: bool
errors: List[str] = field(default_factory=list)
warnings: List[str] = field(default_factory=list)
auto_corrected: bool = False
correction_detail: str = ""
class ClaimValidationPipeline:
"""Pre-submission validation pipeline — runs all 25 rules sequentially."""
def __init__(self, config: dict = None):
self.config = config or {}
self.results: List[ValidationResult] = []
def validate_claim(self, claim: dict) -> dict:
"""Run the full 25-rule validation pipeline."""
self.results = []
# Stage 1: Patient Demographics (Rules 1-5)
self._run_rule(1, "Patient Name Format", "demographics",
lambda: validate_patient_name(
claim.get('patient_first_name', ''),
claim.get('patient_last_name', '')
))
self._run_rule(2, "DOB Validation", "demographics",
lambda: validate_dob(
claim.get('patient_dob', ''),
claim.get('service_date')
))
self._run_rule(3, "Address Standardization", "demographics",
lambda: validate_address(
claim.get('street', ''),
claim.get('city', ''),
claim.get('state', ''),
claim.get('zip_code', '')
))
# ... Rules 4-5: SSN/MRN and Insurance ID (shown above)
# Stage 2: Provider (Rules 6-9)
self._run_rule(6, "NPI Luhn Validation", "provider",
lambda: validate_npi(claim.get('billing_npi', '')))
# ... Rules 7-9
# Stage 3-6: Insurance, Clinical, Procedure, Financial
# (same pattern for all remaining rules)
# Compile results
total = len(self.results)
passed = sum(1 for r in self.results if r.passed)
failed = total - passed
return {
"clean_claim": failed == 0,
"total_rules": total,
"passed": passed,
"failed": failed,
"errors": [r for r in self.results if not r.passed],
"warnings": [r for r in self.results if r.warnings],
"auto_corrections": [r for r in self.results if r.auto_corrected],
"timestamp": datetime.utcnow().isoformat(),
}
def _run_rule(self, rule_id, name, category, validation_fn):
"""Execute a single validation rule and record the result."""
try:
result = validation_fn()
self.results.append(ValidationResult(
rule_id=rule_id,
rule_name=name,
category=category,
passed=result.get('valid', False),
errors=result.get('errors', []),
warnings=result.get('warnings', []),
))
except Exception as e:
self.results.append(ValidationResult(
rule_id=rule_id,
rule_name=name,
category=category,
passed=False,
errors=[f"Rule execution error: {str(e)}"],
))Integration Points: Where These Rules Fit in Your Stack
These 25 rules should be integrated at three checkpoints in your revenue cycle:
- Point of Service (Real-time): Rules 1-5 (demographics) and 10-13 (insurance) run during patient check-in. Catch eligibility and data quality issues before the encounter happens.
- Charge Capture (Post-encounter): Rules 14-21 (clinical and procedure) run when the provider submits charges. Flag coding issues while the encounter is fresh and documentation is accessible.
- Pre-Submission (Batch): All 25 rules run on the complete claim immediately before clearinghouse submission. This is your last line of defense.
For the technical architecture of how these rules integrate with EDI transactions, payer APIs, and clearinghouse workflows, see our complete guide to US payer integration and EDI healthcare transactions.
Measuring Success: KPIs After Implementation
After deploying these 25 rules, track these four KPIs weekly:
| KPI | Before Validation | Target After 90 Days | How to Measure |
|---|---|---|---|
| Clean Claim Rate | <90% | 95-98% | Claims accepted on first submission / total claims |
| First-Pass Denial Rate | 8-15% | 3-5% | Claims denied on first submission / total claims |
| Days in A/R | 45-55 | 28-35 | Average days from service to payment |
| Validation Rule Hit Rate | N/A | Track per rule | Which rules catch the most errors — focus improvement there |



