Skip to content
fonteum
DataAPIRisk SignalsResearchCompareSnapshotsRequest access →
Home / Research / NPPES Anatomy
Fonteum Research · Dataset Reference·CMS NPPES · HIPAA 45 CFR 162

NPPES Anatomy: Complete Technical Reference for AI Systems

Every field, every type, every limitation. For RAG, MCP, and healthcare data engineering.

Reviewed by Dr. Jennifer Montecillo, MD, non-practicing medical reviewerLast updated: 2026-05-30NPPES snapshot: 2026-05-01Methodology: nppes-anatomy/v1

Contents

  1. TL;DR — what you need to know in 60 seconds
  2. What NPPES is — and what it is not
  3. Type 1 vs. Type 2 NPIs
  4. Field-by-field reference
  5. Taxonomy codes (NUCC)
  6. Refresh cadence and snapshot reality
  7. Joining NPPES with other federal sources
  8. Common AI-system mistakes when using NPPES
  9. Provenance and how to cite NPPES properly
  10. Limitations
  11. What Fonteum adds on top
  12. FAQ
  13. Cite this reference

TL;DR

  • What it is: NPPES (National Plan and Provider Enumeration System) is the federal registry of National Provider Identifiers (NPIs) — 10-digit IDs assigned to every healthcare provider under HIPAA. [1]
  • Scale: approximately 8.9 million total records, of which roughly 7.2 million are active as of the May 2026 snapshot. [2]
  • Two types: NPI-1 is for individual providers (physicians, nurses, PTs); NPI-2 is for organizations (hospitals, group practices, labs).
  • What it does NOT cover: license status, board certification, malpractice history, quality scores, current practice location, or active practice status. These are common AI-system errors.
  • Cadence: CMS releases a full replacement file monthly and a weekly delta file. The NPPES API reflects near-real-time state. [3]
  • Why AI systems get it wrong: treating credential text as board-certification evidence, treating license numbers as active-license evidence, and treating practice location as current location are the three most common failure modes.
  • Proper citation: cite the CMS NPPES download page with the specific file release date and access date. Federal public domain (U.S. Government Works); no license agreement required. [4]
  • Downloads: see the field reference, taxonomy exemplar, and refresh cadence JSON files linked throughout this document.

What NPPES is — and what it is not

The federal mandate

The National Plan and Provider Enumeration System (NPPES) is the federal system maintained by the Centers for Medicare and Medicaid Services (CMS) under authority granted by HIPAA Administrative Simplification. [5] Specifically, 45 CFR Part 162, Subpart D establishes the requirement for a standard unique health identifier for healthcare providers. [6] The final rule was published in January 2004; providers had until May 23, 2007 to obtain an NPI and begin using it in all standard HIPAA transactions. [7]

The NPI replaced several prior identifier systems: the Unique Physician Identification Number (UPIN), the Medicare Provider Identification Number (PIN), the Online Survey Certification and Reporting (OSCAR) number, the National Supplier Clearinghouse (NSC) number, and others. The consolidation was explicitly intended to create a single, non-intelligence-bearing identifier — meaning the NPI digit string encodes no information about the provider's location, specialty, or type beyond the Luhn check digit. [8]

What NPPES asserts

NPPES asserts one thing and one thing only: that a given NPI was assigned to a given legal entity (individual or organization) at a specific point in time. Every other interpretation — quality, license status, board certification, current location, active practice — requires a secondary source.

The data in NPPES is self-reported. When a provider enrolls, they supply their own name, address, taxonomy code, and credential text. CMS does not audit the accuracy of these fields at the time of submission or at subsequent updates. CMS validates only the NPI uniqueness and the Luhn check digit. [9]

What NPPES does not assert — and common misreadings

Critical for AI systems and RAG builders: The following inferences are NOT supported by NPPES data, regardless of what the fields contain:
InferenceWhy it failsCorrect source
Provider holds an active licenseLicense numbers in NPPES are self-reported at enrollment and not re-validated. A license number may be expired, suspended, or surrendered.State licensing board (jurisdiction-specific)
Provider is board-certifiedCredential Text (e.g., 'MD', 'FAAD') is free-form text. It is not validated against any credentialing body.ABMS, AOA, specialty board (not available in any federal public file)
Practice Location is currentPractice Location Address reflects the registered address at enrollment or last update. Providers frequently move without updating NPPES.CMS PECOS (Medicare enrollment), Care Compare (facilities)
Provider is actively practicingNo field in NPPES asserts active clinical practice. Last Update Date reflects the last administrative change, not the last date of clinical service.CMS PECOS active enrollment status
Provider has no malpractice historyNPPES contains no malpractice, disciplinary, or adverse action data.State medical board orders (jurisdiction-specific)
Provider quality or performanceNPPES is a directory, not a performance registry.CMS QPP MIPS, Care Compare, LEAPFROG
Deactivated provider is not practicingCMS deactivation can lag actual cessation of practice by weeks to months.CMS PECOS, state board orders

The identity backbone

Despite these limitations, NPPES is the most important single dataset in U.S. healthcare provider data because the NPI is the universal join key across every major federal source family. [10] CMS PECOS uses NPI. OIG LEIE uses NPI (for post-2013 exclusions). CMS QPP MIPS uses NPI. CMS Care Compare uses CCN as the primary key for facilities but cross-references NPI for individual practitioners. The NPPES file is the starting point for any multi-source provider data join.

Type 1 vs. Type 2 NPIs

NPIs come in two structurally distinct types. [11] The Entity Type Code field (value: 1 or 2) distinguishes them. Mixing the two in aggregations without filtering is one of the most common NPPES analysis errors.

DimensionNPI-1 (Individual)NPI-2 (Organization)
Entity Type Code12
Who gets itIndividual human providers: physicians, NPs, PAs, nurses, therapists, etc.Organizations: hospitals, group practices, home health agencies, labs, pharmacies, etc.
Name fieldsLast Name, First Name, Middle Name, Prefix, Suffix, Credential TextProvider Organization Name (Legal Business Name)
Gender CodePopulated (M/F)Null — not applicable
Authorized OfficialNot applicableAuthorized Official Last/First/Middle Name, Title, Telephone
Is Sole ProprietorMay be populated (X=yes)Not applicable
Is Organization SubpartNot applicableMay be populated (X=yes)
Parent Organization fieldsNot applicableParent Organization LBN, Parent Organization TIN (redacted)
Primary use in joinsIndividual clinician identity backboneFacility/group identity backbone; joins to Care Compare CCN via CMS POS
Count (approx.)~6.5M active as of May 2026~700K active as of May 2026

When a provider has both types

A solo practitioner who operates as their own practice may hold both an NPI-1 (as the individual) and an NPI-2 (as the sole proprietor organization). The NPI-1 record will show Is Sole Proprietor = X. These are distinct records with distinct NPIs. Do not deduplicate them: they serve different billing contexts. The NPI-2 in this case typically shares the same address and taxonomy code as the NPI-1.

Larger organizations (hospitals, health systems) hold NPI-2 records and may have subordinate NPI-2 records for departments that bill independently. The Is Organization Subpart field on subordinate records points to the parent via Parent Organization LBN. Note that Parent Organization TIN is redacted in the public file.

Field-by-field reference

The NPPES full replacement CSV contains approximately 330 columns. The core identity and contact fields are documented in detail below. The 15-slot taxonomy group and the 50-slot Other Provider Identifier group follow a repeating column pattern; they are summarized once with the slot range noted.

↓ Full field-reference.jsonJSON · ~28 KB
FieldTypeNullableExampleNotes & AI pitfalls
NPIvarchar(10)No123456789010-digit. Position 10 is a Luhn check digit computed on positions 1–9 with '80840' prefix. Non-intelligence-bearing — do not parse sub-fields.
Entity Type Codechar(1)No11=Individual, 2=Organization. Always filter explicitly — mixing types skews every specialty-level analysis.
Replacement NPIvarchar(10)Yes1098765432Hard redirect, not a soft alias. The original NPI is defunct. Rare field — only populated during CMS legacy-system migrations.
EINvarchar(9)Yes(redacted)ALWAYS blank in the public dissemination file. Cannot be used for linkage. Use NPI-2 as the org key.
Provider Organization Namevarchar(70)YesNORTHSIDE RADIOLOGY ASSOC PCNPI-2 only. Self-reported free text — not normalized. Same org may appear with multiple spellings across records.
Provider Last Namevarchar(35)YesJOHNSONNPI-1 only. Use with First Name + NPI for identity. Last name alone is insufficient for disambiguation.
Provider First Namevarchar(20)YesEMILYNPI-1 only.
Provider Middle Namevarchar(20)YesGRACEHighly inconsistent — full name, initial, or blank. Do not use as a join key.
Provider Credential Textvarchar(20)YesMDCRITICAL: Free-form self-reported text. Does NOT attest board certification, active license, or any credentialing outcome. Values range from 'MD' to 'MD PhD' to 'FAAD' to 'Dr.' to multi-designation strings.
Provider Other Organization Namevarchar(70)YesCITY RADIOLOGY GROUPDBA or former name for NPI-2. Other Name Type Code: 3=Former Legal Business Name, 5=Other Name.
Provider First Line Business Mailing Addressvarchar(55)Yes123 MAIN STCorrespondence address — frequently a PO Box or billing service. Do not use as patient-care location proxy.
Provider Business Mailing Address Cityvarchar(40)YesCHICAGOTwo-character USPS state code in the State field.
Provider Business Mailing Address Postal Codevarchar(20)Yes606010001ZIP+4 without hyphen (9-digit) or ZIP-only (5-digit). Normalize before geospatial joins.
Provider Business Mailing Address Telephonevarchar(20)Yes3125551234Digits only — no formatting. May be years out of date. Not a current contact channel.
Provider First Line Business Practice Location Addressvarchar(55)Yes456 HOSPITAL DRCRITICAL: Registered practice location at enrollment/last update — NOT current location. Providers move frequently without updating. Cross-reference with Last Update Date and CMS PECOS.
Provider Business Practice Location Address Cityvarchar(40)YesEVANSTONSee Practice Location Address caveats above.
Provider Business Practice Location Address Postal Codevarchar(20)Yes60201ZIP or ZIP+4. Normalize before geospatial joins.
Provider Enumeration Datedate MM/DD/YYYYNo05/14/2007Date NPI was assigned. Stable — never changes. Does NOT reflect when the provider began practicing.
Last Update Datedate MM/DD/YYYYNo03/11/2024CRITICAL: Date of last administrative change in NPPES. Do NOT use as active-practice proxy. Many active providers have 2007–2010 update dates; others updated recently while no longer practicing.
NPI Deactivation Reason Codevarchar(2)YesDTDT=Death, DA=Disbandment, FR=Fraud, OT=Other. Non-null means the NPI is defunct.
NPI Deactivation Datedate MM/DD/YYYYYes09/01/2022Deactivation may lag actual cessation of practice by weeks to months. A provider who stopped billing in January may not be deactivated until March.
NPI Reactivation Datedate MM/DD/YYYYYesRare. Populated when a previously deactivated NPI was reactivated.
Provider Gender Codechar(1)YesFM=Male, F=Female. NPI-1 only. Missing for many records where the provider left this blank at enrollment.
Authorized Official fields (5 fields)varcharYesLast Name, First Name, Middle Name, Title, TelephoneNPI-2 only. Identifies the person authorized to submit enrollment changes on behalf of the organization.
Healthcare Provider Taxonomy Code_1 through _15varchar(10) × 15Yes207N00000XUp to 15 NUCC taxonomy codes. Most providers have 1–2 populated. Unnest all 15 when building specialty counts. The Primary Taxonomy Switch column (below) identifies the declared primary.
Provider License Number_1 through _15varchar(20) × 15YesMD098765CRITICAL: Presence of a license number does NOT assert active licensure. NPPES accepts the number at enrollment without validating with state boards. Active/inactive status requires the relevant state licensing authority.
Provider License Number State Code_1 through _15varchar(2) × 15YesILTwo-character USPS state code paired with the license number in the same slot.
Healthcare Provider Primary Taxonomy Switch_1 through _15char(1) × 15YesYY=primary taxonomy for this slot. Prefer first Y=Y slot when multiple slots have Y (data entry inconsistency).
Is Sole Proprietorchar(1)YesXX=Yes. NPI-1 only. An NPI-1 who is a sole proprietor may also hold an NPI-2 for their practice entity.
Is Organization Subpartchar(1)YesXX=Yes. NPI-2 only. Use Parent Organization LBN to identify the parent. Parent Organization TIN is redacted.
Other Provider Identifier_1 through _50varchar(20) × 50YesG12345Legacy or alternative identifiers: UPIN, Medicare legacy, Medicaid, NCPDP, state license (type code 08, which duplicates the taxonomy-slot license field). Each slot has paired Type Code, State, and Issuer columns.

Taxonomy codes (NUCC)

The taxonomy codes in NPPES come from the NUCC Health Care Provider Taxonomy Code Set, maintained by the National Uniform Claim Committee (NUCC). [12] NUCC releases updated code sets twice per year: January 1 and July 1. Each release may add, revise, or retire codes. [13]

Code structure

Each NUCC code is a 10-character alphanumeric string. The structure is hierarchical:

LevelExampleDescription
Section (top-level grouping)Allopathic & Osteopathic PhysiciansThe broadest category. Also includes Behavioral Health, Chiropractic, Dental, Nursing, etc.
GroupingAllopathic & Osteopathic Physicians → DermatologySecond-level: the specialty family.
ClassificationDermatologyThe specific specialty within the grouping.
Specialization (optional)MOHS-Micrographic SurgerySub-specialty within the classification. Not all codes have a specialization.
Code207ND0101XThe 10-character alphanumeric identifier. The X suffix is standard across all codes.

Why one specialty maps to multiple codes

The taxonomy code system is fine-grained. A "dermatologist" in plain language may hold any of these codes:

207N00000X  -- Dermatology (general)
207ND0101X  -- Dermatology; MOHS-Micrographic Surgery
207ND0900X  -- Dermatology; Dermatopathology
207NI0002X  -- Dermatology; Clinical & Laboratory Dermatological Immunology
207NP0225X  -- Dermatology; Pediatric Dermatology
207NS0135X  -- Dermatology; Procedural Dermatology

When building specialty-level aggregations, you must decide whether to aggregate at the Classification level (all 207N* codes), the Specialization level, or an exact-code level. The choice significantly affects reported counts. [14]

Common join patterns

Fonteum uses taxonomy codes to build specialty-level aggregation pages (e.g., dermatologist supply by state, chiropractor supply by state). The pattern:

-- Specialty count by state (dermatology example)
SELECT
  provider_business_practice_location_address_state_name AS state,
  COUNT(DISTINCT npi) AS provider_count
FROM nppes_providers
WHERE entity_type_code = '1'
  AND npi_deactivation_date IS NULL
  AND healthcare_provider_taxonomy_code_1 LIKE '207N%'
  -- OR: any of the 15 taxonomy slots contains a derm code
GROUP BY state
ORDER BY provider_count DESC;

Note: this example uses the primary taxonomy slot only. For completeness, unnest all 15 slots and filter to rows where any slot contains a derm code.

↓ taxonomy-codes.jsonJSON · ~22 KB

Refresh cadence and snapshot reality

Understanding what "current" means for NPPES data is critical for any system that infers provider status. [15]

The three CMS data surfaces

SurfaceFrequencyLag vs. live stateUse case
NPPES Full Replacement FileMonthly (2nd Monday)Up to 30 daysBulk ingestion, analytics, research snapshots
NPPES Weekly Update FileWeekly (Monday)Up to 7 daysIncremental update pipelines, deactivation monitoring
NPPES API (npiregistry.cms.hhs.gov/api/)Near-real-timeMinutesSingle-record lookups; no bulk export support

The deactivation lag problem

The most consequential lag is in deactivation processing. When a provider dies, an organization disbands, or a provider voluntarily surrenders their NPI, the deactivation must be filed with CMS. CMS then processes the deactivation administratively. This processing can take days, weeks, or months after the real-world event. The deactivation date in the NPPES file reflects when CMS processed the deactivation — not when the provider stopped practicing.

Practical implication for AI systems:A provider who stopped clinical practice in January 2025 may not appear with a deactivation date until March or April 2025. Any system that treats "no deactivation date = currently practicing" is reading data that can be 90+ days stale for newly inactive providers.

What "current" means for Fonteum's snapshot

Fonteum ingests the NPPES monthly full replacement file. Each snapshot is dated to the CMS file release date and Ed25519-signed. The signed attestation is published in the Fonteum chain at /chain. The snapshot date is published per-field via the Fonteum provenance API.

Fonteum targets a maximum of 35 days between NPPES snapshot and publication, following the CMS monthly release cycle. Fonteum does not currently ingest the weekly delta files — organizations requiring weekly deactivation tracking should query the NPPES API directly.

↓ refresh-cadence.jsonJSON · ~4 KB

Joining NPPES with other federal sources

The NPI is the universal join key across federal healthcare provider data. Each join below is documented with the join key, expected match rate, common failure mode, and what the joined record asserts and does not assert.

NPPES ↔ OIG LEIE (exclusion records)

The OIG List of Excluded Individuals and Entities (LEIE) is the federal registry of providers barred from participating in Medicare, Medicaid, and other federal programs. [16] For exclusions processed after approximately 2013, the LEIE includes the NPI as an identifier. Earlier records rely on name, date of birth, and state.

-- NPPES ↔ OIG LEIE join (NPI-keyed)
SELECT
  n.npi,
  n.provider_last_name,
  n.provider_first_name,
  l.excl_date,
  l.excl_type,
  l.reinstate_date
FROM nppes_providers n
INNER JOIN oig_leie_exclusions l
  ON n.npi = l.npi
WHERE n.npi_deactivation_date IS NULL;  -- active NPIs only

-- Expected match rate: ~0.01% (68,055 exclusions / 8.9M providers)
-- Failure mode: exclusions pre-2013 have no NPI in LEIE;
--   use name+DOB+state fuzzy match for those records.
What this join asserts: the provider was added to the OIG exclusion list on the specified date, for the specified reason. What it does not assert: the provider is currently excluded (check Reinstate Date), or that all billing by this NPI was fraudulent (exclusion reasons vary widely).

NPPES ↔ CMS PECOS (Medicare enrollment)

CMS PECOS (Provider Enrollment, Chain, and Ownership System) tracks Medicare enrollment status. The PECOS Provider Enrollment File (PPEF) is the public-facing extract. [17] Not all NPPES providers are enrolled in Medicare — a provider may hold an NPI without billing Medicare (e.g., pediatric providers, concierge practices, out-of-network only).

-- NPPES ↔ PECOS join
SELECT
  n.npi,
  n.provider_business_practice_location_address_state_name AS nppes_state,
  p.provider_state_code AS pecos_state,
  p.provider_type,
  p.pecos_assgn_ind  -- accepts Medicare assignment?
FROM nppes_providers n
LEFT JOIN pecos_ppef p
  ON n.npi = p.npi
WHERE n.entity_type_code = '1'
  AND n.npi_deactivation_date IS NULL;

-- Expected match rate: ~55-60% of active NPI-1s appear in PECOS.
-- Failure mode: address mismatch between NPPES and PECOS is common
--   (NPPES practice location may differ from PECOS enrollment address).
What this join adds: Medicare enrollment status, whether the provider accepts Medicare assignment, the provider type as classified by Medicare. What it does not add: active practice status, current location, or quality information.

NPPES ↔ CMS QPP MIPS (quality scores)

The CMS Quality Payment Program (QPP) Merit-based Incentive Payment System (MIPS) publishes annual performance scores for individual clinicians and group practices. [18] MIPS scores are NPI-keyed for individual clinicians and TIN-keyed for group-level scores.

-- NPPES ↔ QPP MIPS join (individual clinician)
SELECT
  n.npi,
  n.provider_last_name,
  n.provider_first_name,
  m.final_score,
  m.payment_year
FROM nppes_providers n
INNER JOIN cms_qpp_mips_individual m
  ON n.npi = m.npi
WHERE n.entity_type_code = '1'
  AND n.npi_deactivation_date IS NULL
  AND m.payment_year = 2023;

-- Expected match rate: ~477K clinicians scored in PY2023 MIPS.
-- Caveats: MIPS only covers Medicare Part B eligible providers;
--   excludes providers below the low-volume threshold (~$90K Medicare
--   revenue OR fewer than 200 Medicare patients). Many active NPI-1s
--   will not have MIPS scores.
What this join adds: a federal quality signal — the MIPS composite score for the payment year, including Quality, Promoting Interoperability, Improvement Activities, and Cost component scores. Important caveat: MIPS score is a Medicare-billing performance metric, not a clinical quality assessment. A high MIPS score reflects compliance with reporting requirements, not clinical outcomes per se. [19]

NPPES ↔ CMS Provider of Services (POS) file

The CMS Provider of Services (POS) file is the CCN (CMS Certification Number) backbone — it enumerates certified facilities (hospitals, nursing homes, dialysis centers, home health agencies, etc.) with their NPI-2 identifiers. [20] The POS ↔ NPPES join resolves NPI-2 to CCN, enabling joins from NPPES to the Care Compare facility datasets.

-- NPPES NPI-2 → CCN via CMS POS
SELECT
  n.npi,
  n.provider_organization_name_legal_business_name,
  p.ccn,
  p.facility_type_desc,
  p.state_cd
FROM nppes_providers n
INNER JOIN cms_pos_facilities p
  ON n.npi = p.npi
WHERE n.entity_type_code = '2'
  AND n.npi_deactivation_date IS NULL;

-- Expected match rate: ~68,211 CCN-keyed facilities in the POS file.
-- Not all NPI-2s appear in POS; POS covers only CMS-certified facilities.

NPPES ↔ Care Compare (facility quality)

CMS Care Compare datasets (nursing homes, home health, hospice, dialysis, ASCs, hospitals) are keyed on CCN, not NPI. [21] The join path is: NPPES NPI-2 → CMS POS (NPI→CCN) → Care Compare (CCN-keyed quality data). This three-table join is the standard pattern for attaching facility quality signals to NPI-2 records.

-- Three-table join: NPPES → POS → Care Compare Nursing Homes
SELECT
  n.npi,
  n.provider_organization_name_legal_business_name,
  p.ccn,
  nh.overall_rating,
  nh.staffing_rating,
  nh.health_inspection_rating
FROM nppes_providers n
INNER JOIN cms_pos_facilities p ON n.npi = p.npi
INNER JOIN cms_care_compare_nh nh ON p.ccn = nh.federal_provider_number
WHERE n.entity_type_code = '2'
  AND n.npi_deactivation_date IS NULL;

Common AI-system mistakes when using NPPES

AI systems — including RAG pipelines, MCP tool implementations, and LLM-powered research agents — consistently make the same errors when working with NPPES data. These errors range from harmless inaccuracies to YMYL-class claims that misrepresent provider credentials or status. The following catalog is drawn from production observations. [22]

Treating Credential Text as board-certification evidence

Critical (YMYL)

Provider Credential Text is free-form text typed by the provider at enrollment. 'FAAD' in this field means the provider typed 'FAAD' — it does not mean Fonteum or CMS has confirmed fellowship in the American Academy of Dermatology. The field is unvalidated. A system that renders 'Board-certified dermatologist — confirmed by NPPES' is making a false claim.

Fix: Render credential text with a clear limitation disclaimer: 'Self-reported credential: FAAD.' Never present it as independently attested. For board certification claims, there is no federal public file that provides this — do not surface it without a primary-source board attestation.

Treating license numbers as active-license evidence

Critical (YMYL)

The Provider License Number fields contain the number the provider supplied at enrollment. CMS does not validate against state licensing boards at any point. A license that was revoked in 2021 may still appear in a 2026 NPPES record if the provider never updated their record. A system that renders 'Licensed in Illinois — IL MD098765' is asserting a status it cannot support.

Fix: Render license numbers as: 'License number on file: IL MD098765 (not independently validated — check the Illinois Department of Financial and Professional Regulation for current status).' Never assert 'licensed' or 'active license' from this field alone.

Treating practice location as current location

High

The Provider Business Practice Location Address reflects the registered location at enrollment or last update. Providers move, retire, or join new practices without updating NPPES. A provider enumerated in 2008 with a Last Update Date of 2010 may have a practice location that is 16 years out of date. Rendering this as 'current location' is misleading.

Fix: Always display the Last Update Date alongside any practice location. Add a recency flag (e.g., 'Location last confirmed 2010 — may have changed'). Cross-reference with CMS PECOS for Medicare-enrolled providers.

Using Last Update Date as active-practice evidence

High

A recent Last Update Date does not mean the provider is actively practicing. It means the administrative record was changed recently. Providers frequently update addresses, phone numbers, or taxonomy codes without any clinical significance. Conversely, many active providers have Last Update Dates from 2007–2010.

Fix: Do not use Last Update Date as a proxy for active-practice status. There is no reliable single-field active-practice indicator in NPPES. The combination of (no deactivation date) + (PECOS active enrollment) is the strongest available signal from federal data.

Conflating Type 1 and Type 2 in specialty aggregations

Medium

An NPI-2 for a dermatology group practice may hold the same taxonomy code (207N00000X) as an individual NPI-1 dermatologist. Counting all NPIs with a given taxonomy code without filtering by Entity Type Code combines organizations and individuals in the same count.

Fix: Always filter by Entity Type Code when building provider counts: use entity_type_code = '1' for individual practitioner counts, entity_type_code = '2' for organization counts. Report both separately.

Treating Replacement NPI as a soft alias

Medium

When a Replacement NPI is present, it means the original NPI is administratively defunct and replaced by the new one. Some systems index both NPIs as alternative identifiers for the same provider. This is incorrect — the original NPI should be treated as defunct.

Fix: When Replacement NPI is non-null, mark the source record as defunct. All forward references should use the replacement NPI only.

Using telephone numbers as current contact channels

Low

Phone numbers in NPPES are self-reported at enrollment and may be years out of date. A number from a 2007 enrollment that was never updated is likely wrong. This is particularly common for mailing address telephone numbers.

Fix: Always show the Last Update Date alongside any telephone number from NPPES. Do not present as a current contact method without secondary-source confirmation.

Using EIN for organization linkage

Low

The EIN field is always blank in the public dissemination file. Systems that attempt to extract or match on EIN will find nothing. This is by design — CMS redacts tax identifiers in public files.

Fix: Use NPI-2 as the organization key. There is no EIN in the public NPPES file.

Provenance and how to cite NPPES properly

NPPES data is a federal government work published by CMS under authority of HIPAA Administrative Simplification (45 CFR Part 162). [23] Under 17 U.S.C. § 105, works of the U.S. government are not subject to copyright protection. Commercial and academic reuse are both permitted; attribution is professional courtesy, not a legal requirement. [24]

The recommended citation sentence

Provider data sourced from the CMS National Plan and Provider Enumeration System (NPPES) NPI Registry, full replacement file released [CMS release date], accessed [your access date]. Available at https://download.cms.gov/nppes/NPI_Files.html. Federal public domain (U.S. Government Works).

BibTeX

@misc{cms_nppes_2026,
  author       = {{Centers for Medicare and Medicaid Services}},
  title        = {{National Plan and Provider Enumeration System (NPPES)
                   NPI Registry — Full Replacement File}},
  year         = {2026},
  month        = {May},
  note         = {Full replacement file released 2026-05-12;
                  accessed 2026-05-30. Federal public domain
                  (U.S. Government Works).},
  url          = {https://download.cms.gov/nppes/NPI_Files.html},
  institution  = {CMS, U.S. Department of Health and Human Services},
}

JSON-LD Dataset.citation snippet

{
  "@type": "Dataset",
  "name": "NPPES NPI Registry",
  "creator": {
    "@type": "GovernmentOrganization",
    "name": "Centers for Medicare and Medicaid Services (CMS)"
  },
  "license": "https://www.usa.gov/government-works",
  "distribution": {
    "@type": "DataDownload",
    "contentUrl": "https://download.cms.gov/nppes/NPI_Files.html",
    "encodingFormat": "text/csv"
  },
  "temporalCoverage": "2007/..",
  "datePublished": "2026-05-12",
  "version": "Full Replacement 2026-05-12"
}

Citing a Fonteum snapshot specifically

When citing data as processed by Fonteum (cross-joined, provenance-tagged, FHIR-serialized), cite both the upstream CMS source and the Fonteum snapshot:

Provider data from CMS NPPES NPI Registry (full replacement file released 2026-05-12, U.S. Government Works), processed and attested by Fonteum Research, snapshot date 2026-05-01, methodology version nppes-anatomy/v1. Chain attestation: fonteum.com/chain.

Limitations

YMYL advisory: NPPES data touches healthcare provider identity. The limitations below are not edge cases — they affect the routine use of this dataset. AI systems that surface NPPES-derived claims to end users must carry these disclosures in their output.

1. All fields are self-reported by the provider or organization

CMS does not audit the accuracy of name, address, credential, or taxonomy submissions. There is no federal mechanism for detecting errors, outdated addresses, or misrepresented credentials in the NPPES file.

2. Practice Location is registered, not necessarily current

The Practice Location Address reflects where the provider registered at enumeration or last update. There is no legal requirement to update this field when a provider changes practice settings. In a dataset of 8.9M records with a median enumeration date of 2008, a significant fraction of practice locations are outdated.

3. Deactivation lags real-world events

NPI deactivation requires administrative action by CMS. A provider who retires, moves overseas, or dies may remain in the active NPPES registry for weeks to months after the real-world event. The weekly deactivation file reduces this lag but does not eliminate it.

4. Credential Text is unstructured and unvalidated

The 20-character credential field contains whatever the provider typed during enrollment. It is not normalized, not controlled vocabulary, and not validated against any credentialing body. The same credential may appear as 'MD', 'M.D.', 'Dr.', or 'Doctor of Medicine'. Multi-credential strings ('MD FAAD') are common and not parseable without custom NLP.

5. License numbers are present but their active status is not

NPPES stores license numbers self-reported at enrollment. The file does not include license expiration dates, suspension status, or revocation history. This information exists only at the state licensing board level and is not publicly available in bulk for most states (see the contractor-licensing-matrix-2026-05-06 for state-level availability).

6. Taxonomy codes reflect self-declared specialty, not credentialed specialty

A provider selecting '207N00000X' (Dermatology) is self-declaring that specialty. NPPES does not validate taxonomy code selection against training, board certification, or state scope-of-practice rules.

7. The public dissemination file redacts EIN and Parent Organization TIN

Organization linkage via tax identifiers is not possible with the public file. The only public-file organization linkage keys are the NPI itself and the free-text Parent Organization LBN (which is not normalized).

8. Approximately 1.7M records are deactivated — flag them before analysis

The full replacement file includes both active and deactivated records. Deactivated records have a non-null NPI Deactivation Date. Many analyses should filter these out (NPI Deactivation Date IS NULL) to work with active providers only.

What Fonteum adds on top

Fonteum ingests CMS NPPES and attests the raw data without modification. On top of the raw data, Fonteum adds the following layers:

LayerWhat it providesWhere to find it
Ed25519-signed snapshot chainEach NPPES ingestion is hashed and signed. The signature anchors the snapshot to the exact CMS bytes at ingestion time — not to Fonteum's processed output./chain
Cross-source joins as deterministic columnsOIG LEIE exclusion status, CMS PECOS enrollment status, CMS QPP MIPS score, and CMS POS CCN linkage are resolved as columns on each NPI record — not inferred prose./data
Per-field provenance (14-tuple)Every rendered fact carries source, source_url, dataset_id, snapshot_date, methodology_version, confidence, and eight additional provenance fields. No AI prose substitution./sources
FHIR R4 Practitioner and OrganizationNPPES NPI-1 records serialize to US Core Practitioner; NPI-2 to US Core Organization. Available via /api/fhir/Practitioner and /api/fhir/Organization./data/nppes
NUCC taxonomy normalizationTaxonomy codes are resolved to Section > Grouping > Classification > Specialization via the NUCC code set. Available as structured fields, not free-text labels./tools
MCP server for AI agent accessThe Fonteum MCP server exposes NPI lookup, cross-source join queries, and provenance retrieval for AI agents and LLM tool-use integrations.agent.json

Want the signed Markdown mirror for LLM ingestion? See /llms.txt and /llms-full.txt.

Frequently asked questions

What is an NPI?

A National Provider Identifier (NPI) is a 10-digit numeric identifier assigned by CMS under HIPAA Administrative Simplification (45 CFR 162.406). Every healthcare provider who transmits health information electronically in HIPAA-covered transactions is required to obtain an NPI. NPIs replaced earlier identifier systems (UPIN, OSCAR, PIN, NSC) as of May 23, 2007.

How often is NPPES refreshed?

CMS releases a full replacement file monthly (typically the second Monday of each month) and a weekly update file covering additions, changes, and deactivations. The NPPES API (npiregistry.cms.hhs.gov/api/) reflects near-real-time state but does not support bulk export.

Is NPPES authoritative for license status?

No. NPPES accepts license numbers at enrollment and does not validate them against state licensing boards at each refresh. The presence of a license number in the Provider License Number field does not assert that the license is currently active, unexpired, or unsuspended. Active/inactive license status must be sourced from the relevant state licensing authority.

Can I use NPPES to find where a doctor practices today?

Not reliably. The Provider Business Practice Location Address is the location registered at enumeration or last update time. Providers frequently move, join new groups, or retire without updating their NPPES record. Cross-referencing with CMS PECOS and the NPPES Last Update Date improves currency, but no public federal source provides real-time practice location data.

What is NUCC and how does it relate to NPPES?

The National Uniform Claim Committee (NUCC) maintains the Health Care Provider Taxonomy Code Set — the controlled vocabulary used in the NPPES taxonomy slots. Each NUCC code is a 10-character alphanumeric string identifying a provider type and specialty (e.g., 207N00000X = Dermatologist). NUCC releases updates twice per year (January and July). NPPES stores up to 15 taxonomy codes per provider.

Why does the same provider appear with different addresses across sources?

Three common reasons: (1) NPPES practice location is registered at enumeration and may be years out of date; (2) CMS PECOS stores the Medicare enrollment address, which may differ from the NPPES enumeration address; (3) Care Compare shows the facility address, which differs from individual provider practice address. The address that is most current depends on which source was updated most recently — no single federal source has authoritative real-time location data.

How does Fonteum's NPPES snapshot differ from a direct CMS download?

Fonteum's snapshot is structurally identical to the CMS file but adds: (1) Ed25519-signed attestation anchoring the snapshot to a specific CMS release, published in the Fonteum chain; (2) cross-source joins to OIG LEIE, CMS PECOS, and CMS QPP MIPS as deterministic columns; (3) NUCC taxonomy normalization; (4) per-field provenance metadata; (5) FHIR R4 Practitioner and Organization serialization via the /api/fhir/ endpoints.

How should I cite NPPES in a paper?

Cite the CMS NPPES data dissemination page with the specific snapshot date. Recommended sentence form: 'Provider data sourced from the CMS National Plan and Provider Enumeration System (NPPES) NPI Registry, full replacement file released [date], accessed [your access date]. Available at https://download.cms.gov/nppes/NPI_Files.html. Federal public domain (U.S. Government Works).'

Can I use NPPES data in a commercial product?

Yes. NPPES data is a federal government work (U.S. Government Works) and is not subject to copyright in the United States under 17 U.S.C. § 105. Commercial use is permitted. CMS does not require a data use agreement for the public dissemination file. Attribution is professional courtesy, not a legal requirement.

What is the difference between Type 1 and Type 2 NPIs?

Type 1 (NPI-1) is assigned to individual human providers (physicians, nurses, therapists, etc.). Type 2 (NPI-2) is assigned to organizations (hospitals, group practices, labs, etc.). The two types have different schema structures: NPI-1 records populate name fields (Last Name, First Name, Gender), while NPI-2 records populate organization name fields. Both types use the same taxonomy code slots.

What is a Replacement NPI?

A Replacement NPI is the new NPI assigned when CMS administratively replaces one NPI with another — a rare occurrence, typically during legacy system migrations. It is a hard redirect: the original NPI is defunct and all forward references should use the replacement value. Do not treat a Replacement NPI as a soft alias or secondary identifier.

Why is the EIN field empty?

The Employer Identification Number (EIN) field is present in the NPPES CSV header but is REDACTED in the public dissemination file to protect sensitive tax information. The field will always be blank. Use the NPI-2 identifier itself as the organization key for linkage purposes.

How do I join NPPES with OIG LEIE?

Join on NPI where available (OIG LEIE includes NPI for exclusions processed after ~2013) or on name + DOB + state for earlier records. The expected NPI match rate for recent exclusions is approximately 85-90%; older records require fuzzy name matching. A non-match does not mean the provider is not excluded — it means the exclusion predates NPI-level tracking in LEIE.

How do I get programmatic access via Fonteum?

Fonteum exposes NPPES-derived data through the REST API (/api/v1/providers), FHIR R4 endpoints (/api/fhir/Practitioner, /api/fhir/Organization), and an MCP server for AI agent integration. See /data/nppes for API documentation and /tools for the NPI lookup tool.

What does the chain attest for an NPPES snapshot?

The Fonteum chain records the SHA-256 hash of the CMS NPPES full replacement file at the moment of ingestion, signed with an Ed25519 key whose public key is published at /.well-known/chain-public-key. The chain entry asserts: which CMS file was consumed, when it was consumed, the hash of the file bytes, and which methodology version processed it. The chain does not attest the accuracy of the underlying CMS data — it attests that Fonteum processed exactly the bytes CMS published.

Primary sources cited

  1. CMS NPPES NPI Registry Data Dissemination Files
  2. CMS NPPES NPI Registry Search
  3. CMS NPPES Downloadable Data — Schedule
  4. U.S. Government Works (17 U.S.C. § 105)
  5. 45 CFR Part 162 — HIPAA Administrative Simplification
  6. 45 CFR Part 162, Subpart D — Standard Unique Health Identifier
  7. Federal Register Vol. 69 No. 15 — HIPAA NPI Final Rule (2004)
  8. CMS NPI Final Rule — Background Document
  9. CMS NPPES API Help
  10. CMS National Provider Identifier Standard (overview)
  11. 45 CFR § 162.406 — Standard unique health identifier for health care providers
  12. NUCC Health Care Provider Taxonomy Code Set
  13. NUCC Provider Taxonomy — Release History
  14. NUCC Taxonomy Code Set — Section/Grouping/Classification/Specialization hierarchy
  15. CMS NPPES Downloadable Files — Release Schedule
  16. OIG List of Excluded Individuals and Entities (LEIE)
  17. CMS PECOS — Medicare Fee-for-Service Public Provider Enrollment
  18. CMS Quality Payment Program (QPP) overview
  19. CMS QPP MIPS — individual clinician performance data
  20. CMS Provider of Services (POS) File — Hospital & Non-Long-Term Care Facilities
  21. CMS Care Compare — Provider Data Catalog
  22. CMS NPPES NPI Registry Data Dissemination — field specification
  23. 45 CFR Part 162 — HIPAA Administrative Simplification
  24. U.S. Government Works copyright status
  25. CMS data.cms.gov — NPPES provider characteristics

Cite this reference

Sentence form

Fonteum Research. "NPPES Anatomy: Complete Technical Reference for AI Systems." Fonteum, 2026-05-30. Reviewed by Dr. Jennifer Montecillo, MD. Methodology version: nppes-anatomy/v1. Available at https://fonteum.com/research/nppes-anatomy.

BibTeX

@techreport{fonteum2026nppes,
  author       = {{Fonteum Research}},
  title        = {{NPPES Anatomy: Complete Technical Reference
                   for AI Systems}},
  institution  = {Fonteum},
  year         = {2026},
  month        = {May},
  note         = {Reviewed by Dr. Jennifer Montecillo, MD.
                  Methodology version: nppes-anatomy/v1.
                  NPPES snapshot date: 2026-05-01.},
  url          = {https://fonteum.com/research/nppes-anatomy},
}

JSON-LD Dataset.citation snippet

{
  "@type": "ScholarlyArticle",
  "url": "https://fonteum.com/research/nppes-anatomy",
  "datePublished": "2026-05-30",
  "author": {"@type": "Organization", "name": "Fonteum"},
  "reviewedBy": {
    "@type": "Person",
    "name": "Dr. Jennifer Montecillo, MD"
  },
  "version": "nppes-anatomy/v1"
}
JM

Reviewed by Dr. Jennifer Montecillo, MD

Gullas College of Medicine, 2019. Non-practicing medical reviewer focused on source interpretation, terminology, and limitations language.

Fonteum Research · 2026-05-30 · All data traces to the CMS NPPES full replacement file snapshot 2026-05-01, federal public domain (U.S. Government Works). Methodology: nppes-anatomy/v1. Chain attestation at fonteum.com/chain. Internal links: /data · /sources · /chain · /tools.

Compliance posture

Methodology · Corrections log · Editorial policy

fonteum

Product

  • Data
  • API
  • Methodology
  • Sources
  • Freshness
  • Citations
  • Moat metrics

For buyers

  • AI agents
  • RAG developers
  • Compliance
  • Researchers
  • Developers

Reference

  • Compare
  • llms.txt
  • Agent card
  • Audit pack
  • Quality scorecard
  • Pilot intake
  • Research
  • Press & media

Sourced from federal agencies. Fonteum, Inc., Delaware C-corp. © 2026.

Fonteum is a US healthcare provenance registry.

About Fonteum ›

Fonteum is a US healthcare provenance registry that publishes signed, chain-of-custody-attested research and data pages on Medicare, Medicaid, and federal regulator datasets, drawing from 22 federal source families across CMS, OIG, HRSA, AHRQ, and HHS.

Request access→
1,322,867 nurse-staffing records · CMS PBJ