Article Text

Download PDFPDF

Clinical codes combined with procedure codes increase diagnostic accuracy of Crohn’s disease in a US military health record
  1. Manish Singla1,2,
  2. Susan Hutfless3,
  3. Elie Al Kazzi4,
  4. Benjamin Rodriguez5,
  5. John Betteridge6,
  6. Steven R Brant3,7,8
  1. 1Gastroenterology Service, Department of Internal Medicine, Walter Reed National Military Medical Center, Bethesda, Maryland, USA
  2. 2Uniformed Services University, Bethesda, MD, United States
  3. 3Harvey M. and Lyn P. Meyerhoff Inflammatory Bowel Diseases Center, Gastroenterology Division, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
  4. 4Department of Medicine, MedStar Union Memorial Hospital, Baltimore, Maryland, USA
  5. 5Gastroenterology Service, Department of Internal Medicine, US Naval Hospital Jacksonville, Jacksonville, Florida, USA
  6. 6Regional GI, Lancaster, Pennsylvania, USA
  7. 7Division of Gastroenterology and Hepatology, Crohns and Colitis Center of New Jersey, Rutgers Robert Wood Johnson Medical School, New Brunswick, New Jersey, USA
  8. 8Department of Genetics, Rutgers University, Piscataway, New Jersey, USA
  1. Correspondence to Dr Manish Singla; manishsingla{at}


Background and aims Previous examinations of International Classification of Disease, Ninth Revision, Clinical Modification (ICD-9-CM) codes to predict accuracy of diagnosis in inflammatory bowel disease have had limited chart review to confirm diagnosis. We aimed to evaluate using the ICD-9-CM for identifying Crohn’s disease (CD) in a large electronic health record (EHR) database.

Methods This is a retrospective case-control study with a 3:1 allocation of EHRs of active duty service members diagnosed with CD from 1996 to 2012. Subjects were selected by having two ICD-9-CM codes for CD and none for ulcerative colitis during the study period. Gastroenterologists reviewed each chart and confirmed the diagnosis of CD by analysing medication history and clinical, endoscopic, histological, and radiographic exams.

Results 300 cases of CD were selected; 14 cases were discarded due to lack of data, limiting analysis to 284 subjects. Two diagnostic codes for CD had sensitivity and specificity of 1.0 and 0.53 respectively, for confirmed CD. If two or more encounters listing CD were with a gastroenterologist, the sensitivity and specificity was 0.71 and 0.87 respectively. If two encounters included a colonoscopy was performed at the same time as a CD code, sensitivity and specificity was 0.49 and 0.88 respectively.

Conclusions The relatively poor specificity of ICD-9-CM codes in making the diagnosis of CD should be taken into consideration when interpreting results and when conducting research using such codes. Limiting these codes to patients given this diagnosis by a gastroenterologist, or to those who had a colonoscopy at the time of a diagnosis, increases the specificity, although at cost of sensitivity, especially for colonoscopy.

  • Crohn's disease
  • ulcerative colitis
  • irritable bowel syndrome

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Summary box

What is already known about this subject?

  • Using clinical billing codes can allow big data analysis of healthcare outcomes in patients with Crohn’s disease (CD).

What are the new findings?

  • Using only clinical billing codes had a poor specificity and positive predictive value (PPV) in predicting patients with CD.

    • Requiring a gastroenterology encounter or adding a code for colonoscopy greatly increased specificity and PPV.

How might it impact on clinical practice in the foreseeable future?

  • Future studies identifying patients with CD using billing codes should include gastroenterology encounters or procedure codes to increase specificity and PPV.


Crohn’s disease (CD) is a chronic idiopathic inflammatory disease of transmural inflammation of the gastrointestinal tract, primarily the ileum or colon. The disease is diagnosed based on biopsies indicative of chronic inflammation by endoscopy or surgery without a history of chronic infectious diseases (ie, tuberculosis) or other factors (eg, ovarian abscesses or diverticulitis) that may cause a similar appearance of chronic gut inflammation.1

Clinically coded data, used primarily for billing or encounter tracking, can be used to identify and study large cohorts of patients with CD in an efficient and cost-effective manner. However, clinically coded data and electronic health records (EHRs) are not designed for research purposes. The codes can reflect ‘working diagnoses’, and are often incomplete descriptions of the severity or complications of disease. Although the EHR provides more details, the notes and uploaded documents do not always capture the longitudinal phenotype and disease activity of patients that may be collected in a recruitment-based prospective study or randomised trial. The volume of patients that can be studied using clinically coded data can add substantially to the knowledge base. Identifying a validated case definition for codes using the EHR associated with a particular cohort can add substantially to the value of the cohort.

Previous studies have examined the accuracy of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) and similar codes based on the reference standard for diagnosis, documentation of inflammatory bowel disease (IBD) in the medical record.2 Previous studies of accuracy of diagnostic codes in the USA found that 67.5% of patients with CD were correctly classified based on at least one ICD-9-CM 555 encounter3 and 88% with two encounters.4 Some cohorts have not performed their own validation studies; rather, they have relied on a case definition of two encounters based on prior evidence.5–10 One study showed a positive predictive value (PPV) of 91% when a CD code was present without any UC codes, although this appears to be an outlier.11 The studies have used various methods to confirm CD from a mention of CD in medical record notes to review endoscopic or radiological images or reports, operative notes, and pathology reports.

The goal of our study was to assess the diagnostic accuracy of several ICD-9-CM definitions in the active duty US military population. The US military provides a unique opportunity for research on IBD and other significant chronic conditions because IBD and related conditions (including chronic diarrhoea and chronic abdominal pain) preclude entry in the US military. Overwhelmingly, first diagnoses entered will be those from initial disease presentations. It is a diverse population but with homogeneous and universal access to medical evaluation and treatment. At a minimum, we required at least two ICD-9-CM 555 encounters.9 In addition, we aimed to examine other definitions (to include timing of diagnosis, procedure codes, and provider specialty) to maximise sensitivity, specificity, and the PPV of CD. The expansive military EHR including clinical notes, endoscopy reports, operative reports, images, and laboratory and pathology results was used to confirm CD diagnoses.


We conducted a retrospective case control study with a 3:1 allocation. Eligible patients included those with active military service between 1 January 1996 and 1 December 2012 with at least three serum samples available in the Armed Forces Repository of Specimen Samples required for a related IBD study. Individuals with at least two outpatient ICD-9-CM codes of 555.x (n=300), no codes of 556.x (ulcerative colitis (UC)) and 100 individuals with similar age, sex, race, and service, but no codes of 555 or 556, were selected for chart review. Electronic versions of clinical notes, pharmacy data, endoscopy reports, radiology reports, and laboratory values were reviewed from the Department of Defense EHR, the Armed Forces Health Longitudinal Technology Application (AHLTA), by medical doctors with subspecialty fellowship training in gastroenterology and clinical practices focused in IBD. All ICD-9-CM and Current Procedural Terminology (CPT) codes and the associated clinically coded information (ie, provider specialty and location of encounter) for all reviewed individuals were available.

Data extracted from the EHR included age, gender, Montreal classification (disease location, disease behaviour, and duration of disease), and histories of smoking, intestinal surgery (to include indication and location), medications, colonoscopies, radiological studies, and diagnoses of CD, UC, irritable bowel syndrome (IBS) and infection. Records were reviewed by four IBD specialists. A chart review confirmed case of CD was defined by clinical symptoms consistent and specific to CD accompanied either by mucosal ulceration on endoscopy or a surgical specimen with pathology confirming chronic histological inflammation.

All cases were reviewed by at least two specialists, with the ruling of the second specialist maintained.

These definitions of interest included different numbers of encounters for 555.x in combination with site of service (gastroenterology (Medical Expense and Performance Reporting System codes AAF for inpatient, BAG for outpatient) or general surgery (ABA)), hospitalisation for CD, and colonoscopy (CPT 45355, 45378, 45379, 45380, 45381, 45382, 45383, 45388, 45384, 45385, 45386, 45387, 45389, 45391, 45392, 45390, 45393, 45398, 45399). A 2×2 table was created for each potential case definition classifying each individual as a true negative, true positive, false negative and false positive based on the definition and chart review determination. Using this table, sensitivity, specificity, PPV and diagnostic accuracy (defined by true positives plus true negatives over the total denominator) were calculated. Exact binomial confidence limits were calculated.4


Our analysis included 284 patients and 100 controls; no medical encounters were available in our EHR for 16 patients. Of the 284 evaluated patients, 196 had a confirmed diagnosis of CD (69%). Twenty cases had no mention of CD in their medical record nor any gastrointestinal or immunological condition (7%). Nine patients had mention of CD in their records but lacked endoscopy or pathology information to make a definitive diagnosis (3%). Multiple patients (6.0%) had other chronic IBDs including indeterminate colitis (n=4), radiographic ileitis without endoscopic inflammation (n=4), lymphocytic colitis (n=5), UC (n=3), and possible UC (n=1). Other intestinal inflammatory conditions were observed in 2.4% of subjects including eosinophilic gastrointestinal disease (n=3), Behcet’s disease (n=1), acute colitis followed by normal endoscopic findings (n=2), and jejunal enteritis seen on radiographic imaging without endoscopic or pathological confirmation (n=1). In 3.5% of subjects, chart review showed complications or features found in CD but had no evidence to confirm the finding was due to CD (ie, intra-abdominal abscess (n=1), cryptitis (n=1), mucosal thickening on CT (n=5), and recurrent anal fissures or perianal fistula without mucosal disease (n=3)). Other gastrointestinal diagnoses included most commonly IBS (n=16), small bowel obstruction (n=1), haemorrhoids (n=1), gastro-oesophageal reflux disorder (n=1), dyspepsia (n=1), chronic abdominal pain (n=1), carcinoid tumour (n=1), appendicitis (n=1) and traveller’s diarrhoea (n=1). One patient had hidradenitis suppurativa, found more frequently among patients with CD (see online supplementary table). None of the 100 control patients had evidence for a diagnosis of IBD following similar examination of their medical records.

Having two diagnostic codes for CD and no codes for UC had sensitivity, specificity, and PPV (with 95% CIs) of 1.0 (by definition as only those with at least two codes were examined so no CI calculated), 0.53 (95% CI 0.46 to 0.60), and 0.69 (95% CI 0.63 to 0.74), respectively (see table 1). When two or more encounters listing CD were with a gastroenterologist,the sensitivity, specificity, and PPV was 0.71 (95% CI 0.65 to 0.88), 0.87 (95% CI 0.81 to 0.91), and 0.85 (95% CI 0.78 to 0.90), respectively. Sensitivity, specificity and PPV were nearly identical if two encounters were with a gastroenterologist or a general surgeon (table 1). If a colonoscopy was performed at the same time as a CD code, the sensitivity, specificity, and PPV was 0.49 (95% CI 0.42 to 0.56), 0.88 (95% CI 0.83 to 0.93), and 0.81 (95% CI 0.73 to 0.88), respectively.

Table 1

Diagnostic accuracy characteristics of case definitions based on 284 chart reviewed cases and 100 controls


Retrospective review of charts to identify patients with CD can be difficult due to the varying presentations of CD; the absence of common, objective clinical tests to confirm diagnoses with high negative predictive values complicates the nature of large database studies to identify patients with CD. ICD9 (and now, ICD-10-CM) codes are frequently used as substitutes for chart review, especially in large database studies where chart reviews are impractical. The poor specificity and PPV we observed (0.69) of even two isolated ICD9 codes in making the diagnosis of CD should be taken into consideration when interpreting results of large population studies.

After starting with a preselected population, requiring at least two CD ICD9 codes be given by gastroenterologists, or requiring a colonoscopy at the time of a diagnostic code, substantially increased the specificity and PPV although at a cost of sensitivity, especially for a colonoscopy requirement. This has some implications for future ‘big data’ research, and suggests that we should continue to interpret database studies extracted from EHRs with caution, particularly without a validation cohort.

Compare our results to these other studies: a study examining medical charts from Massachusetts General Hospital and Brigham and Women’s Hospital of 600 patients with at least one ICD-9-CM code for CD confirmed CD in 67.5% of patients. They found evidence to support a diagnosis of UC instead of CD in 11.0% of the remaining 32.5% of patients.3 These authors included as positives patients with EHRs that included multiple references to having CD without an endoscopic confirmation. In our study, we often found intestinal conditions or non-specific radiographs suggestive of CD (ie, thickening on CT) but endoscopic or pathology evidence was non-specific or supported a related diagnosis (ie, eosinophilic gastrointestinal disease). Additionally, our study had relatively few patients with UC; this was not surprising given we excluded patients with any ICD-9-CM codes for UC for increased CD specificity. A study of the Manitoba Health database used administrative case definitions and found a 91.3% specificity comparing to a self-report questionnaire of patients and a 93.7% specificity compared with a chart review gold standard.12 A study of the General Practice Research Database to validate the diagnosis of CD using OXMIS codes and surveying general practitioners to confirm these diagnoses categorised 86% of 49 patients identified by EHR as having CD.13 A study of the Kaiser Permanente membership randomly selected 2325 patients with at least two outpatient or inpatient ICD-9-CM codes for CD (ie, 555.x), and confirmed CD in 88% of patients with chart review.14 These authors included those with radiological evidence of CD without confirmation with endoscopy. Another study identified patients with IBD using an endoscopy database, and found that an ICD-9-CM diagnostic code for IBD in addition to two medical contacts in the Alberta’s Ambulatory Care Classification System yielded 97.4% PPV for IBD.15 This study began with patients who were undergoing endoscopy with an ICD-9-CM code for IBD, so presumably the patients were starting with endoscopic confirmation. The study that correlates with our findings the best is a study that analysed algorithms to predict diagnosis of CD from discharge and billing data in two large cohorts of Ontario patients which required five physician contacts in 4 years listing IBD in discharge coding to achieve 81.4% PPV for predicting IBD.16

Our study has many strengths. The military health system is a single payer system, so all pathology specimen data for patients during their active duty time were available for analysis. In addition, all endoscopies and biopsies done while the patient was active were available. Rather than having medical billers analyse charts, all 400 charts were analysed by gastroenterologists with specialty training and interest in IBD, likely increasing the reliability of confirmation. We only confirmed patients who had endoscopic/surgical and pathological evidence of CD; this improved the reliability of our findings, but had a negative effect on our sensitivity. We only included those patients with available data on active duty both before and after diagnosis of CD, which may limit generalisability to other EHR systems. A drawback of previous studies is that many cases had long-standing IBD with the diagnosis occurring years before their entry into an evaluated database or health system. As noted, a history of IBD (or chronic intestinal maladies such as chronic diarrhoea) is disqualifying for enlistment and commissioning in the US Armed Forces. This study represents the first evaluation of CD in subjects who have all had their first CD diagnosis in the same EHR. One may have expected our study to find a higher sensitivity than reported by others, since physicians often bill patients from prior evaluations or from the notes of their previous physicians without supporting documentation.

The study has some limitations. In addition, use of codes and EHR databases for research can be affected by misclassification, given that ICD-9-CM codes (and most EHRs) do not have ‘rule out’ or ‘presumed diagnosis’ codes. This can affect the use of ‘big data’ to assess healthcare outcomes in patients identified with CD based on ICD-9-CM codes. In contrast to other studies, if information was available from radiology reports but no endoscopy and pathology information was available, the case was not considered a confirmed diagnosis. We also had to exclude 16 patients due to a lack of reviewable encounters despite billing codes for CD. This may be due to patients being evaluated at clinics billing TRICARE without AHLTA access or during a period when AHLTA was unavailable.

In summary, our study shows the poor specificity and PPV of two ICD9 billing codes for CD, and their significant increase when multiple appropriate ICD9 codes made during a specialist encounter or a colonoscopy procedure code are added to the case definition. To some extent, this should not be surprising as medical providers often give a billing code based on the ‘working’ or ‘historical’ diagnosis as opposed to the confirmed diagnosis. We urge our fellow researchers to include validation of billing codes when reporting results from EHR or other database-based research.



  • Contributors MS: chart review, analysis, drafting of the manuscript, and guarantor of the article. SH: study design, analysis, manuscript revision, critical review of the manuscript. EAK: analysis, critical review of the manuscript. BR: chart review, critical review of the manuscript. JB: study design, manuscript revision. SRB: study design, interpretation of results and analysis, manuscript revision, critical review of the manuscript.

  • Funding SH and SRB were supported in part by Congressionally Directed Medical Research Program Grant PR110833.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval This study was approved by the Institutional Review Boards of all involved institutions and complied with the highest standards of ethical research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement All data relevant to the study are included in the article or uploaded as supplementary information. The data are collated from the military’s Electronic Health Record.