Objective The prevalence of non-alcoholic fatty liver disease (NAFLD) and non-alcoholic steatohepatitis (NASH) cirrhosis is often underestimated in healthcare and administrative databases that define disease burden using International Classification of Diseases (ICD) codes. This retrospective audit was conducted to explore the accuracy and limitations of the ICD, Tenth Revision, Australian Modification (ICD-10-AM) to detect NAFLD, metabolic risk factors (obesity and diabetes) and other aetiologies of chronic liver disease.
Design/Method ICD-10-AM codes in 308 admitted patient encounters at two major Australian tertiary hospitals were compared with data abstracted from patients’ electronic medical records. Accuracy of individual codes and grouped combinations was determined by calculating sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and Cohen’s kappa coefficient (κ).
Results The presence of an ICD-10-AM code accurately predicted the presence of NAFLD/NASH (PPV 91.2%) and obesity (PPV 91.6%) in most instances. However, codes underestimated the prevalence of NAFLD/NASH and obesity by 42.9% and 45.3%, respectively. Overall concordance between clinical documentation and ‘grouped alcohol’ codes (κ 0.75) and hepatitis C codes (κ 0.88) was high. Hepatitis B codes detected false-positive cases in patients with previous exposure (PPV 55.6%). Accuracy of codes to detect diabetes was excellent (sensitivity 95.8%; specificity 97.6%; PPV 94.9%; NPV 98.1%) with almost perfect concordance between codes and documentation in medical records (κ 0.93).
Conclusion Recognition of the utility and limitations of ICD-10-AM codes to study the burden of NAFLD/NASH cirrhosis is imperative to inform public health strategies and appropriate investment of resources to manage this burgeoning chronic disease.
- nonalcoholic steatohepatitis
- diabetes mellitus
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What is already known about this subject?
Under-recording of non-alcoholic fatty liver disease (NAFLD), non-alcoholic steatohepatitis (NASH) and metabolic risk factors in population-based and administrative databases is widely recognised, although the reasons for this remain unclear. Understanding the limitations of International Classification of Diseases (ICD) codes is important to improve the reliability of health system databases for epidemiological studies and health services research.
What are the new findings?
While specificity of the ICD, Tenth Revision, Australian Modification (ICD-10-AM) codes for NAFLD/NASH was high (97.7%), these codes underestimated the prevalence of NAFLD/NASH by 42.9%, despite explicit documentation in the medical record.
The presence of an ICD-10-AM code for obesity accurately predicted obesity in most instances (specificity 95.9%); however, codes substantially underestimated obesity prevalence (sensitivity 54.7%). In most false-negative encounters (84.1%) this was due to lack of clear clinical documentation.
In contrast, accuracy of codes to detect diabetes was excellent (sensitivity 95.8%; specificity 97.6%), with almost perfect concordance between codes and documentation in medical records.
How might it impact on clinical practice in the foreseeable future?
These data suggest that changes may be required in the Australian Coding Standards and ICD-10-AM diagnosis codes, to better document the presence of NAFLD/NASH cirrhosis and obesity. There is also a need for clinician education to reinforce the importance of clear clinical documentation. Improving the accuracy of population-based data on the burden of NAFLD will be imperative to guide public health strategies and appropriate investment of resources to manage this burgeoning chronic disease.
Non-alcoholic fatty liver disease (NAFLD) associated with obesity and type 2 diabetes mellitus (T2DM) is now the most common chronic liver disease (CLD) worldwide and is reported to be a rising cause of advanced liver disease, primary liver cancer and liver-related mortality.1 However, it is difficult to determine an accurate prevalence of NAFLD-related cirrhosis and complications because it appears to be under-represented in databases that contain codes for disease aetiology and causes of death.
In a population-based study that examined International Classification of Diseases (ICD), Tenth Revision, Australian Modification (ICD-10-AM) codes among all people treated in hospital for cirrhosis in Queensland, Australia, during 2008–2016, only 4.8% of patients had a coded disease aetiology of NAFLD or non-alcoholic steatohepatitis (NASH).2 While the proportion of patients admitted with NAFLD/NASH-related cirrhosis increased from 3.6% in 2008–2010 to 6.0% in 2014–2016 (p<0.00001),2 this is likely an under-representation of disease prevalence based on observational and cohort studies, as well as modelling of NAFLD in Australia.3 4
Similar findings have been reported elsewhere. NAFLD prevalence is underestimated in US Medicare datasets and administrative databases that define NAFLD using ICD codes.5 6 In European primary healthcare databases, the pooled prevalence of NAFLD was 1.9%,7 far less than the expected community prevalence of 20%–30%. In order to reduce misclassification of NAFLD in population-based data, some authors have used an extended definition of NAFLD that included cryptogenic liver disease or cirrhosis in the presence of metabolic abnormality and the absence of other causes of liver disease5 or inferred data from the prevalence of obesity and T2DM.8 However, these methods also rely on the accuracy of clinical documentation and administrative coding, which may be influenced by country-specific codes and coding rules. In Australia, only 3.7% of admissions with cirrhosis had a coded diagnosis of obesity2 despite a prevalence of obesity of 31% in Australian adults,9 illustrating the failure to capture or document important patient data that would help to determine liver disease aetiology or comorbidity.
Determining the accuracy of ICD coding in hospital admission data is necessary to understand the limitations and improve the reliability of health system databases for epidemiological studies and health services research. The primary aim of this study was to evaluate the performance and limitations of ICD-10-AM codes to identify NAFLD/NASH in admitted patient encounters with cirrhosis at a major tertiary hospital. The secondary aims were to investigate the accuracy of ICD-10-AM coding for NAFLD risk factors (obesity and T2DM) and other aetiologies of cirrhosis, namely alcohol-related liver disease (ALD), chronic hepatitis C virus (HCV) and hepatitis B virus (HBV) infection.
Sample population and data collection
A retrospective audit was conducted in a sample of prospectively recruited patients with a diagnosis of cirrhosis to ascertain the level of concordance between select ICD-10-AM codes and documentation in patients’ electronic medical records. Patients in the current study were recruited between January 2016 and December 2018 to the CirCare Study, a multicentre observational study of patients with cirrhosis or a randomised controlled trial of an education intervention among outpatients with decompensated cirrhosis. The details of these studies have been previously described.10 11 All patients were adults aged ≥18 years with hepatic cirrhosis diagnosed by a hepatologist, based on liver histology, imaging or a combination of non-invasive markers and clinical assessment. To be included in the current study, patients had to have had at least one admission at the Princess Alexandra Hospital or the Royal Brisbane and Women’s Hospital between 1 January 2016 and 30 June 2019.
In the ICD-10-AM, two codes may be used to record NAFLD/NASH: K75.8 ‘Other specified inflammatory liver diseases (non-alcoholic steatohepatitis)’ and K76.0 ‘Fatty (change of) liver, not elsewhere classified (non-alcoholic fatty liver disease)’. NAFLD/NASH cirrhosis may also attract code K74.6 ‘Other and unspecified cirrhosis of liver’. Obesity may be represented by ICD-10-AM codes under the E66 code block or by the chronic condition supplementary code U78.1. T2DM codes are contained in the E11 code block. Codes for other causes of CLD are outlined in table 1.
All ICD-10-AM codes were obtained from the Queensland Hospital Admitted Patient Data Collection registry (QHAPDC) for every hospital encounter within the study cohort during a minimum 12-month follow-up period. For the current study, one encounter was randomly selected for each patient to be included in the audit sample. In a subset of patients who had ≥1 encounter with and ≥1 encounter without code K74.6, a second encounter was selected (one encounter containing code K74.6 and one encounter that did not contain code K74.6) to ensure the sample contained a representative number of patients with possible NAFLD.
In Australia, assignment of ICD-10-AM diagnosis codes is impacted by the Australian Coding Standards, which stipulate the condition must be either the chief reason for admission or required commencement, alteration or adjustment of therapeutic treatment, diagnostic procedures or increased clinical care and/or monitoring during the encounter. T2DM and HCV are exceptions to this assignment process, as these conditions are always coded when documented. In addition, a supplementary ‘U’ coding capability was introduced on 1 July 2015, to capture conditions (such as obesity) that contribute to a patient’s health status during the admission, but do not otherwise meet the criteria for coding. The index pathway used by coders for coding of cirrhosis, CLD aetiology, obesity and T2DM remained consistent during the audited time period.
Three clinicians (EEP, LUH and ALJ) blinded to QHAPDC coding conducted a comprehensive review of patients’ medical records and extracted data for each audited encounter using an agreed template. A subset of medical records was reviewed by all investigators to ensure consistency in the approach to data extraction, and any discordant results were adjudicated by a hepatologist (EEP). The presence of cirrhosis, obesity or body mass index (BMI) ≥30.0 kg/m2 (≥27.5 kg/m2 if patient of Asian descent), T2DM and liver disease aetiology was collected for each audited encounter. These conditions were considered present if they were documented by a treating clinician or if BMI was recorded in the patients’ medical record. The accuracy of documented diagnoses and recorded BMI was not audited. Clinical information was retrieved from prior admissions and outpatient encounters to corroborate information where available, although only data from the audited encounter were included in the abstraction.
Data are presented as counts and proportions. The accuracy of ICD-10-AM codes to predict the aetiology of CLD and presence of metabolic risk factors (obesity and T2DM) was determined by calculating sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) (see online supplemental file). Concordance between data abstracted from the patient’s medical records (gold standard) and ICD-10-AM data from QHAPDC was assessed using Cohen’s kappa coefficient (κ) of agreement. κ values <0.20 indicated poor agreement, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial and 0.81–1.0 almost perfect agreement between the codes and medical records.12
A total of 312 encounters among 271 patients were selected; however, four encounters were excluded (n=2 data not available and n=2 post liver transplant admissions). Therefore, the final sample contained 308 encounters among 267 patients (41 patients contributed two encounters each).
Medical record review
Among the total 308 audited admissions, at least one liver disease aetiology was documented in 289 (93.8%) encounters. ALD was documented in 177 encounters (57.5%), NAFLD/NASH in 91 (29.5%), current or treated HCV infection in 92 (29.9%), HBV in 10 (3.2%) and ‘other’ in 28 (9.1%) (including drug-induced liver injury, alpha-1 antitrypsin deficiency, autoimmune hepatitis, familial intrahepatic cholestasis, primary biliary cholangitis, primary sclerosing cholangitis and haemochromatosis). In 104 (33.8%) admissions, more than one liver disease aetiology was documented, most commonly ALD and HCV in 61 (19.8%) and ALD and NAFLD/NASH in 32 (10.4%) encounters. Most encounters (n=289; 93.8%) had cirrhosis documented during the admission including one patient with cryptogenic cirrhosis.
Obesity was documented in 46 (14.9%) admissions. In a further 93 (30.2%) admissions, obesity was present (as determined by BMI) but not clearly documented in the audited encounter. T2DM was documented in 96 (31.2%) admissions. An additional six encounters included documentation of ‘other specified diabetes’ (n=3 steroid-induced diabetes and n=3 diabetes secondary to pancreatic insufficiency following a Whipple procedure) but these were not included in the subgroup with T2DM.
ICD-10-AM code accuracy
An ICD-10-AM code for NAFLD/NASH was present in 57 (18.5%) encounters (table 1). Overall concordance between codes and patients’ medical records was good (κ 0.62). However, while specificity (97.7%), PPV (91.2%) and NPV (84.4%) were high, codes for NAFLD/NASH underestimated the prevalence of NAFLD/NASH by 42.9% (39 of 91 encounters did not contain a code despite documentation of NAFLD/NASH in the medical record; sensitivity 57.1%). The five false-positive codes were in patients with ALD (n=4 including two with HCV as a cofactor) and HBV (n=1).
At least one alcohol-related ICD-10-AM code was present in 158 (51.3%) encounters. Concordance between alcohol-related codes and patients’ medical records was substantial (κ 0.75). Codes had good overall accuracy to detect ALD with high sensitivity (83.6%), specificity (92.4%), PPV (93.7%) and NPV (80.7%).
An HCV-related ICD-10-AM code was present in 85 (27.6%) encounters. Accuracy of codes to detect current or treated HCV infection was almost perfect (κ 0.88), with high sensitivity (88.0%), specificity (98.1%), PPV (95.3%) and NPV (95.1%).
Current HBV (hepatitis B surface antigen (HBsAg) positive) was present in a minority of patients (n=10) and was identified with an ICD code in all cases (sensitivity 100%; specificity 97.3%; NPV 100%; κ 0.70). However, an additional eight patients with prior exposure to HBV (HBsAg negative, hepatitis B surface antibody (HBsAb) and/or hepatitis B core antibody (HBcAb) positive) had a false-positive ICD-10-AM code for HBV (PPV 55.6%).
Eighty-three encounters (26.9%) included an ICD-10-AM code for obesity (table 2). While the presence of a code accurately predicted obesity in most instances (PPV 91.6%; specificity 95.9%), codes substantially underestimated obesity prevalence (sensitivity 54.7%; NPV 72.0%). In most false-negative encounters (n=53; 84.1%) this was likely due to lack of clear clinical documentation (ie, obesity only identified by BMI during medical record review). As a consequence, overall accuracy of ICD-10-AM codes to detect obesity was modest (κ 0.52). When obesity was clearly documented in the medical record (n=46 encounters), a code was assigned in 78.3% of cases.
One-third of encounters (31.5%) included an ICD-10-AM code for T2DM. Accuracy of codes to detect diabetes was excellent (sensitivity 95.8%; specificity 97.6%; PPV 94.8%; NPV 98.1%) with almost perfect concordance between codes and documentation in medical records (κ 0.93).
K74.6 ‘Other and unspecified cirrhosis’
The ICD-10-AM code K74.6 ‘Other and unspecified cirrhosis of liver’ was present in 126 (40.9%) of admissions. Most encounters (95.2%) were in patients who had cirrhosis documented in the audited admission. The most common aetiology among encounters containing code K74.6 was NAFLD/NASH (42.9%); however, over one-third of these encounters were in patients who also had ALD, HCV or an ‘other’ concurrent aetiology (figure 1). Three encounters included code K74.6 in the absence of a documented CLD aetiology. These were all day procedures with minimal documentation in the medical record (n=1 endoscopy and n=2 large volume paracentesis). A further three encounters were assigned this code in the absence of documented cirrhosis (n=2 day admissions for liver biopsy and n=1 admission for ascites of cardiac origin).
ICD codes in patients with NAFLD/NASH
Of 91 encounters in patients with NAFLD/NASH identified by medical chart review, 59.3% included code K74.6, 57.1% had an NAFLD/NASH code, 29.7% had an ALD code and 11.0% had an HCV code (figure 2A). Thirty-one encounters (34.1%) included an obesity code in combination with a T2DM code; however, most of these (n=22; 71.0%) were in encounters that already contained an NAFLD/NASH code.
There were 47 encounters identified by medical chart review in patients with NAFLD/NASH without comorbid liver disease. Of these 47 encounters, 68.1% included code K74.6, 55.3% included an NAFLD/NASH code, 68.1% had a diabetes code and 51.1% had an obesity code (figure 2B). None of the 47 encounters contained a code for ALD, HCV, HBV or ‘other’ aetiology.
Under-recording of NAFLD in population-based or administrative databases is widely recognised, although the reasons for this remain unclear. In this study of patient encounters with cirrhosis at two major tertiary hospitals, ICD-10-AM codes had a high specificity but low sensitivity for NAFLD/NASH and similarly had high specificity but low sensitivity for identifying obesity, although for different reasons. In contrast, accuracy of codes to detect T2DM was excellent. Determining the accuracy of ICD coding in hospital admission data for NAFLD and the factors that influence this are necessary to understand the limitations and improve the reliability of health system databases for epidemiological studies and health services research for this patient group.
Our data show that, although a clinical diagnosis of NAFLD was made and recorded in the medical record, misclassification of patients in the ICD-10-AM coding process was prevalent. This may be related to under-recognition of the condition by clinical coders or lack of optimal codes to differentiate this increasingly common liver disease. The absence of a specific code for NASH cirrhosis may contribute to inadequate capture of this information. At present, clinical coding of this diagnosis requires the combination of two imprecise codes: K74.6 ‘Other and unspecified cirrhosis of liver’ and K75.8 ‘Other specified inflammatory liver diseases (non-alcoholic steatohepatitis)’ or K76.0 ‘Fatty (change of) liver, not elsewhere classified (non-alcoholic fatty liver disease)’. In addition to misclassification, it is also possible that other patients in the cohort may have undiagnosed NAFLD, though assessment of the accuracy of diagnoses recorded in the medical record was outside the scope of the current study. Our data support findings from a prior study that found a combination of ICD diagnosis codes had high PPV but low sensitivity to identify documented cases of NASH cirrhosis.13
Interestingly in this ‘real-world’ study, a substantial proportion of patients with NAFLD (48.4%) had a concurrent liver disease, particularly alcohol excess or chronic HCV. Although the currently accepted nomenclature by the American and European Associations for the Study of Liver Diseases excludes excessive alcohol consumption or other liver diseases in the definition of NAFLD,14 15 there is increasing acceptance that NAFLD often coexists with other hepatic disorders.16 17 In fact, the Asian Pacific Association for the Study of the Liver has recently endorsed a proposal to redefine NAFLD based on the detection of steatosis together with the presence of overweight/obesity, or T2DM, or clinical evidence of metabolic dysfunction (‘metabolic-associated fatty liver disease’ or ‘MAFLD’), regardless of alcohol consumption or other concomitant liver disease.18 Recognition of dual aetiology is important because coexistent NAFLD has a synergistic role in liver disease progression16 17 and may increase the risk of cardiometabolic problems. However, lack of clarity about the terminology (NAFLD vs MAFLD) or its diagnosis in the presence of coexisting liver diseases may result in under-reporting of the true burden of NAFLD and highlights the need for consensus within the hepatology community and appropriate dissemination of this information.
Our data show that notation of ‘obesity’ in the medical records was poor (based on recorded BMI in clinical documentation) and was largely responsible for the low sensitivity of obesity codes. These findings support a recent study from Australia that examined the agreement between medical records and ICD-10-AM comorbidity codes in trauma patients.19 The authors found that, based on clinician documentation, the prevalence of obesity was only 9.3% (compared with the Australian population prevalence of 31%9), with an even lower prevalence based on administrative data.19 Multiple previous studies have also shown that the completeness of ICD diagnosis coding for overweight/obesity is low compared with its prevalence in medical records based on clinical weight/height or BMI measurements.20–23 Although it is a potential risk factor for many diseases, obesity is not usually the primary cause of admission and may therefore be overlooked by clinicians when registering diagnoses. This systemic under-reporting currently limits the value of using administrative coding for obesity as part of an ‘extended definition’ of NAFLD in patients admitted with cirrhosis. Further review of coding practices and improved clinician documentation of BMI and obesity is urgently required in order to improve reporting of obesity in healthcare administrative databases.
In our patient cohort, there was almost perfect concordance between administrative codes and documentation of T2DM in medical records. Recent studies examining hospital data from a trauma centre19 and cancer outcomes registry24 in Australia have also reported ‘excellent’ and ‘substantial’ agreement, respectively, between administrative data and medical records for T2DM. This high concordance is likely a consequence of the mandatory requirement to code T2DM when it is documented, as well as the need to monitor and treat this condition during the hospital admission.
Compared with NAFLD/NASH codes, overall accuracy of ICD-10-AM codes for other aetiologies of CLD was high, similar to findings in prior studies.25–27 While most patients with ALD cirrhosis were detected using codes F10.1 ‘Harmful use of alcohol’, K70.3 ‘Alcoholic cirrhosis of liver’ and K70.4 ‘Alcoholic hepatic failure’, ‘grouped alcohol’ codes had the best overall concordance (κ 0.75). Viral hepatitis codes B18.2 ‘Chronic viral hepatitis C’ and B18.1 ‘Chronic viral hepatitis B without delta-agent’ also had high concordance (κ≥0.70), though the code for HBV identified several false-positive patients with previous exposure to HBV. We were not able to assess the accuracy of other viral hepatitis codes because they were not present in our sample.
Strengths of our study include the selection of a well-characterised cohort of patients with a diagnosis of cirrhosis confirmed by a healthcare provider. While it was outside the scope of the current study to assess the accuracy of clinical documentation, data abstraction from medical records was conducted by clinicians experienced in the management of CLD, including a hepatologist. The study was conducted using encounters at two tertiary hospitals that use electronic medical records. Therefore, a limitation of the study is that quality of clinical documentation and coding may differ from other smaller hospitals in regional areas that may use paper charts. However, regular internal auditing processes at a hospital, health service and jurisdictional level are regularly conducted for quality assurance purposes, which are supported by national data validation activities. Therefore, we are confident that our data represent an accurate sample of patients with cirrhosis in Southeast Queensland, Australia. The relatively small number of patients with HBV also limits conclusions.
Population-based data on the epidemiology and natural history of NAFLD are crucial to assess the true burden of this liver disease. Our data suggest that misclassification of NAFLD and obesity in the ICD-10-AM coding process may be related to a lack of appropriate codes or inadequate clinical documentation due to under-recognition or under-recording of these conditions. There may also be a lack of clarity about the diagnosis of NAFLD in the presence of coexisting liver diseases such as alcohol excess or viral hepatitis. Recognition of the utility and limitations of ICD-10-AM codes to study the burden of NAFLD/NASH cirrhosis in Australia is imperative to inform public health strategies and appropriate investment of resources to manage this burgeoning chronic disease.
PCV and EEP contributed equally.
Contributors PCV and EP conceived and planned the study. PCV and KLH identified the study sample and obtained International Classification of Diseases, Tenth Revision, Australian Modification data from Queensland Hospital Admitted Patient Data Collection. EP, ALJ and LUH collected the clinical data. KLH analysed the data. PCV, EP and CM interpreted study findings. PCV, EP and KLH drafted the manuscript. All authors reviewed and approved the final version for publication and have agreed to be accountable for all aspects of the work.
Funding KLH was supported by a Health Innovation, Investment and Research Office Clinical Research Fellowship. PCV was supported by an Australian National Health and Medical Research Council Career Development Fellowship (no. 1083090).
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval Ethics approval was obtained from the Human Research Ethics Committees of the Metro South Health (HREC/16/QPAH/628 and HREC/15/QPAH/688) and QIMR Berghofer Medical Research Institute (P2207).
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available upon reasonable request. Data have not been made publicly available due to requirements of research ethics and governance approvals. Requests to collaborate and share data may be directed to the corresponding author.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.