Article Text

Download PDFPDF

Study to determine the likely accuracy of pH testing to confirm nasogastric tube placement
  1. Anne M Rowat3,
  2. Catriona Graham1,
  3. Martin Dennis2
  1. 1 Edinburgh Clinical Research Facility, University of Edinburgh, Edinburgh, UK
  2. 2 Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
  3. 3 School of Health and Social Care, Edinburgh Napier University, Edinburgh, UK
  1. Correspondence to Dr Anne M Rowat; a.rowat{at}


Objective To establish the likely accuracy of pH testing to identify gastric aspirates at different pH cut-offs to confirm nasogastric tube placement.

Methods This prospective observational study included a convenience sample of adult patients who had two (one fresh and one frozen) gastric and oesophageal samples taken during gastroscopy or two bronchial and saliva samples taken during bronchoscopy. The degree of observer agreement for the pH of fresh and frozen samples was indicted by kappa (k) statistics. The sensitivities and specificities at pH ≤5.5 and the area under the receiver operating characteristics (ROC) curve at different pH cut-offs were calculated to identify gastric and non-gastric aspirates.

Results Ninety-seven patients had a gastroscopy, 106 a bronchoscopy. There was complete agreement between observers in 57/92 (62%) of the paired fresh and frozen gastric samples (k=0.496, 95% CI 0.364 to 0.627). The sensitivity of a pH ≤5.5 to correctly identify gastric samples was 68% (95% CI 57 to 77) and the specificity was 79% (95% CI 74 to 84). The overall accuracy to correctly classify samples was between 76% and 77%, regardless of whether patients were taking antacids or not. The area under the ROC curve at different pH cut-offs was 0.74.

Conclusion The diagnostic accuracy of pH ≤5.5 to differentiate gastric from non-gastric samples was low, regardless of whether patients were taking antacids or not. Due to the limited accuracy of the pH sticks and the operators’ ability to differentiate colorimetric results, there is an urgent need to identify more accurate and safer methods to confirm correct placement of nasogastric tubes.

  • gastroscopy
  • endoscopy
  • pH monitoring

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Summary box

What is already known about this subject?

  • Testing the pH of gastric aspirate to show pH ≤5.5 is recommended first-line test to confirm correct placement of nasogastric tubes and reduce the risk of potentially fatal aspiration.

  • False positive readings may occur if the nasogastric tube is misplaced in the oesophagus or false negative readings (pH >5.5) may occur in patients who receive antacid medications, which can delay feeding while waiting for the second-line test, a chest X-ray.

  • There are limited numbers of studies that have examined the different pH cut-offs of gastric, oesophageal, saliva and bronchial aspirates.

What are the new findings?

  • Compared with studies that have taken aspirate directly from the nasogastric tube, patients undergoing scope procedures had a lower sensitivity at the pH cut-off ≤5.5 for identifying gastric aspirates for the whole group and in the presence and absence of antacid medications.

  • Two-thirds of oesophageal aspirates had a pH cut-off ≤5.5, demonstrating considerable overlap with the gastric aspirates in this population.

  • The pH readings between 4.5 and 6.0 provided the greatest overall accuracy, however there was only moderate agreement between observers at pH readings ≥5.0.

How might it impact on clinical practice in the foreseeable future?

  • Current guidelines and training strategies need to be updated to better support healthcare professionals to accurately insert and check the position of nasogastric tubes.

  • Further research is urgently required to develop novel or a combination of low-cost bedside tests, which would confirm nasogastric tube placement, and would be safer, easy to use, reduce the need for expensive X-rays and improve patient outcomes.


Nasogastric tube (NGT) feeding is recommended to provide nutrition and hydration to patients unable to swallow. Every year, approximately 790 000 adults and children in the UK require to be fed by NGT to avoid malnutrition and dehydration and to give essential medications.1 However, if the NGT is not positioned in the stomach or is dislodged in the oesophagus, nasopharynx or bronchial tract it can result in serious harm, including aspiration pneumonia, pulmonary haemorrhage, pneumothorax and death.1–5 The Department of Health in the UK has identified that deaths by NGT misplacement should be ‘never events’.3 However, 3%–4% of NGTs are misplaced every year and the number of serious incidents reported has increased by more than 60% between 2014 and 2017, which calls into question current NGT testing methods.4 5

Current healthcare guidelines recommend that the first-line test to confirm correct NGT placement prior to giving food or medications must be that the pH of an NGT aspirate is ≤5.5 (acidic).6 Nevertheless, false positive readings might occur if the tube is misplaced in the oesophagus or false negative readings (pH >5.5) may occur in patients who secrete less gastric acid, because of antacid medications, achlorhydria or buffering by NGT feeds.7 8 Furthermore, aspirate may not be obtained immediately, which can lead to significant delays to feeding while waiting for a second-line test, most often a chest X-ray, to check the NGT position.9 Although, the chest X-ray is often regarded as the gold standard test, misinterpretation of X-rays has been reported as causing more cases of serious harm and death due to NGT misplacement than false pH readings (45% vs 8%).6 Furthermore, X-rays cannot be repeated very often without risking excessive radiation and they have cost and resource implications, and can result in significant delays to feeding.10 It has been suggested that modified NGTs with external magnetic, electromagnetic guidance systems or fibre-optic capabilities could be used to determine real-time NGT location and help prevent misplacement.11–13 However, these devices have yet to be implemented because they require expensive equipment and extensive training to interpret findings.14 Moreover, lung misplacements can still occur using these devices due to associated problems of malfunctioning, calibration and misinterpretation, therefore these methods are not recommended as stand-alone tests.15

The pH of lung aspirate is more often alkaline (ie, >pH 7), but could at least theoretically be acidic as a result of aspiration of gastric fluid or in situ build-up of lactic acid during infection or hyperventilation.16 17 Measuring end-tidal carbon dioxide (ETCO2) in the aspirated air using capnography or colorimetric test can be used to exclude NGT placement into the lungs rather than confirm placement in the stomach. However, these tests can result in false positive results in 16% of cases when compared with X-rays.18 They are also unable to positively confirm misplacement of the NGT in the mouth, nasopharynx or oesophagus.19

To date, the studies examining the different pH cut-offs of gastric aspirates are limited by the small number of aspirates obtained from the NGT, the lack of a robust gold standard to ensure gastric placement, the differences in the comparative non-gastric samples and methods of pH (stick vs metre) measurements.20 21 There are also very few studies that have examined the pH of oesophageal and/or saliva.22 Therefore, the aim of this study was to provide more precise estimates of the sensitivity and specificity using the entire range of pH stick cut-offs to distinguish between gastric, oesophageal, saliva and bronchial secretions.


Study design, setting and ethics

A prospective observational study was conducted in two UK teaching hospitals between 1 November 2014 and 20 December 2016 to compare the pH of aspirates obtained during gastroscopy or bronchoscopy from the stomach, oesophagus, mouth and lung. The study was reported as recommended by the STAndards for the Reporting of Diagnostic accuracy studies (STARD) guidelines.23


A convenience sample of adults over the age of 16 who required a first routine gastroscopy or bronchoscopy. Consent was obtained from all patients who volunteered to have samples taken during the procedure and data regarding their use of antacid medication and confounding factors that might affect the pH of any aspirate results were recorded. Patients were excluded if they lacked mental capacity or the specimens were considered high risk, including known tuberculosis, blood or airborne viruses.

Data collection

Prior to the procedure, patients were asked to fast for at least 4 hours and data regarding the use of antacid medication or conditions/surgery that might affect the pH of any aspirate results were recorded. During the routine gastroscopy, two of each type of sample (oesophageal and gastric) were suctioned and collected by the operator. Patients undergoing bronchoscopy were asked to spit saliva into two labelled universal containers prior to the procedure and two lung samples were then obtained during the procedure. The reference standard was direct confirmation of the type of aspirate, confirmed by the operator undertaking the gastroscopy or bronchoscopy, which would not be possible if it was taken directly from an NGT. At the end of each of the procedures a research nurse tested two samples using a pH stick. As the testing of fresh samples during endoscopy could not be blinded to the source of aspirate we froze the remaining two samples to −20°C from each location for blinded pH testing. At the end of data collection for the main study, the frozen samples were defrosted thoroughly for 4 hours and were allowed to reach room temperature prior to being tested by a research nurse who was not previously involved in testing the specimens. The samples were then destroyed after the second testing phase.

The pH stick was supplied by Enteral UK, North Duffield, UK. The pH sticks scale ranges from 2.0 to 9.0 with three colour blocks in intervals of 0.5 pH units. The research nurses followed a standard operating procedure to ensure that each specimen was pipetted from the container to cover the pH stick and the colours of the sticks were compared after 60 s with that on the container.

Patient involvement

No patients were involved in the design or implementation of this study. However, the findings, recommendations and implications of the project will be disseminated in accessible formats suitable for the relevant patient community.


Descriptive statistics included the median, IQR, percentages, frequencies and 95% CIs. The McNemar’s test was used to compare paired categorical data and kappa (k) statistic was used to indicate the observer agreement between the paired fresh and frozen samples. Missing data were excluded from analysis and a p value of <0.05 was considered statistically significant. Diagnostic data included analyses of the sensitivity, specificity, positive/negative predictive values (PV), and the positive/negative likelihood ratios (LR) of the pH tests for each type of sample. The cut-off values were aligned to the agreed clinical standards for testing pH (ie, ≤5.5 was classified as a gastric sample, whereas >5.5 was classified as a non-gastric sample). The receiver operating characteristics (ROC) were analysed for gastric versus non-gastric samples to determine the relationship between sensitivity and specificity and the area under the curve. All data analyses were performed using SAS V.9.4 (Statistical Analysis System Institute).

A sample size of 100 for each sample was estimated based on the 95% CIs and the majority of gastric aspirates having a pH ≤5.5. However, the sample size could vary depending on how many patients have gastric secretions with a pH >5.5, as this was unknown we arbitrarily chose 4% where pH might misclassify the samples. We expected no false positive samples when testing saliva or bronchial aspirate, which would give specificity of 100% (95% CI 97 to 100). Therefore, we estimated that 200 patients each having four samples (fresh and frozen) taken during either gastroscopy (gastric and oesophageal samples) or bronchoscopy (bronchial and saliva samples) procedures would be required, providing a total of 800 pH tests.


In total, 211 patients were recruited to the study, however, eight patients were removed as their samples were wrongly labelled and could not be positively identified. Of the 203 remaining patients: 95 (47%) were male; 97 (48%) underwent a gastroscopy; and 106 (52%) a bronchoscopy. Eighty-three (41%) patients were taking antacid medication (2% were taking H2 antagonists and 98% proton pump inhibitors) prior to the gastroscopy (42/97, 43%) or bronchoscopy (41/106, 39%). From the expected 812 samples (ie, two fresh and two frozen samples from the 203 participants), 717 (88%) samples were suitable for testing. Of the 390 fresh and 327 frozen samples, 16 were not collected during the procedure and 63 were not suitable for testing after the sample was defrosted. The numbers of fresh and frozen gastric and non-gastric samples at pH ≤5.5 and >5.5 are shown in figure 1.

Figure 1 STAndards for the Reporting of Diagnostic accuracy studies

(STARD) diagram reporting the flow of participants through the study.

Distribution of the pH for each sample

Figure 2 shows the distribution of the pH for each type of sample. Predictably, the fresh gastric samples (n=96) had the lowest median pH of 2 (IQR 2.0–6.5), regardless of whether patients were taking antacids (n=42) or not. The oesophageal samples (n=90) had a median pH of 5.0 (IQR 2.0–6.5) and the median pH was 7.0 for both bronchial and saliva samples. Importantly, 100% of the bronchial samples (n=103, IQR 6.5–7.0) and 98% of the saliva samples (n=101, IQR 6.5–7.0) had a pH >5.5.

Figure 2

Box plot showing the distribution of pH by sample type, including: median (midline); mean (◊); 25th and 75th percentiles (box); and the range, excluding outliers (bars).

Fresh versus frozen samples

There were no significant differences in the distribution of the discordant results between paired fresh and frozen gastric (McNemar’s test=0.14, p=0.7) and non-gastric (McNemar’s test=0.69, p=0.4) samples at the pH ≤5.5 cut-off. In fact, the agreement was good between the paired fresh and frozen samples at the pH cut-off ≤5.5: gastric (n=85/92, 92%); oesophageal (n=74/87, 85%); bronchial (n=63, 100%); and saliva (n=82, 100%) samples. However, when the individual paired fresh and frozen samples were compared between the observers there was only complete agreement in 57/92 (62%) when testing the gastric samples (k=0.496, 95% CI 0.364 to 0.627) and in 97/232 (42%) for the non-gastric samples (k=0.316, 95% CI 0.241 to 0.390). The differences between the observers more frequently occurred at pH readings ≥5.0, which included 28/92 (30%) of the gastric and 121/232 (52%) of the non-gastric fresh and frozen pairings (online supplementary files 1 and 2).

Accuracy of the pH ≤5.5 cut-off

Table 1 shows the diagnostic accuracy of pH ≤5.5 for all samples and in the presence or absence of antacid medication to distinguish fresh gastric and non-gastric samples. The sensitivity to correctly identify gastric samples at pH ≤5.5 was 68% (95% CI 57 to 77) and the specificity was 79% (95% CI 74 to 84). Surprisingly, the sensitivity of the gastric pH was slightly higher in patients on antacids (71%, 95% CI 55 to 84) compared with the rest (65%, 95% CI 51 to 77). The positive PV to predict the sample was gastric given a positive test (pH ≤5.5) was 52% (95% CI 45 to 58). The negative PV to predict the sample was non-gastric given a negative test (pH >5.5) was 88% (95% CI 85 to 91). The positive and negative LRs were 3.3 and 0.4, respectively. The overall probability that the samples would be correctly classified was 76%–77%, regardless of whether patients were taking antacid medication or had other potentially confounding factors, including those with pernicious anaemia (n=3) and/or had previous gastric surgery (n=9).

Table 1

The proportion of fresh samples from different sources with pH ≤5.5 and the diagnostic accuracy of using this cut-off to detect gastric source overall, and in the presence or absence of prior antacid medication and confounding factors

The sensitivity and specificity at different pH cut-offs

As expected, there was an increase in sensitivity at higher pH cut-offs, but this was at the expense of lowering the specificity. For these samples, the pH readings between 4.5 and 6.0 provided the optimal balance between sensitivities and specificities. There were no bronchial samples with a pH ≤6.0, but there were 59/90 (66%) oesophageal and two saliva samples that lowered the specificity of the test to 76% (95% CI 71 to 81). If a pH ≤4.5 was used the sensitivity fell to only 60% (95% CI 50 to 71), whereas the specificity increased to 86% (95% CI 82 to 90). Overall, the area under the ROC curve at different pH cut-offs to differentiate the gastric from the non-gastric samples was 0.74 (figure 3, online supplementary file 3).

Supplementary file 3

Figure 3

Receiver operating characteristic (ROC) curve to determine the diagnostic accuracy of different pH cut-offs for the fresh gastric sample versus the non-gastric samples.


This prospective study found that a cut-off of pH ≤5.5 resulted in similarly low sensitivities for identifying gastric secretions in the whole group and in the subgroup on antacids or with other confounding factors who might be expected to have less gastric acid. The results of the current study are comparable to a recent study that found a sensitivity of 66% using a similar standard colorimetric pH stick in gastric aspirates taken from an NGT.24 However, earlier studies have estimated that this cut-off would provide slightly higher sensitivities of between 85% and 90% in patients not taking antacids or 73%–77% if antacid medication was present.25–27 Where aspirate is not obtained from the NGT, the overall sensitivity of the pH test has been reported to decrease to 66%.27 It was also reassuring that pH ≤5.5 was able to rule out all of the bronchial samples in the present study.

In this study, 66% of the oesophageal samples had a pH ≤5.5, demonstrating that there was considerable overlap with the gastric aspirates. This may have resulted from regurgitation of gastric secretions through the gastric sphincter during gastroscopy.28 However, if this occurred in patients receiving NGT feeding they would have been at risk of been fed into the oesophagus, which could increase the risk of aspiration pneumonia (ie, a ‘never event’).29 A third of the gastric aspirates also had a pH >5.5, which if taken from an NGT would have resulted in significant delays before essential nutrition, fluid and medication could be delivered while waiting to retest a few hours later or for X-ray confirmation of placement in the stomach.

Comparing the entire range of pH cut-off, the pH readings between 4.5 and 6.0 provided the greatest overall accuracy. However, we found that there was a lack of complete agreement between observers to differentiate the fresh and frozen samples, particularly at pH readings >5.0. Previous studies have also reported that testers frequently have difficulties in differentiating between the small differences in the pH colours, particularly across the range 5–7, which is the vital range to determine whether to feed or not using an NGT.21 30 A recent study reported that misinterpretation of pH readings occurred in 30% of pH readings of which 12% were pH 6.0.9 These errors were equally common among both novice and experienced staff,9 reinforcing that visual inspection of the pH stick is unreliable. However, in this study we cannot discount that some of the samples pH may have been altered by freezing and defrosting of the samples.

Strengths and limitations

A strength of the current study was the large number of aspirates that could be obtained from patients undergoing routine gastroscopy and bronchoscopy and in whom we could be certain of the source of the aspirate. This approach was considered the best option to obtain bronchial and oesophageal aspirates, which have previously been under-reported or estimated from the literature.21 22 Furthermore, obtaining tube aspirate from the NGT is not possible in up to 46% of patients.9 Although, a lack of aspirates will impact on the usability of the pH test, the aim of the current study was to explore the accuracy of pH cut-offs on the different types of aspirate, which would not have been verifiable from the NGT without additional X-rays. A potential limitation of using endoscopic procedures is that there may have been variability between the methods used to obtain the samples. The population undergoing scope procedure may also differ from those requiring nasogastric feeding. It was also expected that the majority of patients would have a fasting gastric pH <4.0, but in fact there was high prevalence of hypochlorhydria, particularly as it is more common in patients undergoing endoscopic investigations.31 32 The gastric pH results between those patients taking antacid medications or not may also be confounded by the fasting conditions and if medications were stopped prior to the procedure.33 Stopping antacids prior to the procedure increases the risk of gastric acid rebound hypersecretion, which might explain our observation that those usually taking an antacid did not have a greater frequency of pH >5.5.34

Implications for practice

Currently, the recommended first-line test using the pH cut-off of ≤5.5 is able to safely exclude all bronchial aspirates, though it does not exclude oesophageal samples. Therefore, it is recommended guidelines should include optimal strategies to prevent oesophageal misplacement or displacement.21 Overall, the accuracy of the pH sticks used in isolation was poor, which could be further hampered by the testers’ ability to differentiate colorimetric results and/or ability to obtain aspirate. Consequently, effective training strategies are required to reduce misinterpretation of pH readings and highlight particular issues related to visual inspection, which could potentially be reduced by redesign of the colorimetric pH stick,9 the addition of biochemical markers,20 24 or comparing pH measured with both pH stick and metre.21 Other bedside methods, including measurement of the internal length of the NGT and ultrasound at the neck, may help prevent NGT position in the oesophagus as well as increase the probability of obtaining aspirate.20 35 At present, there are several bedside methods to verify NGT position, but they are not included in current health guidelines as there is limited evidence to support their effectiveness as stand-alone tests.6 21 However, a combination of cost-effective bedside tests, both established and novel, used to check NGT position is likely to be more accurate and safer than one test in isolation.9


Ensuring patients are well nourished and hydrated to improve outcomes, experiences of care and reducing avoidable harm are international priorities. Therefore, further research into more accurate methods to differentiate gastric from non-gastric tube placement, to reduce delays in feeding, X-rays, repeated NGT insertions, and associated healthcare costs is urgently required.

Supplementary file 1

Supplementary file 2


Data were collected by the research nurses from the Edinburgh Clinical Research Facility, University of Edinburgh. We are grateful to the gastroenterologists and respiratory physicians who helped us by collecting samples during the procedure they carried out.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.


  • Contributors All authors approved the final version of the manuscript. Study concept and design: MD, CG and AMR. Statistical analysis: CG. Acquisition, analysis and interpretation of data: AMR, CG and MD. Drafting of the manuscript: AMR, CG and MD.

  • Funding The study was funded by internal funding from the Centre for Clinical Brain Sciences, University of Edinburgh.

  • Competing interests None declared.

  • Patient consent Not required.

  • Ethics approval This study received prior approval from the Lothian NHS Research Ethics Committee (REC: 13/SS0184; R&D: 20130299).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement The deidentified data set will be made available on request to the lead author.