Background and aim In 2013, Diaz-Nieto et al published a Cochrane review to summarise the impact of postsurgical chemotherapy versus surgery alone on survival for resectable gastric cancer. The authors concluded that postsurgical chemotherapy showed an improvement in overall survival. The aim of this article was to assess the validity of four studies included in the Cochrane review and to investigate the impact of an exclusion of these four studies on the result of the meta-analysis.
Methods Overall survival was selected as endpoint of interest. Among the 34 included papers which analysed this endpoint, we identified the four publications which have the highest weights to influence the final result. The validity of these papers was analysed using the CONSORT (Consolidated Standards of Reporting Trials) checklist for randomised controlled trials. We performed a new meta-analysis without the four studies in order to assess their impact on the general result of the original meta-analysis.
Results The analysed four studies revealed several inconsistencies: inappropriate answers were found in up to 77% of the items of the CONSORT checklist. Unclear or inadequate randomisation, missing blinded set-up, conflict of interest and lacking intention-to-treat analysis were the most common findings. When performing a meta-analysis excluding the four criticised studies, postsurgical chemotherapy still showed a significant improvement in overall survival. Even when excluding all single studies with a statistically significant outcome by themselves and performing a meta-analysis on the remaining 26 studies, the result remains statistically significant.
Conclusion The four most powerful publications in the Cochrane review show substantial deficits. We suggest a more critical appraisal regarding the validity of single studies. However, after the exclusion of these four studies, the result of the meta-analysis did not change.
- gastric cancer
- assessment of validity
- placebo effect
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What is already known about this subject?
Following the results of a Cochrane review by Diaz-Nieto et al. 2013, post-surgical chemotherapy should be used for patients with resectable gastric cancer.
What are the new findings?
The validity of the four most powerful studies included is not sufficient for inclusion in a meta-analysis.
However, even when those four studies are excluded, the overall results of the Cochrane review do not change.
How might it impact on clinical practice in the foreseeable future?
A more critical appraisal regarding the validity of single studies is warranted.
The literature needs to be reassessed in detail in order to avoid unnecessary side effects for the patients as well as unnecessary costs for the health care systems.
Despite declines, gastric cancer is still the fifth most common malignancy in the world after lung, breast, colorectum and prostate cancer.1 Although surgery is considered the only curative option,2 the role of adjuvant chemotherapy (CTx) after curative resection in improving patients’ survival remains controversial.3 While some trials support the use of adjuvant 5-fluorouracil (5-FU) combination CTx,4 5 others do not show any positive effect.6 7 Several meta-analyses have attempted to address the validity of adjuvant CTx in this setting with the majority failing to confirm a positive association due in large part to a lack of sufficient evidence.8 9 In 2013, Diaz-Nieto et al published a Cochrane review investigating the impact of postsurgical CTx versus surgery alone for resectable gastric cancer on overall survival (OS).3 The authors identified 34 randomised controlled trials (RCT) reporting OS and 15 reporting disease-free survival (DFS) and concluded that postsurgical CTx showed an improvement in OS (HR 0.85; 95% CI (0.80; 0.90)) as well as in DFS (HR 0.79; 95% CI (0.72; 0.87)). Although all trials had a high risk of bias and the authors assessed risks such as sequence generation, allocation concealment, blinding, incomplete outcome data, selective outcome reporting, baseline imbalance, early stopping, and source of funding bias according to published recommendations,10–13 several studies with questionable validity were included in the Cochrane review.
For quite a while now, a more individualised therapeutic approach instead of a standardised treatment has increasingly been discussed with respect to patients with malignant tumours. The new ‘choosing wisely’ campaign also contributes to deciding carefully among treatment modalities with their potential side effects. Therefore, due to the validity of the underlying studies, to us it seemed necessary to re-evaluate treatment recommendations for special tumour entities.
Thus, the aim of this work was to assess the validity of four of the studies included in the meta-analysis of Diaz-Nieto et al 3 which confirmed the benefit of postoperative 5-FU combination CTx in gastric cancer with the intention to invite everyone to critically interpret the results and the methodology by which the results were achieved.
Materials and methods
The meta-analysis of Diaz-Nieto et al 3 included a total of 34 studies. Eight studies (24%) (Neri et al 14, Sakuramoto et al 15, Fujimoto et al 16, Douglass et al 17, Chou et al 18, Cirera et al 19, Grau et al 20, Nakajima et al 21) found a statistically significant advantage in survival in patients undergoing adjuvant CTx after curative surgery compared with patients receiving only surgical resection (HR<1 with significant 95% CI not including 1). All the other 26 studies (76%) were not statistically significant: five of them had an HR>1 and the other 21 studies only found a trend for a better survival after adjuvant CTx (HR<1).
In the first part of the Results section, we assessed the validity of the four most powerful studies included in the Cochrane review of Diaz-Nieto et al 3 which found a statistically significant advantage in survival in patients receiving postsurgical CTx after curative resection for gastric cancer compared with patients undergoing surgery only. These studies are those of Sakuramoto, Neri, Fujimoto, and Douglass. The assigned weights are 4.4%, 3.8%, 2.6%, and 2.9%, respectively.
In the second part of the Results section, we performed a new meta-analysis without these aforementioned four studies, and finally we present the results of the meta-analysis with all eight statistically significant studies confirming the survival advantage for patients treated with postsurgical CTx excluded. In this last case, only statistically non-significant studies were included in the meta-analysis.
Selection of the studies and assessment of their validity
To analyse the validity of the Cochrane review, one has to select a positive statement of this review because only in case of a positive statement specific data can be identified for an assessment of validity. In case of negative results, there are too many possibilities that could lead to negative results. From the several endpoints investigated in the Cochrane review of Diaz-Nieto et al,3 we identified OS as a major endpoint of interest. Among the 34 studies identified by the authors of the Cochrane review investigating OS, we selected the four most powerful studies as weighted by the authors of the review which support the advantage of postsurgical 5-FU-based CTx: Neri et al 14,14 Sakuramoto et al 15, Fujimoto et al 16, and Douglass et al 17. The weights assigned to these four studies by the authors of the systematic review according to their sample size, precision of the estimates and width of the CIs were 3.8%, 4.4%, 2.6%, and 2.9%, respectively. We then assessed the validity of these studies using the CONSORT (Consolidated Standards of Reporting Trials) checklist,22 which is a validated instrument for the evaluation of RCT and which has a total of 37 items. The checklist with all items and their precise description is available in the online supplementary appendix 1. We then asked whether the positive result in the Cochrane review is supported by sufficient validity. Figure 1 illustrates our methodology. Two independent review authors (GM and MK) assessed the validity of each of the four publications.
Supplementary file 1
We repeated the meta-analysis using R without the four analysed studies (n=30) and compared the result with the original meta-analysis comprising 34 studies. In a next step, we assumed that all single studies with a statistically significant benefit of postoperative CTx after curative resection of gastric cancer (n=8) were not valid enough and performed a second meta-analysis with the remaining 26 studies. The results were compared with the original meta-analysis (n=34 studies). The meta-analyses were performed with R, V.3.2.0, with the package ‘meta’ (http://www.r-project.org/foundation).
Assessment of the validity of the studies
Table 1 presents a summary of the four analysed papers. The results are reported for each of the four included studies.
Table 2 summarises all the items present in the CONSORT checklist showing how the studies deal with them (extended table can be found in the online supplementary file 2).
Supplementary file 2
In this section, we describe the problems of each study. Regarding the study by Neri et al 14, 18 of the 32 validity criteria (56%) were not met. Five items were not applicable. The patients included were stratified by centre but not randomly assigned to the control or intervention group, therefore we cannot recommend the use of this design in a confirmatory study. Inclusion of untreated controls limits the interpretation of the study. Specifically, the difference between the intervention and control group may be caused by a non-specific effect such as a placebo effect. The risk profiles of the two groups are different with a high probability of unbalanced risk distribution in favour of the intervention group. It is also unclear whether the allocation to the study group was concealed as information about the randomisation procedure is not included. Moreover, blinding was not possible as the control group did not receive any treatment. Furthermore, it is unclear whether all patients were included in the results because table 1 reports only evaluable patients. No information about the number of randomised patients is given. An intention-to-treat (ITT) analysis is not explicitly described. The definition of the study as ‘randomized’ in the title of the article implies a researcher bias. Taken together, these issues lead to insufficient validity of the report and thus the described effect cannot be considered as clinically relevant.
In the study of Sakuramoto et al 15, we identified poor validity in 7 of the 35 validity criteria (20%). Two items were not applicable. Again, as in the previous study, the use of untreated controls limits the interpretation of the study. Additionally, different follow-up modalities are described for the control and intervention group which could be a source of bias. Patients in the intervention group underwent haematological tests and assessments of clinical symptoms every 2 weeks while patients in the control group underwent similar examinations only every 3 months. Due to the use of the minimisation method, allocation concealment is not maintained. Blinding was not possible in this work either as the control group did not receive any treatment. Results are influenced by conflicting interests because a sponsor was involved in the design of the trial and collection of data. As the validity of the report is not sufficient, the described effect cannot be considered as clinically relevant.
In the study of Fujimoto et al,16 27 of the 35 validity criteria were not met (77%). Two items were not applicable. It is unclear whether or not the study was randomised because no randomisation method is described. Again, the use of untreated controls limits the interpretation of the study. The risk profiles of the two groups were not reported and therefore it is not possible to check whether the risk distribution is balanced. It is unclear whether the allocation to the study group was concealed because information about the randomisation procedure is missing. Blinding was not possible as the control group did not receive any treatment. From the 129 patients included in the intervention group, only 97 were analysed (75%), resulting in a loss of power. An ITT analysis was not performed. As the validity of the report is not sufficient, the described effect cannot be considered as clinically relevant.
In the fourth study (Douglass et al 17), 21 of the 34 validity criteria were not met (62%). Three items were not applicable to this study. The randomisation process is not described in detail. Again, the use of untreated controls limits the interpretation of the study. It is unclear whether the allocation to the study group was concealed because information about the randomisation procedure is missing. Blinding was not possible as the control group did not receive any treatment. After the closure of the recruitment phase, 23 patients were withdrawn from the study by a committee and by the principal investigator resulting in a loss of power. The reasons for the withdrawal are not explained in detail so that a conflict of interest cannot be excluded. Moreover, the authors of this last analysed publication state that an update of the results is necessary in order to confirm these results. We could not find a published update in PubMed. As the validity of the report is not sufficient, the described effect cannot be considered as clinically relevant.
Figure 2 shows the result of the meta-analysis when the four analysed studies were excluded. A total of 30 studies were included. Four studies (Chou et al,18 Cirera et al,19 Grau et al,20 and Nakajima et al 21) showed a positive and statistically significant result in favour of the use of postsurgical CTx after curative resection for gastric cancer. Twenty-six of the included studies were not statistically significant by themselves. The new meta-analysis estimate had an HR of 0.88 with a 95% CI (0.83 to 0.94). The estimate of the original meta-analysis was 0.85 with 95% CI (0.80 to 0.90). The exclusion of the four studies did not significantly change the result of the meta-analysis.
We then performed a second meta-analysis (figure 3) excluding the other four studies which found a positive and statistically significant result as well. After the exclusion of all eight studies with positive and statistically significant results, the new meta-analysis consisted only of 26 statistically non-significant studies (5 with an HR>1 and 21 with an HR<1). The new meta-analysis estimate (HR 0.92, 95% CI (0.86 to 0.97)) was slightly higher than the original one, but still statistically significant, indicating a better survival in patients receiving adjuvant CTx after curative resection for gastric cancer compared with patients undergoing surgery only.
In the present manuscript, we assessed the validity of four studies included in the meta-analysis of Diaz-Nieto et al 3 which supports the results of improved survival in patients treated with postsurgical CTx after curative resection for gastric cancer. However, it is important to identify possible bias in the four studies which support the result of the meta-analysis, because bias jeopardises validity. We demonstrated that these four studies are not valid enough to be included in a Cochrane review. Nevertheless, even when excluded from the meta-analysis, the overall result of the meta-analysis still confirms improved survival by the administration of adjuvant CTx after curative surgery. Furthermore, by excluding all single studies that show a significant benefit of adjuvant CTx and performing a new meta-analysis on the remaining 26 single studies, which by themselves were not statistically significant, the original finding of a benefit of adjuvant CTx after surgery to our surprise prevails.
We will first illustrate the problems we discovered in the four mathematically most influential studies supporting the conclusions and, in a second step, discuss our findings after performing the new meta-analyses.
Common problems in all studies
We agree with the authors of the meta-analysis3 that the lack of a placebo-controlled and blinded study affects the validity of the three studies and consequently the validity of the review. Without placebo control, it is impossible to differentiate between specific pharmacological and placebo effects. Placebo effect is defined as the ‘response of a subject to a substance or any procedure known to be without specific therapeutic effect for the condition being treated’.23 Several studies demonstrated that perceptual characteristics of drugs,24 the route of administration,25 laboratory tests,26 diagnosis,27 and doctor–patient relationship play an important role in the outcome of illness.28–31 Information regarding treatment or no treatment alone is sufficient to cause a placebo effect.32 Moreover, patients and doctors’ preferences could also have influenced the results in an open study.33 Patients assigned to the control group feel disadvantaged because they expect to be treated. Furthermore, when there is no concealment of treatment allocation, the randomisation procedure is compromised because of conscious or subconscious bias.34 It is important to perform an ITT analysis to maintain the balance distribution of risk factors between groups which is achieved by a randomisation procedure. Only in the study of Sakuramoto et al 15 that a correct ITT analysis was conducted. Collectively, these aspects affect the validity of the reports and therefore the described effects cannot be considered as clinically relevant.
Specific problems of the study by Neri et al14
This study is the final report of a previously published study by Neri et al 35 ,35 in which patients were stratified by centre to receive either CTx or were in the control group at follow-up. It is not clear whether the centres were stratified and patients in each centre were randomised by a single study centre. Alternatively, each centre could have randomised its own patients. The latter procedure could also explain the unbalanced risk profiles (table 1 of the original article) in favour of the intervention group. Additionally, only evaluable patients were reported in the table and were analysed. Following these findings, a researcher bias is present in the study; in the title of the publication, the authors declared that the study was an RCT which could not be confirmed by our questionnaire however.
Specific problems of the study by Sakuramoto et al15
In the study of Sakuramoto et al,15 a minimisation method is used. Minimisation,36–39 a type of dynamic allocation, is gaining popularity especially in clinical cancer trials. In this design, the new subject’s treatment assignment is determined by evaluating the potential covariate imbalance that would result if he or she were assigned to the treatment or likewise to the control group.40 Minimisation aims at achieving balance over a large number of prespecified prognostic factors simultaneously. In contrast to the opinion of the authors of the Cochrane review,3 we raise concerns over this design as it compromises adequate generation of an allocation sequence and concealment in this study. In fact, investigators using minimisation can determine the group to which a prospective subject would be allocated and then decide whether this is positive or negative in terms of creating an imbalance in some key predictor of outcome not considered in the imbalance function. Despite adding randomisation, so that the treatment that minimises the imbalance function for a given patient is not necessarily allocated to that patient, there is a high probability of this being the case.41 The European Medicines Agency’s committee42 states that ‘dynamic allocation is strongly discouraged.’ Regarding follow-up modalities, the patients in the intervention group underwent more frequent haematological tests than patients in the control group. This could be a source of bias because any treatment and additional attention from the doctor (difference in care) could lead to an improvement in the patients’ outcome.43 Moreover, Sox et al 26 found that laboratory tests that have no diagnostic value were independent factors of recovery. Finally, a sponsor-related conflict of interest was identified by our analysis as also acknowledged in the Cochrane review.3
Specific problems of the study by Fujimoto et al16
In this study, similar to the study of Neri et al,14 the randomisation process is not described, and a table with the characteristics of the patients in the two groups is not reported. This makes it impossible to determine the balance which is a direct sign of a good randomisation process. Interestingly, only 75% of the patients included in the intervention group were analysed. The sample size is not sufficient to reach the needed power for the chosen significance levels provided.
Specific problems of the study by Douglass et al17
In this study, similar to the study of Neri et al,14 the randomisation process is not described. Twenty-three randomised patients were excluded by a committee and by the principal investigator from the final analysis. This results in a loss of power. Additionally, the reasons for the withdrawal of these patients are not explained well enough and conflict of interest cannot be excluded. A power calculation is not reported.
In 2010, the GASTRIC (Global Advanced/Adjuvant Stomach Tumor Research International) Group published a meta-analysis on the same topic as Diaz-Nieto et al 3 and similarly found a benefit of adjuvant CTx for resectable gastric cancer (HR 0.82, 95% CI (0.76 to 0.90)).44 The authors found a total of 31 eligible trials. After asking for individual patients’ data, they obtained data of 17 trials only. Thus, the performed individual patient-level meta-analysis included only 17 of the 31 studies. This meta-analysis has the advantage to be based on individual patient data and has the limitation to include only a part of the existing literature (17 of 31 studies, 55%). Consequently, the results should be treated with caution because they are only partial.
Generally, the discussion about quality and quality assessment of medical publications is still ongoing. When authoring a clinical study, it is important to describe the study according to the CONSORT checklist22 if the study is an RCT, or according to the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) checklist45 if it is observational. In the case of a meta-analysis, it is mandatory to check the validity of each publication and this check should be included in the review. The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist46 helps authors to improve the quality of a meta-analysis of RCT while the MOOSE (Meta-analysis of Observational Studies in Epidemiology) guidelines47 instruct the process of meta-analysing observational studies. In several medical journals, the checklist that goes with the study type must be submitted together with the main manuscript. This increases the quality and standardisation of publications and it is recommended that this procedure becomes a standard of practice for each journal. Several journals also request the trial registration number to consider a study for publication and this is also recommended to become a standard of practice for all publication avenues.
For a reviewer, it is important to carefully evaluate a publication, in particular when a RCT is reported. Design, conduct, reporting, and statistics should be analysed in detail before acceptance. Only valid studies are reliable studies. For an expert pool aiming to publish guidelines, it is necessary to scrutinise the validity of single studies and of meta-analyses as well. As recently shown by Shnier et al,48 financial conflict of interest and relationships between guideline authors and drug companies are common and represent a source of bias in studies. As authoritative value is assigned to guidelines, it is important to develop formal policies to limit the potential influence of any conflict of interest on guideline recommendations.48 Only in this way it is possible to improve the quality of medical publications.
In the second part of our work, we performed the meta-analysis first without the four analysed studies and showed that the result of the meta-analysis does not differ from that of the original one. Moreover, when the other four studies with positive and statistically significant results were excluded as well and only 26 statistically non-significant studies were included, we still found a statistically significant meta-analysis estimate confirming the better survival in patients receiving postsurgical CTx in comparison to those undergoing only gastric resection. This is due to the fact that meta-analyses increase power as described in the Cochrane Handbook49: ‘many individual studies are too small to detect small effects, but when several are combined there is a higher chance of detecting an effect.’ This point is very critical as it means that a statistically significant result in a meta-analysis can be obtained even if none of the included studies found a statistically significant result. This finding again highlights the importance of including only studies of high quality in a meta-analysis, especially in the case of studies which did not find a statistically significant estimate. A meta-analysis can often find a statistically significant result just because of the increase of sample size independently from the quality of the included studies. Yet another example of a meta-analysis with a global statistic positive estimate even if all included studies are not statistically significant can be found in the literature50 and is already critically revised.51
Implications for practice
Following the results of the Cochrane review, postsurgical CTx should be used for patients with resectable gastric cancer. However, it is important to note that some of the included trials contain limitations so that definitive assessments of this topic should be delayed until future trials are properly developed. The four analysed studies that were chosen because of their attributed weights are not of sufficient validity to be included in a meta-analysis, which holds for most of the other studies included.
Perioperative CTx in patients with gastric cancer is now the standard treatment in most centres today and has found its recommendation in several national guidelines and has also been proposed by a Cochrane review.52 However, critical analysis of the leading publications that support perioperative CTx treatment53–55 uncovered serious shortcomings particularly with regard to patient selection, changes in protocol, homogeneity of subjects, surgical quality and analysis of the results.56 The authors of this critical appraisal conclude that none of these studies justify an unrestrained recommendation of perioperative CTx for advanced gastric cancer. Even more interesting, the same group found different recommendations in international guidelines on this topic, despite the fact that all guidelines claim to be based on the same publications.57
Although perioperative CTx in case of advanced gastric cancer is recommended, the literature needs to be reassessed in detail in order to avoid unnecessary side effects for the patients as well as unnecessary costs for the healthcare systems.
Contributors GM, DHB and MK contributed substantially to conception and design of the study. GM and MK contributed to analysis and interpretation of the data and drafted the article. GM performed the meta-analysis. All authors gave the final approval of the version to be published.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.