Background Screening for colorectal cancer (CRC) with guaiac-based faecal occult-blood test (FOBT) has been reported to reduce CRC mortality in randomised trials in the 1990s, but not in routine screening, so far. In Finland, a large randomised study on biennial FOB screening for CRC was gradually nested as part of the routine health services from 2004. We evaluate the effectiveness of screening as a public health policy in the largest population so far reported.
Methods We randomly allocated (1:1) men and women aged 60–69 years to those invited for screening and those not invited (controls), between 2004 and 2012. This resulted in 180 210 subjects in the screening arm and 180 282 in the control arm. In 2012, the programme covered 43% of the target age population in Finland.
Results The median follow-up time was 4.5 years (maximum 8.3 years), with a total of 1.6 million person-years. The CRC incidence rate ratio between the screening and control arm was 1.11 (95% CI 1.01 to 1.23). The mortality rate ratio from CRC between the screening and control arm was 1.04 (0.84 to 1.28), respectively. The CRC mortality risk ratio was 0.88 (0.66 to 1.16) and 1.33 (0.94 to 1.87) in males and females, respectively.
Conclusions We did not find any effect in a randomised health services study of FOBT screening on CRC mortality. The substantial effect difference between males and females is inconsistent with the evidence from randomised clinical trials and with the recommendations of several international organisations. Even if our findings are still inconclusive, they highlight the importance of randomised evaluation when new health policies are implemented.
Trial registration 002_2010_august.
- COLORECTAL CANCER SCREENING
- CANCER PREVENTION
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
What is already known about this subject?
▸ Randomised trials with guaiac-based faecal occult-blood test (FOBT) have been reported to reduce colorectal cancer (CRC) mortality in the 1990s.
▸ The EU and the US both recommend screening from 50 until 74 years of age.
▸ Evidence of the effectiveness of guaiac-based FOB in screening as a part of routine health service with unselected study subjects is missing.
What are the new findings?
▸ Our randomised health services study of FOBT screening on CRC mortality found no effect.
▸ We observed a substantial effect difference between males and females, still inconclusive however.
How might it impact on clinical practice in the foreseeable future?
▸ Re-evaluation of existing screening practices for CRC with FOBT in current health service programmes might be needed.
▸ Before applying any new test as routine health service a randomised evaluation of the effectiveness is warranted.
Accumulated evidence from large randomised trials1–6 has shown a mortality effect of colorectal cancer (CRC) screening using faecal occult-blood test (FOBT). The average reduction in CRC mortality is estimated to be 12%, varying from 10% to 21%, based on the most recent meta-analysis.7 These trials suggest even a bigger reduction in CRC mortality with annual screening.1 ,2 In the first report of the UK study,3 with a median follow-up of 7.8 years, the difference in CRC mortality between screening and control arms was 15%, and the effect began to emerge after 3–4 years from study entry. In the follow-up of the same study,4 with a 15-year screening period and almost 15 years of follow-up after the screening period, a 12% reduction in CRC mortality was observed. In the US trial,1 the biennial screening arm had a higher CRC mortality than the control arm, from 4–5 to 8–9 years after study entry, and, overall, a modest 6% reduction in CRC mortality was detected. In the analysis of the same trial with 30 years of follow-up,2 14 years during the screening period and 16 years after that, a 12% reduction in CRC mortality was observed. In the Danish and Swedish trials, the difference in the cumulative CRC mortality emerged at about 8 years of follow-up or later.5 ,8 Results on CRC mortality with repeated FOB-based testing have also been reported from non-randomised studies in France9 (a 33% reduction) and Germany10 (a 34% reduction).
The US trial (Minnesota) with 30 years of follow-up found a significantly different reduction in CRC mortality between males and females in the biennial screening arm, RR=0.63 (95% CI 0.48 to 0.82) and RR=0.92 (95% CI 0.72 to 1.18), respectively.2 On the other hand, in the UK trial, the reduction in CRC mortality with approximately 20 years of follow-up was similar in males (RR=0.91, 95% CI 0.82 to 1.02) and females (RR=0.90, 0.80 to 1.01).4
Globally, 1.2 million new CRC cases are diagnosed and 600 000 deaths are due to CRC annually.11 Cancers of the colon and rectum remain the third most common (ranked by site specific incidence) in Finland. There were 2904 new CRC cases (ICD-10 classification: C18–C21) in Finland of 5.42 million inhabitants and 1161 CRC deaths in 2012.12 In Finland, between 2001–2012, CRC mortality has decreased annually on average by 1.5% per year in males and 0.7% per year in females.13 Effective tools for primary prevention are limited, and preventive efforts have focused on the detection of CRC in the early stages of the tumour growth process. CRC is considered to develop via an adenoma to carcinoma sequence.14 Effective screening results in detection of adenomas and preclinical cancers and thereafter in decreased CRC mortality. At the moment, the EU15 and the US16 both recommend screening for CRC starting from 50 until 74 years of age.
A new screening programme should allow unbiased evaluation of the effectiveness in the target population.17 A service programme has more challenges than randomised trials, and the expected effect is usually smaller than that observed in randomised trials.18 ,19 Service programmes are run within the normal health care system including more variation in the process and with limited resources and often less devoted human resources as compared to scientific trials. Thus, the real life application, service programmes, should be evaluated rigorously including randomisation. This is possible only during the implementation period of the new screening programme, with disease-specific mortality as the end point.17
We report here the first results on mortality of an individually randomised community-based CRC screening programme (a randomised health services (RHS) study) with the FOB based biennial test among 360 000 men and women in Finland.
The Finnish population-based screening programme was individually randomised (1:1) in the implementation phase by region, gender and birth year to those to be invited for screening (screening arm) and those not invited (control arm), based on Central Population Register (CPR) data and consent applied as carried out in the Finnish health services. The target group included men and women from 60 to 69 years of age. Details of the study design have been reported earlier.20 ,21 The working group of screening at the Finnish Ministry of Social Affairs and Health decided first to recommend a 6-year implementation period with random allocation of the population to be able to evaluate the effects reliably. In 2009, due to low coverage of the programme, the randomisation period was extended up to the year 2014. Thus, the estimation of effectiveness is needed for health policy planning, after the end of the extended randomisation period. At this point we need to take a decision on either to apply for continuation of the implementation period or to apply for closing the randomisation period.
In short, from 2004 until 2012, altogether 362 165 persons were individually randomised either to screening (181 080 subjects) or control (181 085 subjects) arm. All individuals in the screening arm were invited if they had a valid address available from the CPR. The present study population covered approximately 43.5% (invitees 21.8%) of the whole Finnish target population aged 60–69 years at the end of 2012.22
The CPR has a legislative mandate to collect records on residents of Finland including a personal identification code that can be used to link data from various health registers. The register also includes the name, birth date and a valid home address; and dates of emigration and death in case the person has moved outside Finland or died, respectively. Statistics Finland receives death certificates of all deaths and codes the official cause of death nationally. The Finnish Cancer Registry (FCR) collects national data on cancer cases since 1953 with high coverage, close to 99% for solid tumours.23 The personal identification code is used to link people between these registers.
Assessment of study subjects and end points
Subjects who died after the retrieving of the population sample from the CPR but before or at the date of randomisation were excluded (94 in total; 49 invitees and 45 controls) from analysis (figure 1). Similarly also we excluded those who emigrated (in total 7; 4 invitees and 3 controls). Subjects who were diagnosed with CRC before or at the date of the randomisation were also excluded (1572; 817 invitees and 755 controls). In January 2007, some controls (109 subjects) received a screening invitation due to problems in the software—in the activity itself these people were kept in the screening arm. For the present analysis, these people were included in the original control arm. In the final analysis we had a total of 360 492 subjects (180 210 in the screening arm and 180 282 in the control arm).
Incident CRC cases and deaths from CRC were retrieved by linkage with the FCR and defined according to the ICD-O-3 classification topography codes C18.0–C21.2 and C26.0 if malignant behaviour, excluding lymphomas (morphology codes ≥9590) and anal epidermoid cancers (topography C21.0–C21.2 and morphology code 8070). Information about the vital status at the end of 2012, including date of death and emigration, was received from the CPR, and cause of death was obtained from Statistics Finland through record linkage with the unique personal identification number.
Our primary end point in the current study was death from CRC defined by the official classification of death by Statistics Finland. Death due to any cause was the end point in assessing the excess (all-cause) mortality among patients with CRC (diagnosed after the date of randomisation). We define a screen-detected CRC event to be a cancer diagnosed within 6 months from the screening test. An interval cancer is one that is detected after the 6 month period from the screening test but before the next screen.
Those invited for screening every second year, were administered the guaiac-based non-rehydrated FOBT at home after receiving three test-cards (Hemoccult) and instructions by postal mail. A faecal sample was instructed to be collected three times within 1 week of the first sample by obtaining two smears from different locations of the faeces. Dietary restrictions consisted of avoiding raw meat, blood and liver dishes 3 days before sample-taking and during the time of sampling. Also, vitamin C supplements with more than 250 mg of vitamin were not to be used. Test cards were returned by mail to be analysed at the national screening centre at Pirkanmaa Cancer Society in Tampere. One central laboratory covered all of Finland and samples were analysed within 14 days of sample-taking. Results from testing were notified via postal mail to all attenders. In case of blood in any of the samples (test positive), a regional contact nurse was also informed. The contact nurse interviewed (phone mostly) the test-positive participant and thereafter organised a full colonoscopy examination. Colonoscopies were performed regionally either at the health centre, private clinics or hospitals, by experienced physicians having extensive training in colonoscopy (mostly specialised gastroenterologists). The overall compliance of colonoscopy was 84% and the annual compliance varied from 81% to 90% by calendar year. Colonoscopies of controls and in invitees not attending screening as well as those resulting as interval cancers were performed by the same providers as the screen induced ones.
Original sample size calculation was based on 90% power and a 5% type I error if the true effect was 20%20 in CRC mortality. These assumptions implied that we would need to accumulate 1.6 million person-years in the current study. Incidence and mortality rates were estimated by dividing the respective numbers of death from any cause, from CRC and from causes other than CRC, by the number of person-years. The number of person-years is the sum of each subject’s time at risk. In the estimation of incidence, the time at risk was calculated from randomisation to CRC diagnosis, death, emigration, or to the end of 2012, whichever came first. In estimation of the mortality, the time at risk was calculated from randomisation until death, emigration, or to the end of 2012, whichever came first. The ratios of the mortality rates and of the excess mortality rates were estimated within the intention-to-treat principle to measure the effect of the service screening. Confidence intervals (CIs) were estimated assuming the observed number of deaths to follow a Poisson probability law. The difference in the effect of mortality from CRC between males and females was tested using the Cox proportional hazards model and the classical likelihood ratio test. Restricted cubic spline functions24 ,25 with a time-dependent effect of invitation were fitted to individual-level follow-up data for describing arm-specific CRC mortality rates (ie, the hazard of death from CRC) and their ratio over follow-up time. Cumulative proportion of deaths from CRC was estimated in a competing risk setting using a weighted empirical cumulative distribution function.26 The excess mortality rate was estimated by dividing the excess number of deaths observed in patients with CRC by the total number of person-years in each arm.27 ,28 Details of estimation of excess mortality rate and its variance using the delta method are described in online supplementary appendix. When follow-up was started 1 year after randomisation in order to remove the period with no potential effect of screening, our results did not change and, thus, the results are presented with full follow-up starting from the date of randomisation.
The RHS study on implementation of CRC screening was approved by the Ministry of Social Affairs and Health in 2004 (STM/42/07/2004) and updated in 2010 by the official authority, the National Institute of Health and Welfare (THL/619/5.05.00/2010). The study has been registered in the registry for RHS studies maintained by the Cancer Society of Finland (http://www.cancer.fi/rhs/002_2010_august/).
The background characteristics between the screening and control arms were in balance (table 1). Approximately 20 000 subjects were assigned annually either to the screening or the control arm resulting in 180 000 invitees by the end of 2012 and a similar number of controls. The median follow-up time in our study was 4.5 years (range 0.0–8.3), with 25% of subjects (approximately 90 000 subjects) with at least 6.5 years of follow-up. The longest follow-up time was 8.3 years.
In all, close to 440 000 invitations were sent between 2004 and 2012, with an uptake of 68.8% (61.5% among males and 76.0% among females; table 2). The proportion of FOB positive tests was 3.6% of all tests (4.7% among males and 2.7% among females). The proportion of screen-detected CRCs was 42.7% of all CRCs (41.7% among males and 43.9% in females) (table 2). Colonoscopy was performed in 84% of screen positives.
Altogether, 73% out of the total number of 1.6 million person-years were accumulated during the first 4 years after randomisation (table 3). The incidence rate of CRC was higher in the screening arm compared to the control arm: 112.4/100 000 person-years in the screening arm and 100.9/100 000 person years in the control arm (table 3). The CRC incidence rate ratio between the screening and control arm was 1.11 (95% CI 1.01 to 1.23). In both arms, the incidence rate of CRC was higher in males (133.1/100 000 in the screening arm and 120.8/100 000 in the control arm) than in females (92.4 and 81.7/100 000). The CRC incidence rate ratio for males was 1.10 (0.97, 1.25) and for females, 1.13 (0.98, 1.31).
The numbers of new CRC cases showed a substantial difference in the first 2-year interval after randomisation: 84 more people were diagnosed with CRC in the screening than in the control arm (table 4).
A total of 15 963 deaths occurred during the follow-up (table 3), 8000 deaths in the screening arm and 7963 in the control arm. We did not find any difference between study arms in overall mortality rate (rate ratio, RR, 1.00; 95% CI 0.97 to 1.04) or in mortality rate from other causes than CRC (RR 1.00; 0.97 to 1.04). The overall mortality rate in screening and control arms was similar in females (RR 1.00; 0.95 to 1.06) and in males (RR 1.01; 0.97 to 1.05). There was no difference between screening and control arms in other cause mortality in females (RR 0.99; 0.94 to 1.05) or in males (RR 1.01; 0.97 to 1.05). CRC was the cause of death for 170 persons in the screening arm and 164 persons in the control arm (table 3). We did not find any difference in CRC mortality rate between the screening and the control arm (RR 1.04; 0.84 to 1.28). In males, the CRC mortality rate ratio was 0.88 (0.66 to 1.16) and in females, 1.33 (0.94 to 1.87). The interaction between study arm and gender in CRC mortality rate was of borderline significance (p=0.06). The excess mortality rates gave similar estimates: RR 1.05 (0.84 to 1.30) in males and females combined, and 0.94 (0.71 to 1.24) in males and 1.26 (0.88 to 1.80) in females. Figure 2 shows the cumulative CRC mortality proportion per 100 000 persons in screening and control arms for up to 8.3 years from study entry. There was no difference in cumulative CRC mortality between screening and control arm. In females the cumulative CRC mortality in the screening arm was consistently larger than in the control arm, while in males cumulative CRC mortality in the screening arm was smaller than in the control arm.
Smoothed CRC mortality rates and CRC mortality rate ratios as a function of follow-up time for screening and control arms by gender are shown in figure 3. There was no consistent pattern in the rate ratios by sex and follow-up time. The difference between the arms in the overall numbers of CRC deaths was small; at most, three deaths per 2 year interval (except early deaths in interval 0–2 years) (table 4).
Our community-based RHS study did not show any difference in CRC mortality between arms (RR=1.04). Despite the estimated 12% (RR=0.88) reduction in CRC mortality rate found in males and 33% (RR=1.33) increase in females, neither of these rate ratios was statistically significant. The difference between males and females in CRC mortality rate ratio was of borderline significance. Our estimates of the excess mortality rate due to CRC were similar to the above results.
Our study is a RHS study in contrast to a scientific randomised controlled trial (RCT). Major difference between the two types of studies is in the origin of the observations, they are either a by-product of a routine activity (RHS) or they are specifically designed for a research purpose (RCT). Therefore, RHS is particularistic and based on routine health services, whereas the objective of a RCT is abstract and general, that is, scientific. Randomisation and timing are similar but ethics, funding, blinding and several other aspects differ between RHS and RCT.17
Based on information from the earlier RCTs,29 the effect of screening in reducing CRC mortality is likely to be observed after a longer follow-up than that in the current study, with a median 4.5 years (maximum of 8.3 years) of follow-up. This is supported by the results of Mandel et al,1 who found only a 6% reduction in CRC mortality after 13 years of follow-up with biennial screening, and the reduction was 21% in a later analysis.2 Originally, our study was planned to find a statistically significant reduction in CRC mortality if the true effect had been 20%. According to the recent meta-analysis, the estimate of the mortality effect was 15%.7 A longer follow-up would reduce uncertainty in our effect estimate and narrow the CI. Therefore, it is possible that the beneficial effect of biennial screening with FOBT on CRC mortality emerges only after 6–10 years of follow-up.1
The excess mortality rate measures both the direct and indirect mortality due to CRC.27 Unlike CRC mortality, the excess mortality does not depend on the reliability of classification of deaths, as it compares all-cause mortality in patients with CRC with that in comparable CRC-free persons. Because the estimate of CRC mortality rate was similar to that of the excess mortality, the non-existing screening effect is unlikely to be due to misclassification of the causes of death.
The need for analysis of the screening effect at hand was spelled out with the Ministry of Social Affairs and Health in Finland, in an agreement on evaluation after the implementation phase of the programme and before making any decision on its future. The implementation phase of the screening programme was based on gradual expansion in the first 6 years (2004–2009) and on follow-up of five more years (2010–2014).
Our study is a part of Finnish public health and therefore many factors differ compared to earlier screening trials.1 ,3 In routine service screening, there is more variation in available resources (both human and material), the population may be less motivated, and experts are not as devoted and do not follow the guidelines as strictly as in scientific trials. Despite the high compliance (70%), selection among our study subjects cannot be ruled out. In randomised scientific trials, the eligibility of study subjects is guaranteed either by inclusion or exclusion criteria as part of the study protocol. These selective processes also introduce a possibility for incomparability when the test is applied to an unselected target population.
There is also substantial difference concerning various patient and health services characteristics between the old trials and the current study. Survival of patients with CRC has improved substantially from the time of the early clinical trials to the beginning of 2000, from 40% to close to 60% in all Nordic countries,13 and a steady improvement of 5-year survival from colon and rectal cancer has been observed in developed countries.30 ,31 Such improvement in survival of the controls leads to a smaller difference in CRC mortality between randomised groups and thus lower statistical power to detect it.
The uptake of screening was relatively high in our programme. Usually, the uptake is lower in routine application (RHS) than in controlled trials (RCT). In Finland, the uptake was 69%. It was less than that reported from the US trial (Minnesota Colon Cancer Control Study)1 (78%), but better than that of the Nottingham Study3 (60%). Therefore, it is not likely that our lack of significant effect can be explained by lower participation.
In the current study, the sensitivity of FOBT improved after the first years of screening.21 The most conclusive data are provided by the longest follow-up. Unfortunately, these data had, at the same time, the poorest sensitivity. This indicates a learning curve. Uptake in the Bowel Cancer Screening Program32 in England was 57% in the first round, 61% in the second and 66% in the third round. A similar increase in uptake by round was also observed in our programme.33 The improved sensitivity and better uptake both indicate a potential improvement in the effect in the future.
The ideal design would have been to compare FOBT, faecal immunochemical test (FIT) and no screening. The reports on FIT positivity proportion vary from 4% to 10%,34 ,35 high for routine screening with colonoscopy in screen positives. It is possible that FIT would have shown a mortality reduction even at the same positivity proportion threshold as our FOBT (3%), because in such a case more individuals with bleeding had been identified. The reason for bleeding is, however, unknown. There is no direct evidence on improved effectiveness of FIT compared to FOBT, for the time being.
Contamination of the control arm by wild screening is likely to be small in Finland. CRC screening for average risk individuals without symptoms is not common and regular GP checking for healthy individuals is not a common practice. Faecal occult blood tests are used mainly in clinical settings for follow-up of bowel diseases or as part of diagnostics, not in screening purposes. In summary, the effectiveness of routine screening for CRC with FOBT still remains open.
Several international bodies, including the American Cancer Society29 and the EU,15 recommend routine screening for CRC with FOBTs. The recommendations are based on evidence from randomised studies,1 ,3 ,5 ,6 ,8 but they do not include the evaluation of the process as part of the routine health service system. The Finnish programme is consistent in process results and not significantly different in the mortality outcome with evidence available so far.
The difference between males and females in the effect on CRC mortality was of borderline significance. Three trials2 ,4 ,6 and two case–control studies9 ,10 reported reduction in CRC mortality for males and females separately. The reduction in CRC mortality in males was between 5% and 37%, and in females between 8% and 26%. In our study, the reduction in CRC mortality in males (12%) is consistent with the previous studies. However, we are concerned with the observed 33% increase in CRC mortality in females, which has not been reported by any of the other studies. Regarding biennial FOBT, there is concern on poor sensitivity of the guaiac FOBT in Finland, especially in women.21 ,36 Also, the Norwegian screening study on flexible sigmoidoscopy and FOBT either combined or with flexible sigmoidoscopy alone, reported a different effect on mortality between men and women37: screening was effective in men (RR for CRC mortality 0.58, 95% CI 0.40 to 0.85) but not in women (RR 0.91, 0.64 to 1.30) after a median follow-up of 10.9 years. A similar effect in CRC mortality was observed after adenoma removal in males and females: the SMR for CRC was 0.86 (0.74 to 1.00) and 1.06 (0.93 to 1.22), respectively.38
There are different hypotheses as to why women may not benefit as much as men from screening with FOBTs,2 including proportionally more adenomas and CRCs in the proximal colon in men than in women, and that the biology may be different (sessile serrated adenoma pathway) between men and women.2 Also, men have more comorbidity from other serious health problems than women at this age, and it has been speculated that it may lead to differences in cause of death coding between men and women. This was not the case in our material, however.
We found a substantial difference in uptake between males (62%) and females (76%). It is obvious that participation is differently selective in men and in women. We have earlier reported, for example, that marital status was of high importance in uptake, especially in men; married men participate more often than those who are single.39 However, this difference in uptake should increase rather than account for the difference in effect by gender. The difference in the CRC mortality effect between males and females highlights the need for a reanalysis with longer follow-up.
We did not find any effect in a RHS study of FOBT screening on CRC mortality. The substantial effect difference between males and females is inconsistent with the evidence from randomised clinical trials and with the recommendations of several international organisations. Even though our findings are inconclusive, they highlight the importance of randomised evaluation when new health policies are implemented.
Contributors All authors read and approved the final version of the manuscript. Specific author contributions were as follows: NM, MH were involved in the study concept and design. NM, TP were involved in the participant recruitment and characterisation. OM, TP, M-SV were involved in the implementation of screening test. KS, JP were involved in the data and statistical analyses. All authors were involved in the manuscript drafting.
Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.