Professional Documents
Culture Documents
Faculty of
Behavioural and
Movement Sciences
ii Faculty of
Social Sciences
Meta-analyses in mental
health research
A practical guide
iii
Contents
iv
Step 4. Calculating and pooling effect sizes....................................................... 67
Effect sizes based on continuous outcomes ................................................................68
How to find the data needed for calculating effect sizes?......................................70
Interpreting effect sizes.......................................................................................................72
More outcomes in one study..............................................................................................75
Effect sizes based on dichotomous outcomes ............................................................77
Pooling of effect sizes............................................................................................................82
When can effect sizes be pooled?.....................................................................................83
The random and the fixed effects model.......................................................................86
The forest plot: An excellent summary of a meta-analysis.....................................87
Sensitivity analyses................................................................................................................90
Key points...................................................................................................................................91
References....................................................................................................................127
v
vi
Tables and figures
vii
viii
0. Introduction
1
2
0. Introduction
What are meta-analyses and why are
they important?
3
0. Introduction
&!!!!!
%"!!!!
%!!!!!
$"!!!!
$!!!!!
#"!!!!
#!!!!!
"!!!!
4
What are meta-analyses and why are they important?
In this book we will focus on meta-analyses and how they can be con-
ducted. This book can also be used for systematic reviews, by following
the steps described in this book, but without the statistical procedures for
statistically integrating the results of the included studies.
5
0. Introduction
6
What are meta-analyses and why are they important?
ed across all included studies with the same method. Also the analyses of
the outcomes can also be conducted in one uniform way across all studies
(Riley, Lambert, & Abo-Zaid, 2010). Because all datasets are integrated
the statistical power to test predictors of outcome is much better than
for individual randomized trials, because these are usually designed with
sufficient power to find a significant outcome. For examining predictors of
outcome at least four times as many participants are needed as for finding
an effect of the treatment (Brookes et al., 2004). On average, 64% of the
trials in any field contribute their primary data to individual patient data
meta-analyses (Riley, Simmonds, & Look, 2007).
In this book we will not describe the methods of network or individual
patient data meta-analyses, but we will focus on traditional meta-analy-
ses.
7
0. Introduction
Patients can make use of the results of meta-analyses when they have to
make decisions about treatments. And finally researchers can use the re-
sults of meta-analyses in several ways, for example to generate new re-
search questions that are not well examined in the existing trials, or to
see methodological limitations of existing trials and meta-analyses can be
used to estimate sample sizes for future trials.
There are however also several problems with meta-analyses. A me-
ta-analysis can never be better than the studies they summarize. If none
of the included studies has been done according to established methods
for trials and the risk of bias (chapter 3) is high, then a meta-analysis int-
grating the results of these studies can not solve the problem of the risk of
bias in these studies. Sources of bias in trials can not be controlled by the
method of meta-analyses. This feature of meta-analyses is often refer to
as garbage in, garbage out.
Another problem of meta-analyses especially in mental health
care is that there are always differences between studies, for example
in terms of the exact inclusion criteria, setting, recruitment methods, or
treatments and the delivery of treatments. Trials examining an interven-
tions are hardly ever exact replications of each. Some critics say that such
studies can not be compared because of these differences. They say that
meta-analyses combine apples and oranges.
The file drawer problem refers to the problem that not all relevant
studies are indeed published and are therefore often not included in me-
ta-analyses. When these unpublished studies have negative outcomes
not supporting the effectiveness of an intervention while published stud-
ies do support this effectiveness, than meta-analyses may seriously over-
estimate the true effects of an intervention. We will come back to this
problem in chapter 5.
A final problem of meta-analyses is the agenda-driven bias of re-
searchers who conduct the meta-analyses. Many meta-analyses are
written by researchers who are biased towards the intervention they
examine in the meta-analysis. This kind of bias is often called researcher
8
What are meta-analyses and why are they important?
A few years later, in 1980, Gene Glass was involved in the most
extensive illustration of this new method was published on the
outcomes of psychotherapies. This book with the title Benefits
of Psychotherapy included the effect sizes from 375 outcome
studies of psychotherapies for neurotic disorders with more
than 4,000 patients in the treatment and control conditions. This
book was a response to the famous article by Eysenck in 1952
(Eysenck, 1952) in which he claimed that psychotherapy is not
effective, and that many patients receiving treatment do indeed
get better but they also get better without treatment. Smith and
colleagues found for the psychotherapies an effect size of 0.68,
which from todays standards is a moderate to large effect. Mod-
ern meta-analyses, however, have shown that this effect size was
probably considerably overestimated, because of the high risk
of bias in many studies and publication bias. But at that time this
book refuted the article from Eysenck with a new and innovative
method, and Eysenck seemed to have been wrong about the ef-
fects of psychotherapy.
9
0. Introduction
allegiance. We will see later in chapter 3 that researcher bias may affect
the outcomes of individual randomized controlled trials, but that is prob-
ably also true for systematic review where researcher with an allegiance
towards a particular intervention may be inclined to interpret outcomes
of included trials more positive then independent researchers.
Key points
Because of the exponential growth of research there is a need to
integrate the results of multiple studies
Traditional reviews are not systematic and transparent; they can
not solve the need for integration of research
Systematic reviews have a reproducible methodology
In a meta-analysis the results of individual studies are statistically
integrated into one (more precise) outcome
Systematic review have many advantages for professionals, pa-
tients, policy makers and researchers
But there are also risks: bias, garbage in and out, and combining
apples and oranges
10
What are meta-analyses and why are they important?
11
12
Step 1
Defining research
questions for meta-
analyses: PICO
13
14
Step 1
Defining research questions for meta-
analyses: PICO
Introduction
All good scientific research starts with a well-formulated research
question that is important and relevant. That is not different in me-
ta-analyses. Before doing a meta-analysis it is important to think about
the goal of the meta-analysis, why this meta-analysis is important, and
how this new meta-analysis compares with earlier meta-analyses. In step
6 (on Reporting and publishing meta-analyses) we will see that a pub-
lished meta-analysis begins with an Introduction section in which these
issues are described, including the background of the problem, the impor-
tance of the problem, earlier (meta-analytic) research and why this new
meta-analysis is needed.
In meta-analyses of randomized trials examining an intervention, an
Introduction section should always end with the research question, in
which the Participants, the Interventions, the Comparisons and the Out-
comes are specified. Together these four parts of the research question
are summarized as PICO. A good research question of a meta-analysis
always contains the four elements of the PICO acronym.
A research question for a meta-analysis could be for example: What is
the efficacy of cognitive behavior therapy (CBT) on sleep diary outcomes, com-
pared with control, for the treatment of adults with chronic insomnia (Trau-
er, Qian, Doyle, Rajaratnam, & Cunnington, 2015). All four elements of
the PICO are in there: P: adults with chronic insomnia; I: CBT; C: control
groups; O: sleep diary outcomes.
Or another example: The effects of eye movement desensitization and
reprocessing (EMDR) versus cognitive-behavioral therapy (CBT) for adult post-
15
Step 1
traumatic stress disorder (Chen, Zhang, Hu, & Liang, 2015). It also has all
elements of the PICO acronym: P: adults with posttraumatic stress; I:
EMDR; and C: CBT. The outcomes are not specified in this research ques-
tion, but most researchers in this field know that these treatments are fo-
cused on symptoms of posttraumatic stress, so in this case it is not needed
to state that explicitly in the research question.
A well-formulated research question according to the PICO acronym
is not only necessary for having a good research question, but it is also
helps with defining the inclusion and exclusion criteria for a meta-anal-
ysis.
16
Defining research questions for meta-analyses: PICO
17
Step 1
18
Defining research questions for meta-analyses: PICO
19
Step 1
they receive, because placebo does not have the side effects that many
active medications do have (Moncrieff, Wessely, & Hardy, 2004).
For psychological interventions, it is still more complicated to choose
the right control group (Mohr et al., 2009). That is true for randomized
trials but also for meta-analyses in which the results of these individual
trials are integrated. Waiting list control groups are often used to ex-
amine the effects of psychological interventions, because they motivate
people to participate in the control group and the trial, and the partici-
pants get at least some kind of intervention. However, when people are
on a waiting list they probably do nothing to solve their problems because
they are waiting for the therapy (Mohr et al., 2009, 2014). So, it is stimu-
lating not to use their normal ways of coping with problems. If they had
been assigned to a care-as-usual control group, a part of them would have
possibly taken other actions to solve their problems. And because these
participants are willing to be randomized to a waiting list control group,
they probably have high expectations of the effects of the therapy. And
it is known that high expectations result in better outcomes of interven-
tions. So, waiting list control groups may considerably overestimate the
effects of an intervention, for example when compared to usual care.
Another type of control group that is often used in randomized trials
of psychological interventions is care-as-usual. People participating in
such a trial have the chance to get the intervention or nothing. Nothing
means that they can use the care they would usually get when they would
not participate in the trial. A big problem is that usual care very much
depends on the setting and country where the trial is used. For example
many European countries have a national health insurance system, where
all patients have access to health care. Whether for example psychologi-
cal treatments are part of this, differs per country. And in countries with-
out such a national system, patients have to pay for their own insurance
and health care, and that means that access is very much depending on
income. All this implies that care-as-usual can vary considerably between
20
Defining research questions for meta-analyses: PICO
setting and country and that it may be problematic to pool the data from
trials using care-as-usual conditions.
Pill placebos are also sometimes used as control groups in psychologi-
cal interventions. However, in these trials, usually there is also a condition
in which participants receive a medication, because only a placebo condi-
tion in such a trial does not make much sense. But the advantage is that
this allows to examine the effects of psychological interventions with the
same comparator as for medications aimed at the same condition.
Another type of control group that is often used is the placebo ther-
apy. In such a control condition, usually a very basic intervention is given,
often based on client-centered therapy, where the therapist is empathic,
friendly, supporting, but not using specific intervention techniques. The
idea is that this control condition provides anything all therapies have,
and the it is compared to a real intervention in which specific techniques
are used. This should allow to examine what the specific techniques con-
tribute to the effects of the intervention above the basic support given in
all interventions. However, there are this approach is problematic for sev-
eral reasons. First, non-directive supportive counseling (the control con-
dition) can have considerable effects in itself. For example, we found that
in depression counseling has effects that are comparable to cognitive be-
havior therapy (Pim Cuijpers et al., 2012). We also found that the studies
that use these control groups are often heavily influenced by researcher
allegiance. Researcher allegiance means that a researcher is a proponent
of a specific intervention. In counseling for depression, we found that in
studies without researcher allegiance the effects of counseling were com-
parable to those of other therapies. The other problem is that when such
a control condition is not delivered as a serious intervention (because it
is meant as a control group), participants will probably know that this is
not a serious intervention and then the effects will be influenced by this.
So, when a psychological placebo is not convincing enough, the superior
effect of the active intervention may be caused by this. And it is often very
21
Step 1
22
Defining research questions for meta-analyses: PICO
problem at which the intervention is aimed. So, the results of these stud-
ies (including meta-analyses) should be interpreted with caution.
Then meta-analyses can also integrate the results of studies that are
not aimed at examining the effects of interventions. In principle, me-
ta-analyses can integrate all outcomes of studies that have a standard er-
ror (see Step 4). For example, we recently examined whether people suf-
fering from depression have a higher risk of dying within the next period
than people without depression (Pim Cuijpers, Vogelzangs, et al., 2014).
We examined that in almost 300 prospective cohort studies in which
some people had depression at baseline and others did not. All of these
studies had assessed how many of these people had died at follow-up. It
was indeed found that people with depression had a larger chance of dy-
ing than people without depression.
In the same way it is possible to pool the results of studies that have
examined the correlation between two variables, studies that have ex-
amined psychometric properties of psychological measurement instru-
ments, studies that examine differences between populations, and all
kinds of other studies.
23
Step 1
Key points
Any meta-analysis begins with a good research question, just like
any other study
Good research questions for meta-analyses of randomized trials
follow the PICO acronym and they define and specify:
Participants
Intervention
Comparison
And outcome
Comparison conditions for psychological interventions (such as
waiting lists, care-as-usual, psychological placebos) may be prob-
lematic
Apart from randomized trials, the results of all kinds of other
studies can be integrated in meta-analyses as well
24
Defining research questions for meta-analyses: PICO
25
26
Step 2
Searching
bibliographical
databases
27
28
Step 2
Searching bibliographical databases
After defining your research question using the PICO acronym, the
next step is to find trials that have examined this research question and
that can be included in your meta-analysis. In this chapter we will describe
how you can identify these studies. First you will have to choose the bib-
liographical databases that you will search. Then you will have to develop
a strategy for searching in these databases and identify the studies that
meet your inclusion criteria.
If you work in an institute, like a university, and there is a library where
information specialists or librarians are working, it is highly recommend-
ed to involve one of the these information specialists in the process of
identifying studies for inclusion in your meta-analysis. Information spe-
cialists have made a job of knowing which databases are available and
how to search these. The quality of your searches increases considerably
when you are capable of involving an information specialist.
In this chapter we will present databases that can approached on the
Internet. We will give the names and urls of these websites in a separate
table (Table 2.1). However, the names and urls of these websites often
change, so if a url is not correct, please search on the Internet for the cor-
rect address.
Some of the databases that you will use are free, such as PubMed.
For others you will have to have a subscription, like for PsycInfo, or the
Cochrane database of trials. Unfortunately, for a meta-analysis in mental
health and social sciences you will need to have access to some of the paid
databases and cannot be conducted without that access.
29
Step 2
30
Searching bibliographical databases
31
Step 2
Database
Core databases
Pubmed Database of the US National Library of www.ncbi.nlm.nih.gov/
Medicine pubmed
32
Searching bibliographical databases
Citation databases
Thompson Reuters web Thompson Reuters citation database www.webofknowledge.com
of knowledge
Scopus Elsevier citation database www.elsevier.com/solutions/
scopus
Google Scholar The largest citation database devel- scholar.google.com
oped by Google
National and regional databases
Latin America: LILACS http://lilacs.bvsalud.org/en/
Chinese Biomedical Institute of Medical Information & www.imicams.ac.cn/publish/
Literature Database Library default/eng
(CBM)
China National Knowl- Database of Chinese studies http://oversea.cnki.net/kns55/
edge Infrastructure default.aspx
(chkd-cnki)
indMED Database covering peer reviewed http://indmed.nic.in/indmed.html
Indian biomedical journals
Dissertations and theses
ProQuest dissertations Database of dissertations www.proquest.com/products-ser-
vices/dissertations
ProQuest dissertations Database of dissertations Great www.proquest.com/products-ser-
UK & Ireland Britain and Ireland vices/pqdt_uk_ireland.html
Deutsche National The German National Library (Deut- www.dnb.de/DE/Wir/Koopera-
Bibliothek sche National Bibliothek) offers access tion/dissonline/dissonline_node.
to German dissertations html
CNKI Database of Chinese theses http://oversea.cnki.net/kns55/
Navi/CDMDNavi.aspx?Navi-
ID=36&XueKE=1
Other reviews and guidelines
DARE The Database of Abstracts of Reviews www.crd.york.ac.uk/CRDWeb
of Effects of the University of York
National Guideline NGC is a public resource for ev- www.guideline.gov
Clearinghouse idence-based clinical practice
guidelines
Trial registers http://apps.who.int/trialsearch/
33
Step 2
There are also many national and regional databases, such as LILACS
(a database of the scientific and technical literature of Latin America and
the Caribbean) and indMED, covering Indian biomedical journals. There
are also several Chinese databases you can search (see the overview in
(Xia, Wright, & Adams, 2008)), such as the Chinese Biomedical Literature
Database from the Institute of Medical Information & Library and the
China National Knowledge Infrastructure.
Some databases offer access to doctoral dissertations, such as Pro-
Quest, that offers access to dissertations from the US, but also has a sep-
arate collection for dissertations from the United Kingdom and Ireland.
Other websites offer access to German (German National Library) and
Chinese (CNKI) dissertations.
34
Searching bibliographical databases
35
Step 2
36
Searching bibliographical databases
37
Step 2
38
Searching bibliographical databases
the MeSH tree. In Table 2.2 you can see the branch of the MeSH tree
for Mental disorders (taken from www.nlm.nih.gov/mesh/trees.html). In
this branch you can follow each of the sub-branches into new branches.
As an example we have given in Table 2.2 the sub-branches for Anxiety
disorders, Mood disorders and Personality disorders. But each of the oth-
er branches also have sub-branches (see www.nlm.nih.gov/mesh/trees.
html). An alternative is that you search for terms that are included in the
MeSH tree in the MeSH tree browser at www.ncbi.nlm.nih.gov/mesh.
As indicated earlier, each database has its own taxonomy. Embase,
for example has the Emtree thesaurus. The taxonomies of the databases
are built in different ways, but they can all be searched by entering the
right key words.
39
Step 2
course this is only an example illustrating the use of AND and OR, not a
real search string.
Most databases allow to use truncation, wildcards and proximity
operators, which can be useful when you do searches. Truncation is a
searching technique used in databases in which a word ending is replaced
by a symbol, usually the asterisk (*). For example if you use random* as a
search term in PubMed, you will get all the records that include terms like
random, randomized, randomised, randomly. A wildcard (?) replaces
a letter in a word. For example, the term m?n will identify records with
the term man, men, min, mun, etc. Also proximity operators can be
used (in PubMed: ADJ). For example: depression adj3 disorder returns
records where depression and disorder are within 3 words of each other
in any order. adj3 means adjacent within 3 words.
Search filters can also be very useful when conducting searches. At
the website from the InterTASC Information Specialists Sub-Group
Search Filter Resource (www.york.ac.uk/inst/crd/intertasc) you can find
an overview of search filters for many different types of studies for sev-
eral biomedical databases. PubMed and other databases also have their
own search filters. For example, Pubmed has a MeSH term for random-
ized trials (Randomized Controlled Trial[Publication Type]) and you
can limit your searches with this MeSH term. But there are several other
search filters you can use for limiting searches to randomized trials. For
example a simple but more sensitive search filter for randomized trials in
PubMed is: randomized controlled trial [pt] OR controlled clinical trial [pt]
OR randomized [tiab] OR randomly [tiab]. And the Cochrane Collaboration
has developed a highly sensitive search string for randomized trials, which is
aimed at missing as few studies as possible (and accepting that it results in large
numbers of records that have to be screened): ((randomized controlled trial [pt])
OR (controlled clinical trial [pt]) OR (randomized [tiab]) OR (placebo [tiab]) OR
(drug therapy [sh]) OR (randomly [tiab]) OR (trial [tiab]) OR (groups [tiab])) NOT
((animals [mh] NOT humans [mh])).
40
Searching bibliographical databases
A simplified example
So how does all this work when you actually do searches for a me-
ta-analysis? Suppose that you want to do a meta-analysis of cognitive be-
havior therapy for depression compared with waiting list control groups,
and that you would start your search in PubMed. A simple search strategy
could focus on three of the four elements of your PICO:
You could also add search terms for the waiting list component (the
comparator), but it is also possible to leave that out. If you leave it out you
will get all studies on cognitive therapy, regardless of the type of control
group or comparator (because you dont limit the searches to the waiting
list control group). You can do that when the terms for waiting list would
be ambivalent or unclear, and the number of resulting records would not
become too large because of that. When you do the search without wait-
ing list you will have about 1500 resulting records (search conducted in
2015). If you add terms for waiting list (Waiting AND list) you will end up
with about 50. The 1500 is not too much and only 50 seems very few, with
a risk of too many missing studies (because the term waiting list may not
be included in the abstract or title). So, in this case it seems better to leave
out the terms for the waiting list.
This issue can also be illustrated with searches for psychotherapy
for generalized anxiety disorder. Search terms that can used for identi-
fying studies on generalized anxiety disorder are: Generalized anxiety
or generalised anxiety or worry*. If you combine these terms with the
41
Step 2
search string for randomized trials, you will find only about 1400 re-
cords (in 2015). So you could stop with extending your search with oth-
er terms for the other parts of the PICO, because this does not seem too
broad for a search.
42
Searching bibliographical databases
Key points
Search at least Pubmed, PsycInfo, Cochrane database
Develop search strings based on your PICO
Find a balance between sensitivity and precision in your search
strategy
Ask help from a librarian
Use text as well as MeSH terms,
Use Boolean operators, truncation, wildcards, proximity opera-
tors, and search filters.
43
44
Step 3
Selection of studies,
retrievement of data
and risk of bias
45
46
Step 3. Selection of studies,
retrievement of data and risk of bias
Selection of studies
When you have finished the searched you will first have to save the re-
sults of your searches in files. These files should be in a format that can be
used by the reference software that you use, such as Endnote or Reference
Manager. It is beyond the scope of this book to describe exactly how these
packages work. But it is important that you use one of such packages, be-
cause they can help you to remove duplicate abstracts.
Because you have used more than one bibliographical databases for
your searches, it is highly probably that these databases have identified
identical abstracts. You will have to remove these duplicate abstracts.
Removal of duplicate abstracts is a requirement for the PRISMA flow-
chart of the selection process of studies that you will have to make, ac-
cording to the PRISMA guidelines for systematic reviews. So, searching
the results of each of the databases separately (and thus reading some
47
Step 3
PRISMA 2009 Flow Diagram
Figure 3.1 PRISMA owchart
Figure 3.1 PRISMA flowchart
Identication
#ofstudies included in
qualitative synthesis
Included
#ofstudies included in
quantitative synthesis
(meta -analysis)
From: Moher D, Liberati A, Tetzlaff J, Altman DG, The PRISMA Group (2009). Preferred Reporting Items
From: Moher D, Liberati A, Tetzla J, Altman DG, The PRISMA GroupPreferred
(2009).ReportingItems forSystematic Reviews and
Meta-
forAnalyses:
Systematic Reviews
The PRISMA and Meta-
Statement. PLoSAnalyses:
Med 6(6):The PRISMA
e1000097. Statement. PLoS Med 6(6): e1000097.
doi:10.1371/journal.pmed1000097
doi:10.1371/journal.pmed1000097
For more information, visitFormore information, visit www.prisma -statement.org .
www.prisma-statement.org.
databases separately (and thus reading some abstracts in more than one database)
48
37
Selection of studies, retrievement of data and risk of bias
abstracts in more than one database) is not possible if you follow the
PRISMA guidelines.
First you should save the results of your searches in files. How that
is done depends on the database you have searched. In Table 3.2 it is de-
scribed how you can save a file from PubMed. But each database is differ-
ent and it is not possible to give a comprehensive description of how that
is done for all databases. The next step is to import these files into your
reference software package.
When you have imported the results from all databases that you have
searched into your reference software, you can start removing duplicate
records. Most reference software packages have methods to do that au-
tomatically.
It is important that you make notes of how many records you have
found in each of the databases and how many records are left after re-
moving of the duplicate records. These numbers are needed for the PRIS-
MA flowchart. And dont forget to note the date at which you did the
searches! You will have to report that in your paper.
After you have imported all results from the bibliographical databases
into your reference software and have removed the duplicates, you are
ready for the selection of records. You can do that in your reference soft-
ware or you can export the records to a word processor document.
The selection of records in this phase is only about the decision to re-
trieve the full-text of a study or not. So if there is any doubt whether a
record may be about a study that meets inclusion criteria for your me-
ta-analysis, you should retrieve the full-text of this record.
The PRISMA flowchart does not require you to specify why a record
is excluded. You can simply say that records are excluded based on title
and abstract. But you will have to report exactly how many records were
excluded.
49
Step 3
If you have saved your file, you first open the Endnote library in
which you want to import it.
Click on import in the File menu
Click on the box next to Import options
If PubMed (NLM) is not one of the options you can
choose from, click on Other filters and select it from the
menu.
Click on import and the records will be added to your
Endnote library
You can add files from other bibliographical databases
through the same procedure and selecting the right im-
port filter.
50
Selection of studies, retrievement of data and risk of bias
51
Step 3
sion criteria are again based on your PICO. You will have to read each of
the papers carefully and see if it meets your inclusion criteria. This should
also be done by two independent researchers with disagreements solved
by discussion and a final decision by a third researcher.
If you decide to exclude a full-text study, you have to give a reason for
that. The PRISMA flowchart requires that you report how many full-text
papers you retrieved, how many were excluded and what the reasons for
exclusion were. This process is often not very clear, because studies can
be excluded for several reasons. So giving one reason why a study was
excluded is often confusing because there may have been other reasons
why it would have been excluded. A solution to that is to make a hierarchy
of reasons for exclusion. However, in most meta-analyses, researchers in-
dicate the first reason they found in the article why it should be excluded.
In this phase a decision for inclusion is not definite. You may include
a study because it meets inclusion criteria but may find out later for ex-
ample that it is impossible to calculate effect size because not all the data
that are necessary for that are given in the paper. Or you may find out
that this study is actually a secondary paper about another study that you
already included.
In the rest of this chapter we will describe the first two categories of
data to be retrieved. The third category, about the calculation of effect
sizes, will be described in the next chapter.
52
Selection of studies, retrievement of data and risk of bias
There are no fixed rules for which characteristics of the studies should
be extracted. When you have read scientific articles about the subject of
your meta-analysis and you have read the full-texts of the studies you
have retrieved, you probably know which characteristics of the studies
you should collect. But it is not uncommon that during the process of ex-
tracting data you come up with other characteristics you should retrieve.
In general you can say that you should at least collect data about the
elements of your PICO. That means that you should collect data about
the participants in the trials, the interventions, and the comparators. The
outcomes (the last part of the PICO) are used for the calculation of the
effect sizes (described in the next chapter).
So for example for the participants you can rate how they were re-
cruited, what the exact definition of their problem was (for treatment
studies), exclusion criteria that were used in the trials, sociodemographic
characteristics like age, gender, socioeconomic status, and proportion of
participants from minority groups.
53
Step 3
Sources of bias
There are several sources of bias that should be assessed when con-
ducting a meta-analysis. More information about the sources of bias and
how to assess that can be found in the Cochrane Handbook for System-
atic Reviews of Interventions (Higgins & Green, 2011), which gives an ex-
cellent overview of the different types of risk of bias, and is available for
free online (http://handbook.cochrane.org). Below we will describe the
54
Selection of studies, retrievement of data and risk of bias
Selection bias
Selection bias refers to systematic differences between the groups
that were randomized in the trial. One of the strong characteristics of ran-
domized trials is that participants are assigned to conditions in a random
way. When that is done correctly, there are no baseline differences be-
tween the two (or more) groups that are randomized. If, however, this as-
signment is not done well, there may be systematic differences between
the two (or more), which violate this basic principle of randomized trials.
That also means that if differences between these groups after treatment
are found, these are not caused by the treatment, but by these systematic
differences between the randomized groups.
Selection bias can result from errors in the randomization process.
First, the assignment of participants in trials should be done in a random
way, meaning that it is only chance that makes a person being assigned to
one condition and not to the other. The process of generating the order
in which participants are assigned to condition is usually called sequence
generation. There are several ways how this can be done in an adequate
way, like for example using a random numbers table or a computerized
random number generator (like www.random.org). But also for example
coin tossing or throwing dice are adequate ways to generate random
numbers. Incorrect ways of generating the order in which participants
are assigned to conditions include assignments by date of birth, the date
of admission, or patient record number. Other ways of assigning partici-
pants that are wrong include assignment based on the judgment of a clini-
cian or the preference of the participant.
But adequate sequence generation is not the only risk where the allo-
cation of participants can go wrong. It is also important that the research-
ers and participants cannot foresee the assignment, because then they
could also influence the process of randomization. It is important there-
55
Step 3
Detection bias
Detection bias refers to systematic differences between groups
in how outcomes are determined. Detection bias can be prevented by
blinding (or masking) of participants, the personnel involved in the study,
and outcome assessors.
In medication trials it is possible to fully blind patients who partici-
pate. Patients receive the medication that is tested or a placebo pill that
is exactly the same as the medication, but without the active substance.
In such trials patients and the doctors who treat the patients dont know
whether they have received the medication or the placebo. In psycholog-
ical interventions this blinding is typically not possible. Participants know
whether they are randomized to an intervention or to a waiting list, usual
care or pill placebo. Blinding is simply not possible in most psychological
interventions in mental health care. That means that the effects that are
found for psychological interventions can very well be caused by other
factors than the specific techniques used in the intervention. For example
is it very well possible that the effects are (partly) caused by the expecta-
tions a patient has of the intervention. It is well-known that expectations
are associated with outcome of therapies (Constantino, Arnkoff, Glass,
56
Selection of studies, retrievement of data and risk of bias
Attrition bias
One of the core elements of randomized trials is that participants who
are randomized are also all included in the analyses of outcome. In earlier
studies this element of trials was not considered important and analyses
57
Step 3
of the outcomes were usually only applied to the ones who completed
the study. The participants who dropped out of the study were simply
ignored, and in some early trials it was even considered appropriate to
replace drop-outs from the study.
Nowadays it is usually understood that analyzing all randomized par-
ticipants is important for estimating the true effect of an intervention. It
is very well possible that the participants who drop out are also the ones
who do not benefit from the intervention and who get worse because of
the intervention or during the intervention. Focusing only on the ones
who do not drop out probably inflates the effect sizes considerably. There
is also empirical evidence from meta-analyses that studies that only in-
clude study completers find higher effects than studies that include all
randomized participants in the analyses (Cuijpers, van Straten, Bohlmei-
jer, Hollon, & Andersson, 2010).
But how can the data of participants that drop-out be used in the
analyses? We do not have these data, so how can they be used? There are
several ways to estimate, or impute, these missing data, such as using the
last observation that is available (last observation carried forwards), mul-
tiple imputation techniques or mixed models for repeated measurements
(Crameri, von Wyl, Koemeda, Schulthess, & Tschuschke, 2015).
Note that there is a difference between participants who drop out of
the intervention and those who drop out of the study. People who drop
out of the intervention can still participate in the study by participating
in the assessments of outcome after the intervention. The problem of im-
puting missing data is mostly relevant for the participants who drop out
of the study.
Reporting bias
In many randomized controlled trials several outcome measures are
used. Researchers can be inclined to report only the outcomes for which
significant outcomes are found or outcomes with the largest effect siz-
es. This can affect the outcomes of meta-analyses considerably, because
58
Selection of studies, retrievement of data and risk of bias
Researcher allegiance
In research on psychological interventions, the problem of research-
er allegiance is also a potential threat to validity. Researcher allegiance
can be defined as a researchers belief in the superiority of a treatment
[and] the superior validity of the theory of change that is associated
with the treatment (Leykin & DeRubeis, 2009, p.55). Many meta-analy-
sis have shown that researcher allegiance is associated with considerably
better outcomes for the preferred treatment (Dragioti, Dimoliatis, Foun-
toulakis, & Evangelou, 2015; Munder, Brtsch, Leonhart, Gerger, & Barth,
2013a).
59
Step 3
60
Selection of studies, retrievement of data and risk of bias
For the assessment of the risk of bias it is again very important that
this is done by two independent researchers and that disagreements are
discussed until agreement is reached (when needed a third, senior, re-
viewer is involved).
Reporting of risk of bias in a meta-analysis is very important. First,
risk of bias should be clearly reported for each of the included studies in
a meta-analysis. This can be done in the table describing the characteris-
tics of the included studies that should be included in any meta-analysis.
Second, it is also important to report the aggregated results for the risk
of bias for all included studies together. For example, the percentage of
all studies that meet each criterion and the total number of studies that
meet all criteria. Third, a graphical representation of the risk of bias can
also be useful. In Figure 3.3 an example is given how risk of bias can be
represented graphically.
61
Step 3
Apart from assessing the risk of bias in the studies that are included in
a meta-analysis it is of course also very important to examine whether the
risk of bias has an effect on the outcomes of the meta-analysis. We will
describe in Step 5 how that can be done.
Unfortunately, the Cochrane risk of bias assessment tool does not as-
sess researcher allegiance. This has to be assessed, therefore, separately
from the Cochrane tool.
62
Selection of studies, retrievement of data and risk of bias
Key points
Work through the retrieved records found in the searches and
retrieve the full texts of the papers that may meet your inclusion
criteria.
Work preferably with two researchers
Read the retrieved full texts of the papers carefully
Make a clear overview of in/exclusion criteria to guide this pro-
cess
Extract relevant data from the included studies on the partici-
pants, the intervention and general study characteristics
Assessment of validity of studies is of vital importance
The Cochrane risk of bias tool is a good instrument that assesses
major types of risk of bias: Adequate sequence generation; allo-
cation concealment; blinding; incomplete outcome data; selective
reporting; and other potential threats to validity
For psychological interventions, researcher allegiance is also an
important threat to validity of the included trials.
63
64
Step 4
65
66
Step 4. Calculating and pooling effect
sizes
67
Step 4
sible to use effect sizes for counts, rates, and ordinal outcomes. Because
such effect sizes are not much used in meta-analyses in mental health re-
search we will not focus on them here.
Mintervention Mcontrol
d=
SDpooled
In this formula, M stands for the mean and SD stands for the standard
deviation. The pooled SD is calculated as:
68
Calculating and pooling effect sizes
es. But on the Internet, many calculators are available that can help with
the calculation of effect sizes.
So in order to calculate an effect size, you need the mean (M), the
standard deviation (SD) and the sample size (N) from the two groups that
you are comparing (I and C from the PICO acronym). But where can you
find these data in an article? Usually the data you need are in a Table pre-
senting the outcomes of the trial. For example, if you look at the following
open-access paper describing a randomized trial examining the efficacy of
mindfulness-based cognitive therapy as a public mental health interven-
tion for adults with mild to moderate depressive symptomatology (Pots,
Meulenbeek, Veehof, Klungers, & Bohlmeijer, 2014): http://journals.plos.
org/plosone/article?id=10.1371/journal.pone.0109789.
You will find the outcome data in the table at page 7. As you can see,
the table presents the M and SD of the outcome measures, as well as the
N for each of the two conditions. If a paper does not report these data in
a table, it is necessary to read the text of the results section carefully, be-
cause sometimes these data are only presented as text. In this study the
main outcome measure is the CES-D (Center for Epidemiological Studies
Depression scale) (Roberts, 1980). When we take the M, SD and N of
the treatment group at post-test (M=11.79; SD=8,76; N=76) and of the
waiting list control group (N=16,43; SD=9,94; N=75), than this results in
a Cohens d of 0.50. This d is also reported in the Table itself.
Unfortunately, not all trials give the exact data that are needed for the
calculation of effect sizes. Sometimes the M is reported, but not the SD.
However, the SD can also be calculated from other statistics, such as the
standard error (SE) or the 95% confidence interval (CI) around the mean.
If the 95% CI around the mean is given, the SE can be calculated with
the following formula: SD = ((M - CIlower) / 1.96) * N. In this formula CIlo-
wer indicates the lower threshold of the 95% CI.
69
Step 4
Cohens d and Hedges g can also be calculated from tests in which the
difference between the two groups is tested, for example the t-test, but
also the p-value indicating the difference between the two groups. It is
beyond the scope of this book to provide the statistical details of these
calculations, and most software packages offer help with the conversion
of these statistics into effect sizes.
The formulas to calculate effect sizes are only valid when the effect
size is based on the difference between two groups, like a treatment and
a control group. If an effect size is calculated that measures the improve-
ment from pre-test (baseline) to post-test, then these formulas can not be
used, as described above. The reason for that is that the pre- and post-test
scores are not independent of each other, and the correlation between
the pre- and post-test scores are needed to calculate the effect size. Un-
fortunately, this correlation is hardly ever reported in trials and therefore
researchers typically assume a value for this correlation (for example 0.7).
An alternative could be to calculate Glasss as the effect size, because
this uses the standard deviation of the control group (the baseline mea-
sure) and does not use the pooled standard deviation of both groups. For
Glasss the correlation between pretest and posttest is therefore not
needed.
70
Calculating and pooling effect sizes
71
Step 4
So when you have collected the data from the studies you want to in-
clude in your meta-analysis you have a table that could look like this (for
the first three studies). This is enough to calculate effect sizes for each
study and pool them across studies.
72
Calculating and pooling effect sizes
lier, an effect size indicates the difference between two groups in terms
of standard deviations. But whether that is clinically relevant can not be
determined based on the size of the effect. For example, an effect size of
0.1 in terms of years of survival would be considered by most clinicians
as a very important and strong effect, whereas the same effect size of
0.1 in terms of more social skills or knowledge about mental health
would likely not be considered clinically meaningful by most clinicians
(Pim Cuijpers, Turner, et al., 2014). Thus, there is little correspondence
between the effect size and its clinical relevance. It has been suggested
that an effect size of 0.5 can be seen as a generic threshold for clinical
relevance (Fournier et al., 2010; Kirsch et al., 2008; National Institute for
Clinical Excellence, 2009) but this is inaccurate and misleading, because it
does not take into account the clinical relevance of the outcome measure.
Another disadvantage of the effect size is that it is difficult to explain
the clinical relevance to patients and clinicians. If a clinician wants to ex-
plain to a patient what an effect size of g=0.5 means, he or she would say
something like: If you get this treatment the average patient will score
0.5 standard deviation better than patients who do not get the treat-
ment. And then the clinician has to explain what a standard deviation
means. It can hardly be expected that patients then know what they can
expect from this treatment.
73
Step 4
a) The NNTs are calculated according to the method provided by: Kraemer HC, Kupfer
DJ. Size of treatment effects and their importance to clinical research and practice.
Biological Psychiatry 2006; 59: 990-996 (Kraemer & Kupfer, 2006)
74
Calculating and pooling effect sizes
One way to solve this problem is to convert the effect size to the num-
bers-needed-to-be-treated (NNT). The NNT indicates the number of pa-
tients that have to be treated in order to generate one additional positive
outcome (Laupacis et al., 1988). It has the advantage that it is easier to
understand what the clinical meaning of the NNT is compared to what
the effect size means. In the next paragraph on dichotomous outcomes
we will see that the NNT is the inverse of the Risk Difference between
two conditions. So, for example if 30% of the patients in the control group
improve, and 50% improves in the treatment group, the Risk Difference is
20% (50%-30%) and the NNT is 5 (=1/0.20).
There are at least five methods to convert an effect size to the NNT,
all of which assuming that the mean scores follow a normal or near nor-
mal distribution (da Costa et al., 2012; Furukawa & Leucht, 2011). Four of
these methods require an estimate of the event rate in one or both of the
conditions and each of these four methods is superior to the fifth method
(Kraemer & Kupfer, 2006). However, because this fifth method does not
need an estimate of a variable that is often not available, it is still used in
many meta-analyses. With this fifth method it is possible to calculate a
NNT for each value of the effect size. In the table below for each value of
the effect size the NNT is given.
75
Step 4
ple the Hamilton Depression Rating scale), without one being the primary
outcome. So what to do in such a situation?
There are several solutions to this problem:
76
Calculating and pooling effect sizes
As each of these three solutions has its pros and cons, and there is not
one of the three that is preferred for every meta-analysis, and it depends
on the set of studies and their outcome measures which solution should
be chosen.
77
Step 4
SPSS and SAS (meta-analysis macros for SAS and SPSS, devel-
oped by David B. Wilson are available from: http://mason.gmu.
edu/~dwilsonb/ma.html)
STATA
Review manager (developed by the Cochrane Collaboration;
available from: http://tech.cochrane.org/revman/download)
Several packages for meta-analyses are available in R, including
metafor, mvmeta and mada (Schwarzer et al., 2015).
Comprehensive Meta-analysis.
78
Calculating and pooling effect sizes
Risk difference (RD) = (a / (a+b)) (c / c+d)) = Risk in therapy group risk in control
group
79
Step 4
The Risk Difference (RD) indicates the difference between the risk
for the event in the treatment and the risk for the event in the comparison
group. So in the example above, the risk in the treatment group was 60%
(30/50) and the risk in the control group was 20% (10/50). This means
that the RD in this case was 40%.
80
Calculating and pooling effect sizes
81
Step 4
82
Calculating and pooling effect sizes
83
Step 4
tween d=0.20 and d=0.60. Important for such power calculations is the
between study variance (Tau-square). These calculations were done for
low, medium and high between-study variance and for a power (1 beta)
for 0.80 and 0.90.
The number of studies and participants is not enough to decide wheth-
er pooling of studies is useful. If the majority studies has low risk of bias
and/or if clinical and statistical heterogeneity is high, then pooling may
still not be useful. There are no good guidelines for when heterogeneity
and risk of bias are good enough for pooling. If pooling is not useful, it is
still possible to write a systematic review without pooling the effect sizes.
84
Calculating and pooling effect sizes
Table 4.4 The number of studies needed to find effect sizes, for low,
medium and high between study variance and power of 0.80 and 0.90
0.3 20 12 15 18 20 16 20 23
30 8 10 12 30 11 13 16
40 6 8 9 40 8 10 12
50 5 6 7 50 7 8 10
0.4 20 7 9 10 20 9 11 14
30 5 6 7 30 6 8 9
40 4 5 5 40 5 6 7
50 3 4 4 50 4 5 6
0.5 20 5 6 7 20 5 8 9
30 3 3 5 30 4 5 6
40 3 3 4 40 3 4 5
50 2 3 3 50 3 3 4
0.6 20 3 4 5 20 4 5 6
30 2 3 3 30 3 4 4
40 2 2 3 40 2 3 3
50 2 2 2 50 2 2 3
85
Step 4
86
Calculating and pooling effect sizes
that, the outcomes of the random and fixed effect models will be compa-
rable, but when there are differences between the studies it is still better
to use the random effects model. The decision to use the fixed effect or
the random effects model should be based on the knowledge about the
studies, and whether they share a common effect size, not on a statistical
test of heterogeneity.
From this perspective it is also important to stress that the confidence
intervals around the level of heterogeneity are often broad, meaning that
even when heterogeneity is low, its 95% confidence interval could still
include a high level of heterogeneity and its therefore uncertain if het-
erogeneity is indeed low (Ioannidis, Patsopoulos, & Evangelou, 2007).
When using the random effects model it is important to examine pos-
sible sources of heterogeneity. Are these studies indeed heterogeneous,
and can we find explanations for this heterogeneity? In the next chapter,
we will focus on the methods that are available for examining heteroge-
neity.
87
Step 4
For example in the Figure, the study by Gulliver et al. (Gulliver et al., 2012)
is clearly a small study, while the study by Griffiths et al. (Griffiths, Chris-
tensen, Jorm, Evans, & Groves, 2004) is large.
In this forest plot the studies are presented according to the size of
the effects, with the first study having the highest effect size and the last
study the lowest effect size. Other meta-analyses present the studies in
alphabetical order, or according to the year in which they were published
(like in cumulative meta-analyses, see next chapter). But the order in
which the studies are presented, is not important if we want to learn how
the forest plot can be seen as the core of a meta-analysis.
Below this same figure is given again, but with some illustrative points.
The pooled effect size is indicated with a black dot on the last line of the
plot, and the 95% CI is the line drawn though this dot. As can be seen
88
Calculating and pooling effect sizes
from this Figure, the upper and lower threshold of the 95% CI around the
pooled effect size range from 0.17 to 0.39. The blue vertical lines indicate
these thresholds for the 95% CIs.
The 95% CI of the first study (Kiropoulos, Griffiths, & Blashki, 2011)
does not overlap with the blue lines, and therefore the 95% CI of this study
does not overlap with the pooled 95% CI of all studies. When this is the
case, such a study is often considered to be an outlier. An outlier means
that a study differs considerably from the other studies in a meta-analy-
sis. There is not one best way of identifying outliers, but one simple meth-
od is to look whether the 95% CI of the study overlaps with the 95% CI of
the pooled effects size. In the next chapter we will discuss how to handle
potential outliers when examining heterogeneity in a meta-analysis. For
here it is sufficient to illustrate how the outlier can be identified.
95 CI around
Personal stigma and social distance effect size
89
Step 4
Sensitivity analyses
When a meta-analysis is conducted many decisions are taken, about
the inclusion of specific studies, participants, outcomes and designs. For
example in the next chapter we will give an example of a meta-analysis we
conducted on psychological treatments of depression in old age. But it is
not clear how old age should be defined. Some studies include only people
older than 55, other use 60 or 65 years as cut-off for being included in the
trial. Or in psychological treatments usually different outcome measures
are used that measure the same construct. Also studies with different lev-
els of risk of bias may be included.
Sensitivity analyses can be helpful with examining whether such de-
cisions have affected the outcomes. It is for example in most cases useful
to limit the analyses to studies with low risk of bias to see if that leads to
different outcomes than when all studies (also the ones with higher risk of
bias) are included. And when multiple outcome measures are used to ex-
amine the effects of the interventions, it can also be examined in sensitiv-
ity analyses whether the outcomes for specific measures lead to different
outcomes. Any other decision made in the meta-analysis can be examined
like this in sensitivity analyses.
90
Calculating and pooling effect sizes
Key points
Cohens d indicates the difference between the treated and the
control at post-test in terms of standard deviations
Effect sizes d of 0.2 are small, 0.5 moderate and 0.8 large
Many other statistics can be pooled in meta-analyses
Pooling means the statistical integration of the result of multiple
studies into one overall effect size
Whether or not it is useful to pool effect sizes across studies de-
pends on the number of studies, the number of participants per
study, the heterogeneity of the set of studies and the risk of bias
of the set of studies.
The forest plot gives a good summary of a meta-analysis, with the
effect size for each study, the 95% confidence interval (that also
indicates the size of the study), the differentiation of effect sizes,
outliers and the pooled effect size.
91
92
Step 5
Examining heterogeneity
and potential
publication bias
93
94
Step 5. Examining heterogeneity and
potential publication bias
95
Step 5
Figure 5.1 Forest plot for psychotherapy versus placebo for adult depression
(P. Cuijpers, Turner, et al., 2014)
Now compare this with the forest plot that is presented in Figure 5.2.
This is from a meta-analysis on psychological treatments of depression
in older adults compared with control groups (waiting list, care-as-usu-
al, placebo). We have marked the 95% confidence interval around the
pooled effect size with two vertical red lines, so it is easier to see which
studies are outliers. The outliers (the studies in which the 95% confidence
interval of the effect size does not overlap with the pooled effect size) are
marked with red ovals. Large studies have more narrow 95% confidence
intervals, and as can be seen from the figure also some large studies are
outliers (e.g., Joling et al., 2011; Williams et al., 2000). That is remarkable
because it could be expected that large studies can make better estimates
of the true effect size, so if large studies are outliers than it is very proba-
ble that heterogeneity is high. That combined with the fact that the num-
96
Examining heterogeneity and potential publication bias
97
Step 5
Quantifying heterogeneity
Another way to examine heterogeneity is to quantify it. The I2 statistic
is the most used method to quantify heterogeneity (Higgins, Thompson,
Deeks, & Altman, 2003). It is the percentage of the total variance that can
be explained by heterogeneity. The formula for calculating I2 is:
98
Examining heterogeneity and potential publication bias
99
Step 5
100
Examining heterogeneity and potential publication bias
101
Step 5
102
Examining heterogeneity and potential publication bias
The other question is which subgroups should be chosen for the sub-
group analyses. In principle any characteristic of the participants, the in-
tervention, comparison group, outcomes, or design of the study can be
used for subgroup analyses. When risk of bias is high in some of the stud-
ies, it is usually a good idea to at least examine whether there is a differ-
ence between the studies with low or high risk of bias (although that can
also be examined with a meta-regression analysis, see below). Or even
better, to examine each of the items of the risk of bias assessment tool
separately in a series of subgroup analyses (because each of these items
can have an independent effect on the outcomes and making a sum of the
risk of bias for each study may obscure this).
Unfortunately, there are no fixed rules for choosing which charac-
teristics should be included in the subgroup analyses. Knowing a field of
interventions and the studies examining the effects of the interventions
usually leads to enough ideas for doing subgroup analyses. The big risk is
that meta-analytic researchers simply do all possible subgroup analyses
with all characteristics of the studies that have been extracted, but select
only the ones that are significant for reporting in their article. This is one
of the reasons why publishing the protocol for a meta-analysis is useful
and to specify in advance which subgroup analyses are planned.
So when interpreting the findings of subgroup analyses it is also im-
portant to see whether the analyses were planned in advance. Apart from
that it is also to assess whether the findings in the subgroup analyses can
be explained by other or external evidence. For example, our finding that
Internet therapies with and without support differ from each other is in
line with the assumption that human contact is needed for psychologi-
cal interventions to be effective (Mohr, Cuijpers, & Lehman, 2011). That
makes the finding more credible and stronger.
Also the difference between the subgroups is important. Statistical
significance is not the only relevant issue, because it is very much depen-
dent on power and the number of studies and participants in the stud-
ies. As indicated earlier, the size of the difference between subgroups in
103
Step 5
terms of differential effect sizes is not the only issue that matters, but also
the clinical meaning of that differential effect size.
Metaregression analyses
Metaregression analyses can also be used to examine sources of
heterogeneity. In a bivariate metaregression analysis the association be-
tween a continuous characteristic of the studies and the effect sizes is
examined. For example, the association between the effect size and the
number of sessions in an intervention could be examined in a metaregres-
sion analysis. In figure 5.3 the association between the effect size and the
number of therapy sessions for therapies in adult depression is graph-
ically represented, based on a meta-analysis in which we examined this
(among others; (Pim Cuijpers, Huibers, Ebert, Koole, & Andersson, 2013)).
The line indicates the regression line, the curved lines are the 95% confi-
dence intervals and the dots are the individual studies. In general metare-
gression analyses should not be conducted when the number of studies is
smaller than 10 (Higgins, J.P.T. & Green, S., 2011).
There are different statistical methods for doing the metaregression
analyses, but we will not go into them because that is too technical. We
will focus on the main outcomes of a metaregression analysis and how to
interpret the results.
What is important in a metaregression analysis is the slope of the re-
gression line. When the regression line is completely horizontal, there is
no association between the effect size and the predictor (in this case the
number of sessions of the therapy that is examined). That horizontal line
indicates that the effect size is the same for any value of the predictor
(number of sessions). If, however, the line is not horizontal, that indicates
that the effect size differs for different values of the predictor. In our ex-
ample, there was a small, positive slope for the association between the
number of treatment sessions and the effect size.
The limitations that were mentioned for subgroup analyses are also
true for metaregression analyses. So they also only provide observational
104
Examining heterogeneity and potential publication bias
2,00
1,50
Hedges's g
1,00
0,50
0,00
-0,50
2,5 5,0 7,5 10,0 12,5 15,0 17,5 20,0 22,5 25,0
Number of sessions
evidence. In our example the (small) association between effect size and
number of sessions can not be considered as evidence that such an asso-
ciation really exists.
Apart from bivariate metaregression analyses it is also possible to do
multivariate metaregression analyses in which more than one predictor
is included simultaneously. In these multivariate models it is also possible
to include categorical variables, such as in normal regression analyses.
When categorical variables are included in metaregression analyses they
should be categorized into dummy variables (variables indicating a 1 or
0, for presence or absence of the characteristic). In our example Internet
interventions with or without support we could simply enter one variable
indicating support (1) or no support (0). But just as in normal regression
analyses it is also possible to include variables with more than two cate-
gories, where one of the categories is the reference category.
105
Step 5
106
Examining heterogeneity and potential publication bias
107
Step 5
108
Examining heterogeneity and potential publication bias
0,2
Standard Error
0,4
0,6
0,8
-3 -2 -1 0 1 2 3
Hedges's g
0,0
0,2
Standard Error
0,4
0,6
0,8
-3 -2 -1 0 1 2 3
Hedges's g
109
Step 5
In figure 5.4 we have given an example of this funnel plot. On the ver-
tical axis is the standard error, which indicates the size of the study. So the
higher on the vertical axis, the larger the study is in terms of participants.
On the horizontal axis the effect size (Hedges g) is given. Each of the cir-
cles represents one of the studies. This figure is based on a meta-analysis
of a meta-analysis of cognitive behavior therapy (CBT) for adult depres-
sion (Pim Cuijpers, Berking, et al., 2013).
When the study is smaller (less participants) the effect size can be ex-
pected to divert more from the mean effect size, because of chance. As
it is smaller it will be less precise and divert more from the pooled effect
size. When the study is larger the chance is smaller that its effect size dif-
fers from the mean effect size by chance, so it will be closer to the pooled
effect size. But all effect sizes plotted like this divert from the mean effect
size by chance, nothing else. The smaller studies more than the bigger
studies, but it is still all only chance. But if these effect sizes differ from
the mean effect size only by chance, they should divert in both directions,
positive and negative.
Thus, Figure 5.4 should be symmetrical with as much small studies
(lower in the figure) on the right as there are on the left, but without any
testing it can be see already that there are more studies on the right of the
mean effect size (positive studies) than on the left (negative studies). This
visual inspection of the funnel plot already suggests that there are more
positive studies than negative ones.
There are several tests for the asymmetry of the funnel plot. Two
much used tests are Begg and Mazumdars test (Begg & Mazumdar, 1994)
and Eggers test of the intercept (Egger, Davey Smith, Schneider, & Mind-
er, 1997). They test whether the funnel plot is symmetrical and if they are
significant, it can be concluded that there is significant publication bias (or
another bias, see below). Another approach to missing studies in a funnel
plot was developed by Duval and Tweedie (Duval & Tweedie, 2000). They
developed a method to estimate how many studies are missing from the
110
Examining heterogeneity and potential publication bias
funnel plot, to impute the missing studies and estimate the effect size af-
ter imputation of these missing studies. In the second part of figure 5.4
this method has been applied to the studies on CBT. The black dots repre-
sent the imputed studies, the ones that should have been there but are in
fact not there. In this case the number of imputed studies was 27 (there
were 94 studies included in this meta-analysis), and after taking these im-
puted studies into consideration the mean effect size indicating the dif-
ference between CBT and control groups after treatment dropped from
g=0.71 to g=0.53. Eggers test and Begg and Mazumdars test were also
highly significant in this meta-analysis (both p<0.001).
The funnel plot is a very useful tool for detecting possible publication
bias. But there are also several important risks associated with its use.
First of all, it requires a considerable number of studies, generally at least
30 (Lau, Ioannidis, Terrin, Schmid, & Olkin, 2006), but that also depends
on the effect size and the size of the studies. Furthermore, how the funnel
plot looks also depends on other factors, for example the type of outcome
(dichotomous versus continuous, and RRs versus ORs) lead to differenc-
es in funnel plots (Lau et al., 2006), as does the parameter of the vertical
axis (sample size, standard error, etc.). When heterogeneity is high (like in
our example) funnel plots also may lead to false interpretations (Terrin,
Schmid, Lau, & Olkin, 2003).
Another issue is that the funnel plot allows to see if small studies with
negative effects (smaller than the mean effect) have not been published
while they should have been published (based on chance). It is very well
possible that this is caused by publication bias and that these studies
are not published because authors, editors and journals prefer positive
effects and are not interested in papers when they report no or nega-
tive effects of an intervention. But this cannot be considered evidence
for publication bias. Maybe a better term for the phenomenon is small
sample bias. It is possible there are other reasons why studies with small
samples have larger effects than larger studies. For example, small studies
may focus more on high risk patients, have a shorter follow-up and dif-
111
Step 5
fer because treatment effects decrease over time or may target different
populations (Lau et al., 2006).
In the case of psychological interventions it may be true that very good
therapists who developed a new therapy delivered a treatment in a small
pilot trial themselves, while in later larger trials the therapy was delivered
by other therapists. The delivery of a new therapy by a famous professor
may also increase expectations of patients. And small pilot studies also
may use more waiting list control groups for example than larger studies.
I sum, the funnel plot is a useful tool to examine publication bias (or
better: small sample bias) but the results of the funnel plot should be con-
sidered with caution.
112
Examining heterogeneity and potential publication bias
Key points
113
114
Step 6
115
116
Step 6. Writing and publishing meta-
analyses
When you have followed the first five steps in this book you have con-
ducted the basic steps of a meta-analysis. You have formulated a research
question according to the PICO acronym, you have searched bibliograph-
ical databases to identify the studies that can answer you research ques-
tion, you have carefully selected the studies that meet inclusion criteria,
you have extracted the data of each of these studies including character-
istics of the participants, the intervention, the comparator and risk of bias,
you have calculated effect sizes and pooled these according to the random
effects model, you have examined sources of heterogeneity in subgroup
and metaregression analyses, and you have examined small sample bias.
So when you have done all that you are ready to publish the results of your
meta-analysis. In this step we will describe the publishing of the protocol
for your meta-analysis, the PRISMA guidelines for publishing meta-anal-
yses, and a stepwise guide for what each of the parts of the publication of
your meta-analysis should contain.
117
Step 6
118
Writing and publishing meta-analyses
are reported. In the same way, the PRISMA statement gives an overview
of other aspects of the meta-analysis that should be reported by authors,
including aspects of the Introduction (the rationale, the PICO), the meth-
ods (for example the inclusion criteria, the data extraction, the methods
used for assessing risk of bias), the results (such as a description of the
included studies, risk of bias for all studies, effect sizes), and the discus-
sion (including a summary of the main findings, the limitations, and impli-
cations for future research). All lists as well as the full PRISMA Statement
are available at the website (www.prisma-statement.org).
The PRISMA statement is one of the guidelines that exist for the re-
porting of many types of scientific studies, for example randomized trials
(the CONSORT statement), observational studies (STROBE), and quali-
tative research (SPQR). An overview of these statements can be found
at the website of the EQUATOR network (www.equator-network.org)
where much additional information can be found such as extensions for
specific types of meta-analyses (such as the PRISMA-P statement we al-
ready mentioned, but also the PRISMA extension for individual patient
data meta-analyses).
119
Step 6
Basic information
Title
Authors
Funding support and conflict of interest
Reference to published protocol
Abstract
Introduction
Explain the background,
The importance of the problem,
Earlier (meta-analytic) research,
Why this new meta-analysis is needed,
End with the research question (PICO)
Methods
Identification (searches) and selection of studies (in/exclusion
criteria
Data extraction and quality assessment
Analyses (which effect size, how was it calculated, pooling, the
model used, heterogeneity, publication bias, subgroup and me-
taregression analyses)
Results
Selection and inclusion of studies (PRISMA flowchart)
Characteristics of included studies, including quality/validity, table
with selected characteristics
Outcomes: pooled effect sizes
120
Writing and publishing meta-analyses
Discussion
Summary of main results
What does this add to existing knowledge
Implications for research and practice
Future research
Limitations
Conclusion
121
Step 6
122
Writing and publishing meta-analyses
teristics of each study. When the number of studies is too large this can
also be added as an Appendix to the paper. The description of the quality
or risk of bias of the studies (including a description of each study) should
also be given here.
Then the outcomes of the meta-analyses should be given, including
the pooled effect sizes (with 95% confidence intervals), heterogeneity,
results of sensitivity analyses, potential outliers, and asymmetry of the
funnel plot (publication bias). Furthermore, the results of subgroup and
metaregression analyses should be given here. A forest plot with the main
analyses should also be presented here.
The Discussion section is very much the same as the discussion sec-
tion of other papers in social science and mental health research. First, a
summary of the main findings is given and a description is given of what
this study adds to the existing knowledge. Then the implication for future
research and clinical practice are given. An important paragraph is about
the limitations of the study. Were there enough studies? Was heteroge-
neity not too high? Was it possible to explain causes of heterogeneity?
Was the risk of bias not too high in the set of studies? Finally, a conclusion
should be given about the outcomes and consequences of the study.
It is also important to present Tables and Figures in the paper. Some
must be included, such as a PRISMA flowchart and as we indicated, the
forest plot is in many ways the core of a meta-analysis and should there-
fore also be included in a paper. Then a descriptive table with the major
characteristics of the studies should also be included. The risk of bias of
each study should also be reported but that can be integrated in the de-
scriptive table. Another table with the main results of the analyses and
subgroup analyses is also very useful and makes the paper easier to un-
derstand for the reader. Apart from these Figures and Tables of course
other ones can also be included, for example an extra Figure with another
forest plot (for example of a subset of studies) or a Figure describing the
results of a metaregression analysis.
123
Step 6
Key points
It is advisable to develop a protocol before starting with a me-
ta-analysis and to publish that protocol, for example at the PROS-
PERO website
The PRISMA Statement contains an evidence-based minimum set
of items for reporting in systematic reviews and meta-analyses
and should be used when writing a paper on a meta-analysis
A paper on a meta-analysis follows the general rules for writing
any other report on a scientific study in social science and mental
health
The main structure of a paper on a meta-analysis contains main
sections of the Introduction, Methods, Results and Discussion
Figures that should be included are the PRISMA flowchart and
the forest plot
Tables that should be included are a descriptive table with the
major characteristics of the studies and a table summarizing the
main results of the analyses and subgroup analyses
124
Writing and publishing meta-analyses
125
126
References
127
Crameri, A., von Wyl, A., Koemeda, M., Schulthess, P., & Tschuschke, V. (2015). Sen-
sitivity analysis in multiple imputation in effectiveness studies of psychother-
apy. Frontiers in Psychology, 6. http://doi.org/10.3389/fpsyg.2015.01042
Cuijpers, P., Berking, M., Andersson, G., Quigley, L., Kleiboer, A., & Dobson, K. S.
(2013). A meta-analysis of cognitive-behavioural therapy for adult depres-
sion, alone and in comparison with other treatments. Canadian Journal of
Psychiatry. Revue Canadienne de Psychiatrie, 58(7), 376385.
Cuijpers, P., Driessen, E., Hollon, S. D., van Oppen, P., Barth, J., & Andersson, G.
(2012). The efficacy of non-directive supportive therapy for adult depres-
sion: a meta-analysis. Clinical Psychology Review, 32(4), 280291. http://doi.
org/10.1016/j.cpr.2012.01.003
Cuijpers, P., Huibers, M., Ebert, D. D., Koole, S. L., & Andersson, G. (2013). How
much psychotherapy is needed to treat depression? A metaregression anal-
ysis. Journal of Affective Disorders, 149(1-3), 113. http://doi.org/10.1016/j.
jad.2013.02.030
Cuijpers, P., Karyotaki, E., Pot, A. M., Park, M., & Reynolds, C. F. 3rd. (2014). Man-
aging depression in older age: psychological interventions. Maturitas, 79(2),
160169. http://doi.org/10.1016/j.maturitas.2014.05.027
Cuijpers, P., Li, J., Hofmann, S. G., & Andersson, G. (2010). Self-reported versus cli-
nician-rated symptoms of depression as outcome measures in psychotherapy
research on depression: a meta-analysis. Clinical Psychology Review, 30(6),
768778. http://doi.org/10.1016/j.cpr.2010.06.001
Cuijpers, P., Sijbrandij, M., Koole, S., Huibers, M., Berking, M., & Andersson,
G. (2014). Psychological treatment of generalized anxiety disorder: a
meta-analysis. Clinical Psychology Review, 34(2), 130140. http://doi.
org/10.1016/j.cpr.2014.01.002
Cuijpers, P., Turner, E. H., Koole, S. L., van Dijke, A., & Smit, F. (2014). What is
the threshold for a clinically relevant effect? The case of major depressive
disorders. Depression and Anxiety, 31(5), 374378. http://doi.org/10.1002/
da.22249
Cuijpers, P., Turner, E. H., Mohr, D. C., Hofmann, S. G., Andersson, G., Berking, M.,
& Coyne, J. (2014). Comparison of psychotherapies for adult depression to
pill placebo control groups: a meta-analysis. Psychological Medicine, 44(4),
685695. http://doi.org/10.1017/S0033291713000457
Cuijpers, P., van Straten, A., Bohlmeijer, E., Hollon, S. D., & Andersson, G. (2010).
The effects of psychotherapy for adult depression are overestimated: a
meta-analysis of study quality and effect size. Psychological Medicine, 40(2),
211223. http://doi.org/10.1017/S0033291709006114
Cuijpers, P., Vogelzangs, N., Twisk, J., Kleiboer, A., Li, J., & Penninx, B. W. (2014).
Comprehensive meta-analysis of excess mortality in depression in the
128
general community versus patients with specific illnesses. The Ameri-
can Journal of Psychiatry, 171(4), 453462. http://doi.org/10.1176/appi.
ajp.2013.13030325
da Costa, B. R., Rutjes, A. W. S., Johnston, B. C., Reichenbach, S., Nesch, E., Tonia,
T., Jni, P. (2012). Methods to convert continuous outcomes into odds
ratios of treatment response and numbers needed to treat: meta-epidemio-
logical study. International Journal of Epidemiology, 41(5), 14451459. http://
doi.org/10.1093/ije/dys124
Dragioti, E., Dimoliatis, I., Fountoulakis, K. N., & Evangelou, E. (2015). A systematic
appraisal of allegiance effect in randomized controlled trials of psychothera-
py. Annals of General Psychiatry, 14, 25. http://doi.org/10.1186/s12991-015-
0063-1
Driessen, E., Hollon, S. D., Bockting, C. L. H., Cuijpers, P., & Turner, E. H. (2015).
Does Publication Bias Inflate the Apparent Efficacy of Psychological Treat-
ment for Major Depressive Disorder? A Systematic Review and Meta-Anal-
ysis of US National Institutes of Health-Funded Trials. PloS One, 10(9),
e0137864. http://doi.org/10.1371/journal.pone.0137864
Duval, S., & Tweedie, R. (2000). Trim and fill: A simple funnel-plot-based method of
testing and adjusting for publication bias in meta-analysis. Biometrics, 56(2),
455463.
Egger, M., Davey Smith, G., Schneider, M., & Minder, C. (1997). Bias in meta-analy-
sis detected by a simple, graphical test. BMJ (Clinical Research Ed.), 315(7109),
629634.
Eysenck, H. J. (1952). The effects of psychotherapy: an evaluation. Journal of Con-
sulting Psychology, 16(5), 319324.
Fanelli, D. (2010). Positive Results Increase Down the Hierarchy of the Sciences.
PLoS ONE, 5(4), e10068. http://doi.org/10.1371/journal.pone.0010068
Fournier, J. C., DeRubeis, R. J., Hollon, S. D., Dimidjian, S., Amsterdam, J. D., Shel-
ton, R. C., & Fawcett, J. (2010). Antidepressant drug effects and depression
severity: a patient-level meta-analysis. JAMA, 303(1), 4753. http://doi.
org/10.1001/jama.2009.1943
Furukawa, T. A., & Leucht, S. (2011). How to obtain NNT from Cohens d: compar-
ison of two methods. PloS One, 6(4), e19070. http://doi.org/10.1371/journal.
pone.0019070
Geraedts, A. S., Kleiboer, A. M., Wiezer, N. M., van Mechelen, W., & Cuijpers, P.
(2014). Short-term effects of a web-based guided self-help intervention for
employees with depressive symptoms: randomized controlled trial. Journal of
Medical Internet Research, 16(5), e121. http://doi.org/10.2196/jmir.3185
Griffiths, K. M., Carron-Arthur, B., Parsons, A., & Reid, R. (2014). Effectiveness
of programs for reducing the stigma associated with mental disorders. A
129
meta-analysis of randomized controlled trials. World Psychiatry: Official
Journal of the World Psychiatric Association (WPA), 13(2), 161175. http://doi.
org/10.1002/wps.20129
Griffiths, K. M., Christensen, H., Jorm, A. F., Evans, K., & Groves, C. (2004). Effect
of web-based depression literacy and cognitive-behavioural therapy inter-
ventions on stigmatising attitudes to depression: randomised controlled trial.
The British Journal of Psychiatry: The Journal of Mental Science, 185, 342349.
http://doi.org/10.1192/bjp.185.4.342
Gulliver, A., Griffiths, K. M., Christensen, H., Mackinnon, A., Calear, A. L., Parsons,
A., Stanimirovic, R. (2012). Internet-based interventions to promote
mental health help-seeking in elite athletes: an exploratory randomized
controlled trial. Journal of Medical Internet Research, 14(3), e69. http://doi.
org/10.2196/jmir.1864
Hamilton, M. (1960). A rating scale for depression. Journal of Neurology, Neurosur-
gery, and Psychiatry, 23, 5662.
Harnad, S., Brody, T., Vallires, F., Carr, L., Hitchcock, S., Gingras, Y., Hilf, E. R.
(2008). The Access/Impact Problem and the Green and Gold Roads to Open
Access: An Update. Serials Review, 34(1), 3640. http://doi.org/10.1080/009
87913.2008.10765150
Higgins, J. P. T., Altman, D. G., Gtzsche, P. C., Jni, P., Moher, D., Oxman, A. D.,
Sterne, J. A. C. (2011). The Cochrane Collaborations tool for assessing risk
of bias in randomised trials. BMJ, 343, d5928. http://doi.org/10.1136/bmj.
d5928
Higgins, J. P. T., & Green, S. (2011). Cochrane Handbook for Systematic Reviews of In-
terventions Version 5.1.0 [updated March 2011]. The Cochrane Collaboration.
Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring
inconsistency in meta-analyses. BMJ (Clinical Research Ed.), 327(7414), 557
560. http://doi.org/10.1136/bmj.327.7414.557
Houghton, J., & Vickery, G. (2005). Digital Broadband Content : Scientific Publishing
(Directorate for Science, Technology and Industry : Committee for Informa-
tion, Computer and Communications Policy.).
Ioannidis, J. P. A., Patsopoulos, N. A., & Evangelou, E. (2007). Uncertainty in het-
erogeneity estimates in meta-analyses. BMJ, 335(7626), 914916. http://doi.
org/10.1136/bmj.39343.408449.80
Jarrett, R. B., Schaffer, M., McIntire, D., Witt-Browder, A., Kraft, D., & Risser,
R. C. (1999). Treatment of atypical depression with cognitive therapy or
phenelzine: a double-blind, placebo-controlled trial. Arch Gen Psychiatry, 56,
4317.
Joling, K. J., Hout, H. P., vant Veer-Tazelaar, P. J., Horst, H. E., Cuijpers, P., Ven,
P. M., & Marwijk, H. W. (2011). How effective is bibliotherapy for very old
130
adults with subthreshold depression? A randomized controlled trial. Amer-
ican Journal of Geriatric Psychiatry, 19, 25665. http://doi.org/10.1097/
JGP.0b013e3181ec8859
Kiropoulos, L. A., Griffiths, K. M., & Blashki, G. (2011). Effects of a multilingual
information website intervention on the levels of depression literacy and de-
pression-related stigma in Greek-born and Italian-born immigrants living in
Australia: a randomized controlled trial. Journal of Medical Internet Research,
13(2), e34. http://doi.org/10.2196/jmir.1527
Kirsch, I., Deacon, B. J., Huedo-Medina, T. B., Scoboria, A., Moore, T. J., & Johnson,
B. T. (2008). Initial severity and antidepressant benefits: a meta-analysis of
data submitted to the Food and Drug Administration. PLoS Medicine, 5(2),
e45. http://doi.org/10.1371/journal.pmed.0050045
Kok, R. N., Donker, T., Batelaan, N. M., Beekman, A. T., Van Straten, A., & Cuijpers,
P. (n.d.). Psychological treatment of specific phobias: a meta-analysis. Submit-
ted.
Kraemer, H. C., & Kupfer, D. J. (2006). Size of treatment effects and their impor-
tance to clinical research and practice. Biological Psychiatry, 59(11), 990996.
http://doi.org/10.1016/j.biopsych.2005.09.014
Lau, J., Ioannidis, J. P. A., Terrin, N., Schmid, C. H., & Olkin, I. (2006). The case of
the misleading funnel plot. BMJ (Clinical Research Ed.), 333(7568), 597600.
http://doi.org/10.1136/bmj.333.7568.597
Lewis, S., & Clarke, M. (2001). Forest plots: trying to see the wood and the trees.
BMJ (Clinical Research Ed.), 322(7300), 14791480.
Leykin, Y., & DeRubeis, R. J. (2009). Allegiance in Psychotherapy Outcome Re-
search: Separating Association From Bias. Clinical Psychology: Science and
Practice, 16(1), 5465. http://doi.org/10.1111/j.1468-2850.2009.01143.x
Masicampo, E. J., & Lalande, D. R. (2012). A peculiar prevalence of p values just
below .05. Quarterly Journal of Experimental Psychology (2006), 65(11), 2271
2279. http://doi.org/10.1080/17470218.2012.711335
Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G. (2009). Preferred reporting
items for systematic reviews and meta-analyses: the PRISMA statement.
BMJ, 339, b2535. http://doi.org/10.1136/bmj.b2535
Moher, D., Shamseer, L., Clarke, M., Ghersi, D., Liberati, A., Petticrew, M., PRIS-
MA-P Group. (2015). Preferred reporting items for systematic review and
meta-analysis protocols (PRISMA-P) 2015 statement. Systematic Reviews, 4,
1. http://doi.org/10.1186/2046-4053-4-1
Mohr, D. C., Cuijpers, P., & Lehman, K. (2011). Supportive accountability: a model
for providing human support to enhance adherence to eHealth interven-
tions. Journal of Medical Internet Research, 13(1), e30. http://doi.org/10.2196/
jmir.1602
131
Mohr, D. C., Ho, J., Hart, T. L., Baron, K. G., Berendsen, M., Beckner, V., Duffecy, J.
(2014). Control condition design and implementation features in controlled
trials: a meta-analysis of trials evaluating psychotherapy for depression.
Translational Behavioral Medicine, 4(4), 407423. http://doi.org/10.1007/
s13142-014-0262-3
Mohr, D. C., Spring, B., Freedland, K. E., Beckner, V., Arean, P., Hollon, S. D.,
Kaplan, R. (2009). The selection and design of control conditions for ran-
domized controlled trials of psychological interventions. Psychotherapy and
Psychosomatics, 78(5), 275284. http://doi.org/10.1159/000228248
Moncrieff, J., Wessely, S., & Hardy, R. (2004). Active placebos versus antide-
pressants for depression. The Cochrane Database of Systematic Reviews, (1),
CD003012. http://doi.org/10.1002/14651858.CD003012.pub2
Munder, T., Brtsch, O., Leonhart, R., Gerger, H., & Barth, J. (2013a). Researcher al-
legiance in psychotherapy outcome research: an overview of reviews. Clinical
Psychology Review, 33(4), 501511. http://doi.org/10.1016/j.cpr.2013.02.002
Munder, T., Brtsch, O., Leonhart, R., Gerger, H., & Barth, J. (2013b). Researcher al-
legiance in psychotherapy outcome research: an overview of reviews. Clinical
Psychology Review, 33(4), 501511. http://doi.org/10.1016/j.cpr.2013.02.002
Mynors-Wallis, L., Davies, I., Gray, A., Barbour, F., & Gath, D. (1997). A randomised
controlled trial and cost analysis of problem-solving treatment for emotional
disorders given by community nurses in primary care. British Journal of Psy-
chiatry, 170, 1139.
National Institute for Clinical Excellence. (2009). The Treatment and Management
of Depression in Adults. Partial Update of Clinical Practice Guideline No 23. Lon-
don: National Institute for Clinical Excellence.
Pots, W. T. M., Meulenbeek, P. A. M., Veehof, M. M., Klungers, J., & Bohlmeijer, E. T.
(2014). The efficacy of mindfulness-based cognitive therapy as a public men-
tal health intervention for adults with mild to moderate depressive symp-
tomatology: a randomized controlled trial. PloS One, 9(10), e109789. http://
doi.org/10.1371/journal.pone.0109789
Riley, R. D., Lambert, P. C., & Abo-Zaid, G. (2010). Meta-analysis of individual par-
ticipant data: rationale, conduct, and reporting. BMJ (Clinical Research Ed.),
340, c221.
Riley, R. D., Simmonds, M. C., & Look, M. P. (2007). Evidence synthesis combining
individual patient data and aggregate data: a systematic review identified
current practice and possible methods. Journal of Clinical Epidemiology, 60(5),
431439. http://doi.org/10.1016/j.jclinepi.2006.09.009
Roberts, R. E. (1980). Reliability of the CES-D Scale in different ethnic contexts.
Psychiatry Research, 2(2), 125134.
132
Roest, A. M., de Jonge, P., Williams, C. D., de Vries, Y. A., Schoevers, R. A., & Turner,
E. H. (2015). Reporting Bias in Clinical Trials Investigating the Efficacy of
Second-Generation Antidepressants in the Treatment of Anxiety Disorders:
A Report of 2 Meta-analyses. JAMA Psychiatry, 72(5), 500510. http://doi.
org/10.1001/jamapsychiatry.2015.15
Rosenthal, R. (1979). The file drawer problem and tolerance for null results.
Psychological Bulletin, 86(3), 638641. http://doi.org/10.1037/0033-
2909.86.3.638
Shamseer, L., Moher, D., Clarke, M., Ghersi, D., Liberati, A., Petticrew, M., PRIS-
MA-P Group. (2015). Preferred reporting items for systematic review and
meta-analysis protocols (PRISMA-P) 2015: elaboration and explanation. BMJ
(Clinical Research Ed.), 349, g7647.
Sterling, T. (1959). Publication decisions and their possible effects on inferences
drawn from tests of significance, or vice versa. . J Am Stat Assoc, (285), 3034.
Sterne, J. A. C., Sutton, A. J., Ioannidis, J. P. A., Terrin, N., Jones, D. R., Lau, J., Hig-
gins, J. P. T. (2011). Recommendations for examining and interpreting funnel
plot asymmetry in meta-analyses of randomised controlled trials. BMJ, 343,
d4002. http://doi.org/10.1136/bmj.d4002
Terrin, N., Schmid, C. H., Lau, J., & Olkin, I. (2003). Adjusting for publication bias in
the presence of heterogeneity. Statistics in Medicine, (22), 21132126.
Trauer, J. M., Qian, M. Y., Doyle, J. S., Rajaratnam, S. M. W., & Cunnington, D.
(2015). Cognitive Behavioral Therapy for Chronic Insomnia: A Systematic
Review and Meta-analysis. Annals of Internal Medicine, 163(3), 191204.
http://doi.org/10.7326/M14-2841
Turner, E. H., Knoepflmacher, D., & Shapley, L. (2012). Publication bias in antipsy-
chotic trials: an analysis of efficacy comparing the published literature to the
US Food and Drug Administration database. PLoS Medicine, 9(3), e1001189.
http://doi.org/10.1371/journal.pmed.1001189
Turner, E. H., Matthews, A. M., Linardatos, E., Tell, R. A., & Rosenthal, R. (2008).
Selective publication of antidepressant trials and its influence on apparent
efficacy. The New England Journal of Medicine, 358(3), 252260. http://doi.
org/10.1056/NEJMsa065779
Williams, J. W., Barrett, J., Oxman, T., Frank, E., Katon, W., Sullivan, M., Sengupta,
A. (2000). Treatment of dysthymia and minor depression in primary care: A
randomized controlled trial in older adults. Jama, 284, 151926.
Xia, J., Wright, J., & Adams, C. E. (2008). Five large Chinese biomedical bib-
liographic databases: accessibility and coverage. Health Information
and Libraries Journal, 25(1), 5561. http://doi.org/10.1111/j.1471-
1842.2007.00734.x
133
Meta-analyses in mental health research.
A practical guide
ISBN 978-90-825305-0-6
ISBN 9789082530506
9 789082 530506