You are on page 1of 27

Scientometrics (2015) 105:23–49

DOI 10.1007/s11192-015-1663-x

A bibliometric analysis of the Turkish software


engineering research community

Vahid Garousi1

Received: 10 July 2015 / Published online: 12 August 2015


 Akadémiai Kiadó, Budapest, Hungary 2015

Abstract This paper presents a bibliometric analysis of the Turkish software engineering
(SE) community (researchers and institutions). The bibliometric analysis has been con-
ducted based on the number of papers published in the software-engineering-related venues
and indexed in the Scopus academic search engine until year 2014. According to the
bibliometric analysis, the top-ranked institution is Middle East Technical University, and
the top-ranked scholar is Ayşe Başar Bener (formerly with Boğaziçi University and now
with Ryerson University in Canada). The analysis reveals other important findings and
presents a set of implications for the Turkish SE community and stakeholders (e.g., funding
agencies and decision makers) such as the followings: (1) Turkey produces only about
0.49 % of the world-wide SE knowledge, as measured by the number of papers in Scopus,
which is very negligible unfortunately. To take a more active role in the global SE
community, the Turkish SE community has to increase their outputs. (2) We notice a lack
of diversity in the general SE spectrum, e.g., we noticed very little focus on requirements
engineering, software maintenance and evolution, and architecture. This denotes the need
to further diversification in SE research topics in Turkey. (3) In total, 89 papers in the pool
(30.8 % of the total) are internationally-authored SE papers. Having a good level of
international collaborations is a good sign for the Turkish SE community. The highest
international collaborations have been with researchers from United States, Canada and
Netherlands. (4) In general, the involvement of industry in SE search is low. All stake-
holders (e.g., government, industry and academia) should aim at increasing the level of
industry-academia collaboration in the Turkish SE community, (5) Citation to Turkish SE
papers, in general, are significantly lower than a set of three representative pools of SE
papers. This is an area of concern which needs further investigation, and (6) In general,
there is a need to increase both the quantity and quality of the Turkish SE papers, in the

& Vahid Garousi


vahid.garousi@hacettepe.edu.tr
1
Software Engineering Research Group, Department of Computer Engineering, Hacettepe
University, Ankara, Turkey

123
24 Scientometrics (2015) 105:23–49

global stage. The approach we use in this study could be replicated in other countries to
provide insights and trends about the SE research performance in other countries.

Keywords Bibliometric analysis  Software engineering  Turkey  Researchers 


Scholars  Research community  Turkish universities and institutions

Introduction

Software engineering (SE) is an established discipline that has existed on its own for more
than 45 years. The term software engineering first appeared in the 1968 NATO Software
Engineering Conference (Galler 1969), and was meant to provoke thought regarding the
perceived ‘‘software crisis’’ at the time.
With the increasingly ubiquitous application of the computing and software technolo-
gies, the maturity of SE has a considerable impact on almost every other discipline (i.e., it
is hardly possible to imagine a modern society without heavy usage of software systems).
Over the last 45 years (1969–2014), the discipline of SE has gained a very substantial
growth in depth and breadth of its research contributions. This paper is aimed at a
quantitative bibliometric analysis of the Turkish SE community since Turkish researchers
have recently started to become more active in this field.
It is natural to raise various questions that are aimed to assess distinct facets of SE, for
example: (1) Who are the active researchers in the field of SE? (2) Which institutions (or
research centers) are most the active in the field? (3) Which research methodologies are
used the most in this field? (4) What are the active research topics in the field? (5) Which
countries are most the active in the field? and (6) How the growth in the number of
publications in SE compares to other disciplines of engineering and science?
There are various types of bibliometric studies in the field of SE addressing some of the
above questions, e.g., Caia and Card (2008), Wohlin (2007, 2008), Glass and Chen (2001,
2002, 2003, 2005), Tse et al. (2006), Wong et al. (2008, 2009, 2011), Ren and Taylor
(2007), Garousi and Varma (2010), Glass et al. (2002), McCain et al. (2005), Parnas
(2007), Meyer et al. (2009), Geist et al. (1996), Farhoodi et al. (2013), Garousi and Ruhe
(2013), Jia and Harman (2010), Harman et al. (2009). For example, the on-going annual
studies by Glass and Chen (2001, 2002, 2003, 2005), Tse et al. (2006), Wong et al. (2008,
2009, 2011) have been ranking the top researchers and institutions world-wide since 1996,
addressing the questions (1) and (2) above. There are studies which rank the top SE
researchers and institutions in regional levels as well (e.g., across Canada Garousi and
Varma 2010). There are also studies such as Glass et al. (2002) and Caia and Card (2008)
to address questions (3) and (4) above, respectively.
However, there exists no bibliometric study in the context of the Turkish SE commu-
nity. This is the main goal and contribution of the work reported in this article. This paper
is motivated by the need that understanding the top researchers and institutions active SE
research areas and also the level of industry-academia collaborations in this area in a
national level will assist young Turkish researchers and Ph.D. students in choosing their
research areas more wisely and also help the SE Turkish SE community identify strengths,
weaknesses, maturity levels, and opportunities in this area. These are the benefits that we
observed from our previous bibliometric study in the Canadian context (Garousi and
Varma 2010). Furthermore, this study and our Canadian study (Garousi and Varma 2010)

123
Scientometrics (2015) 105:23–49 25

could be replicated in other countries to provide insights and trends about the SE research
performance inside each country. To conduct our study, we mine statistical data from a
well-known online database of research articles, Scopus (2014), and provide results which
can be of interest to both SE researchers and also research policy and decision makers
inside Turkey and abroad.
The remainder of this article is structured as follows. Section ‘‘Related work’’ discusses
the related work. The research methodology is discussed in ‘‘Research methodology’’
section. Section ‘‘Results’’ presents the results of our study. Finally, ‘‘Conclusions and
future work’’ section concludes the article and discusses directions of future work.

Related work

Bibliometric rankings are quite common in field of SE, e.g., Caia and Card (2008), Wohlin
(2007, 2008), Glass and Chen (2001, 2002, 2003, 2005), Tse et al. (2006), Wong et al.
(2008, 2009, 2011), Ren and Taylor (2007), Garousi and Varma (2010), Glass et al. (2002),
McCain et al. (2005), Parnas (2007), Meyer et al. (2009), Geist et al. (1996), Farhoodi et al.
(2013), Garousi and Ruhe (2013), Jia and Harman (2010), Harman et al. (2009). We
discuss these works in the following. The study by Caia and Card (2008) analyzed the
active research topics in SE. The authors considered 7 top journals and 7 top international
conferences in SE and examined all the 691 papers published in these journals or presented
at these conferences in year 2006. The paper Caia and Card (2008) reported several
interesting findings: (1) 73 % of journal papers focus on 20 % of subject indexes in SE,
including testing and debugging, management, and software/program verification, (2)
89 % percent of conference papers focus on 20 % of subject indexes in SE, including
software/program verification, testing and debugging, and design tools and techniques, and
(3) the average number of references cited by a journal paper is about 33, whereas this
number becomes around 24 for a conference paper.
Furthermore, Wohlin conducted two consecutive studies in 2000 and 2001 (Wohlin
2007, 2008) to determine the most cited articles in the SE journals. The objective of the
analysis was to identify and list the articles that have influenced others the most as mea-
sured by citation count. An understanding of which research is viewed by the research
community as most valuable to build upon may provide valuable insights into what
research to focus on now and in the future. Based on the analysis, a list of the 20 most cited
articles was presented here. The intention of the analysis was twofold. Afterwards, the
authors of the most cited articles in 2001 were invited to contribute to a special section of
the Elsevier journal on Information and Software Technology (IST).
Perhaps, the most well-known collection of ranking studies in SE is the on-going
annual survey by Glass and Chen (2001, 2002, 2003, 2005) (Tse et al. 2006; Wong et al.
2008, 2009, 2011) which have so far ranked the top researchers and institutions world-
wide since 1996 until 2008. The annual ranking done by Glass and Chen (2001, 2002,
2003, 2005) (Tse et al. 2006; Wong et al. 2008, 2009, 2011) is based on paper counting
without considering number of citations. Ranking is based on the weighted scores of the
number of papers published in six selected journals and one magazine which have the
highest impact factors among all SE journals, and are widely perceived as the top
journals in the area. They are: (1) IEEE Transactions on Software Engineering (TSE), (2)
ACM Transactions on Software Engineering and Methodology (TOSEM), (3) Elsevier
Journal of Systems and Software (JSS), (4) Wiley Journal of Software Practice and

123
26 Scientometrics (2015) 105:23–49

Experience (SPE), (5) Springer Journal of Empirical Software Engineering (EMSE), (6)
Elsevier Journal of Information and Software Technology (IST), and (7) IEEE Software
Magazine.
Ren and Taylor (2007) followed an impact-factor-based approach and presented a
ranking in 2007 which provided different results (not that different though) compared to
the assessments done by Glass and Chen (2001, 2002, 2003, 2005) (Tse et al. 2006; Wong
et al. 2008, 2009, 2011). As part of their study, Ren and Taylor (2009) also developed a
Java tool (which is freely available online) which can be used by other researchers. To
weigh the papers and produce the ranking, their tool supports two types of bibliometric
metrics: impact factors, and h-index.
Garousi and Varma (2010) recently used the Java tool developed by Ren and Taylor
(2009) to conduct a Canada-wide ranking of SE scholars and institutions, based on both
impact factor, and h-index values. A break-down by different provinces of Canada was
conducted. Furthermore, they took a further step and conducted an initial inter-provincial
research efficiency analysis, by taking as input the total dollar value of research funds
received by each province, and relating it to the number of generated papers as the
output.
Published in the Journal of Scientometrics, the work of McCain et al. (2005) used
bibliometric and knowledge elicitation techniques to map a knowledge domain for SE in
the 1990s. Mappings of the intellectual and cognitive structure of SE were conducted using
three techniques: (1) author co-citation analysis, (2) pathfinder network analysis, and (3)
card sorting, as a knowledge elicitation method. Co-citation counts for 60 prominent SE
authors over the period of 1990–1997 were gathered from a scholarly database called
SciSearch. The study revealed interesting findings including a co-citation and a pathfinder
network map for the SE domain in the 1990s.
Glass et al. (2002) conducted an analysis of the research and research methods in the
SE literature as of 2002. They examined 369 papers in six leading research journals in
the SE field. From that examination, the authors concludes that SE research is diverse
regarding topic, narrow regarding research approach and method, inwardly-focused
regarding reference discipline, and technically focused (as opposed to behaviorally
focused) regarding level of analysis. The authors however passed no judgment on the SE
field as a result of these findings. Instead, they presented them as groundwork for future
SE research efforts.
Some critics of the evaluation method used by Glass and his colleagues believe that
correctness, importance, novelty, and overall contribution of each paper should be given
greater consideration than the number of publications (Parnas 2007). However, most
researchers agree that an assessment on these grounds will certainly be influenced by
subjective factors such as the competence or bias of the reviewer (Meyer et al. 2009). In
addition, the time investment required to adequately review each paper significantly
limits the number of publications that can be included in a survey. Citation counting has
been proposed as an enhancement to publication counting, although Parnas (2007)
observed that a citation might well imply a negative critique or simply a neutral ref-
erence as part of a general summary of related work. While the development of a more
comprehensive and accurate metric for the assessment of researchers and institutions is a
worthwhile goal, the rankings provided by publication counting can still be useful (Geist
et al. 1996).
Garousi and Ruhe (2013) reported in a bibliometric and geographic assessment of
40 years of SE research (from 1969 until 2009) in the global scale. Among the results of
that study are: (1) Over the 40-year window, in total about 60 % of the SE literature has

123
Scientometrics (2015) 105:23–49 27

been contributed by only 7 % of all countries, (2) the SE research output of different
countries does not necessarily correlate with their GDPs, (3) the share of contributions to
the SE discipline by the American researchers has declined from 71.43 (in 1980) to
14.90 % (in 2008), and (4) China is the country with the biggest growth in the number of
publications (from 0.82 % of the entire SE publications in 1991 to about 14 % of all the
papers in 2009).
We have also been able to find several systematic mapping and systematic literature
review (SLR) studies, e.g., Farhoodi et al. (2013), Jia and Harman (2010), Harman et al.
(2009) which, as part of their studies, have conducted a bibliometric assessment of papers
in focused sub-areas of SE, namely: development of scientific software (Farhoodi et al.
2013), software mutation testing (Jia and Harman 2010), and search-based SE (Harman
et al. 2009). These studies have measured and reported the number of publications by each
country in these areas and provided interesting trends and insights.
A number of bibliometric studies have been conducted and reported in the context of the
Turkish research community, which show that the results of such studies are of interest to
the research community, e.g.:
• A bibliometric analysis of tourism- and hospitality-related articles published in Turkey
(Evren and Kozak 2013).
• A bibliometric analysis of international collaboration of Turkey in liver transplantation
research (Bas et al. 2011).
• A bibliometric review of references of nursing research papers during the decade
1994–2003 in Turkey (Ergul et al. 2010).
• A comparative bibliometric analysis of parasitological research in Iran and Turkey
(Rashidi et al. 2013).
Last but not the least, there are several Turkish-wide rankings of institutions, the two
major ones being the following two:
• The list of top active institutions in Turkey, in all scientific disciplines, by their number
of scientific papers (ULAKBIM 2014) conducted and published regularly by the
Turkish Academic Network and Information Center (shortened as ULAKBİM in
Turkish) which is a division of the Scientific and Technological Research Council of
Turkey (shortened as TÜBITAK in Turkish). There is no ULAKBİM ranking focused
in the field of SE.
• The University Ranking by Academic Performance (URAP) project (2014) conducted
in the Middle East Technical University (METU) which ranks world-wide and also
Turkish universities in various groups, e.g., world ranking, ranking by country, ranking
by region, and field-based ranking. However, we were unable to find in the URAP
rankings a field ranking focused on SE. The URAP data includes a field ranking for the
general area of ‘‘Information and Computing Sciences’’ and that has been conducted in
the world-wide context and not among Turkish institutions.
By summarizing all the above related works, we see that there exists no bibliometric
study on the Turkish SE community. This is the goal and contribution of the work reported
in this article.
The author of this study has recent experience in conducting systematic bibliometric
studies, e.g., Farhoodi et al. (2013), Garousi and Ruhe (2013), Garousi and Varma (2012).
All that experience has enabled him to conduct this current bibliometric study with the
same approach and rigor.

123
28 Scientometrics (2015) 105:23–49

Research methodology

Goal and the research questions of our study are discussed in ‘‘Goal and research ques-
tions’’ section. Then, we discuss the data source and the paper search methodology in
‘‘Data source’’ and ‘‘Search methodology’’ sections.

Goal and research questions

Similar to our previous bibliometric studies (Farhoodi et al. 2013; Garousi and Ruhe 2013;
Garousi and Varma 2012), the research approach that we use in our study is the goal-
question-metric (GQM) methodology (Basili 1992). Using the GQM’s goal template
(Basili 1992), the goal of this study is to study the volume of research contributions (i.e.,
number of papers) to the field of SE in the country of Turkey by various Turkish
researchers and institutions for the purpose of ranking the output of researchers and
institutions, from the point of view (and for the benefit) of young researchers, Ph.D.
students and funding decision makers in Turkey. Based on the above goal, we raise the
following research questions (RQ’s):
• RQ 1: What is the annual trend of publications in Turkey in the field of SE? The
rationale for this RQ is to understand the trend of growth in SE papers originating from
Turkey to see how Turkey is doing in comparison to the entire world in terms of SE
research.
• RQ 2: What are the popular subject areas in this pool of papers? The rationale for this
RQ is to understand popular subject areas inside the Turkish SE community so that
young Turkish researchers and Ph.D. students can choose their research areas more
wisely.
• RQ 3: Who are the active authors? The rationale for this RQ is to identify the active
researcher in this area so that the prospective Ph.D. students would know which
researchers to approach to study in this area.
• RQ 4: What are the active institutions? Answering this RQ will also help prospective
Ph.D. students to better choose the institutions to conduct SE research.
• RQ 5: How do the ranking of active institutions provided in this study (in RQ 4)
compare with other Turkish-wide scientific rankings?
• RQ 6: What is the extent of collaborations between Turkish SE researchers and the
international SE community? The rationale for this RQ is to quantify the level of
international collaborations in the field of SE between Turkey and the rest of the world.
• RQ 7: What is the extent of industry-academia collaborations in the Turkish SE
community? The rationale for this RQ is to see the extent to which the Turkish SE
industry and academia collaborate.
• RQ 8: the sub-RQs under this RQ deal with various citation analysis aspects:
• RQ 8.1: What are the top-cited publications?
• RQ 8.2: How do the citations to papers authored solely by Turkish authors compare
with those having Turkish and international authors? The rationale for this RQ is to
see whether having international collaboration on a paper would increase its
citations.
• RQ 8.3: What is the correlation between citations versus publication years? Do
newer publications receive more citations compared to older ones?

123
Scientometrics (2015) 105:23–49 29

• RQ 8.4: How do the papers count and the citations of Turkish SE papers compare
with those of other countries?

Following the GQM methodology (Basili 1992), to quantify and address each of the
above RQs, we have used various metrics as discussed in ‘‘Results’’ section.

Data source

As the data source for this study, we evaluated several online research databases, e.g.,
IEEE Xplore, ACM Digital Library, ISI Web of Knowledge, Scopus and Google Scholar.
To choose the most suitable data source, we reviewed several studies conducted for the
sole purpose of identifying most comprehensive data sources for bibliometric studies, e.g.,

Fig. 1 Searching and filtering of the data in Scopus

123
30 Scientometrics (2015) 105:23–49

Falagas et al. (2008), Archambault et al. (2009). Databases such as IEEE Xplore, ACM
Digital Library and Google Scholar seemed to have a limited search pool in their data-
bases. For example, IEEE Xplore only provides the papers published by the IEEE. Sim-
ilarly, the ACM Digital Library only has the papers published by the ACM. The authors of
(Archambault et al. 2009) have done an interesting analysis by calculating the correlation
of the number of papers by country, based on inputs from the Web of Knowledge and
Scopus. Interestingly, the correlation is very high (R2 = 0.994). Thus, we almost had two
equal alternatives to choose from. If we were to conduct our study with either choice, we
would expect to get similar results. Between the two similar alternatives, we selected
Scopus.
Scopus (2014) is a bibliographic database containing abstracts and citations for aca-
demic journal articles. It covers nearly 21,000 titles from over 5000 publishers, of which
20,000 are peer-reviewed journals in the scientific, technical, medical, and social sciences.
Evaluating ease of use and coverage of Scopus and the Web of Science, a 2006 study
(Burnham 2006) concluded that ‘‘Scopus is easy to navigate, even for the novice user. The
ability to search both forward and backward from a particular citation would be very
helpful to the researcher’’. Scopus also offers author profiles which cover affiliations,
number of publications and their bibliographic data, references, and details on the number
of citations each published document has received. Scopus has about 50 million records
(papers).

Search methodology

To search for the SE papers published by the Turkish researchers and indexed by Scopus,
we entered the term ‘‘software’’ in the Source Title field of the Scopus search page, and the
name of the country ‘‘Turkey’’ in the Affiliation Country field. Two screenshots from the
searching and filtering process of the data in Scopus are shown in Fig. 1. The exact search
string that Scopus automatically generated using our search approach (as shown in Fig. 1)
is as follows:
(SRCTITLE (software) AND AFFILCOUNTRY (turkey)) AND (LIMIT-TO
(SUBJAREA, ‘‘COMP’’)) AND (EXCLUDE (EXACTSRCTITLE, ‘‘Advances in
Engineering Software’’)) AND (EXCLUDE (EXACTSRCTITLE, ‘‘Optimization
Methods and Software’’)) AND (EXCLUDE (EXACTSRCTITLE, ‘‘Environmental
Modelling and Software’’)) AND (EXCLUDE (SUBJAREA, ‘‘ENVI’’))
To ensure only having relevant SE papers in our pool of papers, we had to exclude a few
journals which had the term ‘‘software’’ but were not really SE venues, e.g., Journal of
Advances in Engineering Software, Journal of Optimization Methods and Software, and
Journal of Environmental Modeling and Software.
We checked the inclusion of the following major SE conferences (Xie 2011): Inter-
national Conference on Software Engineering (ICSE), International Symposium on the
Foundations of Software Engineering (FSE), International Conference on Automated
Software Engineering (ASE) and International Conference on Software Maintenance
(ICSM). To further validate the accuracy of the papers in our pool, we used the two well-
known accuracy metrics from the information retrieval and data-mining literature: preci-
sion and recall (Witten et al. 2011). Precision is the fraction of retrieved instances that are
relevant, while recall is the fraction of relevant instances that were retrieved.
To calculate the precision metric, our approach was to verify that each of the papers in
the pool was a relevant SE paper. All papers in the pool appeared to relevant SE papers.

123
Scientometrics (2015) 105:23–49 31

The precision rate thus was 100 %. To measure recall, we selected five representative
papers from the list of publications of two well-known Turkish researchers (Ayşe Başar
Bener and Onur Demirörs). We then verified whether those five papers were included in
the pool. The good news was that all five papers were in the pool, denoting that the recall
rate was 100 %.
The search process resulted in 289 papers as shown in Fig. 1 which we treated as the
pool of paper used for analysis in the rest of this paper. To ensure repeatability of this
study, all the raw data and the list of papers extracted from Scopus have been placed in an
online document (Garousi 2014) which the reader can download and analyze.

Results

We present the results in ‘‘RQ 1: Annual trend of publications’’–‘‘RQ 7: Industry-academia


collaborations’’ sections for each of the RQs 1–8.

RQ 1: Annual trend of publications

Annual trend of papers in the pool have been shown as both individual annual and also
cumulative plots in Fig. 2. To put the Turkish statistics in context, we also adopt and show
in Fig. 3 the annual trend of SE papers world-wide from another recent world-wide SE
bibliometrics study (Garousi and Ruhe 2013). Note that the data in the world-wide chart
are only until 2009 which was the analysis end time of the study reported in Garousi and
Ruhe (2013). We discuss the main observations from these figures next:

50

40
# of papers

30

20

10

0
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014

350
300
(cummulave)
# of papers

250
200
150
100
50
0
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014

Fig. 2 Annual trend of papers in the pool (top annual values, bottom cumulative values)

123
32 Scientometrics (2015) 105:23–49

30000

(cummulave) 25000
# of papers

20000
15000
10000
5000
0
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
Fig. 3 Annual trend of SE papers world-wide from another bibliometrics study (Garousi and Ruhe 2013),
between years 1992 and 2009

• The Turkish SE community was not very active until year 2008. Until 2006, the
community published less than ten papers a year in the SE venues indexed in Scopus.
• Although there are ups and down even after 2007, but the general trend has a healthy
growth rate.
• The data for year 2014 are partial since this study was conducted before the year-end of
2014. Thus the data in Scopus for 2014 are partial.
• By comparing the Turkish and world-wide cumulative charts, we can see that former
had a surge around year 2007, while the latter has been on a more steady growth rate.
We also wondered about the proportion of world-wide SE papers which originate from
Turkey. To numerically measure this, we did a search for ‘‘software’’ subject (venue)
names and did not put any other criteria. This search resulted in 58,114 papers, denoting all
the SE-related papers indexed in Scopus. We then divided the Turkish SE pool size (289
papers) by that number. In conclusion, we found that, as of this writing (September 2014),
Turkey produces only about 0.49 % (about half a percent) of the world-wide SE knowl-
edge, as measured by the number of papers in Scopus, which is very negligible
unfortunately.

Fig. 4 Word cloud of paper titles (using the online tool at www.wordle.net)

123
Scientometrics (2015) 105:23–49 33

RQ 2: Popular subject areas

To analyze the popular subject areas, one convenient way was to visualize the word cloud
of paper titles. We created the word cloud using an online tool (www.wordle.net), which is
shown in Fig. 4. Common and obvious English terms (such as ‘‘software’’) have been
removed for brevity. Also, the most frequent terms in paper titles are listed in Table 1.
We can see, in order, defect-analysis, testing, model driven, and software development
are among the most popular subject areas in the Turkish SE community. We notice a lack
of diversity in the general SE spectrum, e.g., we notice very narrow focus on requirements
engineering, software maintenance and evolution, and software architecture. This denotes
the need to foster and encourage more research on these topics among the Turkish SE
researchers.

RQ 3: Active authors

Scopus data includes the list of authors for the papers. We however had to conduct some
data cleaning and merging, e.g., ‘‘Bener A.B.’’ and ‘‘Bener A.’’ (the case of the first author
in Fig. 5) were treated as two different persons which we had to merge into one. The root
case was that often some authors had used different last names or their middle names in
some papers and not in others.
We show in Fig. 5 the ranking of authors who have been involved in least five articles in
the pool. Ayşe Başar Bener with 41 papers is the 1st ranking author. Onur Demirörs and
Burak Turhan both with 22 papers are in the 2nd and 3rd ranks.
To help young researchers and Ph.D. students thinking about graduate studies in SE, it
is important to also know the research areas of the active authors. We describe in Table 2
the research areas of the top-10 most active authors along with their affiliation information.
We can see that three of the top-10 active authors have moved out of Turkey and one has
moved from abroad to Turkey.

RQ 4: Active institutions

Active institutions with at least five papers in the pool of papers are shown in Fig. 6.
Middle East Technical University (METU), Boğaziçi University and Bilkent University
are the top three with 58, 55 and 33 SE papers, respectively. Out of the 12 entities in Fig. 6,

Table 1 The most frequent


Term Number of papers Percentage of all papers
terms in paper titles
Defect 28 10
Testing 23 8
Model 22 8
Development 20 7
Prediction 19 7
Process 15 5
Time 13 4
Measurement 13 4
Quality 12 4

123
34 Scientometrics (2015) 105:23–49

# of papers
0 10 20 30 40 50

Bener, A. 41
Demirors, O. 22
Turhan, B. 22
Tekinerdogan, B. 11
Sozer, H. 11
Garousi, V. 9
Tarhan, A. 8
Buzluca, F. 8
Tuglular, T. 7
Tosun, A. 6
Catal, C. 6
Caglayan, B. 6
Yilmaz, C. 6
Misirli, A.T. 5
Dogru, A.H. 5
Belli, F. 5
Gencel, C. 5
Ulusoy, O. 5
Calikli, G. 5
Misra, S. 5

Fig. 5 Active authors: ranking of authors with at least five articles in the pool

10 are academic institutions, one is a national research center and funding agency
(TÜBİTAK) and one (ASELSAN Inc.) is a major firm (defense contractor).
A natural question related to active institutions was about the geographic locations
where the Turkish SE research activities are conducted. We present the map of Turkey
showing the locations of the 12 active institutions in Fig. 7. As we can see, the 12 insti-
tutions are located in only three major cities, Ankara, İstanbul and İzmir. We notice in the
map a lack of geographical diversity in terms of the SE research in Turkey, especially in
the north and east of the country.

RQ 5: Comparing rankings of this study with other Turkish-wide rankings


of institutions

As discussed in ‘‘Related work’’ section, there are several Turkish-wide rankings of


institutions, the two major ones being the following: (1) the list of top active institutions in
Turkey, in all scientific disciplines, by their number of scientific papers (ULAKBIM 2014)
published regularly by the Turkish Academic Network and Information Center (shortened
as ULAKBİM in Turkish), and (2) the set of various rankings published by a project named
the University Ranking by Academic Performance (URAP) (2014) which is conducted in
the Middle East Technical University (METU). There is no ranking in either of the above
two sources focused in the field of SE. However, still, to put our observations and rankings
in context, we compare the rankings generated in this study with the above two Turkish-
wide rankings of institutions.
Since in the answer of RQ 4 in ‘‘RQ 4: Active institutions’’ section, there were 12
institutions in the ranking of Fig. 6 (institutions with at least five papers in our pool of
papers), we also extracted the top-12 institutions from both the ULAKBİM ranking (2014)

123
Scientometrics (2015) 105:23–49 35

Table 2 Affiliation and research areas of the top-10 most active authors
Rank Authors Affiliation Research areas

1 Ayşe Başar Bener Former affiliation: Software analytics, software measurement,


Boğaziçi University software economics, software quality, green
Current affiliation: analytics
Ryerson University in
Canada
2 Onur Demirörs Middle East Technical Software process improvement models, software
University engineering standards, software engineering
education
3 Burak Turhan Former affiliation: Test-driven development, software analytics,
Boğaziçi University defect prediction, effort estimation
Current affiliation:
University of Oulu in
Finland
4 Bedir Tekinerdoğan Bilkent University Software architecture design, aspect-oriented
software development, model-driven software
development and software product line
engineering
5 Hasan Sözer Özyeğin University Software architecture design, software reliability
engineering, software fault tolerance,
distributed and self-adaptive systems
6 Vahid Garousi Former affiliation: Software test engineering, search-based software
University of Calgary engineering, empirical studies and
in Canada experimentation in software engineering
Current affiliation:
Atilim University
7 Ayça Tarhan Hacettepe University Model-based assessment and improvement of
software processes, software quality, software
measurement, process management, and
software engineering standards
8 Feza Buzluca Istanbul Technical Object-oriented software design
University
9 Tuğkan Tuğlular Izmir Institute of Software testing, information security testing,
Technology dependability
10 Ayşe Tosun Mısırlı Former affiliation: Software engineering, AI
Boğaziçi University
Current affiliation:
University of Oulu in
Finland

and the URAP ranking (2014). Both rankings have been extracted from their respective
sources (ULAKBIM 2014; URAP team 2014) and shown in Tables 3 and 4.
The rank criteria for the two rankings (by ULAKBİM and URAP) are the number of
publications and world rankings, respectively. We also show in both tables rank number of
each institutions in our ranking (Fig. 6). As the results reveal, we can see that there are low
correlations among the rankings by ULAKBİM and URAP and our ranking. The reason is
rooted in the fact that the two other rankings are not focused in the field of SE and also that,
research output of SE researchers in a given institution does not seem to be correlated with
all other fields. In addition, most of the Turkish SE researchers seems to be located in a few
universities (as discussed above, in the discussion of geographic locations in Fig. 7).

123
36 Scientometrics (2015) 105:23–49

# of papers
0 10 20 30 40 50 60 70
Middle East Technical University 58
Bogazici University 55
Bilkent University 33
18
Alim University 15
12
Ozyegin University 12
Haceepe University 11
10
10
7
Ege University 5

Fig. 6 Active institutions with at least five papers in the pool of papers

Fig. 7 The map of Turkey showing the locations of the active institutions listed in Fig. 6 (using the tool
www.multiplottr.com)

RQ 6: International collaborations

We show in Fig. 8 the list of countries whose researchers have had at least two joint SE
papers with the Turkish SE researchers. Researchers from United States, Canada and
Netherlands have had the most collaboration (measured by number of published joint
papers) with the Turkish SE researchers, being involved in 37, 35 and 11 papers,
respectively. One interesting observation is that, although United States and Canada are not
that close from geographical standpoint compared to European countries to Turkey, still
the Turkish SE researchers seem to find common interests to collaborate quite often with
researchers from those countries.
We also ranked the foreign international university names with which the Turkish SE
researchers had the most collaboration with. In the top-3 list are: Ryerson University (15

123
Scientometrics (2015) 105:23–49 37

Table 3 List of top-12 institutions in Turkey, in all scientific disciplines, by their number of scientific
papers, during period 2002–2006, published by ULAKBIM (2014)
Country ranking University Number of Rank number in
publications our ranking

1 Istanbul University 4530 Not in top 50


2 Hacettepe University 4147 8
3 Ankara University 2949 Not in top 50
4 Ege University 2344 12
5 Gazi University 2220 Not in top 50
6 Middle East Technical University 2208 1
7 Atatürk University 2028 Not in top 50
8 Dokuz Eylul University 1703 13
9 Istanbul Technical University 1662 4
10 Firat University 1577 Not in top 50
11 Çukurova Üniversitesi 1363 Not in top 50
12 Erciyes University 1358 Not in top 50

Table 4 Top-12 Turkish institutions during 2013–2014 according to the University Ranking by Academic
Performance (URAP) project (2014)
Country ranking University World ranking Rank number in our ranking

1 Istanbul University 417 Not in top 50


2 Hacettepe University 458 8
3 Middle East Technical University 474 1
4 Ege University 486 12
5 Ankara University 510 Not in top 50
6 Gazi University 519 Not in top 50
7 Istanbul Technical University 589 4
8 Boğaziçi University 734 2
9 Çukurova University 784 Not in top 50
10 Bilkent University 818 3
11 Suleyman Demirel University 819 Not in top 50
12 Marmara University 837 Not in top 50

joint papers with Turkish academics), University of Calgary (9 joint papers), and
University of Oulu and University of Twente (each with 8 joint papers).
We also wanted to know whether the volume of international SE papers have been
increasing over the years. Figure 9 shows the annual trend of all papers versus interna-
tionally-authored SE papers from Turkey (a researcher from Turkey and one from abroad).
We can notice that after year 2003 when the number of SE papers started to arise, the
number of internationally-authored SE papers from Turkey have also raised in a steady
fashion. This is a good sign for the Turkish SE community as the researchers continue
collaborating with international researchers. In total, there are 89 papers (30.8 % of the
total) internationally-authored SE papers in the pool.

123
38 Scientometrics (2015) 105:23–49

# of papers
0 5 10 15 20 25 30 35 40
United States 37
Canada 35
Netherlands 11
Germany 9
Finland 8
8
Spain 5
4
Panama 3
Brazil 2
China 2
2
Singapore 2
Sweden 2

Fig. 8 Countries whose researchers have had at least two joint SE papers with the Turkish SE researchers

50
45 All papers
40
35
# of papers

30
25
20
15
10
5
0
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014

Fig. 9 Annual trend of all papers versus internationally-authored SE papers from Turkey

RQ 7: Industry-academia collaborations

For an applied field such as SE, it is very important to have active industry-academia
collaborations. Various challenges ad experience reports on the importance of this par-
ticular topic have been reported in the literature, e.g., Kathy (1997), Runeson and Minör
(2014), Wohlin (2013). We wanted to quantitatively measure the extent of such collabo-
rations. For each paper, we reviewed its authors and their affiliations and then tagged the
paper as fully written by academic authors, fully from industry, or an industry-academia
joint collaboration. The annual trends are shown in Fig. 10.
We can see that, in general, the involvement of industry is low. In total, there are only
four and 13 papers (1.4 and 4.5 % of the total) papers with only industrial authors and
industry-academic authors in the pool. The remaining 272 papers (94.1 %) have been
authored by only academic authors. The first papers resulting from industry-academia

123
Scientometrics (2015) 105:23–49 39

50
Academic papers
40

Only industrial authors


# of papers

30

20

10

0
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
Fig. 10 Annual trends of industry-academia collaborative SE papers, versus papers from only industrial
authors versus all papers in the pool

Table 5 The five most cited papers in the pool


Title of article Year Venue Number of Authors (Turkish
citations authors have been
received underlined)

On the relative value of cross- 2009 Empirical Software 56 Turhan B., Menzies T.,
company and within-company Engineering Bener A.B., Di
data for defect prediction Stefano J.
Defect prediction from static 2010 Automated Software 40 Menzies T., Milton Z.,
code features: Current results, Engineering Turhan B., Cukic B.,
limitations, new approaches Jiang Y., Bener A.
Functional size measurement 2008 ACM Transactions on 40 Gencel C., Demirors O.
revisited Software Engineering
and Methodology
Medical image security and EPR 2011 Journal of Systems and 24 Ulutas M., Ulutas G.,
hiding using Shamir’s secret Software Nabiyev V.V.
sharing scheme
A process model for component- 2003 IEEE Software 21 Dogru A.H., Tanik
oriented software engineering M.M.

collaborations appeared quite late, in year 2008. In recent year, we see a small increase in
the number of those papers, but further increase is needed.

RQ 8: Citation analysis

We present the following results next:


• RQ 8.1: top-cited publications
• RQ 8.2: citations to papers authored solely by Turkish authors versus those having
international authors
• RQ 8.3: citations versus publication years
• RQ 8.4: comparing the papers count and the citations with other countries

123
40 Scientometrics (2015) 105:23–49

RQ 8.1: Top-cited publications

The five most cited papers in the pool are shown in Table 5. Except the fifth paper, the
other four have appeared quite recently (between 2008 and 2011). This could possibly
mean that the quality of, and thus the citations to, the more recent papers are more
compared to the older papers. From the list of authors in this list, we can also notice that
all, except one, of the top five papers are internationally-authored papers.

RQ 8.2: Citations to papers authored solely by Turkish authors versus those having
international authors

Similar to the citations analysis that we conducted in a recent systematic mapping of the
web application testing domain (Garousi et al. 2013a), we used two types of citation
metrics for this purpose:
1. All (total) number of citations to a paper since its publication.
2. Average number of citations per year since publication of a study.
The latter metric, defined as follows, is important to be considered to ensure fairness in
ranking and comparisons since it neutralizes the effect of publication year (i.e., old and
recent papers):
Total Citations ðpÞ
Average Citations ðpÞ ¼
2014  Publication Year ðpÞ

In the above formula, p is a given paper. Total Citations (p) and Publication Year (p)
are self-explanatory. As an example on how the normalized number of citations is cal-
culated, the study (Turhan et al. 2009) has 56 total citations as of this writing and was
published in 2009. Thus, its normalized citation is calculated as:
56
Average Citations ðpÞ ¼ ¼ 9:3
2014  2009

60
All citations Average annual number of citations
9

50 8
Average annual number of citations

7
40
6
All citations

30 5

4
20
3

2
10
1

0 0

International collaborations Solely by Turkish authors International collaborations Solely by Turkish authors

Average 3.84 1.99 0.69 0.33

Median 1.0 0.0 0.21 0.0

Fig. 11 Individual-value plots comparing citations to papers authored solely by Turkish authors versus
those having international authors

123
Scientometrics (2015) 105:23–49 41

400 50
All citations Average annual number of citations

Average annual number of citations


40
300

30
All citations

200
20

100 10

0
0
WAT GUI UML-SPE
WAT GUI UML-SPE

Average 31.90 21.15 34.90 5.54 3.47 4.72

Median 12.00 8.00 18.50 3.00 2.00 2.65

Fig. 12 Individual-value showing citations to three pools of papers reviewed by three recent systematic
mapping studies (Garousi et al. 2013a, b; Banerjee et al. 2013)

Compared to the total number of citations, the average metric essentially returns the
average number of citations of a paper per year, since its publication year. Individual-value
plots showing the citations using both the above metrics are depicted in Fig. 11 in which
papers authored solely by Turkish authors versus those having international authors have
been shown. The average and median values of each data set are also shown.
There is no statistically significant difference between any pair of the data sets in each
of the two charts. However, we can notice slight difference based on the average and
median values. In the total citations chart, we can see that internationally-authored papers
have slightly more citations (3.86 on average) compared to papers with all Turkish authors
(1.99 on average). In the average number of citations chart, we also observe a similar
situation. All charts are skewed downwards, denoting that most papers have received low
or simply zero number of citations.
As a few points of reference, since we have access to citation data from three recent
systematic mapping studies conducted in three areas of SE, we compared the distributions
of the citations to Turkish SE papers versus those data sets: (1) web application testing
(Garousi et al. 2013a), (3) GUI testing (Banerjee et al. 2013) and (4) UML-driven software
performance engineering (UML-SPE) (Garousi et al. 2013b). Figure 12 shows the citations
to above-mentioned three pools of papers. As we can visually see in the charts and also
according to the average and median values, the citation values in the Turkish data sets are
significantly lower than the all three pools of papers. This is an area of concern which
needs further investigation on why Turkish SE papers have been receiving low citations.

RQ 8.3: Citations versus publication years

Two scatter-plots showing the total and average citations to each paper versus its publi-
cation year are displayed in Figs. 13 and 14. A quadratic regression fit has been also shown
in each figure. The Pearson correlation values for the two figures are -0.23 and 0.16,
respectively. The former value denotes that, the higher the publication year of a paper, the
less the number of citations to that paper. In simpler terms, this means that newer papers
have less citation than older papers. This is as expected since newer papers have had less
chance to be noticed, read and be referenced by other papers. The latter Pearson correlation
value (0.16 for data sets in Fig. 14) denotes that the average citations slightly increase by

123
42 Scientometrics (2015) 105:23–49

60

50

# of citations to the paper


40

30

20

10

1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Publication year of each paper

Fig. 13 Scatter-plot showing the number of citations to each paper versus its publication year

9
Average annual # of citations to the paper

1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Publication year of each paper

Fig. 14 Scatter-plot showing the average annual number of citations to each paper versus its publication
year

Fig. 15 Scatter-plot comparing 100000


Canada
the number of papers from each Germany
country and the total number of 10000
citations to those papers
Turkey
1000
# of citaons

Iran Greece
100

10

1
1 10 100 1000 10000
# of papers

123
Scientometrics (2015) 105:23–49 43

increase in years. This means that newer papers tend to get more average citations, possibly
since they discuss more trendy topics which are cited more by other papers.

RQ 8.4: Comparing the papers count and the citations with other countries

To assess Turkey’s SE research performance in the global scale, we compared the papers
count and the total numbers of citations to Turkish SE papers with four selected countries.
We chose two of Turkey’s neighboring countries: Greece and Iran, and two developed
countries from the North America and Europe: Canada and Germany.
Figure 15 shows a scatter-plot which visualizes the papers counts and the total numbers
of citations to SE papers from each of these countries. The data for the other four countries
have been extracted from Scopus in the same way as we did for the case of Turkey
(‘‘Search methodology‘‘ section). For better visualization, both axes in Fig. 15 are in
logarithmic scale. The same data have also been shown numerically in Table 6 where an
additional metric (average citations per paper) is also computed. We discuss below our
observations based on these data:
• As we can see, the three countries: Turkey, Greece and Iran have performed quite
similarly, and the two developed countries (Canada and Germany) have performed
quite similarly. However, there is a vast difference with respect to both metrics between
the two ‘‘clusters’’.
• While Greek researchers have published slightly more SE papers than their Turkish
counterparts (559 vs. 289), the Turkish SE papers have received more total citation
compared to Greek papers (744 vs. 349).
• Among the five countries, the Greek SE papers have the lowest average citations per
paper (only 0.6) and the Canadian SE papers have the highest (10.5).
Regarding the low ratio of the Turkish SE contributions in the global scale, the personal
opinion and observation of the author during his academic career in Turkey have been that
most Turkish authors seem to prefer publishing in national venues, the main one being the
Turkish National Software Engineering Symposium, which in Turkish is phrased as:
Ulusal Yazılım Mühendisliği Sempozyumu (UYMS) (2014). A simple calculation of the
number of papers published in the UYMS series of conferences can shed light on this issue.
The UYMS conference has been held each year between 2011 and 2014 and, before that,
on years 2003, 2005, 2007 and 2009, thus eight times as of this writing. Each year, the
UYMS conference publishes about 50 papers, thus yielding a total of 400 papers during its
eight time of being held. By comparing this number of 400 papers in the UYMS sym-
posium with the 289 papers in the pool of Turkish SE papers in Scopus, we can notice the
tendency of Turkish authors to publish in national venues rather than in international
venues.

Table 6 Citation data for SE papers from five countries


Country No. of SE papers No. of citations Average citations per paper

Turkey 289 744 2.6


Iran 392 552 1.4
Greece 559 349 0.6
Germany 4246 35,125 8.3
Canada 3550 37,376 10.5

123
44 Scientometrics (2015) 105:23–49

Discussions

Section ‘‘RQ 8.1: Top-cited publications’’ discusses a summary of findings, trends and
implications for the Turkish SE community. Section ‘‘RQ 8.2: Citations to papers authored
solely by Turkish authors versus those having international authors’’ discusses potential
threats to the validity of our study and steps we have taken to minimize or mitigate them.

Summary of findings, trends and implications for the Turkish SE community

Below we summarize the main results of each RQ and discuss the relevant trends and
findings. We also present implications for the Turkish SE community regarding the
findings.
• RQ 1 (The annual trend of publications in Turkey in the field of SE):
• The Turkish SE community was not that active until year 2008. Until 2006, the
community published less than 10 papers a year in the SE venues indexed in
Scopus.
• Although there are ups and down even after 2007, but the general trend has a
healthy growth rate after 2008.
• Turkey has produced so far only about 0.49 % of the world-wide SE knowledge, as
measured by the number of papers in Scopus, which is very negligible
unfortunately.
• To take a more active role in the global SE community, the Turkish SE community
has to increase their international outputs.
• RQ 2 (The popular subject areas):
• The most popular subject areas in the Turkish SE community are, in order: defect-
analysis, testing, model driven, and software development.
• We noticed a lack of diversity in the general SE spectrum, e.g., there has been very
little focus on requirements engineering, software maintenance and evolution, and
software architecture. This denotes the need for further diversification in SE
research topics in Turkey.
• RQ 3 (The active authors):
• Researchers Ayşe Başar Bener, Onur Demirörs and Burak Turhan are the most
active authors. Thus the prospective Ph.D. students could approach those and other
top researchers in the list (Fig. 5) to conduct SE research in Turkey.
• RQ 4 (The active institutions):
• Middle East Technical University (METU), Boğaziçi University and Bilkent
University are the top three institutions. Thus the prospective Ph.D. students could
approach those and other top institutions in the list (Fig. 6) to conduct SE research
in Turkey.
• We notice a lack of geographical diversity in terms of the SE research in Turkey as
the top 12 institutions are located in three major cities only (Ankara, İstanbul and
İzmir). This denotes the need for further geographical diversification of SE research
topics in Turkey by establishing SE research groups in other cities.

123
Scientometrics (2015) 105:23–49 45

• RQ 5 (Comparing rankings of this study with other Turkish-wide rankings of


institutions):
• There are low correlations among the institution rankings provided by ULAKBIM (2014)
and URAP team (2014) and our ranking. The reason is rooted in the fact that the two other
rankings are not focused in the field of SE and also that, research output of SE researchers
in a given institution does not seem to be correlated with all other fields. In addition, most
of the Turkish SE researchers seems to be located in a few universities only.
• RQ 6 (Collaborations between Turkish SE researchers and the international SE
community):
• In total, 89 papers in the pool (30.8 % of the total) are internationally-authored SE
papers. Having a good level of international collaborations is a good sign for the
Turkish SE community.
• The highest international collaborations have been with researchers from United
States, Canada and Netherlands.
• RQ 7 (Industry-academia collaboration):
• In total, there are only four and 13 papers (1.4 and 4.5 % of the total) papers with
only industrial authors and industry-academic authors in the pool. In general, the
involvement of industry is low.
• The remaining 272 papers (94.1 %) have been authored by only academic authors.
• All stakeholders (e.g., government, industry and academia) should aim at increasing
the level of industry-academia collaboration in the Turkish SE community.
• RQ 8 (Citation analysis):
• RQ 8.1 (Top-cited publications):
• The three top-cited publications have appeared between years 2008 and 2010
and are mostly internationally-authored papers.
• RQ 8.2: (Citations to papers):
• Internationally-authored papers have slightly more citations (3.86 on average)
compared to papers with all Turkish authors (1.99 on average).
• All citation charts are skewed downwards, denoting that most papers have
received low or just zero number of citations. This denotes the need to analyze
the reasons behind low citations values and to take actions to increase those
numbers.
• We compared the distributions of the citations to Turkish SE papers versus three
other citation data sets: (1) web application testing (Garousi et al. 2013a), (3)
GUI testing (Banerjee et al. 2013) and (4) UML-driven software performance
engineering (UML-SPE) (Garousi et al. 2013b). The citation numbers in the
Turkish data sets are significantly lower than the all three pools of papers. This
is an area of concern which needs further investigation.
• RQ 8.3: (Correlation between citations versus publication years):
• Newer papers, in general, have less citations than older papers.
• However, newer papers tend to get more average annual citations, possibly
since they discuss more trendy topics which are cited more by other papers.

123
46 Scientometrics (2015) 105:23–49

• RQ 8.4 (Comparison of count and the citations of Turkish SE papers with those of
other countries):
• The three countries: Turkey, Greece and Iran have performed quite similarly,
but their performance is much less than the two developed countries (Canada
and Germany) who have performed quite similarly with each other.
• While Greek researchers have published slightly more SE papers than their
Turkish counterparts (559 vs. 289), the Turkish SE papers have received more
total citation compared to Greek papers (744 vs. 349).
• In general, there is a need to increase both the quantity and quality of the
Turkish SE papers, in the global stage.

Potential threats to validity

For this study, the following potential threats to validity are applicable: internal, construct,
and external validity. We discuss below the steps we have taken to minimize or mitigate
those potential threats to validity.
Internal validity refers to the certainty that the different research questions studied are
properly based on the papers selected. We addressed internal validity and reduced (se-
lection) bias in two ways. Firstly, we chose the Scopus academic paper database which was
pre-evaluated to be among the most comprehensive data sources (Falagas et al. 2008;
Archambault et al. 2009; Burnham 2006). Secondly, to validate the accuracy of the paper
selection, we used the two well-known accuracy metrics from the information retrieval and
data-mining literature: precision and recall (Witten et al. 2011) (section ‘‘Data source’’).
Furthermore, to search for the SE papers published by researchers of each country, we
entered the phrase ‘‘software’’ in the Source Title field of the Scopus search page.
Construct validity refers to the relationship between theory and observation, and
whether the proposed measurement actually represents the topic of investigation. In our
context, our goal was to conduct a bibliometric assessment of the Turkish SE community.
We used the simplest bibliometric metrics, i.e., number of papers, for this approach. We
would have liked to use more sophisticated metrics such as h-index (Hirsch 2005) and
impact factor (Garfield 2005). However, we did not have the data for these metrics, and
thus we could not use those metrics. We plan to conduct ranking studies using such
advanced metrics in future works.
External validity refers to the generalizability of results. All the steps performed have
been made transparent in the study. All the papers were taken from Scopus are publicly
available. The data that we have extracted from Scopus are available in online document
(Garousi 2014). Since our pool had high level of precision, we believe that it is possible to
replicate and also generalize our results.

Conclusions and future work

Bibliometric studies are an established means to measure and analyze publications in an


area of research. Because of the pervasive character and continuously growing importance
of software, the study of SE is of substantial interest. While there is a growth of SE papers
in general, this study looks more deeply intro publications trends in the Turkish SE

123
Scientometrics (2015) 105:23–49 47

community. While there are other types of bibliometric studies in the SE community (e.g.,
Caia and Card 2008; Wohlin 2007, 2008; Glass and Chen 2001, 2002, 2003, 2005; Tse
et al. 2006; Wong et al. 2008, 2009, 2011; Ren and Taylor 2007; Garousi and Varma 2010;
Glass et al. 2002; McCain et al. 2005), no study to date has conducted the types of analysis
we conducted in this article.
Based on the analysis of 289 papers, seven research questions were studied in this
article. The bibliometric study revealed interesting insights into the contributions of the
Turkish SE community to the field of SE from 1992 to 2014. Among other benefits, these
insights are considered useful for decisions about research collaborations with Turkish SE
researchers and selecting Ph.D. programs and research supervisors. The results can also be
used to see the impact and further need of national funding initiatives in Turkey. The same
type of bibliometric study approach can be conducted in other countries’ contexts.
Note that this geographic bibliometric study is considered an initial step in bibliometric
analysis of the Turkish SE community, and further investigations on the topic using further
bibliometric techniques, metrics and concepts are necessary. The nature of this analysis is
very much quantitative in terms of the number of publications. The qualitative impact in
general is harder to measure, and is seen as a topic of future research in this area. We also
plan to conduct other additional analyses, e.g., interviews with the industry to increase their
involvements.

Acknowledgments Vahid Garousi was partially supported by several internal grants provided by
Hacettepe University and the Scientific and Technological Research Council of Turkey (TÜBİTAK).

References
Archambault, É., Campbell, D., Gingras, Y., & Larivière, V. (2009). Comparing bibliometric statistics
obtained from the web of science and scopus. Journal of American Society for Information Science, 60,
1320–1326.
Banerjee, I., Nguyen, B., Garousi, V., & Memon, A. (2013). Graphical User Interface (GUI) testing:
Systematic mapping and repository. Information and Software Technology, 55, 1679–1694.
Bas, K., Dayangac, M., Yaprak, O., Yuzer, Y., & Tokat, Y. (2011). International collaboration of Turkey in
liver transplantation research: A bibliometric analysis. Transplantation Proceedings, 43, 3796–3801.
Basili, V. R. (1992). Software modeling and measurement: The Goal/Question/Metric paradigm. In Tech-
nical Report, University of Maryland at College Park.
Burnham, J. (2006). Scopus database: A review. Biomedical Digital Libraries, 3(1), 1–8.
Caia, K.-Y., & Card, D. (2008). An analysis of research topics in software engineering—2006. Journal of
Systems and Software, 81, 1051–1058.
Ergul, S., Ardahan, M., Temel, A. B., & Yıldırım, B. Ö. (2010). Bibliometric review of references of nursing
research papers during the decade 1994–2003 in Turkey. International Nursing Review, 57, 49–55.
Evren, S., & Kozak, N. (2013). Bibliometric analysis of tourism and hospitality related articles published in
Turkey. Anatolia, 25, 61–80.
Falagas, M. E., Pitsouni, E. I., Malietzis, G. A., & Pappas, G. (2008). Comparison of PubMed, Scopus, Web
of Science, and Google Scholar: Strengths and weaknesses. The FASEB Journal, 22, 338–342.
Farhoodi, R., Garousi, V., Pfahl, D., & Sillito, J. P. (2013). Development of scientific software: A systematic
mapping, bibliometrics study and a paper repository. International Journal of Software Engineering
and Knowledge Engineering, 23, 463–506. (in press).
Galler, B. A. (1969). ACM president’s letter: NATO and software engineering? Communications of the
ACM, 12, 301.
Garfield, E. (2005). The Agony and the ecstasy—The history and the meaning of the journal impact factor.
In International Congress on Peer Review in Biomedical Publication.
Garousi, V., Mesbah, A., Betin-Can, A., & Mirshokraie, S. (2013a). A systematic mapping study of web
application testing. Elsevier Journal of Information and Software Technology, 55, 1374–1396.

123
48 Scientometrics (2015) 105:23–49

Garousi, V., Shahnewaz, S., & Krishnamurthy, D. (2013b). UML-driven software performance engineering:
A systematic mapping and trend analysis. In V. G. Dı́az, J. M. C. Lovelle, B. C. P. Garcı́a-Bustelo, &
O. S. Martı́nez (Eds.), Progressions and innovations in model-driven software engineering. Hershey:
IGI Global.
Garousi, V., & Varma, T. (2012). A bibliometrics analysis of Canadian Electrical and Computer Engi-
neering Institutions (1996–2006) based on IEEE Journal Publications. Canadian Journal on Computer
and Information Science, 5, 1–24.
Garousi, V. (2014). Data for the study: A bibliometric analysis of the Turkish Software Engineering
Community. http://goo.gl/KB0tDt. Accessed July 2015.
Garousi, V., & Ruhe, G. (2013). A bibliometric/geographic assessment of 40 years of software engineering
research (1969–2009). International Journal of Software Engineering and Knowledge Engineering, 23,
1343–1366.
Garousi, V., & Varma, T. (2010). Bibliometric assessment of canadian software engineering scholars and
institutions (1996–2006). Canadian Journal on Computer and Information Science, 3, 19–29.
Geist, R. M., Chetuparambil, M., Hedetniemi, S., & Turner, A. J. (1996). Computing research programs in
the US. Communications of the ACM, 39, 96–99.
Glass, R. L., & Chen, T. Y. (2001). An assessment of systems and software engineering scholars and
institutions (1996–2000). Journal of Systems and Software, 59, 107–113.
Glass, R. L., & Chen, T. Y. (2002). An assessment of systems and software engineering scholars and
institutions (1997–2001). Journal of Systems and Software, 64, 79–86.
Glass, R. L., & Chen, T. Y. (2003). An assessment of systems and software engineering scholars and
institutions (1998–2002). Journal of Systems and Software, 68, 77–84.
Glass, R. L., & Chen, T. Y. (2005). An assessment of systems and software engineering scholars and
institutions (1999–2003). Journal of Systems and Software, 76, 91–97.
Glass, R. L., Vessey, I., & Ramesh, V. (2002). Research in software engineering: An analysis of the
literature. Information and Software Technology, 44, 491–506.
Harman, M., Mansouri, S., & Zhang, Y. (2009). Search-based software engineering: A comprehensive
analysis and review of trends techniques and applications. In King’s College London, Technical Report
TR-09-03. http://www.dcs.kcl.ac.uk/technical-reports/papers/TR-09-03.pdf
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the
National Academy of Sciences, 102, 16569–16572.
Jia, Y., & Harman, M. (2010). An analysis and survey of the development of mutation testing. IEEE
Transactions on Software Engineering, 37, 649–678.
Kathy, B. (1997). Collaborations: Closing the industry-academia gap. IEEE Software, 14, 49–57.
McCain, K. W., Verner, J. M., Hislop, G. W., Evanco, W., & Cole, V. (2005). The use of bibliometric and
knowledge elicitation techniques to map a knowledge domain: Software engineering in the 1990s.
Springer Scientometrics, 65, 131–144.
Meyer, B., Choppy, C., Staunstrup, J., & Leeuwen, J. V. (2009). Viewpoint: Research evaluation for
computer science. Communications of the ACM, 52, 31–34.
UYMS Symposium Organizers. (2014). Turkish National Software Engineering Symposium, acronym in
Turkish: UYMS. http://www.uyms.org.tr. Accessed July 2015.
Parnas, D. L. (2007). Stop the numbers game. Communications of the ACM, 50, 19–21.
Rashidi, A., Rahimi, B., & Delirrad, M. (2013). Bibliometric Analysis of parasitological research in Iran and
Turkey: A comparative study. Iranian Journal of Parasitology, 8, 313–322.
Ren, J., & Taylor, R. N. (2007). Automatic and versatile publications ranking for research institutions and
scholars. Communications of the ACM, 50, 81–85.
Ren, J., & Taylor, R. N. (2009). A Java tool for ranking institutions and authors by publications. www.isr.
uci.edu/projects/ranking. Accessed July 2015.
Runeson, P., & Minör, S. (2014) The 4 ? 1 view model of industri–academia collaboration. In International
Workshop on Long-term Industrial Collaboration on Software Engineering.
Scopus search engine. (2014). http://www.scopus.com. Accessed July 2015.
Tse, T. H., Chen, T. Y., & Glass, R. L. (2006). An assessment of systems and software engineering scholars
and institutions (2000–2004). Journal of Systems and Software, 79, 816–819.
Turhan, B., Menzies, T., Bener, A. B., & Stefano, J. D. (2009). On the relative value of cross-company and
within-company data for defect prediction. Empirical Software Engineering, 14, 540–578.
Turkish academic network and information center (ULAKBİM) part of the scientific and technological
research council of Turkey (TÜBITAK). (2014). Top-80 active institutions in Turkey, in all scientific
disciplines, in five year periods (Turkish: Türkiye’ ye ait en çok yayın yapan 80 kuruma ait, tüm bilim
alanlarında 5 yıllık dönemler halinde). https://www.ulakbim.gov.tr/cabim/yayin/tbyg_1981-2006/4_5.
php?u=NULL&y=0&command=G%F6ster. Accessed July 2015.

123
Scientometrics (2015) 105:23–49 49

URAP team. (2014). University Ranking by Academic Performance (URAP) project and laboratory. http://
www.urapcenter.org. Accessed July 2015.
Witten, I. H., Frank, E., & Hall, M. A. (2011). Data mining: Practical machine learning tools and tech-
niques. Amsterdam: Elsevier.
Wohlin, C. (2007). An analysis of the most cited articles in software engineering journals—2000. Infor-
mation and Software Technology, 49, 2–11.
Wohlin, C. (2008). An analysis of the most cited articles in software engineering journals—2001. Infor-
mation and Software Technology, 50, 3–9.
Wohlin, C. (2013) Empirical software engineering research with industry: Top 10 challenges. In Pro-
ceedings of the International Workshop on Conducting Empirical Studies in Industry.
Wong, W. E., Tse, T. H., Glass, R. L., Basili, V. R., & Chen, T. Y. (2008). An assessment of systems and
software engineering scholars and institutions (2001–2005). Journal of Systems and Software, 81,
1059–1062.
Wong, W. E., Tse, T. H., Glass, R. L., Basili, V. R., & Chen, T. Y. (2009). An assessment of systems and
software engineering scholars and institutions (2002–2006). Journal of Systems and Software, 82,
1370–1373.
Wong, W. E., Tse, T. H., Glass, R. L., Basili, V. R., & Chen, T. Y. (2011). An assessment of systems and
software engineering scholars and institutions (2003–2007 and 2004–2008). Journal of Systems and
Software, 84, 162–168.
Xie, T. (2011). Software Engineering Conferences. http://people.engr.ncsu.edu/txie/seconferences.htm.
Accessed July 2015.

123

You might also like