Citation performance of a publication depends heavily on its academic field. Some words in keywords, titles, and abstracts of publications may be indicative of their academic field. Therefore, analysis of differences in citation performance of these words helps us understand inter-field differences in citation performance. In this article, we analyzed citation performance of publications that contain certain words in their keywords, titles, and abstracts in Web of Science from 2010 to 2012. We found that some words do not have a consistent performance. For instance, publications that use a certain word in their keywords have a different average performance compared to publications that use the same word in their titles. Next, we investigated keywords, titles, and abstracts separately. We laid out the words that have the lowest and highest average citations. Words that contain animal names, country names, and mathematical concepts are among the worst performers. Words that contain terminology specific to a scientific field and have relatively lower frequency are among the best performers.
There are wide differences among citation performances of different academic fields. These differences are problematic when publications from distinct academic fields are compared. For example, a dean of a college who evaluates both chemists and economists for promotion should understand the inter-field differences in citation performances.
Consider an administrator who has to evaluate the research output of a faculty member. If the attention of the administrator is drawn to the word “experiment”that is frequently used in titles of publications of the researcher, he/she may want to learn whether the publications come from a high citation field. The simplest way to answer this question is to look at the average citation performance of publications that contain the word “experiment” in their titles.
It is technically straightforward to list all words in keywords, titles, and abstracts. It is also easy to compute citation performance of publications that contain each of these words. Once the list is prepared, administrators can easily use this guide to judge whether publications are coming from a high citation field and apply the necessary normalizations.
The most important advantage of utilizing this methodology is its simplicity. A typical researcher may not possess the bibliometric expertise to understand more sophisticated methods for research evaluation. In contrast, it does not require much expertise to understand that publications that contain the word “cancer” in their titles have high citation performance.
In this article, we investigated the viability of this method. We computed the performance of frequently used words in keywords, titles, and abstracts in Web of Science from 2010 to 2012. The average number of citations that publications, which contained a particular word, receive was computed. Then, the citation performance was normalized by the publication year.
The method was evaluated in two parts. In the first part, citation performance was analyzed in terms of consistency. If publications that contain “theory”in their titles perform differently from publications that contain “theory”in their abstracts, then, research evaluation would depend on whether titles or abstracts are being used. This dependence would decrease the appeal of the method.
In the second part, the words that had the highest and lowest citation performance were listed and whether words with extreme values are generic or indicative of the subject of publications were examined. Consider the case where publications that have the preposition “at”in their titles outperforms other publications. Then, it is not appealing to use this fact in the normalizations. Alternatively, the word “empirical”has indication of the field of publications. Consequently, using citation performance of “empirical”in a normalization would not draw many objections.
2 Related literature
There are many studies that analyze citation performance of publications by their keywords, titles, and abstracts. Tahamtan et al. (2016) provided a detailed review about this literature. The literature on this subject can be broadly divided into two groups. The first group analyzed the content of words in keywords, titles, and abstracts. Studies that analyzed the citation performance of publications that contained country names in their titles can be considered in this group. The second group focused on the structure of titles and abstracts. Studies that analyzed citation performance of publications by the length of their titles fall into this group. This article is more related to the first group of studies.
Some studies found that publications that have unfamiliar or new words in their keywords and titles attract fewer citations. Thelwall (2017) analyzed the citation performance of articles from a large dataset from Scopus. The study defined a title obscure if it contained unique words that had not been used in titles of other articles. It was found that articles that have obscure titles have lower citation performance. Uddin and Khan (2016) analyzed research output in the obesity research area. Publications that have newly introduced keywords are found to receive fewer citations. In contrast, keyword diversity and number of keywords have positive effect on the citation performance. Fox and Burns (2015) have detailed information about the editorial process of the journal Functional ecology. The study found that publications that have specific species names in their titles are rejected more often at the editorial stage and cited less often once they are accepted.
In this article, best and worst performing words in keywords, titles, and abstracts are found. The findings are not in accordance with the results of the previous literature. Words that contain specific terminology are found to attract highest level of citations, whereas generic words are found to attract lowest level of citations. Moreover, frequently used words are not among best performing words. This study is more general than the previous literature in two aspects. First, a more general dataset is used and all publications in Web of Science for three years are included in this study. Second, all possible words in keywords, titles, and abstracts in terms of citation performance are ranked. Consequently, this study is not restricted to new keywords or subclass of terminology such as species names.
Abramo et al. (2016) analyzed publication data from Italy. They showed that articles that have keywords, titles, and abstracts that contain word “Italy” attract fewer citations. From a more general perspective, the results in this article are consistent with this result. We found that publications that contain some country names in their keywords, titles, and abstracts are among the publications that have the lowest citation performance.
Gazni (2011) considered abstracts of all Web of Science publications of five institutions. He ranked abstracts according to his readability. The readability measure treats texts that contain shorter sentences and shorter words as easy texts. The study found that publications that have harder abstracts attract more citations.
Length of titles is another important factor that affects the citation performance. There is no consensus about whether publications that have short titles are advantageous. For example, Van Wesel et al. (2014) found that publications that have long titles attract more citations. In contrast, Habibzadeh and Yadollahie (2010) used another dataset to reach the opposite conclusion.
Other structural aspects of titles are also explored in previous studies. Titles that contain non-alphanumeric characters (Nair and Gibbert 2016, Gnewuch and Wohlrabe 2017), colons (Jamali and Nikzad 2011), and punctuations (Fumani et al. 2015) have different citation performances than titles without these characteristics.
The main purpose of this study was to propose a simple way to judge differences in citation performance of different fields. If words in keywords, titles, and abstracts are indicative of the field, then differences in citation performance of words make us understand the inter-field differences in citation performance.
Some Web of Science subject categories involve hundreds of journals. Therefore, there are wide inter-field differences within a single subject category. One study analyzed citation performances of award-winning mathematicians (Smolinsky and Lercher 2012). Although these mathematicians were evaluated similarly by their peers, their citation performance heavily depended on their subfields. Another study analyzed three Spanish psychology journals (Buela-Casal et al. 2009). Theoretical psychology studies are found to attract more citations than empirical ones.
Inter-field differences exist even when a single journal is considered. Johnston et al. (2013) analyzed citation performance of articles from American Economic Review. They found that empirical publications attract more citations than theoretical publications from the same journal.
There are two main advantages of the basic methodology that we introduce. First, we used the whole Web of Science including science and social science publications so that our results are general. For example, it is possible to compare a social scientist to a scientist in a distinct academic field. Second, since the methodology uses publication as its unit, it is possible to explain inter-field differences within a journal. Some specific words in keywords, titles, and abstract may separate publications that attract more citations within a single journal. As a result, there will a better understanding of microlevel field differences when the methodology proposed in this article is used.
There are many ways to account for inter-field differences, yet no standard method has been established. Normalizations usually involve two steps. First, publications are grouped by subject categories. Second, citation statistics that are attained from subject categories are used for normalizations. The percentile ranking of a journal’s impact factor within its subject category is an example for such a statistic.
There is no consensus on which subjects are to be used. Web of Science subject categories are widely used for these normalizations (Ruiz-Castillo and Waltman 2015). However, Web of Science subject categories are constructed by using ad hoc methods (Pudovkin and Garfield 2002). One of the methods that is used in determining Web of Science categories is to choose a group of journals as the core journals of a subject. Then, a visual examination of citation information of the core journals is used to determine which journal should be included to the subject category.
Categorization of subjects is mainly done by co-citation and/or co-word analysis (Besselaar and Heimeriks 2006). Co-citation analysis groups publications together that are frequently cited together in the reference of other publications. Co-word analysis clusters publications that use similar words in keywords, titles, and abstracts. Co-word analysis is used on its own in some studies (e.g., Bhattarchaya and Basu 1997, Blatt 2009). In other studies, both methods are used together (e.g., Colliander 2015, Waltman and van Eck 2012).
The method used in this study is related to the categorization done by co-word analysis. Our method is simple and primitive. We treated each word separately and grouped publications according to these single words. If categorization by single words can explain field differences properly, then it is more plausible that a cluster of words also produces meaningful results.
Milojevic et al. (2011) conducted a co-word analysis from 16 journals in Web of Science. They were able to integrate words that had the same semantic structure. There were tens of thousands of words in the dataset that we used. Unfortunately, it was not technically feasible to integrate similar words in this study. Consequently, we only analyzed the most commonly used structure of the words.
Leydesdorff (1997) conducted a co-word analysis in biochemistry subject to categorize publications into broad categories such as theoretical and empirical. The author noted that the same word may have different meanings in different contexts; therefore, confusions are inevitable. Our study spanned the whole science and social science fields. Therefore, our findings would be more vulnerable to such confusions.
There is also no consensus on which statistics to be used after subject categorization method is chosen. For example, Sombatsompop and Markpin (2005) took the average impact factor of the subject category and divided each journal’s impact factor by this statistic to achieve inter-field equality. Alternatively, Ramirez et al. (2000) used median and maximum impact factor of the subject category for normalization.
We got publication records from Web of Science in July 2017. We only analyzed publications from 10,848 journals that are in Journal Citation Report 2012. All journals from this report are indexed in either Science or Social Science Citation Index.
There were 3.75 million publications from these journals from 2010 to 2012. About 92% of these publications are articles. The number of received citations as of July 2017 is used for citation performance of these publications.
Large dataset permitted us to generate general results. There were tens of thousands of keywords in the data. The number of words in titles and abstracts was even more numerous. Thus, it was not technically feasible to integrate semantically similar words.
Keywords, titles, and abstracts from publications were analyzed. Titles and abstracts were parsed into single words. Composite words that contain hyphens were not separated. Keywords were used in their original form. Therefore, there were keywords that contain more than one word. This was problematic in the consistency part of the analysis because the performance of keywords was compared to single words from titles and abstracts.
Because of this, only keywords that were contained in more than 20,000 publications were considered in the consistency part. There was only one keyword that had more than one word and used in more than 20,000 publications.
All computations were performed by using Perl programs. At the first stage, we gathered the list of all keywords and all words in titles and abstracts. Then, we computed the average performance of the publications that contained a specific word. Due to computational problems, it was not possible to compute citation performance of all the words. Consequently, we restricted our attention to the words that were contained in more than 3,000 publications.
4 Year-adjusted normalization
Citation performances of words were computed separately for keywords, titles, and abstracts. Therefore, a specific word such as “data” may have a different performance depending on whether it comes from keywords, titles, and abstracts.
2010 publications have more time to attract citations than 2012 publications. Because of this, citation performance of a publication is normalized by its publication year.
The following formula was used to compute normalized citation performance of publications by their keywords.
“w” is the keyword. “y” is the publication year.
is the total citation of publications that contain keyword w in the year y. is the total number of publications that contain keyword w in the year y.
Cy is the total number of citations that are received by all keywords in the year y. If there are S separate keywords in a publication, then citation of that publication is counted S times.
Ny is the total number of all keywords in the year y. If there are S separate keywords in a publication, then that publication is counted S times. Therefore, Cy/Ny is the publication performance of an average keyword in the year y.
is the average publication performance of w in the year y. It is computed as follows:
is greater than 1, then w has a higher average citation than average keyword of that year.
Year-adjusted keyword performance (YK) is the average performance of w through years. It is simply computed by taking the averages of all years that w is used. If w is used in all 3 years, then the formula of YK is as follows:
If we use the keyword “data”as w. Then, YK measures the average citation performance of all publications that have keyword “data” relative to the average citation performance of all keywords. If YK is above 1 for a certain keyword, we can say that the citation performance of that keyword is above the average citation performance of all keywords.
A simple variation of the abovementioned formula was used to compute performances of publications by their titles and abstracts. Instead of using a certain keyword for “w” in the abovementioned formula, we used a certain word in the titles or abstracts of the publication. If “w” stands for a certain word in the title, then the resulting average performance is called “year-adjusted title word performance” (YT). Alternatively, if “w”stands for a certain word in the abstract, then the resulting average performance is called “year-adjusted abstract word performance” (YA).
YK, YT, and YA can have different values even for the same word. If we choose w to be “empirical,” then, YT gives the average performance of publications that contain “empirical” in their titles. This value can be different than YA which gives the average performance of publications that have “empirical” in their abstracts.
We tested whether publications that contain a certain word as a keyword has comparable performance as publications that contain the same word in their titles and abstracts.
Table 1 summarizes 45 words that are in at least 20,000 keywords, 20,000 titles, and 20,000 abstracts. Ten bold words have above average performance as words in keywords, titles, and abstracts. Fourteen underlined words have below average performance as words in keywords, titles, and abstracts. The remaining 21 words have no consistent performance. For example, “trial” is not a high performing word in abstracts, but it has the highest performance as a title word.
Consistency of year-adjusted performance of words in keywords, titles, and abstracts.
There are five words from life sciences that have consistently above average performance. These words are as follows: “cells,” “brain,” “cancer,” “protein,” and “gene.” When we look at other high performing words, “nanoparticles” point to a specific academic subject; “mice” is indicative that publications contain an experiment; “activation,” “expression,” and “complexes” are words that give little clue about the field that publications belong to.
“Health” and “surgery” are only two underperforming words that are potentially from publications in life sciences. There are many underperforming words that are not specific to a field such as “women,” “model,” or “system.”
We used all Web of Science dataset in this study. Therefore, words summarized in Table 1 are from publications from a wide range of scientific fields. We kept our analysis general because we tried to form a guide to compare distinct academic fields. However, there is a high level of variation of citation performance among publications that contain the same word. Consequently, we cannot claim that citation performance of any of the words is greater than one with statistical significance.1
6 Best performing words
Table 2 summarizes 30 best performing keywords that have been included in at least 3,000 publications. The keywords are sorted in descending order. “Embryonic stem-cells” has 2.52 YK and “innate human immunity” has 1.57 YK.
Best 30 performers in terms of YK.
|embryonic stem-cells; carbon nanotubes; field-effect transistors; graphite; genome-wide association; caenorhabditis-elegans; DNA methylation; living cells; regulatory t-cells; gold nanoparticles; tgf-beta; one-pot synthesis; quantum dots; functionalization; electrodes; acute myeloid-leukemia; long-term potentiation; activated protein-kinase; nf-kappa-b; genome; placebo-controlled trial; arabidopsis-thaliana; growth-factor receptor; synaptic plasticity; mouse model; mesenchymal stem-cells; growth-factor-beta; drug-delivery; human brain; innate immunity|
The structure of top performing keywords is different from the general distribution. There are 956 keywords that have been in at least 3,000 publications. Only 62 (6%) of them have more than one word. However, 19 out of 30 (61%) of top performing keywords have multiple words.
Keywords that have highest frequency are not prevalent among high performers. There are 189 keywords (20%) that are contained in more than 10,000 publications. There is only 1 out of 30 top performing keywords which is contained in more than 10,000 publications.
Both top performing and frequently used words are very important. If an administrator decides to use keywords in the normalization, then, he/she would normalize the citation performance of publications by dividing YK values of keywords of the publications. Since frequently used keywords would be more prevalent among publications, YK value of these keywords would be used more often. In addition to this, the value of keywords that have extreme YK would affect the normalized values most.
As summarized in Table 2, the top performing keywords contain special terminology that cannot be understood by the layperson. “caenorhabditis-elegans,” and “tgf-beta” are examples of such keywords. When we look up definitions of these keywords, we see that most of the high performing keywords from Table 2 are from life sciences.
The success of publications that contain words from life sciences is not surprising. Life science journals are known to have high impact factors. According to 2016 Journal Citation Report, all top 14 subject categories are related to life sciences. Therefore, our basic categorization is consistent with Web of Science subject categorization in this sense.
Table 3 provides 30 top performing words that are used at least 3,000 times in titles. The list is sorted in a descending order. “graphene” has 4.14 YT, and “topological” has 1.93 YT.
Best 30 performers in terms of YT.
|graphene; batteries; sequencing; randomized; meta-analysis; genome-wide; guidelines; society; advances; systematic; 2010; update; immunity; methylation; photocatalytic; histone; genome; inflammation; alzheimer’s; recent; mammalian; high-performance; arabidopsis; recommendations; reveals; solar; regulates; photovoltaic; targeting; topological|
Similar to our findings about keywords, top performing title words are not among the most frequent words. Around quarter of title words are in more than 10,000 publications, whereas only 3 out of 30 top performing words are that frequent.
Top performing title words are mostly composed of scientific terminology. However, there are exceptions such as “recent,” “2010,” and “guidelines.” Many researchers might object if publication performance of articles that have “recent” in their titles is used for research evaluation. Accordingly, these words should be extracted from the list. However, such a human intervention would decrease the appeal of this method.
Table 4 summarizes best 30 performing words that are contained in at least 3,000 abstracts. The list is in descending order, and the values of YA are descending from 4.30 to 2.34.
Best 30 performing words in terms of YA.
|graphene; braf; batteries; reads; microbiota; autophagy; nanosheets; micrornas; reprogramming; mirnas; sirtl; person-years; trastuzumab; kras; next-generation; emt; progression-free; nanomaterials; rnas; plasmonic; genome-wide; pluripotent; biofuels; aacr; nrf2; non-coding; transcriptome; self-renewal; epigenetic; functionalization|
Top performing abstract words are also not among the most frequent abstract words. Around 45% of abstract words appear more than 10,000 times. The ratio for high performing abstract words is just 20%.
“genome-wide,” “graphene,” and “batteries” are the only three words that are also listed as top performing title words. Top performing abstract words also contain scientific terminology that is unfamiliar to layperson such as “sirt1” and “emt.”
7 Worst performing words
Table 5 presents the list of worst performing keywords that are contained in at least 3,000 publications. The list has animal names such as sheep and cattle, country names such as Brazil and Spain, and basic mathematical concepts such as equations and convergence. The list composes entirely of single-word keywords. There are not many technical terminologies that are unfamiliar to the layperson. The list is not dominated words used in life science fields. Therefore, this list has exactly the reverse properties as the best performing keywords.
Worst 30 performing words in terms of YK.
|spaces; sheep; dogs; Brazil; existence; cattle; law; injuries; cultivars; steel; equations; yield; education; Turkey; constituents; trauma; geometry; convergence; ceramics; waves; students; politics; alloys; fish; leaves; flows; spain; gender; competition; patient|
Tables 6 and 7 list the 30 title words and abstract words, respectively, that have the worst performance. Like worst performing keywords, worst performing title and abstract words are also largely composed of country names, mathematical concepts, and animal names. The worst performing title and abstract words are nontechnical as well. Worst performing abstract words contain gender words such as “man,” “woman,” “his,” and “she.”
Worst 30 performing words in terms of YT.
|note; theorem; graphs; Turkey; operators; genus; university; spaces; case; asymptotic; teaching; law; Brazil; Iran; bilateral; some; Mexico; existence; politics; integral; presenting; report; symmetric; unusual; nursing; equations; education; Brazilian; Korean; Korea|
Worst 30 performing words in terms of YA.
|espana; boy; girl; let; angstrom(3); banach; algebras; his; tritium; she; opt; woman; algebra; court; colonial; abelian; replications; eyelid; (p>005); Turkish; projective; irreducible; polish; paulo; Russian; buffalo; essay; courts; man; rio|
We investigated the viability of a simple guide for research evaluation. The publications were grouped according to the words used in keywords, titles, and abstracts. Then, the average performance of publications that contained a certain word was computed.
It is very simple for researchers or administrators to use this guide. They can simply check the performance of words in the publication record and decide whether publications are coming from a high citation field. Normalizations can be easily performed by simply dividing the citation performance of publications by the metric that we proposed.
We tested the viability of the metric in two aspects. First, we checked for consistency. If a word has a different performance when used as a keyword than used as a word in the title, then the measure is less convincing to be used. We showed that some words have consistent performance but other words fail this test.
Second, we laid out the words with extreme values. These words would matter a lot when used as a guide. The publications that contain these words will be evaluated as being from top or bottom citation field. Animal names, country names, and basic mathematical concepts are among the worst performers. Best performers largely consist of technical and unfamiliar terminologies. Although most of the high performing words come from scientific terminology, many low performing words are generic and hardly say anything about subject of the study. For example, it is hard to defend to use the word “man” to stratify publications into subjects. As a result, there should be a human judgment which words to be used in an evaluation process. Consequently, it decreases the appeal to use this metric as an evaluation tool.
Citation performance of publications according to keywords, titles, and abstract has been analyzed before. In sum, country names, obscure, and new and less frequent words are found to be disadvantageous to be used in publications. Top and bottom performing words that are listed can be evaluated in this regard. Bottom performing list contains country names. This is consistent with the literature that finds publications that contain country names perform worse. We found that less frequent words attract more citations. This finding is not in accordance with previous literature that found that less frequent and new and obscure words are not good for citation performance. However, we used a very general dataset and found bottom and top words from all publications from Science and Social Science fields. This broad approach may be responsible for the inconsistency.
There are many more sophisticated tools for research evaluation which are proposed in the literature. Hicks et al (2015) proposed general principles that a research evaluation satisfies in the work entitled “the Leiden Manifesto.” First principle states that a good research evaluation by bibliometric methods should be according to peer review. Our list of top and bottom performing words serves as a test for this principle. Administrators may judge relevancy of words in the lists for research evaluation. The sixth principle states that a good research evaluation should account for field differences. If words in keywords, titles, and abstracts are evaluated to successfully stratify publications into subjects, then the metric that we proposed can easily be used to normalize publications according to the subjects.
The fourth principle of Leiden Manifesto states that a research evaluation method should be transparent. The fifth principle suggests that a research evaluation method should be confirmed by the evaluated researcher. Due to its simplicity, the methods that we proposed in this article satisfied both of these principles.
Blatt, E. M. (2009). Differentiating, describing, and visualizing scientific space: A novel approach to the analysis of published scientific abstracts. Scientometrics, 80(2), 387–408. https://doi.org/10.1007/s11192-008-2070-3
Buela-Casal, G., Zych, I., Medina, A., Viedma del Jesus, M., Lozano, S., & Torres, G. (2009). Analysis of the influence of the two types of the journal articles; theoretical and empirical on the impact factor of a journal. Scientometrics, 80(1), 267–284. https://doi.org/10.1007/s11192-008-1715-6
Fumani, F.Q.R., M. & Goltaji, M. & Parto, P. (2015). The impact of title length and punctuation marks on article citations. Annals of Library and Information Studies., 62(3), 126–132.
Leydesdorff, L. (1997). Why words and co-words cannot measure the development of the sciences. Journal of the American Society for Information Science, 48(5), 418–427. https://doi.org/10.1002/(SICI)1097-4571(199705)48:53.0.CO;2-Y
Milojevic, S., Sugimoto, C. R., Yan, E., & Ding, Y. (2011). The cognitive structure of library and information science: Analysis of article title words. Journal of the American Society for Information Science and Technology, 62(10), 1933–1953. https://doi.org/10.1002/asi.21602
Waltman, L., & van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the Association for Information Science and Technology, 63(12), 2378–2392. https://doi.org/10.1002/asi.22748
However this problem is not endemic to the categorization in this paper. For example, average number of citations of all publications from Web of Science in “economics” subject category in year 2012 is 7.25 in year 2017. The standard deviation of citations is 13.98. Therefore, one cannot claim with statistical confidence that number of citations of publications in the economics subject category is greater than zero.