Open Access

Node2vec Representation for Clustering Journals and as A Possible Measure of Diversity


Cite

Figure 1

Map of scientific journals. Colors of dots mean the corresponding ESI categories of journals. Dots in red with black border are journals indexed as multidisciplinary, of which we list only a few on the map.
Map of scientific journals. Colors of dots mean the corresponding ESI categories of journals. Dots in red with black border are journals indexed as multidisciplinary, of which we list only a few on the map.

Figure 2

(a) Our vector-based clustering of journals compared with several existing journal classification systems JCR, VOS, ESI and LCAS. (b) We also compare resulted clusters using various dimensions of the node2vec vectors and we find there is not much differences among d = 32, d = 64 and d = 128: the similarity of 64–32 and 128–32 are much higher than that of 8–32 while 16–32 is somewhere in between. Also when Vec clusters with d = 8, 16, 32, 64, 128 are compared against VOS, we find as long as d > 16, increasing d does not make a big difference. Considering both performance and computational cost, we only report the results of d = 32.
(a) Our vector-based clustering of journals compared with several existing journal classification systems JCR, VOS, ESI and LCAS. (b) We also compare resulted clusters using various dimensions of the node2vec vectors and we find there is not much differences among d = 32, d = 64 and d = 128: the similarity of 64–32 and 128–32 are much higher than that of 8–32 while 16–32 is somewhere in between. Also when Vec clusters with d = 8, 16, 32, 64, 128 are compared against VOS, we find as long as d > 16, increasing d does not make a big difference. Considering both performance and computational cost, we only report the results of d = 32.

Figure 3

Scatter plot of vector norms versus node centrality. Node centrality is measured as the node occurrence frequency in the random walk series generated for Node2Vec. Orange dots represent journals indexed in Multidisciplinary Science.
Scatter plot of vector norms versus node centrality. Node centrality is measured as the node occurrence frequency in the random walk series generated for Node2Vec. Orange dots represent journals indexed in Multidisciplinary Science.

Figure 4

Diversity of journals calculated using similarity measured by (a) vector vn learned from node2vec and (b) vector vc. Journals indexed as Multidisciplinary are colored according to their JIFs with blue implying low JIF and red implying high JIF as shown in the right legend. The citing diversity of journal i is measured based on its referenced journals, and cited diversity is measured based on the journals citing it.
Diversity of journals calculated using similarity measured by (a) vector vn learned from node2vec and (b) vector vc. Journals indexed as Multidisciplinary are colored according to their JIFs with blue implying low JIF and red implying high JIF as shown in the right legend. The citing diversity of journal i is measured based on its referenced journals, and cited diversity is measured based on the journals citing it.

Figure 5

A graphic summary of our work: concepts and connections in red are the ones that have been implemented in the current work while the ones in green can be topics of future investigation. The rest of concepts and connections have been proposed and implemented in earlier studies, see for example, (Mikolov et al., 2013) and (Grover & Leskovec, 2016).
A graphic summary of our work: concepts and connections in red are the ones that have been implemented in the current work while the ones in green can be topics of future investigation. The rest of concepts and connections have been proposed and implemented in earlier studies, see for example, (Mikolov et al., 2013) and (Grover & Leskovec, 2016).

The “King - Man + Woman = Queen” test on the node2vec trained vectors of journals. Top-5 “Queen”-like journals are presented.

ExampleTest 1Test 2Test 3
KingPLoS Comput. Biol
ManNat. Cell Biol
WomanPhys. Rev. LettGenome BiolJ. Neurosci
QueenJ. Stat. Mech. Theory ExpBioinformaticsNeuroImage
Phys. Rev. EBMC BioinformaticsBiol. Cybern
Fluctuation Noise LettJ. Comput. BiolFront. Comput. Neurosci
EPLBioData MinCereb. Cortex
Eur. Phys. J. BJ. Bioinform. Comput. BiolJ. Comput. Neurosci
eISSN:
2543-683X
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining