Search Results

1 - 10 of 12 items :

  • Library and Information Science, other x
Clear All
Design of a generic, open platform for machine learning-assisted indexing and clustering of articles in PubMed, a biomedical bibliographic database

important efforts have been made to streamline text mining workflows by providing a library of natural language processing (NLP) tools (e.g., stemmers, parsers, and named entity recognizers) that can be connected together in a pipeline Manning, Surdeanu, Bauer, Finkel, Bethard, McClosky, D., 2014 ; Savova, Masanz, Ogren, Zheng, Sohn, Kipper-Schuler, Chute, 2010 ; Batista-Navarro, Carter, Ananiadou, 2016 ; Clarke, Srikumar, Sammons, Roth, 2012 ). In addition, there are valuable machine learning packages that provide machine learning algorithms in a user-friendly manner

Open access
Citation performance of publications grouped by keywords, titles, and abstracts

that have been included in at least 3,000 publications. The keywords are sorted in descending order. “Embryonic stem-cells” has 2.52 YK and “innate human immunity” has 1.57 YK. Table 2 Best 30 performers in terms of YK. embryonic stem-cells; carbon nanotubes; field-effect transistors; graphite; genome-wide association; caenorhabditis-elegans; DNA methylation; living cells; regulatory t-cells; gold nanoparticles; tgf-beta; one-pot synthesis; quantum dots; functionalization; electrodes; acute myeloid-leukemia; long-term potentiation; activated

Open access
Smart and Connected Health: What Can We Learn from Funded Projects?

, information extraction, and summarization requires understanding of the meaning of the texts, and has been challenging. This study applies three types of text analysis/processing: (1) low-level natural language processing such as stop-word identification and filtering and stemming. The result helps to create a high quality word cloud that reveals the most frequent content words from the abstracts of the projects; (2) descriptive or bibliometric analysis. This is possible because the records of these NSF projects are well-organized datasets, as described in Table 1

Open access
Enhancing Clinical Decision Support Systems with Public Knowledge Bases

://dumps.wikimedia.org/enwiki/20160701/ was used for Wiki-DP. It was downloaded on March 5, 2016, and contains 5.79 million articles. Only the title and the content of each article were kept. Tags, references, external links and see also parts were all removed. The Wikipedia collection was first performed to stop word removal and stemming using Porter stemmer, and it was then indexed by Indri. Table 1 Summary of experiment topics and collections. Dataset Usage Collection size Indexing Topics 2014 CDS Track Training 733,138 articles Indri 30

Open access
Information Security Compliance in Organizations: An Institutional Perspective

evaluated in a specific situation. There are three types of external pressures that an organization has to consider including coercive pressures, normative pressures, and mimetic pressures ( Davidsson et al., 2006 ; Cavusoglu et al., 2015 ). Coercive pressures force organizations to adopt certain institutionalized regulations and practices with respect to the security of organizational information in managing the organization ( Hu et al., 2007 ). Such pressures stems from government laws and regulations that force organizations to act in compliance to certain rules and

Open access
Supporting Book Search: A Comprehensive Comparison of Tags vs. Controlled Vocabulary Metadata

determine optimal parameter settings on our training topics. These optimal settings were then used on the 334 test topics to produce the results presented in the remainder of this paper. We optimized three different parameters: – Degree of smoothing . The λ parameter controls the influence of the collection language model, with higher values giving more influence to the collection language model. We varied λ in steps of 0.1, from 0.0 to 1.0. – Stopword filtering . Either no filtering or using the SMART stop word list. – Stemming . Either no

Open access
Burgeoning Data Repository Systems, Characteristics, and Development Strategies: Insights of Natural Resources and Environmental Scientists

, length, weight, and multiple measures of the same tree… whether the branch, or the wood, or just the stem of the tree. Those are some of the attributes that we can query by and use to tie together studies from different sources.” Often, standardized attributes help access and manipulate data and thus make processing, joining, and extracting subsets of data easier. 3.2 Burgeoning Data Repositories and Archival Systems Speaking of the emergence and abundance of different data repositories and archival systems, the scientists, on the one hand, applaude the

Open access
Applications of inferential statistical methods in library and information science

( Gravetter & Wallnau, 2013 ). Statistics is one of the most useful and powerful tools in data analysis for both academics and practitioners ( Vaughan, 2001 ). The essence of statistics lies in the idea of inference ( Johnson, 2009 ). As researchers collect sample data to answer specific quantitative research questions, statistical methods enable them to draw conclusions about a broader base of people, events, or objects compared with samples actually included in the study ( Munro, 2005 ). Generalization and interpretation of the statistical results stemming from limited

Open access
Document- and Keyword-based Author Co-citation Analysis

https://doi.org/10.1016/j.joi.2008.05.004 Zhao, D., & Strotmann, A. (2011). Counting first, last, or all authors in citation analysis: A comprehensive comparison in the highly collaborative stem cell research field. Journal of the American Society for Information Science and Technology , 62 (4), 654–676. https://doi.org/10.1002/asi.21495 10.1002/asi.21495 Zhao D. Strotmann A. 2011 Counting first, last, or all authors in citation analysis: A comprehensive comparison in the highly collaborative stem cell research field Journal of the American Society for

Open access
An investigation on the evolution of diabetes data in social Q&A logs

contains prepositions, conjunctions, auxiliaries, articles, numerals, interjections, and other function words. The Porter stemming algorithm was used to remove common morphological and inflectional endings from words and to bring variant forms of a word together ( Porter, 1980 ). The log data in Yahoo! Answers from April 1, 2013 to March 31, 2014 were collected, comprising 8,570 Q&A records, wherein every record contains a question and its corresponding best answer. The total number of words in the log data was 1,486,696, and the average number of words per record was

Open access