Search Results

1 - 10 of 26 items :

  • Computer Sciences, other x
  • Business and Economics x
Clear All
Providing Research Data Management (RDM) Services in Libraries: Preparedness, Roles, Challenges, and Training for RDM Practice

4 Distribution of survey participants by state Table 3 Distribution of Survey Participants by US Regions (n= 186, plus Puerto Rico) US Regions Counts (Percentage) Northeast (DC included) 70 (37.63) West 43 (23.12) Midwest 32 (17.20) Southeast 26 (13.98) Southwest 15 (8.06) 4.1.2 Organization Types The 241 respondents were from four types of institutions, including universities/colleges, government agencies, non-academic health organizations, and other types of organizations (see Table 4 for

Open access
An Influence Prediction Model for Microblog Entries on Public Health Emergencies

selection, has strong processing capacity with high-dimension attributes, and will not overfit. Its random strategy allows for greater variability between subclassifiers in the random forest, resulting in superior classification performance ( Fu & Chen, 2014 ). Therefore, it can work well with microblog attributes with complex composition and high dimensions. The random forest uses the bootstrap method, a method of sampling with replacement. Some of the samples are not used in the training process, but for testing. Assuming that the sample size is N , the probability

Open access
g st -store: Querying Large Spatiotemporal RDF Graphs

, ? y in Q * has a range assertion. Thus, the subtrees rooted at d 1 2  and  d 4 3 $\begin{array}{} d_1^2\text{ and }d_4^3 \end{array} $ can be safely pruned, because the spatial features are unsatisfied. Pruning Rule 2 Consider two variables v i and v j bound by a spatial join assertion, and NodeSet i is the candidate set of v i and NodeSet j is the candidate set of v j . Suppose the max distance is set to be MaxDist . Let n i ∈ NodeSet i ; if the distance from MBR of n i to any node n j ∈ NodeSet j is larger than MaxDist

Open access
Structural Analysis of Medical-Terminology Hashtag versus Lay-Language Hashtag Tweet Collections: An Information Theoretical Method with Entropy Matrix

. Boltzmann’s entropy equation deals with a thermodynamic situation called equilibrium, where the microstate of the system has equal probability. In the meantime, Gibbs (1878) defined his entropy as the sum of the entropies of all the individual microstates in the system: H = − K ∑ i = 1 n p i l o g   p i $$H=-K\sum\limits_{i=1}^{n}{{{p}_{i}}log\,{{p}_{\text{i}}}}$$ Unlike Boltzmann’s entropy as a function of the number of microstates, Gibbs entropy is a function of probabilities of microstates and his equation has direct connection to Shannon’s entropy

Open access
Enhancing Clinical Decision Support Systems with Public Knowledge Bases

terms appearing in the document in an ordered/unordered sequence within a window size. Given a query Q, document D can be ranked as: P ( D | Q ) ⟺ r a n k ⁡ ∑ c ∈ T λ T f T ( D | c ) + ∑ c ∈ O λ O f T ( D | c ) + ∑ c ∈ U λ U f T ( D | c ) $$ \begin{array}{} P(D|Q)\mathop \Longleftrightarrow \limits^{rank} \sum_{c \in T}\lambda_Tf_T(D|c)+\sum_{c \in O}\lambda_Of_T(D|c)\\ +\sum_{c \in U}\lambda_Uf_T(D|c) \end{array} $$ (1) where T is the unigram clique set in the query, which has no term dependency; O is the cliques of ordered terms having sequential dependency

Open access
Improving Publication Pipeline with Automated Biological Entity Detection and Validation Service

many entities are reported by DIVE, ABNER_NLP, and ABNER_BIO. We calculated recall and precision as follows: Recall = N u m b e r   o f   e n t i t i e s   i n   G r o u n d   T r u t h   t h a t   h a s   b e e n   r e p o r t e d   b y   e a c h   p r o g r a m N u m b e r   o f   e n t i t i e s   i n   g o l d   s t a n d a r d P r e c i s i o n = N u m b e r   o f   e n t i t i e s   i n   G r o u n d   T r u t h   t h a t   h a s   b e e n   r e p o r t e d   b y   e a c h   p r o g r a m N u m b e r   o f   e n t i

Open access
To Phrase or Not to Phrase – Impact of User versus System Term Dependence upon Retrieval

, where a mixture of controlled vocabulary (descriptors), which contained phrases, and free-text searching was applied. Phrase (or proximity) operators have been particularly important in bibliographic IR systems, such as DIALOG or Web of Science. At the time, the users of bibliographic IR systems were mostly professional librarians, trained in using a wide range of operators including phrasing (spanning a range of term nearness options) from adjacent to a distance of n terms in specified search fields, like title or abstract, or in the basic index. Early analyses of

Open access
Data-driven Pattern Analysis of Acknowledgments in the Biomedical Domain

). Additionally, words with similar context will have similar vectors ( Goldberg, & Levy, 2014 ). To determine the similarity of the authors and countries in our dataset, we begin by parsing our data, extracting the acknowledgment section, and creating a word2vec model using Deep Learning4J (Deeplearning4j Development Team, n. d.). Word2vec is a two-layer neural net that takes a text corpus as input, and sends as output a set of feature vectors for the words in the input corpus. First, the input data are loaded and processed, converting all words to lowercase. Next, we tokenize

Open access
Citation performance of publications grouped by keywords, titles, and abstracts

performance of publications by their keywords. a) “w” is the keyword. “y” is the publication year. b) C y w $\begin{array}{}{} C_y^w \end{array} $ is the total citation of publications that contain keyword w in the year y. c) N y w $\begin{array}{}{} N_y^w \end{array} $ is the total number of publications that contain keyword w in the year y. d) C y is the total number of citations that are received by all keywords in the year y. If there are S separate keywords in a publication, then citation of that publication is counted S

Open access
A Clickstream Data Analysis of the Differences between Visiting Behaviors of Desktop and Mobile Users

mobile users. Figure 5 Power-law distribution of mobile users among products. 4.3 Footprint Depth Analysis Footprint depth is the viewing duration of one page. The longer time the user spends on the page, the more attention he or she pays to. The contents Table 4 displays the average page viewing durations among devices. Table 4 Average Page Viewing Durations among Devices (in seconds) Page category All Desktop Mobile All 20.75 32.14 20.57 N 22.24 48.31 21.83 P 26.36 20.89 26.71 T 18.99 34

Open access