Linked data is becoming a mature technology as a lightweight realization of the Semantic Web, as well as a way of facilitating knowledge reorganization and discovery. As a use case and start point, based on linked data technology, a genealogy knowledge service platform was implemented by the Shanghai Library for providing knowledge discovery and open data services. This article explains the design and development of the Genealogy Knowledge Service Platform, describes the method and process of the implementation, and introduces four examples of how the platform helps users to discover questions, raise questions, and solve questions for their research, to explain how Linked Data can be used in Digital Humanities.
Many investigators have carried out text mining of the biomedical literature for a variety of purposes, ranging from the assignment of indexing terms to the disambiguation of author names. A common approach is to define positive and negative training examples, extract features from article metadata, and use machine learning algorithms. At present, each research group tackles each problem from scratch, in isolation of other projects, which causes redundancy and a great waste of effort. Here, we propose and describe the design of a generic platform for biomedical text mining, which can serve as a shared resource for machine learning projects and as a public repository for their outputs. We initially focus on a specific goal, namely, classifying articles according to publication type and emphasize how feature sets can be made more powerful and robust through the use of multiple, heterogeneous similarity measures as input to machine learning models. We then discuss how the generic platform can be extended to include a wide variety of other machine learning-based goals and projects and can be used as a public platform for disseminating the results of natural language processing (NLP) tools to end-users as well.
Christina Lioma, Birger Larsen and Peter Ingwersen
When submitting queries to information retrieval (IR) systems, users often have the option of specifying which, if any, of the query terms are heavily dependent on each other and should be treated as a fixed phrase, for instance by placing them between quotes.In addition to such cases where users specify term dependence, automatic ways also exist for IR systems to detect dependent terms in queries. Most IR systems use both user and algorithmic approaches. It is not however clear whether and to what extent user-defined term dependence agrees with algorithmic estimates of term dependence, nor which of the two may fetch higher performance gains. Simply put, is it better to trust users or the system to detect term dependence in queries? To answer this question, we experiment with 101 crowdsourced search engine users and 334 queries (52 train and 282 test TREC queries) and we record 10 assessments per query. We find that (i) user assessments of term dependence differ significantly from algorithmic assessments of term dependence (their overlap is approximately 30%); (ii) there is little agreement among users about term dependence in queries, and this disagreement increases as queries become longer; (iii) the potential retrieval gain that can be fetched by treating term dependence (both user- and system-defined) over a bag of words baseline is reserved to a small subset (approximately 8%) of the queries, and is much higher for low-depth than deep precision measures. Points (ii) and (iii) constitute novel insights into term dependence.
Yiming Zhao, Baitong Chen, Jin Zhang, Ying Ding, Jin Mao and Lihong Zhou
This study investigates the evolution of diabetics’ concerns based on the analysis of terms in the Diabetes category logs on the Yahoo! Answers website. Two sets of question-and-answer (Q&A) log data were collected: one from December 2, 2005 to December 1, 2006; the other from April 1, 2013 to March 31, 2014. Network analysis and a t-test were performed to analyze the differences in diabetics’ concerns between these two data sets. Community detection and topic evolution were used to reveal detailed changes in diabetics’ concerns in the examined period. Increases in average node degree and graph density imply that the vocabulary size that diabetics use to post questions decreases while the scope of questions has become more focused. The networks of key terms in the Q&A log data of 2005–2006 and 2013–2014 are significantly different according to the t-test analysis of the degree centrality and betweenness centrality. Specifically, there is a shift in diabetics’ focus in that they have become more concerned about daily life and other nonmedical issues, including diet, food, and nutrients. The recent changes and the evolution paths of diabetics’ concerns were visualized using an alluvial diagram. The food- and diet-related terms have become prominent, as deduced from the visualization results.
The second-order h-type indicators are suggested to identify top units in scientometrics. Basically, the re-ranking of h-type series leads to the second-order h-type indicator. The second-order h-type indicators provide an interesting and natural method to identify top units, yielding fixed h-top. Differentiating from the series of artificially defined highly cited percentile classes, the h-top contributes a natural definite top in the series of highly cited classes. When studying theoretically, the second-order h-index concerns 3% of the h-top whereas the first-order h-index refers to 10% of the h-core. The ratio of the first- and second-order h-index, hT/h, is 30%. When studying empirically, the ratio of the first- and second-order h-index, hT/h, is <30%. The approach of calculating second-order h-type indicators is exemplified based on journals in two fields.
At the level of any entity (company), inventory represents an important category of current assets, and implicitly, of total assets. Starting from the importance of this category of assets for the normal development of the production or sales activity, this paper has as priority objectives the following: delimitation of the theoretical aspects regarding the inventory valuation of the sold goods; determining the impact that inventory valuation methods may have on the financial position and financial performance of the company; applied analysis of inventory valuation options. The results obtained from both theoretical and practical research verify the main assumption that the inventory valuation options have a different impact on financial situation and the financial performance of an entity.
Despite the differences between Japanese and styles, both will have a huge impact on their national economies. In terms of cultural management styles will continue to present significant differences. Although nothing is certain, both Americans and Japanese must continue to adapt their management styles to maintain global competitiveness. In general, human resources, labor relations within organizations are mainly features that differentiate the Japanese management system of other countries, especially the US.
Health Care is a sensitive issue that concerns not only the individual but also society in general. Health economics are a specialization of the economists in the health sector who aim for the proper function of hospital administration. It deals with issues related to the financing and delivery of health services and the role of such services and other personal decisions in contributing to personal health. Many researches refer to the problems that each health unit faces, emphasizing on the resources, programs and health expenditure. Some of these programs, especially the most effective, are mentioned in this research. Their creation was based on the best quality of health services in all OECD countries.
With this research, we aim to develop a methodological framework for evaluating the total health expenditure (consists of all expenditures or outlays for medical care, prevention, promotion, rehabilitation, community health activities, health administration and regulation and capital formation with the predominant objective of improving health) in the 23 OECD countries, by creating a panel data regression and analyzing the results, from 2000 to 2014. For this reason, some of the most important variables (macroeconomic and related to the health sector), were used as tools to assess the performance of each country, as far as the resources and the expenditure for the health care are concerned. Every explanatory variable that was used in this sample, but also the combination of a number of these explanatory variables showed a positive correlation with total expenditures as a percentage of GDP in the majority of the equations. Some variables showed a negative correlation with total health expenditures, which doesn’t fit with the economic theory. Financial crisis is the reason for this.
The article presents an analysis of the awareness of the population about the kinds of contagious diseases to which it is exposed, as well as ways to prevent known and applied in everyday life. Presentation exposes results of a survey in the Dambovita county of Romania and tries to explain it by reference to information campaigns on contagious diseases. The empirical study reveals the main contagious diseases known and those less known by people, the favourite sources of information, the main measures of prevention known and applied by individuals. Finally some considerations are made regarding the future organization of information campaigns in this area.
Dan Marius Coman, Mihaela Denisa Coman and Ciprian Costel Munteanu
This research paper is a study of the application of selection techniques in financial audit, with particular attention to the sampling selection technique, as well as an attempt to find out how to improve them, especially in the current informational context.
Computer assisted auditing techniques (CAATS) are able to analyze enlarge data volumes to distinguish errors by taking all activities of the economic entity in the audit period.