Browse

1 - 10 of 128 items :

  • Bioinformatics x
  • Applied Mathematics x
Clear All

Summary

In this paper, some problems related to determining experimental plans satisfying the criterion of D-optimality are presented. Moreover, the optimality conditions and relations between the parameters of the chemical balance weighing designs are described, and some construction examples are given.

Summary

There is a growing need to analyze data sets characterized by several sets of variables observed on the same set of individuals. Such complex data structures are known as multiblock (or multiple-set) data sets. Multi-block data sets are encountered in diverse fields including bioinformatics, chemometrics, food analysis, etc. Generalized Canonical Correlation Analysis (GCCA) is a very powerful method to study this kind of relationships between blocks. It can also be viewed as a method for the integration of information from K > 2 distinct sources (Takane and Oshima-Takane 2002). In this paper, GCCA is considered in the context of multivariate functional data. Such data are treated as realizations of multivariate random processes. GCCA is a technique that allows the joint analysis of several sets of data through dimensionality reduction. The central problem of GCCA is to construct a series of components aiming to maximize the association among the multiple variable sets. This method will be presented for multivariate functional data. Finally, a practical example will be discussed.

Summary

High Nature Value farmlands in Europe are of greatest importance in the conservation of biodiversity. Their environmental importance has been recognized for some time, and has been studied mostly in Western Europe. This article describes the results of multivariate statistical analyses performed on data (13 variables) collected from the latest National Agricultural Census and the CORINE database to provide a typology of farmlands with respect to their nature value at municipality level (LAU 2, Local Administrative Units level 2) across Poland. All municipalities were grouped into eight categories (types). Some of the farmland categories were considered to be High Nature Value farmland (HNVf). The following interrelated variables mostly contributed to the identification of HNVf: share of protected areas and forest, grassland, arable land and fallow, farmland cover diversity, and rate of nitrogen fertilization. HNVf was identified in 958 out of 2173 municipalities, covering 44% of the territory of Poland. The identified HNVf also overlaps partially (61%) with LFAs (Less Favored Areas). Farmlands with the highest nature value are located mostly across mountain and hilly areas, close to forests, and protected areas on lowlands and river valleys. The identified HNV farmlands are characterized by low-input farming systems and a large share of semi-natural habitats with a high landscape mosaic.

Summary

For square contingency tables with nominal categories, a local symmetry model which indicates the symmetric structure of probabilities for only one pair of symmetric cells is proposed. For ordinal square tables, the present paper proposes (1) another local symmetry model for cumulative probabilities from the upper-right and lower-left corners of the table, and (2) a measure to represent the degree of departure from the proposed model. The measure has the form of a weighted harmonic mean of the diversity index, which includes the Shannon entropy as a special case. Examples are given in which the proposed method is applied to square table data on decayed teeth in Japanese women patients.

Summary

Preliminary studies which may be of significance for research against coronaviruses, including SARS-CoV-2, which has caused an epidemic in China, are presented. An analysis was made of publicly available data that contain information about important metabolites neutralizing coronaviruses. Preliminary studies show that especially Ficus, barley, thistle and sundew should be additionally tested with the aim of producing medicines for coronavirus.

Summary

There are many methods used to determine the reaction to fire of wood and wood-based materials, ranging from full-scale to small-scale testing. One of the small-scale techniques is the use of the Mini Fire Tube (MFT). The aim of this work is to optimize the assessment of fireproofing preparations for wood and wood-based materials using the MFT. In addition, an evaluation was made of the effectiveness of the protection in relation to the maximum temperature of the control sample. An attempt at optimization was made using the example of pine wood and selected flame retardants. Based on the results of control tests, the critical point (the time at which the average sample temperature reached its maximum value) was calculated (Grześkowiak and Moliński 2019). The effectiveness of the protection at the critical point was determined. Efficiencies were calculated for all time parameters to determine the optimal test time. A formula was proposed for the duration of tests for protected samples, taking as a reference the maximum temperature of the control series. The tests showed that it is possible to shorten the time for testing of the effectiveness of flame retardants based on the time taken to obtain maximum temperature for control samples.

Summary

We show that, in practice, the standard unit root tests, cointegration tests, and similar tests are unreliable. This conclusion is more generally applicable to other related regression-based tests. In particular, these tests attempt to solve a problem by creating another problem.

Summary

In the literature there can be found a wide collection of correlation and association coefficients used for different structures of data. Generally, some of the correlation coefficients are conventionally used for continuous data and others for categorical or ordinal observations. The aim of this paper is to verify the performance of various approaches to correlation coefficient estimation for several types of observations. Both simulated and real data were analysed. For continuous variables, Pearson’s r 2 and MIC were determined, whereas for categorized data three approaches were compared: Cramér’s V, Joe’s estimator, and the regression-based estimator. Two method of discretization for continuous data were used. The following conclusions were drawn: the regression-based approach yielded the best results for data with the highest assumed r 2 coefficient, whereas Joe’s estimator was the better approximation of true correlation when the assumed r 2 was small; and the MIC estimator detected the maximal level of dependency for data having a quadratic relation. Moreover, the discretization method applied to data with a non-linear dependency can cause loss of dependency information. The calculations were supported by the R packages arules and minerva.