Browse

1 - 10 of 763 items :

  • Probability and Statistics x
Clear All

Summary

Preliminary studies which may be of significance for research against coronaviruses, including SARS-CoV-2, which has caused an epidemic in China, are presented. An analysis was made of publicly available data that contain information about important metabolites neutralizing coronaviruses. Preliminary studies show that especially Ficus, barley, thistle and sundew should be additionally tested with the aim of producing medicines for coronavirus.

Abstract

Despite increasing efforts during data collection, nonresponse remains sizeable in many household surveys. Statistical adjustment is hence unavoidable. By reweighting the design, weights of the respondents are adjusted to compensate for nonresponse. However, there is no consensus on how this should be carried out in general. Theoretical comparisons are inconclusive in the literature, and the associated simulation studies involve hypothetical situations not all equally relevant to reality. In this article we evaluate the three most common reweighting approaches in practice, based on real data in Norway from the two largest household surveys in the European Statistical System. We demonstrate how cross- examination of various reweighting estimators can help inform the effectiveness of the available auxiliary variables and the choice of the weight adjustment method.

OPEN ACCESS

Abstract

Interviewers often assess after the interview the respondent’s ability and reluctance to participate. Prior research has shown that this evaluation is associated with next-wave response behavior in face-to-face surveys. Our study adds to this research by looking at this association in telephone surveys, where an interviewer typically has less information on which to base an assessment. We looked at next-wave participation, non-contact and refusal, as well as longer-term participation patterns. We found that interviewers were better able to anticipate refusal than non-contact relative to participation, especially in the next wave, but also in the longer term. Our findings confirm that interviewer evaluations – in particular of the respondent’s reluctance to participate – can help predict response at later waves, also after controlling for commonly used predictors of survey nonresponse. In addition to helping to predict nonresponse in the short term, interviewer evaluations provide useful information for a long-term perspective as well, which may be used to improve nonresponse adjustment and in responsive designs in longitudinal surveys.

Abstract

There is considerable demand for official statistics on temporary populations to supplement statistics on resident and working populations. Progress has been slow, with temporary population statistics not part of the standard suite of measures produced by national statistical offices. This article adopts the framework for official statistics proposed by Raymer and colleagues as a guide to aspects relating to society, concepts, data, processing, outputs and validation. The article proposes a conceptual framework linking temporary population mobility, defined as a move more than one night in duration that does not entail a change in usual residence, and temporary populations. Using Australia as an example, we discuss various dimensions of temporary mobility that complicate its measurement. We then report the outcomes of a survey of user needs for temporary population statistics along with a desktop review of OECD countries to identify the best formulation of temporary population statistics, and current international practice respectively. The article concludes by proposing two related concepts for temporary populations: population present and person-time, which overcome a number of issues currently impeding progress in this area and discuss their potential implementation.

Abstract

Estimates for small areas defined by social, demographic, and geographic variables are increasingly important for official statistics. To overcome problems of small sample sizes, statisticians usually derive model-based estimates. When aggregated, however, the model- based estimates typically do not agree with aggregate estimates (benchmarks) obtained through more direct methods. This lack of agreement between estimates can be problematic for users of small area estimates. Benchmarking methods have been widely used to enforce agreement. Fully Bayesian benchmarking methods, in the sense of yielding full posterior distributions after benchmarking, can provide coherent measures of uncertainty for all quantities of interest, but research on fully Bayesian benchmarking methods is limited. We present a flexible fully Bayesian approach to benchmarking that allows for a wide range of models and benchmarks. We revise the likelihood by multiplying it by a probability distribution that measures agreement with the benchmarks. We outline Markov chain Monte Carlo methods to generate samples from benchmarked posterior distributions. We present two simulations, and an application to English and Welsh life expectancies.

Abstract

With the possibility of dependence between the sources in a capture-recapture type experiment, identification of the direction of such dependence in dual system of data collection is vital. This has a wide range of applications, including in the domains of public health, official statistics and social sciences. Owing to the insufficiency of data for analyzing a behavioral dependence model in dual system, our contribution lies in the construction of several strategies that can identify the direction of underlying dependence between the two lists in the dual system, that is, whether the two lists are positively or negatively dependent. Our proposed classification strategies would be quite appealing for improving the inference as evident from recent literature. Simulation studies are carried out to explore the comparative performance of the proposed strategies. Finally, applications on three real data sets from various fields are illustrated.

Abstract

Analysis of trends in health data collected over time can be affected by instantaneous changes in coding that cause sudden increases/decreases, or “jumps,” in data. Despite these sudden changes, the underlying continuous trends can present valuable information related to the changing risk profile of the population, the introduction of screening, new diagnostic technologies, or other causes. The joinpoint model is a well-established methodology for modeling trends over time using connected linear segments, usually on a logarithmic scale. Joinpoint models that ignore data jumps due to coding changes may produce biased estimates of trends. In this article, we introduce methods to incorporate a sudden discontinuous jump in an otherwise continuous joinpoint model. The size of the jump is either estimated directly (the Joinpoint-Jump model) or estimated using supplementary data (the Joinpoint-Comparability Ratio model). Examples using ICD-9/ICD-10 cause of death coding changes, and coding changes in the staging of cancer illustrate the use of these models.

Abstract

This article discusses use of the composite estimator with the optimal weight to reduce the variance (or the mean-squared-error, MSE) of the ratio estimator. To study the practical usefulness of the proposed composite estimator, a Monte Carlo simulation is performed comparing the bias and MSE of composite estimators (with estimated optimal weight and with known optimal weight) with those of the simple expansion and the ratio estimators. Two examples, one regarding the estimation of dead fir trees via an aerial photo and the other regarding the estimation of the average sugarcane acres per county, are included to illustrate the use of the composite estimator developed here.

Abstract

The requirement to anonymise data sets that are to be released for secondary analysis should be balanced by the need to allow their analysis to provide efficient and consistent parameter estimates. The proposal in this article is to integrate the process of anonymisation and data analysis. The first stage uses the addition of random noise with known distributional properties to some or all variables in a released (already pseudonymised) data set, in which the values of some identifying and sensitive variables for data subjects of interest are also available to an external ‘attacker’ who wishes to identify those data subjects in order to interrogate their records in the data set. The second stage of the analysis consists of specifying the model of interest so that parameter estimation accounts for the added noise. Where the characteristics of the noise are made available to the analyst by the data provider, we propose a new method that allows a valid analysis. This is formally a measurement error model and we describe a Bayesian MCMC algorithm that recovers consistent estimates of the true model parameters. A new method for handling categorical data is presented. The article shows how an appropriate noise distribution can be determined.