As a result of the growing digitization of society and the development of electronic economy, current statistical data sources, including administrative registers, do not satisfy the information needs of society. Therefore, there are growing gaps in the statistical coverage of a number of sectors of the economy. One example of such a gap is the secondary real estate market, which is only partially accounted for by official statistical data sources. On the other hand new data sources such as the Internet or Big Data tend to decrease information gap in official statistics. The Web portals that specialise in brokerage on real estate market should be not neglected as a data source for statistics. Therefore, the aim of the paper is to use two Web portals devoted to the housing market to estimate supply measured in the number of flats offered to sale in Poznań, Poland. In addition, classification and quality of Web portals will be discussed.
New data sources, namely big data and the Internet, have become an important issue in statistics and for official statistics in particular. However, before these sources can be used for statistics, it is necessary to conduct a thorough analysis of sources of nonrepresentativeness.
In the article, we focus on detecting correlates of the selection mechanism that underlies Internet data sources for the secondary real estate market in Poland and results in representation errors (frame and selection errors). In order to identify characteristics of properties offered online we link data collected from the two largest advertisements services in Poland and the Register of Real Estate Prices and Values, which covers all transactions made in Poland. Quarterly data for 2016 were linked at a domain level defined by local administrative units (LAU1), the urban/rural distinction and usable floor area (UFA), categorized into four groups. To identify correlates of representation error we used a generalized additive mixed model based on almost 5,500 domains including quarters.
Results indicate that properties not advertised online differ significantly from those shown in the Internet in terms of UFA and location. A non-linear relationship with the average price per m2 can be observed, which diminishes after accounting for LAU1 units.