Statistical analysis of data set on national reporting of emission of air pollutants. Part I: investigation of outliers / Analiza statystyczna zbioru danych pochodzącego z krajowej sprawozdawczości emisji zanieczyszczeń do powietrza, Cz. I: wykrywanie wartości odstających

Open access


The Polish emission reporting system - “Krajowa baza o emisjach gazów cieplarnianych i innych substancji” (or National Emission Database (NED)) - was established at the end of 2010. Initially (data submitted for 2010), the database contained reported emission data for greenhouse gases and air pollutants from plants that have had proper Integrated Pollution Prevention and Control permissions (i.e., integrated permission for the release of gases and dusts into the air). The emissions reported to the NED are recognised as the emissions from local sources and partly as the emissions from point sources, with the possibility of including them into a national emission inventory as point source data (in the case of air pollutants). In the near future, it is planned that the database will be perceived as an integrated system for national air emission management (and the emission data from all sources will be required to pay a “tax for the use of the environment”, which will be regulated by national Polish law). This paper is a part of the work related to the analysis of reported emission data. Additional research on the data collected in the national database might be used to develop a National Emission Inventory, in addition to evolution of country-specific emission factors (e.g. from combustion and industrial processes). The analysed data (emission of NOX, CO, SOX and TSP) were taken from the data for point sources submitted for 2011 primarily with the aim of improving the quality of data submitted previously - for 2010. This paper is the first study in the research to investigate outliers among the reported data using some basic statistical methods.

Acuna , E & Rodrig uez , CA 2004, Meta analysis study of outlierdetection methods in classification, Technical Paper, Department of Mathematics, University of Puerto Rico at Mayaguez, Venice.

Barnett , V & Lewis , T 1994, Outliers in statistical data, John Wiley.

Ben-Gal, I 2005, ‘Outlier detection’ in Data mining and knowledgediscovery handbook: a complete guide for practitionersand researchers, eds O Maimon & L Rockach, Kluwer Academic Publishers, ISBN 0-387-24435-2.

Dixon, WJ 1950, ‘Analysis of extreme values’, Annals of MathematicalStatistics, no. 21, pp. 488-506.

Dixon, WJ 1951, ‘Ratios involving extreme values’, Annals ofMathematical Statistics, no. 22, pp. 68-78.

Engineering statistics handbook (ESH), Grubbs’ test for outliers n.d. Available from: .

Frey, HCH 2007a, Quantification of uncertainty in emission factors and inventories.

Frey, HCh 2007b, Quantification of uncertainty in air pollutant emissions inventories.

Gallardo, L, Escribano, J, Dawidowski , L, Rojas, N, de Fátima Andrade , M & Osses , M 2011, ‘Evaluation of vehicle emission inventories for carbon monoxide and nitrogen oxides for Bogotá, Buenos Aires, Santiago, and São Paulo’, AtmosphericEnvironment, doi:10.1016/j.atmosenv.2011.11.051.

Grubbs, F 1950, ‘Sample criteria for testing outlying observations’, Annals of Mathematical Statistics, no. 21, pp. 27-58.

Hawkins , D 1980, Identification of outliers, Chapman and Hall.

IEP-NRI, NCEM, n.d. Zespół Ochrony Powietrza, 2012 Materiały szkoleniowe dla operatorów instalacji dotyczące wypełniania wniosku do Krajowej bazy o emisjach gazów cieplarnianych i innych substancji za rok 2012, Warszawa, edt. 2012 and 2013 [in Polish].

Ilango , V, Subramanian , R & Vas udevan , V 2012, ‘A five step procedure for outlier analysis in data mining’, EuropeanJournal of Scientific Research, no.3, pp. 327-339, ISSN 1450-216X.

Iwaniec , M 2008, Walidacja metod pomiarowych, StatSoft Polska [in Polish].

Johnson, R 1992, Applied multivariate statistical analysis, Prentice Hall.

Komsta , Ł 2011, “Outliers” package for CRAN-R. Available from: .

Kriegel , H-P, Kröger , P & Zimek , A 2010, ‘Outlier detection techniques’, The 2010 SIAM International Conference on DataMining.

Kumari , H, Joon , V, Chandra , A & Kaushik , SC 2011, ‘Carbon monoxide and nitrogen oxide emissions from traditional and improved biomass cook stoves used in India’, in InternationalConference on Chemistry and Chemical Process, IPCBEE, vol. 10 (2011) © (2011), IACSIT Press, Singapore.

[LAW1] Regulation of Polish Minister of Environment concerningthe report questionnaire and way of its input toNational Emission Database (pl: Rozporządzenie MinistraŚrodowiska z dnia 28 grudnia 2010 r. w sprawie formularzaraportu oraz sposobu jego wprowadzania do Krajowej bazyo emisjach gazów cieplarnianych i innych substancji), Dz. U. Nr 3/2011, Poz. 4 [in Polish].

[LAW2] The Environmental Protection Law Act (pl: PrawoOchrony Środowiska), Dz.U. Nr 62/2001, Poz. 627 with changes entered into force after 27th of April, 2001 [in Polish].

[LAW3] The law act on system of management of emissions ofGHGs and other pollutants (pl: Ustawa z dnia 17 lipca 2009r. o systemie zarządzania emisjami gazów cieplarnianychi innych substancji), Dz.U. Nr 130/2009, Poz. 1070 [in Polish].

Likens , GE, Buso , DC & Butler , TJ 2005, Long-term relationships between SO2 and NOX emissions and SO4 2− and NO3 concentration in bulk deposition at the Hubbard Brook Experimental Forest, NH.

Myatt , GJ 2007, Making sense of data (a practical guide to exploratorydata analysis and data mining), Wiley.

Rorabacher, D 1991, ‘Statistical treatment for rejection of deviant values: critical values of Dixon’s “Q” parameter and related subrange ratios at the 95% confidence level’, Analytical Chemistry, no. 63, pp. 139-146.

Journal Information

CiteScore 2017: 0.26

SCImago Journal Rank (SJR) 2017: 0.137
Source Normalized Impact per Paper (SNIP) 2017: 0.211


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 74 74 4
PDF Downloads 34 34 1