Preprocessing Raw Data in Clinical Medicine for a Data Mining Purpose

Open access


Dealing with data from the field of medicine is nowadays very current and difficult. On a global scale, a large amount of medical data is produced on an everyday basis. For the purpose of our research, we understand medical data as data about patients like results from laboratory analysis, results from screening examinations (CT, ECHO) and clinical parameters. This data is usually in a raw format, difficult to understand, non-standard and not suitable for further processing or analysis. This paper aims to describe the possible method of data preparation and preprocessing of such raw medical data into a form, where further analysis algorithms can be applied.

1. BERRY, M. J., LINOFF, G., 1997. Data mining techniques: for marketing, sales, and customer support. John Wiley & Sons, Inc.

2. LAROSE, D. T., 2014. Discovering knowledge in data: an introduction to data mining. John Wiley & Sons.

3. GRZYMALA-BUSSE, J. W. Handling missing attribute values. Data mining and knowledge discovery handbook. Second edition. Springer New York Dordrecht Heidelberg London. ISBN 978-0-387-09822-7

4. HERNÁNDEZ, M. A., STOLFO, S. J., 1998. Real-world data is dirty: Data cleansing and the merge/purge problem. Data mining and knowledge discovery 2.1, pp. 9-37.

5. KIM, WON, et al., 2003. A taxonomy of dirty data. Data mining and knowledge discovery 7.1, pp. 81-99.

Journal Information


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 209 175 7
PDF Downloads 122 116 9