Acceso abierto

An efficient and automatic ECG arrhythmia diagnosis system using DWT and HOS features and entropy- based feature selection procedure


Cite

Introduction

At present, one of the major reasons of death in the world is cardiovascular disease (CVD), related to the epidemiological transition of unhealthy lifestyles such as smoking habits, diabetes mellitus and obesity [1,2]. Atrial and ventricular arrhythmias are two important factors in cardiovascular disease of which cardiac rhythm disorder and heartbeat abnormalities are great public challenges in developed countries. It can be said that the variations in the heart cellular electrophysiology are the main cause for arrhythmias, as common causes of sudden cardiac death are a consequence of arrhythmogenic cardiac disorders.

Using surface electrodes, a simple recording of the heart electrical activity is defined as an electrocardiogram (ECG) [3]. Electrocardiogram is an efficient non-invasive tool that has different employment in biomedical sciences such as diagnosing rhythm disturbances, evaluating the heartbeat rate, checking the cardiac rhythm, biometric identification, emotion identification, etc.

To avoid CVD deaths that are caused by long-term effect of the cardiac arrhythmias occurring inside the heart, it is required to be detected on time for proper diagnosis. Today, the ECG analysis is not limited to the diagnosis of cardiovascular diseases and researchers have exploited ECG signals for many other applications such as emotion recognition and biometric identification.

In order to diagnose abnormalities of the heartbeat, the ECG signal should be analyzed, but given the problems such as the abnormality of the heartbeat, presence of artifacts, the diversity of ECG data for each person, and time-consuming analysis, physicians confront challenges in order to detect abnormalities. Therefore, analysis of ECG signals using a computer-aided tools, potentially helps physicians to efficiently identify abnormalities [4, 5].

The four major stages in a heartbeat abnormalities diagnosis procedure are preprocessing, feature extraction, feature selection, and classification [6]. Various types of artifacts and noise usually contaminate ECG recordings. In the preprocessing stage, the goals are to decrease such artifacts and noise and to improve the signal for subsequent processing.

As an important step, feature extraction converts the ECG signal to a collection of features and the feature extraction techniques are classified to frequency, temporal and time–frequency methods. Temporal methods do not provide good distinction because the variations in the amplitude and duration in the electrocardiogram signal are subtle [7]. Also, the frequency techniques are not appropriate for analyzing ECG data, because these techniques cannot obtain the temporal information of signals. Therefore, an appropriate time–frequency method could be the best option. The wavelet transform (WT) is the extensively applied time–frequency technique [8, 9, 10, 11]. The wavelet transform gives a high resolution in the frequency and time domains.

High dimensionality of the feature space makes an increase in computational time and a decrease in classifycation accuracy. To build robust learning models, a subset of the relevant features should be determined. The main objectives of feature selection are making a faster and more efficient learning procedure, ameliorating the efficiency of classification and making a better understanding of the underlying process that generates the data [12].

Various types of classifiers have been used for ECG classification and analysis tasks. These diverse classifiers can be typically classified into several classes such as linear discriminant analysis (LDA), artificial neural networks (ANNs), k nearest neighbor (kNN), Bayesian classifiers, decision tree (DT), and support vector machine (SVM) [13].

In recent years, many studies have been conducted about analysis and classification of signal of ECG. A linear procedure using the DWT feature extraction method and the principal component analysis (PCA) feature reduction method in order to obtain ECG signal classification was applied in [14]. Some of the recently proposed works about the ECG signal analysis that utilized PCA for feature dimension reduction are given in [15, 16, 17]. The linear methods in ECG analyses provide good classification accuracy in free noise conditions but in the presence of noise, these linear techniques cannot acquire the maximum of accuracy [18]. So, using nonlinear methods better performance can be achieved under noisy conditions to exploit the hidden data from the ECG data [19]. In [18,20], the utilization of higher order spectra (HOS) cumulants improves diagnosis. By using HOS-based techniques, ECG data is less affected by the morphological variations. In [21], an arrhythmia diagnosis method using a mixture of varied features containing morphological features, HOS, higher order statistics of the wavelet coefficients and Fourier transform coefficients, was presented. Das and Ari in [22] mixed the wavelet transform and S-transform (ST) features to select the more effective features by combining these features. Oster et al. in [23] extracted features of ECG signals by using switching Kalman filters. Elhaj et al. [24] reduced features of ECG signals by exploiting principal component and independent component analysis. Arrhythmia classification is made by exploiting SVM and NNs.

In this paper, an efficient approach is proposed in order to perform ECG arrhythmia classification using linear and nonlinear feature extraction methods and an entropy-based feature selection method. Since ECG beat classification strongly depends on feature extraction stage, DWT, as an efficient tool for analyzing nonstationary signals, is used to extract linear features from ECG signals. In addition, since the use of HOS features makes very proper diagnosis and ECG signal is less affected by the morphological variations, the HOS cumulants feature extraction method is applied to extract nonlinear features. Furthermore, a fast and effective entropy measure-based feature selection method is used as a dimensionality reduction technique. Entropy measures the average information of a content. Since features that contain more information have higher entropy, this criterion is used for feature selection. Compared with the other linear and nonlinear dimensionality reduction techniques, the proposed feature selection method has less computational complexity. Finally, the selected features are fed to neural network and supported vector machines in order to automate classification.

ECG data

The database that was used in this work was taken from The MIT-BIH ECG Arrhythmia database that include the normal beat and common arrhythmias. This database is used as a reference to arrhythmia detectors. A total of 48 different patients were analyzed, each containing 30 min of annotated ECG recordings of continuous ECG. The ECG recordings in this database contain the normal clinical recordings, complex ventricular, junctional, and supraventricular arrhythmias [25]. These records were sampled at 360 Hz and band pass filtered at 0.1 – 100 Hz [25]. In this paper, our method was evaluated by five classes of beats including: non-ectopic beats (N), fusion beats (F), supraventricularectopic beats (S), ventricular ectopic beats (V), and unknown beats (U). The summarization of the five classes of ECG beat samples is represent in table 1.

A summary table of ECG heartbeats classified as per ANSI/AAMI EC57:1998 standard [25].

ANSI/AAMI classesNon-ectopic (N)Supraventricular (S)Ventricular (V)Fusion (F)Unknown (U)
Normal (N)Aberrated atrial premature (A)Ventricular escape (V)Fusion of ventricular and normal (F)Unclassifiable (U)
Left bundle branch block (LBBB)Atrial premature (a)Premature ventricular contraction (E)Paced (p)
MIT-BIH classesRight bundle branch block (RBBB)Supraventricular premature (S)Fusion of paced and normal (f)
Nodal (junctional) escape (j)Nodal (junctional) premature (J)
Atrial escape beat (e)
Materials and methods

The proposed system includes three main phases: preprocessing, feature extraction and classification. The procedure of the presented method is illustrated in Fig. 1. Each phase is described in the following sections.

Fig. 1

Block diagram of the proposed technique.

Preprocessing

The ECG signal contains various kinds of noises and artifacts. Contact noise, muscle artifacts, baseline drift, power line interference, electromyography artifact and electrode motion artifact are examples of noise in question.

In this paper, the preprocessing stage is divided into two phases. The first phase includes the noise reduction of the ECG signal using the Discrete Wavelet Transform. The next is the segmentation of the ECG signal. To segment the ECG data, at first step, the R peaks of this signal that are specified in the annotated file from MIT-BIH Arrhythmia database, are exploited. Then a window with 200 samples around the R peak (100 and 99 samples from left and right, respectively), is selected. Each phase is demonstrated in the following subsections.

DWT-based Noise reduction

In this paper, in order to reduce noise from the ECG signal, the DWT is used since it is an effective tool for non-stationary signals analysis. The Daubechies 6 (d6) wavelet basis function is applied for noise reduction from the ECG signal and the ECG data divided into nine levels [24]. The 9th level approximation sub-band, which includes the frequency range of 0–0.351 Hz and is mostly the baseline wander, is not exploited to rebuild the denoised ECG signal. Moreover, the frequency band from 45–90 and 90–180 are not taken into account because the frequencies higher than 45 Hz are not considered in ECG recognition. The inverse wavelet transform is used to exploit the cleaned ECG data. The detail coefficients of sub-bands 3 to 9 is calculated to reconstruct clean ECG data [24].

ECG Segmentation

After the noise elimination stage, the locations of R peaks that are specified in the annotated file of the MIT-BIH database, are extracted.

After extraction of the R peak, 100 and 99 samples from the left and right sides of the R peak, and the R peak itself, are selected as a beat or segment of 200 samples. Fig. 2 displays five categories of heartbeats that are denoised and segmented by the mentioned procedure.

Fig. 2

Example of five different categories of heartbeats that are denoised and segmented.

Feature extraction and selection

ECG heartbeat classification and recognition depends on different features [26]. A dataset of features is constructed using the linear DWT-based and nonlinear HOS-based feature extraction techniques. To exploit efficient features from high dimensions, an entropy-based feature selection technique is applied. The linear DWT-based, nonlinear HOS-based feature extraction techniques and entropy-based feature selection method are demonstrated in the following subsections.

Linear DWT-based feature extraction

Electrocardiography signals are non-stationary in nature. This feature creates DWTs an efficient tool for the analysis of and frequency-based feature extraction from ECG signals, according to its strong time-frequency localization feature [13]. DWT is a linear convert, which divides data into its components that emerge at several scales [13]. Temporal localization of the spectral components can be derived from DWT. Accordingly, DWT provides the time-frequency presentation of the data [13].

In this article, using a DWT, the feature extraction procedure is applied. In the proposed method, wavelet coefficients from the third and fourth level of detail coefficients (D3 and D4) are extracted. The prototype wavelet applied in this paper is Daubechies wavelet with order 2 (D2) because of its morphological likeness with the QRS complex of the ECG beat [27]. Five linear statistical parameters including minimum, maximum, mean, standard deviation and power of the wavelet coefficient are calculated from each sub-band of the signal.

Nonlinear HOS-based feature extraction

In this paper, the HOS method is applied to the nonlinear feature extraction. The HOS technique is applied to analyze non-stationary, non-linear and non-Gaussian signals [26].

The first and second order statistics play an important role in the bio-signal processing. Since the first two order statistics of a signal are not adequate to indicate the nonlinear features of signals, so in this analysis, third and fourth order cumulants, which are the third and fourth order correlation derived from Higher Order Spectrum (HOS) were used.

Assume x(n) as a stationary data with discrete-time. The moments of x(n) exist up to the order n. So, the nth order moment function of x(n) is determined by:

mnx(τ1,τ2,,τn1)=E[x(n)x(n+τ1)x(n+τn1)]$$\begin{align} & m_{n}^{x}\left( {{\tau }_{1}},{{\tau }_{2}},\ldots ,{{\tau }_{n-1}} \right) \\ & \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,=E\left[ x\left( n \right)x\left( n+{{\tau }_{1}} \right)\ldots x\left( n+{{\tau }_{n-1}} \right) \right] \\ \end{align}$$

where m1x,m2x,m3xandm4x$m_{1}^{x},m_{2}^{x},m_{3}^{x}\,\,\text{and}\,\,m_{4}^{x}$are the first four order moments. Also τi are the time lag factors and E[.] represent the expectation operator. Then, the cumulants can be calculated as non-linear compositions of moments as:

c1x=m1x$$c_{1}^{x}=m_{1}^{x}$$c2x=m2x(τ1)$$c_{2}^{x}=m_{2}^{x}\left( {{\tau }_{1}} \right)$$c3x=m3x(τ1,τ2)$$c_{3}^{x}=m_{3}^{x}\left( {{\tau }_{1}},{{\tau }_{2}} \right)$$c4x=m4x(τ1,τ2,τ3)m2x(τ1)m2x(τ2τ3)m2x(τ2)m2x(τ3τ1)m2x(τ3)m2x(τ1τ2)$$\begin{align} & c_{4}^{x}=m_{4}^{x}\left( {{\tau }_{1}},{{\tau }_{2}},{{\tau }_{3}} \right)-m_{2}^{x}\left( {{\tau }_{1}} \right)m_{2}^{x}\left( {{\tau }_{2}}-{{\tau }_{3}} \right)- \\ & m_{2}^{x}\left( {{\tau }_{2}} \right)m_{2}^{x}\left( {{\tau }_{3}}-{{\tau }_{1}} \right)-m_{2}^{x}\left( {{\tau }_{3}} \right)m_{2}^{x}\left( {{\tau }_{1}}-{{\tau }_{2}} \right) \\ \end{align}$$

where c1x,c2x,c3xandc4x$c_{1}^{x},\,c_{2}^{x},\,c_{3}^{x}\,\,\text{and}\,\,c_{4}^{x}$are the first four order cumulants, respectively.

Feature selection

In ECG signal processing, one of the major concerns, according to the various attentions, such as the classification accuracy and computational complexity, is the feature space high dimensionality. Feature selection includes wide various methods in order to specify a subset of the related features and construct robust learning patterns by rejecting the most redundant and irrelevant features [13].

In information theory, entropy quantifies the amount of uncertainty involved in the value of a random variable or the outcome of a random process. Actually Shannon's Entropy Η(X), measures the average information of a random variable. Therefore, entropy measure can be used to select the features that have more information. In the information theory, the most generally used Shannon entropy can be defined as:

H=ipilog2(pi)$$H=-\sum\nolimits_{i}{{{p}_{i}}\,lo{{g}_{2}}\left( {{p}_{i}} \right)}$$

where ρi is the probability of occurrence of the i-th possible value of the random variable. This paper offers the feature selection method based on a Shannon entropy measure in which features that have more entropy values are selected for classification.

Classification

For each class of ECG beats, in order to examine them, 22 features from the composed feature vector were extracted. The feature vector was experimented by neural network and Support vector machine classifiers.

Neural network classifier

The neural networks have different capabilities based on application and these are used in various fields, such as aerospace, finance, telecommunication and medicine. The feed-forward NN [24] plays an important role in classifying the ECG data. In this study, the feed-forward neural network was used for pattern classification. In general, the feed-forward NN parameters such as the number of hidden layers, the number of hidden neurons and learning algorithm make a significant contribution to the performance of feed-forward NN [24]. In this paper, 22 nodes were considered for the input layer that are related to the 22 features, 14 neurons for the hidden layer and 5 neurons for the output layer corresponding to the five classes. After several trial and errors, 14 neurons in the hidden layer were found to lead to the highest accuracy.

The back propagation procedure was applied to improve the learning procedure. In the network, the weights were updated with the MSE (mean square error) algorithm and the condition for terminating the algorithm is to reduce the MSE below the specific threshold. Then, testing patterns was classified as the output by the trained Neural Network.

SVM classifier

The SVM classifier uses some nonlinear mapping to transform the input vector patterns to higher dimension feature space and it separates two classes of samples based on an optimal separating hyper-plane. In a single layer classifier, the SVM can supervise classification challenges because of its ability in extension [24]. When the SVM classifier uses the maximal margin principle, it presents a desirable generalization capability so that it maximizes the distance between the patterns and the class separating hyper-plane. An objective function is formulated based on the distances of the class separating hyper-plane and the optimization process is carried out [24]. The SVM uses different kernel transformation basis function (RBF) that includes the radial, polynomial, quadratic, etc. C parameter and the kernel parameter play an important role in the performance of SVM because the number of support vectors and the maximization margin of the SVM are determined by these parameters.

Ethical approval

The conducted research is not related to either human or animal use.

Result

In order To evaluate the efficiency of the classification, three common metrics (sensitivity, specificity and accuracy) are used:

Accuracy(Acc)=TP+TNTP+TN+FP+FN$$\text{Accuracy}\left( \text{Acc} \right)=\frac{TP+TN}{TP+TN+FP+FN}$$Sensitivity(Sen)=TPTP+FN$$\text{Sensitivity}\left( \text{Sen} \right)=\frac{TP}{TP+FN}$$Specificity(Spe)=TN/(TN+FP)$$\text{Specificity}\left( \text{Spe} \right)=TN/\left( TN+FP \right)$$

In the above-mentioned equations, TP is the correctly detected beats, TN is correctly undetected beats, FN is undetected beats and FP is falsely detected beats.

All experiments were performed using Matlab on the ECG signal from the MIT BIH arrhythmia database that was denoised by wavelet algorithm. The length of the signals that were selected is 200 samples, including 99 samples from the right of R peak and 100 samples from the left of R peak.

A mixture of DWT-based linear and HOS-based nonlinear features that consist of 22 features for each data point was selected. The NN and SVMRBF kernel classifier was applied to input feature vectors. The SVM-RBF (nonlinear SVM with the Gaussian kernel) was used with respect to best accuracy of classification. C factor and the gamma factor for this kernel was specified to 65 and 0.6, respectively.

Further, the feed-forward NN classifier was applied for better evaluation. According to the 22 selected features, the input layer with 22 nodes was employed. Also, a hidden layer with 14 nodes, along with an output layer with 5 nodes (based on the 5 classes) were employed.

In order to examine the classifiers performance, the 10-fold cross-validations method was used and the efficiency of the proposed system was measured by this technique. The results of the proposed model for all classes with NN and SVM-RBF classifier are presented in table 2. As it can be seen, the SVM-RBF has better performance compared with the NN classifier.

Classification results of the proposed methods with the two different classifiers.

ClassifierSensitivity (%)Specificity (%)Accuracy (%)
SVM-RBF99.5799.8999.83
NN97.5899.3999.03
Discussion

This paper represents the ECG beat classification using a mixture of various feature extraction methods and entropy- based feature selection technique. It can be seen from our results that the suggested procedure is able to classify the arrhythmia classes with almost 99.83% accuracy using the SVM-RBF and 99.03% accuracy using the NN classifier. Here, the performance of the proposed method was compared with other related works. Eight ECG arrhythmia classification methods were selected in order to be compared with our presented method. The performance of the proposed method and other methods were presented in table 3.

– Comparison of the classification efficiency of the proposed method and some of studies performed based on the same database.

LiteratureyearFeaturesClassifierClassesAccuracy (%)
Martis et al. [20]2012PCASVM-RBF598.11%
Martis et al. [14]2013DWT + PCASVM-RBF596.92%
DWT + PCANN598.78%
Osowski and Linh [19]2001HOSHybrid fuzzy796.06%
NN
Martis et al. [27]2013Cumulant + PCANN594.52%
Elhaj et al. [24]2016PCA + DWT + HOS +SVM-RBF598.91%
ICA
NN598.90%
Acharya et al [28]2017Raw dataCNN594.03
Yang et al. [29]2018PCAnetLinear SVM597.94
Oh et al. [30]2018Raw dataCNN-LSTM598.10
ProposedDWT+HOSSVM-RBF599.83
NN599.03

The results show that the classification accuracy of the proposed method (using SVM and NN classifiers) is better compared to the other methods, so the proposed method can be used as an efficient tool in order to diagnose heart diseases.

Conclusion

One of the first steps to check the health of the heart is to get the ECG signal. The ECG signal provides very important information about the condition of the heart. In this research, an efficient heartbeat classification procedure was proposed using a mixture of various feature extraction methods and entropy-based feature selection technique.

The entropy method was also used to reduce dimensionality as a feature selection method. The patterns were classified by the two various classifiers (SVMRBF and NN). The experimental results shows that the presented method can classify the five arrhythmia classes with high accuracy (99.03% and 99.83%) using the neural network and support vector machine method, respectively. This method can be applied in different types of arrhythmia recognition methods in order to increase productivity along with reducing the dimensions of the features vector.