ECG CLASSIFICATION USING SLANTLET TRANSFORM AND ARTIFICIAL NEURAL NETWORK

Automatic detection and classification of cardiac arrhythmias is important for diagnosis of cardiac abnormality. This paper shows a method to accurately classify ECG arrhythmias through a combination of slantlet transform and artificial neural network (ANN). The ability of the slantlet transform to decompose signal at various resolutions allows accurate extraction of features from non-stationary signals like ECG. The low frequency coefficients, which contain the maximum information about the arrhythmia, were selected from the slantlet decomposition. These coefficients are fed to a Multi-Layer Perceptron (MLP) artificial neural network which classifies the arrhythmias. In the present work the ECG data is taken from standard MIT-BIH database. The proposed system is capable of distinguishing the normal sinus rhythm and nine different arrhythmias. The overall accuracy of classification of the proposed approach is 98.40 %. Three other transformation methods are used and the accuracy of the classification of each was compared with the slantlet system accuracy. These transformation methods are: the Fourier transform which gives 67.80% accuracy, the discrete cosine transform which gives 92.72% accuracy, and the wavelet transform (using Haar and Daubechies-4 scaling function coefficients, which give an accuracies of 96.02% and 96.25% respectively)


INTRODUCTION
Electrocardiogram (ECG) measurements are used to monitor the contraction of the cardiac muscles by measuring the propagation of electrical depolarization and repolarization in the atria and ventricles.The ECG waveform is divided into P, Q, R, S, T and U elements.Fig. 1 shows the components of a typical ECG signal.The P wave corresponds to atrial depolarization that shows contraction of both left and right atria.The QRS complex represents the depolarization of the ventricles.The T wave represents ventricles repolarization which setting up the cardiac muscle for another contraction.Sometimes it will follow by U wave that represents the Purkinje fibers repolarization.A highly sensitive electrocardiograph tools can help cardiologists to detect various heart irregularities, cardiac diseases and damages.The ECG interpretation is important for cardiologists to decide diagnostic categories of cardiac problems [1].

Fig. 1: Components of a typical ECG signal
Classification of ECG is an important area in biomedical signal processing, several algorithms have been developed for classification of ECG beats.These techniques extract features, which are either temporal or transformed representation of the ECG waveforms.The extracted features are used in the pattern recognition system to classify the ECG beats.The subject of pattern recognition can be divided into two main areas of study (1) features extraction and (2) classifier design, as summarized in Fig. 2, where x(t) is the input signal.

Fig. 2 : Pattern recognition system
The paper presents a slantlet based approach to extract features from the non-stationary ECG signal.The slantlet transform allows improved time-frequency localization of the signal.A supervised artificial neural network (ANN) is developed to recognize and classify the nonlinear morphologies.

Feature extraction
ANN trained with back propagation algorithm, classifies the applied input ECG beat to appropriate class.
-WAVELET TRANSFORM Wavelet transforms (WT) are used to decompose the original signal into a set of coefficients that describe the signal's frequency content at given times.The wavelet transform is designed to give good time resolution and poor frequency resolution at high frequencies and good frequency resolution and poor time resolution at low frequencies.This approach makes sense especially when the signal at hand has high frequency components for short durations and low frequency components for long durations, which is the case in most biological signals, mainly the Electroencephalogram (EEG), Electromyogram (EMG), and ECG signals [2].Wavelets are functions defined over a finite interval and having an average value of zero.The basic idea of the wavelet transform is to represent any arbitrary function of time as a superposition of a set of such wavelets or basis functions.These basis functions or baby wavelets are obtained from a single prototype wavelet called the mother wavelet, by dilations or contractions (scaling) and translations (shifts) [3].

The Continuous Wavelet Transform
Wavelet functions generated from one single function ψ, which is called mother wavelet, by the scaling factor a and the translation factor b is given by: Where a: is the scaling factor, and b: is the translation factor Where ψ must satisfy: The basic idea of wavelet transform is to represent any arbitrary function f as a decomposition of the wavelet basis or write f as an integral over a and b of ψ a,b .The continuous wavelet transform of a signal

The Discrete Wavelet Transform
The DWT of a signal x is calculated by passing it through a series of filters.First the samples are passed through a low pass filter with impulse response g resulting in a convolution of the two, as shown in equation below [5]: The signal is also decomposed simultaneously using a high-pass filter h.
The outputs give the detail coefficients (from the high-pass filter) and approximation coefficients (from the low-pass).It is important that the two filters are related to each other and they are known as a quadrature mirror filter.However, since half the frequencies of the signal have now been removed, half the samples can be discarding according to Nyquist's rule.
For many signals, the low-frequency content is the most important part, which gives the signal identity.The high-frequency content, on the other hand, imparts details or noise.In wavelet analysis, all the speaking is about the approximations and details.The approximations are the high-scale, lowfrequency components of the signal.The details are the low-scale, high-frequency components.The filter outputs are then down-sampled by 2, as shown in the following equations: This decomposition has halved the time resolution since only half of each filter output characterizes the signal.However, each output has half the frequency band of the input so the frequency resolution has been doubled.

Discrete Wavelet Transform Using Filter Bank Structure
The DWT is calculated as described above, the structure uses the high pass filter, low pass filter and subsampling is called Filter Bank.This decomposition is repeated to further increase the frequency resolution and the approximation coefficients decomposed with high and low pass filters and then down-sampled.This is represented as a binary tree with nodes representing a sub-space with different time-frequency localization, the tree is known as a bank and is shown in Fig. 3.
At each level in Fig. 3 the signal is decomposed into low and high frequencies.Due to the decomposition process the input signal must be a multiple of (2 n ) where n is the number of levels [5].
And Artificial Neural Network R. Thabit

Derivations of The Slantlet Filters Coefficients [8]
The filters that construct the slantlet filter bank are g i (n), f i (n), and h i (n).The L-scale filter bank has 2L channels.The low-pass filter is to be called h L (n).The filter adjacent to the low-pass channel is to be called f L (n).Both h L (n) and f L (n) are to be followed by downsampling by 2 L .The remaining 2L-2 channels are filtered by g i (n) and its shifted time-reverse for i=1,…,L-1.Each is to be followed by downsampling by 2 i+1 .The sought filter g i (n) is described by four parameters and can be written as: To obtain g i (n) such that the sought L-scale filter bank is orthogonal with 2 zero moments, requires obtaining parameters The same approach work for f i (n) and h i (n).
A generic artificial neural network can be defined as a computational system consisting of a set of highly interconnected processing elements, called neurons, which process information as a response to external stimuli.The inputs received by a single processing element, see Fig. 5, can be represented as an input vector X= (x 1 , x 2 ,…x m ), where i=1,…,m and x i is the signal from the ith input.The weights connected to the neuron can be represented as a weight vector of the form W = (w 1 , w 2 , …, w m ), which represents the weight associated to the connection between the input vector X, and the processing element.A neuron contains a threshold value that regulates its action potential which is called the bias (x 0 ) and the weight of the connection is w 0 .While action potential of a neuron is determined by the weights associated with the neuron's inputs, a threshold modulates the response of a neuron to a particular stimulus confining such response to a pre-defined range of values [9].The equation below shows the y output of a neuron as an activation function f of the weighted sum of m inputs.The activation function, denoted by f (ν), defines the output of a neuron in terms of the induced local field ν.

Structure of ANN [9]
Neural networks are typically arranged in layers.Each layer in a layered network is an array of processing elements or neurons.Information flows through each element in an inputoutput manner.In other words, each element receives an input signal, manipulates it and forwards an output signal to the other connected elements in the adjacent layer.

Back Propagation Algorithm [10]
Different network topologies with powerful learning strategies to solve nonlinear problems have been reported.For the present application, back propagation with momentum is used to train the feed forward neural network.The output units (y k units) have weights w jk and the hidden units have weights w ij .During the training phase, each output neuron compares its computed activation y k with its target value d k to determine the associated error E for the pattern with that neuron, i.e., Where m is the number of the output neurons The ANN weights and biases are adjusted to minimize the least-square error.The minimization problem is solved by the gradient technique.This is achieved by back propagation of the error.When using momentum, the net is proceeding not in the direction of the gradient, but in the direction of a combination of the current gradient and the previous direction of weight correction.Convergence is sometimes faster if a momentum term is added to the weight update formula.
The BPA is a supervised learning algorithm, in which a mean square error function is defined, and the learning process aims to reduce the overall system error to a minimum.The connection weights are randomly assigned at the beginning and progressively modified to reduce the overall system error.The weight updating starts with the output layer and progresses backward.The weight update is in the direction of 'negative descent', to maximize the speed of error reduction.
For effective training, it is desirable that the training data set be uniformly spread throughout the class domains.The available data can be used iteratively, until the error function is reduced to a minimum.
The accuracy of the neural network classifier depends on several factors, such as the size and quality of the training set, the method of the training imparted and also the parameters chosen to represent the input.

-SYSTEM DESIGN STAGE
The problem under study is to classify the ECG signals into normal cases and nine abnormal cases depending on the features of the ECG signals.In this work the physionet database of biological signals is used as the source of ECG records, namely the MIT-BIH ECG Database.This database is accessible on the internet and is widely used in experimental works on classification of ECG signals and biological signals in general [11].
The data files were recorded with different sampling frequencies, so it is needed to resample the records to a unified frequency.The unified frequency used in the proposed work is 360 Hz.
The proposed work is done on ten classes where the data used are collected from the following databases in the MIT-BIH ECG database: Arrhythmia, Atrial fibrillation, Malignant Ventricular Ectopy, Supraventricular Arrhythmia, Normal Sinus rhythm, and the PTB (Physikalisch-Technische Bundesanstalt, i.e.The National Metrology Institute of Germany) Diagnostic ECG Database.

-PROPOSED FLOW DIAGRAM
The general block diagram for the proposed system of the classification is shown in Fig. 6.In the present work, MATLAB software package version7.4.0.287 (R2007a) is used to implement the software design and algorithms.The main components of the system are the sampling block in which the sampling rate of the signal is made 360 Hz so if the sampling rate of input signal is not equal to 360 Hz then re-sampling is done.The data files used consists of time stamps and values recorded from two electrodes, some of these files contain more than two electrodes records.The PTB Diagnostic ECG Database consists of time stamps and values recorded from 15 leads, each record includes 15 simultaneously measured signals: the conventional 12 leads (I, II, III, aVR, aVL, aVF, V1, V2, V3, V4, V5, V6) together with the 3 Frank lead ECGs (VX, VY, VZ).Each signal is digitized at 1000 samples per second.For each record the sampling rate is mentioned in its associated header file.The records used are taken from lead II to find the ECG beats that are used in the pattern recognition system.After the sampling block, the beat detection block is introduced in which one beat of the signal is extracted.The extracted beat is introduced to the feature extraction block for dimensionality reduction and for the creation of the feature vector.The feature vector creation process is done by using SLT, DWT, DCT and FFT.The new feature vector is used as an input to the BP-NN to classify them into normal and abnormal ECG beats.Some of the extracted beats are used for the training of the neural network and others are used for the testing.The classifier performance was evaluated by calculating accuracy of the network in classification process.

The Input ECG Signal
The ECG record that needs to be classified must be taken with its header, where the ECG record contains the ECG signals from a number of leads and the header file contains the sampling rate used to sample that signal.For the used records in this work the sampling rate for the Arrhythmia data base is 360 Hz, for the Atrial fibrillation recodes, Malignant Ventricular Ectopy, Supraventricular Arrhythmia, and Normal Sinus rhythm, is 250 Hz and for the PTB Diagnostic ECG Database the sampling rate is 1000 Hz [11].

Sampling Block
The sampling rate must be 360 Hz for whole records in order to make the ECG record suitable for the processing stages following this stage.The sampling rate change depends on the information contained in the header file of the record.

Windowing Process
First of all, one lead must be chosen to extract the ECG beat, the lead chosen in the proposed work is lead II since most of the rhythms are seen in this lead record.As the shape of each beat in ECG waves is asymmetric, P-QRS-T complexes are selected by using windows with a range of 100 samples before the R-wave maximum point and 155 samples after the R-wave ECG Class maximum thus the number of samples in each extracted beat is 256 samples.This is to extract a single beat ECG signal from the multi-beat data.The flow chart for the windowing process is shown in Fig. 7.The extracted beat is now ready for the feature extraction step.

A-Feature Selection
It is basically impossible to apply any classification method directly to the ECG samples, because of the large amount and the high dimension of the samples necessary to describe such a big variety of clinical situations.A set of algorithms from signal conditioning to measurements of average wave amplitudes, durations, and areas, is usually adopted to perform a quantitative description of the signal and a parameter extraction.On this set of extracted ECG parameters, several techniques for medical diagnostic classification are then applied, such as probabilistic approaches, heuristic models, and knowledge-based systems.The aim of this work was to determine suitable input feature vectors which would discriminate between normal and different types of heart diseases.

B. DWT Coefficients Extraction
In the present work Db4 and Haar wavelets have been used as the mother wavelets.For achieving good time-frequency localization, the preprocessed ECG signal is decomposed by using the DWT up to the third level.The 3-level wavelet decomposition structure is shown in Fig. 3, where the result is 4 different subsets, three subsets for the details (the wavelet function coefficients) and the forth is the approximation subset (the scaling function coefficients).Since most of the information is concentrated in the low frequency components only the 32 samples result from the level-3 low pass filter will be considered as the features of the input signal.

C. SLT Coefficients Extraction
The slantlet filter bank used to extract the features of the ECG signal is 3-scale (L=3) filter bank.The structure of this filter bank is shown in Fig. 9, where 6 different filters are used.These filters are constructed using the derivations explained previously.The low pass filter h3(n) output is the approximation of the signal and the other outputs are the details so it can be efficient to take only the coefficients of the low pass filter as features of the signal and discard the remaining coefficients without losing many information about the signal.Since the slantlet transform gives better time localization the results of the classification will be better using this transformation method.The features extraction using SLT is shown in Fig. 10. a1: is 32 samples which are the outputs of the low pass filter h3 after down sampling by 8.Only this vector will be used as the features of the ECG beat. d1: is 32 samples which are the outputs of the band pass filter f3 after down sampling by 8.  d2: is 32 samples which are the outputs of the band pass filter g2 after down sampling by 8.  d3: is 32 samples which are the outputs of the band pass filter (shifted time reverse of g2) after down sampling by 8.  d4: is 64 samples which are the outputs of the band pass filter g1 after down sampling by 4.  d5: is 64 samples which are the outputs of the band pass filter (shifted time reverse of g1) after down sampling by 4.

D. DCT Coefficients Extraction
The discrete cosine transform has an important property that its basis vectors closely approximate the original signal with a little number of coefficients.The DCT is an orthogonal transform and most of the energy of the signal transformed is concentrated in the low frequency components.To extract the features of the ECG signal only the 32 samples in the lower part of the coefficients will be used.

E. FFT Coefficients Extraction
The amplitude of the signal using FFT is symmetrical so it is enough to consider only one side of the coefficients resulted to extract the features of the signal.The phase coefficient with dimensionality reduction will not give efficient result when the number of classes is large, so that the phase will not be used.It is important to notice that the features extracted using the FFT contains only the amplitude information.For dimensionality reduction first 32 samples are used as the features of the input signal.

Neural Network Classifier
The ECG record contains number of beats, this number of beats in each record is different according to the recording time, also the sampling rate in some records different from that of the other record.To extract one beat from the multi beat record the re-sampling is done if the sampling rate is not 360 Hz, then the windowing process is done.After the extraction of one beat, the features of each beat are found using transformation methods and dimensionality reduction to obtain an acceptable number of features that can be used in the neural network.
The neural network used as the classifier uses the features obtained from the feature extraction process for training and testing.The neural network structure used in the proposed work consists of 32 neuron input layer, 20 neuron hidden layer and 1 neuron output layer is shown in Fig. 11, and it's specifications are illustrated in Table 1.
The activation function for the input layer and the hidden layer neurons is the Tan-sigmoid function, the activation function for the output layer neuron is the Linear function.
The neural network after identifying its parameters [identifying the number of layers and the activation functions] was trained using the BPA.After the training process of the neural network, the testing process was done to test the performance of the neural network in classifying the input patterns.The maximum number of iterations for the neural network used in this work is 1000.

Data Manipulation
The system has been applied on ten classes of ECG beats, one of them is the normal beats class and the others were some of the heart arrhythmias.These ten classes and their corresponding number of beats are shown in Table 2.The number of beats of each class has been divided into two parts: a-One part of these beats is used for the training of the neural network called (training beats), the training of ANN starts by computing the local errors from the output of the network towards the input.At first, the data sample is sent through the network to find an output and then calculate the error introduced at the output layer and reflect it to find the overall error, all weights are updated until the required performance is obtained.b-The other part is used for the testing of the network called (testing beats), at the completion of training, the testing beats are tested on the feed forward network and their resultant error is used to give the measure of the generalization ability of the network.
Whole results obtained after classification are shown in Table 3, where the number of errors result from the classification process is different according to the type of the transformation method used to extract the features of the signal.

Performance Evaluation of Neural Network
The Neural Network used for the classification purpose in this work has the same structure overall the types of transformation methods applied to find the features of the signal, so that the complexity in all types is the same.When the comparison is done between the results obtained after testing each network the differences are affected by the transform used to find the features.
To compare the performance of the network combined with each type of transformation methods the accuracy of each one has been computed.

CONCLUSSIONS
In this work different transformation methods are used to extract the features of the ECG beat.The pattern recognition system used to classify the ECG signals used these features as the input to the neural network classifier.The slantlet transform is one of the transformation methods used to extract the features of the ECG beat which make the recognition of the ECG beats more accurate.
It can be concluded from this work, the transformation of the ECG signal from time domain to time-frequency domain gives better results than the transformation to frequency domain.This was clear in the results of the WT and SLT.The SLT is an orthogonal transform and provides improved time localization than WT, therefore it will improve the classification results.To improve the accuracy for the heart beat recognition system, five different transformation methods are applied to find the features of the signal and a suitable neural network used for the classification.The comparison of the accuracy of the SLT system with the four other systems used in this work gives a conclusion that the SLT system gives improved accuracy for the heart beat recognition.

Fig. 5 :
Fig. 5: Basic Model of a Single Neuron

Fig. 8
(a) shows the record (103) from the MIT-BIH arrhythmia database which contains the normal beats this record is from the modified lead II (MLII) and lead V2.The record from lead II is used to extract only one beat from it.Fig. 8 (b) shows the extracted beat (normal beat).

Fig. 10
Fig. 10 (a) ECG beat (Normal) (b) SLT of the ECG beat (c) 32 samples of the SLT

Table 1 :
Neural Network specifications

Table 2 :
The ECG classes and their corresponding number of beats

Table 3 :
Results of classification for all systems

Table 4 :
Table 4 is constructed to compare between the accuracy of different classification systems used in this work.Accuracy of each classification system