Skull Stripping Based on the Segmentation Models

S kull image separation is one of the initial procedures used to detect brain abnormalities. In an MRI image of the brain, this process involves distinguishing the tissue that makes up the brain from the tissue that does not make up the brain. Even for experienced radiologists, separating the brain from the skull is a difficult task, and the accuracy of the results can vary quite a little from one individual to the next. Therefore, skull stripping in brain magnetic resonance volume has become increasingly popular due to the requirement for a dependable, accurate, and thorough method for processing brain datasets. Furthermore, skull stripping must be performed accurately for neuroimaging diagnostic systems since neither non-brain tissues nor the removal of brain sections can be addressed in the subsequent steps, resulting in an unfixed mistake during further analysis. Therefore, accurate skull stripping is necessary for neuroimaging diagnostic systems. This paper proposes a system based on deep learning and Image processing, an innovative method for converting a pre-trained model into another type of pre-trainer using pre-processing operations and the CLAHE filter as a critical phase. The global IBSR data set was used as a test and training set. For the system's efficacy, work was performed based on the principle of three dimensions and three sections of MR images and two-dimensional images, and the results were 99.9% accurate.


INTRODUCTION
The brain, a crucial human body part, controls numerous intricate biological processes (Chen et al., 2018) (Nayyef and Al-Tamimi, 2022).It is also susceptible to several conditions that can be diagnosed using magnetic resonance imaging (MRI), a common imaging technique (Fatima et al., 2020).Notwithstanding, the complex structure of the brain and the presence of non-brain tissues, such as the cranium, scalp, capillaries, meninges, etc., can pose a formidable obstacle to the success of MRI examinations and disease diagnosis (Al-Tamimi et al., 2018).Skull stripping, also known as brain extraction or brain segmentation, is an essential pre-processing phase for neuroimaging analysis that must be performed to overcome this challenge (Hazra and Byun, 2020).It entails removing nonbrain tissues from magnetic resonance (MR) images of the brain, such as the cranium, scalp, and meninges while preserving the brain's anatomical structure (Al-Majeed and Al-Tamimi, 2020).Magnetic resonance imaging (MRI) is an indispensable noninvasive radiological technique that provides diagnostic images of the human body's interior rich in detail and contrast (Mahdi and Mahmood, 2014).Unlike other imaging techniques, such as computed tomography (CT), MRI does not utilize ionizing radiation to produce images.Instead, magnetic resonance is used to acquire internal information from the human body without causing damage (Hwang et al., 2019).Consequently, MRI is a safe and efficient diagnostic instrument for a wide range of medical conditions (Gao et al., 2019).Typically, MR imaging instruments acquire images from three There are many studies on the topic of separating the skull.For instance (Wang et al., 2019) focus on eliminating non-brain tissue from brain scans obtained via magnetic resonance imaging (MRI), performed on two publicly available datasets (NFBS) and (IBSR) and one clinical dataset by implementing a rapid framework for extracting and visualizing the brain, which is founded on current OpenGL pipelines.This framework is adept at utilizing the parallel computing capabilities of a GPU to the fullest extent.This approach involves utilizing a 3D surface evolution algorithm based on the BET algorithm and achieved mean Dice coefficients of 96.8%, 97.1%, and 98.5% on the three datasets.(Moazami et al., 2021) developed a method for quantifying lesions in MRIs of multiple sclerosis patients by employing (NFBS) dataset for a novel probabilistic deep learning algorithm that is used for brain extraction, which frames the task as a Bayesian inference problem and uses a cGAN to solve it, where the generator is modeled as a U-Net.Uncertainty is introduced through latent variables at multiple scales and features via conditional instance normalization.The proposed method achieved accurate segmentation results with comparable reproducibility to advanced software tools and manual segmentation (Manual skull-stripping of the brain and lesions was performed for NFBS data) as shown for different methods, method 1: Dice = 77.48,Sensitivity= 73.55, method 2: Dice = 91.81,Sensitivity= 96.60, method 3: Dice = 97.90,Sensitivity= 98.11, method 4: Dice = 98.06,Sensitivity= 98.34.Moreover, (Li et al., 2022) introduced a source-free adaptation method that successfully applied to skull-stripping on multi-site age datasets without requiring any target domain annotations or simultaneous access to image inputs for both domains.This method includes a segmentation grid, a shape dictionary, and an automatic shape code.By utilizing a shape dictionary, the Shape Auto-Encoder leverages prior anatomical knowledge to improve segmentation results on the target domain and address unreliable results.The model was evaluated on NFBS, ADNI, and dHCP datasets and achieved the following results: ) use an approach that takes advantage of orthogonal moment preprocessing presented here to improve CNN results for whole brain MRI segmentation.This method involves using kernel windows to convert the image into a modified version that features an optimal representation of coefficients.Evaluations performed on the NFBS, OASIS, and TCIA datasets revealed, in ascending order, an improvement of 4.12%, 1.91%, and 1.05%, respectively.When taught with TCIA, transfer learning using a variety of orthogonal moments achieved further improvements of 9.86% and 7.76% on NFBS and OASIS.When compared to U-Net, U-Net++, and U-Net3+, various CNN designs such as U-Net exhibited a minor gain in overall accuracy, with U-Net3+ showing a 0.64 percentage point increase for original images and a 0.33 percentage point increase for changed images.Although manual skull stripping is still the most precise and reliable procedure, it is timeconsuming and requires radiologists' expertise (Rehman et al., 2020).Therefore, an automated system is developed by utilizing deep learning techniques such as convolution neural network (CNN) and U-net to perform skull stripping on MR images with low contrast (Ghadi and Salman, 2022).The primary objective is to aid doctors in accurately diagnosing brain diseases by eliminating non-brain tissues and preserving the brain (Tariq et al., 2017).This system aims to enhance the ease, speed, and accuracy of diagnosis and treatment, offering a potential solution by using the segmentation model to address the challenges posed by manual skull stripping which is achieved by employing CNN pre-trained models that have different versions used in various fields (Joodi et al., 2023).

THE PROPOSED SYSTEM
Many computer vision applications, like the identification of handwritten documents and others, may use segmentation techniques (Oudah et al., 2018).Here, segmentation was utilized in the medical field.The best imaging technology for investigating brain tissue in depth is magnetic resonance imaging (MRI) because it can capture interior and exterior features with excellent spatial resolution (Qi et al., 2020).This allows for the detection of subtle alterations to these structures.MRI has many uses in the medical sciences, but it is beneficial for studying the brain.As a result, a method was proposed to strip the skull using deep learning, image processing techniques, and the ability to switch between a threedimensional and a one-dimensional workspace.A schematic depicting the layout of the proposed system is provided in Fig. 1:

The Topological Structure of the U-Net
The model uses the U-Net, a CNN architecture used for image segmentation generated based on EficientNetb0 produced by the segmentation model library.It has a "U" shape structure with an encoder path and a decoder path.

Encoder Path:
Captures low-level features with initial convolutions.Downsamples the image using max pooling.Establishes skip connections between encoder and decoder.

Decoder Path:
Upsamples the features.
Refines features and integrates skip connections.Maps feature to the segmentation output.
U-Net effectively combines low-level and high-level features for accurate segmentation, making it popular in biomedical imaging (Ronneberger et al., 2015).
Figure 1.The Proposed System.

EfficientNet-B0
EfficientNet-B0 is a convolutional neural network (CNN) architecture designed to achieve high accuracy while being computationally efficient.It was proposed by researchers at Google in 2019 (Tan et al., 2019).The architecture of EfficientNet-B0 consists of multiple blocks of convolutional layers, called the "compound scaling" approach.It simultaneously scales the width, depth, and resolution of the network to achieve a good balance between accuracy and efficiency.The base building block of EfficientNet-B0 is a mobile inverted bottleneck convolution (MBConv).This block combines depth-wise convolutions and pointwise convolutions to reduce the number of parameters and computational costs.It also introduces a shortcut connection to improve gradient flow and facilitate training.EfficientNet-B0 uses "swish activation" instead of the traditional ReLU activation function.Swish is a smooth, non-monotonic activation function that has been shown to improve performance in deep neural networks.EfficientNet-B0 combines various regularization techniques, such as dropout and stochastic depth, to improve the performance further.These techniques help prevent overfitting and improve generalization.Overall, EfficientNet-B0 achieves state-of-the-art accuracy on image classification tasks while maintaining a relatively small model size and computational cost, making it well-suited for resource-constrained environments.

Dataset
The

CLAHE and Resizing Steps
Limited contrast AHE (CLAHE) is a modification of the adaptive graph equation that prevents excessive contrast amplification to lessen the amount of noise amplification (Unal et al., 2016) (Musa et al., 2018).CLAHE, in short, implements the graph equation in small spots or small tiles with high accuracy and a limit of contrast.It was adopted in the proposed

Training and Evaluation
To convert Efficient-Net B0 into U-Net, the input/output layers must fit the U-Net structure.U-Net is a convolutional neural network architecture for semantic segmentation in computer vision.It consists of an encoding path that captures context and extracts features from the input image, a bridge that connects the encoding and decoding paths, and a decoding path that performs up-sampling and combines features to produce a high-resolution segmentation map.The U-shaped design effectively combines multi-scale features through skip connections, allowing it to capture fine-grained details while maintaining global context.It has become a widely used model in applications like medical imaging.In addition, a segmentation model is used to modify network parameters, add layers, and introduce skip connections for reusing extracted features, which helps preserve information during upsampling.Therefore, it is necessary to define the input and output layers as given in Table 1.:a-Define the input layer as: (shape= (None, None, 1)) b-Add convolution layer with the parameters: Conv2D (3, (1, 1)) (input) c-The output is the base model (efficient-netB0) combined with the above steps (a,b).(1) The F1-score training metric provides insights into the model's ability to balance precision and recall during training.Loss refers to the difference between a given input's predicted and actual output.For a single training, the used formula is (Powers, 2020): where:  is the true binary label (0 or 1).
is the predicted probability of the positive class (between 0 and 1).
The consequence of using one metric was the dice score, as the dice score is based on the competition's mean dice coefficient.The dice coefficient can compare the pixel-by-pixel agreement between a predicted segmentation and the ground truth.The coefficient of the dice is twofold.First, the area of overlap divided by the total number of pixels in both images is a good score for determining whether the proposed cranial stripping system is effective.The formula is given by: where: X is the predicted set of pixels Y is the ground truth.
The result of a proposed system based on training and testing data is given in Table 2., and presented in Fig. 7.The blue color indicates the training data and output, and the orange color indicates the test data and the data results.

RESULTS WITH IMPLEMENTATION STEPS
The system is implemented using Python programming language (anaconda distribution), a widely used high-level, interpreted, general-purpose, dynamic programming language.This software runs under Windows 10 Pro and uses a laptop with Intel(R) Core (TM) i7-8750H CPU @ 2.20GHz 2.20 GHz processor, 64-bit operating system, and RAM of 16.00GB and NVIDIA GeForce RTX 1060 4GB Graphics Processing Unit (GPU).
The general algorithm of the proposed system was as follows: 1-To load the IBSR dataset: first, obtain the dataset from the official IBSR website.Extract the files with folders for original MRI scans and corresponding manual segmentations.Use Python and libraries like NumPy and nibabel to load the original MRI scans.2-Pre-processing techniques in image processing and computer vision improve digital image quality and usability.CLAHE filtering enhances visual details by equalizing histograms of smaller image tiles.Resizing changes image dimensions for standardization and faster processing, benefiting machine learning methods.3-Efficient-Net B0 is transformed into a U-Net architecture for semantic segmentation in the training stage.U-Net consists of encoding, bridge, and decoding paths.The encoding path captures features, the bridge connects the paths, and the decoding path produces a high-resolution segmentation map.To convert Efficient-Net B0 into U-Net, the input and output layers are adjusted, and a convolutional layer is added to preserve learned features.The resulting model combines global context with fine-grained details through skip connections, enabling effective semantic segmentation.
The steps of the proposed system with the result are discussed in the following point: The IBSR (Internet Brain Segmentation Repository) dataset is a valuable resource for medical image analysis.Loading the IBSR dataset, including the original brain MRI scans and their corresponding masks, is crucial for researchers and practitioners.This article provides a brief guide to help you load the IBSR dataset quickly and efficiently.
Step 1: Obtain the IBSR Dataset: Download the IBSR dataset from the official IBSR website by selecting the appropriate version.
Step 2: Extract the Dataset Files: Extract the downloaded files to a local directory containing folders like "images" for the original MRI scans and "masks" for the corresponding manual segmentations.
Step 3: Load the Original MRI Scans: Use programming languages like Python and libraries like NumPy, SimpleITK, or nibabel to load the original MRI scans.
Step 4: Load the Mask Data: Using appropriate libraries to load the mask data is similar to loading the original images.
Loading the IBSR dataset is essential for brain image analysis.Following the steps mentioned, you can efficiently load the original MRI scans and their corresponding masks.
The images for the original MRI scans and masks are shown in Fig. 8 below.Remember to comply with the dataset's licensing terms and usage guidelines.The loaded dataset can be used for segmentation algorithms, deep learning models, and statistical analyses, among other applications.

Pre-processing (CLAHE filter +Resizing)
In image processing and computer vision, pre-processing techniques play a crucial role in optimizing the quality and usability of digital images (  The CLAHE filter is particularly useful when images suffer from poor lighting conditions, uneven illumination, or limited dynamic range.By redistributing the intensity values across the image, CLAHE improves the overall contrast, making it easier to discern objects and features previously obscured or difficult to perceive.
Resizing: Resizing, as the name suggests, involves altering the dimensions of an image.It can be performed to reduce or increase an image's size.In pre-processing, resizing is often employed to standardize the image dimensions or reduce the computational complexity for subsequent analysis tasks.
Resizing an image can have multiple benefits.Firstly, it helps eliminate any irregularities in image sizes, facilitating uniformity across a dataset.This standardization is crucial when images are fed into machine learning algorithms, as they often expect consistent input dimensions.
Secondly, resizing allows for computational efficiency.Large images may be computationally expensive, especially when dealing with resource-intensive tasks such as object detection or image classification.Resizing reduces the image size while maintaining the essential details, thus enabling faster processing without significant loss of information.In

CONCLUSIONS
Numerous studies have been conducted on the brain, primarily due to its significance in predicting and analyzing specific diseases or conditions.Extract a file.The brain may store patient images for medical analysis and prognostic information.Digital image (MR) processing algorithms have been implemented in the medical field, focusing on identifying the brain.This work proposes a framework for brain extraction based on three primary steps: 1) image processing, 2) deep learning-based training, and 3) cranium separation.This study utilized a global IBSR data set with a training and testing rate of 20%-80%.The results were promising because the sigmoid model was used, which was based on a pre-trained model (Efficient-Net B0) with a library (segmentation models) that enables training work in a U-NET manner.Furthermore, with the number of layers in Moodle (Efficient-Net B0) and future works, it is possible to evaluate additional data sets with the suffix (nii).
perspectives: axial, coronal, and sagittal (Ulah et al., 2020).These images are acquired using T1-weighted, T2-weighted, PD (proton density-weighted), and FLAIR (fluid-attenuated inversion recovery) sequences (Dey and Hong, 2018).Conventional or deep learning techniques can be applied to extract the cranium.Conventional methods of cranium removal (Kalavathi and Prasath, 2016) include mathematical morphology-based techniques, template-based techniques, deformable surface-based techniques, intensity-based techniques, and hybrid techniques.Each method has advantages and disadvantages, depending on the application's specific requirements (Al-Tamimi and Sulong, 2014) (Azam et al., 2023).Methods for deep learning typically fall into two categories: Methods for 2D and 3D removal of the cranium (Al-Tamimi et al., 2018).
ASD=4.63±0.98,DICE=91.57±1.38,SPE=99.55±0.15,and SEN=88.09±2.67.(De Oliveira et al., 2022) Utilized two convolutional neural networks (CNNs) for automatic brain and lesion segmentation, a method for quantifying lesions was implemented in MRI scans of patients diagnosed with multiple sclerosis.With the help of four different datasets: the first one (NFBS) was used for training the brain segmentation model, the other two datasets IBSI 2015 and MICCAI 2016 were used for training the lesion segmentation model, and the fourth and final dataset (HC-FMB) was used for testing.The first proposed framework, which concerned brain extraction, obtained a Dice coefficient of 0.9786, an accuracy of 0.9969, a precision of 0.9851, a sensitivity of 0.9851, and a specificity of 0.9985.The second proposed framework, used for brain lesion segmentation, obtained a Dice coefficient of 0.8893, accuracy of 0.9996, precision of 0.9376, sensitivity of 0.8609, and specificity of 0.9999.Finally, (Da Silva et al., 2022 Internet Brain Segmentation Repository (IBSR)(Atkins et al., 2002): The IBSR v2.0 dataset is an MRI brain dataset accessible to the public.The dataset consists of high-quality T1-weighted images of gray matter, white matter, and cerebrospinal fluid from 18 subjects.This dataset was developed in collaboration with Brigham and Women's Hospital by the Center for Morphometric Analysis at Massachusetts General Hospital.The initial release of the dataset occurred in 1999.Since then, it has been utilized extensively in advancing brain imaging and analysis techniques, including image registration, segmentation, and shape analysis.IBSR v2.0 dataset images have a voxel size of 1×1mm and a resolution of 256x256x124 voxels.Expert neuroanatomists conducted the manual segmentations, which exhibited a high degree of inter-rater consistency.Fig.2is an illustration of the IBSR dataset.While, Fig.3shows the use of the dataset by the researcher and Fig.4shows how the dataset is divided into training and testing.

Figure 3 .
Figure 3.The downloading of IBSR by another researcher.

Figure 4 .
Figure 4. Dataset splitting into training and testing.
Following establishing a suitable working environment, firstly, the dataset was divided into training and validation sets using a ratio of 80% and 20% in the training phase.Then Adam optimizer was employed for network training to minimize the cost or loss function during training.The model was trained on dataset images, with 128 batch-size and 60 epochs, utilizing the benefit of early-stopping, which is used to prevent overfitting of the model during training and enhance the learning rate of the model, in addition, to saving computational resources and time by stopping training before the model has reached a plateau in performance, by setting the patience parameter to 10 which means the model's performance on the validation set is monitored during training.If there is no improvement for ten consecutive epochs, the training is stopped early.Then the obtained weights are stored in each epoch and saved the best only.Finally, the model attained a remarkable F1 score, as shown in Fig.6.Epoch refers to the number of times a neural network is trained on the entire dataset.F1-Score: refers to a numerical value representing the performance or quality of a model's predictions.To compute the F1-Score as a training metric, keep track of the number of true positives (TP), false positives (FP), and false negatives (FN) for each batch of training examples.Then, at the end of each epoch or after a specific number of batches, you can calculate the F1-Score using the following equation (Goodfellow et al., 2016): F1 − Score = 2 * (TP(TP + FP)) * (TP / (TP + FN)) ((TP / (TP + FP)) + (TP / (TP + FN)))

Figure 6 .
Figure 6.The proposed model (a) model score and (b) model loss

Figure 7 .
Figure 7. Summarizes the result of the proposed system in percentage.
to the entire image, CLAHE divides the image into smaller regions called tiles.Each tile's histogram is then equalized independently, limiting the contrast amplification to avoid noise over-amplification.This adaptive approach helps preserve local and global contrast, enhancing details and improving visual quality (Al-Juboori, 2017).

Fig. 9
below the original image, CLAHE and resizing image are shown.

Table 1 .
The summary of the training model

Table 2 .
The Summarized Result.