Brain tumor detection empowered with ensemble deep learning approaches from MRI scan images

Brain tumor detection empowered with ensemble deep learning approaches from MRI scan images

Dataset and preprocessing

This study utilizes a publicly accessible brain MRI dataset from Kaggle ( consisting of 2,870 images, including both normal brain scans and those exhibiting tumors. To enhance classification accuracy and improve model generalization, preprocessing techniques such as normalization and augmentation were applied. These steps ensured uniformity in image size, shape, and quality. Deep learning-based preprocessing was performed using MATLAB, guaranteeing consistency in image resolution and overall quality. Data augmentation is essential for strengthening the robustness and generalization capabilities of deep learning models in brain tumor detection. By introducing various transformations to the original images, a more diverse and expansive training set was created. The most frequently employed augmentation techniques include rotation, translation, scaling, flipping, and elastic deformation. Rotation and flipping simulate different anatomical orientations and head positions, while translation and scaling accommodate spatial shifts and resolution variations in MRI scans. Additionally, elastic deformations introduce structural distortions that mimic imaging artifacts or pathological changes in the brain45,47.

To improve the noise robustness and model resilience, the research team applied two key preprocessing techniques: Gaussian smoothing and intensity adjustments. These steps help the model accommodate standard deviations in the forms of brain imaging that happen across institutions and imaging protocols. The 3D image dataset was enlarged to 10,000 samples using an extensive set of transformations: rotating the image from 0 to 25 degrees in 5-degree increments, flipping the image horizontally and vertically, scaling the image up and down, cropping the image to yield a variety of center points, and adding Gaussian noise in a variety of forms to simulate real-world MRI artifacts48,51.

Given the dataset’s relatively small size, even after augmentation, several measures were implemented to mitigate overfitting. A k-fold cross-validation strategy was used to ensure consistent model performance across different subsets of data. Additionally, dropout layers (with rates between 0.2 and 0.5) and L2 regularization were incorporated to prevent excessive model complexity. An early stopping mechanism was also applied to terminate training once validation performance plateaued, thereby avoiding overfitting.

Deep learning in brain imaging analysis

DNNs have become fundamental tools in brain imaging analysis, excelling in segmentation, classification, functional analysis, and biomarker identification. These models autonomously learn intricate patterns in brain MRI scans, enabling them to differentiate between healthy and diseased brains with high precision. Additionally, DNNs can reveal dynamic patterns of brain activity and identify quantitative biomarkers linked to neurological and psychiatric disorders6. By leveraging large labeled datasets, deep learning models advance our understanding of brain structure and function, contributing to improved diagnosis, treatment planning, and disease management. Their capability to extract detailed features from medical images makes them particularly effective for brain tumor detection. These networks facilitate precise feature extraction, which is critical for accurate medical diagnoses8.

Optimizing deep learning model performance requires careful hyperparameter tuning. Factors such as learning rate, batch size, and regularization techniques significantly influence the model’s ability to generalize to new data2.Proper tuning strategies ensure optimal predictive performance while maintaining stability5. For this study, InceptionV3 and Xception were selected as the primary deep learning architectures due to their proven effectiveness in image classification tasks. These convolutional neural networks (CNNs) employ advanced feature extraction mechanisms, allowing them to capture intricate image structures, making them particularly suitable for medical image analysis6. Their efficiency and relatively lightweight nature also facilitate deployment in computationally constrained environments7.

By combining InceptionV3 and Xception, this study harnesses the strengths of both models to improve tumor classification accuracy. Each network learns distinct representations of brain tumor characteristics, and their ensemble approach enhances overall classification performance. This integration enables a more comprehensive analysis, achieving superior accuracy and robustness compared to using either model individually. The selection of these architectures is based on their efficiency, effectiveness, and complementary feature extraction capabilities, collectively contributing to high-precision brain tumor detection7. Table 1 shows the Actual and Augmented dataset of the brain tumor.

Table 1 Actual and augmented dataset of brain tumor.

A deep neural network’s parameters require a significant volume of data. A data augmentation technique was employed to flip, rotate, and modify the brightness of the training set images to augment the little dataset. Therefore, the model was able to react to new data. The actual images are overall 2870. The DNN interpreted these minor changes as brand-new images, increasing the size of our training set. After augmentation, each class has 2500 images. Overall, there are 10,000 total images.

In brain ensemble model for the tumor detection, two main strategies are key to the control of train/test leakage during data augmentation. Firstly, the data is split (into training and testing sets) before any augmentation takes place. The training data and test data afterward are segregated and the pipeline for the training data and testing data is introduced to apply transformations on the training set only7,44. A degree of randomness is introduced in the process of augmentation using techniques such as random rotations, flips, and alterations in brightness, which in turn both diversify augmented samples and prevent them from being overfitted to transformations. The image data generators with leakage prevention capabilities are considered. In addition to this, a validation set is used to carry out periodic assessments of the model performance throughout the training process, which in turn helps to detect the problem of leakages. These methods ensure that during augmentation there is no accidental leak of the test data information which may help them in generalizing the model for unseen brain scans5,6. Table 2 shows the training options, and the parameters used.

Table 2 Different training options and its related parameters.

To get the models to perform at their best, a systematic approach is used to fine-tune all the hyperparameters of all the models. Firstly, used the grid search and random search to get the suitable ranges for the four hyperparameters: learning rate, batch size, dropout rate, and the number of epochs. As for the learning rate, some values were as follows 10−1, 10−2, 10−3, and 10−4were tried out to identify the rate at which convergence could be achieved without overshooting the minimum49,50. The numbers of batches including 16, 32, 64 and 128 were used in the experiment to determine the relationship between the training speed and the stability of the model. Dropout rates were set from 0.2 to 0.5 to avoid overfitting while preserving the ability of generalization. Moreover, the number of epochs was also set as the hyperparameter and was changed dynamically according to the early stopping rule to avoid overfitting and extra calculations. More complex methods of optimization including Adam and RMSProp were considered, and the Adam optimizer was selected due to its flexibility and convergence speed. For ensemble learning, the combination weights of Inception-V3 and Xception models were optimized, to achieve the highest level of cooperation between the two structures. To validate the choices of hyperparameters, cross-validation methods were used to reduce chances of over-fitting of the tuned parameters. These procedures of fine-tuning helped in the enhancement of the performance of the proposed XL-TL model44,47.

Deep learning algorithm

Deep learning techniques are currently employed to identify and organize brain tumors. The effectiveness of the Xception and Inception-V3 deep learning algorithms is contrasted using the chosen dataset. The inception and Xception deep learning architectures were selected based on their established performance in processing highly complex image information, especially in the medical domain. InceptionV3 could differentiate itself from other related approaches because it utilizes the inception modules, which in return allows the model to extract detailed multi-scale content of the scan45. This differentiation between various tumor classes and the ability to discover even subtle abnormalities is the most fundamental. Then, Xception’s distinct depth-wise reliable convolutions offer computational efficiency without quality compromise, this ensures that the model is useful to medical imaging tasks where the data is limited with available computing resources. With the capacity of the Inception and Xception, researchers will not be limited to technical specification but to implement models’ predictability and the application in neuro diagnostics and treatments47.

Inception-V3

Google published the Inception network, a pre-trained system model, under the Google Net name in 2014. Originally, a 22-layer system with 5 M parameters and 1 × 1, 3 × 3, and 5 × 5 filter sizes was used to extract features at various scales and with maximum pooling.

Fig. 2
figure 2

The architecture of inception-V3.

Fig. 2

Utilizing 1 × 1 filters enhances computational efficiency. In 2015, Google refined the Inception model to develop InceptionV3, optimizing the architecture by reducing the size of convolutional layer boundaries. Specifically, instead of using five 5 × 5 convolutional filters, the model employs two 3 × 3 filters, thereby reducing computational complexity without compromising performance. The InceptionV3 model consists of 48 layers. To mitigate overfitting, we adopted the InceptionV3 model for our experiment and fine-tuned it according to the target dataset. Figure 2illustrates the architecture of InceptionV352.

Xception

Xception is a convolutional neural network comprising 71 layers, designed to process input images of 224 × 224 dimensions. It supports pre-trained networks from the ImageNet database, which includes over one million images, and operates with input images of 299 × 299 resolution. The architecture is based on depthwise separable convolution layers, totaling 36 layers dedicated to feature extraction. In a standard convolution, the number of parameters is computed as 16 × 32 × 3 × 3, resulting in 4,608 parameters. Conversely, depthwise separable convolution reduces this count to 16 × 3 × 3 + 16 × 32 × 1 × 1 = 656 parameters. This reduction in parameters enhances the model’s efficiency and computational performance. Figure 3illustrates the architectural design of Xception52.

Fig. 3
figure 3

The architecture of xception.

Ensemble deep learning applied to brain tumor detection

This study presents a model that integrates the outputs of two independently trained neural networks, as illustrated in Fig. 3, to minimize false positives in detection. Pre-trained models, often derived from transfer learning networks, are utilized to reduce errors. By leveraging interconnected networks, the system can achieve optimal results with minimal inaccuracies. Once the data has been preprocessed and structured, a CNN architecture is developed from the learned models47,48.

Moreover, ensemble learning and hyperparameter optimization play a crucial role in enhancing the accuracy and robustness of machine learning models. Their importance is particularly evident in brain imaging analysis, where dataset limitations and subject variability pose significant challenges. Ensemble learning improves predictive accuracy and generalization by combining multiple models, effectively utilizing their diversity. In brain imaging, where data availability is limited and inter-subject variations are high, ensemble methods help mitigate overfitting and enhance model stability52.

In the context of brain image classification, particularly for disease detection, integrating classifiers trained on different subsets of data or using varied algorithms can lead to more consistent and reliable results49.

Weighted averaging ensemble technique

This study employs a weighted averaging ensemble method to combine the predictions of InceptionV3 and Xception for improved classification accuracy. The final probability for class Ck is computed as follows49:

$$p\left( {ck} \right) = {\omega _1}{p_1}\left( {ck} \right) + {\omega _2}{p_2}\left( {ck} \right)$$

Where:

  • \(p(ck)\) represent the final predicated probability for class \(ck\)

  • \(p_1 and\, p_2\,(ck)\) denote the class probailities predicated by inception v3 and xception respectively

  • \(\omega_1 and\, \omega_2\) are the weights assigned to each model’s prediction ensuring that:

The ensemble framework leverages the strengths of both models by adjusting their respective weights based on performance metrics obtained from validation data. This method enhances classification reliability, particularly when dealing with complex brain tumor images.

The ensemble structure can be implemented in different ways, with weighted averaging and stacking ensemble techniques being two commonly used methods. In the weighted averaging approach, the probability outputs from InceptionV3 and Xception are combined based on assigned weights, ensuring that the final prediction reflects a balanced contribution from both models. These weights are derived from each model’s validation accuracy, optimizing performance through proper calibration.

Alternatively, a stacking ensemble approach trains a meta-learner, such as logistic regression or a neural network, to combine the outputs of InceptionV3 and Xception. This meta-learner is trained using a validation dataset to learn how to effectively merge the predictions from the base models. The final decision is then determined by the output of this meta-learner, offering an alternative strategy for ensemble learning53.

Despite the integration of multiple deep learning models, the ensemble method does not require modifications to the individual architectures of InceptionV3 and Xception. Both models maintain their original configurations, consisting of multiple convolutional, pooling, and fully connected layers. The ensemble process simply aggregates their predictions to leverage their complementary strengths, ultimately improving classification performance without altering the network structures54.

Rather than relying solely on a single model, the ensemble approach is specifically designed to enhance feature extraction and classification accuracy. Combining the outputs of different models, whether through weighted averaging or stacking, results in a more robust and accurate brain tumor detection framework. By merging their complementary capabilities, the ensemble model mitigates individual model weaknesses and achieves superior diagnostic performance53,55.

To further refine classification accuracy, a layered ensemble classifier was developed in response to the previously discussed transfer learning classifiers. The deep CNN ensemble model enhances classification effectiveness by integrating multiple network architectures, enabling the model to extract diverse, architecture-specific patterns. Since deep CNNs inherently introduce randomness, this approach enables the model to capture and learn different neural network design-specific features, ultimately boosting classification performance. The ensemble method simplifies feature extraction, making the boosting approach both more accurate and computationally efficient55.

For this study’s four-class classification task, the ensemble model was built by combining InceptionV3 and Xception architectures. During hyperparameter tuning, the model was optimized using 64 neurons, with dropout rates of 0.2 and 0.1, and a SoftMax activation function to classify the MRI scans into four distinct tumor types. The neural network ensemble was trained over 50 epochs in 64-batch sizes, with an optimized learning rate of 0.0001%.

Figure 4 illustrates the proposed XL-TL model architecture, which integrates physical, training, and validation layers within the ensemble framework. This ensemble neural network approach enhances model reliability, ensuring greater accuracy and generalizability in brain tumor detection.

Selection and optimization of base models

The selection and grouping of base models in this study followed a systematic approach to ensure optimal performance in brain tumor detection using MRI scans. Various deep learning architectures, including Convolutional Neural Networks (CNNs), ResNet, and DenseNet, were evaluated to leverage their individual strengths while minimizing the risk of overfitting. Each base model underwent performance evaluation using the same dataset, focusing on key metrics such as classification accuracy, sensitivity, and specificity. The models that demonstrated the highest performance were shortlisted for further integration.

Some models were specifically chosen due to their success in similar medical imaging tasks through transfer learning, which enhanced their performance even with smaller datasets, a common constraint in medical imaging. The selected models were then combined using a weighted voting mechanism, where final predictions were determined based on the reliability of each model. Greater influence was assigned to models with higher accuracy, ensuring a balanced and data-driven decision-making process.

In certain configurations, an ensemble stacking approach was employed, where the predictions from base models served as input features for a higher-level meta-model. This meta-model, trained separately, optimized the combination of predictions from the base models, further refining classification accuracy. The optimal ensemble configuration was determined by maximizing overall accuracy, precision, recall, F1-score, and AUC-ROC, while minimizing false positive and false negative rates. To ensure stability, k-fold cross-validation was applied, validating the robustness of the ensemble model across different dataset partitions.

Another key consideration was computational efficiency, as practical clinical applications require fast and resource-efficient models. The performance of different ensemble configurations was assessed to ensure reliable and real-time MRI processing. Additionally, an error analysis of individual model predictions helped identify specific weaknesses, ultimately leading to the creation of a highly accurate and reliable deep learning ensemble for brain tumor detection.

Challenges and solutions in model implementation

Several challenges were encountered during the experimental implementation of the proposed XL-TL model. One major issue was the limited availability of high-quality datasets, as many prior studies relied on outdated, non-uniform, and handcrafted feature-based datasets, which were naturally less precise than those processed using deep learning techniques56. To mitigate this, the study incorporated two levels of data augmentation, significantly enhancing the dataset’s diversity and improving the training and testing stages of the model.

Another limitation in previous research was the restricted number of tumor classes, as many studies failed to include new real-time datasets, reducing the generalizability of their findings29,50. To address this, the proposed model integrated an updated and expanded brain tumor dataset, ensuring broader coverage of different tumor types and improving classification robustness.

A further challenge involved finding the optimal combination of InceptionV3 and Xception, balancing their performance and computational complexity39. This was achieved through hyperparameter tuning, where different configurations were tested to maximize accuracy while keeping computational demands manageable. Despite these optimizations, certain constraints remain, such as the reliance on historical MRI data instead of real-time medical imaging and the computational expense of deep learning ensembles.

Nevertheless, the key contributions of this study include the development of an ensemble deep learning model integrating InceptionV3 and Xception, which demonstrated higher accuracy than previous machine learning and deep learning approaches56. This ensemble model provides a scalable and efficient solution for brain tumor detection, advancing the field of medical image classification while addressing critical limitations in earlier research.

Fig. 4
figure 4

The architecture of the proposed XL-TL ensemble model.

In order to comprehend the system and how it functions, Table 3 presents the step-by-step process or pseudocode of the suggested model. Ultimately, it uses the evaluation matrix to calculate the precision and accuracy of the suggested model.

Table 3 Pseudo code of proposed model XL-TL.

link