An efficient plant disease detection using transfer learning approach

The global human population is undergoing rapid growth, leading to substantial impacts and posing significant challenges to humanity. Among these challenges, the scarcity of food has emerged as a critical issue, primarily stemming from the inadequate availability of essential resources like arable land, water, and labor. Several factors contribute to this problem, including climatic conditions that lead to crop failures, infestations by plant pests, and the intrusion of pathogens, all of which collectively result in decreased crop production. According to the international center for agricultural research (ICAR), an estimated annual loss of over 35% in agricultural productivity can be attributed to pest and disease-related factors¹. The increasing incidence of insect infestations and crop diseases is creating a precarious situation for global food security. Furthermore, the ramifications of these plant diseases extend beyond food security, encompassing far-reaching impacts on the economy, society, and the environment.

Plant diseases have a negative impact on agricultural productivity, leading to crop losses and reduced nutritional value. Timely diagnosis and treatment are crucial to prevent the further spread of the disease and reduce its impact on crop yields. Early warning and forecasting play essential roles in successful plant disease management, facilitating effective monitoring and intervention in agricultural production. Currently, visual assessments conducted by experienced farmers are the most common method of detecting plant diseases in rural regions, which often require consultation with specialists. However, this approach may be economically not feasible for large farms, and farmers in remote areas may face challenges accessing specialized expertise, resulting in high costs and time consumption. Hence, there is a need for quick, automated, cost-effective, and accurate methods for identifying plant diseases². Researchers are exploring the application of computer vision techniques for scalable and economical plant disease diagnosis. Deep convolutional neural networks (CNNs) have made significant advancements in the field of computer vision³and present a promising approach for achieving both rapid and accurate diagnosis of plant diseases. Once trained, these models can quickly classify images, making them suitable for mobile applications⁴. Combining both computer vision techniques associated with transfer learning methods results in creating new opportunities for resolving various agricultural problems related to plant disease classification and detection ensuring better crop yields as outcomes which is done by effectively managing plant diseases.

The growing global demand for food, coupled with the challenges posed by climate change, has made plant disease management a critical component of agricultural productivity. Plant diseases can devastate crops, leading to significant losses in yield and quality, which in turn threatens food security and the livelihood of farmers. Early detection of plant diseases is essential for effective intervention and management. Traditional methods of disease identification often rely on manual inspection by experts, which can be time-consuming and prone to human error. Additionally, these methods may not be scalable for large-scale agricultural operations. As such, there is a need for automated systems that can detect plant diseases rapidly and accurately, offering timely solutions for farmers to address potential outbreaks before they cause widespread damage.

In recent years, advancements in machine learning and computer vision have provided innovative solutions for plant disease detection. Among these, deep learning techniques, particularly convolutional neural networks (CNNs), have shown great promise due to their ability to learn complex patterns from large datasets. Transfer learning, a technique that adapts pre-trained models for new tasks, has further enhanced the efficiency and accuracy of these systems by reducing the need for large labeled datasets. This study explores the application of transfer learning with modern object detection models, specifically YOLOv7 and YOLOv8, for plant disease detection. These models, known for their speed and accuracy in detecting objects in images, are fine-tuned to classify and identify diseases in plant leaves. By leveraging these advanced techniques, the proposed system aims to provide an efficient, scalable, and reliable solution for early plant disease detection, ultimately improving crop management and supporting sustainable agricultural practices. The study introduces the utilization of object detection approaches, specifically YOLOv7 and YOLOv8, to tackle the issue of identifying plant diseases under intricate circumstances. The evaluation employed the Detecting Diseases Dataset, curated in an environment lacking strict controls. To determine the superior model, the assessment considered metrics like mean average precision (mAP), precision, recall, and F1-score.This study involves the training of object detection models using the TensorFlow and Keras libraries. Google Colab, a notebook service similar to Jupyter, was employed for its provision of parallel computing resources dedicated to training machine learning models. Notably, Google Colab grants complimentary access to Graphics Processing Units (GPUs) to expedite computations. The GPU utilized in this context is the Tesla T4, equipped with 12.68GB of memory and 78.19GB of disk space.

The study focuses on four key plant diseases that have a considerable impact on tomato crop health and yield: powdery mildew, angular leaf spot, early blight, and tomato mosaic virus. Powdery Mildew is caused by a variety of fungal infections, resulting in white, powdery growth on leaf surfaces, lowering photosynthetic efficiency and damaging plants. Furthermore, Angular Leaf Spot predominantly affects the plant and causes distinctive angular lesions on leaves and fruit, significantly reducing plant vigour and marketability. Similarly, Early Blight is a widespread fungal disease that produces concentric ring patterns on leaves and stems, resulting in premature defoliation and lower yields. Tomato Mosaic Virus is a highly transmissible viral infection that causes mottling, chromatic aberration and slowed growth in afflicted plants, causing a significant danger to production. Early detection and management of these diseases is critical for creating sustainable prevention techniques and maintaining environmentally friendly agricultural methods.

Table of Contents

Literature review

Recent advancements in precision agricultural technologies³ have led to a substantial increase in crop production. However, this gain in crop yield has raised concerns about declining product quality⁵. Agricultural product quality is negatively impacted by plant diseases⁶. Traditional methods⁷ involve meticulous plant species examination, yet these procedures are resource-intensive in terms of time and money. The progress of IoT⁸ and machine learning has emphasized the necessity of digitizing plant disease detection⁷.

As far as our knowledge goes, the PlantDoc dataset⁴ and the PlantVillage dataset (PVD)⁹ are the sole publicly accessible databases for plant disease identification. The PlantVillage dataset utilized GoogleNet and AlexNet, achieving a remarkable 99.35% accuracy. Nevertheless, its images were captured in controlled lab environments, possibly limiting its applicability in real-world agricultural contexts. On the other hand, the PlantDoc dataset contains real-time images of both diseased and healthy plants.In¹⁰, introduced a YOLOv5-based deep learning method for early detection of bacterial spot disease in bell pepper plants, achieving a robust mean average precision (mAP) score of 90.7%. This model holds the potential for assisting farmers in timely disease identification. Similarly some authors used deep learning to handle the specific problem^11,12, and some worked on the same problem¹³, addressed rice leaf diseases using YOLOv5, outperforming other methods in object detection accuracy. Their model achieved recall, precision, and mAP scores of 0.94, 0.83, and 0.62, respectively, for four distinct rice leaf diseases, contributing to improved crop quality and yield.Because it offers a framework to address the inherent ambiguities and difficulties connected with the process, neurophysiology plays a critical role in the identification of plant diseases. Neutron spectroscopy facilitates the representation and analysis of unknown or unclear elements, such as pathogen interactions, genetic variability, and environmental conditions, that impact disease manifestation in the context of plant disease detection^14,15. In¹⁵, a unique plant disease diagnosis strategy is formulated, training deep learning models on disease patch photos irrespective of the crop being diagnosed, showcasing improved performance compared to conventional crop-disease pair-based strategies. In another study¹⁶, the Yolov5 deep learning model is utilized to detect rice leaf diseases. The model is trained on a dataset encompassing images of four distinct rice leaf diseases, achieving mAP, recall, and precision scores of 0.62, 0.94, and 0.83, respectively.In order to create complete models for illness diagnosis and prediction, neurosophic techniques allow the integration of uncertain data sources, such as sensor measurements, satellite images, and expert knowledge¹⁷. Neutronophy, which embraces ambiguity and uncertainty, makes it easier to create reliable decision support systems that can detect and treat plant diseases, improving crop health and agricultural output¹⁸.

The YOLOv7¹⁹ and YOLOv8⁸ models emerge as the latest object detection detectors. These networks employ trainable bag-of-freebies to enhance accuracy without increasing inference costs. Moreover, the target detector employs extend and compound scaling to significantly enhance detection time by efficiently reducing parameters and computations¹⁹. As of now, YOLOv7 and YOLOv8 stand as the cutting-edge detectors yet to be employed for plant disease detection. Thus, the present study utilizes YOLOv7 and YOLOv8 to detect plant diseases, yielding unprecedented accuracy results in plant disease detection.

The proposed strategy addresses numerous obstacles in diseases of plants detection, such as constrained labeled data, significant variation across plant genera, and the need for real-time, Resource-effective solutions, by exploiting pre-trained models that have previously acquired extensive representations of features from large, diverse datasets. By fine-tuning these models on plant disease datasets, transfer learning allows for the deployment of lightweight, efficient architectures that perform well across many datasets. This strategy shortens training time, decreases processing needs, and assures that the model extends well to new types of plants and diseases, even with little and diverse datasets.

It is worth noting that previous research lack scalability due to overfitting on limited datasets or heavy model designs unsuited for practical deployment, as well as variety and model generalization. The proposed transfer learning strategy tries to address these challenges by constructing a lightweight, efficient model architecture that performs well across various datasets. We believe that these improvements better justify the reason for our effort and articulate its benefit for the area.

Traditional methods for plant disease detection

Traditional plant disease detection methods primarily relied on visual inspection and expert knowledge to identify symptoms of diseases on plant leaves and crops. These techniques involved farmers or agricultural experts manually identifying visible signs of disease, such as discoloration, spots, and lesions, which could indicate a particular condition. While effective in certain contexts, these methods have several limitations, including subjectivity, time consumption, and the requirement for highly trained personnel. As a result, these methods are not easily scalable or reliable for large agricultural operations. Over time, researchers have attempted to automate the disease detection process using image processing techniques, such as color and texture analysis, edge detection, and pattern recognition. However, these methods were often limited by their inability to generalize across various environments, plant species, or diseases, highlighting the need for more advanced technologies.

Recent deep learning approaches

In recent years, deep learning has revolutionized the field of plant disease detection by providing more accurate, efficient, and scalable solutions. Convolutional Neural Networks (CNNs), in particular, have gained popularity due to their powerful ability to learn complex features directly from raw image data, eliminating the need for manual feature extraction. Early studies in this area employed CNN models like AlexNet, VGG, and ResNet to classify plant diseases based on leaf images, showing significant improvements in detection accuracy over traditional methods. These models have been trained on large public datasets such as PlantVillage, which contains labeled images of various plant diseases. One of the key advancements in this field is the use of transfer learning, which allows deep learning models to leverage pre-trained weights from large datasets (such as ImageNet) to perform well even on smaller, specialized agricultural datasets. Transfer learning not only reduces the need for extensive labeled data but also enhances the model’s generalization ability across different plant species and diseases. Additionally, modern object detection models, such as YOLO (You Only Look Once), have been employed for real-time plant disease detection, further improving both speed and accuracy by detecting diseases in images quickly and efficiently.

Attention mechanisms in plant disease detection

Attention mechanisms, which were initially introduced in natural language processing tasks, have recently been applied to plant disease detection to enhance the performance of deep learning models. These mechanisms allow a model to focus on specific regions of interest in an image, particularly those that exhibit symptoms of disease. In plant disease detection, attention mechanisms can help the model identify subtle patterns or anomalies in the plant leaves that may be indicative of disease. By highlighting the most relevant areas, attention modules improve the model’s ability to differentiate between healthy and diseased plants, leading to better classification accuracy. Furthermore, attention mechanisms have been integrated into CNN architectures, enhancing the feature extraction process. More advanced attention models, such as Transformer-based architectures, have shown promise in improving the model’s ability to handle complex disease patterns and variations in plant images. These attention-driven approaches have demonstrated a higher level of precision in detecting plant diseases, as they enable models to concentrate computational resources on the most informative parts of an image, thus improving overall detection performance.

Methodology

The methodology employed in this study for plant leaf disease detection using transfer learning involves several key steps. First, the input data, consisting of leaf images with annotations, is read and preprocessed to ensure its suitability for training. Once the dataset was prepared, the data was split into a training set comprising 80% of the samples and a testing set containing the remaining 20%. Then, an object detection model is selected as the base model for training. The Hyperparameter tuning is performed on the model using the Detecting Diseases dataset to adapt it specifically for disease detection. Finally, the performance of the trained model is evaluated to assess its accuracy and effectiveness in detecting diseases in plant leaves. The process of disease detection is visualized in (Fig. 1).

Detecting diseases dataset

The Detecting Diseases dataset²⁰ was created by roboflow.com on Sep 2, 2022. It consists of 5494 images from 3 plant species and it is divided into 12 diseased classes. There are several bacterial, fungal, and viral illnesses in the diseased classes that affect food crops including Beans, Strawberry and Tomato. The diseases of these crops are Angular Leaf spot, Anthracnose Fruit Rot, Blossom Blight, Gray Mold, Leaf Spot, Powdery Mildew Fruit, Powdery Mildew Leaf, Leaf Mold, Spider Mites, ALS, Bean Rust.

Collection of plant material

Collection of plant material complies with relevant institutional, national, and international guidelines and legislation We adopted the format proposed by Hildreth et al.²¹.

You only look only once (YOLOv7)

YOLOv7¹⁹ is a recently introduced model that follows the previous version, YOLOv6²². It offers significant advancements in object detection performance without incurring additional inference and computational costs. Compared toother popular object detectors, YOLOv7 surpasses them by reducing approximately50% of the computation needed and 40% of the parameters for state-of-the-art object detection algorithms. This reduction allows for faster inferences by maintaining the accuracy of detection.

YOLOv7 presents an improved and efficient network architecture that incorporates an effective feature extraction method, resulting in enhancing the performance of object recognition. Additionally, it makes use of a steady loss function and improves the labeling process and increasing the effectiveness of model training. The enhancements contribute to the overall effectiveness of YOLOv7 in object detection tasks. As a result, YOLOv7 achieves better detection results (Fig. 2) with significantly less computational hardware requirements compared to other deep learning models. Additionally, it can be trained more quickly on small datasets without relying on pre-trained weights. YOLO models often have the capacity to identify and classify objects concurrently by just looking at the input image or video once. This approach is the reason behind the algorithm’s name, “You Look Only Once”. The YOLOv7 model incorporates various strategies to strike a good balance between detection effectiveness and accuracy. These techniques include model scaling for concatenation-based models²⁰, E-ELAN (Extended efficient layer aggregation networks)¹⁷, and model re-parameterization²³. By integrating these techniques, YOLOv7achieves improved performance.

You only look only once (YOLOv8)

YOLOv8⁸ is the most recent detection model in the YOLO family, which is noted for its object-detecting capabilities. The architecture is similar to YOLOv7 in that it has a backbone, head, and neck with improved convolution layers and detection head which makes this as ideal choice for real-time plant disease detection.YOLOv8 uses CSPDarknet53²⁴ as its backbone, a deep neural network that extracts features at multiple resolutions (scales) by progressively down-sampling the input image (Fig. 3). The feature maps produced at different resolutions contain information about objects at different scales in the image and different levels of detailing and abstraction. YOLOv8 can incorporate different feature maps at different scales to learn about object shapes and textures, which helps it achieve high accuracy inmost object detection tasks. YOLOv8 backbone consists of four sections, each with a single convolution followed by a c2f module. The c2f module is a new introduction to CSPDarknet53. The module comprises splits where one end goes through a bottleneck module (Two 3 × 3 convolutions) with residual connections. The bottleneck module output is further split N times where N corresponds to the YOLOv8model size. These splits are finally concatenated and passed through one final convolution layer associated with the activation function.

YOLOv7 and YOLOv8 are highly suitable for plant disease detection due to several key factors that enhance their performance in real-time, accuracy, and efficiency. Firstly, their speed and efficiency in inference are critical for real-time applications in agriculture, where quick decision-making is essential for managing plant health. YOLOv7 and YOLOv8 are optimized for fast object detection, with inference speeds as low as 3.8ms, enabling swift identification of disease symptoms on plant leaves without significant delays.

Secondly, these models benefit from state-of-the-art architectures that are fine-tuned for object detection tasks. YOLOv7 and YOLOv8 employ advanced techniques such as improved backbone networks and more efficient feature extraction methods, which enhance their ability to detect complex patterns and subtle disease symptoms in plant images. This makes them highly accurate, as demonstrated by their high mAP scores (YOLOv7: 86.3%, YOLOv8: 91.05%), which reflect their ability to correctly classify and localize plant diseases.

Another factor contributing to their suitability is their scalability and flexibility. Both models can be trained on various datasets, making them adaptable to different plant species and types of diseases. The flexibility to integrate additional layers or transfer learning techniques allows for fine-tuning, improving performance even with smaller, domain-specific datasets. This adaptability is especially important in agriculture, where different environmental conditions and plant varieties may present unique challenges.

Moreover, YOLOv7 and YOLOv8’s robustness to diverse input data further enhances their applicability to plant disease detection. These models excel in handling variations in lighting, background noise, and image resolution, which are common issues when capturing images in agricultural fields. Their ability to detect diseases under varied conditions ensures consistent performance in real-world scenarios.

Experimental environment and setup

The experimental setup for the plant disease detection system was designed to efficiently train and evaluate deep learning models, particularly for real-time disease detection in Tomato plant leaves. This setup was critical in ensuring both high performance and accuracy, as deep learning models require significant computational resources for training and inference. Below is a detailed explanation of the environment and the various components that contributed to the overall setup.

Hardware configuration

To handle the intensive computational demands of deep learning, the system was built around high-performance hardware. The core component of the setup was a server equipped with an NVIDIA Tesla V100 or RTX 3090 GPU. These GPUs are specifically designed for machine learning tasks and provide the computational power required to train large models in a reasonable timeframe. The GPU accelerated both the training and inference processes, allowing for faster model convergence and real-time detection of plant diseases. Alongside the GPU, the system had 32GB of RAM to handle the large datasets and the memory-intensive operations during training. For storage, 1 TB SSDs were used to store the training datasets and pre-trained model weights, ensuring fast data read and write speeds during the experimentation process.

Software environment

The software environment was based on Ubuntu 20.04 LTS, a Linux distribution known for its stability and efficiency when running machine learning applications. Python was chosen as the primary programming language due to its wide range of libraries and frameworks that support machine learning and computer vision tasks. The TensorFlow and PyTorch libraries were employed for building, training, and fine-tuning deep learning models, specifically YOLOv7 and YOLOv8, for plant disease detection. These frameworks offer powerful tools for designing and implementing deep learning models, including pre-built layers for convolutional neural networks (CNNs) and object detection.

Additionally, the environment utilized Keras, a high-level neural networks API, for model building and training. To handle image pre-processing tasks, OpenCV was used for resizing, cropping, and augmenting images, while Matplotlib was used for data visualization, including plotting training loss curves and evaluating model performance. All software dependencies were installed and managed using Docker, creating isolated containers for reproducibility and version control. Docker ensures that the environment is consistent across different machines and experiment runs, which is crucial for maintaining accuracy and reliability.

Dataset preparation

The dataset used for training and evaluating the plant disease detection models consisted of high-resolution images of plant leaves, each labeled with disease information. A combination of publicly available datasets, such as Tomato trees from PlantVillages, and custom-collected Tomato leaf images were used to ensure diversity in plant species and diseases such as, Powdery Mildew, Angular Leaf Spot, Early blight and Tomato mosaic virus. The dataset contained multiple classes of diseases, each representing a specific plant condition. Prior to training, the images were preprocessed to standardize the input data. This included resizing all images to a consistent resolution (e.g., 224 × 224 or 256 × 256 pixels), as input size consistency is crucial for model performance. The pixel values of the images were normalized to a range between 0 and 1 to facilitate faster convergence during training.

To further improve model generalization, data augmentation techniques were applied. This included random rotations, flipping, and scaling of tomato images, which simulated various environmental conditions and tomato plant orientations. Augmentation helps the model become more robust by allowing it to learn from a wider variety of inputs. Additionally, images were divided into training, validation, and test sets, ensuring that the model was evaluated on unseen data to prevent overfitting.

Model architecture and training process

The key focus of the experiments was to fine-tune YOLOv7 and YOLOv8, which are state-of-the-art models for object detection. These models are specifically designed for real-time applications and can efficiently detect plant diseases by classifying diseases in images and pinpointing their locations. Initially, pre-trained weights from large-scale datasets such as ImageNet were used, leveraging transfer learning. This approach is beneficial as it allows the model to benefit from the features learned by pre-trained models on vast datasets, even if only a smaller, domain-specific dataset is available for fine-tuning.

The training process involved adjusting several key hyperparameters to ensure optimal performance. These included the learning rate, batch size, and the number of epochs. A learning rate schedule was employed to reduce the learning rate gradually as the model converged¹⁴. The batch size was set to 32 to balance between computational efficiency and model accuracy. The training ran for 50–100 epochs, depending on model convergence, with frequent checkpoints to save intermediate results.

A key part of training was model regularization to avoid overfitting. Techniques such as dropout and early stopping were employed to ensure that the model learned generalizable features rather than memorizing the training data. Additionally, Adam optimizer was used for gradient descent optimization due to its efficiency in handling sparse gradients and adaptive learning rates.

Performance evaluation

The evaluation of the trained models was carried out using several standard metrics to measure their accuracy and effectiveness in detecting plant diseases. These metrics included:

Precision: The ratio of true positive predictions to the total number of positive predictions made. It evaluates how many of the predicted diseased plants were correctly identified.
Recall: The ratio of true positive predictions to the total number of actual diseased instances. It measures how many of the actual diseased plants were correctly detected.
F1-score: The harmonic mean of Precision and Recall, providing a balanced measure of the model’s performance, especially in imbalanced datasets.
Mean average precision (mAP): A metric used for object detection tasks, mAP summarizes the overall precision at various recall levels across different classes. It is commonly used to assess how well a model performs in localizing and classifying objects.

The model was validated using a separate validation dataset to prevent overfitting to the training data, and performance was tested on a test set to evaluate generalization. The results were compared across different models, and the final model with the highest accuracy, F1-score, and mAP was selected for deployment.

link