Machine learning-based prediction of preeclampsia using first-trimester inflammatory markers and red blood cell indices | BMC Pregnancy and Childbirth

Machine learning-based prediction of preeclampsia using first-trimester inflammatory markers and red blood cell indices | BMC Pregnancy and Childbirth

Baseline characteristics of pregnant women

This study included a total of 17,955 pregnant women, of whom 17,381 were non-preeclamptic and 574 developed PE (Table 1). The inflammatory markers were measured at an average gestational age of 12.43 weeks. Compared to non-preeclamptic women, those with PE were older, had a higher BMI, had singleton and IVF pregnancies, and had a lower parity. In terms of inflammatory markers and RBC indices, preeclamptic women presented increased WBC, NEUT, LYMPH, MONO, PLT, SIRI, SII, RBC, HGB and hematocrit (HCT) values during early pregnancy prior to the onset of disease (Fig. 2). However, no significant differences were observed in the NLR, dNLR, MLR, NMLR, PLR, or LMR. Additionally, WBC, NEUT, LYMPH, MONO, PLT, the SIRI, and the SII were stratified into quartiles. The prevalence of PE increased with higher quartile levels (Table 2).

Table 1 Baseline characteristics of pregnant women
Fig. 2
figure 2

Changes in inflammatory markers and RBC indices in PE pregnant women during early pregnancy before disease onset. Differences in (A) WBC, (B) NEUT, (C) LYMPH, (D) MONO, (E) PLT, (F) SIRI, (G) SII, (H) RBC, (I) HGB and (J) HCT values between PE cases and unaffected individuals. WBC, white blood cell; NEUT, neutrophil; LYMPH, lymphocyte; MONO, monocyte; PLT, platelet; SIRI, systemic inflammatory response index; SII, systemic immune-inflammation index; RBC, red blood cell; HGB, hemoglobin, HCT, hematocrit. ***P < 0.001

Table 2 The proportion of patients with PE sorted by quartile

Associations between inflammatory markers, RBC indices, and the risk of PE

Three models were constructed by adjusting for different confounding variables to evaluate the associations between WBC, NEUT, LYMPH, MONO, PLT, SIRI, SII, RBC, HGB, and HCT and the risk of PE. After all confounding variables were adjusted, significant positive associations with PE incidence were detected for WBC (OR = 1.09, 95% CI: 1.04–1.13, P < 0.001), NEUT (OR = 1.09, 95% CI: 1.04–1.14, P < 0.001), LYMPH (OR = 1.27, 95% CI: 1.05–1.53, P = 0.013), MONO (OR = 2.57, 95% CI: 1.31–5.03, P = 0.006), PLT (OR = 1.01, 95% CI: 1.01–1.01, P < 0.001), SIRI (OR = 1.11, 95% CI: 1.01–1.21, P = 0.032), SII (OR = 1.01, 95% CI: 1.01–1.01, P = 0.002), RBC (OR = 2.12, 95% CI: 1.64–2.75, P < 0.001), HGB (OR = 1.02, 95% CI: 1.01–1.03, P < 0.001), and HCT (OR = 1.07, 95% CI: 1.03–1.11, P = < 0.001) (Table 3).

Table 3 Relationships among inflammatory markers, RBC indices, and the risk of PE

Subsequently, WBC, NEUT, LYMPH, MONO, PLT, SIRI, and SII values were converted into quartile-based categorical variables. In the fully adjusted model, compared with Q1, the risk of PE progressively increased across quartiles for WBC, NEUT, LYMPH, MONO, and PLT, with a significant trend (P < 0.05). However, no such trend was observed for the SIRI or the SII (Table 3).

A nonlinear relationship and threshold effect were observed between inflammatory markers, RBC indices, and the risk of PE

To further ensure the robustness of the results, the potential nonlinear relationship between inflammatory biomarkers and PE risk was examined. In the RCS regression model, after adjusting for all confounding factors, no significant nonlinear relationships were found between LYMPH, MONO, the SIRI, the SII, and PE (P for nonlinearity > 0.05). However, significant nonlinear associations were observed between WBC, NEUT, PLT, RBC, HGB, and PE (nonlinear P < 0.05) (Fig. 3A-E). A tendency toward nonlinear associations was also observed between HCT and PE (nonlinear P = 0.07) (Fig. 3F).

Fig. 3
figure 3

The nonlinear relationship between inflammatory biomarkers, the RBC index and PE risk. Restricted cubic spline analyses of the associations of inflammatory markers and RBC indices (A) WBC; (B) NEUT; (C) PLT; (D) RBC; (E) HGB; (F) HCT) with PE. WBC, white blood cell; NEUT, neutrophil; PLT, platelet; RBC, red blood cell; HGB, hemoglobin; HCT, hematocrit

In Table 4, further analysis revealed a threshold effect on the association between WBC count and PE risk (P for likelihood ratio test = 0.034), with an inflection point at 8.44. When the WBC count was less than 8.44, no significant association with PE risk was observed (OR = 0.92, 95% CI: 0.79–1.08, P = 0.307). However, when the WBC count exceeded 8.44, each unit increase was associated with a 0.14-fold increase in PE risk (OR = 1.14, 95% CI: 1.07–1.22, P < 0.001). A threshold effect was also observed for the PLT (P for likelihood ratio test = 0.004), with an inflection point at 204. For RBC, a significant threshold effect was detected (P for likelihood ratio test < 0.001), with an inflection point at 3.84. When the RBC count was less than 3.84, each unit increase was associated with a 0.85-fold decrease in PE risk (OR = 0.15, 95% CI: 0.03–0.78, P = 0.024). However, when RBC count exceeded 3.84, each unit increase was associated with a 1.61-fold increase in PE risk (OR = 2.61, 95% CI: 1.95–3.50, P < 0.001). Additionally, a threshold effect was identified for HGB (P for likelihood ratio test < 0.001), with an inflection point at 119.43, and for HCT (P for likelihood ratio test = 0.021), with an inflection point at 34.63.

Table 4 Threshold effects of inflammatory markers and RBC indices on PE

Subgroup analyses

In Table 4, a subgroup analysis was conducted to examine whether the relationships between inflammatory markers, the RBC index, and the risk of PE varied across subgroups according to age, BMI, parity, gestational diabetes, hypothyroidism, singleton pregnancies, and IVF pregnancies. Significant interactions were identified between WBC, NEUT, LYMPH, PLT, and singleton pregnancies (P value for interaction < 0.05), as well as between RBC and parity (P value for interaction < 0.05) (Table 5; Fig. 4).

Table 5 Relationships between inflammatory markers, RBC indices, and the risk of PE in different subgroups
Fig. 4
figure 4

Relationships between inflammatory markers, the RBC index and PE risk in singleton and parity subgroups. WBC, white blood cell; NEUT, neutrophil; LYMPH, lymphocyte; PLT, platelet; RBC, red blood cell

Genetic association of immune cells with PE

To assess the genetic association of immune cells with PE, the causal relationship between 731 immune cells and PE was evaluated via a two-sample MR method. The analysis identified 15 immune cell types as risk factors for PE, including CD20-% lymphocyte, CD11c + CD62L- monocyte %monocyte, lymphocyte %leukocyte, HLA DR + NK % CD3- lymphocyte, PDL-1 on CD14- CD16 + monocyte, and CD16 on CD14 + CD16 + monocyte (Fig. 5).

Fig. 5
figure 5

Forest plots showing the causal associations between immune cells and PE

Owing to the high OR of CD11c + CD62L- monocyte %monocyte, further genetic association analyses were conducted for this immune cell type. No heterogeneity or horizontal pleiotropy was detected in the sensitivity analysis, as confirmed by the Cochran Q test, MR‒Egger intercept test, and MR Pleiotropy Residual Sum and Outlier (MR-PRESSO) (P > 0.05, Table 6). The stability of the results was further validated via funnel plots (Supplementary Figure S1C). Additionally, leave-one-out analysis demonstrated that no single IV significantly influenced causal inference (Supplementary Figure S1D), supporting the reliability of the causal relationship between CD11c + CD62L- monocyte %monocyte and PE risk.

Table 6 Results of the sensitivity analysis

To investigate how CD11c + CD62L- monocyte %monocyte influence the progression of PE, relevant metabolites were incorporated into the analysis. The results indicated that elevated levels of CD11c + CD62L- monocyte %monocyte might increase the phosphate-to-5-oxoproline ratio, thereby contributing to an increased risk of PE (Fig. 6A). The estimated mediated effect was 0.009 (95% CI: −0.000174–0.0182), with a mediated effect proportion of 6.6% (95% CI: −0.128–13.3%) (Table 7; Fig. 6B).

Fig. 6
figure 6

Mediation analyses for the genetic association between CD11c + CD62L- monocyte %monocyte and PE. (A) Forest plot depicting the causal relationship between CD11c + CD62L- monocyte %monocyte, phosphate to 5-oxoproline and PE. (B) The mediation effect of phosphate to 5-oxoproline on the causal effect of CD11c + CD62L- monocyte %monocyte on PE

Table 7 The mediating effect of the phosphate to 5-oxoproline ratio

Prediction of PE by the inflammation score and RBC index changes prior to symptom onset

A total of 17,955 samples were randomly divided into training and test sets at a 5:5 ratio. No statistically significant differences were observed between the two groups, except for gestational diabetes, which was more prevalent in the test set.

Scoring models for blood routine indicators were constructed via twelve machine learning methods. Among these models, the model developed with GBM demonstrated the best performance in predicting PE, achieving an AUC of 0.72 in the training set and 0.65 in the test set (Fig. 7A, C, D). This performance significantly outperformed that of individual indicators, suggesting that the scoring model can effectively identify the risk of developing PE during early pregnancy.

Fig. 7
figure 7

Machine learning algorithms to construct PE prediction models. A AUC values of the blood routine score models constructed by twelve machine learning algorithms. B AUC values of the PE prediction models constructed by twelve machine learning algorithms. C ROC curves of the optimal blood routine score (GBM) model and the PE prediction model (RF + GBM) in the training cohort. D ROC curves of the optimal blood routine score model (GBM) and the PE prediction model (RF + GBM) in the test group

Subsequently, maternal risk factors such as age, BMI, and parity were incorporated into the model to construct a comprehensive PE prediction model. In both the training and test sets, this joint prediction model achieved high prediction performance before 14 weeks of gestation, with AUC values of 0.82 and 0.73, respectively (Fig. 7B-D). The AUC improvement of the joint prediction model over the GBM model based on blood routine indicators alone was statistically significant in both the training and testing groups (De-Long test, P < 0.001) (Fig. 7C, D). Additionally, Supplementary analysis of risk scores revealed significantly higher values in PE subgroups compared to the unaffected group, with preterm PE exhibiting the highest scores (Supplementary Figure S2).

link