Virtual reality-empowered deep-learning analysis of brain cells
Whole-brain immunolabeling and clearing
Immunostaining for c-Fos was performed using a modified version of SHANEL10. All incubation steps were carried out under moderate shaking (300 rpm). For the pretreatment, samples were dehydrated with an ethanol/water series (50%, 70% and 100% ethanol) at room temperature for 3 h per step. Next, samples were incubated in dichloromethane (DCM)/methanol (2:1 v/v) at room temperature for 1 day. Brains were rehydrated with an ethanol/water series (100%, 70% and 50% ethanol and diH2O) at room temperature for 3 h per step. Samples were incubated in 0.5 M acetic acid at room temperature for 5 h followed by washing with diH2O. Next, brains were incubated in 4 M guanidine HCl, 0.05 M sodium acetate, 2% v/v Triton X-100, pH 6.0, at room temperature for 5 h followed by washing with diH2O. Brains were incubated in a mix of 10% CHAPS and 25% N-methyldiethanolamine at 37 °C for 12 h before washing with diH2O. Blocking was performed by incubating the brains in 0.2% Triton X-100, 10% dimethylsulfoxide and 10% goat serum in PBS shaking at 37 °C for 2 days. Samples were incubated with c-Fos primary antibody (Cell Signaling Technology, 2250, 1:1,000 dilution) in primary antibody buffer (0.2% Tween-20, 5% dimethylsulfoxide, 3% goat serum and 100 µl heparin per 100 ml PBS) shaking at 37 °C for 7 days. The antibody solution was filtered (22-µm pore size) before use. Samples were washed in washing solution (0.2% Tween-20 and 100 µl heparin in 100 ml PBS) shaking at 37 °C for 1 day at which the washing solution was refreshed five times. Brains were incubated with the secondary antibody (Alexa Fluor 647 and goat anti-rabbit IgG (H + L) from Invitrogen, A-21245, 1:500 dilution) in secondary antibody buffer (0.2% Tween-20, 3% goat serum and 100 µl heparin per 100 ml PBS) shaking at 37 °C for 7 days followed by incubating in washing solution shaking at 37 °C for 1 day at which the washing solution was refreshed five times. Brains were dehydrated using 3DISCO2 with a THF/H2O series (50%, 70%, 90% and 100% THF) for 12 h per step followed by an incubation in DCM for 1 h. Tissues were incubated in benzyl alcohol/benzyl benzoate (1:2 v/v) until tissue transparency was reached (>4 h).
For microglia labeling, brains of CX3CR1GFP/+ mice were pretreated via the modified SHANEL protocol as described above and incubated with Atto647N-conjugated anti-GFP nanobooster (Chromotek, gba647n-100, 1:1,000 dilution) with 5% 2-hydroxypropyl-β-cyclodextrin, 0.2% Tween-20 and 6% goat serum in PBS for 5 days at 37 °C. Brains were washed as described in washing solution shaking at 37 °C for 1 day at which the washing solution was refreshed five times. Brains were dehydrated with an ethanol/dH2O series (50%, 70%, 90% and 100% ethanol) at room temperature for 2 h each step and incubated in 100% ethanol overnight. Subsequently, brains were incubated in DCM for 1 h before incubation in benzyl alcohol/benzyl benzoate until tissue transparency was reached.
Light-sheet imaging
Light-sheet imaging for c-Fos labeled brains was conducted through a ×4 objective lens (Olympus XLFLUOR 340) equipped with an immersion-corrected dipping cap mounted on an UltraMicroscope II (LaVision BioTec) coupled to a white light laser module (NKT SuperK Extreme EXW-12). The antibody signal was visualized using a 640/40 nm excitation and 690/50 nm emission filter. Tiling scans (3 × 3 tiles) were acquired with a 15–20% overlap, 60% sheet width and 0.027 NA. The images were taken in 16-bit depth and at a nominal resolution of 1.625 μm per voxel on the xy axes. In the z dimension we took images in 6-μm steps using left- and right-sided illumination. Whole-brain scans for microglia-labeled CX3CR1GFP/+ brains were generated with the LaVision BioTec Ultramicroscope Blaze coupled with LaVision BioTec MI PLAN ×12 objective (0.53 NA (WD = 10 mm), nominal pixel size of 0.54 µm in xy). Stitching of tile scans was carried out using Fiji’s stitching plugin, using the ‘Stitch Sequence of Grids of Images’ plugin45 and custom Python scripts.
ClearMap
ClearMap7 and the CellMap portion18 of ClearMap2 were used with adapted settings for thresholds and cell sizes that fitted to the higher resolution and different signal-to-noise ratios in our dataset. Segmentation masks were saved as tiff stacks by toggling the ‘save’ option in the last segmentation step. ClearMap was ported to Python (v.3.5) before use, but functioned identically46. We only used the cell segmentation portions, no pre-processing (for example ClearMap2’s flat-field correction) or post-processing, such as atlas alignment, were performed. Both pipelines were run for an entire brain and subsequently subdivided into test patches that we used for the comparisons with DELiVR. For ‘optimized ClearMap’3, we performed the following pre-processing steps on our image stack: (1) Background equalization to homogenize intensity distribution and appearance of the c-Fos+ cells over different regions of the brain, using pseudo-flat-field correction function from Bio-Voxxel toolbox ( (2) Convoluted background removal, to remove all particles bigger than relevant cells. This was performed with the median option in the Bio-Voxxel toolbox. (3) A 2D median filter to remove remaining noise after background removal. (4) Unsharpen mask to amplify the high-frequency components of a signal and increase overall accuracy of the cell detection algorithm of ClearMap. (5) A z-wise removal of artifacts by manually selecting ROIs in Fiji. After pre-processing, ClearMap7 was applied by following the original publication and considering the threshold levels that we obtained from the pre-processing steps.
Ventricle masking
We wrote an automated pre-processing script that downsamples the image stack to an isotropic 25 × 25 × 25 µm per voxel and then applies a custom-trained random forest to identify ventricles. Specifically, we integrated Ilastik19 (v.1.4.0b8) with a 3D pixel classifier, which we trained on several downsampled brain image stacks to differentiate between ventricles and brain parenchyma. The pre-processing script then generates a 3D mask stack that our script upsamples to the original image stack dimensions, using bicubic interpolation to avoid aliasing artifacts at ventricle edges. It then masks each original z-plane image with the respective mask, pads it and returns a 16-bit image stack (saved as one big .npy file that can be read via np.memmap).
Annotation
VR annotation for c-Fos+ cells was carried out using Arivis VisionVR (v.3.4.0, Carl Zeiss Microscopy Software Center Rostock) or syGlass (v.1.7.2, ref. 12). For this purpose, the annotator was wearing a VR headset (Oculus Rift S) and carried out annotations in VR using hand controllers (Oculus Touch). Slice-by-slice annotation was carried out using ITK-SNAP (v.3.8, ref. 11). For comparing VR and 2D-sliced based annotation, a 1003-voxel volume of c-Fos labeled brain was annotated by the participants and the time was recorded until the annotation task was finished. For training and testing our deep-learning network, we annotated a total of 48 × 100³ voxel patches in VR. All of our training and test patches were furthermore vetted by an expert biologist in ITK-SNAP to ensure that only cells were annotated. We evaluated the annotation quality using the formula of Dice as described below. For more details about the annotation process in VR, please see our ‘DELiVR handbook’ provided as a Supplementary Note. Microglia cell bodies were annotated in VR similar to c-Fos+ cells using Arivis VisionVR. Only the somata were annotated, while the microglia processes were excluded.
Deep learning
To automatically segment the cells in all brains, we trained a 3D BasicUNet47 for DELiVR from the MONAI library48. The annotated dataset of 48 × 100³ patches was split into nine patches for testing and 39 patches for training stratified by signal after manual ventricle masking. As an activation function, we chose Mish49 and as optimizer Ranger21 (ref. 50). As a loss function, we used binary cross-entropy loss17. For the training of 500 epochs, we set the initial learning rate to 1 × 10−3 and the batch size to four. The network was then trained on a single GPU (NVIDIA RTX8000). Instead of conducting model selection, we selected the last checkpoint after 500 epochs of training. To compare the DELiVR 3D BasicUNet with other segmentation models, we trained UNETR15, SegResNet16and MONAI DynUNET17 with similar specifications.
The microglia 3D BasicUNet model was trained in a similar fashion for 500 epochs using 161 patches containing 3,798 cells. These were split into 129 patches for training and 32 patches for testing. Training was performed on an NVIDIA A100 GPU.
Evaluation of the segmentation model
Evaluation of the deep-learning model was conducted in a twofold manner. First, we evaluated the volumetric segmentation quality by assessing, for each voxel, whether it was correctly classified as foreground or background using pymia51. A volumetric quality assessment gave us TPs, FPs, FNs and true negatives by comparing every prediction voxel with the reference annotation voxel. Additionally, we conducted an instance-wise assessment of the segmentation quality. Therefore, we assess detection rates on a single-cell (instance) level52. To fairly evaluate every cell irrespective of the patch, we aggregated the counts across all patches and computed the instance metrics globally53.
Volumetric and instance scores were calculated according to the following equations:
$$\mathrmDice=\frac2\mathrmTP2\mathrmTP+\mathrmFP+\mathrmFN\qquad\mathrmSensitvity=\frac\mathrmTP\mathrmTP+\mathrmFN\qquad\mathrmPrecision=\frac\mathrmTP\mathrmTP+\mathrmFP$$
Comparison with ClearMap, ClearMap2, ‘Optimized ClearMap’ and Ilastik was performed on a test brain to generate segmentations from which we cropped 100³-voxel patches to avoid artifacts that occur when the methods are applied at the patch level. These patches were then compared to our reference annotation using the same metrics as described above.
Atlas registration and statistical analysis
For atlas registration, we used mBrainAligner14, which worked well with our datasets (Supplementary Fig. 1). We manually saved the downsampled isotropic 25 × 25 × 25 µm per voxel stacks as .v3draw using Vaa3d54. Subsequently, we wrote an automated script that aligned the image stacks to mBrainAligner’s 50 × 50 × 50 µm per voxel version of the Allen Brain Atlas CCF3 reference atlas, using the LSFM example settings with minor adaptations. Subsequently, we used mBrainAligner’s swc transformation tool to map the center-point coordinates of our c-Fos+ cells into atlas space.
Furthermore, we wrote a custom cell-to-atlas script (reusing parser code from VeSSAP55 and the Allen Brain Atlas CCF3 atlas file as provided by the Scalable Brain Atlas56) that filters the cells by size, with a user-defined upper and lower limit and returns two tables: a table with each cell as a row, including the region and Allen Brain Atlas color code, etc. and a region table with one region per row, in which the number of c-Fos+ cells per region is summarized. For all datasets, the post-processing script generates overview tables that contain cell counts for all regions. We used the latter for uncorrected Student’s t-tests. Finally, we implemented a level-aware multiple-testing script that compares groups at the Allen Brain Atlas’s 11 structure levels. We excluded the fiber tracts from our statistical comparisons.
Visualization
For visualizing the cells and regions in atlas space, we used BrainRender20 (v.2) with a modified density plot function46. To visualize the segmented cells in the original image space, we combined the area-wise color code from the Allen Brain Atlas with the 3D segment mask output by the connected component analysis. The result is a cell mask file with each cell being color coded according to the brain area that it belongs to, which makes overlaying with the original image data in for example Fiji easy and allows for direct visual inspection of the segmentation results. Finally, we used the Allen Institute for Brain Science’s cortical flat-map code ( with adaptions46 to include our heat maps.
DELiVR Docker and Fiji plugin
We packaged the DELiVR pipeline as provided in GitHub ( into a Docker container (base, nvidia/cuda:11.7.2-runtime-ubuntu22.04) including mBrainAligner14 ( Ilastik ( v.1.4.0b8) and TeraStitcher portable57 ( v.1.11.10). The code included Python (v.3.8), PyTorch (v.1.11), PyTorch Lightning (v.2.0.5), Nibabel (v.5.1.0), MONAI (v.1.2.0), SciPy (v.1.8.1), NumPy (v.1.24.4), Pandas (v.1.4.3), imglib2 ( and cc3d ( For details, please see the Docker file on GitHub (https://github.com/erturklab/delivr_cfos/blob/main/Dockerfile).
We wrote the Fiji58 (v.1.52p) plugin in Java (v.1.8, using Maven (v.3.9.5) and Jackson, as a front end. This provides a graphical user interface that compiles a config.json with path names and analysis parameters. Subsequently, the plugin calls the Docker container via a shell command and displays the progress of the pipeline. For a more detailed description, please see our ‘DELiVR handbook’ provided as a Supplementary Note.
Docker for training and Fiji plugin
We packaged the training code ( as a separate Docker container, which is also accessible via the Fiji plugin. The training plugin accepts annotated patches and trains a model specifically for this dataset. This model can then be imported into the inference pipeline for dataset-specific inference for any cell type. The Fiji training plugin compiles a config_train.json and arranges the file layout for the training Docker. It displays the training progress and shows the final test scores at the end.
Cell culture
C26 and NC26 colon cancer cells were cultured in high-glucose DMEM with pyruvate (Life Technologies, 41966052), supplemented with 10% fetal bovine serum (Sigma-Aldrich, F7524) and 1% penicillin-streptomycin (Thermo Fisher, 15140122) as described previously28,59. Before using the cells for transplantation, cells had a confluence of 80%. Cells were trypsinized, counted and required cell numbers were suspended in Dulbecco’s PBS (Thermo Fisher, 14190250).
Animal experimentation
Experiments were carried out with male BALB/c mice aged 10–12 weeks. They were purchased from Charles River Laboratories, maintained on a 12-h light–dark cycle and fed a regular unrestricted chow diet. The set points in the animal room were set to 20–24 °C temperature and 45–65% humidity. The mice were injected with 1 × 106 C26 or 1.5 × 106 NC26 colon cancer cells28,59 in 50 µl PBS subcutaneously into the right flank. Control mice were injected with 50 µl PBS. After 5 days from cell implantation, mice were monitored daily for tumor growth and body weight. Cachectic C26 tumor-bearing mice were considered cachectic when they had lost 10–15% of body weight. Mice were killed following deep anesthesia with a mix of ketamine/xylazine, followed by intracardiac perfusion with heparinized PBS (10 U ml−1 heparin) and by a perfusion with 4% paraformaldehyde (PFA). Tissues and organs were dissected, weighed and post-fixed at 4 °C overnight. Animal experimentation was performed in accordance with European Union directives and the German Animal Welfare Act (Tierschutzgesetz) and approved by the state ethics committee and the Government of Upper Bavaria (ROB-55.2-2532.Vet_02-18-93).
The 6–8-week-old CX3CR1GFP/+ (B6.129P-Cx3cr1tm1Litt/J) mice were purchased from The Jackson Laboratory (strain code 005582). They were deeply anesthetized using a combination of midazolam, medetomidine and fentanyl, intracardially perfused with 15 ml 0.01 M PBS solution (10 U ml−1 heparin) and 15 ml 4% PFA solution. The brain was dissected, post-fixed in 4% PFA for 6 h, then proceeded for staining and clearing following the SHANEL protocol. CX3CR1GFP/+ mice were killed for organ withdrawal (Tötung zu Wissenschaftlichen Zwecken/Organentnahme) in accordance with the German law for animal experiments (Tierschutzgesetz), paragraph 4, section 3.
Statistical analysis
Results from biological replicates were expressed as mean ± s.e.m. Statistical analysis was performed using GraphPad Prism (v.9). Normality was tested using Shapiro–Wilk normality tests. To compare two conditions, unpaired Student’s t-tests or Mann–Whitney U-tests were performed. A one-way ANOVA with Sidak’s post hoc test or Kruskal–Wallis tests with Dunn’s multiple comparison test were used to compare three groups. For the c-Fos+ density comparison between areas, we used two-sided t-tests followed by Benjamini–Hochberg multiple-testing correction with a false discovery rate (FWER) of 0.1, as implemented in SciPy statsmodels.stats.multitest.multipletests module (https://www.statsmodels.org/dev/generated/statsmodels.stats.multitest.multipletests.html).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
link
