The Biomedical Image Analysis Group at the University of North Carolina at Chapel Hill (UNC-biag) focuses on the design of computational algorithms to extract quantitative measures from biomedical data. While our emphasis is on image data (e.g., obtained via magnetic resonance imaging, computed tomography, or microscopy) our analyses also include clinical measures and genomics. The group is led by Marc Niethammer.
Our work is highly interdisciplinary and includes collaborators from a wide range of disciplines such as statistics, applied mathematics, radiology, surgery, and epidemiology. Consequently, we also publish in venues ranging from clinical journals, to medical conferences (such as MICCAI and IPMI) to computer vision conferences (such as CVPR and ECCV), to machine learning conferences (such as NeurIPS, ICML, and AAAI).
Purpose: Accurate deformable registration between computed tomography (CT) and cone-beam CT (CBCT) images of pancreatic cancer patients treated with high biologically effective radiation doses is essential to assess changes in organ-at-risk (OAR) locations and shapes and to compute delivered dose. This study describes the development and evaluation of a deep-learning (DL) registration model to predict OAR segmentations on the CBCT derived from segmentations on the planning CT. Methods: The DL model is trained with CT-CBCT image pairs of the same patient, on which OAR segmentations of the small bowel, stomach, and duodenum have been manually drawn. A transformation map is obtained, which serves to warp the CT image and segmentations. In addition to a regularity loss and an image similarity loss, an OAR segmentation similarity loss is also used during training, which penalizes the mismatch between warped CT segmentations and manually drawn CBCT segmentations. At test time, CBCT segmentations are not required as they are instead obtained from the warped CT segmentations. In an IRB-approved retrospective study, a dataset consisting of 40 patients, each with one planning CT and two CBCT scans, was used in a fivefold cross-validation to train and evaluate the model, using physician-drawn segmentations as reference. Images were preprocessed to remove gas pockets. Network performance was compared to two intensity-based deformable registration algorithms (large deformation diffeomorphic metric mapping [LDDMM] and multimodality free-form [MMFF]) as baseline. Evaluated metrics were Dice similarity coefficient (DSC), change in OAR volume within a volume of interest (enclosing the low-dose PTV plus 1 cm margin) from planning CT to CBCT, and maximum dose to 5 cm3 of the OAR [D(5cc)]. Results: Processing time for one CT-CBCT registration with the DL model at test time was less than 5 seconds on a GPU-based system, compared to an average of 30 minutes for LDDMM optimization. For both small bowel and stomach/duodenum, the DL model yielded larger median DSC and smaller interquartile variation than either MMFF (paired t-test P < 10−4 for both type of OARs) or LDDMM (P < 10−3 and P = 0.03 respectively). Root-mean-square deviation (RMSD) of DL-predicted change in small bowel volume relative to reference was 22% less than for MMFF (P = 0.007). RMSD of DL-predicted stomach/duodenum volume change was 28% less than for LDDMM (P = 0.0001). RMSD of DL-predicted D(5cc) in small bowel was 39% less than for MMFF (P = 0.001); in stomach/duodenum, RMSD of DL-predicted D(5cc) was 18% less than for LDDMM (P < 10−3). Conclusions: The proposed deep network CT-to-CBCT deformable registration model shows improved segmentation accuracy compared to intensity-based algorithms and achieves an order-of-magnitude reduction in processing time.
Transport processes are ubiquitous. They are, for example, at the heart of optical flow approaches; or of perfusion imaging, where blood transport is assessed, most commonly by injecting a tracer. An advection-diffusion equation is widely used to describe these transport phenomena. Our goal is estimating the underlying physics of advection-diffusion equations, expressed as velocity and diffusion tensor fields. We propose a learning framework (YETI) building on an auto-encoder structure between 2D and 3D image time-series, which incorporates the advection-diffusion model. To help with identifiability, we develop an advection-diffusion simulator which allows pre-training of our model by supervised learning using the velocity and diffusion tensor fields. Instead of directly learning these velocity and diffusion tensor fields, we introduce representations that assure incompressible flow and symmetric positive semi-definite diffusion fields and demonstrate the additional benefits of these representations on improving estimation accuracy. We further use transfer learning to apply YETI on a public brain magnetic resonance (MR) perfusion dataset of stroke patients and show its ability to successfully distinguish stroke lesions from normal brain regions via the estimated velocity and diffusion tensor fields.
Minimizing cross-entropy over the softmax scores of a linear map composed with a high-capacity encoder is arguably the most popular choice for training neural networks on supervised learning tasks. However, recent works show that one can directly optimize the encoder instead, to obtain equally (or even more) discriminative representations via a supervised variant of a contrastive objective. In this work, we address the question whether there are fundamental differences in the sought-for representation geometry in the output space of the encoder at minimal loss. Specifically, we prove, under mild assumptions, that both losses attain their minimum once the representations of each class collapse to the vertices of a regular simplex, inscribed in a hypersphere. We provide empirical evidence that this configuration is attained in practice and that reaching a close-to-optimal state typically indicates good generalization performance. Yet, the two losses show remarkably different optimization behavior. The number of iterations required to perfectly fit to data scales superlinearly with the amount of randomly flipped labels for the supervised contrastive loss. This is in contrast to the approximately linear scaling previously reported for networks trained with cross-entropy.
Background: Non-human primates are commonly used in neuroimaging research for which general anaesthesia or sedation is typically required for data acquisition. In this analysis, the cumulative effects of exposure to ketamine, Telazol® (tiletamine and zolazepam), and the inhaled anaesthetic isoflurane on early brain development were evaluated in two independent cohorts of typically developing rhesus macaques. Methods: Diffusion MRI scans were analysed from 43 rhesus macaques (20 females and 23 males) at either 12 or 18 months of age from two separate primate colonies. Results: Significant, widespread reductions in fractional anisotropy with corresponding increased axial, mean, and radial diffusivity were observed across the brain as a result of repeated anaesthesia exposures. These effects were dose dependent and remained after accounting for age and sex at time of exposure in a generalised linear model. Decreases of up to 40% in fractional anisotropy were detected in some brain regions. Conclusions: Multiple exposures to commonly used anaesthetics were associated with marked changes in white matter microstructure. This study is amongst the first to examine clinically relevant anaesthesia exposures on the developing primate brain. It will be important to examine if, or to what degree, the maturing brain can recover from these white matter changes.
Learning maps between data samples is fundamental. Applications range from representation learning, image translation and generative modeling, to the estimation of spatial deformations. Such maps relate feature vectors, or map between feature spaces. Well-behaved maps should be regular, which can be imposed explicitly or may emanate from the data itself. We explore what induces regularity for spatial transformations, e.g., when computing image registrations. Classical optimization-based models compute maps between pairs of samples and rely on an appropriate regularizer for well-posedness. Recent deep learning approaches have attempted to avoid using such regularizers altogether by relying on the sample population instead. We explore if it is possible to obtain spatial regularity using an inverse consistency loss only and elucidate what explains map regularity in such a context. We find that deep networks combined with an inverse consistency loss and randomized off-grid interpolation yield well behaved, approximately diffeomorphic, spatial transformations. Despite the simplicity of this approach, our experiments present compelling evidence, on both synthetic and real data, that regular maps can be obtained without carefully tuned explicit regularizers, while achieving competitive registration performance.