|12||Peirong||Liu||Perfusion Imaging via Mass Transport||2023||Postdoc at Massachusetts General Hospital||(webpage) (Google Scholar) (LinkedIn)|
|11||Yifeng||Shi||Representation Learning with Additional Structures||2023||Waymo||(Google Scholar) (LinkedIn)|
|10||Zhenlin||Xu||Towards Deep Visual Learning in the Wild: Data-efficiency, Robustness and Generalization||2022||Amazon, AWS AI Lab||(webpage) (Google Scholar) (LinkedIn)|
|9||Zhengyang||Shen||Accurate, Fast and Controllable Image and Point Cloud Registration||2022||(webpage) (Google Scholar) (LinkedIn)|
|8||Zhipeng||Ding||Toward Solving Groupwise Medical Image Analysis Problems with Deep Learning||2021||Amazon||(Google Scholar) (LinkedIn)|
|7||Xu||Han||Registration of Images with Pathologies||2020||(Google Scholar) (LinkedIn)|
|6||Heather||Couture||Discriminative Representations for Heterogeneous Images and Multimodal Data||2019||Pixel Scientia||(webpage) (Google Scholar) (LinkedIn)|
|5||Istvan||Csapo||Registration and Analysis of Developmental Image Sequences||2018||(Google Scholar) (LinkedIn)|
|4||Xiao||Yang||Uncertainty Quantification, Image Synthesis and Deformation Prediction for Image Registration||2017||Bytedance||(Google Scholar) (LinkedIn)|
|3||Yi||Hong||Image and Shape Analysis for Spatiotemporal Data||2016||Tenure Track Associate Professor at Shanghai Jiao Tong University||(webpage) (Google Scholar)|
|2||Tian||Cao||Coupled Dictionary Learning for Image Analysis||2016||Apple||(Google Scholar) (LinkedIn)|
|1||Liang||Shan||Automatic Localized Analysis of Longitudinal Cartilage Changes||2014||(Google Scholar) (LinkedIn)|
Deep Learning approaches have achieved revolutionary performance improvement on many computer vision tasks from understanding natural images and videos to analyzing medical images. Besides building more complex deep neural networks (DNNs) and collecting giant annotated datasets to obtain performance gains, more attention has now been focused on the shortcomings of DNNs. As recent research has shown, even when trained on millions of labeled samples, deep neural networks may still lack robustness to domain shift, small perturbations, and adversarial examples. On the other hand, in many real-world scenarios, e.g. in clinical applications, the number of labeled training samples is significantly smaller than for large existing deep learning benchmarks. Moreover, current deep learning models cannot generalize to samples with novel combinations of seen elementary concepts. Therefore, in this thesis, I focus on handling the critical needs to make modern deep learning approaches applicable in the real-world with a focus on computer vision tasks. Specifically, I focus on data efficiency, robustness, and generalization. I propose (1) DeepAtlas, a joint learning framework for image registration and segmentation that can learn DNNs for both tasks from unlabeled images and a few labeled images. (2) RandConv, a data augmentation technique that applies a random convolution layer on images during training to improve the generalization performance of a DNN in the presence of domain shift and robustness to image corruptions. (3) CompGen, a comprehensive study of compositional generalization in unsupervised representation learning on disentanglement and emergent language models.
Registration is the process of establishing spatial correspondences between two objects. Many downstream tasks, e.g, in image analysis, shape animation, can make use of these spatial correspondences. A variety of registration approaches have been developed over the last decades, but only recently registration approaches have been developed that make use of and can easily process the large data samples of the big data era. On the one hand, traditional optimization-based approaches are too slow and cannot take advantage of very large data sets. On the other hand, registration users expect more controllable and accurate solutions since most downstream tasks, e.g., facial animation and 3D reconstruction, increasingly rely on highly precise spatial correspondences. In recent years, deep network registration approaches have become popular as learning-based approaches are fast and can benefit from large-scale data during network training. However, how to make such deep-learning-based approached accurate and controllable is still a challenging problem that is far from being completely solved. This thesis explores fast, accurate and controllable solutions for image and point cloud registration. Specifically, for image registration, we first improve the accuracy of deep-learning-based approaches by introducing a general framework that consists of affine and non-parametric registration for both global and local deformation. We then design a more controllable image registration approach that image regions could be regularized differently according to their local attributes. For point cloud registration, existing works either are limited to small-scale problems, hardly handle complicated transformations or are slow to solve. We thus develop fast, accurate and controllable solutions for large-scale real-world registration problems via integrating optimal transport with deep geometric learning.
Image regression, atlas building, and multi-atlas segmentation are three groupwise medical image analysis problems extended from image registration. These three problems are challenging because of the difficulty in establishing spatial correspondences and the associated high computational cost. Specifically, most previous methods are computationally costly as they are optimization-based approaches. Hence fast and accurate approaches are highly desirable. This dissertation addresses the following problems concerning the three groupwise medical image analysis problems: (1) fast and reliable geodesic regression for image time series; (2) joint atlas building and diffeomorphic registration learning; (3) efficient and accurate label fusion for multi-atlas segmentation; and (4) spatially localized probability calibration for semantic segmentation networks. Specifically, the contributions in this thesis are as follows: (1) A fast predictive simple geodesic regression approach is proposed to capture the frequently subtle deformation trends of longitudinal image data. (2) A new deep learning model that jointly builds an atlas and learns the diffeomorphic registrations in both the atlas-to-image and the image-to-atlas directions is developed. (3) A novel deep learning label fusion method (VoteNet) that locally identifies sets of trustworthy atlases is presented; and several ways to improve the performance under the VoteNet based multi-atlas segmentation framework are explored. (4) A learning-based local temperature scaling method that predicts a separate temperature scale for each pixel/voxel is designed. The resulting post-processing approach is accuracy preserving and is theoretically guaranteed to be effective.
Registration is one of the fundamental tasks in medical image analysis. It is an essential step for many applications to establish spatial correspondences between two images. However, image registration in the presence of pathologies is challenging due to tissue appearance changes and missing correspondences caused by the pathologies. For example, for patients with brain tumors, the tissue is often displaced by the tumors, creating more significant deformations than what is observed in a healthy brain. Moreover, a fast and accurate image registration in the presence of pathologies is especially desired for immediate assessment of the registration results. This dissertation addresses the following problems concerning the registration of images with pathologies: (1) efficient registration between an image with pathologies and a common control atlas; (2) patient-specific longitudinal registration between pre-operative and post recurrence images for patients with glioblastoma; (3) automatic brain extraction for images with pathologies; and (4) fast predictive registration of images with pathologies to an atlas. Contributions presented in this dissertation are as follows: (1) I develop a joint PCA/image-reconstruction approach for images with pathologies. The model estimates quasi-normal image appearance from the image with pathologies and uses the reconstructed quasi normal image for registration. It improves the registration accuracy compared to directly using the images with pathologies, while not requiring the segmentation of the pathological region. (2) I propose a patient-specific registration framework for the longitudinal study of tumor recurrence of patients diagnosed with glioblastoma. It models the healthy tissue appearance for each patient in the individual space, thereby improving the registration accuracy. (3) I develop a brain extraction method for images with pathologies by jointly modeling healthy brain tissue, pathologies, and non-brain volume. (4) I design a joint registration and reconstruction deep learning model which learns an appearance mapping from the image with pathologies to atlas appearance while simultaneously predicting the transformation to atlas space. The network disentangles the spatial variation from the appearance changes caused by the pathology.
Histology images of tumor tissue are an important diagnostic and prognostic tool for pathologists. Recently developed molecular methods group tumors into subtypes to further guide treatment decisions, but they are not routinely performed on all patients. A lower cost and repeatable method to predict tumor subtypes from histology could bring benefits to more cancer patients. Further, combining imaging and genomic data types provides a more complete view of the tumor and may improve prognostication and treatment decisions. While molecular and genomic methods capture the state of a small sample of tumor, histological image analysis provides a spatial view and can identify multiple subtypes in a single tumor. This intra-tumor heterogeneity has yet to be fully understood and its quantification may lead to future insights into tumor progression. In this work, I develop methods to learn appropriate features directly from images using dictionary learning or deep learning. I use multiple instance learning to account for intra-tumor variations in subtype during training, improving subtype predictions and providing insights into tumor heterogeneity. I also integrate image and genomic features to learn a projection to a shared space that is also discriminative. This method can be used for cross-modal classification or to improve predictions from images by also learning from genomic data during training, even if only image data is available at test time.
Mapping images into the same anatomical coordinate system via image registration is a fundamental step when studying physiological processes, such as brain development. Standard registration methods are applicable when biological structures are mapped to the same anatomy and their appearance remains constant across the images or changes spatially uniformly. However, image sequences of animal or human development often do not follow these assumptions, and thus standard registration methods are unsuited for their analysis. In response, this dissertation tackles the problems of i) registering developmental image sequences with spatially non-uniform appearance change and ii) reconstructing a coherent 3D volume from serially sectioned images with non-matching anatomies between the sections. There are three major contributions presented in this dissertation. First, I develop a similarity metric that incorporates a time-dependent appearance model into the registration framework. The proposed metric allows for longitudinal image registration in the presence of spatially non-uniform appearance change over time—a common medical imaging problem for longitudinal magnetic resonance images of the neonatal brain. Next, a method is introduced for registering longitudinal developmental datasets with missing time points using an appearance atlas built from a population. The proposed method is applied to a longitudinal study of young macaque monkeys with incomplete image sequences. The final contribution is a template-free registration method to reconstruct images of serially sectioned biological samples into a coherent 3D volume. The method is applied to confocal fluorescence microscopy images of serially sectioned embryonic mouse brains.
Image registration is essential for medical image analysis to provide spatial correspondences. It is a difficult problem due to the modeling complexity of image appearance and the computational complexity of the deformable registration models. Thus, several techniques are needed: Uncertainty measurements of the high-dimensional parameter space of the registration methods for the evaluation of the registration result; Registration methods for registering healthy medical images to pathological images with large appearance changes; Fast registration prediction techniques for uni-modal and multi-modal images. This dissertation addresses these problems and makes the following contributions: 1) A frame- work for uncertainty quantification of image registration results is proposed. The proposed method for uncertainty quantification utilizes a low-rank Hessian approximation to evaluate the variance/co- variance of the variational Gaussian distribution of the registration parameters. The method requires significantly less storage and computation time than computing the Hessian via finite difference while achieving excellent approximation accuracy, facilitating the computation of the variational approximation; 2) An image synthesis deep network for pathological image registration is developed. The network transforms a pathological image into a ‘quasi-normal’ image, making registrations more accurate; 3) A patch-based deep learning framework for registration parameter prediction using image appearances only is created. The network is capable of accurately predicting the initial momentum for the Large Deformation Diffeomorphic Metric Mapping (LDDMM) model for both uni-modal and multi-modal registration problems, while increasing the registration speed by at least an order of magnitude compared with optimization-based approaches and maintaining the theoretical properties of LDDMM. Applications of the methods include 1) Uncertainty quantification of LDDMM for 2D and 3D medical image registrations, which could be used for uncertainty-based image smoothing and subsequent analysis; 2) Quasi-normal image synthesis for the registration of brain images with tumors with potential extensions to other image registration problems with pathologies and 3) deformation prediction for various brain datasets and T1w/T2w magnetic resonance images (MRI), which could be incorporated into other medical image analysis tasks such as fast multi-atlas image segmentation, fast geodesic image regression, fast atlas construction and fast user-interactive registration refinement.
In analyzing brain development or identifying disease it is important to understand anatomical age-related changes and shape differences. Data for these studies is frequently spatiotemporal and collected from normal and/or abnormal subjects. However, images and shapes over time often have complex structures and are best treated as elements of non-Euclidean spaces. This dissertation tackles problems of uncovering time-varying changes and statistical group differences in image or shape time-series. There are three major contributions: 1) a framework of parametric regression models on manifolds to capture time-varying changes. These include a metamorphic geodesic regression approach for image time-series and standard geodesic regression, time-warped geodesic regression, and cubic spline regression on the Grassmann manifold; 2) a spatiotemporal statistical atlas approach, which augments a commonly used atlas such as the median with measures of data variance via a weighted functional boxplot; 3) hypothesis testing for shape analysis to detect group differences between populations. The proposed method for cross-sectional data uses shape ordering and hence does not require dense shape correspondences or strong distributional assumptions on the data. For longitudinal data, hypothesis testing is performed on shape trajectories which are estimated from individual subjects. Applications of these methods include 1) capturing brain development and degeneration; 2) revealing growth patterns in pediatric upper airways and the scoring of airway abnormalities; 3) detecting group differences in longitudinal corpus callosum shapes of subjects with dementia versus normal controls.
Modern imaging technologies provide different ways to visualize various objects ranging from molecules in a cell to the tissue of a human body. Images from different imaging modalities reveal distinct information about these objects. Thus a common problem in image analysis is how to relate different information about the objects. For instance, relating protein locations from fluorescence microscopy and the protein structures from electron microscopy. These problems are challenging due to the difficulties in modeling the relationship between the information from different modalities. In this dissertation, a coupled dictionary learning based image analogy method is first introduced to synthesize images in one modality from images in another. As a result, using my method multi-modal registration (for example, registration between correlative microscopy images) is simplified to a mono-modal one. Furthermore, a semi-coupled dictionary learning based framework is proposed to estimate deformations from image appearances. Moreover, a coupled dictionary learning method is explored to capture the relationship between GTPase activations and cell protrusions and retractions. Finally, a probabilistic model is proposed for robust coupled dictionary learning to address learning a coupled dictionary with non-corresponding data. This method discriminates between corresponding and non-corresponding data thereby resulting in a “clean” coupled dictionary by removing non-corresponding data during the learning process.
Osteoarthritis (OA) is the most common form of arthritis; it is characterized by the loss of cartilage. Automatic quantitative methods are needed to screen large image databases to assess changes in cartilage morphology. This dissertation presents an automatic analysis method to quantitatively analyze longitudinal cartilage changes from knee magnetic resonance (MR) images. A novel robust automatic cartilage segmentation method is proposed to overcome the limitations of existing cartilage segmentation methods. The dissertation presents a new and general convex three-label segmentation approach to ensure the separation of touching objects, i.e., femoral and tibial cartilage. Anisotropic spatial regularization is introduced to avoid over-regularization by isotropic regularization on thin objects. Temporal regularization is further incorporated to encourage temporally-consistent segmentations across time points for longitudinal data. The state-of-the-art analysis of cartilage changes relies on the subdivision of car- tilage, which is coarse and purely geometric whereas cartilage loss is a local thinning process and exhibits spatial non-uniformity. A novel statistical analysis method is proposed to study localized longitudinal cartilage thickness changes by establishing spatial correspondences across time and between subjects. The method is general and can be applied to other nonuniform morphological changes in other diseases.