- Title： MAIB-Talk-019: Computational learning genotype–phenotype associations with single-cell data
- Date：10:00pm US East time, 06/24/2023
- Date：10:00am Beijing time, 06/25/2023
- Zoom ID：933 1613 9423
- Zoom PWD：416262
- Zoom: https://uwmadison.zoom.us/meeting/register/tJcudu-prTIuGNda1MsF8PKyRQlnGn06TP2E
MAIB: Manifold learning, Artificial Intelligence, Biology Forum (MAIB)
Presentation Record(Previous Presentation will be showed here if the video is not released for this talk)
Dr. Yongjian Yang
Yongjian is a fourth-year doctoral student in the Department of Electronics and Computer Science, Texas A&M University. The supervisor is Dr James Cai. His research direction is computational biology and bioinformatics. Phenotypic associations provide biological explanations, including but not limited to cell communication, gene knockouts, and gene-protein associations. His related work has been published in journals such as Cell Systems, Nucleic Acids Research and academic conferences such as RECOMB and ICIBM.
We are developing and applying machine learning models on biomedical data, especially single-cell data, to provide biological interpretation for genotype-phenotype associations. In the first half of my presentation, I would introduce scTenifoldXct, a semi-supervised computational tool for detecting ligand-receptor (LR)-mediated cell-cell interactions and mapping cellular communication graphs. Our method is based on manifold alignment, using LR pairs as inter-data correspondences to embed ligand and receptor genes expressed in interacting cells into a unified latent space. Neural networks are employed to solve the optimization problem. In the second half, I would like to introduce GenKI, a virtual knockout (KO) tool for gene function prediction using scRNA-seq data in the absence of KO samples when only wild-type (WT) samples are available. Without using any information from real KO samples, GenKI, which adapts a variational graph autoencoder (VGAE) model, is designed to capture shifting patterns in gene regulation caused by the KO perturbation in an unsupervised manner and provide a robust and scalable framework for gene function studies.
Yang, Yongjian, et al. “scTenifoldXct: A semi-supervised method for predicting cell-cell interactions and mapping cellular communication graphs.” Cell Systems 14.4 (2023): 302-311.
Yang, Yongjian, et al. “Gene knockout inference with variational graph autoencoder learning single-cell gene regulatory networks.” Nucleic Acids Research (2023): gkad450.
永健是德州农工大学电子计算机系四年级博士生，指导教授是Dr James Cai，研究方向是计算生物学和生物信息学，主要开发和利用机器学习的方法深度挖掘单细胞数据，为基因型-表型关联提供生物学解释，包括但不限于细胞通讯，基因敲除以及基因-蛋白质关联。他的相关工作发表在Cell Systems，Nucleic Acids Research等期刊以及RECOMB，ICIBM等学术会议。
Single-Cell Data Analysis: Single-cell data analysis has gained significant attention in recent years due to its ability to provide insights into cellular heterogeneity and dynamics. Traditional bulk RNA sequencing techniques provide an average gene expression profile for a population of cells, but they fail to capture the inherent variability between individual cells. Single-cell RNA sequencing (scRNA-seq) allows researchers to obtain gene expression profiles at the resolution of individual cells, enabling the identification of rare cell types, characterization of cellular states, and exploration of cell-cell interactions.
Genotype-Phenotype Associations: Understanding the relationship between genotypes (genetic variations) and phenotypes (observable traits) is a fundamental goal in biomedical research. However, uncovering the complex associations between genetic factors and phenotypic outcomes is challenging, especially at the single-cell level. By integrating genomics, transcriptomics, and phenotypic data, researchers can gain insights into the functional implications of genetic variations and their impact on cellular behavior and disease processes.
Ligand-Receptor Mediated Cell-Cell Interactions: Cell-cell interactions play a crucial role in various biological processes, such as development, immune response, and disease progression. These interactions often involve ligand-receptor (LR) pairs, where ligands on one cell bind to receptors on another cell, initiating signaling cascades and mediating communication between cells. Identifying LR-mediated interactions and understanding their underlying mechanisms are essential for unraveling cellular communication networks and deciphering the complexity of multicellular systems.
Challenges in Gene Function Prediction: Gene function prediction aims to uncover the biological roles and molecular functions of genes. Traditional approaches rely on experimental techniques like gene knockout (KO) studies, where specific genes are deliberately disrupted to observe the resulting phenotypic changes. However, KO experiments are time-consuming, expensive, and not feasible for all genes. Leveraging scRNA-seq data to predict gene functions, especially in the absence of real KO samples, presents an opportunity to overcome these limitations and accelerate gene function studies.
Based on these challenges and objectives, your research has developed two computational tools: scTenifoldXct and GenKI. scTenifoldXct focuses on detecting LR-mediated cell-cell interactions and mapping cellular communication graphs, while GenKI addresses the prediction of gene functions in the absence of real KO samples using scRNA-seq data. These tools utilize machine learning techniques, such as manifold alignment and variational graph autoencoders, to extract meaningful information from single-cell data and provide biological interpretation for genotype-phenotype associations.
By introducing these tools in your presentation, you will demonstrate how they contribute to addressing key challenges in single-cell data analysis and advancing our understanding of cellular interactions and gene function prediction.