PhD Defense: Multi-Task Learning and its Applications to Biomedical Informatics

PhD Proposal Defense in Computer Science

School of Computing, Informatics, and Decision Systems Engineering


Multi-Task Learning and Its Applications to Biomedical Informatics 

Jiayu Zhou

Date and Time: Monday Oct 14, 2013, 2:00PM

Location: BYENG 420

 Committee Members:

Dr. Jieping Ye (Chair), Dr. Baoxin Li, Dr. Hans Mittelmann, Dr. Yalin Wang

Abstract:  In many fields we need to build predictive models for a set of related machine learning tasks, such as information retrieval, computer vision and biomedical informatics. Traditionally these tasks are treated independently and the inference of the models is done separately for each task, which ignores important connections among the tasks. Multi-task learning aims at simultaneously building models for all tasks in order to improve the generalization performance, leveraging inherent relatedness of these tasks. In this proposal, I first present a clustered multi-task learning (CMTL) formulation, which is capable of simultaneously learning models for all tasks and meanwhile identifying groups of similar tasks by performing clustering on these models. I establish the theoretical equivalence relationship between the proposed CMTL and a widely used multi-task learning algorithm called alternating structure optimization (ASO). The equivalence has provided an important guideline: when high-dimension data is involved, CMTL can be used as an efficient alternative to perform ASO. Then I present a real world biomedical informatics application which can benefit from the multi-task learning. In this application, I study the disease progression of the Alzheimer’s disease, where we need to build a set of regression models given the baseline data of the patients. Each regression model predicts the cognitive status of the patients at a future time point, and therefore these models are temporally related. I propose two multi-task formulations for disease modeling, which have greatly improved the predictive performance as compared to existing methods.