![]() |
||||||||
August 4, 2004
The Carolina Center for Genome Sciences is pleased to announce the arrival of Dr. Yufeng Liu to Chapel Hill, fresh from his PhD studies in Xiaotong Shen’s group at Ohio State University. He is now an Assistant Professor at UNC in the Department of Statistics and Operations Research with a joint appointment in CCGS. Yufeng’s primary interest is in statistical learning theory whereby methods or “machines” are developed to address general classification problems. In particular, he is interested in so-called supervised learning techniques that use a training set as a “teacher” to train machines how to classify data based on experimentally established relationships. The machines can then learn how to classify new data for which relationships are not known. In the last 10 years or so, a machine-learning technique called support vector machines (SVMs) has led to a growing number of successful, real-world applications such as handwriting recognition, image classification, and genomic analysis. More recently, a new generation machine-learning technique called psi-learning has been developed which often yields more accurate classifications than SVMs. Although SVMs and psi-learning can both yield accurate classification for binary problems, they are not probability-based approaches and consequently, not applicable for multicategory classification directly. For his dissertation, Yufeng extended psi-learning and SVMs to solve multicategory classification problems. These techniques are especially relevant to the high-dimensional datasets derived from genomic experiments. For instance, DNA microarray analysis often tests a relatively small number of samples (e.g., different environmental conditions, tumor samples, mutant backgrounds) for the expression of thousands or even tens-of-thousands of genes. From this type of experimental data, it is often difficult to discriminate between samples for different classes accurately. SVMs and psi-learning are well-suited for such applications since the computation mainly depends on sample size rather than the number of genes. Compared to multicategory SVMs, multicategory psi-learning protects the resulting “classifiers” or models from being sensitive to unusual samples or “outliers.” This allows multicategory psi-learning to be more robust, leading to a more accurate classification of future samples than SVMs in certain applications. This type of analysis has far-reaching biomedical applications such as tumor classification, cancer diagnosis/prognosis, gene annotation, pharmacogenomics, and disease-susceptibility gene identification. Yufeng’s efforts toward adapting psi-learning and SVMs to multicategory classification problems have particular relevance to CCGS biologists and geneticists who are often faced with the problem of gleaning meaningful nuggets of information from vast, genome-wide datasets. Yufeng is looking forward to meeting and collaborating with CCGS faculty whose work may benefit from his novel statistical tools. ”In the exciting research environment at CCGS, I am eager to see that my statistical-learning techniques will be effective in solving various genomic problems and, at the same time, these problems can motivate me to develop new statistical methodologies and theories.” For
additional information, see: |
![]() |
|||||||
![]() |
![]() |
|||||||
![]() |
||||||||
![]() |
||||||||
Yufeng Liu [phone] 919-843-1899 [email] yfliu@email.unc.edu |
||||||||