Combining Boosting and Active Learning for Mining Multi-Class Genomic Data

TitleCombining Boosting and Active Learning for Mining Multi-Class Genomic Data
Publication TypeConference Paper
Year of Publication2016
AuthorsFarid, DMd.
EditorNowé, A
Tertiary AuthorsManderick, B
Conference Name25th Belgian-Dutch Conference on Machine Learning (Benelearn)
Date Published09/2016
Conference LocationKortrijk, Belgium
Abstract

Boosting and active learning achieve high classification rate in many real world machine learning for data mining applications. This paper presents an optimal ensemble learning to improve the prediction accuracy of multi-class DNA variant classification employing boosting and active learning. We use naive Bayes classifier and clustering to find the most informative unlabeled DNA variants as part of active learning and use boosting as a base classifier. The strategy of combining boosting and active learning is evaluated based on genomic data (148 Exome data sets) of Brugada syndrome from the Centre of Medical Genetics, VUB UZ Brussel.