Genome or Exome screening has started to become a norm in clinical settings, with an aim to provide an improved and effective diagnosis process for patients. To cater to this need, hospitals need an effective, certified and reliable (Bio)informatics solution that can merge, store and analyze the multidimensional data forms. Also, challenge is to handle the huge avalanche of data generated from sequencing. This project has been planned to design and develop solution to afore mentioned needs of Centre for Medical Genetics of VUB UZ Brussel, ULB Erasme and UCL, De Duve. Beside the three medical centres, also involved are: three research groups specialized in database infrastructures and computational intelligence – ULB MLG, ULB IRIDIA and VUB AI Lab (COMO). There are three oligogenic diseases the project is focusing upon – Brugada Syndrome, Epileptic Encephalopathies and Cleft Lip and/or Palate. Project is being hosted by IB square and funded by INNOVIRIS.
This project aims to answer the research challenges by:
1. Design and creation of a multi-site clinical/phenomic and genomic data warehouse compliant with issues of interoperability, privacy, security, scalability and reliability.
2. Development of automated tools (including quality checking and mapping pipelines, pre-processing, dimensionality reduction and multivariate classification) for extracting relevant information from genetic data.
3. Use of the designed tools to extract new knowledge and transfer it to the medical setting.
VUB AI Lab (COMO) is working along with ULB MLG on the aspect of design and developing strategy for information discovery on genomic and clinical big data platform. Lab is particularly focused to come up with an optimal ensemble method that will help in gaining new biological insights. Goal is to screen and evaluate ensemble predictive modelling techniques for:
1. improving the prediction accuracy variant identification.
2. pathology classification tasks.
Strategies has been designed to evaluate data sets simultaneously by top-down (start with all the features and funnel down) and bottom-up approach (start with set of existing known causal features) using combination of black-box and white-box algorithms for data mining. Datasets will be tested upon with machine learning approaches like classification based on association; association mining with SVM/neural network/logitboost; random subspace with bagging; adaboost.M1, etc. for screening.
Another goal that will be worked upon is – to develop an interactive information discovery platform for genomics with clinical data.
VUB AI Lab is also working with Centre of Medical Genetics, VUB UZ Brussel for development of a centre specific and centralized Clinical/Phenomic database.
Machine Learning for Data mining