Dr. Michele Leone

Research Interests

Research Interests

Started as a biologist, moved to the dark side of bioinformatics over time. Landscape and wild photographer, I am passionate about animals and natural sciences. My research interests include machine learning, natural language processing,computational and evolutionary biology.

During my PhD, I designed and implemented a computational method able to identify combinations of static and dynamic genomic functional elements in the most common genomic assemblies, and how they change across semantically annotated biological conditions. This method allows the comparison of a great number of genomic profiles of chromatin states in different conditions thought Hidden Markov Models, as well as the extraction of their specific variations.

Moreover, I targeted the problem of extracting useful metadata from free-text descriptions of genomic data samples from the Gene Expression Omnibus (GEO) database. Rather than treating the problem as classification or named entity recognition, it was modelled as machine translation, leveraging state-of-the-art sequence-to-sequence models to directly map unstructured input into a structured text format. The application of such models allows for imputation of output fields that are implied, but never explicitly mentioned, in the input text. Then, an active learning framework was designed to receive feedback from the users and improve the metadata prediction. Finally, a technique to interpret the predictions of the model was developed and this interpretation mechanism was applied in a web interface to help the user give correct feedback.

As a post-doc, my current research is focused on understanding how the moulting process and the variety of moulting patterns have altered during the evolution of arthropod life cycles. In order to uncover general trends in the evolution of arthropod moulting as well as unanswered problems, I am working on the creation of the MoultDB repository, which integrates data and information about moulting from many sources.