Kaustubh Kishor Jaghav

Year of selection 2015
Cohort 2

This email address is being protected from spambots. You need JavaScript enabled to view it.

Phone number

+91 8999167180

Home Institute

Dr. Babasaheb Ambedkar Marathwada University, Maharashtra, India

Host Institute

University of Milan, Italy

Lab, Institute, Country

Comparitive Genomics and Bioinformatics lab,Institute of Biosciences, University of Milan, Italy

Name of Researcher/Supervisor Prof. David S. Horner
Duration of working period

Oct 2016 - Dec 2017

Title and Brief report of the work (max 300 words)

Title: Development and evaluation of machine learing approaches for the functional evaluation of single nucleotide polymorphism in micro RNA molecules

Summary: Recent developments in sequencing technologies, which allow the sequencing of hundreds of millions of short sequences for an affordable cost in a matter of days, has greatly facilitated the discovery and annotation of novel micro RNAs.  Indeed nowadays, miRNAs are more often than not inferred from evidence about their expression from large scale sequencing studies, rather than by de-novo methods based on strict rules providing an accurate description of their secondary structure. 

While the advent of NGS technologies has provided the means to study patterns of expression of miRNAs in depth, across different organisms, tissues and experimental conditions, recent studies suggest that large collections of miRNA annotations, such as miRBase are likely to contain a substantial proportion of false positive predictions.

Under the standard biological/genetic assumption that functionally important sequences (in this case, true miRNA precursors) are likely to exhibit higher levels of conservation between individuals and species than non-functional sequences (false positive miRNA precursors), a key testable hypothesis can be generated.

False positive predicted miRNA precursors would be expected to show both lower scores from de-novo miRNA prediction algorithms and higher levels of sequence/structure variability between individuals than “real” miRNA precursors.

My thesis project is centered on exploring whether machine learning approaches, together with available data on human genomic variation, can be employed both to identify putative false positive miRNAs in miRBase and to assess the potential impact of Single Nucleotide Polymorphisms (SNPs) on the processing of pre-miRNA molecules and consequent production of mature miRNAs across individuals and populations. In order to study this underlying hypothesis, my work has been focused mainly on 2 tasks:

  1. Development of a method based on machine learning for the classification of miRNAs precursors
  2. Application to such method in order to evaluate possible effects of single nucleotide polymorphisms as reported in publicly available collections of human genetic variation, on miRNA precursor’s structure.

I show that human genomic variation at annotated miRNA loci is concentrated in miRNAs that are not conserved between humans and mice, which exhibit higher levels of variability between human individuals and which attain lower scores from a machine learning approach designed to recognize pre-miRNA-like secondary structures.

Taken together, my results are consistent with the hypothesis that putatively incorrectly annotated miRNAs can be recognized through machine learning approaches and that similar approaches might be used to identify genomic variants that are likely to affect miRNA maturation.

List of publications with impact factor, presentation of the research work in conferences/ seminars /workshops  
Present position

Graduated and Unemployed

BRAVE © 2013 | Coordinated by Agricultural University of Athens