About the Lab

Much biological data is semantic, making it a difficult substrate for computation. My research interests are centered on organizing biological data in ways that make it more amenable to computation. I am particularly interested in the development of ontologies that describe biological knowledge and provide a means for detailed analysis of associated data.

Link to Google Scholar for references https://scholar.google.com/citations?user=ccf6djEAAAAJ&hl=en  


  • Newborn Screening collaboration with the Utah Department of Health
  • Sequence Ontology. The Sequence Ontology (SO) aims to unify the terminology used to describe biological sequence. It has been developed in conjunction with the model organism database groups to simplify data exchange and promote the development of computable genomic annotations.

The SO is curated and maintained by this lab. We are also developing software to facilitate using the ontology, and are having fun exploring genomic annotations. Work is ongoing and has generated multiple productive collaborations.
  • Using data science methods to investigate various disease outcomes.

Previous projects

  • ClinVar Miner. The lab contributed to 2 of the 3 NHGRI ClinGen grants for variant databases, developed the ClinVar Miner tool and contributed to various working groups.
  • Metagenomics. The lab was involved in the Taxonomer project to build tools for the analysis of clinical metagenomic RNA-seq data.
  • Disease Annotations for Variants in Personal Genomes. This project is in collaboration with Fabric Genomics, a personal genome software company.
  • Gene Ontology. The Gene Ontology (GO) has also provided the biological community with a tool that allows researchers to both communicate with each other effectively as it unified the vocabulary and also analyze large quantities of data. The GO is an ontology that describes the classes of molecular function, biological process and cellular location, and the relationships that hold between them. It is used by many of the model organism databases to label what the gene products do, what process they are involved in and where they are located. These functional annotations are then used to search across the genomes based on semantics rather than sequence similarity. 
We are part of the Gene Ontology Consortium
  • Ontologies for Public Health Informatics This project was in collaboration with Catherine Staes at the University of Utah