A recent paper, first-authored by José Lourenço –  BioISI researcher – reports that there is a small number of genetic signatures which allows the identification to which genotype a particular hepatitis B virus belongs. If you want to know more about this paper, published in the journal Virus Evolution, read the BioISI Digest below.

What was the starting point that led to the current research?

Myself and the other researchers on this study based at the University of Oxford (Philippa Mathews, Sunetra Gupta), University of Bristol (Anna McNaughton), University of Cambridge (Caitlin Pley) and University of Tel Aviv (Uri Obolski) had previously worked together on various topics of the population biology of the (human) Hepatitis B virus (HBV). The work presented in this study was a natural evolution of previous work, for which we grabbed the opportunity of doing some computational experiments after we realized that we now had thousands of full genomes of the virus available for analysis.

What is the main finding reported in this paper?

The main finding of the paper is that there is a small number of genetic signatures spread across the HBV genome which allows to identify, with little uncertainty, to which of the ten known genotypes (A to J) a particular viral sample belongs to. These genetic signatures are simple combinations of specific amino acids at particular genomic positions.

If you had to explain the main finding to a 5-year-old child, how would you do it?

In supermarkets we can find that different brands of cookies have specific codebars. Viruses also have codebars. When people are infected with the Hepatitis B virus, we can check which codebar the virus has. We have found that such codebars are actually very simple and can be used to identify if a person is infected with a really bad HBV. We hope that this can help medical doctors more easily check for bad viruses, and if they find them that they can treat people better.

Why is it important for the scientific community and for society at large?

The population of circulating HBV is long known to be structured into ten large groups (lineages) with reasonable genetic similarity denominated in the literature as genotypes (named from A to J). Some of these genotypes can present identifiable phenotypic differences of clinical and epidemiological relevance. For example, particular genotypes can be more commonly associated with vaccine escape, antiviral treatment failure or faster progression to severe disease in the host. To date, two sequences would be accepted as being part of different genotypes when generally presenting larger than 7.5 per cent nucleotide divergence between them. At the same time, HBV genotyping is not routinely undertaken to inform patient care, although several lines of research point to its usefulness in decision making regarding treatment, for example. One of the largest contributor to this status quo is the lack of knowledge regarding how to identify specific genotypes from sequence data alone. This implies that identification of the infecting genotype requires sequencing the full viral genome of the patient, followed by a phylogenetic comparison to a large dataset of already sequenced genomes, in order to conclude to which known genotype does the infecting virus most resemble to. This process has costs and requires a pipeline of expertise commonly not immediately available. Our study reveals that HBV genotypes are actually characterized by a small subset of unique genetic signatures, revealing technical opportunities to develop quick-testing kits that by focusing on those signatures may identify infecting genotype without the need to perform full genome sequencing and phylogenetic inference.

What are the next steps?

This was a proof-of-concept study, exploring the potential of standard machine learning techniques to mine vast amounts of genetic data, with the aim of identifying genetic signatures of interest. Together with private sector partners, our next steps are to generate more knowledge to inform the the development of test kits for HBV genotyping. Apart from the potential to inform technological advances in testing, by identifying the positions of the genome in which genotypes have evolved and remained different from each other, the study also provides insights into venues for future functional studies to explore whether such positions are also the main determinants of observable phenotypic differences between the genotypes.

Find out more about José Lourenço’s research here 

Read the full paper here.

José Lourenço [image provided by the researcher]