Santamaria G, Rodriguez-Ruiz P, Renau-Mínguez C, Pinto FR, Coscollá M.
Mycobacterium tuberculosis, the causative agent of tuberculosis, is composed of several lineages characterized by a genome identity higher than 99%. Although the majority of the lineages are associated with humans, at least four lineages are adapted to other mammals, including different M. tuberculosis ecotypes. Host specificity is associated with higher virulence in its preferred host in ecotypes such as M. bovis. Deciphering what determines the preference of the host can reveal host-specific virulence patterns. However, it is not clear which genomic determinants might be influencing host specificity. In this study, we apply a combination of unsupervised and supervised classification methods on genomic data of ~27,000 M. tuberculosis clinical isolates to decipher host-specific genomic determinants. Host-specific genomic signatures are scarce beyond known lineage-specific mutations. Therefore, we integrated lineage-specific mutations into the iEK1011 2.0 genome-scale metabolic model to obtain lineage-specific versions of it. Flux distributions sampled from the solution spaces of these models can be accurately separated according to host association. This separation correlated with differences in cell wall processes, lipid, amino acid and carbon metabolic subsystems. These differences were observable when more than 95% of the samples had a specific growth rate significantly lower than the maximum achievable by the models. This suggests that these differences might manifest at low growth rate settings, such as the restrictive conditions M. tuberculosis suffers during macrophage infection.