VOG are created from scratch for all viral genomes from NCBI RefSeq. VOG are built fully automated, based on manually curated reference data (annotation quality criteria) and associations (functional categories). Our VOG workflow is controlled by Python scripts and is mainly based on NCBI COGsoft and HH-suite (see below).

Methods central in VOG construction

  • NCBI COGsoft

    Identifying orthologous genes in multiple genomes is a fundamental task in comparative genomics. Construction of intergenomic symmetrical best matches (SymBets) and joining them into clusters is a popular method of ortholog definition, embodied in several software programs.

    COGsoft utilizes the EdgeSearch [...]

  • HH-suite

    The HH-suite is an open-source software package for sensitive protein sequence searching. It contains programs that can search for similar protein sequences in protein sequence databases. These sequence searches are a standard tool in modern biology with which the function [...]