Based on the study "The association of virulence factors with genomic islands" (Ho Sui, Fedynak, Hsiao, Langille, and Brinkman (2009), PLoS ONE).
BLAST similarity searches were performed against the deduced proteomes of 298 pathogenic and 333 non-pathogenic sequenced prokaryotic genomes downloaded from the National Center for Biotechnology Information FTP site in March 2008. An e-value cut-off of 10-7 was used to exclude distant homologs. Pathogen or non-pathogen status for each genome was obtained through the NCBI Complete Microbial Genomes webpage (Haft et al. 2005) and then manually curated to ensure data quality and completeness.
Important note on this analysis: The list of genes defined as being pathogen-associated will change as more genomes are sequenced. This analysis will be updated at a later time in order to reflect this.