TY - JOUR
T1 - Identification of core and variable components of the Salmonella enterica subspecies I genome by microarray
AU - Anjum, Muna F.
AU - Marooney, Chris
AU - Fookes, Maria
AU - Baker, Stephen
AU - Dougan, Gordon
AU - Ivens, Al
AU - Woodward, Martin J.
PY - 2005/12/1
Y1 - 2005/12/1
N2 - We have performed microarray hybridization studies on 40 clinical isolates from 12 common serovars within Salmonella enterica subspecies I to identify the conserved chromosomal gene pool. We were able to separate the core invariant portion of the genome by a novel mathematical approach using a decision tree based on genes ranked by increasing variance. All genes within the core component were confirmed using available sequence and microarray information for 5. enterica subspecies I strains. The majority of genes within the core component had conserved homologues in Escherichia coli K-12 strain MG1655. However, many genes present in the conserved set which were absent or highly divergent in K-12 had close homologues in pathogenic bacteria such as Shigella flexneri and Pseudomonas aeruginosa. Genes within previously established virulence determinants such as SPI1 to SPI5 were conserved. In addition several genes within SPI6, all of SPI9, and three fimbrial operons (fim, bcf, and stb) were conserved within all 5. enterica strains included in this study. Although many phage and insertion sequence elements were missing from the core component, approximately half the pseudogenes present in S. enterica serovar Typhi were conserved. Furthermore, approximately half the genes conserved in the core set encoded hypothetical proteins. Separation of the core and variant gene sets within S. enterica subspecies I has offered fundamental biological insight into the genetic basis of phenotypic similarity and diversity across S. enterica subspecies I and shown how the core genome of these pathogens differs from the closely related E. coli K-12 laboratory strain.
AB - We have performed microarray hybridization studies on 40 clinical isolates from 12 common serovars within Salmonella enterica subspecies I to identify the conserved chromosomal gene pool. We were able to separate the core invariant portion of the genome by a novel mathematical approach using a decision tree based on genes ranked by increasing variance. All genes within the core component were confirmed using available sequence and microarray information for 5. enterica subspecies I strains. The majority of genes within the core component had conserved homologues in Escherichia coli K-12 strain MG1655. However, many genes present in the conserved set which were absent or highly divergent in K-12 had close homologues in pathogenic bacteria such as Shigella flexneri and Pseudomonas aeruginosa. Genes within previously established virulence determinants such as SPI1 to SPI5 were conserved. In addition several genes within SPI6, all of SPI9, and three fimbrial operons (fim, bcf, and stb) were conserved within all 5. enterica strains included in this study. Although many phage and insertion sequence elements were missing from the core component, approximately half the pseudogenes present in S. enterica serovar Typhi were conserved. Furthermore, approximately half the genes conserved in the core set encoded hypothetical proteins. Separation of the core and variant gene sets within S. enterica subspecies I has offered fundamental biological insight into the genetic basis of phenotypic similarity and diversity across S. enterica subspecies I and shown how the core genome of these pathogens differs from the closely related E. coli K-12 laboratory strain.
UR - http://www.scopus.com/inward/record.url?scp=28444474563&partnerID=8YFLogxK
U2 - 10.1128/IAI.73.12.7894-7905.2005
DO - 10.1128/IAI.73.12.7894-7905.2005
M3 - Article
C2 - 16299280
AN - SCOPUS:28444474563
VL - 73
SP - 7894
EP - 7905
JO - Infection and Immunity
JF - Infection and Immunity
SN - 1098-5522
IS - 12
ER -