IMG: Integrated Microbial Genomes
IMG: Integrated Microbial Genomes

Plasmids

IMG 2.4 contains additional plasmid sequences that did not come from specific microbial genome sequencing projects. The procedure for including these plasmids into IMG 2.4 is outlined below:

  1. All plasmids available from RefSeq release 25 (1118 plasmids) were downloaded.
  2. 789 plasmids that were already in IMG 2.4(either as part of complete genomes project or isolate plasmids), were subtracted from the list of plasmids downloaded from RefSeq.
  3. 29 plasmids were with no genes called, and therefore excluded from the list of plasmids downloaded from RefSeq.
  4. The remaining 300 new 'isolate plasmids' from the list of plasmids downloaded from RefSeq were loaded into IMG 2.4.
  5. After inclusion into IMG, 15 'isolate plasmids' were found to be redundant (both as 'isolate plasmids' and as part of genome projects), and therefore were removed from IMG 2.4.

A total of 295 "new isolate plasmids" are part of IMG 2.4’s content, bringing the total to 687 isolate plasmids in IMG.

Strain names were added to organism name when available from publications or other sources. For example, see plasmid (NC_001520) available at NCBI . The original name from NCBI/RefSeq is "Acidithiobacillus ferrooxidans plasmid pTF4.1". The name was changed in IMG to "Acidithiobacillus ferrooxidans MAL4-1 plasmid pTF4.1"

Plasmids in IMG 2.1

IMG 2.1 contains additional plasmid sequences that did not come from specific microbial genome sequencing projects. The procedure for including these plasmids into IMG 2.1 is outlined below.

Procedure

RefSeq serves as the primary data source for plasmids. IMG 2.0 already contained 245 plasmids as part of complete genomes. The procedure for including more plasmids into IMG 2.1 involved the following steps:

  1. a list of 408 plasmids provided by ACLAME [1] was used as the baseline list of plasmids to be added to IMG;
  2. 6 plasmids that were already part of complete genomes in IMG 2.0 were subtracted from list (a).
  3. the remaining 402 plasmids were included in IMG 2.1.

Notes

1. Data sources

ACLAME lags behind RefSeq in terms of plasmid content. Thus, about 343 plasmids that were available in RefSeq (as of January 22nd 2007) but had no ACLAME counterpart, were not loaded into IMG 2.1. After review by the Plasmid Working Group these will be included in future IMG releases.

2. Plasmid source

There is currently no consensus on the 'source' or 'origin' for a plasmid sequence; several options are under discussion by the Plasmid Working Group.

3. Plasmid names

A plasmid name in IMG often includes the host name, following the RefSeq convention (e.g., see Methanosarcina acetivorans plasmid pC2A. However, when the host is not known, or should not be linked to the plasmid in the name because its broad host range nature, only the name of the plasmid is given (examples: pB3, pB8, pB10, QKH54, pIPO2, see for example pB3)

4. Host information

The complete organism/strain information for the bacterial host in which the plasmid was first found (if known) is often not available in RefSeq or ACLAME, or is presented in a non-standard form. Thus, Genbank provides the strain name of the host sometimes, but not always, (a) in the plasmid name itself; or (b) as a "strain" or a "note" field in the data file. Host information was manually curated in IMG based on the original publications.

References

[1] Leplae R, Hebrant A, Wodak SJ, and Toussaint A. 2004. ACLAME: A CLAssification of Mobile genetic Elements. Nucleic Acid Research, Vol. 32, D45-D49.

Acknowledgements

The following domain experts (Plasmid Working Group) have provided advice on plasmid nomenclature and classification: