Escherichia coli 13k assemblies PopPUNK database
No Thumbnail Available
Restricted Availability
Date
2022-03-01, 2022-03-01
Persistent identifier of the Data Catalogue metadata
Creator/contributor
Editor
Journal title
Journal volume
Publisher
Publication Type
dataset
Peer Review Status
Repositories
Access rights
ISBN
ISSN
Description
# _Escherichia coli_ 13k reference
## PopPUNK database files
## v1.0.0 (1 March 2022)
### Description
This tarball contains the PopPUNK v2.4.0 [1] database files of a
clustering for the 13435 _E. coli_ assemblies from three studies
[2-4]. A file matching the clustering with the multilocus sequence
types [5] (identified using mlst v2.19.0 [6]) is provided in
`ecoli_sequence_information.tsv`.
The corresponding assemblies and a Themisto v2.1.0 [7] pseudoalignment
index are also available as separate uploads in Zenodo.
__Note:__ the `esc_ra9772aa_as` entry in the PopPUNK files does not
have a corresponding assembly nor is it included in the Themisto
pseudoalignment index or the `ecoli_sequence_information.tsv`
file. This is because the entry for this sequence was corrupted in the
original run of PopPUNK.
### Files
- `pop_db`: the PopPUNK sketch files.
- `pop_fit_dbscan`: the initial DBSCAN fit for the sketch.
- `pop_fit_refined`: the final refined version of the DBSCAN fit.
- `pop_fit_refined_viz`: microreact visualisation files from the refined fit.
- `ecoli_sequence_information.tsv`: a tab-separated text file containing the MLST types and the PopPUNK clusters.
### References
- [1] Lees J et al., _Fast and flexible bacterial genomic epidemiology with PopPUNK._ https://doi.org/10.1101/gr.241455.118
- [2] Horesh G et al., _A comprehensive and high-quality collection of Escherichia coli genomes and their genes._ https://doi.org/10.1099/mgen.0.000499
- [3] Gladstone R et al., _Emergence and dissemination of antimicrobial resistance in Escherichia coli causing bloodstream infections in Norway in 2002–17: a nationwide, longitudinal, microbial population genomic study._ https://doi.org/10.1016/S2666-5247(21)00031-8
- [4] Shao Y et al., _Stunted microbiota and opportunistic pathogen colonization in caesarean-section birth._ https://doi.org/10.1038/s41586-019-1560-1
- [5] Jolley K et al., _Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications._ https://doi.org/10.12688/wellcomeopenres.14826.1
- [6] Seemann T, _mlst_ _GitHub._ https://github.com/tseemann/mlst
- [7] Mäklin T et al., _Bacterial genomic epidemiology with mixed samples._ https://doi.org/10.1099/mgen.0.000691