Escherichia coli 13k assemblies PopPUNK database

No Thumbnail Available

Restricted Availability

Date

2022-03-01, 2022-03-01

Persistent identifier of the Data Catalogue metadata

Creator/contributor

Editor

Journal title

Journal volume

Publisher

Publication Type

dataset

Peer Review Status

Repositories

Access rights

ISBN

ISSN

Description

# _Escherichia coli_ 13k reference ## PopPUNK database files ## v1.0.0 (1 March 2022) ### Description This tarball contains the PopPUNK v2.4.0 [1] database files of a clustering for the 13435 _E. coli_ assemblies from three studies [2-4]. A file matching the clustering with the multilocus sequence types [5] (identified using mlst v2.19.0 [6]) is provided in `ecoli_sequence_information.tsv`. The corresponding assemblies and a Themisto v2.1.0 [7] pseudoalignment index are also available as separate uploads in Zenodo. __Note:__ the `esc_ra9772aa_as` entry in the PopPUNK files does not have a corresponding assembly nor is it included in the Themisto pseudoalignment index or the `ecoli_sequence_information.tsv` file. This is because the entry for this sequence was corrupted in the original run of PopPUNK. ### Files - `pop_db`: the PopPUNK sketch files. - `pop_fit_dbscan`: the initial DBSCAN fit for the sketch. - `pop_fit_refined`: the final refined version of the DBSCAN fit. - `pop_fit_refined_viz`: microreact visualisation files from the refined fit. - `ecoli_sequence_information.tsv`: a tab-separated text file containing the MLST types and the PopPUNK clusters. ### References - [1] Lees J et al., _Fast and flexible bacterial genomic epidemiology with PopPUNK._ https://doi.org/10.1101/gr.241455.118 - [2] Horesh G et al., _A comprehensive and high-quality collection of Escherichia coli genomes and their genes._ https://doi.org/10.1099/mgen.0.000499 - [3] Gladstone R et al., _Emergence and dissemination of antimicrobial resistance in Escherichia coli causing bloodstream infections in Norway in 2002–17: a nationwide, longitudinal, microbial population genomic study._ https://doi.org/10.1016/S2666-5247(21)00031-8 - [4] Shao Y et al., _Stunted microbiota and opportunistic pathogen colonization in caesarean-section birth._ https://doi.org/10.1038/s41586-019-1560-1 - [5] Jolley K et al., _Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications._ https://doi.org/10.12688/wellcomeopenres.14826.1 - [6] Seemann T, _mlst_ _GitHub._ https://github.com/tseemann/mlst - [7] Mäklin T et al., _Bacterial genomic epidemiology with mixed samples._ https://doi.org/10.1099/mgen.0.000691

Keyword (yso)

Publication Series

Journal title

Location of the original dataset