cc-by-4.0Khawaja, Tamim2025-04-292024-06-252024-06-25https://datakatalogi.helsinki.fi/handle/123456789/5267Bin-assembled Escherichia coli genomes from Punjab, Pakistan These assemblies are a part of a cross-sectional study conducted in Punjab, Pakistan aimed at investigating E. coli colonisation diversity in healthy carriage with the use of CLED enrichment plates. About Version history v0.1.1 (current version) Added reference to the study. v0.1.0 Added brief description with a few missing parts. Distribution If you use these assemblies in your study please cite the source as appropriate. These assemblies are made available under a CC-BY 4.0 license. Citation Khawaja, T., Mäklin, T., Kallonen, T. et al. Deep sequencing of Escherichia coli exposes colonisation diversity and impact of antibiotics in Punjab, Pakistan. Nature Communications 15, 5196 (2024). https://doi.org/10.1038/s41467-024-49591-5 Methods briefly Species identification Sequencing data from the ENA project PRJEB36642 was error-corrected with fastp and pseudoaligned with Themisto against a species-level index (available from https://doi.org/10.5281/zenodo.6656881). Reads were assigned to species using the mSWEEP/mGEMS pipeline as described in https://www.nature.com/articles/s41467-022-35178-5. Lineage identification Read from the species-level bins were again pseudoaligned with Themisto against an E. coli index (will be made available in a later version). Lineage-level assignment was performed using mSWEEP and mGEMS at the level of PopPUNK sequence clusters. The created bins were screened with demix_check and bins that received a score of 1 or 2 were kept. Data in the kept bins were assembled with shovill and the bin-assembled genomes (BAGs) were quality controlled with checkm for >= 90% completeness and <= 10% contamination. Finally, BAGs shorter than 4 Mb or longer than 6 Mb were removed. Contact Tommi Mäklin <tommi'at'maklin.fi>.genome informaticsescherichia coliantimicrobial resistancemetagenomicsBin-assembled Escherichia coli genomes from a study in Punjab, Pakistandataset