cc-by-4.0Mäklin, Tommi2025-04-292021-09-292021-09-29https://datakatalogi.helsinki.fi/handle/123456789/5533This dataset contains the synthetic mixture samples and reference sequences - as well as the appropriate metadata - that were originally used in the 2021 revision of the mSWEEP manuscript. There are 87 samples in total, each containing 100bp paired-end Illumina sequencing reads from 10 different Escherichia coli strains from 10 different lineages. The number of reads is set so that the sequencing coverage of the individual strains varies between 50x and 0.10x and sums up to 100x.mSWEEPsynthetic mixtureescherichia colivariable coverageSynthetic Escherichia coli mixture samples with variable coveragedataset