STRT-N mouse library output files

2023-01-13, 2023-01-13
dataset
Output files from STRT-N data analysis These files were obtained from a successful STRT-N mouse library using STRTN.sh, STRTN-UCSC-Allas.sh and STRTN-Seurat.sh. Details are below: OUTPUT-QC.txt  Quality check report for all samples. It is provided in out directory by STRTN.sh. Column Value Barcode Sample name. OUTPUT with numbers Qualified_reads Primary aligned read count Total_reads Read count without redundant (duplicate) reads Redundancy Qualified reads / Total reads Mapped_reads Mapped read count (Total reads without unmapped reads) Mapped_rate Mapped reads / Total reads Spikein_reads Read count mapped to ERCC spike-ins Spikein-5end_reads Read count mapped to the 5'-end 50 nt region of ERCC spike-ins Spikein-5end_rate Spikein-5end reads / Spikein reads Coding_reads Read count aligned within any exon or the 500 bp upstream of coding genes Coding-5end_reads Read count aligned the 5′-UTR or 500 bp upstream of coding genes Coding-5end_rate Coding-5end reads / Coding reads OUTPUT-QC-plots.pdf Quality check report by boxplots. Mapped_reads, Mapped_rate, Spikein_reads, Mapped / Spikein, Spikein-5end_rate, and Coding-5end_rate are shown for all samples. Barcode numbers of outlier samples are marked with red characters. It is provided in out directory by STRTN.sh. Please consider these outlier samples for the further downstream analysis. OUTPUT_byGene-counts.txt Read count table output from. It is provided in out directory by STRTN.sh. featureCounts. https://bioconductor.org/packages/release/bioc/vignettes/Rsubread/inst/doc/SubreadUsersGuide.pdf OUTPUT_byGene-counts.txt.summary Filtering summary from. It is provided in out directory by STRTN.sh.  featureCounts. https://bioconductor.org/packages/release/bioc/vignettes/Rsubread/inst/doc/SubreadUsersGuide.pdf Output_bam Resulting BAM files including unmapped, non-primary aligned, and duplicated (marked) reads. Files are provided in out directory by STRTN.sh. Output_bai Index files (.bai) of the resulting BAM files in the Output_bam directory. Files are provided in out directory by STRTN.sh. OUTPUT.output.bam BAM files containing reads except for duplicate and non-primary reads. Files are provided in the working directory by STRTN.sh. OUTPUT.minus.bw and OUTPUT.plus.bw BigWig files for each strands of each sample. Files are provided in the working directory by STRTN.sh. coding_5end.bb BigBed file for coding-5'end annotation file. It is provided in the working directory by STRTN.sh hub.txt Parameters for each tracks. It is provided in the working directory by STRTN-UCSC-Allas.sh. Link of hub.txt file Provided by STRTN-UCSC-Allas.sh. OUTPUT-QC-BeeswarmPlots.pdf Visualization quality check values for each developmental stage using BeeswarmPlots. It is provided in out directory by STRTN-Seurat.sh. Rplots.pdf Elbow, JackStraw, PCA, UMAP and violin plots. It is provided in out directory by STRTN-Seurat.sh. ExtractIlluminaBarcodes_Metrics Metrics file produced by the Picard ExtractIlluminaBarcodes program. The number of matches/mismatches between the barcode reads and the actual barcodes is shown per lane. https://gatk.broadinstitute.org/hc/en-us/articles/360037426491-ExtractIlluminaBarcodes-Picard- HISAT2_Metrics Alignment summary of samples from each lane produced by the HISAT2 program. https://daehwankimlab.github.io/hisat2/manual/ MarkDuplicates_Metrics Metrics file indicating the numbers of duplicates produced by the Picard MarkDuplicates program. https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard- OUTPUT_MultiQC_report.html  For each sample, fastq files from the output BAM files are generated by fastq-fastQC.sh in the fastq directory. These fastq files (without duplicated reads) can be submitted to public sequence databases. FastQC files are also generated for each fastq file in the fastqc directory. Based on the FastQC results, MultiQC report (MultiQC_report.html) is generated. OUTPUT_byTFE-counts_annotation.txt Read count table output from featureCounts with genomic annotations, produced by STRTN-TFE.sh. OUTPUT_peaks.bed Peak position information of TFEs, produced by STRTN-TFE.sh. OUTPUT_annotation Annotation of TFEs,  produced by STRTN-TFE.sh.