# reads with at least one reported alignment: 555 (55.50%)
# reads that failed to align: 445 (44.50%)
Reported 555 alignments to 1 output stream(s)
Time searching: 00:00:00
Overall time: 00:00:00
%% Cell type:markdown id: tags:
## Remove non-codant RNA
Original script burns 416 lines (0-415). Doing so strip the first non-header entry. Is it right ? Here I strip 415 exactly.
This block creates a generator that filters records according to the corresponding sam file with '4' in the field used in the original script. This method takes the bet that the first non-header line in the sam file matches the first line of the fastq. This is ugly. Don't do this. Need refact with browsing the sam file correctly.
The '4' in the flag field of the SAM file means that the read has no reported alignment. In this case, every aligned read means a match with non-codant tRNA index. So we keep only "mismatches" as they represent all reads that don't match with non-codant, in other words the reads we are interested in.
> Sum of all applicable flags. Flags relevant to Bowtie are:
> * 1 The read is one of a pair
> * 2 The alignment is one end of a proper paired-end alignment
> * 4 The read has no reported alignments
> * 8 The read is one of a pair and has no reported alignments
> * 16 The alignment is to the reverse reference strand
> * 32 The other mate in the paired-end alignment is aligned to the reverse reference strand
> * 64 The read is the first (#1) mate in a pair
> * 128 The read is the second (#2) mate in a pair
> Thus, an unpaired read that aligns to the reverse reference strand will have flag 16.
> A paired-end read that aligns and is the first mate in the pair will have flag 83 (= 64 + 16 + 2 + 1).