And reviewing publications that performed the same analysis will give you an idea about what others are doing, so you can make the best choices. Search the tool panel with the keywords RNA, DNA, and then VCF to get an idea about they type of tools available for each.
Meaning, was the library prep done by targeting a genomic region or based on a RNA library specific to the transcript? One sample/condition or many? These details make a difference in the type tools you can use for mapping/variant calls. Is the data sequenced really RNA and not DNA? I understand the target is a particular transcript (if I understood you correctly), but both are possible sequencing options - it depends on what the samples were based on. If this is the case, using a "QC > map > variant calling > (optional) annotation" type of workflow would probably be a better choice. If the data is from multiple samples, you might not want to assemble in batch, or at all, since this will mix up the samples together, making a trace-back to a particular sample/condition difficult if not impossible. Need to terminate program.Ĭould anybody, please, explain me what is the problem with my workflow? Terminate called after throwing an instance of 'std::bad_alloc'Ĭould not allocate a distance matrix for 126817 seqs. Here (after step 5), the output was empty and I have gotten the following error message at the end of the log file: Attempted to run Clustal 2.1 to perform multiple sequence alignment. Converted fastq to fasta by using FASTQ to FASTA toolĥ. This was followed by filtering by quality (FASTQ filter by quality tool).Ĥ.
I ran FASTQ joiner tool to combine both files into one.ģ. Removed the adapters with primers by using the Clip tool.Ģ. I've performed the following steps by using public Galaxy:ġ. align the consensus sequences to your reference sequence discard any low quality data that remains assemble the two overlapping reads to get a consensus sequence for each fragment trim off the primers and any adaptor sequence
I am new to RNAseq analysis, so I was advised to do the following: The data I have was obtained by using MiSeq machine, and it is paired end (2 separate files). I've been trying to analyze reads from a short transcript.