denovo Transcriptome Assembly for Pstr
I don’t have a reference genome or transcriptome to align the Chapter 3 reads of P.strigosa to for the reciprocal transplant study, so I need to generate a de novo transcriptome using the reads from the study.
However, I can’t find a consensus on how many samples from the study to use. I have 192 sequence files - a forward and reverse read for each sample, or which there is 96 of them.
Should I be using all of them? Won’t that be time/resource intensive? Is that even necessary?
GitHub Workflows
There are a few GitHub workflows I have been looking at for using Trinity: 1) Dr. Jill Ashey 2) Roberts lab 3) Trinity GitHub Repository 4) Trinity de novo Transcriptome assembly workshop 5) Shrimp project
None of the above workflows specify if there is a min/max amount of sequence files to use. Dr. Jill Ashey used 12 samples (6 forward, 6 reverse).
From Grabherr et al. 2011 (the first Trinity publication), they state that 50 M pairs of reads is enough for Trinity to fully reconstruct 86% of annotated transcripts. The authors say that after 50 M paired reads, it is “saturated” or doesn’t require any more data.
Haas et al. 2014 (the newer Trinity publication) doesn’t talk about this.
Coral papers that construct de novo transcriptomes
- Alderdice et al. 2022 -
- Paired end reads (150bp) from Acropora
- SOAPdenovo-Trans for 23-kmer length was used for de novo transcriptome assembly from all samples.
- Number of samples: