denovo Transcriptome Assembly for Pstr

I don’t have a reference genome or transcriptome to align the Chapter 3 reads of P.strigosa to for the reciprocal transplant study, so I need to generate a de novo transcriptome using the reads from the study.

However, I can’t find a consensus on how many samples from the study to use. I have 192 sequence files - a forward and reverse read for each sample, or which there is 96 of them.

Should I be using all of them? Won’t that be time/resource intensive? Is that even necessary?

GitHub Workflows

There are a few GitHub workflows I have been looking at for using Trinity: 1) Dr. Jill Ashey 2) Roberts lab 3) Trinity GitHub Repository 4) Trinity de novo Transcriptome assembly workshop 5) Shrimp project

None of the above workflows specify if there is a min/max amount of sequence files to use. Dr. Jill Ashey used 12 samples (6 forward, 6 reverse).

From Grabherr et al. 2011 (the first Trinity publication), they state that 50 M pairs of reads is enough for Trinity to fully reconstruct 86% of annotated transcripts. The authors say that after 50 M paired reads, it is “saturated” or doesn’t require any more data.

Haas et al. 2014 (the newer Trinity publication) doesn’t talk about this.

Coral papers that construct de novo transcriptomes

  1. Alderdice et al. 2022 -
    • Paired end reads (150bp) from Acropora
    • SOAPdenovo-Trans for 23-kmer length was used for de novo transcriptome assembly from all samples.
    • Number of samples:
Written on July 25, 2025