Abstract:
Identification of transcription and translation from noncoding regions augmented the
complexity of the genome, a problem further compounded by novel open reading
frames (ORFs) present within noncoding as well as genic regions with as of yet
unclear functions. In this study, we investigated two novel ORFs: short ORFs
(sORFs) and alternative ORFs (altORFs), within mouse naive B and T cells. We
established evidence of transcription for 2721 sORFs and 4251 altORFs and found
3604 sORFs and 2104 altORFs to be translated. We also identified 289 sORFs and
980 altORFs as differentially expressed (DE) between B and T cells. Furthermore,
PCA analysis indicated that transcript expression levels of these novel ORFs are
significant and sufficient to distinguish between the two cell types. Additionally,
differential methylation (DM) analysis of these differentially expressed novel ORFs
and protein-coding transcripts allowed us to identify 117, 139 and 1398 DMRs
upstream, downstream and within the body of DE sORFs, 199, 257 and 28497 near
DE altORFs and 1712,1679 and 24910 near protein-coding transcripts. Moreover, 46
sORFs containing DMRs were identified in the upstream and downstream regions of
protein-coding transcripts indicating that expression of DE protein-coding transcripts
might be affected by sORFs. Also, we found no evidence of LINE/SINE repeat
elements regulating expression of DE sORFs. Here, we present a framework for a
systematic investigation of transcription and translation from novel ORFs that could
be utilized to ascertain their functions or identify potential diseases variants present
within them.