Description
RNA sequencing, or RNA-seq, is a method for mapping and quantifying the
transcriptome of any organism that has a genomic DNA sequence
assembly. Compared to microarrays, RNA-seq is especially
well-suited for de novo
discovery of RNA splicing patterns and for determining unequivocally
the presence or absence of lower abundance class RNAs.
RNA-seq is performed by reverse-transcribing an RNA sample into
cDNA followed by high throughput DNA sequencing. Most data is produced
in one of two formats: single reads, each of which comes from one end
of a randomly primed cDNA molecule (and represent one end of one cDNA
segment), and paired-end reads, which are obtained as pairs
from both ends of a randomly primed cDNA (and represent two opposite
ends of one cDNA segment). The resulting sequence reads are then
informatically mapped onto the genome sequence (Alignments). Those that map,
mapped reads, are counted to determine their frequency of occurence at
known gene models. Those that don't map to the genome are mapped to
known RNA splice junctions (Splice
Sites).
Some RNA-seq protocols do not specify the coding strand. As a result,
there can be ambiguity at loci where both strands are transcribed.
Display Conventions
These tracks are multi-view composite tracks that contain multiple
data types (views). Each view within a track
has separate display controls, as described here.
Most ENCODE tracks contain multiple subtracks, corresponding to
multiple experimental conditions. If a track contains a large
number of subtracks, only some subtracks will be displayed by default.
The user can select which subtracks are displayed via the display controls
on the track details pages.
Credits
These data were generated and analyzed as part of the ENCODE project, a
genome-wide consortium project with the aim of cataloging all
functional elements in the human genome. This effort includes
collecting a variety of data across related experimental conditions, to
facilitate integrative analysis. Consequently, additional ENCODE tracks may
contain data that is relevant to the data in these tracks.
References
Morozova O, Hirst M, Marra MA. Applications of new sequencing
technologies for transcriptome analysis. Annual Review of
Genomics and Human Genetics. 2009;10:135-51.
Metzker ML. Sequencing
technologies - the next generation. Nature Reviews: Genetics. 2010
Jan;11(1):31-46
Data Release Policy
Data users may freely use ENCODE data, but may not, without prior
consent, submit publications that use an unpublished ENCODE dataset
until nine months following the release of the dataset. This date is
listed in the Restricted Until column on the track configuration page
and the download page. The full data release policy for ENCODE is
available here.