Description

RNA sequencing, or RNA-seq, is a method for mapping and quantifying the transcriptome of any organism that has a genomic DNA sequence assembly. Compared to microarrays, RNA-seq is especially well-suited for de novo discovery of RNA splicing patterns and for determining unequivocally the presence or absence of lower abundance class RNAs.

RNA-seq is performed by reverse-transcribing an RNA sample into cDNA followed by high throughput DNA sequencing. Most data is produced in one of two formats: single reads, each of which comes from one end of a randomly primed cDNA molecule (and represent one end of one cDNA segment), and paired-end reads, which are obtained as pairs from both ends of a randomly primed cDNA (and represent two opposite ends of one cDNA segment). The resulting sequence reads are then informatically mapped onto the genome sequence (Alignments). Those that map, mapped reads, are counted to determine their frequency of occurence at known gene models. Those that don't map to the genome are mapped to known RNA splice junctions (Splice Sites).

Some RNA-seq protocols do not specify the coding strand. As a result, there can be ambiguity at loci where both strands are transcribed.

Display Conventions

These tracks are multi-view composite tracks that contain multiple data types (views). Each view within a track has separate display controls, as described here. Most ENCODE tracks contain multiple subtracks, corresponding to multiple experimental conditions. If a track contains a large number of subtracks, only some subtracks will be displayed by default. The user can select which subtracks are displayed via the display controls on the track details pages.

Credits

These data were generated and analyzed as part of the ENCODE project, a genome-wide consortium project with the aim of cataloging all functional elements in the human genome. This effort includes collecting a variety of data across related experimental conditions, to facilitate integrative analysis. Consequently, additional ENCODE tracks may contain data that is relevant to the data in these tracks.

References

Morozova O, Hirst M, Marra MA. Applications of new sequencing technologies for transcriptome analysis. Annual Review of Genomics and Human Genetics. 2009;10:135-51.

Metzker ML. Sequencing technologies - the next generation. Nature Reviews: Genetics. 2010 Jan;11(1):31-46

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column on the track configuration page and the download page. The full data release policy for ENCODE is available here.