Database : Rice lncRNA database consists 6064 rice genome lncRNA sequences compiled mainly from three databases which includes,

1.GreeNC Database (5237 sequences): http://GreeNC.sciencedesigners.com/wiki/Main_Page
GreeNC database : The Green Non-Coding Database (GreeNC) is a repository of lncRNAs annotated in plants and algae. This database was designed in a way to provide a tool for the scientific community that can boost the research on this class of transcripts. The GreeNC database provides information about sequence, genomic coordinates, coding potential and folding energy for all the identified lncRNAs.

2.Plant Non-Coding RNA Database (790 sequences) : http://structuralbiology.cau.edu.cn/PNRD/
PLNlncRBase : PLNlncRbase has been designed as an easy-to-use resource to provide detailed information for experimentally identified plant lncRNAs. In the current version, PLNlncRbase has manually collected data from nearly 200 published literature, covering a total of 1187 plant lncRNAs in 43 plant species. The user can retrieve plant lncRNA entries from a well-organized interface through a keyword search by using the name of plant species or a lncRNA identifier. Each entry upon a query will be returned with detailed information for a specific plant lncRNA, including the species name, a lncRNA identifier, a brief description of the potential biological role, the lncRNA sequence, the lncRNA classification, an expression pattern of the lncRNA, the tissue/developmental stage/condition for lncRNA expression, the detection method for lncRNA expression, a reference literature, and the potential target gene(s) of the lncRNA extracted from the original reference.

3.PLNlncRBase Database (37 sequences) (Experimentally identified sequences) : http://bioinformatics.ahau.edu.cn/PLNlncRBase
Plant Non-coding RNA Database : It is the updated version of PMRD. PNRD mainly focuses on plant species. A total of 25739 entries of 11 different types of ncRNAs from 150 plant species were collected. Targets of miRNAs were extended to 178138 pairs in 46 species, while the number of miRNA expression profiles reached 35.

Long Non-Coding RNAs were download from these databases and in total it was found to be 6,064 sequences. Since these were all rice lncRNA sequences, BLASTClust tool was used to find out redundancy among sequences. BLASTClust is a standalone tool which comes in BLAST package and is used to cluster nucleotide as well as protin sequences. By using BLASTClust it was found that there were no rendundant sequences with an identity of 75% and query coverage of 80%. Experimentally identified sequences were retained from PLNlncRBase and the same sequences in other databases were removed. The dataset downloaded were checked for redundancy using BLAST Clust tool.Dataset compiled were analyzed for the presence of motifs and repeats in rice genome lncRNA sequences.
RegRNA 2.0 server was used to predict 14 types of motifs such as

1. A-U rich elements
2. CIS-regulatory elements
3. Functional RNA sequences
4. Long stems
5. micro RNA target sites
6. nc hybridization
7. ORF
8. Polyadenylation sites
9. Rho independent terminator
10. Ribosome binding site
11. Rice splicing sites
12. RNA C to U editing site
13. Transcriptional regulatory motif and
14. untranslational region motifs.

Repeat masker is a computational tool used for identifying, classifying and masking repeat elements. Using repeat masker the repetitive sequences along with their type, class/family, position and length of these sequences were identified and were analyzed using in-house perl program.


Copyright © -Tamil Nadu Agricultural University