In order to facilitate scientific researchers to do structure or sequence analysis, annotation, gene target prediction, and bioinformatics functional queries of various coding and non-coding RNA, we have listed useful online tools. RiboBio’s informatics analysis platform integrates multiple bioinformatics databases, allowing researchers to personalize biodata on the web interface and create a personalized data analysis workflow. Please call +86 400-686-0075 or email firstname.lastname@example.org for the service.
NCBI (National Center for Biotechnology Information) focuses on the basic and applied research in computing molecular biology with a group of computer scientists, molecular biologists, mathematicians, biochemists, experimental physicists, and structural biologists. It studies the organization of genes, analysis of sequences, and prediction of structures. The current research program include the detection and analysis of gene organization, repeat sequence forms, protein domains and structural units, the establishment of genetic maps of human genome, the mathematical models of kinetics of HIV infection, the analysis of sequence error effects in database searches, the development of a new alignment algorithm for database search and multiple sequence, the development of a non-redundant sequence database, a mathematical model for statistically significant evaluation of sequence similarity and a vector model for text retrieval. In addition, NCBI researchers continue to promote collaboration with other research institutes within NIH and research labs at many academies and governments.
The miRBase sequence database, one of the authoritative data sources for miRNA research, is a comprehensive database that provides information on miRNA sequence data, annotations, and predicted gene targets. Currently, microRNA 21.0 is the latest version with a higher reliability, a total of 4,196 hairpin precursor sequences and 5441 mature miRNAs with some ambiguous and incorrectly annotated sequences have been removed.
exoRBase is a repository of circular RNA (circRNA), long non-coding RNA (lncRNA) and messenger RNA (mRNA) derived from RNA-seq data analyses of human blood exosomes. Experimental validations from published articles are also included. exoRBase features the integration and visualization of RNA expression profiles based on standardized RNA-seq data spanning both normal individuals and patients with different diseases. exoRBase aims to collect and characterize all long RNA species in human blood exosomes by providing annotation, expression level and possible original tissues. exoRBase will aid researchers in identifying molecular signatures in blood exosomes and trigger new circulating biomarker discovery and functional implication for human diseases.
The piRNA database was created by Indian scientist SaiLakshmiS with his partners to preserve newly discovered piRNA sequences. It contains the most comprehensive piRNA sequence information, including nearly 20 million piRNA-related sequences of human, mouse and rat. After homologous sequence removal and genomic mapping, more than 100,000 piRNA sequence are detected with unique genome target sites. The database enables a variety of search methods for species and chromosomes, and it can also display each piRNA or piRNA cluster on a large genome map.
With a comprehensive annotation of miRNA targets on lncRNAs, DIANA-LncBase provides a database of experimentally supported and in silico predicted miRNA Recognition Elements (MREs) on lncRNAs in human and mice. The experimentally supported entries available in DIANA-LncBase correspond to > 5000 interactions, while the computationally predicted interactions exceed 10 million. It also maintains detailed information of each miRNA-lncRNA pair, such as external linkages, graphical mapping of transcript genomic locations, binding site characterization, lncRNA tissue expression, as well as MREs conservation and prediction scores.
Starbase is a powerful database commonly used in lncRNA/circRNA/microRNA research. With multiple sequencing data, Starbase integrates prediction software to help: 1. Find non-coding RNA (lncRNA, circRNA, etc) based on microRNA; 2. Find mRNA targets based on microRNA; 3. Find ceRNA regulatory molecules; 4. Find RNA-binding proteins.
circRNADisease is a human-supported database of circRNA and disease experiments and is an important source of knowledge for understanding the role of circRNA in human disease. Each entry contains circRNA and disease names, circRNA expression patterns, experimental techniques, circRNA partners, function descriptions of circRNAs, literature references and other annotations of circRNA.
Cancer-specific CircRNA Database
CSCD (Cancer-Specific CircRNA Database) is a database developed for cancer-specific loop proteins. CSCD collected available RNA sequencing (total RNA and rRNA depletion or Polya enrichment) data sets from 87 cancer cell samples, and conducted a comprehensive analysis with four popular algorithms – CIRI, FIDEXCIRC, CURCRNAX FIDER and CURCExpRever. It also collects CurcRNAs identified only in cancer samples. There are currently 272,152 cancer-specific loop proteins in CSCD.
CycPediaV2 is an updated comprehensive database containing circRNA annotations of more than 150 RNA SEQ data sets from six different species, allowing users to search, browse and download circRNA with the expression features of various cells/tissues (including disease samples). In addition, the updated database incorporates a conservative analysis of CrCRNs between humans and mice. The web interface also contains a computational tool for comparing CyrRNA expression between samples.
GO Enrichment Analysis
After obtaining a specific gene set, the enrichment analysis of GO functional significance is able to determine the major biological functions of differentially expressed genes. GO ontology covers three major domains: Biological Process, Cellular Component and Molecular Function, referring to the biological process, cellular component, and molecular function of genes. The basic unit of GO is term, and each term corresponds to an attribute.
KEGG Enrichment Analysis
KEGG (Kyoto Encyclopedia of Genes and Genomes) enrichment analysis identifies the most important biochemical metabolic pathways and signal transduction pathways involved in differentially expressed genes. KEGG is a major public database for Pathway. In organisms, different genes coordinate with each other to perform their biological functions, and Pathway analysis helps to further understand the biological functions of genes.
TANRIC Database is valuable tool for lncRNA research and an atlas of non-coding RNA in cancer, it characterizes the expression profiles of lncRNAs in large patient cohorts of 20 cancer types and more than 8,000 samples overall. TANRIC is an interactive data analysis and visualization platform that contains three categories of data: lncRNA annotation information, RNA-Seq data, and profiling data.
FRNADB is a novel database service that carries a large number of non-coding transcripts, including annotated/non-annotated sequences from H-IV databases, non-coding and RNADB. A set of computational analyses was performed on the involved sequences, including RNA secondary structure motif discovery, EST support evaluation, cis-regulatory element search, protein homology search.
Predict circRNA binding target – CircInteractome
CircInteractome predicts the binding sites of the known 109 RNA-binding protein datasets with the circRNA in circbase, and predicted potential binding sites for miRNAs to circRNA using Targetscan software. It supports operations such as circRNA molecular search, circRNA-binding protein prediction, PCR primer design, and siRNA interference sequence design.
Venn Diagram is a series of circles that overlap to show the possibilities of various data sets. It represents the intersection and different sets among up to 30 sets of data.
RNAfold web server
NRED, built by John Mattick Lab, provides gene expression information for thousands of long ncRNAs in human and mouse.
The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information.
LNCipedia offers a comprehensive annotation of the sequence and structure of long-chain non-coding RNA in human.
Exosomes are 30-150nm membrane vesicles of endocytic origin secreted by most cell types in vitro. ExoCarta is an exosome database to provide the contents that were identified in exosomes in multiple organisms.
Vesiclepedia is a database dedicated to the study of extracellular vesicles. It involves 33 species and 538 provenance researches, and themiRNAs/mRNA/protein/fat found in apoptotic bodies, microvesicles, exosomes and other vesicles are included.
The Functional lncRNA Database
lncRNAdb provides a comprehensive annotation of biologically functional long-chain non-coding RNA, it is built by the John Mattick Lab.
NONCODE provides comprehensive annotation for long non-coding RNA, including expression and lncRNA function predicted by the team’s ncFANs computer software.
GeneCards is a searchable integrative database that provides concise genome-related information for all known and predicted human genes.
NCBI/ BLAST is a common tool for sequence analysis. BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
Linc2GO is a web resource that aims to provide comprehensive functional annotations for human lincRNA. MicroRNA-mRNA and microRNA-lincRNA interaction data were integrated to generate lincRNA functional annotations based on the ‘competing endogenous RNA hypothesis’.
lncLocator: long non-coding RNA
lncLocator is developed by the Pattern Recognition and Bioinformatics Group of Shanghai Jiaotong University to predict the localization of lncRNA in subcellular cells.
UCSC Genome Browser
circNet generates an integrated interaction network between circRNAs, miRNAs and mRNAs.
circBase is used to explore circRNA information.
CircPro is an automated high-throughput data analysis pipeline capable of detecting circRNAs, predicting their protein-coding potential and discovering junction reads from Ribo-Seq data.
Cytoscape is an open source software platform for visualizing and analyzing complex networks. It creates molecular interaction networks.
catRAPID predicts the algorithm for large-scale RNA-protein interactions.