GLASSgo (GLobal Automated sRNA Search go) combines iterative BLAST
searches, pairwise identity filtering, and structure based
clustering in an automated prediction pipeline to find sRNA
homologs from scratch. The web server provides
predefined parameter sets for a non-expert usage
as well as enables a manual setup of the query parameters.
The returned GLASSgo result is in FASTA format, whereby the first
entry represents the input sequence.
Published precomputed results for
IsaR1 from PCC6803,
IsaR1 from PCC6803 (adapted parameters), and
IsaR1 from PCC7424
using GLASSgo v1.5.0.
Introduction
When using GLASSgo please cite :
- Steffen C. Lott, Richard A Schäfer, Martin Mann, Rolf Backofen, Wolfgang R Hess, Bjoern Voss, Jens Georg
GLASSgo - Automated and reliable detection of sRNA homologs from a single input sequences
Frontiers in Genetics, 9, 124, 2018. - Martin Raden, Syed M Ali, Omer S Alkhnbashi, Anke Busch, Fabrizio Costa, Jason A Davis, Florian Eggenhofer, Rick Gelhausen, Jens Georg, Steffen Heyne, Michael Hiller, Kousik Kundu, Robert Kleinkauf, Steffen C Lott, Mostafa M Mohamed, Alexander Mattheis, Milad Miladi, Andreas S Richter, Sebastian Will, Joachim Wolff, Patrick R Wright, and Rolf Backofen
Freiburg RNA tools: a central online resource for RNA-focused research and teaching
Nucleic Acids Research, 46(W1), W25-W29, 2018.
Results are computed with GLASSgo version 1.5.2
Overview
The following parameters are used to control the execution of GLASSgo
Furthermore, additional information is available
Sequence Parameters
Query sRNA in FASTA
The (single) sRNA sequence has to be provided in FASTA format.
Input can be given either as direct text input or by uploading a file.
A sequence in FASTA format begins with a single-line sequence identifier that starts with a greater-than (">") symbol, followed by lines of sequence data.
For readability, it is recommended that each line is at most 80 characters in length.
The parameter constraints are: The input has to be in valid FASTA format. The number of sequences has to be at least 1 and at most 1. Sequence lengths have to be in the range 20-800. The allowed sequence alphabet is 'ACGUTacgut'.
Defaults to ()
Defaults to ()
Search Parameters
Taxon selection
The GLASSgo search is by default based on the complete NCBI
‘nt’ database. In general, sRNAs show a limited distribution
among the phylogenetic tree, such that a targeted search in a
specfic taxonomic group is likely to perform better. Thus, select the
taxon that your search should be limited to.
Parameter setup
You can run GLASSgo either in automated mode or you
can manually set the advanced parameters.
Maximum allowed E-value
The E-value mainly influences the sensitivity of GLASSgo.
A relaxed E-value (>1.0) increases the chance to get more
sequences, but also increases computation time.
The parameter constraints are: Input value has to be parsable as Double. The value must be smaller than or equal to 50.
Defaults to (1)
Defaults to (1)
Minimum allowed identity [%]
Each sRNA candidate is compared to the query sRNA on
sequence level and should have a percent identity larger than
the value of this parameter to be kept for further analysis.
Please note, that values lower than 65% increase the total
number of hits, but also slightly increase the probability
for false positives.
The parameter constraints are: Input value has to be parsable as Double. The value must be greater than or equal to 10 and must be smaller than or equal to 75.
Defaults to (52)
Defaults to (52)
Structure-based clustering
Defines whether or not structural clustering (via Londen) is to be applied.
The parameter constraints are: Input value has to be parsable as Integer.
Defaults to ( on)
Defaults to ( on)
Structure-based clustering
Structure-based filtering
Structure-based filtering can either be done automatically
or you can set manually an according
structure-based filtering value (see according
parameter).
Manual value for filtering
The structure-based filtering represents the
third filtering step of GLASSgo and is applied to the
candidate hits with medium percent identity (min_identity
< %ID < 80%). Lowering the parameter value will result
in a more strict analysis (less false positives) and vice
versa.
The parameter constraints are: Input value has to be parsable as Double. The value must be greater than or equal to 0 and must be smaller than or equal to 3.
Defaults to (2)
Defaults to (2)
Additional Settings
Include upstream region
Setting the parameter 'Upstream Region' to 100 extracts 100 nucleotides upstream for each predicted GLASSgo hit. This additionally sequence
information is directly concatenated with the corresponding GLASSgo hit
and therefore an integral part of the returned GLASSgo
results. Note: The upstream region is not considered while the similarity
value [%] is computed! In addition, the FASTA header (e.g. start
position) for each GLASSgo hit
will be updated (if upstream region is activated), whereas the upstream region is additionally mentioned like
-UTR-REGION-100nt:1002422-1002521-. You can find further information
about the GLASSgo results in the output help section.
(0 == no additional upstream region included).
(0 == no additional upstream region included).
The parameter constraints are: Input value has to be parsable as Integer. The value must be greater than or equal to 0 and must be smaller than or equal to 500.
Defaults to (0)
Defaults to (0)
Output Description
The output of GLASSgo is a file in multi-FASTA format where the input
sequence (query) is followed by the identified homologs. If no homologs
could be found, only the input sequence is shown. In the following the
output format will be discussed using two examples.
Both examples show a partial result of GLASSgo applying EcpR1. In the first example,the
upstream region was turned off while the value
for the upstream region was set to 100 nt for the second.
For this very reason, the
headers as well as the sequence sizes are unequal.
The following header shows the Accession number of the respective genome followed by the genomic coordinates of the proposed sRNA homolog (no upstream region included).
The following header shows the Accession number of the respective genome followed by the genomic coordinates of the proposed sRNA homolog (no upstream region included).
>CP013051.1:1422247-1422417 Sinorhizobium americanum CCGM7, complete genome-p.c.VAL:80.75%-taxID:1408224 AAAGGAAGTGAGACTTCCACGATCGATCGGTTACCCCATGATGCTCAGGTCCGCCGCATCTCCTGGGTCGTGGGGTCGGTCGGCTGGCTTCCGACATCCGCGGATTCCTCGTGCCGCAGTCGGAGCCAGCCGACCCCCTTTCAAAACGCCGCTTCAAAAGAGGCGGCGTTTIn contrast, the next header shows the genomic coordinates of the combined upstream region (100nt) and the proposed sRNA. The exact coordinates of the upstream region are given later in the header (UPSTREAM-REGION-100nt:1422147-1422246).
>CP013051.1:1422147-1422417 Sinorhizobium americanum CCGM7, complete genome-UPSTREAM-REGION-100nt:1422147-1422246-p.c.VAL:80.75%-taxID:1408224 ATTTGTCCGAATACGAGACAGAATTAACCAAACGCCGAGCAACCCGCTTCGGCGATTAAGAATTCGTTGATTTTTTTTTATTTTCAAGCAATGCTGATATAAAGGAAGTGAGACTTCCACGATCGATCGGTTACCCCATGATGCTCAGGTCCGCCGCATCTCCTGGGTCGTGGGGTCGGTCGGCTGGCTTCCGACATCCGCGGATTCCTCGTGCCGCAGTCGGAGCCAGCCGACCCCCTTTCAAAACGCCGCTTCAAAAGAGGCGGCGTTTBoth examples contain the name of the genome entry and a pairwise similarity value of p.c.Val:80.75% (query vs. GLASSgo hit ) as well as their corresponding taxonomic number taxID:1408224.
Input Examples
EcpR1 in Proteobacteria
EcpR1 in Proteobacteria
The example's result can be directly accessed here
NsiR4 in Cyanobacteria
NsiR4 in Cyanobacteria
The example's result can be directly accessed here
List of Changes
- 4.5.9 - 181220 : GLASSgo v1.5.2 online; new "upstream region" output parameter
- 4.5.7 - 180904 : GLASSgo v1.5.1 online; publication updated
- 4.4.9 : GLASSgo v1.5.0 online
- 4.4.7 : BLAST-nt database downgrade due to inconsistencies
- 4.4.4 : GLASSgo v1.4.3 online
- 4.4.3 : BLAST-nt database update
- 4.4.0 : GLASSgo v1.4.2 online
- 4.3.1 : GLASSgo v1.3.2 online
- 4.3.0 : GLASSgo v1.2.2 online
- 4.1.0 : The GLASSgo webserver goes online