RNAlib-2.4.3
RNA Structure Notations

Common Notations for RNA secondary structures

Dot-Bracket Notation (a.k.a. Dot-Parenthesis Notation)

The Dot-Bracket notation as introduced already in the early times of the ViennaRNA Package denotes base pairs by matching pairs of parenthesis () and unpaired nucleotides by dots ..

Example: A simple helix of size 4 enclosing a hairpin of size 4 is annotated as

((((....))))
See also
vrna_ptable_from_string(), vrna_db_flatten(), vrna_db_flatten_to()

Extended Dot-Bracket Notation

A more generalized version of the original Dot-Bracket notation may use additional pairs of brackets, such as <>, {}, and [], and matching pairs of uppercase/lowercase letters. This allows for anotating pseudo-knots, since different pairs of brackets are not required to be nested.

Example: The follwing annotations of a simple structure with two crossing helices of size 4 are equivalent:

<<<<[[[[....>>>>]]]]
((((AAAA....))))aaaa
AAAA{{{{....aaaa}}}}
See also
vrna_ptable_from_string(), vrna_db_flatten(), vrna_db_flatten_to()

Washington University Secondary Structure (WUSS) notation

The WUSS notation, as frequently used for consensus secondary structures in Stockholm 1.0 format allows for a fine-grained annotation of base pairs and unpaired nucleotides, including pseudo-knots.

Below, you'll find a list of secondary structure elements and their corresponding WUSS annotation (See also the infernal user guide at http://eddylab.org/infernal/Userguide.pdf)

  • Base pairs

    Nested base pairs are annotated by matching pairs of the symbols <>, (), {}, and []. Each of the matching pairs of parenthesis have their special meaning, however, when used as input in our programs, e.g. structure constraint, these details are usually ignored. Furthermore, base pairs that constitute as pseudo-knot are denoted by letters from the latin alphabet and are, if not denoted otherwise, ignored entirely in our programs.

  • Hairpin loops

    Unpaired nucleotides that constitute the hairpin loop are indicated by underscores, _.

    Example: <<<<<_____>>>>>

  • Bulges and interior loops

    Residues that constitute a bulge or interior loop are denoted by dashes, -.

    Example: (((–<<_____>>-)))

  • Multibranch loops

    Unpaired nucleotides in multibranch loops are indicated by commas ,.

    Example: (((,,<<_____>>,<<____>>)))

  • External residues

    Single stranded nucleotides in the exterior loop, i.e. not enclosed by any other pair are denoted by colons, :.

    Example: <<<____>>>:::

  • Insertions

    In cases where an alignment represents the consensus with a known structure, insertions relative to the known structure are denoted by periods, .. Regions where local structural alignment was invoked, leaving regions of both target and query sequence unaligned, are indicated by tildes, ~.

    Note
    These symbols only appear in alignments of a known (query) structure annotation to a target sequence of unknown structure.
  • Pseudo-knots

    The WUSS notation allows for annotation of pseudo-knots using pairs of upper-case/lower-case letters.

    Note
    Our programs and library functions usually ignore pseudo-knots entirely treating them as unpaired nucleotides, if not stated otherwise.

    Example: <<<_AAA___>>>aaa

See also
vrna_db_from_WUSS()