RNAlib-2.4.3
Utilities for sequence alignments
+ Collaboration diagram for Utilities for sequence alignments:

Data Structures

struct  vrna_pinfo_s
 A base pair info structure. More...
 

Macros

#define VRNA_MEASURE_SHANNON_ENTROPY   1U
 Flag indicating Shannon Entropy measure. More...
 

Typedefs

typedef struct vrna_pinfo_s vrna_pinfo_t
 Typename for the base pair info repesenting data structure vrna_pinfo_s.
 
typedef struct vrna_pinfo_s pair_info
 Old typename of vrna_pinfo_s. More...
 

Functions

int vrna_aln_mpi (const char **alignment)
 Get the mean pairwise identity in steps from ?to?(ident) More...
 
vrna_pinfo_tvrna_aln_pinfo (vrna_fold_compound_t *vc, const char *structure, double threshold)
 Retrieve an array of vrna_pinfo_t structures from precomputed pair probabilities. More...
 
char ** vrna_aln_slice (const char **alignment, unsigned int i, unsigned int j)
 Slice out a subalignment from a larger alignment. More...
 
void vrna_aln_free (char **alignment)
 Free memory occupied by a set of aligned sequences. More...
 
char ** vrna_aln_uppercase (const char **alignment)
 Create a copy of an alignment with only uppercase letters in the sequences. More...
 
char ** vrna_aln_toRNA (const char **alignment)
 Create a copy of an alignment where DNA alphabet is replaced by RNA alphabet. More...
 
char ** vrna_aln_copy (const char **alignment, unsigned int options)
 Make a copy of a multiple sequence alignment. More...
 
float * vrna_aln_conservation_struct (const char **alignment, const char *structure, const vrna_md_t *md)
 Compute base pair conservation of a consensus structure. More...
 
float * vrna_aln_conservation_col (const char **alignment, const vrna_md_t *md_p, unsigned int options)
 Compute nucleotide conservation in an alignment. More...
 
int get_mpi (char *Alseq[], int n_seq, int length, int *mini)
 Get the mean pairwise identity in steps from ?to?(ident) More...
 
void encode_ali_sequence (const char *sequence, short *S, short *s5, short *s3, char *ss, unsigned short *as, int circ)
 Get arrays with encoded sequence of the alignment. More...
 
void alloc_sequence_arrays (const char **sequences, short ***S, short ***S5, short ***S3, unsigned short ***a2s, char ***Ss, int circ)
 Allocate memory for sequence array used to deal with aligned sequences. More...
 
void free_sequence_arrays (unsigned int n_seq, short ***S, short ***S5, short ***S3, unsigned short ***a2s, char ***Ss)
 Free the memory of the sequence arrays used to deal with aligned sequences. More...
 

Detailed Description


Data Structure Documentation

◆ vrna_pinfo_s

struct vrna_pinfo_s

A base pair info structure.

For each base pair (i,j) with i,j in [0, n-1] the structure lists:

  • its probability 'p'
  • an entropy-like measure for its well-definedness 'ent'
  • the frequency of each type of pair in 'bp[]'
    • 'bp[0]' contains the number of non-compatible sequences
    • 'bp[1]' the number of CG pairs, etc.

Data Fields

unsigned i
 nucleotide position i
 
unsigned j
 nucleotide position j
 
float p
 Probability.
 
float ent
 Pseudo entropy for $ p(i,j) = S_i + S_j - p_ij*ln(p_ij) $.
 
short bp [8]
 Frequencies of pair_types.
 
char comp
 1 iff pair is in mfe structure
 

Macro Definition Documentation

◆ VRNA_MEASURE_SHANNON_ENTROPY

#define VRNA_MEASURE_SHANNON_ENTROPY   1U

#include <ViennaRNA/aln_util.h>

Flag indicating Shannon Entropy measure.

Shannon Entropy is defined as $ H = - \sum_c p_c \cdot \log_2 p_c $

Typedef Documentation

◆ pair_info

typedef struct vrna_pinfo_s pair_info

#include <ViennaRNA/aln_util.h>

Old typename of vrna_pinfo_s.

Deprecated:
Use vrna_pinfo_t instead!

Function Documentation

◆ vrna_aln_mpi()

int vrna_aln_mpi ( const char **  alignment)

#include <ViennaRNA/aln_util.h>

Get the mean pairwise identity in steps from ?to?(ident)

Parameters
alignmentAligned sequences
Returns
The mean pairwise identity

◆ vrna_aln_pinfo()

vrna_pinfo_t* vrna_aln_pinfo ( vrna_fold_compound_t vc,
const char *  structure,
double  threshold 
)

#include <ViennaRNA/aln_util.h>

Retrieve an array of vrna_pinfo_t structures from precomputed pair probabilities.

This array of structures contains information about positionwise pair probabilies, base pair entropy and more

See also
vrna_pinfo_t, and vrna_pf()
Parameters
vcThe vrna_fold_compound_t of type VRNA_FC_TYPE_COMPARATIVE with precomputed partition function matrices
structureAn optional structure in dot-bracket notation (Maybe NULL)
thresholdDo not include results with pair probabilities below threshold
Returns
The vrna_pinfo_t array

◆ vrna_aln_slice()

char** vrna_aln_slice ( const char **  alignment,
unsigned int  i,
unsigned int  j 
)

#include <ViennaRNA/aln_util.h>

Slice out a subalignment from a larger alignment.

Note
The user is responsible to free the memory occupied by the returned subalignment
See also
vrna_aln_free()
Parameters
alignmentThe input alignment
iThe first column of the subalignment (1-based)
jThe last column of the subalignment (1-based)
Returns
The subalignment between column $i$ and $j$

◆ vrna_aln_free()

void vrna_aln_free ( char **  alignment)

#include <ViennaRNA/aln_util.h>

Free memory occupied by a set of aligned sequences.

Parameters
alignmentThe input alignment

◆ vrna_aln_uppercase()

char** vrna_aln_uppercase ( const char **  alignment)

#include <ViennaRNA/aln_util.h>

Create a copy of an alignment with only uppercase letters in the sequences.

See also
vrna_aln_copy
Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
Returns
A copy of the input alignment where lowercase sequence letters are replaced by uppercase letters

◆ vrna_aln_toRNA()

char** vrna_aln_toRNA ( const char **  alignment)

#include <ViennaRNA/aln_util.h>

Create a copy of an alignment where DNA alphabet is replaced by RNA alphabet.

See also
vrna_aln_copy
Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
Returns
A copy of the input alignment where DNA alphabet is replaced by RNA alphabet (T -> U)

◆ vrna_aln_copy()

char** vrna_aln_copy ( const char **  alignment,
unsigned int  options 
)

#include <ViennaRNA/aln_util.h>

Make a copy of a multiple sequence alignment.

This function allows one to create a copy of a multiple sequence alignment. The options parameter additionally allows for sequence manipulation, such as converting DNA to RNA alphabet, and conversion to uppercase letters.

See also
vrna_aln_copy(), #VRNA_ALN_RNA, #VRNA_ALN_UPPERCASE, #VRNA_ALN_DEFAULT
Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
optionsOption flags indicating whether the aligned sequences should be converted
Returns
A (manipulated) copy of the input alignment

◆ vrna_aln_conservation_struct()

float * vrna_aln_conservation_struct ( const char **  alignment,
const char *  structure,
const vrna_md_t md 
)

#include <ViennaRNA/aln_util.h>

Compute base pair conservation of a consensus structure.

This function computes the base pair conservation (fraction of canonical base pairs) of a consensus structure given a multiple sequence alignment. The base pair types that are considered canonical may be specified using the #vrna_md_t.pairs array. Passing NULL as parameter md results in default pairing rules, i.e. canonical Watson-Crick and GU Wobble pairs.

Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
structureThe consensus structure in dot-bracket notation
mdModel details that specify compatible base pairs (Maybe NULL)
Returns
A 1-based vector of base pair conservations
SWIG Wrapper Notes:
This function is available in an overloaded form where the last parameter may be omitted, indicating md = NULL

◆ vrna_aln_conservation_col()

float * vrna_aln_conservation_col ( const char **  alignment,
const vrna_md_t md,
unsigned int  options 
)

#include <ViennaRNA/aln_util.h>

Compute nucleotide conservation in an alignment.

This function computes the conservation of nucleotides in alignment columns. The simples measure is Shannon Entropy and can be selected by passing the VRNA_MEASURE_SHANNON_ENTROPY flag in the options parameter.

Note
Currently, only VRNA_MEASURE_SHANNON_ENTROPY is supported as conservation measure.
See also
VRNA_MEASURE_SHANNON_ENTROPY
Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
mdModel details that specify known nucleotides (Maybe NULL)
optionsA flag indicating which measure of conservation should be applied
Returns
A 1-based vector of column conservations
SWIG Wrapper Notes:
This function is available in an overloaded form where the last two parameters may be omitted, indicating md = NULL, and options = VRNA_MEASURE_SHANNON_ENTROPY, respectively.

◆ get_mpi()

int get_mpi ( char *  Alseq[],
int  n_seq,
int  length,
int *  mini 
)

#include <ViennaRNA/aln_util.h>

Get the mean pairwise identity in steps from ?to?(ident)

Deprecated:
Use vrna_aln_mpi() as a replacement
Parameters
Alseq
n_seqThe number of sequences in the alignment
lengthThe length of the alignment
mini
Returns
The mean pairwise identity

◆ encode_ali_sequence()

void encode_ali_sequence ( const char *  sequence,
short *  S,
short *  s5,
short *  s3,
char *  ss,
unsigned short *  as,
int  circ 
)

#include <ViennaRNA/aln_util.h>

Get arrays with encoded sequence of the alignment.

this function assumes that in S, S5, s3, ss and as enough space is already allocated (size must be at least sequence length+2)

Parameters
sequenceThe gapped sequence from the alignment
Spointer to an array that holds encoded sequence
s5pointer to an array that holds the next base 5' of alignment position i
s3pointer to an array that holds the next base 3' of alignment position i
ss
as
circassume the molecules to be circular instead of linear (circ=0)

◆ alloc_sequence_arrays()

void alloc_sequence_arrays ( const char **  sequences,
short ***  S,
short ***  S5,
short ***  S3,
unsigned short ***  a2s,
char ***  Ss,
int  circ 
)

#include <ViennaRNA/aln_util.h>

Allocate memory for sequence array used to deal with aligned sequences.

Note that these arrays will also be initialized according to the sequence alignment given

See also
free_sequence_arrays()
Parameters
sequencesThe aligned sequences
SA pointer to the array of encoded sequences
S5A pointer to the array that contains the next 5' nucleotide of a sequence position
S3A pointer to the array that contains the next 3' nucleotide of a sequence position
a2sA pointer to the array that contains the alignment to sequence position mapping
SsA pointer to the array that contains the ungapped sequence
circassume the molecules to be circular instead of linear (circ=0)

◆ free_sequence_arrays()

void free_sequence_arrays ( unsigned int  n_seq,
short ***  S,
short ***  S5,
short ***  S3,
unsigned short ***  a2s,
char ***  Ss 
)

#include <ViennaRNA/aln_util.h>

Free the memory of the sequence arrays used to deal with aligned sequences.

This function frees the memory previously allocated with alloc_sequence_arrays()

See also
alloc_sequence_arrays()
Parameters
n_seqThe number of aligned sequences
SA pointer to the array of encoded sequences
S5A pointer to the array that contains the next 5' nucleotide of a sequence position
S3A pointer to the array that contains the next 3' nucleotide of a sequence position
a2sA pointer to the array that contains the alignment to sequence position mapping
SsA pointer to the array that contains the ungapped sequence