ini.trakem2.vector
Class Editions

java.lang.Object
  extended by ini.trakem2.vector.Editions

public class Editions
extends java.lang.Object

To extract and represent the sequence of editions that convert any N-dimensional vector string to any other of the same number of dimensions.


Nested Class Summary
static class Editions.Chunk
           
 
Field Summary
protected  boolean closed
           
static int DELETION
           
protected  double delta
           
 double distance
          Levenshtein's distance between vs1 and vs2.
protected  int[][] editions
          In the form [length][3]
static int INSERTION
           
static int MUTATION
           
protected  VectorString vs1
           
protected  VectorString vs2
           
protected  double WD
          Weight for deletion cost.
protected  double WI
          Weight for insertion cost.
protected  double WM
          Weight for correspondence cost.
 
Constructor Summary
Editions(VectorString vs1, VectorString vs2, double delta, boolean closed)
           
Editions(VectorString vs1, VectorString vs2, double delta, boolean closed, double wi, double wd, double wm)
           
 
Method Summary
 Editions.Chunk findLargestMutationChunk(int max_non_mut)
           
 double getDistance()
           
 int[][] getEditions()
           
 double getPhysicalDistance(boolean skip_ends, int max_mut, float min_chunk, boolean average)
          Returns the distance between all points involved in a mutation; if average is false, then it returns the cummulative.
 double getSimilarity()
           
 double getSimilarity(boolean skip_ends, int max_mut, float min_chunk)
          A mutation is considered an equal or near equal, and thus does not count.
 double getSimilarity2()
           
 double getSimilarity2(boolean skip_ends, int max_mut, float min_chunk)
          Returns the number of mutations / max(len(vs1), len(vs2)) : 1.0 means all are mutations and the sequences have the same lengths.
 double[] getStatistics(boolean skip_ends, int max_mut, float min_chunk, boolean score_mut_only)
          Returns {average distance, cummulative distance, stdDev, median, prop_mut} which are: [0] - average distance: the average physical distance between mutation pairs [1] - cummulative distance: the sum of the distances between mutation pairs [2] - stdDev: of the physical distances between mutation pairs relative to the average [3] - median: the average medial physical distance between mutation pairs, more robust than the average to extreme values [4] - prop_mut: the proportion of mutation pairs relative to the length of the queried sequence vs1.
 double getStdDev(boolean skip_ends, int max_mut, float min_chunk)
           
 VectorString getVS1()
           
 VectorString getVS2()
           
 int length()
           
 java.lang.String prettyPrint(java.lang.String separator)
          Get the sequence of editions and matches in three lines, like: vs1: 1 2 3 4 5 6 7 8 9 M M D M M M I I M M M vs2: 1 2 3 4 5 6 7 8 9 With the given separator (defaults to tab if null)
 Editions recreateFromCenter(int max_non_mut)
          Find the longest chunk of mutations (which can include chunks of up to max_non_mut of non-mutations), then take the center point and split both vector strings there, perform matching towards the ends, and assemble a new Editions object.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DELETION

public static final int DELETION
See Also:
Constant Field Values

INSERTION

public static final int INSERTION
See Also:
Constant Field Values

MUTATION

public static final int MUTATION
See Also:
Constant Field Values

WI

protected final double WI
Weight for insertion cost.


WD

protected final double WD
Weight for deletion cost.


WM

protected final double WM
Weight for correspondence cost.


vs1

protected VectorString vs1

vs2

protected VectorString vs2

delta

protected double delta

closed

protected boolean closed

editions

protected int[][] editions
In the form [length][3]


distance

public double distance
Levenshtein's distance between vs1 and vs2.

Constructor Detail

Editions

public Editions(VectorString vs1,
                VectorString vs2,
                double delta,
                boolean closed)

Editions

public Editions(VectorString vs1,
                VectorString vs2,
                double delta,
                boolean closed,
                double wi,
                double wd,
                double wm)
Method Detail

getDistance

public double getDistance()

length

public int length()

getEditions

public int[][] getEditions()

getVS1

public VectorString getVS1()

getVS2

public VectorString getVS2()

getSimilarity

public double getSimilarity(boolean skip_ends,
                            int max_mut,
                            float min_chunk)
A mutation is considered an equal or near equal, and thus does not count. Only deletions and insertions count towards scoring the similarity.

Parameters:
skip_ends - enables ignoring sequences in the beginning and ending if they are insertions or deletions.
max_mut - indicates the maximum length of a contiguous sequence of mutations to be ignored when skipping insertions and deletions at beginning and end.
min_chunk - indicates the minimal proportion of the string that should remain between the found start and end, for vs1. The function will return the regular similarity if the chunk is too small.

getSimilarity

public double getSimilarity()

getSimilarity2

public double getSimilarity2()

getSimilarity2

public double getSimilarity2(boolean skip_ends,
                             int max_mut,
                             float min_chunk)
Returns the number of mutations / max(len(vs1), len(vs2)) : 1.0 means all are mutations and the sequences have the same lengths.


getPhysicalDistance

public double getPhysicalDistance(boolean skip_ends,
                                  int max_mut,
                                  float min_chunk,
                                  boolean average)
Returns the distance between all points involved in a mutation; if average is false, then it returns the cummulative. Returns Double.MAX_VALUE if no mutations are found.


getStdDev

public double getStdDev(boolean skip_ends,
                        int max_mut,
                        float min_chunk)

getStatistics

public double[] getStatistics(boolean skip_ends,
                              int max_mut,
                              float min_chunk,
                              boolean score_mut_only)
Returns {average distance, cummulative distance, stdDev, median, prop_mut} which are: [0] - average distance: the average physical distance between mutation pairs [1] - cummulative distance: the sum of the distances between mutation pairs [2] - stdDev: of the physical distances between mutation pairs relative to the average [3] - median: the average medial physical distance between mutation pairs, more robust than the average to extreme values [4] - prop_mut: the proportion of mutation pairs relative to the length of the queried sequence vs1. [5] - Levenshtein's distance [6] - Similarity: 1 - (( N_insertions + N_deletions ) / max(len(seq1), len(seq2))) [7] - Proximity: cummulative distance between pairs divided by physical sequence length [8] - Proximity of mutation pairs [9] - Ratio of sequence lengths: vs1.length / vs2.length [10] - Tortuosity: squared ratio of the difference of the euclidian distances from first to last point divided by the euclidian length of the sequence.


prettyPrint

public java.lang.String prettyPrint(java.lang.String separator)
Get the sequence of editions and matches in three lines, like: vs1: 1 2 3 4 5 6 7 8 9 M M D M M M I I M M M vs2: 1 2 3 4 5 6 7 8 9 With the given separator (defaults to tab if null)


findLargestMutationChunk

public Editions.Chunk findLargestMutationChunk(int max_non_mut)

recreateFromCenter

public Editions recreateFromCenter(int max_non_mut)
                            throws java.lang.Exception
Find the longest chunk of mutations (which can include chunks of up to max_non_mut of non-mutations), then take the center point and split both vector strings there, perform matching towards the ends, and assemble a new Editions object.

Throws:
java.lang.Exception