agglovar.seqmatch ================= .. py:module:: agglovar.seqmatch .. autoapi-nested-parse:: Fast Smith-waterman implementation. Classes ------- .. autoapisummary:: agglovar.seqmatch.MatchScoreModel Module Contents --------------- .. py:class:: MatchScoreModel An engine for scoring alignments. Intended for determining sequence similarity through an alignment where the alignment itself is not important. :param match: Base match score (> 0.0). :param mismatch: Base mismatch score (< 0.0). :param gap_open: Gap open cost (<= 0.0). :param gap_extend: Gap extend cost (<= 0.0). :param map_limit: Maximum sequence size before falling back to Jaccard index in :func:`match_prop()`. :param jaccard_kmer: Jaccard k-mer size for comparisons falling back to Jaccard index in :func:`match_prop()`. .. py:method:: __post_init__() -> None Post-initialization. .. py:method:: match_prop(seq_a: str, seq_b: str) -> float Get the alignment score proportion over the max possible score between two sequences. To follow tandem duplications to map correctly, seq_b is duplicated head-to-tail (seq_b + seq_b) and seq_a is aligned to it. The max possible score is achieved if seq_a and seq_b are the same size and seq_a aligns to seq_b + seq_b with all seq_a bases matching (function returns 1.0). The numerator is the alignment score and the denominator is the size of the larger sequence times the match score:: min( score_align(seq_a, seq_b + seq_b), min_len(seq_a, seq_b) * match ) / ( max_len(seq_a, seq_b) * match ) :param seq_a: Subject sequence. :param seq_b: Query sequence. :returns: Alignment proportion with seq_b duplicated head to tail. If either sequence is None or empty, returns 0.0 .. py:method:: score_align(seq_a: str, seq_b: str) -> float Get max score aligning two sequences. Only returns the score, not the alignment. This is a legacy native Python implementation used by SV-Pop. It is accurate, but extremely slow. :param seq_a: Subject sequence. :param seq_b: Query sequence. :returns: Maximum alignment score. .. py:method:: score_align_native(seq_a: str, seq_b: str) -> float Get max score aligning two sequences. Only returns the score, not the alignment.commands = [["pytest", "{posargs}"]] This is a legacy native Python implementation used by SV-Pop. It is validated, but extremely slow. :param seq_a: Subject sequence. :param seq_b: Query sequence. :returns: Maximum alignment score. .. py:attribute:: affine_gap :type: float .. py:attribute:: jaccard_kmer :type: int :value: 9 .. py:attribute:: map_limit :type: Optional[int] :value: 5000 .. py:attribute:: match :type: float .. py:attribute:: mismatch :type: float .. py:attribute:: rotate_min :type: int :value: 3 .. py:attribute:: score_model :type: agglovar.align.score.AffineScoreModel :value: None