agglovar.align.op

Constants and functions for working with alignment operations.

Definitions in this subpackage are borrowed from PAV 3+.

Attributes

ADV_QRY_ARR

Array of operation codes that advance the query position.

ADV_REF_ARR

Array of operation codes that advance the reference position.

ALIGN_SET

Set of alignment operation codes (match, sequence match, mismatch).

CIGAR_OP_SET

Set of valid operation characters.

CLIP_SET

Set of clipping operation codes (soft and hard clipping).

CONSUMES_QRY_ARR

Array of operation codes that consume query bases.

CONSUMES_REF_ARR

Array of operation codes that consume reference bases.

D

Deletion operation code.

EQ

Sequence match operation code.

EQX_SET

Set of exact match/mismatch operation codes.

H

Hard clipping operation code.

I

Insertion operation code.

INT_STR_SET

Set of valid integer character strings representing operation codes.

M

Match or mismatch operation code.

N

Skipped region operation code.

OP_CHAR

Mapping from operation codes to their character representations.

OP_CHAR_FUNC

Vectorized function to convert operation codes to characters.

OP_CODE

Mapping from CIGAR characters to operation codes.

P

Padding operation code.

S

Soft clipping operation code.

VAR_ARR

Array of operation codes that introduce variation.

X

Sequence mismatch operation code.

Functions

cigar_to_arr(→ numpy.ndarray)

Get a numpy array with two dimensions (dtype int).

Module Contents

agglovar.align.op.cigar_to_arr(cigar_str: str) numpy.ndarray

Get a numpy array with two dimensions (dtype int).

The first column is the operation codes, the second column is the operation lengths.

Parameters:

cigar_str – CIGAR string.

Returns:

Array of operation codes and lengths (dtype int).

Raises:

ValueError – If the CIGAR string is not properly formatted.

agglovar.align.op.ADV_QRY_ARR: numpy.typing.NDArray[numpy.integer]

Array of operation codes that advance the query position.

agglovar.align.op.ADV_REF_ARR: numpy.typing.NDArray[numpy.integer]

Array of operation codes that advance the reference position.

agglovar.align.op.ALIGN_SET: frozenset[int]

Set of alignment operation codes (match, sequence match, mismatch).

agglovar.align.op.CIGAR_OP_SET: frozenset[str]

Set of valid operation characters.

agglovar.align.op.CLIP_SET: frozenset[int]

Set of clipping operation codes (soft and hard clipping).

agglovar.align.op.CONSUMES_QRY_ARR: numpy.typing.NDArray[numpy.integer]

Array of operation codes that consume query bases.

agglovar.align.op.CONSUMES_REF_ARR: numpy.typing.NDArray[numpy.integer]

Array of operation codes that consume reference bases.

agglovar.align.op.D: int = 2

Deletion operation code.

agglovar.align.op.EQ: int = 7

Sequence match operation code.

agglovar.align.op.EQX_SET: frozenset[int]

Set of exact match/mismatch operation codes.

agglovar.align.op.H: int = 5

Hard clipping operation code.

agglovar.align.op.I: int = 1

Insertion operation code.

agglovar.align.op.INT_STR_SET: frozenset[str]

Set of valid integer character strings representing operation codes.

agglovar.align.op.M: int = 0

Match or mismatch operation code.

agglovar.align.op.N: int = 3

Skipped region operation code.

agglovar.align.op.OP_CHAR: Mapping[int, str]

Mapping from operation codes to their character representations.

agglovar.align.op.OP_CHAR_FUNC

Vectorized function to convert operation codes to characters.

agglovar.align.op.OP_CODE: Mapping[str, int]

Mapping from CIGAR characters to operation codes.

agglovar.align.op.P: int = 6

Padding operation code.

agglovar.align.op.S: int = 4

Soft clipping operation code.

agglovar.align.op.VAR_ARR: numpy.typing.NDArray[numpy.integer]

Array of operation codes that introduce variation.

agglovar.align.op.X: int = 8

Sequence mismatch operation code.