agglovar.kmer.util
==================
.. py:module:: agglovar.kmer.util
.. autoapi-nested-parse::
Basic k-mer manipulation utilities.
Attributes
----------
.. autoapisummary::
agglovar.kmer.util.BASE_TO_INT
agglovar.kmer.util.BYTE_SIZE_TO_NUMPY_UINT
agglovar.kmer.util.INT_TO_BASE
agglovar.kmer.util.NP_MAX_BIT_SIZE
agglovar.kmer.util.NP_MAX_BYTE_SIZE
agglovar.kmer.util.NP_MAX_KMER_SIZE
Classes
-------
.. autoapisummary::
agglovar.kmer.util.KmerUtil
Functions
---------
.. autoapisummary::
agglovar.kmer.util.stream
agglovar.kmer.util.stream_index
Module Contents
---------------
.. py:class:: KmerUtil(k_size: int, k_min_size: int = 0, k_min_mask: int = 0)
Manages basic k-mer functions, such as converting formats, appending bases, and reverse-complementing.
Contains a set of constants specific to the k-mer size and minimizer (if defined).
:ivar k_size: K-mer size.
:ivar k_bit_size: Number of bits in a k-mer. Minimum size of unsigned integers storing k-mers.
:ivar k_byte_size: Size of k-mers in bytes.
:ivar k_min_size: Minimizer size or 0 if a minimizer is not used.
:ivar k_min_mask: Minimizer mask if set and kMinSize is not 0.
:ivar k_mask: Mask for k-mer part of integer. Masks out unused bits if an integer has more bits
than `k_bit_size`.
:ivar min_kmer_util: K-mer util for minimizer. `None` if a minimizer is not defined.
:ivar minimizer_mask: Mask for extracting minimizers from k-mers (minimizer-mask). `0` if a
minimizer is not defined.
:ivar sub_per_kmer: Number of sub-kmers per k-mer (sub_per_kmer). `None` if a minimizer is not
defined.
:ivar np_int_type: Smallest numpy unsigned integer type that can store k-mers of k_size. `None`
if k_size is larger than the maximum numpy unsigned integer size (i.e. 8: np.uint64).
.. py:method:: append(kmer: int, base: str) -> int
Shift k-mer one base and append a new base.
:param kmer: Old k-mer.
:param base: Base to be appended.
:returns: New k-mer with appended base.
.. py:method:: canonical_complement(kmer: int) -> int
Get the canonical k-mer of a k-mer.
The canonical k-mer is the lesser of the k-mer and its reverse-complement.
:param kmer: K-mer.
:returns: `kmer` if it is less than the reverse-complement, and the reverse-complement otherwise.
.. py:method:: minimizer(kmer: int) -> int
Get the minimizer of a k-mer.
The minimizer of a k-mer is the lesser of all sub-k-mers and their reverse-complements.
This function can only be used if a minimizer size is defined.
:param kmer: K-mer.
:returns: Minimizer.
.. py:method:: rev_complement(kmer: int) -> int
Reverse-complement k-mer.
:param kmer: K-mer.
:returns: Reverse-complement of `kmer`.
.. py:method:: rev_complement_array(kmer_arr: numpy.ndarray) -> numpy.ndarray
Reverse-complement k-mers in an array.
:param kmer_arr: K-mer array.
:returns: Reverse-complemented k-mers.
.. py:method:: to_kmer(k_str: str) -> int
Convert a string to a k-mer.
:param k_str: K-mer string.
:returns: K-mer integer.
.. py:method:: to_string(kmer: int) -> str
Translate integer k-mer to a string.
:param kmer: Integer k-mer.
:returns: String representation of `kmer`.
.. py:attribute:: k_bit_size
:type: int
.. py:attribute:: k_byte_size
:type: int
.. py:attribute:: k_mask
:type: int
.. py:attribute:: k_min_mask
:type: int
.. py:attribute:: k_min_size
:type: int
.. py:attribute:: k_size
:type: int
.. py:attribute:: min_kmer_util
:type: Optional[Self]
.. py:attribute:: minimizer_mask
:type: int
.. py:attribute:: np_int_type
:type: Optional[numpy.integer]
.. py:attribute:: sub_per_kmer
:type: Optional[int]
.. py:function:: stream(seq: str, kutil: KmerUtil) -> Iterator[int]
Get an iterator for k-mers in a sequence.
:param seq: String sequence of bases.
:param kutil: K-mer util describing parameters (e.g. k-mer size).
:yields: K-mers.
.. py:function:: stream_index(seq: str, kutil: KmerUtil) -> Iterator[tuple[int, int]]
Get an iterator for k-mers in a sequence with their index.
:param seq: String sequence of bases.
:param kutil: K-mer util describing parameters (e.g. k-mer size).
:returns: Iterator of tuples (kmer, index). Index starts at 0 and increments for each k-mer.
.. py:data:: BASE_TO_INT
:type: Mapping[str, int]
Maps base string to 2-bit integer.
.. py:data:: BYTE_SIZE_TO_NUMPY_UINT
:type: Mapping[int, numpy.integer]
Maps k-mer byte size the smallest numpy integer type that can store it.
Only defined for sizes up to the maximum numpy unsigned numpy size (i.e. 8: np.uint64).
.. py:data:: INT_TO_BASE
:type: list[str]
:value: ['A', 'C', 'G', 'T']
Maps 2-bit integer to base string.
.. py:data:: NP_MAX_BIT_SIZE
Maximum k-mer bit size for numpy arrays.
.. py:data:: NP_MAX_BYTE_SIZE
Maximum k-mer byte size for numpy arrays.
.. py:data:: NP_MAX_KMER_SIZE
Maximum k-mer size for numpy arrays.