agglovar.kmer.util ================== .. py:module:: agglovar.kmer.util .. autoapi-nested-parse:: Basic k-mer manipulation utilities. Attributes ---------- .. autoapisummary:: agglovar.kmer.util.BASE_TO_INT agglovar.kmer.util.BYTE_SIZE_TO_NUMPY_UINT agglovar.kmer.util.INT_TO_BASE agglovar.kmer.util.NP_MAX_BIT_SIZE agglovar.kmer.util.NP_MAX_BYTE_SIZE agglovar.kmer.util.NP_MAX_KMER_SIZE Classes ------- .. autoapisummary:: agglovar.kmer.util.KmerUtil Functions --------- .. autoapisummary:: agglovar.kmer.util.stream agglovar.kmer.util.stream_index Module Contents --------------- .. py:class:: KmerUtil(k_size: int, k_min_size: int = 0, k_min_mask: int = 0) Manages basic k-mer functions, such as converting formats, appending bases, and reverse-complementing. Contains a set of constants specific to the k-mer size and minimizer (if defined). :ivar k_size: K-mer size. :ivar k_bit_size: Number of bits in a k-mer. Minimum size of unsigned integers storing k-mers. :ivar k_byte_size: Size of k-mers in bytes. :ivar k_min_size: Minimizer size or 0 if a minimizer is not used. :ivar k_min_mask: Minimizer mask if set and kMinSize is not 0. :ivar k_mask: Mask for k-mer part of integer. Masks out unused bits if an integer has more bits than `k_bit_size`. :ivar min_kmer_util: K-mer util for minimizer. `None` if a minimizer is not defined. :ivar minimizer_mask: Mask for extracting minimizers from k-mers (minimizer-mask). `0` if a minimizer is not defined. :ivar sub_per_kmer: Number of sub-kmers per k-mer (sub_per_kmer). `None` if a minimizer is not defined. :ivar np_int_type: Smallest numpy unsigned integer type that can store k-mers of k_size. `None` if k_size is larger than the maximum numpy unsigned integer size (i.e. 8: np.uint64). .. py:method:: append(kmer: int, base: str) -> int Shift k-mer one base and append a new base. :param kmer: Old k-mer. :param base: Base to be appended. :returns: New k-mer with appended base. .. py:method:: canonical_complement(kmer: int) -> int Get the canonical k-mer of a k-mer. The canonical k-mer is the lesser of the k-mer and its reverse-complement. :param kmer: K-mer. :returns: `kmer` if it is less than the reverse-complement, and the reverse-complement otherwise. .. py:method:: minimizer(kmer: int) -> int Get the minimizer of a k-mer. The minimizer of a k-mer is the lesser of all sub-k-mers and their reverse-complements. This function can only be used if a minimizer size is defined. :param kmer: K-mer. :returns: Minimizer. .. py:method:: rev_complement(kmer: int) -> int Reverse-complement k-mer. :param kmer: K-mer. :returns: Reverse-complement of `kmer`. .. py:method:: rev_complement_array(kmer_arr: numpy.ndarray) -> numpy.ndarray Reverse-complement k-mers in an array. :param kmer_arr: K-mer array. :returns: Reverse-complemented k-mers. .. py:method:: to_kmer(k_str: str) -> int Convert a string to a k-mer. :param k_str: K-mer string. :returns: K-mer integer. .. py:method:: to_string(kmer: int) -> str Translate integer k-mer to a string. :param kmer: Integer k-mer. :returns: String representation of `kmer`. .. py:attribute:: k_bit_size :type: int .. py:attribute:: k_byte_size :type: int .. py:attribute:: k_mask :type: int .. py:attribute:: k_min_mask :type: int .. py:attribute:: k_min_size :type: int .. py:attribute:: k_size :type: int .. py:attribute:: min_kmer_util :type: Optional[Self] .. py:attribute:: minimizer_mask :type: int .. py:attribute:: np_int_type :type: Optional[numpy.integer] .. py:attribute:: sub_per_kmer :type: Optional[int] .. py:function:: stream(seq: str, kutil: KmerUtil) -> Iterator[int] Get an iterator for k-mers in a sequence. :param seq: String sequence of bases. :param kutil: K-mer util describing parameters (e.g. k-mer size). :yields: K-mers. .. py:function:: stream_index(seq: str, kutil: KmerUtil) -> Iterator[tuple[int, int]] Get an iterator for k-mers in a sequence with their index. :param seq: String sequence of bases. :param kutil: K-mer util describing parameters (e.g. k-mer size). :returns: Iterator of tuples (kmer, index). Index starts at 0 and increments for each k-mer. .. py:data:: BASE_TO_INT :type: Mapping[str, int] Maps base string to 2-bit integer. .. py:data:: BYTE_SIZE_TO_NUMPY_UINT :type: Mapping[int, numpy.integer] Maps k-mer byte size the smallest numpy integer type that can store it. Only defined for sizes up to the maximum numpy unsigned numpy size (i.e. 8: np.uint64). .. py:data:: INT_TO_BASE :type: list[str] :value: ['A', 'C', 'G', 'T'] Maps 2-bit integer to base string. .. py:data:: NP_MAX_BIT_SIZE Maximum k-mer bit size for numpy arrays. .. py:data:: NP_MAX_BYTE_SIZE Maximum k-mer byte size for numpy arrays. .. py:data:: NP_MAX_KMER_SIZE Maximum k-mer size for numpy arrays.