agglovar.fa
Reference and FASTA processing utilities.
Functions
|
Get a table of reference information from a FASTA file. |
|
Read an FAI File name. |
Module Contents
- agglovar.fa.fa_info(ref_fa: str | pathlib.Path) polars.DataFrame
Get a table of reference information from a FASTA file.
FASTA must have a “.fai”.
- Table columns:
chrom: Chromosome name. md5: MD5 of the sequence. order: Order of the sequence in the FASTA file. pos: Start position of the sequence in the FASTA file. line_bp: Number of base pairs per line. line_bytes: Number of bytes per line.
- Parameters:
ref_fa – Reference FASTA.
- Returns:
A table with sequence information.
- agglovar.fa.read_fai(fai_file_name: str | pathlib.Path, cols: Iterable[str] | None = ('chrom', 'len'), name: str = None) polars.DataFrame
Read an FAI File name.
By default, return a Series of chromosome lengths keyed by the chromosome (or contig) name.
Available columns are: chrom, len, pos, line_bp, line_bytes
- Parameters:
fai_file_name – File to read.
cols – Select these columns. None to retain all columns.
name – Name of the chromosome column (typically “qry_id” to match query alignment tables).
- Returns:
A table of the FAI file.