bio.std.hts.bam.pileup

This module is used for iterating over columns of alignment.

The function makePileup is called on a range of coordinate-sorted reads mapped to the same reference. It returns an input range of columns.

This returned range can then be iterated with foreach. First column is located at the same position on the reference, as the first base of the first read.
Each popFront operation advances current position on the reference. The default behaviour is to exclude sites with zero coverage from the iteration.

Each column keeps set of reads that overlap corresponding position on the reference. If reads contain MD tags, and makePileup was asked to use them, reference base at the column is also available.


Each read preserves all standard read properties but also keeps column-related information, namely <ul>

  • number of bases consumed from the read sequence so far
  • current CIGAR operation and offset in it
  • all CIGAR operations before and after current one
  • </ul>
    It is clear from the above that current CIGAR operation cannot be an insertion. The following are suggested ways to check for them: <ul>

  • cigar_after.length > 0 && cigar_operation_offset == cigar_operation.length - 1 && cigar_after[0].type == 'I'
  • cigar_before.length > 0 && cigar_operation_offset == 0 && cigar_before[$ - 1].type == 'I'
  • </ul>

    Members

    Classes

    PileupRange
    class PileupRange(R, alias TColumn = PileupColumn)

    The class for iterating reference bases together with reads overlapping them.

    PileupRangeUsingMdTag
    class PileupRangeUsingMdTag(R)

    Tracks current reference base

    Functions

    makePileup
    auto makePileup(R reads, bool use_md_tag, ulong start_from, ulong end_at, bool skip_zero_coverage)

    Creates a pileup range from a range of reads. Note that all reads must be aligned to the same reference.

    pileupChunks
    auto pileupChunks(R reads, bool use_md_tag, size_t block_size, ulong start_from, ulong end_at)

    This function constructs range of non-overlapping consecutive pileups from a range of reads so that these pileups can be processed in parallel.

    pileupColumns
    auto pileupColumns(R reads, bool use_md_tag, bool skip_zero_coverage)
    Undocumented in source. Be warned that the author may not have intended to support it.
    pileupInstance
    auto pileupInstance(R reads, ulong start_from, ulong end_at, bool skip_zero_coverage)
    Undocumented in source. Be warned that the author may not have intended to support it.
    takeUntil
    auto takeUntil(Range range, Sentinel sentinel)
    Undocumented in source. Be warned that the author may not have intended to support it.

    Manifest constants

    useMD
    enum useMD;

    Allows to express the intention clearer.

    Structs

    AbstractPileup
    struct AbstractPileup(R, S)

    Abstract pileup structure. S is type of column range.

    PileupChunkRange
    struct PileupChunkRange(C)
    Undocumented in source.
    PileupColumn
    struct PileupColumn(R)

    Represents a single pileup column

    PileupRead
    struct PileupRead(Read = bio.std.hts.bam.read.EagerBamRead)

    Represents a read aligned to a column

    TakeUntil
    struct TakeUntil(alias pred, Range, Sentinel)
    Undocumented in source.

    Examples

    import bio.std.hts.bam.reader, bio.std.hts.bam.pileup, std.stdio, std.algorithm : count;
    void main() {
        auto bam = new BamReader("file.bam");       // assume single reference and MD tags
        auto pileup = bam.reads().makePileup(useMD);
        foreach (column; pileup) {
            auto matches = column.bases.count(column.reference_base);
            if (matches < column.coverage * 2 / 3)
                writeln(column.position);           // print positions of possible mismatches
        }
    }

    Meta