biocantor.location.location_impl

Module Contents

Classes

SingleInterval

A single contiguous interval within a sequence

CompoundInterval

A location consisting of multiple intervals

_EmptyLocation

Singleton object representing an empty location

Functions

EmptyLocation()

Returns the single EmptyLocation instance

_union_preserve_overlaps(...)

Attributes

HAS_CGRANGES

biocantor.location.location_impl.HAS_CGRANGES = False
class biocantor.location.location_impl.SingleInterval(start: int, end: int, strand: inscripta.biocantor.location.strand.Strand, parent: Optional[inscripta.biocantor.util.types.ParentInputType] = None)

Bases: inscripta.biocantor.location.location.Location

A single contiguous interval within a sequence

property is_contiguous: bool

Returns True iff this Location is fully contiguous within its parent

property is_empty: bool

Returns True iff this Location is empty

property blocks: List[inscripta.biocantor.location.location.Location]

Returns list of contiguous blocks comprising this Location

property num_blocks: int

Returns number of contiguous blocks comprising this Location

property is_overlapping: bool

SingleInterval is by definition always non-overlapping

property _full_span_interval: inscripta.biocantor.location.location.Location

Returns the full span of this interval; is trivial for a SingleInterval and EmptyLocation

__str__()

Returns a human readable string representation of this Location

__repr__()

Returns the ‘official’ string representation of this Location

scan_blocks() Iterator[inscripta.biocantor.location.location.Location]

Returns an iterator over blocks in order relative to strand of this Location

__eq__(other)

Returns True iff this Location is equal to other object

optimize_blocks() inscripta.biocantor.location.location.Location

Returns a new Location covering the same positions but with blocks optimized. For example, empty blocks may be removed or adjacent blocks may be combined if applicable.

gap_list() List[inscripta.biocantor.location.location.Location]

Returns list of contiguous regions comprising the space between blocks of this Location. List is ordered relative to strand of this Location.

gaps_location() inscripta.biocantor.location.location.Location

Returns a Location representing the space between blocks of this Location.

__hash__()

Returns a hash code satisfying location1 == location2 => hash(location1) == hash(location2)

__lt__(other: inscripta.biocantor.location.location.Location)

Return self<value.

compare(other: inscripta.biocantor.location.location.Location) inscripta.biocantor.util.ordering.RelativeOrder

Returns a negative integer if this Location is less than the other Location, positive integer if it is greater, and zero otherwise.

extract_sequence() inscripta.biocantor.sequence.Sequence

Extracts the sequence of this Location from the parent. Concrete implementations should raise ValueError if no parent exists.

parent_to_relative_pos(parent_pos: int) int

Converts a position on the parent to a position relative to this Location. Concrete implementations should raise ValueError if the given position does not overlap this Location.

relative_to_parent_pos(relative_pos: int) int

Converts a position relative to this Location to a position on the parent

relative_interval_to_parent_location(relative_start: int, relative_end: int, relative_strand: inscripta.biocantor.location.strand.Strand) inscripta.biocantor.location.location.Location

Converts an interval relative to this Location to a Location on the parent

Parameters
  • relative_start – 0-based start position of interval relative to this Location

  • relative_end – 0-based exclusive end position of interval relative to this Location

  • relative_strand – Strand of interval relative to the strand of this Location. If the strand of interval is on the SAME strand as the strand of this location, relative_strand is PLUS. If the strand interval is on the OPPOSITE strand, relative_strand is MINUS.

Returns

Return type

New Location on the parent with the parent as parent

has_overlap(other: inscripta.biocantor.location.location.Location, match_strand: bool = False, full_span: bool = False, strict_parent_compare: bool = False) bool

Compares the overlap of this interval to another interval. If full_span is True, then this interval is compared to the full span of the other interval, regardless of type of the other interval.

_has_overlap_single_interval(other: inscripta.biocantor.location.location.Location) bool
reverse() SingleInterval

Returns a new Location corresponding to this Location with the same start and stop, with strand and structure reversed

reverse_strand() SingleInterval

Returns a new Location corresponding to this Location with the strand reversed

reset_strand(new_strand: inscripta.biocantor.location.strand.Strand) SingleInterval

Returns a new Location corresponding to this Location with the given strand

reset_parent(new_parent: inscripta.biocantor.parent.Parent) SingleInterval

Returns a new Location corresponding to this Location with positions unchanged and pointing to a new parent

shift_position(shift: int) SingleInterval

Returns a new Location corresponding to this location shifted by the given distance

distance_to(other: inscripta.biocantor.location.location.Location, distance_type: inscripta.biocantor.DistanceType = DistanceType.INNER) int

Returns the distance from this location to another location with the same parent. Return value is a non-negative integer and implementations must be commutative.

Parameters
  • other – Other location with same parent as this location

  • distance_type – Distance type

_distance_to_single_interval(other: inscripta.biocantor.location.location.Location, distance_type: inscripta.biocantor.DistanceType) int
intersection(other: inscripta.biocantor.location.location.Location, match_strand: bool = True, full_span: bool = False, strict_parent_compare: bool = False) inscripta.biocantor.location.location.Location

Intersects this SingleInterval with another Location.

Parameters
  • other – The other Location.

  • match_strand – Match strand or ignore strand?

  • full_span – Perform comparison on the full span of the other interval? Trivial for this SingleInterval, but relevant if other is a CompoundInterval.

_intersection_single_interval(other: inscripta.biocantor.location.location.Location) inscripta.biocantor.location.location.Location
union(other: inscripta.biocantor.location.location.Location) inscripta.biocantor.location.location.Location

Returns a new Location representing the union of this Location with the other Location. This operation is commutative. Raises exception if locations cannot be combined.

_union_single_interval(other: inscripta.biocantor.location.location.Location) inscripta.biocantor.location.location.Location
union_preserve_overlaps(other: inscripta.biocantor.location.location.Location) inscripta.biocantor.location.location.Location

Returns a new Location representing the union of this Location with the other Location, retaining overlapping blocks where applicable. This operation is commutative. Raises exception if locations cannot be combined.

minus(other: inscripta.biocantor.location.location.Location, match_strand: bool = True, strict_parent_compare: bool = False) inscripta.biocantor.location.location.Location

Returns a new Location representing this Location minus its intersection with the other Location. Returned Location has the same Strand as this Location. If there is no intersection, returns this Location. This operation is not commutative.

Parameters
  • other – Other location

  • match_strand – If set to True, automatically return this Location if other Location has a different Strand than this Location

  • strict_parent_compare – Raise MismatchedParentException if parents do not match

extend_absolute(extend_start: int, extend_end: int) inscripta.biocantor.location.location.Location

Returns a new Location representing this Location with start and end positions extended by the given values, ignoring Strand. Returned Location has same Strand as this Location.

Parameters
  • extend_start – Non-negative integer: amount to extend start

  • extend_end – Non-negative integer: amount to extend end

extend_relative(extend_upstream: int, extend_downstream: int) inscripta.biocantor.location.location.Location

Returns a new Location extended upstream and downstream relative to this Location’s Strand.

Parameters
  • extend_upstream – Non-negative integer: amount to extend upstream relative to Strand

  • extend_downstream – Non-negative integer: amount to extend downstream relative to Strand

_location_relative_to(other: inscripta.biocantor.location.location.Location, strict_parent_compare: bool = False, optimize_blocks: bool = True) inscripta.biocantor.location.location.Location

optimize_blocks is not used here, but is still a keyword argument to ensure a unified API between SingleInterval and CompoundInterval.

merge_overlapping() inscripta.biocantor.location.location.Location

Merges overlapping windows

to_feature_location() Bio.SeqFeature.FeatureLocation

Convert to a BioPython FeatureLocation.

to_biopython() Bio.SeqFeature.FeatureLocation

Provide a shared function signature with other Locations

class biocantor.location.location_impl.CompoundInterval(starts: Union[Tuple[int, Ellipsis], List[int]], ends: Union[Tuple[int, Ellipsis], List[int]], strand: inscripta.biocantor.location.strand.Strand, parent: Optional[inscripta.biocantor.util.types.ParentInputType] = None)

Bases: inscripta.biocantor.location.location.Location

A location consisting of multiple intervals

property _single_intervals

Lazy evaluation; cached result

property num_blocks

Returns number of contiguous blocks comprising this Location

property is_contiguous: bool

Returns True iff this Location is fully contiguous within its parent

property is_overlapping: bool

Does this interval have overlaps?

property is_empty: bool

Returns True iff this Location is empty

property blocks: List[inscripta.biocantor.location.location.Location]

Returns list of contiguous blocks comprising this Location

property _full_span_interval: SingleInterval

Returns the full span of this interval; is trivial for a SingleInterval and EmptyLocation

__slots__ = ['_single_interval_store', '_is_overlapping', '_starts', '_ends']
static _sort_starts_ends(starts: List[int], ends: List[int], strand: inscripta.biocantor.location.strand.Strand) Tuple[Tuple[int, Ellipsis], Tuple[int, Ellipsis]]

Given an array of positions and an orientation, sort the positions such that they are in incrementing order relative to the orientation.

classmethod from_single_intervals(intervals: List[SingleInterval]) CompoundInterval
classmethod _from_single_intervals_no_validation(intervals: List[SingleInterval]) CompoundInterval
__str__()

Returns a human readable string representation of this Location

__eq__(other)

Returns True iff this Location is equal to other object

__hash__()

Returns a hash code satisfying location1 == location2 => hash(location1) == hash(location2)

__repr__()

Returns the ‘official’ string representation of this Location

scan_blocks() Iterator[SingleInterval]

Returns an iterator over blocks in order relative to strand of this Location

extract_sequence() inscripta.biocantor.sequence.Sequence

Extracts the sequence of this Location from the parent. Concrete implementations should raise ValueError if no parent exists.

parent_to_relative_pos(parent_pos: int) int

Converts a position on the parent to a position relative to this Location. Concrete implementations should raise ValueError if the given position does not overlap this Location.

relative_to_parent_pos(relative_pos: int) int

Converts a position relative to this Location to a position on the parent

relative_interval_to_parent_location(relative_start: int, relative_end: int, relative_strand: inscripta.biocantor.location.strand.Strand) inscripta.biocantor.location.location.Location

Converts an interval relative to this Location to a Location on the parent

Parameters
  • relative_start – 0-based start position of interval relative to this Location

  • relative_end – 0-based exclusive end position of interval relative to this Location

  • relative_strand – Strand of interval relative to the strand of this Location. If the strand of interval is on the SAME strand as the strand of this location, relative_strand is PLUS. If the strand interval is on the OPPOSITE strand, relative_strand is MINUS.

Returns

Return type

New Location on the parent with the parent as parent

has_overlap(other: inscripta.biocantor.location.location.Location, match_strand: bool = False, full_span: bool = False, strict_parent_compare: bool = False) bool

If full_span is True, then the full span of both this location and the other location are used for the comparison.

optimize_blocks() inscripta.biocantor.location.location.Location
  • Removes empty blocks

  • Combines adjacent blocks, preserving strictly overlapping blocks

  • Converts to SingleInterval if has only one block

optimize_and_combine_blocks() inscripta.biocantor.location.location.Location
  • Removes empty blocks

  • Combines adjacent and overlapping blocks

  • Converts to SingleInterval if has only one block

gap_list() List[SingleInterval]

Returns list of contiguous regions comprising the space between blocks of this Location. List is ordered relative to strand of this Location.

gaps_location() inscripta.biocantor.location.location.Location

Returns a Location representing the space between blocks of this Location.

_to_single_interval_if_one_block() inscripta.biocantor.location.location.Location
_combine_blocks(preserve_overlappers: bool) CompoundInterval

Combine adjacent and (optionally) overlapping blocks. Also strips out empty blocks.

Parameters

preserve_overlappers – Do not combine strictly overlapping blocks

reverse() CompoundInterval

Returns a new Location corresponding to this Location with the same start and stop, with strand and structure reversed

reverse_strand() CompoundInterval

Returns a new Location corresponding to this Location with the strand reversed

reset_strand(new_strand: inscripta.biocantor.location.strand.Strand) inscripta.biocantor.location.location.Location

Returns a new Location corresponding to this Location with the given strand

reset_parent(new_parent: inscripta.biocantor.parent.Parent) CompoundInterval

Returns a new Location corresponding to this Location with positions unchanged and pointing to a new parent

shift_position(shift: int) inscripta.biocantor.location.location.Location

Returns a new Location corresponding to this location shifted by the given distance

distance_to(other: inscripta.biocantor.location.location.Location, distance_type: inscripta.biocantor.DistanceType = DistanceType.INNER) int

Returns the distance from this location to another location with the same parent. Return value is a non-negative integer and implementations must be commutative.

Parameters
  • other – Other location with same parent as this location

  • distance_type – Distance type

intersection(other: inscripta.biocantor.location.location.Location, match_strand: bool = True, full_span: bool = False, strict_parent_compare: bool = False) inscripta.biocantor.location.location.Location

Intersects this CompoundInterval with another Location.

Parameters
  • other – The other Location.

  • match_strand – Match strand or ignore strand?

  • full_span – Perform comparison on the full span of the other interval? In all cases, the comparison is performed on the full span.

  • strict_parent_compare – If True, parents will be compared and an exception raised if they are not equal. If False, mismatched parents will result in an EmptyLocation return.

_intersection_single_interval(other: inscripta.biocantor.location.location.Location, match_strand: bool, full_span: bool = False) inscripta.biocantor.location.location.Location

Intersections with full span are always symmetric full span (both are considered as full span)

_intersection_compound_interval(other: inscripta.biocantor.location.location.Location, match_strand: bool, full_span: bool = False) inscripta.biocantor.location.location.Location
union(other: inscripta.biocantor.location.location.Location) inscripta.biocantor.location.location.Location

Returns a new Location representing the union of this Location with the other Location. This operation is commutative. Raises exception if locations cannot be combined.

_union_single_interval(other: inscripta.biocantor.location.location.Location)
static _merge_compound_blocks(blocks: List[SingleInterval]) inscripta.biocantor.location.location.Location
_union_compound_interval(other: inscripta.biocantor.location.location.Location)
union_preserve_overlaps(other: inscripta.biocantor.location.location.Location) inscripta.biocantor.location.location.Location

Returns a new Location representing the union of this Location with the other Location, retaining overlapping blocks where applicable. This operation is commutative. Raises exception if locations cannot be combined.

minus(other: inscripta.biocantor.location.location.Location, match_strand: bool = True, strict_parent_compare: bool = False) inscripta.biocantor.location.location.Location

Returns a new Location representing this Location minus its intersection with the other Location. Returned Location has the same Strand as this Location. If there is no intersection, returns this Location. This operation is not commutative.

Parameters
  • other – Other location

  • match_strand – If set to True, automatically return this Location if other Location has a different Strand than this Location

  • strict_parent_compare – Raise MismatchedParentException if parents do not match

extend_absolute(extend_start: int, extend_end: int) inscripta.biocantor.location.location.Location

Returns a new Location representing this Location with start and end positions extended by the given values, ignoring Strand. Returned Location has same Strand as this Location.

Parameters
  • extend_start – Non-negative integer: amount to extend start

  • extend_end – Non-negative integer: amount to extend end

extend_relative(extend_upstream: int, extend_downstream: int) inscripta.biocantor.location.location.Location

Returns a new Location extended upstream and downstream relative to this Location’s Strand.

Parameters
  • extend_upstream – Non-negative integer: amount to extend upstream relative to Strand

  • extend_downstream – Non-negative integer: amount to extend downstream relative to Strand

_location_relative_to(other: inscripta.biocantor.location.location.Location, optimize_blocks: bool = True) inscripta.biocantor.location.location.Location
merge_overlapping() inscripta.biocantor.location.location.Location

If this compound interval is overlapping, merge the overlaps

to_compound_location() Bio.SeqFeature.CompoundLocation

Convert to a BioPython CompoundLocation, or FeatureLocation if this is a 1-block interval

to_biopython() Bio.SeqFeature.CompoundLocation

Provide a shared function signature with other Locations

class biocantor.location.location_impl._EmptyLocation

Bases: inscripta.biocantor.location.location.Location

Singleton object representing an empty location

property length: int
property parent: inscripta.biocantor.parent.Parent
property strand: inscripta.biocantor.location.strand.Strand
property start: int
property end: int
property is_contiguous: bool

Returns True iff this Location is fully contiguous within its parent

property is_empty: bool

Returns True iff this Location is empty

property blocks: List[inscripta.biocantor.location.location.Location]

Returns list of contiguous blocks comprising this Location

property num_blocks: int

Returns number of contiguous blocks comprising this Location

property is_overlapping: bool

EmptyLocation is by definition always non-overlapping

property _full_span_interval: inscripta.biocantor.location.location.Location

Returns the full span of this interval; is trivial for a SingleInterval and EmptyLocation

_instance
__str__()

Returns a human readable string representation of this Location

__eq__(other)

Returns True iff this Location is equal to other object

__repr__()

Returns the ‘official’ string representation of this Location

__hash__()

Returns a hash code satisfying location1 == location2 => hash(location1) == hash(location2)

scan_blocks() None

Returns an iterator over blocks in order relative to strand of this Location

optimize_blocks() inscripta.biocantor.location.location.Location

Returns a new Location covering the same positions but with blocks optimized. For example, empty blocks may be removed or adjacent blocks may be combined if applicable.

gap_list() List[inscripta.biocantor.location.location.Location]

Returns list of contiguous regions comprising the space between blocks of this Location. List is ordered relative to strand of this Location.

gaps_location() inscripta.biocantor.location.location.Location

Returns a Location representing the space between blocks of this Location.

extract_sequence() inscripta.biocantor.sequence.Sequence

Extracts the sequence of this Location from the parent. Concrete implementations should raise ValueError if no parent exists.

parent_to_relative_pos(parent_pos: int) int

Converts a position on the parent to a position relative to this Location. Concrete implementations should raise ValueError if the given position does not overlap this Location.

relative_to_parent_pos(relative_pos: int) int

Converts a position relative to this Location to a position on the parent

parent_to_relative_location(parent_location) inscripta.biocantor.location.location.Location

Converts a Location on the parent to a Location relative to this Location.

Parameters
  • parent_location – Location with the same parent as this Location. Both parents can be None.

  • optimize_blocks – Run optimize_blocks on the resulting location?

Returns

Return type

New Location relative to this Location.

relative_interval_to_parent_location(relative_start: int, relative_end: int, relative_strand: inscripta.biocantor.location.strand.Strand) inscripta.biocantor.location.location.Location

Converts an interval relative to this Location to a Location on the parent

Parameters
  • relative_start – 0-based start position of interval relative to this Location

  • relative_end – 0-based exclusive end position of interval relative to this Location

  • relative_strand – Strand of interval relative to the strand of this Location. If the strand of interval is on the SAME strand as the strand of this location, relative_strand is PLUS. If the strand interval is on the OPPOSITE strand, relative_strand is MINUS.

Returns

Return type

New Location on the parent with the parent as parent

has_overlap(other: inscripta.biocantor.location.location.Location, match_strand: bool = False, full_span: bool = False, strict_parent_compare: bool = False) bool

Returns True iff this Location shares at least one position with the given Location. For subclasses representing discontiguous locations, regions between blocks are not considered.

Parameters
  • other – Other Location

  • match_strand – If set to True, automatically return False if given interval Strand does not match this Location’s Strand

  • full_span – If set to True, compare the full span of this Location to the full span of the other Location.

  • strict_parent_compare – Raise MismatchedParentException if parents do not match

Returns

Return type

True if there is any overlap, False otherwise

reverse() inscripta.biocantor.location.location.Location

Returns a new Location corresponding to this Location with the same start and stop, with strand and structure reversed

reverse_strand() inscripta.biocantor.location.location.Location

Returns a new Location corresponding to this Location with the strand reversed

reset_strand(new_strand: inscripta.biocantor.location.strand.Strand) inscripta.biocantor.location.location.Location

Returns a new Location corresponding to this Location with the given strand

reset_parent(new_parent: inscripta.biocantor.parent.Parent) inscripta.biocantor.location.location.Location

Returns a new Location corresponding to this Location with positions unchanged and pointing to a new parent

shift_position(shift: int) inscripta.biocantor.location.location.Location

Returns a new Location corresponding to this location shifted by the given distance

location_relative_to(other: inscripta.biocantor.location.location.Location, optimize_blocks: bool = True) inscripta.biocantor.location.location.Location

Converts this Location to a Location relative to another Location. The Locations must overlap. The returned value represents the relative location of the overlap within the other Location.

If optimize_blocks is True, the resulting Location will not have any adjacent or overlapping intervals. This is often desirable, because the output of this function can have weird coordinates when the locations are overlapping or adjacent. However, there are some cases where it is desirable to retain the original block structure. One such example are CDS where adjacent blocks or overlapping blocks are used to model frameshifts or indels.

_location_relative_to(other: inscripta.biocantor.location.location.Location, optimize_blocks: bool = True) inscripta.biocantor.location.location.Location
distance_to(other: inscripta.biocantor.location.location.Location, distance_type: inscripta.biocantor.DistanceType = DistanceType.INNER) int

Returns the distance from this location to another location with the same parent. Return value is a non-negative integer and implementations must be commutative.

Parameters
  • other – Other location with same parent as this location

  • distance_type – Distance type

intersection(other: inscripta.biocantor.location.location.Location, match_strand: bool = True, full_span: bool = False, strict_parent_compare: bool = False) inscripta.biocantor.location.location.Location

Returns a new Location representing the intersection of this Location with the other Location. Returned Location, if nonempty, has the same Strand as this Location. This operation is commutative if match_strand is True.

Parameters
  • other – Other location

  • match_strand – If set to True, automatically return EmptyLocation() if other Location has a different Strand than this Location

  • full_span – If set to True, compare the full span of this Location to the full span of the other Location.

  • strict_parent_compare – Raise MismatchedParentException if parents do not match

union(other: inscripta.biocantor.location.location.Location) inscripta.biocantor.location.location.Location

Returns a new Location representing the union of this Location with the other Location. This operation is commutative. Raises exception if locations cannot be combined.

union_preserve_overlaps(other: inscripta.biocantor.location.location.Location) inscripta.biocantor.location.location.Location

Returns a new Location representing the union of this Location with the other Location, retaining overlapping blocks where applicable. This operation is commutative. Raises exception if locations cannot be combined.

minus(other: inscripta.biocantor.location.location.Location, match_strand: bool = True, strict_parent_compare: bool = False) inscripta.biocantor.location.location.Location

Returns a new Location representing this Location minus its intersection with the other Location. Returned Location has the same Strand as this Location. If there is no intersection, returns this Location. This operation is not commutative.

Parameters
  • other – Other location

  • match_strand – If set to True, automatically return this Location if other Location has a different Strand than this Location

  • strict_parent_compare – Raise MismatchedParentException if parents do not match

extend_absolute(extend_start: int, extend_end: int) inscripta.biocantor.location.location.Location

Returns a new Location representing this Location with start and end positions extended by the given values, ignoring Strand. Returned Location has same Strand as this Location.

Parameters
  • extend_start – Non-negative integer: amount to extend start

  • extend_end – Non-negative integer: amount to extend end

extend_relative(extend_upstream: int, extend_downstream: int) inscripta.biocantor.location.location.Location

Returns a new Location extended upstream and downstream relative to this Location’s Strand.

Parameters
  • extend_upstream – Non-negative integer: amount to extend upstream relative to Strand

  • extend_downstream – Non-negative integer: amount to extend downstream relative to Strand

merge_overlapping() inscripta.biocantor.location.location.Location

Merges overlapping windows

to_biopython()

Returns a BioPython interval type; since they do not have a shared base class, we need a union

first_ancestor_of_type(sequence_type: Union[str, inscripta.biocantor.parent.SequenceType]) inscripta.biocantor.parent.Parent

Returns the Parent object representing the closest ancestor (parent, parent of parent, etc.) of this location which has the given sequence type. Raises NoSuchAncestorException if no ancestor with the given type exists.

biocantor.location.location_impl.EmptyLocation()

Returns the single EmptyLocation instance

biocantor.location.location_impl._union_preserve_overlaps(loc1: inscripta.biocantor.location.location.Location, loc2: inscripta.biocantor.location.location.Location) inscripta.biocantor.location.location.Location