biocantor
Subpackages
Submodules
Package Contents
Classes
str(object='') -> str |
|
Generic enumeration. |
|
Shared AbstractLocation base class simplifies imports for type checking |
|
Shared AbstractSequence base class simplifies imports for type checking |
|
Shared AbstractParent base class simplifies imports for type checking |
Attributes
- biocantor.__version__ = 0.19.0
- biocantor.Strand
- biocantor.Alphabet
- class biocantor.SequenceType
-
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to ‘strict’.
- CHROMOSOME = chromosome
- SEQUENCE_CHUNK = sequence_chunk
- static sequence_type_str_to_type(sequence_type: Optional[str]) Optional[Union[SequenceType, str]]
Convenience function to convert a str to a SequenceType, if possible
- class biocantor.DistanceType
Bases:
enum.Enum
Generic enumeration.
Derive from this class to define new enumerations.
- INNER = inner
- OUTER = outer
- STARTS = starts
- ENDS = ends
- class biocantor.AbstractLocation
Bases:
abc.ABC
Shared AbstractLocation base class simplifies imports for type checking
- abstract property is_contiguous: bool
Returns True iff this Location is fully contiguous within its parent
- abstract property blocks: List[AbstractLocation]
Returns list of contiguous blocks comprising this Location
- abstract property is_overlapping: bool
Returns True if this interval contains overlaps; always False for SingleInterval
- abstract property _full_span_interval: AbstractLocation
Returns the full span of this interval; is trivial for a SingleInterval and EmptyLocation
- __slots__ = ['start', 'end', 'strand', 'parent', 'length', '_sequence']
- start :int
- end :int
- strand :Strand
- parent :Optional[AbstractParent]
- length :int
- __len__()
Returns the length (number of positions) of this Location. For subclasses representing discontiguous locations, regions between blocks are not considered.
- abstract __str__()
Returns a human readable string representation of this Location
- abstract __eq__(other)
Returns True iff this Location is equal to other object
- abstract __hash__()
Returns a hash code satisfying location1 == location2 => hash(location1) == hash(location2)
- abstract __repr__()
Returns the ‘official’ string representation of this Location
- abstract scan_blocks() Iterator[AbstractLocation]
Returns an iterator over blocks in order relative to strand of this Location
- abstract optimize_blocks() AbstractLocation
Returns a new Location covering the same positions but with blocks optimized. For example, empty blocks may be removed or adjacent blocks may be combined if applicable.
- abstract gap_list() List[AbstractLocation]
Returns list of contiguous regions comprising the space between blocks of this Location. List is ordered relative to strand of this Location.
- abstract gaps_location() AbstractLocation
Returns a Location representing the space between blocks of this Location.
- abstract extract_sequence() AbstractSequence
Extracts the sequence of this Location from the parent. Concrete implementations should raise ValueError if no parent exists.
- abstract parent_to_relative_pos(parent_pos: int) int
Converts a position on the parent to a position relative to this Location. Concrete implementations should raise ValueError if the given position does not overlap this Location.
- abstract relative_to_parent_pos(relative_pos: int) int
Converts a position relative to this Location to a position on the parent
- abstract parent_to_relative_location(parent_location: AbstractLocation, optimize_blocks: bool = True) AbstractLocation
Converts a Location on the parent to a Location relative to this Location.
- Parameters
parent_location – Location with the same parent as this Location. Both parents can be None.
optimize_blocks – Run optimize_blocks on the resulting location?
- Returns
- Return type
New Location relative to this Location.
- location_relative_to(other: AbstractLocation, optimize_blocks: bool = True) AbstractLocation
Converts this Location to a Location relative to another Location. The Locations must overlap. The returned value represents the relative location of the overlap within the other Location.
If
optimize_blocks
isTrue
, the resulting Location will not have any adjacent or overlapping intervals. This is often desirable, because the output of this function can have weird coordinates when the locations are overlapping or adjacent. However, there are some cases where it is desirable to retain the original block structure. One such example are CDS where adjacent blocks or overlapping blocks are used to model frameshifts or indels.
- abstract _location_relative_to(other: AbstractLocation, optimize_blocks: bool = True) AbstractLocation
- abstract relative_interval_to_parent_location(relative_start: int, relative_end: int, relative_strand: Strand) AbstractLocation
Converts an interval relative to this Location to a Location on the parent
- Parameters
relative_start – 0-based start position of interval relative to this Location
relative_end – 0-based exclusive end position of interval relative to this Location
relative_strand – Strand of interval relative to the strand of this Location. If the strand of interval is on the SAME strand as the strand of this location, relative_strand is PLUS. If the strand interval is on the OPPOSITE strand, relative_strand is MINUS.
- Returns
- Return type
New Location on the parent with the parent as parent
- abstract scan_windows(window_size: int, step_size: int, start_pos: int = 0) Iterator[AbstractLocation]
Returns an iterator over fixed size windows within this Location. Windows represent sub-regions of this Location and are with respect to the same parent as this Location. The final window returned is the last one that fits completely within this Location. Returned windows are in order according to relative position within this Location; i.e., corresponding to the strand of this Location.
- Parameters
window_size –
step_size –
start_pos – 0-based relative start position of first window relative to this Location
- abstract has_overlap(other: AbstractLocation, match_strand: bool = False, full_span: bool = False, strict_parent_compare: bool = False) bool
Returns True iff this Location shares at least one position with the given Location. For subclasses representing discontiguous locations, regions between blocks are not considered.
- Parameters
other – Other Location
match_strand – If set to True, automatically return False if given interval Strand does not match this Location’s Strand
full_span – If set to True, compare the full span of this Location to the full span of the other Location.
strict_parent_compare – Raise MismatchedParentException if parents do not match
- Returns
- Return type
True if there is any overlap, False otherwise
- abstract reverse() AbstractLocation
Returns a new Location corresponding to this Location with the same start and stop, with strand and structure reversed
- abstract reverse_strand() AbstractLocation
Returns a new Location corresponding to this Location with the strand reversed
- abstract reset_strand(new_strand: Strand) AbstractLocation
Returns a new Location corresponding to this Location with the given strand
- abstract reset_parent(new_parent: Optional[AbstractParent]) AbstractLocation
Returns a new Location corresponding to this Location with positions unchanged and pointing to a new parent
- abstract shift_position(shift: int) AbstractLocation
Returns a new Location corresponding to this location shifted by the given distance
- abstract distance_to(other: AbstractLocation, distance_type: DistanceType = DistanceType.INNER) int
Returns the distance from this location to another location with the same parent. Return value is a non-negative integer and implementations must be commutative.
- Parameters
other – Other location with same parent as this location
distance_type – Distance type
- abstract merge_overlapping() AbstractLocation
Merges overlapping windows
- abstract to_biopython() Union[Bio.SeqFeature.FeatureLocation, Bio.SeqFeature.CompoundLocation]
Returns a BioPython interval type; since they do not have a shared base class, we need a union
- abstract first_ancestor_of_type(sequence_type: Union[str, SequenceType]) AbstractParent
Returns the Parent object representing the closest ancestor (parent, parent of parent, etc.) of this location which has the given sequence type. Raises NoSuchAncestorException if no ancestor with the given type exists.
- abstract has_ancestor_of_type(sequence_type: Union[str, SequenceType]) bool
Returns True if some ancestor (parent, parent of parent, etc.) of of this location has the given sequence type, or False otherwise.
- abstract lift_over_to_first_ancestor_of_type(sequence_type: Union[str, SequenceType]) AbstractLocation
Returns a new Location representing the liftover of this Location to its closest ancestor sequence (parent, parent of parent, etc.) which has the given sequence type. If the immediate parent has the given type, returns this Location. Raises NoSuchAncestorException if no ancestor with the given type exists.
- has_ancestor_sequence(sequence: AbstractSequence) bool
Returns True iff this Location has some ancestor (parent, parent of parent, etc.) whose sequence attribute is equal to the given sequence
- lift_over_to_sequence(sequence: AbstractSequence) AbstractLocation
Returns a new Location representing the liftover of this Location to the given sequence. The given sequence must be equal to the sequence attribute of some Parent in the ancestor hierarchy of this Location; otherwise, raises NoSuchAncestorException.
- abstract intersection(other: AbstractLocation, match_strand: bool = True, full_span: bool = False, strict_parent_compare: bool = False) AbstractLocation
Returns a new Location representing the intersection of this Location with the other Location. Returned Location, if nonempty, has the same Strand as this Location. This operation is commutative if match_strand is True.
- Parameters
other – Other location
match_strand – If set to True, automatically return EmptyLocation() if other Location has a different Strand than this Location
full_span – If set to True, compare the full span of this Location to the full span of the other Location.
strict_parent_compare – Raise MismatchedParentException if parents do not match
- abstract union(other: AbstractLocation) AbstractLocation
Returns a new Location representing the union of this Location with the other Location. This operation is commutative. Raises exception if locations cannot be combined.
- abstract union_preserve_overlaps(other: AbstractLocation) AbstractLocation
Returns a new Location representing the union of this Location with the other Location, retaining overlapping blocks where applicable. This operation is commutative. Raises exception if locations cannot be combined.
- abstract minus(other: AbstractLocation, match_strand: bool = True, strict_parent_compare: bool = False) AbstractLocation
Returns a new Location representing this Location minus its intersection with the other Location. Returned Location has the same Strand as this Location. If there is no intersection, returns this Location. This operation is not commutative.
- Parameters
other – Other location
match_strand – If set to True, automatically return this Location if other Location has a different Strand than this Location
strict_parent_compare – Raise MismatchedParentException if parents do not match
- abstract extend_absolute(extend_start: int, extend_end: int) AbstractLocation
Returns a new Location representing this Location with start and end positions extended by the given values, ignoring Strand. Returned Location has same Strand as this Location.
- Parameters
extend_start – Non-negative integer: amount to extend start
extend_end – Non-negative integer: amount to extend end
- abstract extend_relative(extend_upstream: int, extend_downstream: int) AbstractLocation
Returns a new Location extended upstream and downstream relative to this Location’s Strand.
- Parameters
extend_upstream – Non-negative integer: amount to extend upstream relative to Strand
extend_downstream – Non-negative integer: amount to extend downstream relative to Strand
- abstract contains(other: AbstractLocation, match_strand: bool = False, full_span: bool = False, strict_parent_compare: bool = False) bool
Returns True iff this location contains the other. If
full_span
isTrue
, the full span of both locations are compared.- Parameters
other – Other location
match_strand – If set to True, automatically return EmptyLocation() if other Location has a different Strand than this Location
full_span – If set to True, compare the full span of this Location to the full span of the other Location.
strict_parent_compare – Raise MismatchedParentException if parents do not match
- class biocantor.AbstractSequence
Bases:
abc.ABC
Shared AbstractSequence base class simplifies imports for type checking
- __slots__ = ['sequence', 'alphabet', 'id', 'sequence_type', 'parent', '_len']
- sequence_type :SequenceType
- _len :int
- sequence :str
- alphabet :Alphabet
- parent :Optional[AbstractParent]
- id :Optional[str]
- __len__()
- class biocantor.AbstractParent
Bases:
abc.ABC
Shared AbstractParent base class simplifies imports for type checking
- abstract property strand: Optional[Strand]
Returns the Strand of this Parent. If this Parent has no explicit Strand, but has a Location, that Location’s Strand is returned.
- __slots__ = ['parent', 'id', 'sequence_type', '_strand', 'location', 'sequence', '_strand_property']
- id :Optional[str]
- sequence_type :Optional[SequenceType]
- sequence :Optional[AbstractSequence]
- location :Optional[AbstractLocation]
- parent :Optional[AbstractParent]
- _strand :Optional[Strand]
- abstract equals_except_location(other, require_same_sequence: bool = True)
Checks that this Parent is equal to another Parent, ignoring the associated Location members.
By default also checks that any associated Sequence objects also match, but this can be toggled off.
- abstract strip_location_info() AbstractParent
Returns a new Parent object representing this Parent with information about child location removed
- abstract first_ancestor_of_type(sequence_type: Union[str, SequenceType], include_self: bool = True) AbstractParent
Returns the Parent object representing the closest ancestor (parent, parent of parent, etc.) of this Parent which has the given sequence type. If include_self is True and this Parent has the given type, returns this object. Raises NoSuchAncestorException if no ancestor with the given type exists.
- Parameters
sequence_type (str) – Sequence type
include_self – Include this sequence as a candidate
- abstract has_ancestor_of_type(sequence_type: Union[str, SequenceType], include_self: bool = True) bool
Returns True if some ancestor (parent, parent of parent, etc.) of this Parent has the given sequence type, or False otherwise. If include_self is True and this Parent has the given type, returns True.
- Parameters
sequence_type (str) – Sequence type
include_self – Include this sequence as a candidate
- abstract lift_child_location_to_parent()
Lifts location of child object on this parent to the parent of this parent. Raises ValueError if any required data is missing (child location or location of this parent on its parent).
- Returns
Child object location lifted to the parent of this parent
- Return type
- abstract reset_location(location) AbstractParent
Returns a new Parent object with child location set to the given location
- abstract has_ancestor_sequence(sequence, include_self: bool = True) bool
Returns True iff this Parent has some ancestor (parent, parent of parent, etc.) whose sequence attribute is equal to the given sequence. If include_self is True and this Parent has sequence attribute equal to the given sequence, returns True.