Location
operations
Instantiate a Location
[ ]:
from inscripta.biocantor.location.location_impl import SingleInterval, CompoundInterval, EmptyLocation, Strand
from inscripta.biocantor.sequence import Sequence, Alphabet
from inscripta.biocantor.parent.parent import Parent
# No parent
single_interval = SingleInterval(5, 10, Strand.PLUS)
compound_interval = CompoundInterval([2, 8], [5, 13], Strand.PLUS)
# With parent sequence
compound_interval_with_sequence = CompoundInterval([2, 8], [5, 13], Strand.PLUS,
parent=Sequence(
"CTACGACTTCCGAGTCCAAAGTGTCCGTGT",
Alphabet.NT_STRICT,
type="chromosome",
))
# Empty location (implemented as a singleton)
# Rarely needs to be directly instantiated, but is returned from method calls where appropriate
empty = EmptyLocation()
Data access
Start, end, strand
[ ]:
compound_interval.start, compound_interval.end, compound_interval.strand
(2, 13, <Strand.PLUS: 1>)
Other basic properties
[ ]:
compound_interval.num_blocks
2
[ ]:
compound_interval.is_contiguous
False
[ ]:
single_interval.is_empty
False
[ ]:
EmptyLocation().is_empty
True
[ ]:
single_interval.parent is None
True
[ ]:
compound_interval_with_sequence.parent
<Parent: id=None, type=chromosome, strand=+, location=CompoundInterval <2-5:+, 8-13:+>, sequence=<Sequence;
Alphabet=NT_STRICT;
Length=30;
Parent=None;
Type=chromosome>, parent=None>
List of contiguous blocks
[ ]:
compound_interval.blocks
[<SingleInterval 2-5:+>, <SingleInterval 8-13:+>]
Iterator over contiguous blocks in strand-relative order
[ ]:
block_iter = CompoundInterval([1, 8], [3, 10], Strand.MINUS).scan_blocks()
list(block_iter)
[<SingleInterval 8-10:->, <SingleInterval 1-3:->]
Extract underlying spliced sequence
[ ]:
compound_interval_with_sequence.extract_sequence()
<Sequence=ACGTCCGA;
Alphabet=NT_STRICT;
Length=8;
Parent=None;
Type=None>
Set theoretic operations
Overlap
[ ]:
SingleInterval(5, 10, Strand.PLUS).has_overlap(SingleInterval(9, 20, Strand.PLUS))
True
[ ]:
SingleInterval(5, 10, Strand.PLUS).has_overlap(SingleInterval(9, 20, Strand.MINUS))
True
[ ]:
SingleInterval(5, 10, Strand.PLUS).has_overlap(SingleInterval(9, 20, Strand.MINUS), match_strand=True)
False
Intersection
[ ]:
CompoundInterval([2, 8], [5, 13], Strand.PLUS).intersection(SingleInterval(4, 10, Strand.PLUS))
CompoundInterval <4-5:+, 8-10:+>
[ ]:
SingleInterval(0, 3, Strand.PLUS).intersection(SingleInterval(5, 8, Strand.PLUS))
EmptyLocation
Union
[ ]:
CompoundInterval([0, 10], [5, 15], Strand.PLUS).union(CompoundInterval([0, 8], [7, 9], Strand.PLUS))
CompoundInterval <0-7:+, 8-9:+, 10-15:+>
Contains
Check if each block of other location is contained in a block of this location
[ ]:
compound_interval.contains(CompoundInterval([2, 10], [4, 11], Strand.PLUS))
True
Minus
[ ]:
SingleInterval(10, 20, Strand.PLUS).minus(SingleInterval(13, 15, Strand.PLUS))
CompoundInterval <10-13:+, 15-20:+>
Gaps (introns)
[ ]:
# List of gaps as SingleInterval objects, ordered relative to location strand
CompoundInterval([10, 20, 30], [15, 25, 35], Strand.MINUS).gap_list()
[<SingleInterval 25-30:->, <SingleInterval 15-20:->]
[ ]:
# All gaps as one Location object
CompoundInterval([10, 20, 30], [15, 25, 35], Strand.MINUS).gaps_location()
CompoundInterval <15-20:-, 25-30:->
Other feature arithmetic operations
Distance to another location
[ ]:
compound_interval.distance_to(SingleInterval(20, 30, Strand.MINUS))
7
Extend endpoints, returning a new Location
[ ]:
SingleInterval(5, 10, Strand.MINUS).extend_relative(3, 4)
<SingleInterval 1-13:->
[ ]:
SingleInterval(5, 10, Strand.MINUS).extend_absolute(3, 4)
<SingleInterval 2-14:->
Reverse or reset strand
[ ]:
compound_interval.reverse_strand()
CompoundInterval <2-5:-, 8-13:->
[ ]:
compound_interval.reset_strand(Strand.MINUS)
CompoundInterval <2-5:-, 8-13:->
Reverse feature, flipping strand and block structure
[ ]:
compound_interval.reverse()
CompoundInterval <2-7:-, 10-13:->
Shift entire location left or right
[ ]:
SingleInterval(3, 5, Strand.MINUS).shift_position(-2)
<SingleInterval 1-3:->
Iterator over (spliced) windows
[ ]:
window_iter = CompoundInterval([1, 8], [6, 15], Strand.MINUS).scan_windows(window_size=3, step_size=2, start_pos=0)
[ ]:
list(window_iter)
[<SingleInterval 12-15:->,
<SingleInterval 10-13:->,
<SingleInterval 8-11:->,
CompoundInterval <4-6:-, 8-9:->,
<SingleInterval 2-5:->]
Operations on Parent
hierarchy
Identify ancestors in Parent
hierarchy
[ ]:
compound_interval_with_sequence.first_ancestor_of_type("chromosome")
<Parent: id=None, type=chromosome, strand=+, location=CompoundInterval <2-5:+, 8-13:+>, sequence=<Sequence;
Alphabet=NT_STRICT;
Length=30;
Parent=None;
Type=chromosome>, parent=None>
[ ]:
compound_interval_with_sequence.has_ancestor_of_type("other_seq_type")
False
[ ]:
compound_interval_with_sequence.has_ancestor_sequence(
Sequence("CTACGACTTCCGAGTCCAAAGTGTCCGTGT", Alphabet.NT_STRICT, type="chromosome"))
True
Coordinate conversion
Establish a 3-level hierarchy
Highest level: all of chr1
Middle level: 30nt slice of chr1
Lowest level: a 10nt feature initially defined relative to the 30nt slice
[ ]:
# A Parent object representing a full chromosome
chr1 = Parent(id="chr1", sequence_type="chromosome")
# A slice of chr1 lying at positions 1000-1030
chromosome_slice_location = SingleInterval(1000, 1030, Strand.PLUS, parent=chr1)
chromosome_slice = Sequence("CTGATAGGGGATGCAGTATATCCCTGGATA", Alphabet.NT_STRICT,
parent=chr1.reset_location(location=chromosome_slice_location))
# A feature defined relative to the slice
feature = SingleInterval(5, 15, Strand.MINUS, parent=chromosome_slice)
Convert the feature to chromosome coordinates
[ ]:
feature.lift_over_to_first_ancestor_of_type("chromosome")
<SingleInterval <Parent: id=chr1, type=chromosome, strand=-, location=<SingleInterval 1005-1015:->, sequence=None, parent=None>:1005-1015:->
Convert a feature-relative position to slice-relative
[ ]:
feature.relative_to_parent_pos(6)
8
Convert a feature-relative interval to slice-relative
[ ]:
feature.relative_interval_to_parent_location(7, 9, Strand.MINUS)
<SingleInterval <Parent: id=None, type=None, strand=+, location=<SingleInterval 6-8:+>, sequence=<Sequence;
Alphabet=NT_STRICT;
Length=30;
Parent=<Parent: id=chr1, type=chromosome, strand=+, location=<SingleInterval <Parent: id=chr1, type=chromosome, strand=+, location=<SingleInterval 1000-1030:+>, sequence=None, parent=None>:1000-1030:+>, sequence=None, parent=None>;
Type=None>, parent=<Parent: id=chr1, type=chromosome, strand=+, location=<SingleInterval <Parent: id=chr1, type=chromosome, strand=+, location=<SingleInterval 1000-1030:+>, sequence=None, parent=None>:1000-1030:+>, sequence=None, parent=None>>:6-8:+>
Convert a chromosome-relative position to slice-relative
[ ]:
chromosome_slice.\
location_on_parent.\
parent_to_relative_pos(1007)
7
Convert a chromosome-relative feature to slice-relative
[ ]:
chromosome_slice.\
location_on_parent.\
parent_to_relative_location(SingleInterval(990, 1010, Strand.MINUS, parent=chr1))
<SingleInterval 0-10:->
Location of one feature relative to another feature
Express the intersection of two locations in coordinates relative to one of the locations
[ ]:
feature.location_relative_to(SingleInterval(11, 20, Strand.PLUS, parent=chromosome_slice))
<SingleInterval 0-4:->
[ ]: