Sequence operations

Instantiate a Sequence

[ ]:
from inscripta.biocantor.location.location_impl import SingleInterval, Strand
from inscripta.biocantor.parent import Parent
from inscripta.biocantor.sequence import Sequence, Alphabet


sequence = Sequence(data="AAAAAAA",
                    alphabet=Alphabet.NT_STRICT,
                    id="my_sequence",
                    type="chromosome_slice",
                    parent=Parent(id="chr1",
                                  sequence_type="chromosome",
                                  location=SingleInterval(33, 40, Strand.MINUS)))

Simple attributes

ID

[ ]:
sequence.id
'my_sequence'

Alphabet

[ ]:
sequence.alphabet
<Alphabet.NT_STRICT: 'ACGT'>

Sequence type

[ ]:
sequence.sequence_type
'chromosome_slice'

Sequence data

[ ]:
str(sequence)
'AAAAAAA'

Operations on sequence data

Reverse complement, updating parent data as appropriate

[ ]:
sequence_rc = sequence.reverse_complement(new_id="sequence_rc")
sequence_rc
<sequence_rc;
  Alphabet=NT_STRICT;
  Length=7;
  Parent=<Parent: id=None, type=None, strand=+, location=<SingleInterval 33-40:+>, sequence=None, parent=None>;
  Type=None>
[ ]:
str(sequence_rc)
'TTTTTTT'

Append another sequence, updating parent data as appropriate

[ ]:
sequence_2 = Sequence(data="CCCC",
                      alphabet=Alphabet.NT_STRICT,
                      id="my_sequence",
                      type="chromosome_slice",
                      parent=Parent(id="chr1",
                                    sequence_type="chromosome",
                                    location=SingleInterval(20, 24, Strand.MINUS)))
sequence.append(sequence_2)
<Sequence=AAAAAAACCCC;
  Alphabet=NT_STRICT;
  Length=11;
  Parent=<Parent: id=chr1, type=chromosome, strand=-, location=CompoundInterval <20-24:-, 33-40:->, sequence=None, parent=None>;
  Type=chromosome_slice>

Get fasta-formatted string

[ ]:
print(sequence.to_fasta())
>my_sequence
AAAAAAA

Operations on Parent hierarchy

Get Parent object and attributes

[ ]:
sequence.parent
<Parent: id=chr1, type=chromosome, strand=-, location=<SingleInterval 33-40:->, sequence=None, parent=None>
[ ]:
sequence.parent_id
'chr1'
[ ]:
sequence.parent_type
'chromosome'

Sequence location relative to parent

[ ]:
sequence.location_on_parent
<SingleInterval 33-40:->

Retrieve nearest ancestor in Parent hierarchy of a given sequence type

[ ]:
sequence.first_ancestor_of_type("chromosome")
<Parent: id=chr1, type=chromosome, strand=-, location=<SingleInterval 33-40:->, sequence=None, parent=None>
[ ]:
sequence.has_ancestor_of_type("other_seq_type")
False
[ ]: