biocantor.io.gff3.constants

Module Contents

Classes

GFF3Headers

Generic enumeration.

_GFF3ReservedQualifiers

These are special reserved qualifiers that BioCantor does not currently support.

BioCantorGFF3ReservedQualifiers

This is the subset of GFF3 reserved qualifiers that BioCantor currently reserves

BioCantorQualifiers

These are qualifiers that are added when exporting from BioCantor to GFF3, if they exist on the object.

GFF3GeneFeatureTypes

These are feature types seen in GFF3 files we are parsing that we currently understand.

BioCantorFeatureTypes

These are the feature types currently supported by BioCantor when writing genes, transcripts,

Attributes

ENCODING_MAP

ENCODING_MAP_WITH_COMMA

ENCODING_PATTERN

ENCODING_PATTERN_WITH_COMMA

GFF_SOURCE

NULL_COLUMN

ATTRIBUTE_SEPARATOR

GFF3ReservedQualifiers

BIOCANTOR_QUALIFIERS_REGEX

biocantor.io.gff3.constants.ENCODING_MAP
biocantor.io.gff3.constants.ENCODING_MAP_WITH_COMMA
biocantor.io.gff3.constants.ENCODING_PATTERN
biocantor.io.gff3.constants.ENCODING_PATTERN_WITH_COMMA
biocantor.io.gff3.constants.GFF_SOURCE = 'BioCantor'
biocantor.io.gff3.constants.NULL_COLUMN = '.'
biocantor.io.gff3.constants.ATTRIBUTE_SEPARATOR = ','
class biocantor.io.gff3.constants.GFF3Headers

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

HEADER = '##gff-version 3'
FASTA_HEADER = '##FASTA'
SEQUENCE_HEADER = '##sequence-region {symbol} 1 {length}'
class biocantor.io.gff3.constants._GFF3ReservedQualifiers

Bases: inscripta.biocantor.util.enum.HasMemberMixin

These are special reserved qualifiers that BioCantor does not currently support.

All of these, if present in a qualifiers dictionary, will be simply included as-is.

See GFFAttributes for more information.

ALIAS = 'Alias'
TARGET = 'Target'
DBXREF = 'Dbxref'
GAP = 'Gap'
DERIVES_FROM = 'Derives_from'
NOTE = 'Note'
ONTOLOGY_TERM = 'Ontology_term'
class biocantor.io.gff3.constants.BioCantorGFF3ReservedQualifiers

Bases: inscripta.biocantor.util.enum.HasMemberMixin

This is the subset of GFF3 reserved qualifiers that BioCantor currently reserves

NAME = 'Name'
PARENT = 'Parent'
ID = 'ID'
biocantor.io.gff3.constants.GFF3ReservedQualifiers
class biocantor.io.gff3.constants.BioCantorQualifiers

Bases: enum.Enum

These are qualifiers that are added when exporting from BioCantor to GFF3, if they exist on the object.

Note that this enum does not filter arbitrary qualifier types, but rather exists to map attributes of an interval object on to the keys in the GFF3 attributes map.

TRANSCRIPT_ID = 'transcript_id'
TRANSCRIPT_NAME = 'transcript_name'
TRANSCRIPT_TYPE = 'transcript_biotype'
PROTEIN_ID = 'protein_id'
PRODUCT = 'product'
GENE_ID = 'gene_id'
GENE_SYMBOL = 'gene_name'
GENE_NAME = 'gene_name'
GENE_TYPE = 'gene_biotype'
FEATURE_ID = 'feature_id'
FEATURE_NAME = 'feature_name'
FEATURE_SYMBOL = 'feature_name'
FEATURE_COLLECTION_NAME = 'feature_collection_name'
FEATURE_COLLECTION_ID = 'feature_collection_id'
FEATURE_COLLETION_TYPE = 'feature_collection_type'
FEATURE_TYPE = 'feature_type'
LOCUS_TAG = 'locus_tag'
biocantor.io.gff3.constants.BIOCANTOR_QUALIFIERS_REGEX
class biocantor.io.gff3.constants.GFF3GeneFeatureTypes

Bases: inscripta.biocantor.util.enum.HasMemberMixin

These are feature types seen in GFF3 files we are parsing that we currently understand.

GENE = 'gene'
TRANSCRIPT = 'transcript'
CDS = 'CDS'
EXON = 'exon'
PSEUDOGENE = 'pseudogene'
class biocantor.io.gff3.constants.BioCantorFeatureTypes

Bases: inscripta.biocantor.util.enum.HasMemberMixin

These are the feature types currently supported by BioCantor when writing genes, transcripts, and feature collections to GFF3. When exporting features, the type of the feature is used directly.

TODO: Feature types should be explicitly linked to Sequence Ontology types. Biological region is a catch-all

term that matches both the INSDC specification as well as SO:0001411.

GENE = 'gene'
TRANSCRIPT = 'transcript'
CDS = 'CDS'
EXON = 'exon'
FEATURE_COLLECTION = 'biological_region'
FEATURE_INTERVAL = 'feature_interval'
FEATURE_INTERVAL_REGION = 'subregion'