Rootroute       Hosting       Order       Map       Login   Secure Inter-Network Operations  
 
man : Bio::DB::SeqFeature::Store

Command: man perldoc info search(apropos)  


Bio::DB::SeqFeature::Store(3)              User Contributed Perl Documentation




NAME
       Bio::DB::SeqFeature::Store -- Storage and retrieval of sequence
       annotation data

SYNOPSIS
         use Bio::DB::SeqFeature::Store;

         # Open the feature database
         my $db      = Bio::DB::SeqFeature::Store->new( -adaptor => 'DBI::mysql',
                                                        -dsn     => 'dbi:mysql:test',
                                                        -create  => 1 );

         # get a feature from somewhere
         my $feature = Bio::SeqFeature::Generic->new(...);

         # store it
         $db->store($feature) or die "Couldn't store!";

         # primary ID of the feature is changed to indicate its primary ID
         # in the database...
         my $id = $feature->primary_id;

         # get the feature back out
         my $f  = $db->fetch($id);

         # change the feature and update it
         $f->start(100);
         $db->update($f) or die "Couldn't update!";

         # searching...
         # ...by id
         my @features = $db->fetch_many(@list_of_ids);

         # ...by name
         @features = $db->get_features_by_name('ZK909');

         # ...by alias
         @features = $db->get_features_by_alias('sma-3');

         # ...by type
         @features = $db->get_features_by_type('gene');

         # ...by location
         @features = $db->get_features_by_location(-seq_id=>'Chr1',-start=>4000,-end=>600000);

         # ...by attribute
         @features = $db->get_features_by_attribute({description => 'protein kinase'})

         # ...by primary id
         @features = $db->get_feature_by_primary_id(42); # note no plural!!!

         # ...by the GFF "Note" field
         @result_list = $db->search_notes('kinase');

         # ...by arbitrary combinations of selectors
         @features = $db->features(-name => $name,
                                   -type => $types,
                                   -seq_id => $seqid,
                                   -start  => $start,
                                   -end    => $end,
                                   -attributes => $attributes);

         # ...using an iterator
         my $iterator = $db->get_seq_stream(-name => $name,
                                            -type => $types,
                                            -seq_id => $seqid,
                                            -start  => $start,
                                            -end    => $end,
                                            -attributes => $attributes);

         while (my $feature = $iterator->next_seq) {
           # do something with the feature
         }

         # ...limiting the search to a particular region
         my $segment  = $db->segment('Chr1',5000=>6000);
         my @features = $segment->features(-type=>['mRNA','match']);

         # getting & storing sequence information
         # Warning: this returns a string, and not a PrimarySeq object
         $db->insert_sequence('Chr1','GATCCCCCGGGATTCCAAAA...');
         my $sequence = $db->fetch_sequence('Chr1',5000=>6000);

         # what feature types are defined in the database?
         my @types    = $db->types;

         # create a new feature in the database
         my $feature = $db->new_feature(-primary_tag => 'mRNA',
                                        -seq_id      => 'chr3',
                                        -start      => 10000,
                                        -end        => 11000);

         # load an entire GFF3 file, using the GFF3 loader...
         my $loader = Bio::DB::SeqFeature::Store::GFF3Loader->new(-store    => $db,
                                                                  -verbose  => 1,
                                                                  -fast     => 1);

         $loader->load('./my_genome.gff3');

DESCRIPTION
       Bio::DB::SeqFeature::Store implements the Bio::SeqFeature::CollectionI
       interface to allow you to persistently store Bio::SeqFeatureI objects
       in a database and to later to retrieve them by a variety of searches.
       This module is similar to the older Bio::DB::GFF module, with the
       following differences:

       1.  No limitation on Bio::SeqFeatureI implementations

           Unlike Bio::DB::GFF, Bio::DB::SeqFeature::Store works with any
           Bio::SeqFeatureI object.

       2.  No limitation on nesting of features & subfeatures

           Bio::DB::GFF is limited to features that have at most one level of
           subfeature. Bio::DB::SeqFeature::Store can work with features that
           have unlimited levels of nesting.

       3.  No aggregators

           The aggregator architecture, which was necessary to impose order on
           the GFF2 files that Bio::DB::GFF works with, does not apply to
           Bio::DB::SeqFeature::Store. It is intended to store features that
           obey well-defined ontologies, such as the Sequence Ontology
           (http://song.sourceforge.net).

       4.  No relative locations

           All locations defined by this module are relative to an absolute
           sequence ID, unlike Bio::DB::GFF which allows you to define the
           location of one feature relative to another.

       We'll discuss major concepts in Bio::DB::SeqFeature::Store and then
       describe how to use the module.

   Adaptors
       Bio::DB::SeqFeature::Store is designed to work with a variety of
       storage back ends called "adaptors." Adaptors are subclasses of
       Bio::DB::SeqFeature::Store and provide the interface between the
       store() and fetch() methods and the physical database. Currently the
       number of adaptors is quite limited, but the number will grow soon.

       memory
           An implementation that stores all data in memory. This is useful
           for small data sets of no more than 10,000 features (more or less,
           depending on system memory).

       DBI::mysql
           A full-featured implementation on top of the MySQL relational
           database system.

       berkeleydb
           A full-feature implementation that runs on top of the BerkeleyDB
           database. See Bio::DB::SeqFeature::Store::berkeleydb.

       If you do not explicitly specify the adaptor, then DBI::mysql will be
       used by default.

   Serializers
       When Bio::DB::SeqFeature::Store stores a Bio::SeqFeatureI object into
       the database, it serializes it into binary or text form. When it later
       fetches the feature from the database, it unserializes it. Two
       serializers are available: Recent versions of

       Storable
           This is a fast binary serializer. It is available in Perl versions
           5.8.7 and higher and is used when available.

       Data::Dumper
           This is a slow text serializer that is available in Perl 5.8.0 and
           higher. It is used when Storable is unavailable.

       If you do not specify the serializer, then Storable will be used if
       available; otherwise Data::Dumper.

   Loaders and Normalized Features
       The Bio::DB::SeqFeature::Store::GFF3Loader parses a GFF3-format file
       and loads the annotations and sequence data into the database of your
       choice. The script bp_seqfeature_load.pl (found in the
       scripts/Bio-SeqFeature-Store/ subdirectory) is a thin front end to the
       GFF3Loader. Other loaders may be written later.

       Although Bio::DB::SeqFeature::Store should work with any
       Bio::SeqFeatureI object, there are some disadvantages to using
       Bio::SeqFeature::Generic and other vanilla implementations. The major
       issue is that if two vanilla features share the same subfeature (e.g.
       two transcripts sharing an exon), the shared subfeature will be cloned
       when stored into the database.

       The special-purpose Bio::DB::SeqFeature class is able to normalize its
       subfeatures in the database, so that shared subfeatures are stored only
       once. This minimizes wasted storage space. In addition, when in-memory
       caching is turned on, each shared subfeature will usually occupy only a
       single memory location upon restoration.

Methods for Connecting and Initializating a Database
       ## TODO: http://iowg.brcdevel.org/gff3.html#a_fasta is a dead link

   new
        Title   : new
        Usage   : $db = Bio::DB::SeqFeature::Store->new(@options)
        Function: connect to a database
        Returns : A descendent of Bio::DB::Seqfeature::Store
        Args    : several - see below
        Status  : public

       This class method creates a new database connection. The following
       -name=>$value arguments are accepted:

        Name               Value
        ----               -----

        -adaptor           The name of the Adaptor class (default DBI::mysql)

        -serializer        The name of the serializer class (default Storable)

        -index_subfeatures Whether or not to make subfeatures searchable
                           (default false)

        -cache             Activate LRU caching feature -- size of cache

        -compress          Compresses features before storing them in database
                           using Compress::Zlib

        -create            (Re)initialize the database.

       The -index_subfeatures argument, if true, tells the module to create
       indexes for a feature and all its subfeatures (and its subfeatues'
       subfeatures). Indexing subfeatures means that you will be able to
       search for the gene, its mRNA subfeatures and the exons inside each
       mRNA. It also means when you search the database for all features
       contained within a particular location, you will get the gene, the
       mRNAs and all the exons as individual objects as well as subfeatures of
       each other. NOTE: this option is only honored when working with a
       normalized feature class such as Bio::DB::SeqFeature.

       The -cache argument, if true, tells the module to try to create a LRU
       (least-recently-used) object cache using the Tie::Cacher module.
       Caching will cause two objects that share the same primary_id to
       (often, but not always) share the same memory location, and may improve
       performance modestly. The argument is taken as the desired size for the
       cache. If you pass "1" as the cache value, a reasonable default cache
       size will be chosen. Caching requires the Tie::Cacher module to be
       installed. If the module is not installed, then caching will silently
       be disabled.

       The -compress argument, if true, will cause the feature data to be
       compressed before storing it. This will make the database somewhat
       smaller at the cost of decreasing performance.

       The -create argument, if true, will either initialize or reinitialize
       the database. It is needed the first time a database is used.

       The new() method of individual adaptors recognize additional arguments.
       The default DBI::mysql adaptor recognizes the following ones:

        Name               Value
        ----               -----

        -dsn               DBI data source (default dbi:mysql:test)

        -autoindex         A flag that controls whether or not to update
                           all search indexes whenever a feature is stored
                           or updated (default true).

        -namespace         A string that will be used to qualify each table,
                           thereby allowing you to store several independent
                           sequence feature databases in a single Mysql
                           database.

        -dumpdir           The path to a temporary directory that will be
                           used during "fast" loading. See
                           L<Bio::DB::SeqFeature::Store::GFF3Loader> for a
                           description of this. Default is the current
                           directory.
        -write             Make the database writeable (implied by -create)

   init_database
        Title   : init_database
        Usage   : $db->init_database([$erase_flag])
        Function: initialize a database
        Returns : true
        Args    : (optional) flag to erase current data
        Status  : public

       Call this after Bio::DB::SeqFeature::Store->new() to initialize a new
       database. In the case of a DBI database, this method installs the
       schema but does not create the database. You have to do this offline
       using the appropriate command-line tool. In the case of the
       "berkeleydb" adaptor, this creates an empty BTREE database.

       If there is any data already in the database, init_database() called
       with no arguments will have no effect. To permanently erase the data
       already there and prepare to receive a fresh set of data, pass a true
       argument.

   post_init
       This method is invoked after init_database for use by certain adaptors
       (currently only the memory adaptor) to do automatic data loading after
       initialization. It is passed a copy of the init_database() args.

   store
        Title   : store
        Usage   : $success = $db->store(@features)
        Function: store one or more features into the database
        Returns : true if successful
        Args    : list of Bio::SeqFeatureI objects
        Status  : public

       This method stores a list of features into the database. Each feature
       is updated so that its primary_id becomes the primary ID of the
       serialized feature stored in the database. If all features were
       successfully stored, the method returns true. In the DBI
       implementation, the store is performed as a single transaction and the
       transaction is rolled back if one or more store operations failed.

       You can find out what the primary ID of the feature has become by
       calling the feature's primary_id() method:

         $db->store($my_feature) or die "Oh darn";
         my $id = $my_feature->primary_id;

       If the feature contains subfeatures, they will all be stored
       recursively. In the case of Bio::DB::SeqFeature and
       Bio::DB::SeqFeature::Store::NormalizedFeature, the subfeatures will be
       stored in a normalized way so that each subfeature appears just once in
       the database.

       Subfeatures will be indexed for separate retrieval based on the current
       value of index_subfeatures().

       If you call store() with one or more features that already have valid
       primary_ids, then an existing object(s) will be replaced. Note that
       when using normalized features such as Bio::DB::SeqFeature, the
       subfeatures are not recursively updated when you update the parent
       feature. You must manually update each subfeatures that has changed.

   store_noindex
        Title   : store_noindex
        Usage   : $success = $db->store_noindex(@features)
        Function: store one or more features into the database without indexing
        Returns : true if successful
        Args    : list of Bio::SeqFeatureI objects
        Status  : public

       This method stores a list of features into the database but does not
       make them searchable. The only way to access the features is via their
       primary IDs. This method is ordinarily only used internally to store
       subfeatures that are not indexed.

   no_blobs
        Title   : no_blobs
        Usage   : $db->no_blobs(1);
        Function: decide if objects should be stored in the database as blobs.
        Returns : boolean (default false)
        Args    : boolean (true to no longer store objects; when the corresponding
                  feature is retrieved it will instead be a minimal representation of
                  the object that was stored, as some simple Bio::SeqFeatureI object)
        Status  : dubious (new)

       This method saves lots of space in the database, which may in turn lead
       to large performance increases in extreme cases (over 7 million
       features in the db).

       Currently only applies to the mysql implementation.

   new_feature
        Title   : new_feature
        Usage   : $feature = $db->new_feature(@args)
        Function: create a new Bio::DB::SeqFeature object in the database
        Returns : the new seqfeature
        Args    : see below
        Status  : public

       This method creates and stores a new Bio::SeqFeatureI object using the
       specialized Bio::DB::SeqFeature class. This class is able to store its
       subfeatures in a normalized fashion, allowing subfeatures to be shared
       among multiple parents (e.g. multiple exons shared among several
       mRNAs).

       The arguments are the same as for Bio::DB::SeqFeature->new(), which in
       turn are similar to Bio::SeqFeature::Generic->new() and
       Bio::Graphics::Feature->new(). The most important difference is the
       -index option, which controls whether the feature will be indexed for
       retrieval (default is true). Ordinarily, you would only want to turn
       indexing off when creating subfeatures, because features stored without
       indexes will only be reachable via their primary IDs or their parents.

       Arguments are as follows:

         -seq_id       the reference sequence
         -start        the start position of the feature
         -end          the stop position of the feature
         -display_name the feature name (returned by seqname)
         -primary_tag  the feature type (returned by primary_tag)
         -source       the source tag
         -score        the feature score (for GFF compatibility)
         -desc         a description of the feature
         -segments     a list of subfeatures (see Bio::Graphics::Feature)
         -subtype      the type to use when creating subfeatures
         -strand       the strand of the feature (one of -1, 0 or +1)
         -phase        the phase of the feature (0..2)
         -url          a URL to link to when rendered with Bio::Graphics
         -attributes   a hashref of tag value attributes, in which the key is the tag
                         and the value is an array reference of values
         -index        index this feature if true

       Aliases:

         -id           an alias for -display_name
         -seqname      an alias for -display_name
         -display_id   an alias for -display_name
         -name         an alias for -display_name
         -stop         an alias for end
         -type         an alias for primary_tag

       You can change the seqfeature implementation generated by new() by
       passing the name of the desired seqfeature class to
       $db->seqfeature_class().

   delete
        Title   : delete
        Usage   : $success = $db->delete(@features)
        Function: delete a list of feature from the database
        Returns : true if successful
        Args    : list of features
        Status  : public

       This method looks up the primary IDs from a list of features and
       deletes them from the database, returning true if all deletions are
       successful.

       WARNING: The current DBI::mysql implementation has some issues that
       need to be resolved, namely (1) normalized subfeatures are NOT
       recursively deleted; and (2) the deletions are not performed in a
       transaction.

   get_feature_by_id
        Title   : get_feature_by_id
        Usage   : $feature = $db->get_feature_by_id($primary_id)
        Function: fetch a feature from the database using its primary ID
        Returns : a feature
        Args    : primary ID of desired feature
        Status  : public

       This method returns a previously-stored feature from the database using
       its primary ID. If the primary ID is invalid, it returns undef.

   fetch
        Title   : fetch
        Usage   : $feature = $db->fetch($primary_id)
        Function: fetch a feature from the database using its primary ID
        Returns : a feature
        Args    : primary ID of desired feature
        Status  : public

       This is an alias for get_feature_by_id().

   get_feature_by_primary_id
        Title   : get_feature_by_primary_id
        Usage   : $feature = $db->get_feature_by_primary_id($primary_id)
        Function: fetch a feature from the database using its primary ID
        Returns : a feature
        Args    : primary ID of desired feature
        Status  : public

       This method returns a previously-stored feature from the database using
       its primary ID. If the primary ID is invalid, it returns undef. This
       method is identical to fetch().

   fetch_many
        Title   : fetch_many
        Usage   : @features = $db->fetch_many($primary_id,$primary_id,$primary_id...)
        Function: fetch many features from the database using their primary ID
        Returns : list of features
        Args    : a list of primary IDs or an array ref of primary IDs
        Status  : public

       Same as fetch() except that you can pass a list of primary IDs or a ref
       to an array of IDs.

   get_seq_stream
        Title   : get_seq_stream
        Usage   : $iterator = $db->get_seq_stream(@args)
        Function: return an iterator across all features in the database
        Returns : a Bio::DB::SeqFeature::Store::Iterator object
        Args    : feature filters (optional)
        Status  : public

       When called without any arguments this method will return an iterator
       object that will traverse all indexed features in the database. Call
       the iterator's next_seq() method to step through them (in no particular
       order):

         my $iterator = $db->get_seq_stream;
         while (my $feature = $iterator->next_seq) {
           print $feature->primary_tag,' ',$feature->display_name,"\n";
         }

       You can select a subset of features by passing a series of filter
       arguments. The arguments are identical to those accepted by
       $db->features().

   get_features_by_name
        Title   : get_features_by_name
        Usage   : @features = $db->get_features_by_name($name)
        Function: looks up features by their display_name
        Returns : a list of matching features
        Args    : the desired name
        Status  : public

       This method searches the display_name of all features for matches
       against the provided name. GLOB style wildcares ("*", "?") are
       accepted, but may be slow.

       The method returns the list of matches, which may be zero, 1 or more
       than one features. Be prepared to receive more than one result, as
       display names are not guaranteed to be unique.

       For backward compatibility with gbrowse, this method is also known as
       get_feature_by_name().

   get_feature_by_name
        Title   : get_feature_by_name
        Usage   : @features = $db->get_feature_by_name($name)
        Function: looks up features by their display_name
        Returns : a list of matching features
        Args    : the desired name
        Status  : Use get_features_by_name instead.

       This method is provided for backward compatibility with gbrowse.

   get_features_by_alias
        Title   : get_features_by_alias
        Usage   : @features = $db->get_features_by_alias($name)
        Function: looks up features by their display_name or alias
        Returns : a list of matching features
        Args    : the desired name
        Status  : public

       This method is similar to get_features_by_name() except that it will
       also search through the feature aliases.  Aliases can be created by
       storing features that contain one or more Alias tags. Wildards are
       accepted.

   get_features_by_type
        Title   : get_features_by_type
        Usage   : @features = $db->get_features_by_type(@types)
        Function: looks up features by their primary_tag
        Returns : a list of matching features
        Args    : list of primary tags
        Status  : public

       This method will return a list of features that have any of the primary
       tags given in the argument list. For compatibility with gbrowse and
       Bio::DB::GFF, types can be qualified using a colon:

         primary_tag:source_tag

       in which case only features that match both the primary_tag and the
       indicated source_tag will be returned. If the database was loaded from
       a GFF3 file, this corresponds to the third and second columns of the
       row, in that order.

       For example, given the GFF3 lines:

         ctg123 geneFinder exon 1300 1500 . + . ID=exon001
         ctg123 fgenesH    exon 1300 1520 . + . ID=exon002

       exon001 and exon002 will be returned by searching for type "exon", but
       only exon001 will be returned by searching for type "exon:fgenesH".

   get_features_by_location
        Title   : get_features_by_location
        Usage   : @features = $db->get_features_by_location(@args)
        Function: looks up features by their location
        Returns : a list of matching features
        Args    : see below
        Status  : public

       This method fetches features based on a location range lookup. You call
       it using a positional list of arguments, or a list of
       (-argument=>$value) pairs.

       The positional form is as follows:

        $db->get_features_by_location($seqid [[,$start,]$end])

       The $seqid is the name of the sequence on which the feature resides,
       and start and end are optional endpoints for the match. If the
       endpoints are missing then any feature on the indicated seqid is
       returned.

       Examples:

        get_features_by_location('chr1');      # all features on chromosome 1
        get_features_by_location('chr1',5000); # features between 5000 and the end
        get_features_by_location('chr1',5000,8000); # features between 5000 and 8000

       Location lookups are overlapping. A feature will be returned if it
       partially or completely overlaps the indicated range.

       The named argument form gives you more control:

         Argument       Value
         --------       -----

         -seq_id        The name of the sequence on which the feature resides
         -start         Start of the range
         -end           End of the range
         -strand        Strand of the feature
         -range_type    Type of range to search over

       The -strand argument, if present, can be one of "0" to find features
       that are on both strands, "+1" to find only plus strand features, and
       "-1" to find only minus strand features. Specifying a strand of undef
       is the same as not specifying this argument at all, and retrieves all
       features regardless of their strandedness.

       The -range_type argument, if present, can be one of "overlaps" (the
       default), to find features whose positions overlap the indicated range,
       "contains," to find features whose endpoints are completely contained
       within the indicated range, and "contained_in" to find features whose
       endpoints are both outside the indicated range.

   get_features_by_attribute
        Title   : get_features_by_attribute
        Usage   : @features = $db->get_features_by_attribute(@args)
        Function: looks up features by their attributes/tags
        Returns : a list of matching features
        Args    : see below
        Status  : public

       This implements a simple tag filter. Pass a list of tag names and their
       values. The module will return a list of features whose tag names and
       values match. Tag names are case insensitive. If multiple tag
       name/value pairs are present, they will be ANDed together. To match any
       of a list of values, use an array reference for the value.

       Examples:

        # return all features whose "function" tag is "GO:0000123"
        @features = $db->get_features_by_attribute(function => 'GO:0000123');

        # return all features whose "function" tag is "GO:0000123" or "GO:0000555"
        @features = $db->get_features_by_attribute(function => ['GO:0000123','GO:0000555']);

        # return all features whose "function" tag is "GO:0000123" or "GO:0000555"
        # and whose "confirmed" tag is 1
        @features = $db->get_features_by_attribute(function  => ['GO:0000123','GO:0000555'],
                                                   confirmed => 1);

   features
        Title   : features
        Usage   : @features = $db->features(@args)
        Function: generalized query & retrieval interface
        Returns : list of features
        Args    : see below
        Status  : Public

       This is the workhorse for feature query and retrieval. It takes a
       series of -name=>$value arguments filter arguments. Features that match
       all the filters are returned.

         Argument       Value
         --------       -----

        Location filters:
         -seq_id        Chromosome, contig or other DNA segment
         -seqid         Synonym for -seqid
         -ref           Synonym for -seqid
         -start         Start of range
         -end           End of range
         -stop          Synonym for -end
         -strand        Strand
         -range_type    Type of range match ('overlaps','contains','contained_in')

        Name filters:
         -name          Name of feature (may be a glob expression)
         -aliases       If true, match aliases as well as display names
         -class         Archaic argument for backward compatibility.
                         (-class=>'Clone',-name=>'ABC123') is equivalent
                         to (-name=>'Clone:ABC123')

        Type filters:
         -types         List of feature types (array reference) or one type (scalar)
         -type          Synonym for the above
         -primary_tag   Synonym for the above

         -attributes    Hashref of attribute=>value pairs as per
                           get_features_by_attribute(). Multiple alternative values
                           can be matched by providing an array reference.
         -attribute     synonym for -attributes

       You may also provide features() with a list of scalar values (the first
       element of which must not begin with a dash), in which case it will
       treat the list as a feature type filter.

       Examples:

       All features on chromosome 1:

        @features = $db->features(-seqid=>'Chr1');

       All features on chromosome 1 between 5000 and 6000:

        @features = $db->features(-seqid=>'Chr1',-start=>5000,-end=>6000);

       All mRNAs on chromosome 1 between 5000 and 6000:

        @features = $db->features(-seqid=>'Chr1',-start=>5000,-end=>6000,-types=>'mRNA');

       All confirmed mRNAs and repeats on chromosome 1 that overlap the range
       5000..6000:

        @features = $db->features(-seqid     => 'Chr1',-start=>5000,-end=>6000,
                                  -types     => ['mRNA','repeat'],
                                  -attributes=> {confirmed=>1}
                                 );

       All confirmed mRNAs and repeats on chromosome 1 strictly contained
       within the range 5000..6000:

        @features = $db->features(-seqid     => 'Chr1',-start=>5000,-end=>6000,
                                  -types     => ['mRNA','repeat'],
                                  -attributes=> {confirmed=>1}
                                  -range_type => 'contained_in',
                                 );

       All genes and repeats:

        @features = $db->features('gene','repeat_region');

   seq_ids
        Title   : seq_ids
        Usage   : @ids = $db->seq_ids()
        Function: Return all sequence IDs contained in database
        Returns : list of sequence Ids
        Args    : none
        Status  : public

   search_attributes
        Title   : search_attributes
        Usage   : @result_list = $db->search_attributes("text search string",[$tag1,$tag2...],$limit)
        Function: Search attributes for keywords occurring in a text string
        Returns : array of results
        Args    : full text search string, array ref of attribute names, and an optional feature limit
        Status  : public

       Given a search string, this method performs a full-text search of the
       specified attributes and returns an array of results.  You may pass a
       scalar attribute name to search the values of one attribute (e.g.
       "Note") or you may pass an array reference to search inside multiple
       attributes (['Note','Alias','Parent']).Each row of the returned array
       is a arrayref containing the following fields:

         column 1     The display name of the feature
         column 2     The text of the note
         column 3     A relevance score.
         column 4     The feature type
         column 5     The unique ID of the feature

       NOTE: This search will fail to find features that do not have a display
       name!

       You can use fetch() or fetch_many() with the returned IDs to get to the
       features themselves.

   search_notes
        Title   : search_notes
        Usage   : @result_list = $db->search_notes("full text search string",$limit)
        Function: Search the notes for a text string
        Returns : array of results
        Args    : full text search string, and an optional feature limit
        Status  : public

       Given a search string, this method performs a full-text search of the
       "Notes" attribute and returns an array of results.  Each row of the
       returned array is a arrayref containing the following fields:

         column 1     The display_name of the feature, suitable for passing to get_feature_by_name()
         column 2     The text of the note
         column 3     A relevance score.
         column 4     The type

       NOTE: This is equivalent to $db->search_attributes('full text search
       string','Note',$limit). This search will fail to find features that do
       not have a display name!

   types
        Title   : types
        Usage   : @type_list = $db->types
        Function: Get all the types in the database
        Returns : array of Bio::DB::GFF::Typename objects
        Args    : none
        Status  : public

   insert_sequence
        Title   : insert_sequence
        Usage   : $success = $db->insert_sequence($seqid,$sequence_string,$offset)
        Function: Inserts sequence data into the database at the indicated offset
        Returns : true if successful
        Args    : see below
        Status  : public

       This method inserts the DNA or protein sequence fragment
       $sequence_string, identified by the ID $seq_id, into the database at
       the indicated offset $offset. It is used internally by the GFF3Loader
       to load sequence data from the files.

   fetch_sequence
        Title   : fetch_sequence
        Usage   : $sequence = $db->fetch_sequence(-seq_id=>$seqid,-start=>$start,-end=>$end)
        Function: Fetch the indicated subsequene from the database
        Returns : The sequence string (not a Bio::PrimarySeq object!)
        Args    : see below
        Status  : public

       This method retrieves a portion of the indicated sequence. The
       arguments are:

         Argument       Value
         --------       -----
         -seq_id        Chromosome, contig or other DNA segment
         -seqid         Synonym for -seq_id
         -name          Synonym for -seq_id
         -start         Start of range
         -end           End of range
         -class         Obsolete argument used for Bio::DB::GFF compatibility. If
                         specified will qualify the seq_id as "$class:$seq_id".
         -bioseq        Boolean flag; if true, returns a Bio::PrimarySeq object instead
                         of a sequence string.

       You can call fetch_sequence using the following shortcuts:

        $seq = $db->fetch_sequence('chr3');  # entire chromosome
        $seq = $db->fetch_sequence('chr3',1000);        # position 1000 to end of chromosome
        $seq = $db->fetch_sequence('chr3',undef,5000);  # position 1 to 5000
        $seq = $db->fetch_sequence('chr3',1000,5000);   # positions 1000 to 5000

   segment
        Title   : segment
        Usage   : $segment = $db->segment($seq_id [,$start] [,$end] [,$absolute])
        Function: restrict the database to a sequence range
        Returns : a Bio::DB::SeqFeature::Segment object
        Args    : sequence id, start and end ranges (optional)
        Status  : public

       This is a convenience method that can be used when you are interested
       in the contents of a particular sequence landmark, such as a contig.
       Specify the ID of a sequence or other landmark in the database and
       optionally a start and endpoint relative to that landmark. The method
       will look up the region and return a Bio::DB::SeqFeature::Segment
       object that spans it. You can then use this segment object to make
       location-restricted queries on the database.

       Example:

        $segment  = $db->segment('contig23',1,1000);  # first 1000 bp of contig23
        my @mRNAs = $segment->features('mRNA');       # all mRNAs that overlap segment

       Although you will usually want to fetch segments that correspond to
       physical sequences in the database, you can actually use any feature in
       the database as the sequence ID. The segment() method will perform a
       get_features_by_name() internally and then transform the feature into
       the appropriate coordinates.

       The named feature should exist once and only once in the database. If
       it exists multiple times in the database and you attempt to call
       segment() in a scalar context, you will get an exception. A workaround
       is to call the method in a list context, as in:

         my ($segment) = $db->segment('contig23',1,1000);

       or

         my @segments  = $db->segment('contig23',1,1000);

       However, having multiple same-named features in the database is often
       an indication of underlying data problems.

       If the optional $absolute argument is a true value, then the specified
       coordinates are relative to the reference (absolute) coordinates.

   seqfeature_class
        Title   : seqfeature_class
        Usage   : $classname = $db->seqfeature_class([$new_classname])
        Function: get or set the name of the Bio::SeqFeatureI class generated by new_feature()
        Returns : name of class
        Args    : new classname (optional)
        Status  : public

   reindex
        Title   : reindex
        Usage   : $db->reindex
        Function: reindex the database
        Returns : nothing
        Args    : nothing
        Status  : public

       This method will force the secondary indexes (name, location,
       attributes, feature types) to be recalculated. It may be useful to
       rebuild a corrupted database.

   attributes
        Title   : attributes
        Usage   : @a = $db->attributes
        Function: Returns list of all known attributes
        Returns : Returns list of all known attributes
        Args    : nothing
        Status  : public

   start_bulk_update,finish_bulk_update
        Title   : start_bulk_update,finish_bulk_update
        Usage   : $db->start_bulk_update
                  $db->finish_bulk_update
        Function: Activate optimizations for large number of insertions/updates
        Returns : nothing
        Args    : nothing
        Status  : public

       With some adaptors (currently only the DBI::mysql adaptor), these
       methods signal the adaptor that a large number of insertions or updates
       are to be performed, and activate certain optimizations. These methods
       are called automatically by the Bio::DB::SeqFeature::Store::GFF3Loader
       module.

       Example:

         $db->start_bulk_update;
         for my $f (@features) {
           $db->store($f);
         }
         $db->finish_bulk_update;

   add_SeqFeature
        Title   : add_SeqFeature
        Usage   : $count = $db->add_SeqFeature($parent,@children)
        Function: store a parent/child relationship between $parent and @children
        Returns : number of children successfully stored
        Args    : parent feature and one or more children
        Status  : OPTIONAL; MAY BE IMPLEMENTED BY ADAPTORS

       If can_store_parentage() returns true, then some store-aware features
       (e.g. Bio::DB::SeqFeature) will invoke this method to store
       feature/subfeature relationships in a normalized table.

   fetch_SeqFeatures
        Title   : fetch_SeqFeatures
        Usage   : @children = $db->fetch_SeqFeatures($parent_feature)
        Function: return the immediate subfeatures of the indicated feature
        Returns : list of subfeatures
        Args    : the parent feature
        Status  : OPTIONAL; MAY BE IMPLEMENTED BY ADAPTORS

       If can_store_parentage() returns true, then some store-aware features
       (e.g. Bio::DB::SeqFeature) will invoke this method to retrieve
       feature/subfeature relationships from the database.

Changing the Behavior of the Database
       These methods allow you to modify the behavior of the database.

   debug
        Title   : debug
        Usage   : $debug_flag = $db->debug([$new_flag])
        Function: set the debug flag
        Returns : current debug flag
        Args    : new debug flag
        Status  : public

       This method gets/sets a flag that turns on verbose progress messages.
       Currently this will not do very much.

   serializer
        Title   : serializer
        Usage   : $serializer = $db->serializer([$new_serializer])
        Function: get/set the name of the serializer
        Returns : the name of the current serializer class
        Args    : (optional) the name of a new serializer
        Status  : public

       You can use this method to set the serializer, but do not attempt to
       change the serializer once the database is initialized and populated.

   index_subfeatures
        Title   : index_subfeatures
        Usage   : $flag = $db->index_subfeatures([$new_value])
        Function: flag whether to index subfeatures
        Returns : current value of the flag
        Args    : (optional) new value of the flag
        Status  : public

       If true, the store() method will add a searchable index to both the
       top-level feature and all its subfeatures, allowing the search
       functions to return features at any level of the conainment hierarchy.
       If false, only the top level feature will be indexed, meaning that you
       will only be able to get at subfeatures by fetching the top-level
       feature and then traversing downward using get_SeqFeatures().

       You are free to change this setting at any point during the creation
       and population of a database. One database can contain both indexed and
       unindexed subfeatures.

   clone
       The clone() method should be used when you want to pass the
       Bio::DB::SeqFeature::Store object to a child process across a fork().
       The child must call clone() before making any queries.

       The default behavior is to do nothing, but adaptors that use the DBI
       interface may need to implement this in order to avoid database handle
       errors. See the dbi adaptor for an example.

TIE Interface
       This module implements a full TIEHASH interface. The keys are the
       primary IDs of the features in the database. Example:

        tie %h,'Bio::DB::SeqFeature::Store',-adaptor=>'DBI::mysql',-dsn=>'dbi:mysql:elegans';
        $h{123} = $feature1;
        $h{124} = $feature2;
        print $h{123}->display_name;

   _init_database
        Title   : _init_database
        Usage   : $success = $db->_init_database([$erase])
        Function: initialize an empty database
        Returns : true on success
        Args    : optional boolean flag to erase contents of an existing database
        Status  : ABSTRACT METHOD; MUST BE IMPLEMENTED BY AN ADAPTOR

       This method is the back end for init_database(). It must be implemented
       by an adaptor that inherits from Bio::DB::SeqFeature::Store. It returns
       true on success.

   _store
        Title   : _store
        Usage   : $success = $db->_store($indexed,@objects)
        Function: store seqfeature objects into database
        Returns : true on success
        Args    : a boolean flag indicating whether objects are to be indexed,
                  and one or more objects
        Status  : ABSTRACT METHOD; MUST BE IMPLEMENTED BY AN ADAPTOR

       This method is the back end for store() and store_noindex(). It should
       write the seqfeature objects into the database. If indexing is
       requested, the features should be indexed for query and retrieval.
       Otherwise the features should be stored without indexing (it is not
       required that adaptors respect this).

       If the object has no primary_id (undef), then the object is written
       into the database and assigned a new primary_id. If the object already
       has a primary_id, then the system will perform an update, replacing
       whatever was there before.

       In practice, the implementation will serialize each object using the
       freeze() method and then store it in the database under the
       corresponding primary_id. The object is then updated with the
       primary_id.

   _fetch
        Title   : _fetch
        Usage   : $feature = $db->_fetch($primary_id)
        Function: fetch feature from database
        Returns : feature
        Args    : primary id
        Status  : ABSTRACT METHOD; MUST BE IMPLEMENTED BY AN ADAPTOR

       This method is the back end for fetch(). It accepts a primary_id and
       returns a feature object. It must be implemented by the adaptor.

       In practice, the implementation will retrieve the serialized
       Bio::SeqfeatureI object from the database and pass it to the thaw()
       method to unserialize it and synchronize the primary_id.

   _fetch_many
        Title   : _fetch_many
        Usage   : $feature = $db->_fetch_many(@primary_ids)
        Function: fetch many features from database
        Returns : feature
        Args    : primary id
        Status  : private -- does not need to be implemented

       This method fetches many features specified by a list of IDs. The
       default implementation simply calls _fetch() once for each primary_id.
       Implementors can override it if needed for efficiency.

   _update_indexes
        Title   : _update_indexes
        Usage   : $success = $db->_update_indexes($feature)
        Function: update the indexes for a feature
        Returns : true on success
        Args    : A seqfeature object
        Status  : ABSTRACT METHOD; MUST BE IMPLEMENTED BY AN ADAPTOR

       This method is called by reindex() to update the searchable indexes for
       a feature object that has changed.

   _start_reindexing, _end_reindexing
        Title   : _start_reindexing, _end_reindexing
        Usage   : $db->_start_reindexing()
                  $db->_end_reindexing
        Function: flag that a series of reindexing operations is beginning/ending
        Returns : true on success
        Args    : none
        Status  : MAY BE IMPLEMENTED BY AN ADAPTOR (optional)

       These methods are called by reindex() before and immediately after a
       series of reindexing operations. The default behavior is to do nothing,
       but these methods can be overridden by an adaptor in order to perform
       optimizations, turn off autocommits, etc.

   _features
        Title   : _features
        Usage   : @features = $db->_features(@args)
        Function: back end for all get_feature_by_*() queries
        Returns : list of features
        Args    : see below
        Status  : ABSTRACT METHOD; MUST BE IMPLEMENTED BY ADAPTOR

       This is the backend for features(), get_features_by_name(),
       get_features_by_location(), etc. Arguments are as described for the
       features() method, except that only the named-argument form is
       recognized.

   _search_attributes
        Title   : _search_attributes
        Usage   : @result_list = $db->_search_attributes("text search string",[$tag1,$tag2...],$limit)
        Function: back end for the search_attributes() method
        Returns : results list
        Args    : as per search_attributes()
        Status  : ABSTRACT METHOD; MUST BE IMPLEMENTED BY ADAPTOR

       See search_attributes() for the format of the results list. The only
       difference between this and the public method is that the tag list is
       guaranteed to be an array reference.

   can_store_parentage
        Title   : can_store_parentage
        Usage   : $flag = $db->can_store_parentage
        Function: return true if this adaptor can store parent/child relationships
        Returns : boolean
        Args    : none
        Status  : OPTIONAL; MAY BE IMPLEMENTED BY ADAPTORS

       Override this method and return true if this adaptor supports the
       _add_SeqFeature() and _get_SeqFeatures() methods, which are used for
       storing feature parent/child relationships in a normalized fashion.
       Default is false (parent/child relationships are stored in denormalized
       form in each feature).

   _add_SeqFeature
        Title   : _add_SeqFeature
        Usage   : $count = $db->_add_SeqFeature($parent,@children)
        Function: store a parent/child relationship between $parent and @children
        Returns : number of children successfully stored
        Args    : parent feature and one or more children
        Status  : OPTIONAL; MAY BE IMPLEMENTED BY ADAPTORS

       If can_store_parentage() returns true, then some store-aware features
       (e.g. Bio::DB::SeqFeature) will invoke this method to store
       feature/subfeature relationships in a normalized table.

   _fetch_SeqFeatures
        Title   : _fetch_SeqFeatures
        Usage   : @children = $db->_fetch_SeqFeatures($parent_feature)
        Function: return the immediate subfeatures of the indicated feature
        Returns : list of subfeatures
        Args    : the parent feature
        Status  : OPTIONAL; MAY BE IMPLEMENTED BY ADAPTORS

       If can_store_parentage() returns true, then some store-aware features
       (e.g. Bio::DB::SeqFeature) will invoke this method to retrieve
       feature/subfeature relationships from the database.

   _insert_sequence
        Title   : _insert_sequence
        Usage   : $success = $db->_insert_sequence($seqid,$sequence_string,$offset)
        Function: Inserts sequence data into the database at the indicated offset
        Returns : true if successful
        Args    : see below
        Status  : ABSTRACT METHOD; MUST BE IMPLEMENTED BY ADAPTOR

       This is the back end for insert_sequence(). Adaptors must implement
       this method in order to store and retrieve nucleotide or protein
       sequence.

   _fetch_sequence
        Title   : _fetch_sequence
        Usage   : $sequence = $db->_fetch_sequence(-seq_id=>$seqid,-start=>$start,-end=>$end)
        Function: Fetch the indicated subsequene from the database
        Returns : The sequence string (not a Bio::PrimarySeq object!)
        Args    : see below
        Status  : ABSTRACT METHOD; MUST BE IMPLEMENTED BY ADAPTOR

       This is the back end for fetch_sequence(). Adaptors must implement this
       method in order to store and retrieve nucleotide or protein sequence.

   _seq_ids
        Title   : _seq_ids
        Usage   : @ids = $db->_seq_ids()
        Function: Return all sequence IDs contained in database
        Returns : list of sequence Ids
        Args    : none
        Status  : TO BE IMPLEMENTED BY ADAPTOR

       This method is invoked by seq_ids() to return all sequence IDs
       (coordinate systems) known to the database.

   _start_bulk_update,_finish_bulk_update
        Title   : _start_bulk_update, _finish_bulk_update
        Usage   : $db->_start_bulk_update
                  $db->_finish_bulk_update
        Function: Activate optimizations for large number of insertions/updates
        Returns : nothing
        Args    : nothing
        Status  : OPTIONAL; MAY BE IMPLEMENTED BY ADAPTOR

       These are the backends for start_bulk_update() and
       finish_bulk_update(). The default behavior of both methods is to do
       nothing.

   Optional methods needed to implement full TIEHASH interface
       The core TIEHASH interface will work if just the _store() and _fetch()
       methods are implemented. To support the full TIEHASH interface,
       including support for keys(), each(), and exists(), the following
       methods should be implemented:

       $id = $db->_firstid()
           Return the first primary ID in the database. Needed for the each()
           function.

       $next_id = $db->_nextid($id)
           Given a primary ID, return the next primary ID in the series.
           Needed for the each() function.

       $boolean = $db->_existsid($id)
           Returns true if the indicated primary ID is in the database. Needed
           for the exists() function.

       $db->_deleteid($id)
           Delete the feature corresponding to the given primary ID. Needed
           for delete().

       $db->_clearall()
           Empty the database. Needed for %tied_hash = ().

       $count = $db->_featurecount()
           Return the number of features in the database. Needed for scalar
           %tied_hash.

Internal Methods
       These methods are internal to Bio::DB::SeqFeature::Store and adaptors.

   new_instance
        Title   : new_instance
        Usage   : $db = $db->new_instance()
        Function: class constructor
        Returns : A descendent of Bio::DB::SeqFeature::Store
        Args    : none
        Status  : internal

       This method is called internally by new() to create a new uninitialized
       instance of Bio::DB::SeqFeature::Store. It is used internally and
       should not be called by application software.

   init
        Title   : init
        Usage   : $db->init(@args)
        Function: initialize object
        Returns : none
        Args    : Arguments passed to new()
        Status  : private

       This method is called internally by new() to initialize a newly-created
       object using the arguments passed to new(). It is to be overridden by
       Bio::DB::SeqFeature::Store adaptors.

   default_settings
        Title   : default_settings
        Usage   : $db->default_settings()
        Function: set up default settings for the adaptor
        Returns : none
        Args    : none
        Status  : private

       This method is may be overridden by adaptors. It is responsible for
       setting up object default settings.

   default_serializer
        Title   : default_serializer
        Usage   : $serializer = $db->default_serializer
        Function: finds an available serializer
        Returns : the name of an available serializer
        Args    : none
        Status  : private

       This method returns the name of an available serializer module.

   setting
        Title   : setting
        Usage   : $value = $db->setting('setting_name' [=> $new_value])
        Function: get/set the value of a setting
        Returns : the value of the current setting
        Args    : the name of the setting and optionally a new value for the setting
        Status  : private

       This is a low-level procedure for persistently storing database
       settings. It can be overridden by adaptors.

   subfeatures_are_indexed
        Title   : subfeatures_are_indexed
        Usage   : $flag = $db->subfeatures_are_indexed([$new_value])
        Function: flag whether subfeatures are indexed
        Returns : a flag indicating that all subfeatures are indexed
        Args    : (optional) new value of the flag
        Status  : private

       This method is used internally by the Bio::DB::SeqFeature class to
       optimize some of its operations. It returns true if all of the
       subfeatures in the database are indexed; it returns false if at least
       one of the subfeatures is not indexed. Do not attempt to change the
       value of this setting unless you are writing an adaptor.

   subfeature_types_are_indexed
        Title   : subfeature_types_are_indexed
        Usage   : $flag = $db->subfeature_types_are_indexed
        Function: whether subfeatures are indexed by type
        Returns : a flag indicating that all subfeatures are indexed
        Args    : none
        Status  : private

       This method returns true if subfeature types are indexed. Default is to
       return the value of subfeatures_are_indexed().

   subfeature_locations_are_indexed
        Title   : subfeature_locations_are_indexed
        Usage   : $flag = $db->subfeature_locations_are_indexed
        Function: whether subfeatures are indexed by type
        Returns : a flag indicating that all subfeatures are indexed
        Args    : none
        Status  : private

       This method returns true if subfeature locations are indexed. Default
       is to return the value of subfeatures_are_indexed().

   setup_segment_args
        Title   : setup_segment_args
        Usage   : @args = $db->setup_segment_args(@args)
        Function: munge the arguments to the segment() call
        Returns : munged arguments
        Args    : see below
        Status  : private

       This method is used internally by segment() to translate positional
       arguments into named argument=>value pairs.

   store_and_cache
        Title   : store_and_cache
        Usage   : $success = $db->store_and_cache(@features)
        Function: store features into database and update cache
        Returns : number of features stored
        Args    : list of features
        Status  : private

       This private method stores the list of Bio::SeqFeatureI objects into
       the database and caches them in memory for retrieval.

   init_cache
        Title   : init_cache
        Usage   : $db->init_cache($size)
        Function: initialize the in-memory feature cache
        Returns : the Tie::Cacher object
        Args    : desired size of the cache
        Status  : private

       This method is used internally by new() to create the Tie::Cacher
       instance used for the in-memory feature cache.

   cache
        Title   : cache
        Usage   : $cache = $db->cache
        Function: return the cache object
        Returns : the Tie::Cacher object
        Args    : none
        Status  : private

       This method returns the Tie::Cacher object used for the in-memory
       feature cache.

   load_class
        Title   : load_class
        Usage   : $db->load_class($blessed_object)
        Function: loads the module corresponding to a blessed object
        Returns : empty
        Args    : a blessed object
        Status  : private

       This method is used by thaw() to load the code for a blessed object.
       This ensures that all the object's methods are available.

   freeze
        Title   : freeze
        Usage   : $serialized_object = $db->freeze($feature)
        Function: serialize a feature object into a string
        Returns : serialized feature object
        Args    : a seqfeature object
        Status  : private

       This method converts a Bio::SeqFeatureI object into a serialized form
       suitable for storage into a database. The feature's primary ID is set
       to undef before it is serialized. This avoids any potential mismatch
       between the primary ID used as the database key and the primary ID
       stored in the serialized object.

   thaw
        Title   : thaw
        Usage   : $feature = $db->thaw($serialized_object,$primary_id)
        Function: unserialize a string into a feature object
        Returns : Bio::SeqFeatureI object
        Args    : serialized form of object from freeze() and primary_id of object
        Status  : private

       This method is the reverse of the freeze(). The supplied primary_id
       becomes the primary_id() of the returned Bio::SeqFeatureI object. This
       implementation checks for a deserialized object in the cache before it
       calls thaw_object() to do the actual deserialization.

   thaw_object
        Title   : thaw_object
        Usage   : $feature = $db->thaw_object($serialized_object,$primary_id)
        Function: unserialize a string into a feature object
        Returns : Bio::SeqFeatureI object
        Args    : serialized form of object from freeze() and primary_id of object
        Status  : private

       After thaw() checks the cache and comes up empty, this method is
       invoked to thaw the object.

   feature_names
        Title   : feature_names
        Usage   : ($names,$aliases) = $db->feature_names($feature)
        Function: get names and aliases for a feature
        Returns : an array of names and an array of aliases
        Args    : a Bio::SeqFeatureI object
        Status  : private

       This is an internal utility function which, given a Bio::SeqFeatureI
       object, returns two array refs. The first is a list of official names
       for the feature, and the second is a list of aliases. This is slightly
       skewed towards GFF3 usage, so the official names are the
       display_name(), plus all tag values named 'Name', plus all tag values
       named 'ID'. The aliases are all tag values named 'Alias'.

BUGS
       This is an early version, so there are certainly some bugs. Please use
       the BioPerl bug tracking system to report bugs.

SEE ALSO
       Bio::DB::SeqFeature, Bio::DB::SeqFeature::Store::GFF3Loader,
       Bio::DB::SeqFeature::Segment, Bio::DB::SeqFeature::Store::DBI::mysql,
       Bio::DB::SeqFeature::Store::berkeleydb
       Bio::DB::SeqFeature::Store::memory

AUTHOR
       Lincoln Stein <lsteinATcshl.org>.

       Copyright (c) 2006 Cold Spring Harbor Laboratory.

       This library is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.



perl v5.12.2                                                 February 24, 2011


rootr.net - man pages