The BeaconPlus environment utilizes the Beacon protocol for federated genomic variant queries, extended by methods discussed in the Beacon API development and custom extensions which may - or may not - make it into the Beacon specification but help to increase the usability of the Progenetix resource.
While the specification in principle follows the Beacon specification, and offers a minimal method to access it, this optioned isn’t used in practice due to the “forward looking” nature of some of the BeaconPlus methods.
The query string is deparsed into a hash reference, in the “$query” object, with the conventions of:
key=val1&key=val2,val3would be deparsed to
key = [val1, val2, val3]
The treatment of each attribute as pointing to an array leads to a consistent, though sometimes awkward, access to the values; even consistently unary values have to be addressed as (first) array element.
An API request is converted in two stages:
API shortcuts are resolved; i.e. requests requiring a specific database
and/or collection may have pre-defined
api_shortcuts values to allow the use
of simple canonical URIs, which at this stage are being expanded to the full
The API request is split & mapped to standard query parameters.
In the configuration file, the root attribute
scopes contains the definitions
of the different “scopes” (essentially the different data collections) and which
query parameters can be applied to them. These definitions also provide
Foreach of the scopes, the pre-defined possible parameters are evaluated for corresponding values in the object generated from parsing the query string. If matching values are found those are added to the pre-formatted query parameter object for the corresponding scope. Those scoped parameter objects will then be processed depending on the type of query (e.g. “variants” queries have a different processing compared to “biosamples” queries; see below).
norm_variant_params function creates intervals for variant
(“BeaconAlleleRequest”) queries from interpolation of all “start” and “end”
parameters. This is done greedily, i.e. allowing for incorrect submission order
and mix of e.g. “startMax” and “startMin” parameter types. The decision if a query
with such a mix should be rejected is handled elsewhere.
The output of the routine are:
variantType: "SNV" is specified w/o
alternateBases value, the wildcard
“N” value is inserted.
This query type is based on the assumption that a query consisting of
endparameter (also no
… aims to detect any CNV of the given type overlapping the
---------s----------------------- ------+++++++++++++++++++++++++++ ++++++++++----------------------- -------+++++++-------------------
This function handles the generation of the variant query for “precise” variants
(i.e. such annotated with
alternateBases, but including
TODO: Split-off of the truly precise queries with single start` positional parameter
Queries with multiple options for the same attribute are treated as logical “OR”.
The automatic Boolean query logic follows:
biocharacteristics there is one exception: Query values for
icdot are connected with AND, even though they target the same scope. This
is due to the assumption that one may want to subset samples of a given
morphology by topography (and vice versa), and that ICD-O M + T also can be
mapped to single ontologies like NCIt.
The current code just looks for the co-existence of
values & then constructs some fancy “$or” and “$and” request to MongoDB.
The construction of the query object depends on the detected parameters: