How to cite Progenetix?

@mbaudis 2021-10-12: more ...

Tools: pgxRpi, an R Library to Access Progenetix Data

pgxRpi is an API wrapper package to access data from Progenetix database. You can use it to

2021-09-16: more ...

Publication DB Criteria

Progenetix publication collection: Which scientific publications are included?

The Progenetix Publication DB contains articles describing whole genome screening (WGS, WES, aCGH, cCGH1) experiments in cancer. Genomic information about the analyzed cancer samples is extracted from these publications to generate cancer mutation data, with a focus on copy number abnormalities (CNV / CNA).

  1. Whole-genome screening techniques: 

2021-06-17: more ...

Progenetix File Formats


Progenetix Segment Files .pgxseg

Progenetix uses a variation of a standard tab-separated columnar text file such as produced by array or sequencing CNV software, with an optional metadata header for e.g. plot or grouping instructions.

Wile the first edition only was geared towards sample-linked segment annotations, a variation is now being provided for CNV frequencies.

@mbaudis 2021-04-16: more ...

Bycon - a Python-based environment for the Beacon v2 genomics API

The bycon project provides a combination of a Beacon-protocol based API with additional API services, used as backend and middleware for the Progenetix resource.

bycon has been developed to support Beacon protocol development following earlier implementations of Beacon+ (“beaconPlus”) with now deprected Perl libraries. The work tightly integrates with the ELIXIR Beacon project.

2021-04-16: more ...

Beacon: Variants

This endpoint is mostly aimed at providing variants Beacon functionality. The app uses the same query processing mechanism as the other bycon applications.

2021-04-16: more ...

Beacon: Biosamples

This endpoint is mostly aimed at providing biosamples handover functionality. However, the app uses the same query processing mechanism as the main byconplus application.

2021-04-16: more ...

Progenetix Image Formats

The standard format for (plot-)images generated on Progenetix is Scalable Vector Graphics (SVG). As the name implies, SVG is scalable, i.e. images can be scaled up without loosing quality or expanding in storage size. However, some of teh generated images use also embedded rastered components which will deteriorate during scaling - this is e.g. the case for array probe plots.

According to Wikipedia

All major modern web browsers—including Mozilla Firefox, Internet Explorer, Google Chrome, Opera, Safari, and Microsoft Edge—have SVG rendering support

On most pages where plots are being displayed there is a download option for the images - (please alert us where those are missing). Browsers also have the option to export SVGs themselves e.g. as PDF.

@mbaudis 2021-02-12: more ...

Progenetix Source Code

With exception of some utility scripts and external dependencies (e.g. MongoDB) the following projects provide the vast majority of the software (from database interaction to website) behind Progenetix and Beacon+.

  • bycon
    • Python based service based on the GA4GH Beacon protocol
    • software powering the Progenetix resource
    • Beacon+ implementation(s) use the same code base
  • progenetix-web
    • website for Progenetix and its Beacon+ implementations
    • provides Beacon interfaces for the bycon server, as well as other Progenetix sevices (e.g. the publications repository)
    • implemented as React / Next.js project
  • PGX
    • a Perl ibrary providing utility functions for Progenetix CNV data
    • used for data transformation, e.g. binning of segmental CNV data
    • main purpose now in providing the various plots (CNV histograms, clusterd CNV profiles, array plots)

Additional Projects

  • icdot2uberon
    • mappings between ICD-O 3 topographies and UBERON anatomical sites
  • ICDOntologies
    • mappings between ICD-O 3 morphology / topography pairs and NCIt neoplasm core cancer ontology

2021-02-06: more ...


Bycon Services

The bycon environment provides a number of data services which make use of resources in the Progenetix environment. Please refer to their specific documentation.

Note: As of 2021-04-07 there are some changes - typical Beacon endpoints such as biosamples have been moved to the /beacon/__service-name__ path:

services.py and URL Mapping

The service URL format progenetix.org/services/__service-name__?parameter=value is a shorthand for progenetix.org/cgi-bin/bycon/services/__service-name__.py?parameter=value.

2020-10-20: more ...

Services: Schemas


2020-10-20: more ...

Services: Geolocations


This service provides geographic location mapping for cities above 25’000 inhabitants (~22750 cities), through either:

  • matching of the (start-anchored) name
  • providing GeoJSON compatible parameters:
    • geolongitude
    • geolatitude
    • geodistance
      • optional, in meters; a default of 10’000m (10km) is provided
      • can be used for e.g. retrieving all places (or data from places if used with publication or sample searches) in an approximate region (e.g. for Europe using 2500000 around Heidelberg…)

2020-10-20: more ...

Services: Genespans

  • Documentation Link
  • Source Link

  • genomic mappings of gene coordinats
  • initially limited to GRCh38 and overall CDS extension
  • responds to (start-anchored) text input of HUGO gene symbols using the geneId parameter
  • returns a list of matching gene objects (see below under Response Formats)

2020-10-20: more ...

Services: Cytomapper

cytomapper Service

This services parses either:

  • a properly formatted cytoband annotation (assemblyId, cytoBands)
    • “8”, “9p11q21”, “8q”, “1p12qter”
  • a concatenated chroBases parameter
    • 7:23028447-45000000
    • X:99202660

While the return object is JSON by default, specifying text=1, together with the cytoBands or chroBases parameter will return the text version of the opposite.

There is a fallback to GRCh38 if no assembly is being provided.

The cytobands and chrobases parameters can be used for running the script on the command line (see examples below). Please be aware of the “chrobases” (command line) versus “chroBases” (cgi) use.



As in other bycon services, API responses are in JSON format with the main content being contained in the data field.

As of v2020-09-29, the ChromosomeLocation response is compatible to the GA4GH VRS standard. The GenomicLocation object is a wrapper around a VRS SimpleInterval.

  "data": {
    "ChromosomeLocation": {
      "chr": "8",
      "interval": {
        "end": "q24.13",
        "start": "q24.11",
        "type": "CytobandInterval"
      "species_id": "taxonomy:9606",
      "type": "ChromosomeLocation"
    "GenomicLocation": {
      "chr": "8",
      "interval": {
        "end": 127300000,
        "start": 117700000,
        "type": "SimpleInterval"
      "species_id": "taxonomy:9606",
      "type": "GenomicLocation"
    "info": {
      "bandList": [
      "chroBases": "8:117700000-127300000",
      "cytoBands": "8q24.11q24.13",
      "referenceName": "8",
      "size": 9600000
  "errors": [],
  "parameters": {
    "assemblyId": "NCBI36.1",
    "cytoBands": "8q24.1"
  "response_type": "cytomapper",
  "warnings": []

2020-10-20: more ...

Services: collations

  • Documentation Link
  • Source Link

  • provides access to information about data “subsets” in the Progenetix project databases
    • typically aggregations of samples sharing an ontology code (e.g. NCIT) or external identifier (e.g. PMID)

2020-10-20: more ...

Legacy API Documentation

The current implementation of the Progenetix APIs uses 2 URL endpoints:

Legacy API

  • progenetix.org/api/ has been deprecated
    • based on the original Perl libraries
    • stepwise deprecation in favour of services
    • documentation removed

2020-06-20: more ...

Beaconplus Data / Query Model

The Progenetix / Beaconplus query model utilises the GA4GH core data model for genomic and (biomedical, procedural) queries and data delivery.

2020-05-26: more ...


Progenetix project Namespace

Since July 2017, the Progenetix project has a registered namespace prefix at EBI’s identifiers.org.

This prefix will gradually be used to implement a REST syntax with stable endpoints for all Progenetix resources (e.g. arraymap.org, cnvar.org …), as well as to guide to our code mapping resources.

Michael Baudis 2018-05-10: more ...