Clustering

The Progenetix CNV visualization includes some default clustering for sample CNV profiles and collation CNV frequencies.

The resource uses a standard CNV binning model with (in GRCh38) 3102 intervals of 1MB default size. For the clustering of both status and frequency values, gain and loss intervals are treated independently, resulting in 6204 values per sample (or collation) for the generation of a clustering matrix:

On the website sample and/or collation clustering is involved automatically on the “Data Visualization” page which can be accessed from the search results. Clustering as well as CNV visualization are performed through the PGX Perl library, utilizing the Algorithm::Cluster interface to the C cluster library with default parameters:

For visualization, while clustering is performed on the matrix with separate values for gains and losses, the plotting is then performed on the original data:

Clustering can be omitted by setting Cluster Tree Width to 0.

More information is made availanble in the Use Cases category.

Michael Baudis  2021-06-22
Edit on Github...