Ramachandran Plots

PGD provides a pseudo ramachandran plot. Our plot uses square bins rather than a more freeflowing plot. This allows quicker generation of graphs.

The Plot page uses django querysets, but makes use of [SQL Aggregates]. For ease in explaining how the data is processed this page will refer to the SQL generated by the querysets

Data Selection

The Ramachandran plot is generated using a specialized SQL query to group data points into bins.

1. Each coordinate has the minimum value subtracted and then divided by the BIN_SIZE and rounded down (FLOOR).  This calculates the bin coordinate for each residue.
2. GROUP BY is applied to both the X and Y coordinates.  This sorts the residues into a grid.

select FLOOR((phi-PHI_MIN)/10) as X, FLOOR((psi-PSI_MIN)/10) as Y from pgd_core_residue GROUP BY X, Y

Subtracting the minimum value from the coordinate shifts the start of the bins to the minimum value. The first bin will always be the same size as other bins. The last bin may be a different size if BIN_SIZE does not divide MAX-MIN evenly

Statistics Calculation

Once records are selected statistical calculations are performed depending on the input. By default the ramachandran plot z-axis displays observations, or the count of residues in that bin.

select Count(*) as count, FLOOR((phi-PHI_MIN)/10) as X, FLOOR((psi-PSI_MIN)/10) as Y from pgd_core_residue GROUP BY X, Y

Optionally the z-axis can also display the average of user selected attribute.

select AVG(a1) as avg, STDDEV(a1) as stddev, FLOOR((phi-PHI_MIN)/10) as X, FLOOR((psi-PSI_MIN)/10) as Y from pgd_core_residue GROUP BY X, Y

Coloring

Graphs are colored to represent values on a z-axis.

  • Reference - The point at which to calculate distance from for determining colors. Used to adjust focus to a specific value. By default reference is the mean.
  • Outlier Sigma - Number of sigma (standard deviations) beyond which values are considered outliers. Standard deviation is not recalculated excluding outliers, but they are excluded when calculating the range of colors. This helps keep the range of colors from spreading too widely due to outliers far from the mean.

Logarithmic Scale

PGD applies a logarithmic scale to all bins. Logarithmic scaled colors allow a greater range of colors choices to be applied closer to the Reference. As values approach the extend of OUTLIER_SIGMA** SIGMA* the number of colors to choose from lessens. This allows differences within values close to the mean to be more apparent.

Color Ranges

Plots are colored using a predefined tuple of RGB values:

  • Max value
  • Adjustment value - used to add a minimum value to the color. ie. the Blue plot adds 75 to all blue values, shifting all colors into a blue hue.

Algorithm

::
  1. The logarithmic scale is first calculated producing a number from 0 to 1.
  2. This is multiplied by each value in the RGB MAX tuple
  3. each adjustment value in the RGB adjustment is added to the corresponding RGB value