It is not always easy to identify compounds that may show insightful SAR, especially when you are dealing with a dataset you are not familiar with.
When creating a new SAR Slide based on MMPs, you have access to a list of compounds (limited to 2000 molecules) ranked based on their potential to provide insightful SAR information. Every compound in your dataset is assigned a score based on various MMPs-derived properties. We describe here in more details the methodology we use.

MMPs in scope
The properties used in the scoring function are based on transformations (MMPs) identified in the dataset. In order to ensure we don't include pairs that may bring noise and focus on what's relevant for SAR analysis, we apply the following rules to derive the set of MMPs that will be used to compute properties and score compounds:
- We skip transformations that involve a constant (core) part being smaller (# of atoms) than the changing (fragment) part
 - We only consider fragments with up to 15 atoms.
 - We consider transformations that involve a difference in the number of heavy atoms between the fragments that is less than or equal to 8 heavy atoms.
 
All descriptors described bellow and used in the scoring scheme are based on this initial set of MMPs. This means that, for example, the number of single cut MMPs shown in the table actually refers to MMPs derived from the rules defined above. The actual number of MMPs that we store in our system, regardless of the nature of the transformation, is typically much higher.
Descriptors
We provide a set of descriptors that we think might be useful to navigate and filter within the list of structures. Some of these are also used in the scoring scheme. Remember that these are computed over a subset of the MMPs as described above.
- 
Minimal Pairs
The number of minimal MMPs identified for the molecule. Due to the algorithm we used to derive MMPs, two compounds can have multiple pairs identified between each others, among which redundancy may exist. This number ignores these redundant pairs, and can also be interpreted as the number of unique compound forming an MMP with a given structure. - 
% of Potential Cliffs
Among the pairs that exist, the percentage that lead to activity change greater than the threshold defined for the main property of the dataset. When the threshold is not defined, or when there is no pair of molecules that both have a valid value, this descriptor is set to -1 (and zero-ed in our scoring scheme). - 
Coverage
Among the molecules involved in at least one pair, what percentage of them form a pair with the molecule ? This number is actually another way to express Minimal Pairs descriptors, and reflects the percentage of distinct molecules forming pairs with a given molecule. - 
Single / Double / Triple Cut Pairs
The number of single, double or triple cut MMPs the molecule is involved in (rem: these numbers are computed based on the subset of pairs described above). - 
Substitution points
The number of unique substitution points (equivalent to cores) identified for the molecule. Since we rely on a subset of all pairs available in the system, this number is typically lower compared to the number of arrows (that also represents locations on the molecule where MMP(s) can be found) you may see in the query within a SAR Slide based on MMPs. 
Score
The score is making use of an internal weighted multi-parameter scoring function. It is relative, that is, each individual component of the score is normalized between the min and max value within the dataset itself. It means scores can't be compared across datasets. It also means that it's not because a structure has a very low score that it may not be worth considering. It would rather means that compounds with higher score are probably more interesting.