Is Dark Chemical Matter Really Dark?

High-throughput screening (HTS) and combinatorial chemistry were introduced in the early 1990s to improve R&D productivity, which remained a daunting challenge for the pharmaceutical industry. Large libraries of chemical compounds could be tested against a panel of biological targets to identify primary hits that could be validated for confirmed biological activity and furthered into the next steps of drug discovery.

A large proportion of the HTS hits is omitted in the early steps due to many reasons, one being the non-specific and frequent-hitter behavior of the compounds. While measures have been proposed and practiced to deal with such issues, on the other hand, little light has been shed on those compounds that remained inactive despite being tested in multiple assays.

Credit: Pexels

In their milestone contribution, Wassermann and colleagues at Novartis visited this topic and coined the term “Dark Chemical Matter” (DCM), referring to those compounds that have been extensively tested in more than 100 HTS campaigns and have not hit even a single target. Analyzing selected DCM compounds in additional assays, they identified attractive hits that showed confirmed antifungal activity, which suggests that DCM may not be entirely inert biologically but possess the potential to demonstrate selective activity. Thus, the DCM was regarded as a valuable starting point for drug discovery owing to its “unique activity” and “clean safety” profiles. A similar analysis by Muegge & Mukherjee at Boehringer concluded that the DCM compounds occasionally provide valuable hits and therefore should not be excluded from screening libraries in favor of less-tested compounds.

In the original study, the DCM compounds were compared with active compounds for molecular properties and scaffolds (the core of a chemical structure), and the only criterion for inclusion of compounds in the latter subset was demonstrating activity (based on a specified threshold) in at least one assay. Possibly, most such compounds could be selective hits that demonstrated as active against only one or two targets. Choosing a set of active compounds that were active in at least five assays (related to different biological targets) may have provided a comparison as to how the less promiscuous DCM compounds differ from highly promiscuous ones.

According to Siramshetty and Preissner, a comparison of the chemical spaces of DCM compounds and drugs would provide an alternative picture of the potential of DCM compounds to be promising candidates. As many as 16% of the approved drugs and 3.5% of the natural compounds were found to be structurally identical to DCM compounds. Furthermore, a comparison of the molecular scaffolds representing the DCM compounds and drugs revealed that over 25% of the scaffolds present in drugs were also present in the DCM compounds, indicating that slight changes to the structures of DCM compounds might result in promising lead compounds.

This was confirmed by identifying that more than 11,000 compounds in the DCM set that form more than 112,000 activity cliffs (a small change in chemical structure resulting in a huge difference in activity towards a biological target) with drugs. In other words, the DCM compounds possess several structural analogues (in drugs) that have established biological activities. Along this line, a recent study by Jasial and Bajorath employed the DCM compounds they identified among extensively-assayed compounds to systematically search for analogues with known bioactivities and subsequently derived target hypotheses for the DCM compounds. However, they identified the DCM compounds based on a modified criterion: compound must have been tested in at least 100 primary assays and have not demonstrated activity in both primary and confirmatory assays.

In addition, analyzing the promiscuity of drugs sharing chemical space with DCM might provide an indirect outlook on the potential of DCM compounds to act against multiple targets. Nearly 22% of the DCM-like drugs were found to be active against more than five human targets at nanomolar concentrations. Applying less stringent criteria, this proportion increased to 41%, in concurrence with the suggestion that testing the DCM compounds at concentrations higher than the typical screening concentrations (1 to 10 micromoles) might bring some value out them.

In another perspective, the DCM compounds were expected not to contain the PAINS substructures (a set of substructure alerts proposed to identify frequent-hitters in HTS output) but, a recent study identified a total 109 PAINS types in 3,570 DCM compounds. Moreover, in his opinion piece, Derek Lowe stated, “None of these structures, I have to say, look odd at all; I don’t think any medicinal chemist would look at them and say, ‘You know what, you could screen that stuff through a hundred assays and never see a damn thing.’ Quite the opposite – they look fine.” In a follow-up study, Wassermann and colleagues (this time at Merck) developed deorphanization strategies to shed light on the DCM compounds.

These findings reaffirm that DCM compounds can be attractive starting points for drug discovery. Therefore, judging the true potential of screening compounds on the basis of 100 assays and omitting those that were consistently inactive in favor of new or untested compounds may not be the best way to go.

These findings are described in the article entitled Drugs as habitable planets in the space of dark chemical matter, recently published in the journal Drug Discovery TodayThis study was conducted by Vishal B. Siramshetty and Robert Preissner from Charité – University Medicine Berlin, the Free University of Berlin, and the German Cancer Consortium