AXXVirtual: a chemistry-driven virtual library for drug discovery
Science Spyglass Building AXXVirtual: a chemistry-driven virtual library for drug discovery In the same way as in high-throughput screening (HTS), the quality of the screened library plays a crucial role in the success of virtual screening. While virtual screening enables the exploration of much broader and more diverse chemical spaces, many virtual libraries are populated with molecules that, while computationally attractive, are difficult — or even impossible — to synthesize. For many drug discovery programs, this stage represents a major bottleneck: synthesis can be slow, unpredictable, and resource-intensive, delaying the confirmation of biological activity and the chemical exploration of the promising compounds. This is precisely the gap that the AXXVirtual library was designed to overcome. Beyond ensuring high-quality, drug-like chemical space, this 185 million non-commercial small molecule library was built with synthesis feasibility at its core. Every AXXVirtual compound can be produced in just 2–3 steps from readily available building blocks. This unique design guarantees that virtual hits are not just theoretical possibilities but tangible molecules, accessible within controlled and predictable timelines. As a result, AXXVirtual enables researchers to reach in vitro confirmation faster, accelerating the path from virtual screening to validated hits. Developed through a structured four-stage process, the compounds were rigorously selected by applying strict rules and filters to guarantee drug-like properties and structural diversity, thereby enabling efficient downstream development. This article walks you through the principles behind building a high-quality virtual library for drug discovery and shows how these concepts were applied in the design of the 185 million compound AXXVirtual library. Want to explore AXXVirtual for your projects? Get in touch Designing for the real lab: the synthetic accessibility Synthetic feasibility has become a key parameter in the design of virtual libraries, ensuring that computational efforts translate into compounds that can be readily produced for downstream testing [1, 2]. The core of this approach lies in relying on synthetic routes based on established reaction classes that have demonstrated their value over decades — including, for instance, amide coupling and the Suzuki–Miyaura reaction. Despite the emergence of new methodologies, these reactions remain the backbone of medicinal chemistry due to their efficiency, reproducibility, scalability, and high yields [3]. Built on the concepts previously described, the AXXVirtual compounds have been designed to be synthesized through twelve synthetic routes, each consisting of two to three steps, employing nine reliable reactions. The building blocks, more than 12.000 in total, were selected from the inventory of a trusted partner and are immediately available, eliminating delays from external orders. In addition, the reagents have been carefully selected to ensure clean reactions, minimizing side products and regioisomer formation. This thoughtful combination of proven chemistry and readily available reagents enables a fast and efficient synthesis, allowing the preparation of 100-120 compounds within just two to three weeks. To maintain these standards, the library is regularly updated in line with the partner’s inventory, making AXXVirtual a dynamic and continuously evolving library. Designing smarter: AI-powered properties and synthetic feasibility prediction Artificial intelligence (AI) is playing an increasingly important role in the landscape of virtual libraries for drug discovery by enabling more accurate and efficient predictions of molecular properties and synthetic accessibility. Today, a variety of machine learning models are employed to predict molecular properties with increasing accuracy and speed. These models rely heavily on large and curated training sets – databases of molecules with known experimental properties – to identify patterns and relationships between molecular features, such as size, chemical groups, and shape, and their observed behaviors, such as solubility and toxicity. Unlike simple rule-based methods, machine learning adapts to the complexity and the variability inherent in chemical data and this allows it to capture subtle influences and nonlinear effects that traditional rules often miss. For synthetic accessibility, tools like RAscore (Retrosynthetic Accessibility Score) [8] are widely used. RAscore is a machine learning classifier trained on the outcomes of the retrosynthetic planning software AiZynthFinder. Instead of running a full retrosynthetic analysis for each molecule — which is impractical when dealing with millions of compounds — RAscore provides a rapid estimate of whether a compound is likely to be synthesizable using known building blocks and reaction rules. Applying RAscore to evaluate AXXVirtual compounds, we found that the vast majority (96%) scored above 0.8 on the 0-to-1 scale, confirming their high synthetic accessibility. This result further highlights the robustness of the chemistry underpinning our library. Designing for success: from synthesizable to developable molecules While synthetic accessibility defines what can be built, drug-likeness defines what is worth pursuing. Virtual libraries should not only contain compounds that are synthetically feasible, but also exhibit molecular properties that make them suitable candidates for future development. This includes properties that impact solubility, permeability, metabolic stability, and safety. The concept of drug-likeness is grounded in the empirical observation of properties shared by orally bioavailable drugs. Large-scale analyses of marketed drugs and clinical candidates have revealed that certain molecular properties – such as moderate size and balanced lipophilicity – are associated with favorable pharmacokinetic behavior. These findings led to the formulation of guidelines, with Lipinski’s Rule of Five (Ro5) [4] and Veber’s rules [5] being among the most well-known and widely adopted. In parallel with physicochemical profiling, the quality of chemical libraries, including virtual ones, must be ensured by excluding compounds known to cause assay interference or unreliable readouts. A major class of such problematic molecules is represented by PAINS (Pan-Assay Interference compoundS), which are chemical structures prone to react nonspecifically with numerous biological targets rather than specifically affecting one desired target [6]. Rhodanines exemplify the extent of the problem. More than 2.000 rhodanines have been reported to have biological activity in over 400 papers. However, a publication by Bristol-Myers Squibb points out that these compounds undergo light-induced reactions that irreversibly modify proteins. It is hard to imagine how such a mechanism could be optimized to produce a drug or a useful tool [7]. At Axxam, more than 20 years of experience with physical libraries have given us deep insight into selecting the right compounds
AXXVirtual: a chemistry-driven virtual library for drug discovery Read More »


