From people to
data to patients

Artificial intelligence, analytics and data science applications are rapidly influencing all parts of the medical research and development process. To incorporate these capabilities into its world-class research operations, Boehringer Ingelheim has created a new department in computational biology and digital sciences. The aim: accelerate the development of new types of medicines.

The world is awash in data. Around 50 zettabytes — 50 billion terabytes — were generated worldwide in 2020. That annual number is expected to grow to 175 zettabytes by 2025. As data scientists use algorithms to analyze this data universe to gain new insights in all areas of human endeavor, no innovative company can afford to be left behind.

That includes Boehringer Ingelheim, whose new Global Computational Biology and Digital Sciences department (gCBDS) is harnessing data for the benefit of drug discovery.

“We want to bring the opportunities from these developments to complement the significant existing expertise of the research organization,” explains Dr. Jan Nygaard Jensen, the Global Head of the gCDBS team. Together with Blaze Stancampiano, Head of Scientific Strategy, Dr. Jensen and his leadership team have developed a roadmap to establish Boehringer Ingelheim as a leading organization in this field.

Data, patterns and new perspectives

Mr. Stancampiano describes it as a new frontier in the understanding of human health. “Thanks to genome research, biobanks and other developments in recent decades, we now know much more about human disease,’’ he says. “We have much more detailed knowledge about genes and proteins which we can use to develop new medicines. This allows us to create more effective medicines with fewer side effects.”

Dr. Jensen notes the vast trove of medically relevant human data that is now available to researchers — including the UK Biobank, which contains in-depth genetic and health information from a half-million participants.

“Unlike a few years ago, we now have access to an enormous amount of human data, internally as well as externally via biobank collaborations,’’ Dr. Jensen says. “We have the computing power and are building the infrastructure we need to analyze this data.’’

Data scientists use algorithms to search this data for patterns and consistent signals to derive scientific insight. “For instance,” Dr. Jensen says, “we investigate whether all people with diabetes in the database share a consistent genome and protein pattern.”

Genes are blueprints for the production of proteins, the building blocks of the body. Proteins perform countless roles: as antibodies to protect the organism; as myosins for muscle movement; as collagen to provide skin and bones with structure; as hormones, enzymes and much more. By influencing the right genes and proteins, medical science can influence specific functions — or malfunctions — in the human body.

Data and laboratory science — a synergistic combination

The gCBDS team uses computer-aided methods to find a series of genes and proteins (also referred to as “targets”) that have associations or even causal relationships with certain diseases. These associations could be responsible for putting people at increased risk of developing a specific disease — and making them potential candidates for a specific therapeutic approach to new medicines.

And yet, computer analysis alone is not enough to determine whether these hypotheses are pointing to the right targets. The entire array of proven processes for drug development and laboratory testing remains necessary. “We have an outstanding team of biologists, chemists and technology experts in the Innovation Unit who we partner with to further evaluate if specific patterns or signals are involved in the development of the disease, and ideally qualify as new drug targets in our portfolio,’’ Dr. Jensen says.

“Collaboration is absolutely key for success,” he says. “We’re not talking about a contradiction between traditional lab work and AI. We combine them to gain new insights which would never be discovered otherwise.”

In other words, the interdisciplinary gCBDS team weaves artificial intelligence and human acumen. “We train AI to recognize patterns, which it can do much better in largescale data sets than a human ever could,’’ Dr. Jensen says. “But first, we have to use our biological understanding to decide which patterns it even makes sense to look for. AI can’t do that for us, yet.”

“We translate large-scale complex disease data into NTC [novel therapeutic concept] discovery, by computational and analytic insight to accelerate our drug portfolio.”Dr. Jan Nygaard Jensen,
Head of gCDBS

In-house data provides an advantage

Boehringer Ingelheim’s goal is to be at the cutting edge of research and to deliver a portfolio of 75% first-in-class molecules, with 50% of them having breakthrough potential for patients, according to Mr. Stancampiano. The U.S. Food and Drug Administration, for example, grants breakthrough therapy designation to medicines that have the potential to be substantial improvements over available therapies.

By focusing on adding patient-centric data from biobanks to the extensive laboratory data generated within Boehringer Ingelheim, the gCDBS department has a clear strategy for achieving its goals. “This allows us to uncover opportunities more rapidly, which in turn enables us to be the first to develop precisely adapted medicines,” Mr. Stancmpiano says.

It's how the gCBDS team’s work fits into the broad Boehringer Ingelheim mission: to create the basis for new medicines and therapies that improve patients’ health and the quality of life.