Breathing life into data for faster drug discovery and improved patient outcomes
In life sciences, data is more than a byproduct of research. Harnessing vast amounts of information enables accelerated drug discovery, precision medicine and operational efficiency.
That moves data management from technical capability to foundational necessity. While advances in AI attract attention, having control over the quality, usability, lineage, security and accessibility of data determines the success or failure of every digital initiative.
Lack of control brings risks: the EU’s AI act, for example, will hit life sciences companies with painful penalties if they bring non-compliant systems to market.
Against that backdrop, data looks less like a resource and more like a core strategic asset – one that needs active protection and management.
Modernising data management is essential to fostering trust and creating an environment for innovation.
The data opportunity in life sciences
On any given day, pharmaceutical firms partner with thousands of study sites and tens of thousands of trail participants around world. A study by Tufts University found that Phase III clinical trials now generate an average of 3.6 million data points, tripling the amount of data collected a decade earlier.
Amid that deluge, having access to the right data when it’s needed — while spending less time managing and preparing it — can optimize R&D.
A 2024 survey by Informatica shows that 41% of organizations have 1,000 or more data sources and nearly 60% use an average of five tools to manage them. As the volume and fragmentation of data grow, the need for a single consolidated solution to manage it becomes clear:
- With a complete view of customer information, sales teams can better understand trends and influencing factors for their target accounts
- Procurement can leverage improved data visibility to collaborate more effectively with vendors and partners
- Corporate M&A teams can achieve faster time to value w by integrating an acquired company’s data, systems and applications more efficiently
Drug discovery becomes more efficient, outcomes become more predictable and the insights from analytics become more reliable. The key is control, with modern data management practices that ensure quality, access, governance and protection.
Phase III clinical trials generate an average of 3.6 million data points
Freedom – from control
While it may seem a paradox, better control over data can be the basis for liberating its business value.
Through the creation of targeted data products for clinical trials and internal data marketplaces to share them, non-technical users can generate their own insights independently. They also gain the independence to work directly with software development teams on new digital and AI applications.
Control equates to trust and confidence that the firm’s store of information is complete, accurate, up-to-date, compliant, secure and with a known lineage.
But there are practical obstacles to overcome.
Data management challenges
Life sciences firms are under intense ethical scrutiny and operate in accordance with multiple international standards, including:
- The Council for International Organizations of Medical Sciences (CIOMS) International Ethical Guidelines
- The International Council for Harmonization of Technical Requirements for Pharmaceuticals for Human Use (ICH) E6 guideline for Good Clinical Practice
- PhRMA’s Principles on Conduct of Clinical Trials and Communication of Clinical Trial Results
To satisfy these requirements, reliable management processes for life sciences data need to be in place. Data Sharing and monetizing arrangements between companies are also on the rise. However, the data must first be clean, trusted, governed – and most importantly – secure and privacy compliant.
De-risking compliance
The issue of data trust extends to pharma's growing regulatory burden. From the AI act to GDPR, EU MDR, HIPAA and various documentation mandates from the EMA and FDA, there are multiple national and regional regimes that interact, overlap, or add new measures that impact how data is handled.
Life sciences firms need to understand and simplify how complex, diverse and unique rules, interpretations and expectations should be addressed in different markets. Built-in intelligence and automates regulatory requirements across key markets can de-risk drug development processes and lead to faster regulatory submission.
Under the EU’s AI Act, data management takes center stage
The European Union’s AI Act came into force on 1st August 2024. A recent paper published in Nature says it will have a profound impact on digital innovation in life sciences.
"The act addresses the unique risks posed by AI technologies, including those related to data processing. Because regulated digital medical products often use personal health data, they also need to comply with GDPR.”
Managing requirements between overlapping regimes makes establishing a single source of truth for patient data not just best practice but a business imperative.
Effective management is crucial for ensuring compliance with the AI act's stringent governance and traceability requirements. By consolidating data management under one platform, life sciences firms can more readily define clear responsibilities, policies and processes.
Workflows can be streamlined for a consistent approach to metadata management, making it easier to apply and automate the act’s data protection and privacy measures.
New reporting requirements
Understanding data lineage becomes more important in the context of sustainability disclosures. Life sciences firms need clarity and control over the data used to support ESG reporting, particularly under rulebooks like GDPR which places restrictions on data sharing, or the ISO Identification of Medicinal Products (IDMP) standard for identifying and exchanging information about medicinal products for human use.
Like most large enterprises, life sciences organizations are expected to report sustainability metrics at increasingly deeper levels of granularity, yet many aren't yet ready to meet the evolving technical requirements and non-financial data types used in ESG metrics.
As companies like Pfizer set out to ‘change a billion lives a year by 2027,’ having trustworthy data to back up the claims reported to shareholders and regulators will be business critical.
41% of global organizations manage 1,000 or more sources of data
Achieving federated data
A mix of legacy systems and siloed data repositories is a fact of life for many life sciences firms, making it difficult to accommodate the multitude of project team or departmental data perspectives that exist in a large multinational organization.
Harnessing data means determining domain type, identifying personal and sensitive data, mapping lineage and cataloging master data assets. But the underlying technical tasks are laborious and complex: master data discovery, third-party data enrichment, ensuring consistent data replication across applications, syndication to data pools and sharing across cloud, on-premises, mobile and social processes.
Stopping these issues from turning into barriers requires significant time, people, skill sets and budget — or solutions that can simplify their execution.
Balancing AI risks and opportunities
Research from Cognizant and Oxford Economics forecasts that enterprise AI projects will leap from today's experimentation to a period of Confident Adoption by 2026. Are life sciences firms ready?
Proper stewardship of patient and intellectual property data is a foundational element of any digital healthcare initiative, but it’s particularly important for AI given recuring questions about the reliability of AI-generated outputs. For an AI algorithm to produce transparent, unbiased results, users and data scientists must fully understand the data that models are being trained with.
Conversely, AI can be leveraged to automate complex data management processes like data extraction, classification and validation, enhancing data accuracy and accelerating compliance.
AI needs data, but data also needs AI.
The promise of cloud-based, AI-driven data management
Part of the solution to life sciences’ data modernization challenge is to let sophisticated and scalable software do the heavy lifting.
Using integrated AI engines, the best of these tools can handle the laborious tasks around data ingestion, integration, replication, governance, quality, catalog, lineage and master data management — all from a single platform.
Data products created specifically for drug discovery and other business use cases can be accessed through a portal, allowing non-technical users to browse, search, find and understand the data they need.
Focusing on data democratization is especially apt if a pharma or biotech company wants to leverage architectures like data mesh or data fabric. Because both models integrate data through loose integrations, they require a level of flexibility typically found in SaaS solutions.
Success story: The benefits of data excellence
A global pharmaceutical brand needed to advance its multi-year strategy to migrate its on-premises solutions to the cloud. Moving to the next phase required a single, trusted source of data for analytics that could be leveraged across the enterprise.
The firm implemented an AI-powered SaaS data management and governance solution to consolidate data management and shift their legacy supply chain analytics to the cloud.
As a result, development processes were simplified, delivering notable reductions in cost and time. Five million records could be loaded in 4 hours, vs. 19 hours required by the previous system. Since deployment, the new cloud-based data management solution has helped cut the time to manage new orders by 50%.
Establishing trust and taking control
Life sciences is no stranger to innovation, but today its eureka moments depend on data — high-quality, compliant, secure and governed.
Data management in a highly regulated global industry remains complex and costly. Companies must overcome silos, integrate external data and comply with evolving regulatory landscapes. They also need to handle growing complexity in data types, formats and storage, while ensuring integrity and security. This is essential to foster trust and an environment of innovation.
Getting there means modernizing data management and empowering R&D teams to unlock new possibilities, from streamlining drug discovery processes to enabling more precise targeting of discreet patient populations.
As advanced analytics and generative AI drive a new wave of innovation, it's essential that data management systems meet the industry’s rising demands for speed, quality, compliance and innovation.
[1] https://www.informatica.com/resources/articles/eu-ai-act-data-governance-strategy.html
[3] https://www.informatica.com/blogs/600-chief-data-officers-share-insights-on-2024-data-strategy.html
[4] https://www.nature.com/articles/s41746-024-01232-3
[5] https://www.pfizer.com/news/announcements/pfizer-sets-new-ambition-changing-billion-lives-year-2027
[6] https://www.cognizant.com/us/en/aem-i/how-to-think-and-act-like-an-ai-native-business