Value Proposition of Diagnostic Test Data

Value Proposition of Diagnostic Test Data
Kishan Kumar, Associate Director, Axtria and Juhi Parikh, Project Lead, Axtria
With the current focus on precision medicine, targeted therapies are already part of the standard of care for several cancers, with many more in clinical trials and on the path to commercialization. With the advancement of new age data sources, it becomes imperative for pharmaceutical manufacturers to understand the value proposition of this diagnostic test data. Building a data capability around these datasets, although not a trivial undertaking, can result in a powerful asset to be used along with other traditional or patient-level data sources. With the evolution of more personalized medicine, targeted therapies focused on specific biomarkers and a robust oncology pipeline in the near future, the value of leveraging ‘data driven’ insights into their commercial and clinical planning takes a higher precedence. Over the past two years, we have been engaged in exploring the value addition that this data provides to facilitate several decision making processes in commercial, clinical and market access teams. Our exploration has led to pilot implementations of several initiatives, most of which has yielded valuable insights. This paper aims to walk the readers through a journey of this exploration.
Keywords: Diagnostic Test Data, Lab Test Data, Oncology, Biomarkers, Applications, Targeted Therapies
In today’s world of unparalleled technological breakthroughs and scientific advancements, personalized health care has the capacity to detect the onset of disease at its earliest stages, pre-empt the progression of disease, and, at the same time, increase the efficiency of the health care system by improving quality, accessibility, and affordability. In the 12 years since the completion of the Human Genome Project (HGP), advances in genome technology have led to an exponential decrease in sequencing costs (more than 16,000-fold). Patients have benefited from major biological insights and medical advances, including the development of more than 100 drugs whose labels now include pharmacogenomics information.1
The opportunity that the field of personalized medicine offers is the potential for advancements in science and medicine. Pharmaceutical manufacturers can now target populations of patients into groups who have a greater susceptibility to respond to a particular treatment. Through this personalized medicine, patients will not just benefit from better treatment but also in early detection and prevention of diseases and disorders. As the advancements in personalized medicine progress, the availability of data to benefit commercial and clinical teams have also increased exponentially. In this paper, we will discuss the value proposition and potential that the data from diagnostic testing laboratories offers to commercial and clinical teams, specifically within oncology manufacturers. In this paper, the usage or references to the word lab tests or diagnostic tests refers to the biomarker testing.
Traditional and Targeted Therapies in Oncology
In order to fully realize the potential value of using biomarker data or lab data, it is important to gather an understanding of targeted therapies. Traditional or standard chemo drugs work by killing cells in the body that grow and divide quickly. Cancer cells divide quickly, which is why these drugs often work against them. But chemo drugs can also affect other (normal) cells in the body that divide quickly, which can sometimes lead to serious side effects. The reason is that these chemo drugs are unable to differentiate between healthy cells and cancer cells. Each time chemo is given, the chemo tries to find a balance between killing the cancer cells (in order to cure or control the disease) and sparing the normal cells (to lessen side effects).
Targeted therapy, on the other hand is a newer type of cancer treatment that uses drugs or other substances to more precisely identify and attack cancer cells, usually while doing little damage to normal cells, resulting in prolonged stability of tumor. Targeted cancer therapies are drugs designed to interfere with specific molecules necessary for tumor growth and progression. Traditional cytotoxic chemotherapies usually kill rapidly dividing cells in the body by interfering with cell division. A primary goal of targeted therapies is to fight cancer cells with more precision and potentially fewer side effects.2 Oncology has been a leader in leveraging and developing products based on these targeted therapies. To-date there are over 200 FDA-approved drugs with pharmacogenomic information in their labeling.3
Candidates for Targeted Therapy – Introduction to Diagnostic Testing
For certain types of cancer, most patients will have an appropriate target (a marker or mutation) for a particular targeted therapy and, thus, will be candidates to be treated with that therapy. For example, in the case of CML, most patients have the BCR-ABL fusion gene. For some other cancer types, however, a patient’s tumor tissue must be tested to determine whether or not an appropriate target is present. The use of targeted therapies may be restricted to patients whose tumor has a specific gene mutation that codes for the target; patients who do not have the mutation would not be candidates because the therapy would have nothing to target. Sometimes, a patient is a candidate for a targeted therapy only if he or she meets specific criteria (for example, their cancer did not respond to other therapies, has spread, or is inoperable). These criteria are set by the FDA when it approves a specific targeted therapy.4
Treatment of patients on targeted therapies starts with diagnosis and testing for cancer. Diagnostic tests are done with samples collected at the time of a first biopsy.  This testing may be referred to as molecular profiling, biomarker testing or tumor testing which all implies the same meaning. The objective of these tests are to determine the mutations that may have occurred in the gene and to identify the occurrence and severity of cancer. The testing practices are intensely debated, impacting diagnostic quality and affecting pathologists, oncologists and patients. There are some slide based testing techniques such as in-situ hybridization (IHC) or Immunohistochemistry (ISH). In recent times, the most commonly used or evolved technique is the Next Generation Sequencing (NGS). NGS is increasingly used in the clinics, most commonly in the form of targeted gene panels that are custom tailored for specific diseases. The FDA has approved multiple targeted drug cancer therapies, and many more are being studied in clinical trials either alone or in combination with other treatments. Some of the commonly known, currently approved targeted therapies for solid malignancies and their molecular targets is provided in Table 1.5
Table 1: Approved Targeted Therapies for Solid Malignancies and Their Molecular Targets
Agent Target(s) FDA-approved indication(s)
Erlotinib (Tarceva) EGFR (HER1/ERBB1) Non-small cell lung cancer (with EGFR exon 19 deletions or exon 21 substitution (L858R) mutations), Pancreatic cancer
Everolimus (Afinitor) mTOR Pancreatic, gastrointestinal, or lung origin neuroendocrine tumor , Renal cell carcinoma, Nonresectable sub ependymal giant cell astrocytoma associated with tuberous sclerosis, Breast cancer (HR+, HER2-)
Ipilimumab (Yervoy) CTLA-4 Melanoma
Imatinib (Gleevec) KIT, PDGFR, ABL GI stromal tumor (KIT+), Dermatofibrosarcoma protuberans, Multiple hematologic malignancies including Philadelphia chromosome-positive ALL and CML
Sorafenib (Nexavar) VEGFR, PDGFR, KIT, RAF Hepatocellular carcinoma, Renal cell carcinoma, Thyroid carcinoma
Trastuzumab (Herceptin) HER2 (ERBB2/neu) Breast cancer (HER2+), Gastric cancer (HER2+)
Bevacizumab (Avastin) VEGF ligand Cervical cancer, Colorectal cancer, Fallopian tube cancer, Glioblastoma, Non-small cell lung cancer, Ovarian cancer, Peritoneal cancer, Renal cell carcinoma
Diagnostic Testing Landscape
Effective usage of diagnostic test data starts with a thorough understanding of the testing landscape. “Not all labs are created equal” – i.e., different genetic testing laboratories will focus on different disease states and different segments of the market. The focus may be exclusively academic or commercial testing. Despite some consolidation, the testing is still very fragmented, with many regional laboratories serving specific geographies. The two broad areas of testing are done in academic labs (that are more research focused) and commercial labs. In order for pharmaceutical manufacturers to benefit from these data sources, it is relatively easier and more accessible, in the near term, to focus on commercial laboratories. Even within the commercial lab setting, there are some labs who still are largely unaware of the potential that capturing and recording these datasets have. Within the commercial laboratories, there are the major labs, such as Quest DiagnosticsTM, LabcorpTM, etc. Then there are the regional data providers, such as Foundation MedicineTM, GenoptixTM, Caris Life SciencesTM, Clarient HealthTM, etc. Lastly, there are the private labs that limit their testing services to a select list of ZIP codes within a particular geography.
Understanding the Datasets
Given the high degree of specialization, many labs address specific, selected disease states. Even within a particular disease state, the testing process across labs vary markedly. This translates to significant variation in reporting test results. Based on the information that was gathered in our pilot experimentation with a handful list of data sets, we had a few observations:
  • Presence of a unique patient identifier. However, the unique identifier was unique only to their databases (i.e. not universally unique).
  • Presence of ZIP code information corresponding to each patient. This represented the zip code of the test site or facility.
  • Presence of molecular level information. Figure 1 illustrates some examples of molecular information represented in these data sets.
  • Besides this information, there was other relevant information pertaining to physician ID (such as NPI, DEA, etc.) or payer information for the patient present in these datasets.
Figure 1: Illustration of Positive Outcome Results From a Given Data Source

A quick illustration of the contents in the datasets is represented in Figure 2.
Figure 2: Illustration of the Availability of Information Represented in Three Lab Datasets for a Two Year Period (2014 to 2016)

The test results represented by the data sources are very granular, down to outcomes at the genomic level. As represented in Figure 3, the datasets represent mutation level data for various biomarkers. The information present shows the number of patients screened for each panel of testing and the number of patients that had positive outcomes.

Figure 3: Illustration of Number of Screened Patients in a Test Panel for Various Markers

In addition to presence of outcomes (positive or negative) for each patient, some datasets even represent point mutations.   For example, an EGFR positive outcome may be “T790M” or “L858R”. This is very useful when working with targeted therapies that are appropriate for patients with very specific outcomes. Illustrated below are some examples from various sources.

Figure 4: Illustration of Information at a Granular Level From Various Data Sources

Applications of Diagnostic Data
With the landscape of the pharmaceutical industry changing from physician-driven to patient-driven, it is imperative to tap into the potential of new age data sources such as diagnostic test data. Powerful and informed insights can be derived by combining this data with existing data sources such as patient level data, claims data, clinical trials data, etc. The advancements in medical infrastructure, not just through means of scientific breakthrough, but also the investments that are in place to track the patient records, has enabled the sales, commercial, marketing and medical teams of  pharmaceutical companies to provide customized intelligence about the physician and their patients, owing to compliancy and HIPAA regulations. More than 93% of U.S. physicians today use Electronic Health Records (EHRs)6,7 and it continues to grow. There are frameworks in place to leverage IT investments and address critical concerns such as interoperability, data sharing and complex consent. The widespread use of EHRs creates the potential for the millions of files of data they hold to be analyzed by researchers, test developers, and regulators, to better develop, refine, and understand the underpinnings and real-world applications of personalized medicine.8
The presence and availability of diagnostic test data is an extension to the EHR and provides more robustness into understanding the longitudinal history of patients. Through our exploration of several datasets, we identified a few potential applications.
  1. Focused Physician Targeting
    The availability of patient level data enables commercial operations teams to create sophisticated targeting models. As the landscape of healthcare changed from volume driven to value driven, commercial teams leverage as many levers as possible to drive focused physician targeting. As observed in a few pharmaceutical companies, physician targeting for oncology and rare disease products differs from primary care in several ways. Because of the increased revenue per patient for an oncology product, the necessity to target the ‘right’ physician is exponentially increased for an oncology product compared to products in the primary care space. Information from patient level data sources and local field intelligence plays an even more crucial role here. In summary, targeting in oncology is more ‘value’ driven than ‘volume’ driven.

    For oncology products, patient level data often dictates the market size depending on the type of tumor, and sales teams are structured based on the prevalence of these disease states. Using diagnostic data, teams are now able to include an additional dimension to their targeting strategy.  It should be noted that diagnostic data is not likely to be used in isolation or as the primary dataset that drives targeting; it would be more useful in conjunction with other patient level and/or specialty pharmacy datasets to provide an additional layer of detail. The occurrence of datasets that allow us to see the volume of patients tested for a specific type of marker (ALK, BRAF, etc.) in combination with disease state mapping (i.e. integration with patient level data for tumor types) allows sales teams to prioritize physicians and institutions. Incorporating the diagnostic data to any existing patient level or physician level data (through unique identifiers such as physician id or ZIP codes) presents an added dimension to physician targeting. 

    As illustrated in Figure 5, conventional targeting approaches use information that pertains only to the physician’s prescribing behavior. With the introduction of patient level data, commercial teams received some intelligence into the types of patients these physicians treat. However, this limits the group of patients to a particular broad disease state only—for example, all patients who are diagnosed with metastatic lung or breast cancer. Incorporating data from diagnostic testing allows us to identify which of these groups of patients test positive for the particular marker for which their product is being prescribed for (e.g. ALK+, BRAF+, etc.). Through extension of this, we would be able to group and differentiate physicians who treat patients for a particular type of cancer vs. a type of marker.
Figure 5: Illustration of a Segmentation Matrix That Incorporates Lab Data and Existing Patient Level Data to Group HCPs

  1. Uncovering Under-Tested Populations
    Through exploration of the select data sources that were available, we noticed a wide range of testing variations for most markers. This pattern is similar to other existing data sources as well. This could be attributed to a number of reasons: gaps in testing (i.e. different regions have different testing rates), differentiation in prevalence of a mutation, or capturing test data. Having a national level view of a particular type of marker data, we can identify sub-national areas of opportunity through localized variations in testing. Furthermore, building a robust lab testing capability, this data could be used to identify geographical differences in testing, including trends over time.
  2. Enabling Informed Forecasting
    Forecasting is a critical exercise for brand teams as it feeds into and influences many other functional areas within an organization. These linkages may be unidirectional (where forecasts feed into decisions made by the other functional areas) or bidirectional (where the forecast is used to quantify the effects of market changes envisioned by other functional areas).9 The challenges with the correct number of patients, lines of therapies and inclusion of the appropriate biomarkers makes it increasingly difficult for forecasters to get the most accurate results. The traditional approach in forecasting for oncology manufacturers bakes in a variety of assumptions and metrics from literature as well as primary and secondary data sources to define critical inputs like testing rates and outcomes. Having insights into the real world results from diagnostic data leads to better accuracy in estimating the size of patients and positivity rates for brand teams.
  3. Applications to Clinical and Medical Teams
    In addition to its applications in commercial decisions, medical/clinical teams can be benefited from using these diagnostic data. One of the most efficient uses of these datasets is in identification or evaluation of clinical trial test sites. The process to select and evaluate sites can be more streamlined and efficient, as localized testing information can provide an initial pool of candidates to choose from. They can also help in designing a trial based on the spread of population of patients. As pharmaceutical medical/clinical teams continue to educate physicians, they can influence their testing behavior to fit their clinical trial needs.

In oncology, commercial activities like sizing the market, benchmarking versus competition, identifying the right targets, and developing the appropriate customer messaging are already complicated by factors like disease staging, line of therapy, metastatic versus adjuvant therapy, combination therapy, and off-label prescribing. In the case of targeted therapies, which are developed for very specific patient sub-populations, this becomes even harder. In order to appropriately leverage this diagnostic test data source, a deeper understanding of the limitations and gaps becomes critical. In the earlier sections of this paper, we had discussed the differentiation and ‘non-standardization’ amongst labs. Additionally, the fragmentation of testing leads to gaps in coverage of data availability, almost seemingly impossible to get 100% coverage, at least in the near term. As the emergence of data providers and data aggregators mature, we may get much closer to a substantial sample size. There is currently no standardized panel across labs that goes through a specific set of mutations sequentially for a particular type of tumor. The protocols for completing a panel may differ–one lab may complete a full panel for every patient, another may prioritize tests and provide varying set of results by patient. The physicians who order or request the test may or may not be able to specify or prioritize mutations to be tested. Hence, if there are manufacturers who have products that are indicated for biomarkers that may not be up the priority list in a standard test panel (e.g. KRAS or ROS1), patients may not even be tested for this biomarker. Lastly, the interpretation of results may differ by lab, depending upon criteria set by the lab and leeway given to the technician. This results in the output data vastly differing; some labs may provide results down to a mutation level while others may just point out the outcome of the results whether it be positive or negative. This creates reports that are non-standardized across various labs.
Given the vast possibilities of exploring these data sources and a realization of the current limitations, pharmaceutical and diagnostic testing companies should start thinking along the lines of developing a data asset strategy around leveraging these diagnostic data sources. The three critical areas in the near term that companies should start working towards are 1. Investments to have a repository of these data sources, 2. Building a team over time that is dedicated to investigating and refining these data sources and 3. Developing subject matter expertise within organizations and identifying means to link these data sources to their day-to-day operations for use by commercial, sales, medical and marketing teams. Soon enough, this would become a norm in a catch-up pharma world to enable better patient treatment options and more informed commercial decisions. 
We would like to thank Randy Risser, Principal, Axtria, Inc., for his continued guidance and for playing a SME role through the course of this engagement. We would also like to thank Rishi Shrivastava, Chuyi Jiang, Kavya Nagaraj and Aakash Gupta from Axtria, Inc., who have spent countless hours in developing several materials and hypothesis throughout this exploration of working with diagnostic test data.
About the Authors
Kishan Kumar is an Associate Director at Axtria, Inc. He has over 10 years of experience in management consulting and working across several life sciences and pharmaceutical clients with expertise in sales force effectiveness, commercial model designs and working with patient level data. Over the past few years, Kishan has worked with several oncology manufacturers to gather an understanding of their commercial models and various data assets used by their brand teams. Prior to his tenure at Axtria, Kishan worked for a few global consulting firms including Cognizant, Alexander Group and MDRx financial. He has a Master’s degree in Biotechnology from University of Pennsylvania, Philadelphia. 
Juhi Parikh is a Project Lead at Axtria, Inc., with 7 years of experience in analytics, consulting and market research. At Axtria, she has worked with several leading pharmaceutical clients on analytics related to sales, marketing and commercial operations. Juhi is experienced in analytics related to real world evidence, oncology, commercial model design, and commercial operations, with deep expertise in patient level sources such as claims data and diagnostic test data.  Prior to Axtria, Juhi worked with Nielsen’s Innovation Practice, conducting new product market research and forecast analytics for leading CPG and OTC pharma brands.

1     Personalized Medicine Coalition (PMC) – The case for personalized medicine – 4 th Edition 2014.

2    Felix W. Frueh, Ph.D., Shashi Amur, Ph.D., Padmaja Mummaneni, Ph.D., Robert S. Epstein, M.D.,Ronald E. Aubert, Ph.D., Teresa M. DeLuca, M.D., Robert R. Verbrugge, Ph.D.,Gilbert J. Burckart, Pharm.D., and Lawrence J. Lesko, Ph.D., Pharmacogenomic Biomarker Information in Drug Labels Approved by the United States Food and Drug Administration:Prevalence of Related Drug Use (Pharmacotherapy 2008;28(8):992–998).

3    U.S. FDA - Table of Pharmacogenomic Biomarkers in Drug Labeling Source:

4    National Cancer Institute - Targeted Cancer Therapies – Source:

5    My Cancer Genome – Source:

6    Building Digital Trust: The role of data ethics in the digital age - Source:

7    Accenture newsroom - Source:

8    Degatano M, Sorokina T, Smyth C. National lab test database provides valuable marketing insights from all stages of the patient journey. Journal of the Pharmaceutical Management Science Association. 2015; Spring:13-21.

9    Forecasting for the pharmaceutical industry – Arthur G. Cook. 2006.