Step 1 of 4
The Problem
The Landscape
Cancer treatment is one of the most consequential sequential decision-making problems in medicine. A patient undergoing chemotherapy receives a series of treatment cycles -- typically 4 to 8 rounds spaced 2 to 3 weeks apart. At each cycle, the oncologist must decide: what dose should the patient receive?
Too high a dose causes severe side effects -- neutropenia, organ damage, treatment discontinuation. Too low a dose allows the tumor to grow back between cycles. The optimal dosing strategy depends on the patient's current health state: tumor size, white blood cell count, kidney function, and overall toxicity level.
Traditional clinical practice uses fixed dosing protocols. A patient weighing 70 kg with a certain body surface area receives a standard dose, regardless of how their body is responding. Some oncologists adjust doses reactively -- reducing the dose after severe side effects appear -- but this is ad hoc and inconsistent.
NovaCure Therapeutics
NovaCure Therapeutics is a mid-stage pharmaceutical company with $420M in annual revenue, focused on solid tumor oncology. They have a portfolio of three chemotherapy drugs in clinical trials and two FDA-approved treatments.
Their clinical data spans 12,000 patients across 8 clinical trials, with detailed longitudinal records: tumor measurements via imaging (every 3 weeks), complete blood counts (weekly), liver and kidney panels (biweekly), and patient-reported quality-of-life scores.
The Problem
NovaCure's lead drug, NC-4817, shows promising efficacy in Phase II trials for advanced non-small-cell lung cancer. However, the trial data reveals a critical pattern:
- 34% of patients require dose reductions due to toxicity
- 18% discontinue treatment entirely due to adverse effects
- Among patients who complete all cycles, tumor response varies enormously -- some achieve complete remission, others show minimal response
The oncology team suspects that a personalized, adaptive dosing strategy could simultaneously improve efficacy (better tumor control) and reduce toxicity (fewer dose reductions and discontinuations). But the challenge is that the optimal dose at cycle depends on everything that has happened in cycles 1 through .
This is a sequential decision-making problem under uncertainty. It is exactly the kind of problem that reinforcement learning was designed to solve.
Business Impact
If NovaCure can demonstrate that an RL-guided dosing protocol improves outcomes, the potential impact is:
- Regulatory: A companion dosing algorithm submitted alongside the drug could accelerate FDA approval and differentiate NC-4817 from competitors
- Clinical: Improved patient outcomes (higher response rates, fewer adverse events) directly impact the drug's commercial value
- Financial: Reducing treatment discontinuation from 18% to under 10% could increase revenue by $45M annually for NC-4817 alone
- Platform: A validated RL dosing framework could be applied across NovaCure's entire drug portfolio