How To Derive Survival Curve Based On Base Curve And Hazard Ratio
Introduction
In rare disease modeling, analysts often face a common challenge: deriving survival curves when only limited evidence is available.
This guide shows how to derive survival curves from a base-case curve using hazard ratios, enabling analysts to generate comparative effectiveness data when direct survival information is lacking. The method increases transparency and reproducibility in rare disease modeling.
The method was validated in a health economic modeling project where evidence gaps required careful assumptions.
The post will walk through several foundational statistical concepts, including:
- Cumulative survival proportion
- Internal hazard and hazard function
- Hazard ratio
- Conversion between cumulative survival proportion and hazard
To make the method practical and reusable, I will also share:
- A reusable Excel-based template
- A Python notebook implementation
Both tools help analysts quickly apply the method, enabling rapid adaptation to new therapies and supporting timely, informed decisions in HEOR and evidence generation.
This guide is intended for health economists, epidemiologists, and modelers working in rare diseases, where creative yet rigorous approaches are often needed to bridge evidence gaps.
Necessary Statistical Concepts
This article assumes basic statistical knowledge, but not expertise in time-to-event analysis, which involves more specialized methods.
Because this post links to the Excel and Python tools, understanding the core concepts will help users apply the workflow confidently in their work.
The goal here is not to provide a full survival analysis tutorial, but to introduce the key concepts needed to understand the analytical framework in the template and notebook.
I also note key mathematical assumptions so users understand the approach’s conditions and limitations.
Hazard
In time-to-event analysis, hazard describes the instantaneous rate at which an event occurs over time.
The method in this article uses a form of hazard called interval hazard. In some contexts, it is also called piecewise constant hazard, interval-specific hazard, or age-specific hazard.
For clarity, I will consistently use the term interval hazard.
Interval hazard reflects event risk within a defined interval, assuming individuals have survived to that interval’s start.
Hazard Ratio (HR)
A hazard ratio (HR) compares the hazard between two groups, typically a treatment group and a reference group.
Mathematically:
\[HR_{\text{B vs A}}=\frac{h_B}{h_A}\]where:
- \(h_A\) represents the hazard in Group A
- \(h_B\) represents the hazard in Group B
Hazard ratios are directional; the numerator and denominator affect interpretation.
Interpretation is simple: HR = 1 means identical hazards, HR > 1 means Group B’s hazard is higher, HR < 1 means it’s lower.
An HR of 0.70 means Group B’s hazard is estimated 30% lower than Group A’s during the interval.
Cumulative Survival Proportion
Cumulative survival proportion represents the proportion of individuals who remain event-free at a given time relative to the starting population at baseline.
In most cases, the cumulative survival proportion decreases over time as more events occur.
This relationship is often visualized using Kaplan–Meier curves that show survival probability over time or age.
The survival curve shows how the probability of remaining event-free changes over time.
Relationship Between Interval Hazard and Cumulative Survival Proportion
A core component of this analytical framework is the quantitative relationship between interval hazard and cumulative survival proportion.
Assuming an exponential distribution within an interval, survival proportion and interval hazard are linked by: \(S(t+\Delta t) = S(t) \exp(-h\Delta t)\) [1]
Rearranging gives interval hazard from survival proportions: \(h = -\ln[S(t+\Delta t)/S(t)]\) [2]
where:
- \(S(t)\) represents the cumulative survival proportion at time t
- \(S(t+\Delta t)\) represents the cumulative survival proportion at the end of the interval
- \(h\) represents the interval hazard within the interval
- \(\Delta t\) represents interval duration
This quantitative relationship between survival curves and interval hazards is the key mechanism underlying the reconstruction workflow provided in the Excel template and Python notebook.
This approach assumes hazards are roughly constant within each interval. Shorter intervals usually better mirror real-world changes.
Deriving Hazard of Group B From Group A Using an External Hazard Ratio
Once interval hazards for a reference group are known, hazards for another group can be estimated using an external hazard ratio.
By definition:
\(h_B=h_A\times HR_{\text{B vs A}}\) [3]
where:
- \(h_A\) represents the interval hazard for Group A
- \(h_B\) represents the interval hazard for Group B
- \(HR_{\text{B vs A}}\) represents the hazard ratio comparing Group B with Group A
This framework reconstructs Group B hazards by applying the external hazard ratio to each of Group A’s interval hazards.
This approach assumes proportional hazards across intervals, so the relative hazard difference stays constant.
In practice, proportional hazards may not always hold. Interpret reconstructed survival curves in light of this assumption and the external HR’s quality.
Step-by-Step Numeric Example
This section provides a step-by-step numerical example showing how to apply the equations and concepts in a clinical modeling scenario.
Example Background
Assume the standard-of-care survival curve is established from age 0 to 50 years.
A novel drug was evaluated in an adult clinical trial with approximately 1 year of follow-up. A hazard ratio comparing the novel drug to standard care was estimated.
The objective is to simulate a survival curve from age 0 to 50 years for the novel drug.
This example uses these terms: base curve (standard-of-care survival), target curve (novel-drug survival), base hazard (from the base curve), and target hazard (base hazard with an HR applied).
This setup poses a typical challenge: using a reference survival curve and an external hazard ratio to simulate survival for a new treatment.
Analysis
Step 1: Reconstruct interval hazards from base survival curve
The table shows age, the base survival curve, and the reconstructed base interval hazard.
For demonstration, ages 0 to 12 are shown. The Excel template includes ages 0 to 50.
The base hazard is calculated from the base survival curve using Equation [2].
| Age (year) | Base curve | Base hazard |
|---|---|---|
| 0 | 100.0% | 0.00 |
| 1 | 100.0% | 0.00 |
| 2 | 100.0% | 0.00 |
| 3 | 100.0% | 0.00 |
| 4 | 100.0% | 0.00 |
| 5 | 100.0% | 0.00 |
| 6 | 100.0% | 0.00 |
| 7 | 100.0% | 0.00 |
| 8 | 100.0% | 0.00 |
| 9 | 99.9% | 0.00 |
| 10 | 99.6% | 0.01 |
| 11 | 99.0% | 0.01 |
| 12 | 97.8% | 0.02 |
This step converts cumulative survival to interval event rates, enabling the curve to be adjusted with a hazard ratio.
Step 2: Derive target hazards by applying the hazard ratio
Next, estimate the target hazard by multiplying the base hazard by the expected hazard ratio, per Equation 3].[3].
| Age (year) | Base curve | Base hazard | Target hazard |
|---|---|---|---|
| 0 | 100.0% | 0.00 | 0.00 |
| 1 | 100.0% | 0.00 | 0.00 |
| 2 | 100.0% | 0.00 | 0.00 |
| 3 | 100.0% | 0.00 | 0.00 |
| 4 | 100.0% | 0.00 | 0.00 |
| 5 | 100.0% | 0.00 | 0.00 |
| 6 | 100.0% | 0.00 | 0.00 |
| 7 | 100.0% | 0.00 | 0.00 |
| 8 | 100.0% | 0.00 | 0.00 |
| 9 | 99.9% | 0.00 | 0.00 |
| 10 | 99.6% | 0.01 | 0.00 |
| 11 | 99.0% | 0.01 | 0.01 |
| 12 | 97.8% | 0.02 | 0.01 |
In practical terms, this step translates relative treatment-effect evidence into age-specific hazard rates for the target treatment.
For example, if the hazard ratio is less than 1, the target hazard will be lower than the base hazard, resulting in a more favorable simulated survival curve.
Step 3: Reconstruct the target survival curve
Finally, the target survival curve is reconstructed from the derived target hazards using Equation [1].
| Age (year) | Base curve | Base hazard | Target hazard | Target curve |
|---|---|---|---|---|
| 0 | 100.0% | 0.00 | 0.00 | 100.0% |
| 1 | 100.0% | 0.00 | 0.00 | 100.0% |
| 2 | 100.0% | 0.00 | 0.00 | 100.0% |
| 3 | 100.0% | 0.00 | 0.00 | 100.0% |
| 4 | 100.0% | 0.00 | 0.00 | 100.0% |
| 5 | 100.0% | 0.00 | 0.00 | 100.0% |
| 6 | 100.0% | 0.00 | 0.00 | 100.0% |
| 7 | 100.0% | 0.00 | 0.00 | 100.0% |
| 8 | 100.0% | 0.00 | 0.00 | 100.0% |
| 9 | 99.9% | 0.00 | 0.00 | 99.8% |
| 10 | 99.6% | 0.01 | 0.00 | 99.5% |
| 11 | 99.0% | 0.01 | 0.01 | 98.9% |
| 12 | 97.8% | 0.02 | 0.01 | 98.0% |
The target curve is the final simulated survival trajectory for the novel therapy.
This workflow provides a transparent way to move from three key inputs:
- an established base survival curve,
- an externally estimated hazard ratio, and
- an interval-based survival calculation framework,
to a simulated survival curve for a target treatment.
While this approach is practical and transparent, its validity depends on several important assumptions. In the next section, I discuss the key assumptions and limitations analysts should consider before applying hazard-ratio-based survival curve derivation in real-world modeling projects.
Key assumption (and limitation)
The primary assumption underlying this approach is that the hazard ratio between the base group and target group remains constant within each age interval.
In addition, the framework commonly assumes that a single hazard ratio can be applied consistently across the entire age range, from younger to older populations.
In real-world clinical settings, however, these assumptions may not always hold. Treatment effects can vary across age groups, disease stages, follow-up durations, or patient subpopulations. As a result, the true hazard ratio may change over time rather than remain constant.
When these assumptions are potentially violated, the simulated survival curve should be interpreted with appropriate caution.
For this reason, sensitivity analyses are often valuable to explore the uncertainty associated with the assumptions. Examples may include:
- applying alternative hazard ratio values,
- varying treatment-effect durability over time,
- using age-specific hazard ratios when evidence is available, or
- testing scenario-based assumptions for long-term extrapolation.
In applied health economic and epidemiological modeling, these sensitivity analyses can help assess the robustness of conclusions and improve transparency around structural uncertainty.
Reusable templates in Excel and Python
To support your work, I developed a reusable template in both an Excel spreadsheet and a Python notebook. Feel free to explore and adapt them to enhance your projects.
This blog article can serve as an introductory guide to using and interpreting the templates, with the necessary documentation embedded in them. Whether you’re a beginner or an experienced analyst, the lightweight templates are designed to be accessible and adaptable to your skill level, making them worth checking out.
We welcome your feedback and comments on the post and templates. Your input will help us improve these tools together, benefiting more model analysts in health economics and outcome research (HEOR), epidemiology, and evidence generation.
Reference
numiqo. Survival analysis [Simply Explained] [Video]
Shaneyfelt, T. Interpreting hazard ratios [Video]