## Detailed Report ### Introduction to Interest Rate Modeling The modeling of yield curves plays a central and increasingly vital role in the financial world. Traditional static valuation methods, such as the widely used Discounted-Cash-Flow (DCF) method, are increasingly recognized by quantitative analysts as inadequate for evaluating real options in a dynamic economic environment. Historically, from roughly 2002 to 2004, interest rate structures were relatively stable. However, the consistent rise leading up to the 2008 financial crisis—and the unprecedented structural shifts that followed—demonstrated that volatility and flexibility are critical parameters. Because the present value of any future cash flow relies heavily on the denominator (the interest rate), project valuation, portfolio management, and derivative pricing all fundamentally rely on accurately predicting future spot rates using models that can adapt to stochastic (random) market behaviors. ### Stochastic Differential Equations (SDEs) & Itō Calculus To accurately model these random market behaviors, we must transition from classical deterministic calculus into the realm of stochastic processes. Financial markets are driven by unpredictable shocks, meaning a standard ordinary differential equation (ODE) is insufficient. Instead, we use Stochastic Differential Equations (SDEs), which take the general form: $$dX_t = a(X_t, t)dt + b(X_t, t)dB_t$$ Here, $a(X_t, t)dt$ represents the deterministic "drift" (the expected direction), and $b(X_t, t)dB_t$ represents the random "diffusion" (the volatility). The variable $dB_t$ is the standard Brownian motion (or Wiener process). Brownian motion is a continuous-time stochastic process characterized by independent, normally distributed increments. Because a Brownian path is infinitely jagged and nowhere differentiable, classical Newton-Leibniz calculus breaks down when trying to integrate or differentiate it. To overcome this, financial mathematics relies on **Itō Calculus**. The cornerstone of Itō calculus is the quadratic variation rule, which proves that over time, the squared increments of Brownian motion scale linearly with time: $(dB_t)^2 \equiv dt$. Using a second-order Taylor expansion, Kiyoshi Itō derived the famous Itō's Lemma, allowing us to differentiate functions of stochastic variables. For a smooth function $f(t, X_t)$, Itō's Lemma is defined as: $$df(t,X_t) = \left( \frac{\partial f}{\partial t} + \mu\frac{\partial f}{\partial X_t} + \frac{1}{2}\sigma^2\frac{\partial^2 f}{\partial X_t^2} \right)dt + \sigma\frac{\partial f}{\partial X_t}dB_t$$ This lemma is the mathematical engine that allows us to derive theoretical bond prices from underlying random interest rate models. ### The Cox-Ingersoll-Ross (CIR) Model To forecast rates reliably, we utilize the Cox-Ingersoll-Ross (CIR) model. The CIR model is an affine stochastic diffusion process. It addresses key empirical criteria for sound interest rate modeling, such as enforcing mean reversion and naturally preventing the systematic prediction of negative interest rates. The instantaneous interest rate $r(t)$ in the CIR model is defined by the following stochastic differential equation: $$dr(t) = \kappa(\theta - r(t))dt + \sigma\sqrt{r(t)}dB_t$$ - $\kappa$: The speed of the mean-reversion process. - $\theta$: The long-term equilibrium value (mean-reversion level). - $\sigma$: The volatility parameter. - $dB_t$: The standard Brownian motion (Wiener process). A distinguishing feature of the CIR model is heteroskedasticity: the absolute volatility scales with the square root of the interest rate ($\sqrt{r(t)}$), perfectly capturing the dynamic properties of spot rate time series. Beyond this, the model inherently assumes **Mean Reversion**, meaning interest rates will eventually revert to a stable, long-term equilibrium ($\theta$) at a speed dictated by $\kappa$. Furthermore, it relies on a non-central chi-square distribution, which mathematically prevents the systematic forecasting of negative rates. ### From Spot Rates to Bond Prices: The Single-Factor Model While the CIR equation elegantly describes the movement of the instantaneous short rate, financial institutions need to price zero-coupon bonds $P(t, r, T)$ across various maturities. In this notation, $t$ is the current time, $r$ is the current spot rate, and $T$ is the maturity date. To bridge this gap, quantitative analysts construct a theoretical, self-financing portfolio consisting of long and short positions in bonds with different maturities. We start by defining a general stochastic process for the spot rate: $$dr(t) = f(r(t),t)dt + \rho(r(t),t)dB(t)$$ Here, $f(r(t),t)$ acts as the deterministic drift term, $\rho(r(t),t)$ is the volatility term, and $dB(t)$ remains the standard Brownian motion. By applying Itō's Lemma to the bond price function $P$, we can model its dynamics. According to the fundamental "No-Arbitrage" principle, a completely risk-free, hedged portfolio must yield exactly the risk-free interest rate. When this strict mathematical constraint is applied to our stochastic differential equations, a crucial new parameter emerges from the derivation: the **Market Price of Risk ($\lambda$)**. This leads us to the central partial differential equation (PDE) for the bond price: $$P_t + (f - \rho\lambda)P_r + \frac{\rho^2}{2}P_{rr} - rP = 0$$ In this central equation: - **$P_t$**: The partial derivative of the bond price with respect to time ($t$). - **$P_r$**: The first partial derivative of the bond price with respect to the interest rate ($r$). - **$P_{rr}$**: The second partial derivative of the bond price with respect to the interest rate ($r$). - **$\lambda$**: The market price of risk. In practice, $\lambda$ is typically negative. This mathematical negativity reflects a core economic reality: investors demand a risk premium (higher yields) for locking their capital into long-term bonds compared to short-term investments. The Market Price of Risk is the mathematical bridge that enforces this premium across the entire yield curve. To solve this complex PDE analytically, financial mathematics uses an affine structural approach (Ansatz): $$P(\tau) = \exp(A(\tau) - B(\tau)r)$$ Here, $\tau = T - t$ represents the time to maturity. $A(\tau)$ and $B(\tau)$ are time-dependent functions whose solutions are found by solving a system of Riccati equations. ### Capturing Reality: The Multi-Factor Expansion The Single-Factor CIR model assumes that the entire yield curve is driven by a single underlying source of randomness. This implies that all bond yields across all maturities move in perfect lockstep. This contradicts empirical market observations, where the yield curve frequently twists, steepens, or flattens independently. To capture these complex dynamics, the framework is expanded into a **Multi-Factor Model**. Instead of a single state variable, the instantaneous interest rate is modeled as the sum of several independent, unobservable CIR processes (factors): $$r = \sum_{i=1}^n y_i$$ In this expansion: - **$n$**: The total number of independent factors being modeled. - **$y_i$**: The individual, unobservable state variable (factor). Each of these state variables follows its own CIR-based stochastic differential equation: $$dy_i = f_i(t,y_i)dt + \sum_{j=1}^n \rho(y_i,y_j,t)dB_i(t)$$ As a result, the price of the zero-coupon bond expands into a product-like exponential form that accounts for all independent factors simultaneously: $$P(\tau,y_1,y_2,...,y_n) = \exp\left(\sum_{i=1}^n A_i(\tau) - B_i(\tau)y_i\right)$$ Where $A_i(\tau)$ and $B_i(\tau)$ are the specific time-dependent Riccati solutions calculated for the $i$-th factor, and $\tau$ remains the time to maturity. In a 3-factor model, each factor independently follows its own CIR process with its own unique parameters ($\kappa_i, \theta_i, \sigma_i, \lambda_i$). This allows the model to distribute roles: one factor naturally optimizes to capture short-term instant behavior, another captures long-term macroeconomic equilibrium, and the third handles medium-term flexions. ### Parameter Estimation with the Kalman Filter Because these individual factors ($y_i$) are virtual, unobservable quantities, extracting them from raw market prices requires an advanced state-space filtering mechanism. We implemented a Kalman Filter. Originally famous for its real-time applications in the NASA Apollo program, the Kalman Filter acts as a Bayesian minimum-variance estimator. It solves a fundamental problem: how do you find the "true" value of a state when both your theoretical model and your market measurements are flawed? At its core, the linear filter continuously calculates a **Kalman Gain** ($K$)—a dynamic weighting factor that decides who to trust. If the market data is extremely noisy, the gain drops, and the filter trusts its internal mathematical model. If the internal model is highly uncertain, the gain rises, and the filter snaps to the noisy market measurements. Use the interactive simulator below to observe this tug-of-war. Adjust the Sensor Noise ($R$) and System Noise ($Q$) to see how the Kalman Gain shifts its trust in real-time. To execute this mathematically, the filter translates our problem into a **State-Space Representation**: 1. **The System Equation:** Describes the hidden evolution of the unobservable CIR factors ($y_t$). $$y_{t+1} = C + F y_t + q_t$$ 2. **The Measurement Equation:** Maps those hidden factors to the actual market data we can observe (e.g., empirical zero-coupon bond prices, $z_t$). $$z_t = A + H y_t + r_t$$ Here, $q_t$ represents the _system noise_ (inaccuracies inherent in the theoretical model, tied to Brownian motion) with covariance $Q$, and $r_t$ represents the _measurement noise_ (market microstructure anomalies) with covariance $R$. ### Evolving to the Adaptive Kalman Filter The standard linear filter is incredibly powerful, but it has a critical flaw when applied to the CIR framework: it uses static noise parameters and assumes a normal distribution, meaning it mathematically allows the "True State" to dip below zero (as seen in the chart above). Because the CIR process involves a strict non-negativity constraint and state-dependent system noise (heteroskedasticity), a standard linear filter is insufficient. We deployed a modified **Adaptive (Quasi-Linear) Kalman Filter** operating via a continuous predictor-corrector loop that introduces two major CIR-specific evolutions: 1. **State-Dependent System Noise (Dynamic $Q$):** Because volatility scales with the square root of the interest rate ($\sqrt{r(t)}$), the system noise is not constant. The adaptive filter actively recalculates the system noise matrix at every step: $Q_t = \sigma^2 \cdot r_t \cdot dt$. Higher interest rates automatically trigger higher mathematical uncertainty. 2. **The Non-Negativity Constraint:** During the loop, the algorithm predicts the state vector forward in time ($\hat{y}_{t+1}$). If this predictor calculates a negative interest rate, our adaptive mechanism actively intervenes. It forces the value to zero ($\tilde{y}_t \ge 0$) to respect the CIR model and immediately updates the error covariance matrix to account for this manual intervention. In the adaptive simulator below, notice how the "Dynamic System Noise (Q)" actively fluctuates alongside the height of the curve, proving the filter's awareness of heteroskedasticity. Furthermore, watch the "Floor Hits" counter track the algorithm's active interventions to prevent negative spot rates during periods of high volatility. ### Project Mission & Implementation The core objective of the project was to calibrate 1-factor, 2-factor, and 3-factor CIR models against highly empirical data. We processed historical data for US zero-coupon bonds spanning from February 1, 1982, to June 1, 2014, across maturities ranging from 1 to 10 years. The algorithmic pipeline included: 1. **Initialization:** Establishing baseline parameters based on long-term expectation values. 2. **Filtration:** Running the adaptive Kalman Filter across the 32-year dataset to extract latent state variables. 3. **Optimization:** Maximizing the Log-Likelihood function to define the ideal combination of drift, volatility, and mean-reversion speed for each independent factor. ### Outcome & Results The application of the Kalman Filter proved highly successful in replicating the true spot rate path. However, distinct performance differences emerged between the multi-factor models: - **The 1-Factor Limitation:** The 1-factor model struggled to accurately represent the entire curve. Because it relies on a single state variable, it cannot specialize in both short-term volatility and long-term mean reversion. As a result, the optimizer produced compromised "average" parameters that typically over-weighted the short end of the curve, leading to inaccurate upward trends for long-term maturities. - **The 3-Factor Superiority:** The 3-factor (and 2-factor) models exhibited a clear role distribution among their state variables. One factor naturally optimized for short-term instant behavior (characterized by a high mean-reversion speed $\kappa$ and high volatility $\sigma$), while another captured the long-term macroeconomic equilibrium (characterized by a dominant $\theta$ and lower volatility). The third factor served to capture medium-term fluctuations. Using the optimized parameters from the multi-factor models, we successfully executed a Monte Carlo simulation utilizing 10,000 independent paths to forecast the yield curve up to the year 2047. By averaging these paths to reduce noise, this robust simulation engine allows for smoothed, highly reliable visualizations of future spot rates across all maturities, providing a powerful quantitative tool for long-term investment analysis. ### P.S. Algorithm Details To fully grasp how the adaptive Kalman Filter processes market data recursively, we can break down the predictor-corrector loop illustrated in the algorithm flowchart. **State and Measurement Variables:** - $y_t$: The true, unobservable state vector representing the underlying CIR factors. - $\hat{y}_t$: The prediction of the state vector prior to measurement correction. - $\tilde{y}_t$: The corrected prediction of the state vector after integrating the actual market data. - $z_t$: The actual, measured interest rates from the market. - $\hat{z}_t$: The predicted interest rates derived from the state space. - $\Delta z_t$: The difference (residual) between the actual measured values $z_t$ and their predictions $\hat{z}_t$. **Matrices and Covariances:** - $F$ and $C$: The system matrix and vector used to calculate the time evolution of the state vector into the next time step. - $H$ and $A$: The output matrix and vector used to calculate the observable quantities (market rates) from the state vector. - $K_t$: The Kalman Gain, which dynamically weights how much correction should be applied based on the measurement residual. - $\hat{P}_t$ and $\tilde{P}_t$: The predicted and corrected covariance matrices of the estimation error. - $Var(z_t)$: The covariance matrix of the predicted interest rates. - $R$: The covariance matrix of the measurement noise, representing the inaccuracy of the measured signals. - $Q_t$: The covariance matrix of the system noise, capturing the inaccuracies of the theoretical model. - $\mathbb{I}$: The mathematical identity matrix. **The Step-by-Step Loop:** 1. **Prediction:** Utilizing the previously established state, the algorithm forecasts the current state ($\hat{y}_t$) and its associated error covariance ($\hat{P}_t$). From this, it predicts what the market rates should be ($\hat{z}_t$) and calculates their variance ($Var(z_t)$). 2. **Kalman Gain & Residual:** True market data ($z_t$) is introduced. The algorithm finds the discrepancy ($\Delta z_t$) and computes the optimal Kalman Gain ($K_t$). 3. **State Correction & Constraint:** The state is corrected to $\tilde{y}_t$. Crucially, the non-negativity constraint is applied: if any value falls below zero, it is manually set to zero ($\tilde{y}_{i,t} \ge 0$). 4. **Iteration & Likelihood Optimization:** The error covariance is updated to $\tilde{P}_t$, and the new, state-dependent system noise ($Q_{t+1}$) is generated for the next cycle. Simultaneously, the Log-Likelihood is calculated at each step to fuel the overarching parameter optimization.