Model Performance Indicators¶
📊 Overview¶
This guide covers the essential metrics for evaluating hydrological prediction models. Understanding these metrics is crucial for assessing model reliability and comparing different approaches.
🎯 Why Performance Metrics Matter¶
- Model Selection: Choose the best model for your specific application
- Validation: Ensure your model generalizes well to unseen data
- Improvement: Identify areas where models need refinement
- Communication: Report model performance to stakeholders
📈 Core Metrics¶
1. Coefficient of Determination (R²)¶
Mathematical Formula¶
Where: - \(Q_{i}^{\mathrm{obs}}\) = observed value at time step \(i\) - \(Q_{i}^{\mathrm{sim}}\) = simulated value at time step \(i\) - \(\overline{Q^{\mathrm{obs}}}\), \(\overline{Q^{\mathrm{sim}}}\) = means of observed and simulated values - \(n\) = number of observations
Interpretation¶
| R² Value | Interpretation |
|---|---|
| 1.0 | Perfect correlation |
| 0.8-1.0 | Very strong relationship |
| 0.6-0.8 | Strong relationship |
| 0.4-0.6 | Moderate relationship |
| 0.2-0.4 | Weak relationship |
| < 0.2 | Very weak or no relationship |
Key Points
- R² measures the proportion of variance explained by the model
- Values range from 0 to 1
- Higher values indicate better linear correlation
- Does not indicate whether predictions are biased
Python Implementation¶
import numpy as np
def calculate_r2(observed, simulated):
"""
Calculate R² between observed and simulated values
"""
# Calculate correlation coefficient
r = np.corrcoef(observed, simulated)[0, 1]
# Square it to get R²
r2 = r ** 2
return r2
2. Nash-Sutcliffe Efficiency (NSE)¶
Mathematical Formula¶
Interpretation¶
| NSE Value | Model Performance |
|---|---|
| 1.0 | Perfect model |
| 0.75-1.0 | Very good performance |
| 0.65-0.75 | Good performance |
| 0.50-0.65 | Satisfactory performance |
| 0.0-0.50 | Unsatisfactory (but better than mean) |
| < 0 | Unacceptable (worse than using mean) |
Important
NSE < 0 means the observed mean is a better predictor than the model!
Python Implementation¶
def calculate_nse(observed, simulated):
"""
Calculate Nash-Sutcliffe Efficiency
"""
# Calculate numerator (sum of squared errors)
numerator = np.sum((observed - simulated) ** 2)
# Calculate denominator (variance of observed)
denominator = np.sum((observed - np.mean(observed)) ** 2)
# Calculate NSE
nse = 1 - (numerator / denominator)
return nse
3. Percent Bias (PBIAS)¶
Mathematical Formula¶
Interpretation¶
| PBIAS Value | Interpretation |
|---|---|
| 0% | No bias (perfect) |
| > 0% | Model underestimates |
| < 0% | Model overestimates |
| ±5% | Very good |
| ±10% | Good |
| ±15% | Satisfactory |
| > ±25% | Unsatisfactory |
Rule of Thumb
- For streamflow: PBIAS < ±10% is very good
- For sediment: PBIAS < ±15% is very good
- For nutrients: PBIAS < ±25% is acceptable
Python Implementation¶
def calculate_pbias(observed, simulated):
"""
Calculate Percent Bias
"""
pbias = 100 * np.sum(observed - simulated) / np.sum(observed)
return pbias
🔧 Complete Evaluation Function¶
Here's the comprehensive function used throughout this guide:
import numpy as np
def evaluate_model(obs, sim):
"""
Calculate R², NSE, and PBIAS between observed and simulated values.
Parameters:
-----------
obs : array-like
Observed values
sim : array-like
Simulated/predicted values
Returns:
--------
dict : Dictionary containing R², NSE, and PBIAS
Example:
--------
>>> obs = np.array([1.2, 2.3, 3.1, 4.5, 5.2])
>>> sim = np.array([1.3, 2.1, 3.3, 4.2, 5.5])
>>> results = evaluate_model(obs, sim)
>>> print(f"R² = {results['R²']:.3f}")
>>> print(f"NSE = {results['NSE']:.3f}")
>>> print(f"PBIAS = {results['PBIAS']:.2f}%")
"""
# Convert to numpy arrays
obs = np.array(obs)
sim = np.array(sim)
# R² (Coefficient of Determination)
r = np.corrcoef(obs, sim)[0, 1]
r2 = r ** 2
# NSE (Nash-Sutcliffe Efficiency)
nse = 1 - np.sum((obs - sim) ** 2) / np.sum((obs - np.mean(obs)) ** 2)
# PBIAS (Percent Bias)
pbias = 100 * np.sum(obs - sim) / np.sum(obs)
return {'R²': r2, 'NSE': nse, 'PBIAS': pbias}
📊 Additional Metrics¶
Mean Absolute Error (MAE)¶
Root Mean Square Error (RMSE)¶
from sklearn.metrics import mean_squared_error
rmse = np.sqrt(mean_squared_error(observed, simulated))
📈 Visualization of Performance¶
Scatter Plot with Metrics¶
import matplotlib.pyplot as plt
def plot_performance(obs, sim, metrics):
"""
Create a scatter plot with performance metrics
"""
fig, ax = plt.subplots(figsize=(8, 8))
# Scatter plot
ax.scatter(obs, sim, alpha=0.5, s=20)
# 1:1 line
min_val = min(obs.min(), sim.min())
max_val = max(obs.max(), sim.max())
ax.plot([min_val, max_val], [min_val, max_val], 'r--', label='1:1 Line')
# Add metrics as text
textstr = f"R² = {metrics['R²']:.3f}\n"
textstr += f"NSE = {metrics['NSE']:.3f}\n"
textstr += f"PBIAS = {metrics['PBIAS']:.2f}%"
ax.text(0.05, 0.95, textstr, transform=ax.transAxes,
fontsize=12, verticalalignment='top',
bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))
ax.set_xlabel('Observed')
ax.set_ylabel('Simulated')
ax.set_title('Model Performance')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
🎯 Performance Standards for Hydrological Models¶
Monthly Time Step¶
| Performance Rating | NSE | R² | PBIAS |
|---|---|---|---|
| Very Good | > 0.75 | > 0.75 | < ±10% |
| Good | 0.65-0.75 | 0.65-0.75 | ±10-15% |
| Satisfactory | 0.50-0.65 | 0.50-0.65 | ±15-25% |
| Unsatisfactory | < 0.50 | < 0.50 | > ±25% |
Daily Time Step¶
| Performance Rating | NSE | R² | PBIAS |
|---|---|---|---|
| Very Good | > 0.65 | > 0.70 | < ±15% |
| Good | 0.50-0.65 | 0.60-0.70 | ±15-20% |
| Satisfactory | 0.40-0.50 | 0.50-0.60 | ±20-30% |
| Unsatisfactory | < 0.40 | < 0.50 | > ±30% |
Time Scale Matters
Daily models typically have lower performance metrics than monthly models due to higher temporal variability.
✅ Best Practices¶
- Use Multiple Metrics: No single metric tells the complete story
- Consider Time Scale: Adjust expectations based on temporal resolution
- Check Residuals: Look for patterns in prediction errors
- Validate on Independent Data: Always test on unseen data
- Report Uncertainty: Include confidence intervals when possible
🚀 Next Steps¶
Now that you understand performance metrics: - Apply them to evaluate your Simple Linear Regression model - Use them to compare different models - Learn about Feature Engineering to improve performance