# Import modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gamma
from scipy.optimize import curve_fit
77 Rainfall generator
Rainfall weather generators are tools designed to simulate realistic precipitation events based on statistical properties of historical actual rainfall data. These generators often rely on probability distributions to model the frequency, intensity, and duration of rain events within a specific region. By fitting historical rainfall patterns to known statistical distributions, such as the Poisson distribution for occurrence and the Gamma or Log-Normal distributions for rainfall amounts, weather generators can create synthetic rainfall series that mimic the natural variability observed in real-world precipitation records.
Rainfall generators are useful in agricultural and hydrological modeling by providing a way to generate plausible rainfall scenarios to assist researchers and practitioners in planning and decision-making processes, particularly in contexts where understanding the variability and extremes of rainfall is essential.
Read and inspect dataset
# Load sample data
= pd.read_csv("../datasets/KS_Manhattan_6_SSW.csv",
df =[-9999], parse_dates=['LST_DATE'], date_format='%Y%m%d')
na_values
# Inspect a few rows
3) df.head(
WBANNO | LST_DATE | CRX_VN | LONGITUDE | LATITUDE | T_DAILY_MAX | T_DAILY_MIN | T_DAILY_MEAN | T_DAILY_AVG | P_DAILY_CALC | ... | SOIL_MOISTURE_5_DAILY | SOIL_MOISTURE_10_DAILY | SOIL_MOISTURE_20_DAILY | SOIL_MOISTURE_50_DAILY | SOIL_MOISTURE_100_DAILY | SOIL_TEMP_5_DAILY | SOIL_TEMP_10_DAILY | SOIL_TEMP_20_DAILY | SOIL_TEMP_50_DAILY | SOIL_TEMP_100_DAILY | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 53974 | 2003-10-01 | 1.201 | -96.61 | 39.1 | NaN | NaN | NaN | NaN | NaN | ... | -99.0 | -99.0 | -99.0 | -99 | -99 | NaN | NaN | NaN | NaN | NaN |
1 | 53974 | 2003-10-02 | 1.201 | -96.61 | 39.1 | 18.9 | 2.5 | 10.7 | 11.7 | 0.0 | ... | -99.0 | -99.0 | -99.0 | -99 | -99 | NaN | NaN | NaN | NaN | NaN |
2 | 53974 | 2003-10-03 | 1.201 | -96.61 | 39.1 | 22.6 | 8.1 | 15.4 | 14.8 | 0.0 | ... | -99.0 | -99.0 | -99.0 | -99 | -99 | NaN | NaN | NaN | NaN | NaN |
3 rows × 28 columns
# Add year, month, and day of the year to summarize data in future steps.
"YEAR"] = df["LST_DATE"].dt.year
df["MONTH"] = df["LST_DATE"].dt.month
df["DOY"] = df["LST_DATE"].dt.dayofyear
df[3) df.head(
WBANNO | LST_DATE | CRX_VN | LONGITUDE | LATITUDE | T_DAILY_MAX | T_DAILY_MIN | T_DAILY_MEAN | T_DAILY_AVG | P_DAILY_CALC | ... | SOIL_MOISTURE_50_DAILY | SOIL_MOISTURE_100_DAILY | SOIL_TEMP_5_DAILY | SOIL_TEMP_10_DAILY | SOIL_TEMP_20_DAILY | SOIL_TEMP_50_DAILY | SOIL_TEMP_100_DAILY | YEAR | MONTH | DOY | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 53974 | 2003-10-01 | 1.201 | -96.61 | 39.1 | NaN | NaN | NaN | NaN | NaN | ... | -99 | -99 | NaN | NaN | NaN | NaN | NaN | 2003 | 10 | 274 |
1 | 53974 | 2003-10-02 | 1.201 | -96.61 | 39.1 | 18.9 | 2.5 | 10.7 | 11.7 | 0.0 | ... | -99 | -99 | NaN | NaN | NaN | NaN | NaN | 2003 | 10 | 275 |
2 | 53974 | 2003-10-03 | 1.201 | -96.61 | 39.1 | 22.6 | 8.1 | 15.4 | 14.8 | 0.0 | ... | -99 | -99 | NaN | NaN | NaN | NaN | NaN | 2003 | 10 | 276 |
3 rows × 31 columns
# Observe trends in precipitation data (2004 to 2016)
=(5,4))
plt.figure(figsize0,400)
plt.xlim('Day of the year')
plt.xlabel('Precipitation (mm)')
plt.ylabel(for year in range(2004,2017):
= df["YEAR"] == year
idx_year = df["P_DAILY_CALC"][idx_year].cumsum()
P_cum range(1,len(P_cum)+1), P_cum)
plt.plot(
# Observe histograms of precipitation
=(5,4))
plt.figure(figsize'Annual precipitation (mm)')
plt.xlabel('Density')
plt.ylabel(for year in range(2004,2017):
= df["YEAR"] == year
idx_year = df["P_DAILY_CALC"][idx_year]
rainfall_year = rainfall_year[rainfall_year > 1]
rainfall_year =25, density=True) plt.hist(rainfall_year, bins
Modeling rainfall as a Markov chain.
This type of rainfall generators is kwnon as the Richardson-type models (Richardson and Wright, 1984). The model computes the rainfall at time t
as a function of time t-1
in three steps. To be consistent with the original manuscript by Richardson and Wright (1984), we will use the term “wet” for a day with measurable rainfall and “dry” for a day without measurable rainfall.
Step 1: Find if the previous day was dry or wet
Step 2: Compute the probability of having a rainfall event on day t
given the condition on the previous day. So, here we need to have two different probability distributions, one that we will use if yesterday was dry and another if yesterday was wet.
What is the probability of having a wet day if yesterday was dry? (i.e. P(W/D) )
What is the probability of having a wet day if yesterday was wet? (i.e. P(W/W) )
To accurately simulate seasonal rainfall trends, the model relies on Wet-Dry and Wet-Wet distributions on a monthly basis.
Steps 1 and 2 deal with the probability of rainfall occurrence, not the amount of rainfall. The next step will help us determine the amout of rainfall on wet days.
Step 3: If as a consequence of the random process we obtain that on day t
there is a rainfall event (i.e., day t
is a wet
day), then we have to compute the amount. To do this we will use a gamma distribution, which is suited to heavily skewed distributions, such as those resulting from histograms of daily rainfall (see previous histogram figure). To complete this step were need to have some parameters that describe this rainfall amount distribution on a monthly basis. These parameters are often determined by fitting the gamma distribution to historical rainfall data.
We will first run an example using the loaded dataset to learn how to compute the Wet-Dry and Wet-Wet probabilities for the entire year. It’s important that we first learn how to compute the simplest step before moving into a more detailed characterization of the rainfall process on a monthly basis.
Determine probability of rainfall occurrence by month
The first step of this process is to characterize the probability functions using observed data.
Probability of Wet-Dry and Wet-Wet days
# Initialize empty arrays
= []
WW = []
WD = []
month_WW = []
month_WD
# Create shorter variable names
= df["MONTH"].values
months = df["P_DAILY_CALC"].values
rainfall
for i in range(1,df.shape[0]):
# Determine if yesterday was Wet or Dry
# Append "1" if today was Wet, else append "0"
if rainfall[i-1] > 0:
month_WW.append(months[i])if rainfall[i] > 0:
1)
WW.append(else:
0)
WW.append(
elif rainfall[i-1] == 0:
month_WD.append(months[i])if rainfall[i] > 0:
1)
WD.append(else:
0)
WD.append(
print(sum(WW)/len(WW))
print(sum(WD)/len(WD))
0.4166666666666667
0.20590604026845638
Create table with probabilities by month
# Wet/Wet table
= pd.DataFrame({'month':month_WW, 'WW':WW})
df_WW
# Wet/Dry
= pd.DataFrame({'month':month_WD, 'WD':WD})
df_WD
# Compute monthly probabilities per month
= df_WW.groupby("month").sum() / df_WW.groupby("month").count()
monthly_WW = df_WD.groupby("month").sum() / df_WD.groupby("month").count()
monthly_WD
# Create single table of parameters
= pd.concat([monthly_WW, monthly_WD], axis=1)
coeff coeff
WW | WD | |
---|---|---|
month | ||
1 | 0.338028 | 0.141176 |
2 | 0.320513 | 0.161184 |
3 | 0.401709 | 0.219048 |
4 | 0.503448 | 0.273063 |
5 | 0.480769 | 0.290909 |
6 | 0.442953 | 0.293680 |
7 | 0.415254 | 0.228571 |
8 | 0.412214 | 0.245847 |
9 | 0.362745 | 0.201258 |
10 | 0.409524 | 0.193353 |
11 | 0.388889 | 0.129794 |
12 | 0.381579 | 0.144092 |
Determine probability of daily rainfall amount
# Fit gamma distribution to each month
= []
monthly_shape = []
monthly_scale
for m in range(1,13):
= df["MONTH"] == m
idx_month = df["P_DAILY_CALC"].loc[idx_month]
x = x[~np.isnan(x)] # Save only non NAN values
x = x[x>0] # Save only positive rainfall values
x = gamma.fit(x, floc=0)
shape, loc, scale
monthly_shape.append(shape)
monthly_scale.append(scale)
# Append amount parameters to monthly lookup table
"shape"] = monthly_shape
coeff["scale"] = monthly_scale
coeff[12) coeff.head(
WW | WD | shape | scale | |
---|---|---|---|---|
month | ||||
1 | 0.338028 | 0.141176 | 0.748257 | 4.166760 |
2 | 0.320513 | 0.161184 | 0.678656 | 8.142081 |
3 | 0.401709 | 0.219048 | 0.632519 | 11.092325 |
4 | 0.503448 | 0.273063 | 0.589697 | 15.483557 |
5 | 0.480769 | 0.290909 | 0.630033 | 13.661336 |
6 | 0.442953 | 0.293680 | 0.599585 | 22.291664 |
7 | 0.415254 | 0.228571 | 0.677182 | 16.648971 |
8 | 0.412214 | 0.245847 | 0.625811 | 21.885339 |
9 | 0.362745 | 0.201258 | 0.585308 | 15.593047 |
10 | 0.409524 | 0.193353 | 0.659315 | 11.230858 |
11 | 0.388889 | 0.129794 | 0.628394 | 10.430029 |
12 | 0.381579 | 0.144092 | 0.668791 | 9.900732 |
Check that our gamma function can generate sound rainfall distributions. Note that the figure below is the probability density function (pdf), not the actual rainfall amount. The y-axis contains density information while the x-axis contains rainfall data.
# Show frequency of rainfall amount for last month of the fitting
= gamma.fit(x, floc=0)
a,b,c
=(5,4))
plt.figure(figsize='scott', density=True)
plt.hist(x, bins= np.linspace(1, 50, 1000)
rainfall_range
plt.plot(rainfall_range, gamma.pdf(rainfall_range,a,b,c))'Daily precipitation (mm)')
plt.xlabel('Density')
plt.ylabel( plt.show()
Build rainfall generator
Now that we have all the distribution properties we can implement the rainfall generator.
# Define rainfall simulator
def rainfall_gen(dates,coeff):
= np.ones(dates.shape[0])*np.nan
P 0] = 0
P[
for t in range(1,dates.shape[0]):
= dates.month[t]
month if P[t-1] == 0:
if np.random.rand() > coeff["WD"][month]:
= 0
P[t] else:
= gamma.rvs(coeff["shape"][month],0,coeff["scale"][month])
P[t] = np.round(P[t]*10)/10
P[t]
elif P[t-1] > 0:
if np.random.rand() > coeff["WW"][month]:
= 0
P[t] else:
= gamma.rvs(coeff["shape"][month],0,coeff["scale"][month])
P[t] = np.round(P[t]*10)/10
P[t]
= P.sum()
P_total return P
Create rainfall scenarios with the generator
# Create an example set of dates
= pd.date_range("2018-01-01","2018-12-31",freq="D")
dates
# Call rainfall generator multiple times and plot cumulative rainfall
=(5,4))
plt.figure(figsizefor i in range(20):
= rainfall_gen(dates,coeff)
P
plt.plot(P.cumsum())
"Rainfall generator")
plt.title("DOY")
plt.xlabel("Rainfall amount (mm)")
plt.ylabel( plt.show()
# Plot the last iteration
=(10,3))
plt.figure(figsize
plt.bar(dates,P)"Daily rainfall") # for last year of the previous simulation
plt.title("Rainfall amount (mm)")
plt.ylabel( plt.show()
References
Intergernmental Panel on Climate Change Data Distribution Centre: https://www.ipcc-data.org/guidelines/pages/weather_generators.html
Jones, P., Harpham, C., Kilsby, C., Glenis, V. and Burton, A., 2010. UK Climate Projections science report: Projections of future daily climate for the UK from the Weather Generator.
Kilsby, C.G., Jones, P.D., Burton, A., Ford, A.C., Fowler, H.J., Harpham, C., James, P., Smith, A. and Wilby, R.L., 2007. A daily weather generator for use in climate change studies. Environmental Modelling & Software, 22(12), pp.1705-1719.
Richardson, C.W., 1982. Dependence structure of daily temperature and solar radiation. Transactions of the ASAE, 25(3), pp.735-0739.
Richardson, C.W. and Wright, D.A., 1984. WGEN: A model for generating daily weather variables.