# Import modules
import numpy as np
from numpy.polynomial import polynomial as P
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import r2_score
59 Calibration soil moisture sensor
In this example we will use linear regression to develop a calibration equation for a soil moisture sensor. The raw sensor output consists of a voltage differential that needs to be correlated with volumetric water content in order to make soil moisture estimations.
A laboratory calibration was conducted using containers with packed soil having a known soil moisture that we will use to develop the calibration curve. For each container and soil type we obtained voltage readings with the sensor and then we oven-dried to soil to find the true volumetric water content.
Independent variable: Sensor raw voltage readings (milliVolts)
Dependent variable: Volumetric water content (cm^3/cm^3)
# Read dataset
= pd.read_csv('../datasets/teros_12_calibration.csv', skiprows=[0,2])
df 3) df.head(
soil | vwc_obs | raw_voltage | |
---|---|---|---|
0 | loam | 0.0047 | 1888 |
1 | loam | 0.1021 | 2019 |
2 | loam | 0.2538 | 2324 |
# Fit linear model (degree=1)
= P.polyfit(df['raw_voltage'], df['vwc_obs'], deg=1)
par print(par)
# Polynomial coefficients ordered from low to high.
[-7.94969978e-01 4.32559249e-04]
# Evaluate fitted linear model at measurement points
'vwc_pred'] = P.polyval(df['raw_voltage'], par) df[
# Determine mean absolute error (MAE)
# Define auxiliary function for MAE
= lambda x,y: np.round(np.mean(np.abs(x-y)),3)
mae_fn
# COmpute MAE for our observations
= mae_fn(df['vwc_obs'], df['vwc_pred'])
mae print('MAE:',mae)
MAE: 0.027
# Compute coefficient of determination (R^2)
= r2_score(df['vwc_obs'], df['vwc_pred'])
r2 print("R-squared:", np.round(r2, 2))
R-squared: 0.96
# Create range of voltages (independent variable) to create a line
= 100
n_points = np.linspace(df['raw_voltage'].min(), df['raw_voltage'].max(), n_points)
x_pred
# Predict values of voluemtric water content (dependent variable) for line
= P.polyval(x_pred, par) y_pred
For a linear model we need at least two points and for non-linear models we need more than two. However, creating a few hundred or even a few thaousand values is not that expensive in terms of memory and processing time, so above we adopted a total of 100 points, so that you can adapt this example to non-linear models if necessary.
# Create figure
=12
fontsize
=(5,4))
plt.figure(figsize'TEROS 12 - Calibration', size=14)
plt.title('raw_voltage'], df['vwc_obs'], facecolor='w', edgecolor='k', s=50, label='Obs')
plt.scatter(df[='black', label='Fitted')
plt.plot(x_pred, y_pred, color'Voltage (mV)', size=fontsize)
plt.xlabel('VWC Observed (cm³ cm⁻³)', size=fontsize)
plt.ylabel(=fontsize)
plt.yticks(size=fontsize)
plt.xticks(size1870, 0.44, 'B',size=18)
plt.text(1870, 0.40,f'N = {df.shape[0]}', fontsize=14)
plt.text(1870, 0.36,f'MAE = {mae} (cm³ cm⁻³)', fontsize=14)
plt.text(=fontsize, loc = 'lower right')
plt.legend(fontsize plt.show()