# Import necessary modules
import numpy as np
19 Functions
In this tutorial, we delve into the basics of functions, covering key concepts like function declaration, inputs, outputs, and documentation through docstrings. Functions are powerful tools, akin to a wrench or screwdriver in a toolbox, that are used for executing specific tasks within your code. Functions are essentially named encapsulated snippets of code that can be reused multiple times. Function help organizing your code, avoiding code repetition, and reduce errors. This reusability not only enhances the modularity of your current project, but also extends to future projects, allowing you to build a personal library of useful functions.
Utilizing functions involves two key steps:
- define or declare the function, setting up what it does and how,
- call or invoke the function whenever you need its functionality in your code.
This two-step process (defining and calling functions) is fundamental to programming and is crucial for creating well-structured, maintainable, and scalable code.
Syntax
This is the main syntax to define your own functions:
def function_name(parameters, par_opt=par_value):
# Code block
return result
Let’s look at a few examples to see this two-step process in action.
Example function: Compute vapor pressure deficit
The vapor pressure deficit (VPD) represents the “thirst” of the atmosphere and is computed as the difference between the saturation vapor pressure and the actual vapor pressure. The saturation vapor pressure can be accurately approximated as a function of air temperature using the empirical Tetens equation. Here is the set equations to compute VPD:
Saturation vapor pressure: e_{sat} = 0.611 \; exp\Bigg(\frac{17.502 \ T} {T + 240.97}\Bigg)
Actual vapor pressure: e_{act} = e_{sat} \frac{RH}{100}
Vapor pressure deficit: VPD = e_{sat} - e_{act}
Variables
e_{sat} is the saturation vapor pressure deficit (kPa)
e_{act} is the actual vapor pressure (kPa)
VPD is the vapor pressure deficit (kPa)
T is air temperature (^\circC)
RH is relative humidity (%)
Define function
In the following example we will focus on the main building blocks of a function, but we will ignore error handling and checks to ensure that inputs have the proper data type. For more details on how to properly handle errors and ensure inputs have the correct data type see the error handling tutorial.
# Define function
def compute_vpd(T, RH, unit='kPa'):
"""
Function that computes the air vapor pressure deficit (vpd).
Parameters:
T (integer, float): Air temperature in degrees Celsius.
RH (integer, float): Air relative humidity in percentage.
unit (string): Unit of the output vpd value.
One of the following: kPa (default), bars, psi
Returns:
float: Vapor pressure deficit in kiloPascals (kPa).
Authors:
Andres Patrignani
Date created:
6 January 2024
Reference:
Campbell, G. S., & Norman, J. M. (2000).
An introduction to environmental biophysics. Springer Science & Business Media.
"""
# Compute saturation vapor pressure
= 0.611 * np.exp((17.502*T) / (T + 240.97)) # kPa
e_sat
# Compute actual vapor pressure
= e_sat * RH/100 # kPa
e_act
# Compute vapor pressure deficit
= e_sat - e_act # kPa
vpd
# Change units if necessary
if unit == 'bars':
*= 0.01 # Same as vpd = vpd * 0.01
vpd
elif unit == 'psi':
*= 0.1450377377 # Convert to pounds per square inch (psi)
vpd
return vpd
Did you notice the expression vpd *= 0.01
? This is a compact way in Python to do vpd = vpd * 0.01
. You can also use it with other operators, like +=
for adding or -=
for subtracting values from a variable.
Description of function components
Function Definition:
def compute_vpd(T, RH, unit='kPa'):
This line defines the function with the namecompute_vpd
, which takes two parameters,T
for temperature andRH
for relative humidity. The function also includes an optional argumentunit=
that has a default value ofkPa
.Docstring:
""" Function that computes the air vapor pressure deficit (vpd). Parameters: T (integer, float): Air temperature in degrees Celsius. RH (integer, float): Air relative humidity in percentage. unit (string): Unit of the output vpd value. One of the following: kPa (default), bars, psi Returns: float: Vapor pressure deficit in kiloPascals (kPa). Authors: Andres Patrignani Date created: 6 January 2024 Reference: Campbell, G. S., & Norman, J. M. (2000). An introduction to environmental biophysics. Springer Science & Business Media. """
The triple-quoted string right after the function definition is the docstring. It provides a brief description of the function, its parameters, their data types, and what the function returns.
Saturation Vapor Pressure Calculation:
= 0.611 * np.exp((17.502*T) / (T + 240.97)) # kPa e_sat
This line of code calculates the saturation vapor pressure (
e_sat
) using air temperatureT
. It’s a mathematical expression that uses theexp
function from the NumPy library (np
), which should be imported at the beginning of the script.Actual Vapor Pressure Calculation:
= e_sat * RH/100 # kPa e_act
This line calculates the actual vapor pressure (
e_act
) based on the saturation vapor pressure and the relative humidityRH
of air.Vapor Pressure Deficit Calculation:
= e_sat - e_act # kPa vpd
Here, the vapor pressure deficit (
vpd
) is computed by subtracting the actual vapor pressure from the saturation vapor pressure.Unit conversion:
# Change units if necessary if unit == 'bars': = vpd * 0.01 vpd elif unit == 'psi': = vpd * 0.1450377377 # Convert to pounds per square inch (psi) vpd
In this step we change the units of the resulting
vpd
before returning the output. Note that since the value ofvpd
using the equations in the function is already inkPa
, so there is no need to handle this scenario in theif
statement.Return Statement:
return vpd
The
return
statement sends back the result of the function (vpd
) to wherever the function was called.
In Python functions, you can use optional parameters with default values for flexibility, placing them after mandatory parameters in the function’s definition.
Call function
Having named our function and defined its inputs, we can now invoke the function without duplicating the code.
# Define input variables
= 25 # degrees Celsius
T = 75 # percentage
RH
# Call the function (without using the optional argument)
= compute_vpd(T, RH)
vpd
# Display variable value
print(f'The vapor pressure deficit is {vpd:.2f} kPa')
The vapor pressure deficit is 0.79 kPa
# Call the function using the optional argument to specify the unit in `bars`
= compute_vpd(T, RH, unit='bars')
vpd
# Display variable value
print(f'The vapor pressure deficit is {vpd:.3f} bars')
The vapor pressure deficit is 0.008 bars
# Call the function using the optional argument to specify the unit in `psi`
= compute_vpd(T, RH, unit='psi')
vpd
# Display variable value
print(f'The vapor pressure deficit is {vpd:.3f} psi')
The vapor pressure deficit is 0.115 psi
In Python, the sequence in which you pass input arguments into a function is critical because the function expects them in the order they were defined. If you call compute_vpd
with the inputs in the wrong order, like compute_vpd(RH, T)
, the function will still execute, but it will use relative humidity (RH
) as temperature and temperature (T
) as humidity, leading to incorrect results. To ensure accuracy, you must match the order to the function’s definition: compute_vpd(T, RH)
.
Evaluate function performance
Code performance in terms of execution time directly impacts the data analysis and visualization experience. The perf_counter()
method within the time
module provides a high-resolution timer that can be used to track the execution time of your code, offering a precise measure of performance. By recording the time immediately before and after a block of code runs, and then calculating the difference, perf_counter()
helps you understand how long your code takes to execute. This is particularly useful for optimizing code and identifying bottlenecks in your Python programs. However, it is important to balance performance with the principle that premature optimization in the early stages of a project is often counterproductive. Optimization should come at a later stage when the code is correct and its performance bottlenecks are clearly identified.
# Import time module
import time
# Get initial time
= time.perf_counter()
tic
= compute_vpd(T, RH, unit='bars')
vpd
# Get final time
= time.perf_counter()
toc
# Compute elapsed time
= toc - tic
elapsed_time print("Elapsed time:", elapsed_time, "seconds")
Elapsed time: 7.727195043116808e-05 seconds
Access function help (the docstring)
compute_vpd?
Signature: compute_vpd(T, RH, unit='kPa') Docstring: Function that computes the air vapor pressure deficit (vpd). Parameters: T (integer, float): Air temperature in degrees Celsius. RH (integer, float): Air relative humidity in percentage. unit (string): Unit of the output vpd value. One of the following: kPa (default), bars, psi Returns: float: Vapor pressure deficit in kiloPascals (kPa). Authors: Andres Patrignani Date created: 6 January 2024 Reference: Campbell, G. S., & Norman, J. M. (2000). An introduction to environmental biophysics. Springer Science & Business Media. File: /var/folders/w1/cgh8d8y962g9c6p4_dxgbn2jh5jy11/T/ipykernel_40431/2839097177.py Type: function
Function variable scope
One aspect of Python functions that we did not cover is variable scope. In Python, variables defined inside a function are local to that function and can’t be accessed from outside of it, while variables defined outside of functions are global and can be accessed from anywhere in the script. It’s like having a conversation in a private room (function) versus a public area (global scope).
To prevent confusion, it’s best to follow good naming conventions for variables in your scripts. However, in extensive scripts with numerous functions—both written by you and imported from other modules—tracking every variable name can be challenging. This is where the local variable scope of Python functions comes to rescue, ensuring that variables within a function don’t interfere with those outside.
Below are a few examples to practice and consider.
Example: Access a global variable from inside of a function
= 1 # Accessible from anywhere in the script
variable_outside
def my_function():
= 2 # Only accessible within this function (it's not being used for anything in this case)
variable_inside print(variable_outside)
# Invoke the function
my_function()
1
Example: Modify a global variable from inside of a function (will not work)
This example will not work, but I think it’s worth trying to learn. In this example, as soon as we request the interpreter to perform an operation on variable_outside
, it searches in the local workspace of the function for a variable called variable_outside
, but since this variable has not been defined WITHIN the function, then it throws an error. See the next example for a working solution.
= 1 # Accessible from anywhere in the script
variable_outside
def my_function():
+= 5
variable_outside print(variable_outside)
# Invoke the function
my_function()
UnboundLocalError: local variable 'variable_outside' referenced before assignment
Example: Modify global variables inside of a function
The solution to the previous example is to explicitly tell Python to search for a global
variable. The use of global variables is an advanced features and often times not recommended, since this practice tends to increase the complexity of the code.
= 1 # Accessible from anywhere in the script
variable_outside
def my_function():
# We tell Python to use the variable defined outside in the next line
global variable_outside
+= 5
variable_outside print(variable_outside)
# Invoke the function
# The function changes the value of the variable_outside
my_function()
# Print the value of the variable
print(variable_outside) # Same value as before since we changed it inside the function
6
6
Example: A global and a local variable with the same name
= 1 # Accessible from anywhere in the script
variable_outside
def my_function():
# A different variable with the same name.
# Only available inside the function
= 1
variable_outside
# We are changing the value in the previous line,
# not the variable defined outside of the function
+= 5
variable_outside
print(variable_outside)
# Invoke the function
# This prints the variable inside the function
my_function()
# This prints the variable we defined at the top, which remains unchanged
print(variable_outside)
6
1
Python utility functions: zip, map, filter, and reduce
zip
Description: Aggregates elements from two or more iterables (like lists or tuples) and returns an iterator of tuples. Each tuple contains elements from the iterables, paired based on their order. For example, zip([1, 2], ['a', 'b'])
would produce an iterator yielding (1, 'a')
and (2, 'b')
. If the iterables don’t have the same length, zip
stops creating tuples when the shortest input iterable is exhausted.
Use-case:This function is especially useful when you need to pair data elements from different sequences in a parallel manner.
# Match content in two lists
= ['Manhattan','Colby','Tribune','Wichita','Lawrence']
sampling_location = [6.5, 6.1, 5.9, 7.0, 7.2]
soil_ph
list(zip(sampling_location, soil_ph))
[('Manhattan', 6.5),
('Colby', 6.1),
('Tribune', 5.9),
('Wichita', 7.0),
('Lawrence', 7.2)]
# The zip() function is useful to combine geographic information
# Here are the geographic coordinates of five stations of the Kansas Mesonet
= [39.12577,39.81409,39.41796, 37.99733, 38.84945]
latitude = [-96.63653, -97.67509, -97.13977, -100.81514, -99.34461]
longitude = [324, 471, 388, 882, 618]
altitude
= list(zip(latitude, longitude, altitude))
coords print(coords)
[(39.12577, -96.63653, 324), (39.81409, -97.67509, 471), (39.41796, -97.13977, 388), (37.99733, -100.81514, 882), (38.84945, -99.34461, 618)]
map
Description: Applies a given function to each item of an iterable (like a list) and returns a map object.
Use-case: Transforming data elements into a collection.
= [0,10,20,100]
celsius = list(map(lambda x: (x*9/5)+32, celsius))
fahrenheit print(fahrenheit)
[32.0, 50.0, 68.0, 212.0]
# Convert a DNA sequence into RNA
# Remember that RNA contains uracil instead of thymine
= 'ATTCGGGCAAATATGC'
dna = dict({"A":"U", "T":"A", "C":"G", "G":"C"})
lookup = list(map(lambda x: lookup[x], dna))
rna print(''.join(rna))
UAAGCCCGUUUAUACG
filter
Description: Filters elements of an iterable based on a function that tests each element.
Use-case: Selecting elements that meet specific criteria.
# Get the occurrence of all adenine nucleotides
= 'ATTCGGGCAAATATGC'
dna list(filter(lambda x: x == "A", dna))
['A', 'A', 'A', 'A', 'A']
# Find compacted soils
= [1.01, 1.52, 1.84, 1.45, 1.32]
bulk_densities = list(filter(lambda x: x > 1.6, bulk_densities))
compacted_soils print(compacted_soils)
[1.84]
# Find hydrophobic soils based on regular function
def is_hydrophobic(contact_angle):
"""
Function that determines whether a soil is hydrophobic
based on its contact angle.
"""
if contact_angle < 90:
= False
repel elif contact_angle >= 90 and contact_angle <= 180:
= True
repel
return repel
= [5,10,20,50,90,150]
contact_angles list(filter(is_hydrophobic, contact_angles))
[90, 150]
reduce
Description: Applies a function cumulatively to the items of an iterable, reducing the iterable to a single value.
Use-case: Aggregating data elements.
from functools import reduce
# Compute total yield
= [1200, 1500, 1800, 2000]
crop_yields = reduce(lambda x, y: x + y, crop_yields)
total_yield print(total_yield)
6500
While the map
, filter
, and reduce
functions are useful in standard Python, the functions are less critical when working with Pandas or NumPy, as these libraries already provide built-in, optimized methods for element-wise operations and data manipulation. Numpy typically surpasses the need for map
, filter
, or reduce
in most scenarios.
Comparative anatomy of functions
Python
def hypotenuse(C1, C2):
= (C1**2 + C2**2)**0.5
H return H
3, 4) hypotenuse(
Matlab
function H = hypotenuse(C1, C2)
H = sqrt(C1^2 + C2^2);
end
hypotenuse(3, 4)
Julia
function hypotenuse(C1, C2)
= sqrt(C1^2 + C2^2)
H return H
end
hypotenuse(3, 4)
R
<- function(C1, C2) {
hypotenuse = sqrt(C1^2 + C2^2)
H return(H)
}
hypotenuse(3, 4)
JavaScript
function hypotenuse(C1, C2) {
let H = Math.sqrt(C1**2 + C2**2);
return H;
}
hypotenuse(3, 4);
Commonalities among programming languages:
- All languages use a keyword (like
function
ordef
) to define a function. - They specify function names and accept parameters within parentheses.
- The body of the function is enclosed in a block (using braces
{}
like in JavaScript and R, indentation in the case of Python, orend
in the case of Julia and Matlab). - Return statements are used to output the result of the function.
- Functions are invoked by calling their name followed by arguments in parentheses.
Practice
Create a function that computes the amount of lime required to increase an acidic soil pH. You can find examples in most soil fertility textbooks or extension fact sheets from multiple land-grant universities in the U.S.
Create a function that determines the amount of nitrogen required by a crop based on the amount of nitrates available at pre-planting, a yield goal for your region, and the amount of nitrogen required to produce the the yield goal.
Create a function to compute the amount of water storage in the soil profile from inputs of volumetric water content and soil depth.
Create a function that accepts latitude and longitude coordinates in decimal degrees and returns the latitude and longitude values in sexagesimal degrees. The function should accept Lat and Lon values as separate inputs e.g.
fun(lat,lon)
and must return a list of tuples with four components for each coordinate: degrees, minutes, seconds, and quadrant. The quadrant would be North/South for latitude and East/West for longitude. For instance:fun(19.536111, -155.576111)
should result in[(19,32,10,'N'),(155,34,34,'W')]