19 Functions

Author

Andres Patrignani

Published

January 6, 2024

Keywords

functions, modularity

In this tutorial, we delve into the basics of functions, covering key concepts like function declaration, inputs, outputs, and documentation through docstrings. Functions are powerful tools, akin to a wrench or screwdriver in a toolbox, that are used for executing specific tasks within your code. Functions are essentially named encapsulated snippets of code that can be reused multiple times. Function help organizing your code, avoiding code repetition, and reduce errors. This reusability not only enhances the modularity of your current project, but also extends to future projects, allowing you to build a personal library of useful functions.

Utilizing functions involves two key steps:

define or declare the function, setting up what it does and how,
call or invoke the function whenever you need its functionality in your code.

This two-step process (defining and calling functions) is fundamental to programming and is crucial for creating well-structured, maintainable, and scalable code.

Syntax

This is the main syntax to define your own functions:

def function_name(parameters, par_opt=par_value):
    # Code block
    return result

Let’s look at a few examples to see this two-step process in action.

Example function: Compute vapor pressure deficit

The vapor pressure deficit (VPD) represents the “thirst” of the atmosphere and is computed as the difference between the saturation vapor pressure and the actual vapor pressure. The saturation vapor pressure can be accurately approximated as a function of air temperature using the empirical Tetens equation. Here is the set equations to compute VPD:

Saturation vapor pressure: e_{sat} = 0.611 \; exp\Bigg(\frac{17.502 \ T} {T + 240.97}\Bigg)

Actual vapor pressure: e_{act} = e_{sat} \frac{RH}{100}

Vapor pressure deficit: VPD = e_{sat} - e_{act}

Variables

e_{sat} is the saturation vapor pressure deficit (kPa)

e_{act} is the actual vapor pressure (kPa)

VPD is the vapor pressure deficit (kPa)

T is air temperature (^\circC)

RH is relative humidity (%)

Define function

In the following example we will focus on the main building blocks of a function, but we will ignore error handling and checks to ensure that inputs have the proper data type. For more details on how to properly handle errors and ensure inputs have the correct data type see the error handling tutorial.

# Import necessary modules
import numpy as np

# Define function

def compute_vpd(T, RH, unit='kPa'):
    """
    Function that computes the air vapor pressure deficit (vpd).

    Parameters:
    T (integer, float): Air temperature in degrees Celsius.
    RH (integer, float): Air relative humidity in percentage.
    unit (string): Unit of the output vpd value. 
                    One of the following: kPa (default), bars, psi

    Returns:
    float: Vapor pressure deficit in kiloPascals (kPa).
    
    Authors:
    Andres Patrignani

    Date created:
    6 January 2024
    
    Reference:
    Campbell, G. S., & Norman, J. M. (2000).
    An introduction to environmental biophysics. Springer Science & Business Media.
    """

    # Compute saturation vapor pressure
    e_sat = 0.611 * np.exp((17.502*T) / (T + 240.97)) # kPa

    # Compute actual vapor pressure
    e_act = e_sat * RH/100 # kPa

    # Compute vapor pressure deficit
    vpd = e_sat - e_act # kPa
    
    # Change units if necessary
    if unit == 'bars':
        vpd *= 0.01 # Same as vpd = vpd * 0.01
    
    elif unit == 'psi':
        vpd *= 0.1450377377 # Convert to pounds per square inch (psi)

    return vpd

Syntax note

Did you notice the expression vpd *= 0.01? This is a compact way in Python to do vpd = vpd * 0.01. You can also use it with other operators, like += for adding or -= for subtracting values from a variable.

Description of function components

Function Definition: def compute_vpd(T, RH, unit='kPa'): This line defines the function with the name compute_vpd, which takes two parameters, T for temperature and RH for relative humidity. The function also includes an optional argument unit= that has a default value of kPa.

Docstring:

"""
Function that computes the air vapor pressure deficit (vpd).

Parameters:
T (integer, float): Air temperature in degrees Celsius.
RH (integer, float): Air relative humidity in percentage.
unit (string): Unit of the output vpd value. 
                One of the following: kPa (default), bars, psi

Returns:
float: Vapor pressure deficit in kiloPascals (kPa).

Authors:
Andres Patrignani

Date created:
6 January 2024

Reference:
Campbell, G. S., & Norman, J. M. (2000).
An introduction to environmental biophysics. Springer Science & Business Media.
"""

The triple-quoted string right after the function definition is the docstring. It provides a brief description of the function, its parameters, their data types, and what the function returns.

Saturation Vapor Pressure Calculation:
```
e_sat = 0.611 * np.exp((17.502*T) / (T + 240.97))  # kPa
```
This line of code calculates the saturation vapor pressure (e_sat) using air temperature T. It’s a mathematical expression that uses the exp function from the NumPy library (np), which should be imported at the beginning of the script.
Actual Vapor Pressure Calculation:
```
e_act = e_sat * RH/100  # kPa
```
This line calculates the actual vapor pressure (e_act) based on the saturation vapor pressure and the relative humidity RH of air.
Vapor Pressure Deficit Calculation:
```
vpd = e_sat - e_act  # kPa
```
Here, the vapor pressure deficit (vpd) is computed by subtracting the actual vapor pressure from the saturation vapor pressure.
Unit conversion:
```
# Change units if necessary
if unit == 'bars':
    vpd = vpd * 0.01

elif unit == 'psi':
    vpd = vpd * 0.1450377377 # Convert to pounds per square inch (psi)
```
In this step we change the units of the resulting vpd before returning the output. Note that since the value of vpd using the equations in the function is already in kPa, so there is no need to handle this scenario in the if statement.
Return Statement:
```
return vpd
```
The return statement sends back the result of the function (vpd) to wherever the function was called.

Note

In Python functions, you can use optional parameters with default values for flexibility, placing them after mandatory parameters in the function’s definition.

Call function

Having named our function and defined its inputs, we can now invoke the function without duplicating the code.

# Define input variables
T = 25 # degrees Celsius
RH = 75 # percentage

# Call the function (without using the optional argument)
vpd = compute_vpd(T, RH)

# Display variable value
print(f'The vapor pressure deficit is {vpd:.2f} kPa')

The vapor pressure deficit is 0.79 kPa

# Call the function using the optional argument to specify the unit in `bars`
vpd = compute_vpd(T, RH, unit='bars')

# Display variable value
print(f'The vapor pressure deficit is {vpd:.3f} bars')

The vapor pressure deficit is 0.008 bars

# Call the function using the optional argument to specify the unit in `psi`
vpd = compute_vpd(T, RH, unit='psi')

# Display variable value
print(f'The vapor pressure deficit is {vpd:.3f} psi')

The vapor pressure deficit is 0.115 psi

Important

In Python, the sequence in which you pass input arguments into a function is critical because the function expects them in the order they were defined. If you call compute_vpd with the inputs in the wrong order, like compute_vpd(RH, T), the function will still execute, but it will use relative humidity (RH) as temperature and temperature (T) as humidity, leading to incorrect results. To ensure accuracy, you must match the order to the function’s definition: compute_vpd(T, RH).

Evaluate function performance

Code performance in terms of execution time directly impacts the data analysis and visualization experience. The perf_counter() method within the time module provides a high-resolution timer that can be used to track the execution time of your code, offering a precise measure of performance. By recording the time immediately before and after a block of code runs, and then calculating the difference, perf_counter() helps you understand how long your code takes to execute. This is particularly useful for optimizing code and identifying bottlenecks in your Python programs. However, it is important to balance performance with the principle that premature optimization in the early stages of a project is often counterproductive. Optimization should come at a later stage when the code is correct and its performance bottlenecks are clearly identified.

# Import time module
import time

# Get initial time
tic = time.perf_counter() 

vpd = compute_vpd(T, RH, unit='bars')

# Get final time
toc = time.perf_counter() 

# Compute elapsed time
elapsed_time = toc - tic
print("Elapsed time:", elapsed_time, "seconds")

Elapsed time: 7.727195043116808e-05 seconds

Access function help (the docstring)

compute_vpd?

Signature: compute_vpd(T, RH, unit='kPa')
Docstring:
Function that computes the air vapor pressure deficit (vpd).
Parameters:
T (integer, float): Air temperature in degrees Celsius.
RH (integer, float): Air relative humidity in percentage.
unit (string): Unit of the output vpd value. 
                One of the following: kPa (default), bars, psi
Returns:
float: Vapor pressure deficit in kiloPascals (kPa).
Authors:
Andres Patrignani
Date created:
6 January 2024
Reference:
Campbell, G. S., & Norman, J. M. (2000).
An introduction to environmental biophysics. Springer Science & Business Media.
File:      /var/folders/w1/cgh8d8y962g9c6p4_dxgbn2jh5jy11/T/ipykernel_40431/2839097177.py
Type:      function

Function variable scope

One aspect of Python functions that we did not cover is variable scope. In Python, variables defined inside a function are local to that function and can’t be accessed from outside of it, while variables defined outside of functions are global and can be accessed from anywhere in the script. It’s like having a conversation in a private room (function) versus a public area (global scope).

To prevent confusion, it’s best to follow good naming conventions for variables in your scripts. However, in extensive scripts with numerous functions—both written by you and imported from other modules—tracking every variable name can be challenging. This is where the local variable scope of Python functions comes to rescue, ensuring that variables within a function don’t interfere with those outside.

Below are a few examples to practice and consider.

Example: Access a global variable from inside of a function

variable_outside = 1 # Accessible from anywhere in the script

def my_function():
    variable_inside = 2 # Only accessible within this function (it's not being used for anything in this case)
    print(variable_outside)
    
# Invoke the function
my_function()

Example: Modify a global variable from inside of a function (will not work)

This example will not work, but I think it’s worth trying to learn. In this example, as soon as we request the interpreter to perform an operation on variable_outside, it searches in the local workspace of the function for a variable called variable_outside, but since this variable has not been defined WITHIN the function, then it throws an error. See the next example for a working solution.

variable_outside = 1 # Accessible from anywhere in the script

def my_function():
    variable_outside += 5
    print(variable_outside)
    
# Invoke the function
my_function()

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[1], line 8
      5     print(variable_outside)
      7 # Invoke the function
----> 8 my_function()

Cell In[1], line 4, in my_function()
      3 def my_function():
----> 4     variable_outside += 5
      5     print(variable_outside)

UnboundLocalError: local variable 'variable_outside' referenced before assignment

Example: Modify global variables inside of a function

The solution to the previous example is to explicitly tell Python to search for a global variable. The use of global variables is an advanced features and often times not recommended, since this practice tends to increase the complexity of the code.

variable_outside = 1 # Accessible from anywhere in the script

def my_function():
    
    # We tell Python to use the variable defined outside in the next line
    global variable_outside 
    
    variable_outside += 5
    print(variable_outside)
    
# Invoke the function
my_function() # The function changes the value of the variable_outside

# Print the value of the variable
print(variable_outside) # Same value as before since we changed it inside the function

6
6

Example: A global and a local variable with the same name

variable_outside = 1 # Accessible from anywhere in the script

def my_function():
    
    # A different variable with the same name. 
    # Only available inside the function
    variable_outside = 1
    
    # We are changing the value in the previous line,
    # not the variable defined outside of the function
    variable_outside += 5
    
    print(variable_outside)
    
# Invoke the function
my_function() # This prints the variable inside the function

# This prints the variable we defined at the top, which remains unchanged
print(variable_outside)

6
1

Python utility functions: zip, map, filter, and reduce

zip

Description: Aggregates elements from two or more iterables (like lists or tuples) and returns an iterator of tuples. Each tuple contains elements from the iterables, paired based on their order. For example, zip([1, 2], ['a', 'b']) would produce an iterator yielding (1, 'a') and (2, 'b'). If the iterables don’t have the same length, zip stops creating tuples when the shortest input iterable is exhausted.
Use-case:This function is especially useful when you need to pair data elements from different sequences in a parallel manner.

# Match content in two lists
sampling_location = ['Manhattan','Colby','Tribune','Wichita','Lawrence']
soil_ph = [6.5, 6.1, 5.9, 7.0, 7.2]

list(zip(sampling_location, soil_ph))

[('Manhattan', 6.5),
 ('Colby', 6.1),
 ('Tribune', 5.9),
 ('Wichita', 7.0),
 ('Lawrence', 7.2)]

# The zip() function is useful to combine geographic information
# Here are the geographic coordinates of five stations of the Kansas Mesonet
latitude = [39.12577,39.81409,39.41796, 37.99733, 38.84945]
longitude = [-96.63653, -97.67509, -97.13977, -100.81514, -99.34461]
altitude = [324, 471, 388, 882, 618]

coords = list(zip(latitude, longitude, altitude))
print(coords)

[(39.12577, -96.63653, 324), (39.81409, -97.67509, 471), (39.41796, -97.13977, 388), (37.99733, -100.81514, 882), (38.84945, -99.34461, 618)]

map

Description: Applies a given function to each item of an iterable (like a list) and returns a map object.
Use-case: Transforming data elements into a collection.

celsius = [0,10,20,100]
fahrenheit = list(map(lambda x: (x*9/5)+32, celsius))
print(fahrenheit)

[32.0, 50.0, 68.0, 212.0]

# Convert a DNA sequence into RNA
# Remember that RNA contains uracil instead of thymine 
dna = 'ATTCGGGCAAATATGC'
lookup = dict({"A":"U", "T":"A", "C":"G", "G":"C"})
rna = list(map(lambda x: lookup[x], dna))
print(''.join(rna))

UAAGCCCGUUUAUACG

filter

Description: Filters elements of an iterable based on a function that tests each element.
Use-case: Selecting elements that meet specific criteria.

# Get the occurrence of all adenine nucleotides
dna = 'ATTCGGGCAAATATGC'
list(filter(lambda x: x == "A", dna))

['A', 'A', 'A', 'A', 'A']

# Find compacted soils
bulk_densities = [1.01, 1.52, 1.84, 1.45, 1.32]
compacted_soils = list(filter(lambda x: x > 1.6, bulk_densities))
print(compacted_soils)

[1.84]

# Find hydrophobic soils based on regular function
def is_hydrophobic(contact_angle):
    """
    Function that determines whether a soil is hydrophobic
    based on its contact angle.
    """
    if contact_angle < 90:
        repel = False
    elif contact_angle >= 90 and contact_angle <= 180:
        repel = True
    
    return repel

contact_angles = [5,10,20,50,90,150]
list(filter(is_hydrophobic, contact_angles))

[90, 150]

reduce

Description: Applies a function cumulatively to the items of an iterable, reducing the iterable to a single value.
Use-case: Aggregating data elements.

from functools import reduce

# Compute total yield
crop_yields = [1200, 1500, 1800, 2000]
total_yield = reduce(lambda x, y: x + y, crop_yields)
print(total_yield)

Note

While the map, filter, and reduce functions are useful in standard Python, the functions are less critical when working with Pandas or NumPy, as these libraries already provide built-in, optimized methods for element-wise operations and data manipulation. Numpy typically surpasses the need for map, filter, or reduce in most scenarios.

Comparative anatomy of functions

Python

def hypotenuse(C1, C2):
    H = (C1**2 + C2**2)**0.5
    return H

hypotenuse(3, 4)

Matlab

function H = hypotenuse(C1, C2)
    H = sqrt(C1^2 + C2^2);
end

hypotenuse(3, 4)

Julia

function hypotenuse(C1, C2)
    H = sqrt(C1^2 + C2^2)
    return H
end

hypotenuse(3, 4)

R

hypotenuse <- function(C1, C2) {
    H = sqrt(C1^2 + C2^2)
    return(H)
}

hypotenuse(3, 4)

JavaScript

function hypotenuse(C1, C2) {
    let H = Math.sqrt(C1**2 + C2**2);
    return H;
}

hypotenuse(3, 4);

Commonalities among programming languages:

All languages use a keyword (like function or def) to define a function.
They specify function names and accept parameters within parentheses.
The body of the function is enclosed in a block (using braces {} like in JavaScript and R, indentation in the case of Python, or end in the case of Julia and Matlab).
Return statements are used to output the result of the function.
Functions are invoked by calling their name followed by arguments in parentheses.

Practice

Create a function that computes the amount of lime required to increase an acidic soil pH. You can find examples in most soil fertility textbooks or extension fact sheets from multiple land-grant universities in the U.S.
Create a function that determines the amount of nitrogen required by a crop based on the amount of nitrates available at pre-planting, a yield goal for your region, and the amount of nitrogen required to produce the the yield goal.
Create a function to compute the amount of water storage in the soil profile from inputs of volumetric water content and soil depth.
Create a function that accepts latitude and longitude coordinates in decimal degrees and returns the latitude and longitude values in sexagesimal degrees. The function should accept Lat and Lon values as separate inputs e.g. fun(lat,lon) and must return a list of tuples with four components for each coordinate: degrees, minutes, seconds, and quadrant. The quadrant would be North/South for latitude and East/West for longitude. For instance: fun(19.536111, -155.576111) should result in [(19,32,10,'N'),(155,34,34,'W')]