22 For loop

Author

Andres Patrignani

Published

January 6, 2024

Keywords

for loops, iteration, iterable

For loops are essential in programming for executing a block of code multiple times, automating repetitive tasks efficiently. They are particularly useful in data science for iterating through various data structures like lists and dictionaries. Unlike conventional counting that starts from 1, Python’s for loops begin at index 0, iterating over sequences starting from the first element.

Syntax

for item in iterable:
    # Code block to execute for each item

Example 1: Basic For loop

Suppose we have a list of soil nitrogen levels from different test sites and we want to print each value. Here’s how you can do it:

# Example of a for loop

# List of soil nitrogen levels in mg/kg
nitrogen_levels = [15, 20, 10, 25, 18]

# Iterating through the list
for level in nitrogen_levels:
    print(f"Soil Nitrogen Level: {level} mg/kg")

Soil Nitrogen Level: 15 mg/kg
Soil Nitrogen Level: 20 mg/kg
Soil Nitrogen Level: 10 mg/kg
Soil Nitrogen Level: 25 mg/kg
Soil Nitrogen Level: 18 mg/kg

Example 2: For loop using the `enumerate` function

The enumerate function adds a counter to the loop, providing the index position along with the value. This is helpful when you need to access the position of the elements as you iterate.

Let’s modify the previous example to include the sample number using enumerate:

# Iterating through the list with enumerate
for index, level in enumerate(nitrogen_levels):
    print(f"Sample {index + 1}: Soil Nitrogen Level = {level} mg/kg")

Sample 1: Soil Nitrogen Level = 15 mg/kg
Sample 2: Soil Nitrogen Level = 20 mg/kg
Sample 3: Soil Nitrogen Level = 10 mg/kg
Sample 4: Soil Nitrogen Level = 25 mg/kg
Sample 5: Soil Nitrogen Level = 18 mg/kg

In this example, index represents the position of each element in the list (starting from 0), and level is the nitrogen level. We use index + 1 in the print statement to start the sample numbering from 1 instead of 0.

Example 3: Combine `for` loop with `if` statement

Combining a for loop with if statements unleashes a powerful and precise control over data processing and decision-making within iterative sequences. The for loop provides a structured way to iterate over a range of elements in a collection, such as lists, tuples, or strings. When an if statement is nested within this loop, it introduces conditional logic, allowing the program to execute specific blocks of code only when certain criteria are met. This combination is incredibly versatile: it can be used for filtering data, conditional aggregation of data, and applying different operations to elements based on specific conditions.

Example 3a

In this short example we will combine a for loop with if statements to generate the complementary DNA strand by iterating over each nucleotide. The code will also filter if there is an incorrect base and in which position that incorrect base is located.

# Example of DNA strand
strand = 'ACCTTATCGGC'

# Create an empty complementary strand
strand_c = ''

# Iterate over each base in the DNA strand (a string)
for k,base in enumerate(strand):
    
    if base == 'A':
        strand_c += 'T'
        
    elif base == 'T':
        strand_c += 'A'
        
    elif base == 'C':
        strand_c += 'G'
        
    elif base == 'G':
        strand_c += 'C'
        
    else:
        print('Incorrect base', base, 'in position', k+1)
    
print(strand_c)

TGGAATAGCCG

Try inserting or changing one of the bases in the sequence for another character not representing a DNA nucleotide.

Example 3b

In this example we will compute the total number of growing degree days for corn over the period of one week based on daily average air temperatures.

# Define daily temperatures for a week 
T_daily = [6, 12, 18, 8, 22, 19, 16] # degrees Celsius

# Define base temperature for corn
T_base = 8 # degrees Celsius

# Initialize growing degree days accumulator
gdd = 0

# Loop through each day of the week
for T in T_daily:

    if T > T_base:
        gdd_daily = T - T_base
    else:
        gdd_daily = 0

    # Accumulate daily growing degree days
    gdd += gdd_daily

# Output total growing degree days for the week
print(f"Total Growing Degree Days for the Week: {gdd} Celsius-Days")

Total Growing Degree Days for the Week: 47 Celsius-Days

Example 4: For loop using a dictionary

# Record air temperatures for a few cities in Kansas
kansas_weather = {
    "Topeka": {"Record High Temperature": 40, "Date": "July 20, 2003"},
    "Wichita": {"Record High Temperature": 42, "Date": "August 8, 2010"},
    "Lawrence": {"Record High Temperature": 39, "Date": "June 15, 2006"},
    "Manhattan": {"Record High Temperature": 41, "Date": "July 18, 2003"}
}

# Iterating through the dictionary
for city, weather_details in kansas_weather.items():
    print(f"Record Weather in {city}:")
    print(f"  High Temperature: {weather_details['Record High Temperature']}°C")
    print(f"  Date of Occurrence: {weather_details['Date']}")

Record Weather in Topeka:
  High Temperature: 40°C
  Date of Occurrence: July 20, 2003
Record Weather in Wichita:
  High Temperature: 42°C
  Date of Occurrence: August 8, 2010
Record Weather in Lawrence:
  High Temperature: 39°C
  Date of Occurrence: June 15, 2006
Record Weather in Manhattan:
  High Temperature: 41°C
  Date of Occurrence: July 18, 2003

The .items() method of a dictionary returns a view object as a list of tuples representing the key-value pairs of the dictionary. So, we can assign the key to one variable and the value to another variable when defining the for loop.

Think of a view object as a window into the original data structure. It doesn’t create a new copy of the data. View objects are useful because they allow you to work with the data in a flexible and memory-efficient way, and they are especially handy for working with large datasets.

View objects do not support indexing directly like lists or tuples. If you need to access specific elements by index frequently, you should consider converting the view object to a list or tuple first.

# Show the content returned by .items()
print(kansas_weather.items())

dict_items([('Topeka', {'Record High Temperature': 40, 'Date': 'July 20, 2003'}), ('Wichita', {'Record High Temperature': 42, 'Date': 'August 8, 2010'}), ('Lawrence', {'Record High Temperature': 39, 'Date': 'June 15, 2006'}), ('Manhattan', {'Record High Temperature': 41, 'Date': 'July 18, 2003'})])

First item: ('Topeka', {'Record High Temperature': 40, 'Date': 'July 20, 2003'})

key: 'Topeka'

value: {'Record High Temperature': 40, 'Date': 'July 20, 2003'}

Example 5: Nested for loops

Imagine we are analyzing soil samples from different fields. Each field has multiple samples, and each sample has various measurements. We’ll use nested for loops to iterate through the fields and then through each measurement in the samples.

# Soil data from multiple fields
soil_data = {
    "Field 1": [
        {"pH": 6.5, "Moisture": 20, "Nitrogen": 3},
        {"pH": 6.8, "Moisture": 22, "Nitrogen": 3.2}
    ],
    "Field 2": [
        {"pH": 7.0, "Moisture": 18, "Nitrogen": 2.8},
        {"pH": 7.1, "Moisture": 19, "Nitrogen": 2.9}
    ]
}

# Iterating through each field
for field, samples in soil_data.items():
    print(f"Data for {field}:")
    
    # Nested loop to iterate through each sample in the field
    for sample in samples:
        print(f"  Sample - pH: {sample['pH']}, Moisture: {sample['Moisture']}%, Nitrogen: {sample['Nitrogen']}%")

Data for Field 1:
  Sample - pH: 6.5, Moisture: 20%, Nitrogen: 3%
  Sample - pH: 6.8, Moisture: 22%, Nitrogen: 3.2%
Data for Field 2:
  Sample - pH: 7.0, Moisture: 18%, Nitrogen: 2.8%
  Sample - pH: 7.1, Moisture: 19%, Nitrogen: 2.9%

In this example, soil_data is a dictionary where each key is a field, and the value is a list of soil samples (each sample is a dictionary of measurements). The first for loop iterates over the fields, and the nested loop iterates over the samples within each field, printing out the pH, Moisture, and Nitrogen content for each sample.

Example 6: For loop using `break` and `continue`

Imagine we are evaluating crop yields from different fields. We want to stop processing if we encounter a field with exceptionally low yield (signifying a possible data error or a major issue with the field) and skip over fields with average yields to focus on fields with exceptionally high or low yields.

# Crop yield data (in tons per hectare) for different fields
crop_yields = {"Field 1": 2.5, "Field 2": 3.2, "Field 3": 1.0, "Field 4": 3.8, "Field 5": 0.8}

# Thresholds for yield consideration
low_yield_threshold = 1.5
high_yield_threshold = 3.0

for field, yield_data in crop_yields.items():
    if (yield_data < low_yield_threshold) or (yield_data > high_yield_threshold):
        print(f"{field} is a potential outlier: {yield_data} tons/ha")
        break  # Stop processing further as this could indicate a major issue
    else:
        continue

Field 2 is a potential outlier: 3.2 tons/ha

We use break to stop the iteration when we encounter a yield below the low_yield_threshold or above high_yield_threshold, which could indicate an outlier that requires immediate attention.

We use continue to skip to the next iteration without executing any additional code in hte loop.

Compative anatomy of `for` loops

Python

nitrogen_levels = [15, 20, 10, 25, 18]
for level in nitrogen_levels:
    print(f"Soil Nitrogen Level: {level} mg/kg")

Matlab

nitrogen_levels = [15, 20, 10, 25, 18];
for level = nitrogen_levels
    fprintf('Soil Nitrogen Level: %d mg/kg\n', level);
end

Julia

nitrogen_levels = [15, 20, 10, 25, 18]
for level in nitrogen_levels
    println("Soil Nitrogen Level: $level mg/kg")
end

R

nitrogen_levels <- c(15, 20, 10, 25, 18)
for (level in nitrogen_levels) {
  cat("Soil Nitrogen Level:", level, "mg/kg\n")
}

JavaScript

const nitrogen_levels = [15, 20, 10, 25, 18];
for (let level of nitrogen_levels) {
    console.log(`Soil Nitrogen Level: ${level} mg/kg`);
}

Commonalities among programming languages:

All languages use a for keyword to start the loop.
They iterate over a collection of items, like an array or list.
Each language uses a variable to represent the current item in each iteration.
The body of the loop (code to be executed) is enclosed within a block defined by indentation or brackets.