# Create a sample dictionary with metadata for some stations of the Kansas Mesonet
= [
data 'name': 'Ashland Bottoms', 'latitude': 39.125773, 'longitude': -96.63653},
{'name': 'Colby', 'latitude': 39.39247, 'longitude': -101.06864},
{'name': 'Garden City', 'latitude': 37.99733, 'longitude': -100.81514},
{'name': 'Manhattan', 'latitude': 39.20857, 'longitude': -96.59169},
{'name': 'Parsons', 'latitude': 37.36875, 'longitude': -95.28771},
{'name': 'Tribune 6NE', 'latitude': 38.53041, 'longitude': -101.66434},
{ ]
33 Save data and objects
Imagine you have Python code that downloads large datasets from the web or performs computationally expensive tasks that you don’t want to repeat. While saving data in .CSV files is common, Python offers more flexible options for saving objects and data structures. In this tutorial, we will explore two powerful modules from the Python standard library: pickle
and json
. These modules allow you to serialize and deserialize Python objects, making it easy to save your work and load it later without rerunning time-consuming code.
In this particular case we are using a dictionary as an example, so that the same dataset can be used with both the pickle
and the json
modules. But you can also pickle other objects and data structures like Pandas Dataframes.
Pickle module
The pickle module is used for serializing and deserializing Python object structures, also called “pickling” and “unpickling”. Serialization is the process of converting a Python object into a byte stream, and deserialization is the inverse process, converting a byte stream back into an object.
The pickle module lets you save Python objects in a binary format, which is efficient and suitable for complex data types, but the resulting file is not human-readable.
# Import module
import pickle
# Save the dataset using pickle
# Open file in write binary mode (data will not be written as text)
with open('../datasets/data.pkl', 'wb') as f:
pickle.dump(data, f)
# Load the dataset using pickle
# Read file in binary mode
with open('../datasets/data.pkl', 'rb') as f:
= pickle.load(f)
data_pickle
# Print first entry of dictionary
print(data_pickle[0])
{'name': 'Ashland Bottoms', 'latitude': 39.125773, 'longitude': -96.63653}
JSON module
The json module provides a way to encode and decode data in JavaScript Object Notation (JSON) format. JSON is a lightweight format that is easy for humans to read and write, and easy for machines to parse and generate.
This format is very similar to Python dictionaries (both use a key-value pair structure), is interoperable with other programming languages, and is ideal for web applications. The JSON format is limited to data types like strings, numbers, lists, and dictionaries.
# Import module
import json
# Save the dataset using JSON
# Open file in write mode (data will be written as text)
with open('../datasets/data.json', 'w') as f:
json.dump(data, f)
# Load the dataset using JSON
with open('../datasets/data.json', 'r') as f:
= json.load(f) data_json
# Print first entry of dictionary
print(data_pickle[0])
{'name': 'Ashland Bottoms', 'latitude': 39.125773, 'longitude': -96.63653}