1  Basic concepts

In geospatial data analysis, data is represented in two primary forms: vector and raster. Both formats are used to capture, store, and visualize geographical information, but they do so in fundamentally different ways.

Basic concepts of geographic information systems

Vector Data: represents geographic features as discrete points, lines, and polygons. This format is used to capture details with precise boundaries and locations. For example:

  • Points could represent locations such as weather stations, wells, or trees.
  • Lines might delineate features like rivers, roads, or railway tracks.
  • Polygons are used to define areas such as lakes, watersheds, or agricultural fields.

Vector data is advantageous for mapping features that have clear boundaries and for tasks that require precise measurements (like distance, perimeter, or area).

Raster Data: Raster data, on the other hand, is a grid of cells (or pixels), where each cell contains a value representing information, such as temperature, elevation, or land cover. The raster format is effective for representing continuous phenomena. Raster data is often used in: - Satellite imagery, where each pixel has spectral data values. - Digital elevation models (DEMs), where each pixel represents ground elevation. - Thematic maps, such as precipitation or land use maps, where each pixel’s value corresponds to a specific category or quantity.

Raster data is especially useful for analyzing spatial variations across a region and for modeling environmental and earth surface processes.

GeoJSON

GeoJSON is a light-weight and widely used open standard format designed for representing geographical features. It is based on JSON (JavaScript Object Notation), which is very much like a Python dictionary, making it easy to read and parse by both humans and machines.

A GeoJSON supports various geometric types such as Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection. Each geometry can also contain properties in the form of key-value pairs, allowing for the addition non-spatial attributes associated with the geographic features.

The following example is a Feature Collection showing the GeoJSON data obtained from GEE for two counties. The keys on the very left (columns, features, id, properties, type, and version) are keys of the Feature Collection. Note that the features key has a list of dictionaries as its value. These two dictinaries contain the information for each county. Since each county is a dictionary, then they also have key-value pairs, like geometry, id, properties, and type. This information is repeated in each item. If you look further, you will notice that even the geometry key has another dictionary inside, which contains the coordinates and the type of the geometry.

Certainly there are multiple levels and quite a bit of nesting, but the information has a clear, organized, and human-readable structure.

Tip

The secret to understand GeoJSON objects is to pay attention to:

  1. indentation,
  2. the opening { and closing } curly braces,
  3. opening [ and closing ] square brackets, and
  4. the commas , that separate individual structures.

The .getInfo() method in GEE allows you to print this information for any object in the notebook. You can also use the Pretty Print (pprint) module in Python to ensure the data is displayed clearly, like the example below. This tutorial will help you get started.

{'columns': {'ALAND': 'Long',
             'AWATER': 'Long',
             'COUNTYFP': 'String',
             'NAME': 'String',
             'STATEFP': 'String',
             'system:index': 'String'},
 'features': [{'geometry': {'coordinates': [[[-100.9542311520017,36.57269908219634],
                                             [-100.95409647655981,36.557977388454546],
                                             [-100.95409647655981,36.557842673053706],
                                             ...]],
                            'type': 'Polygon'},
               'id': '00000000000000000515',
               'properties': {'ALAND': 4700042738,
                              'AWATER': 7347161,
                              'COUNTYFP': '007',
                              'NAME': 'Beaver',
                              'NAMELSAD': 'Beaver County',
                              'STATEFP': '40'},
               'type': 'Feature'},
              {'geometry': {'coordinates': [[[-100.0038723945186,36.75510474601957],
                                             [-100.00382751270251,36.75353381242449],
                                             [-100.00382751270251,36.75223223032575],
                                             ...]],
                            'type': 'Polygon'},
               'id': '00000000000000000957',
               'properties': {'ALAND': 2691041230,
                              'AWATER': 5259447,
                              'COUNTYFP': '059',
                              'NAME': 'Harper',
                              'NAMELSAD': 'Harper County',
                              'STATEFP': '40'},
               'type': 'Feature'}],
 'id': 'TIGER/2016/Counties',
 'properties': {'date_range': [1451606400000, 1483315200000],
                'description': 'The United States Census Bureau TIGER dataset '
                               'contains the 2016 boundaries\n'
                               'for primary legal divisions of US states...',
                'period': 0,
                'title': 'TIGER: US Census Counties 2016'},
 'type': 'FeatureCollection',
 'version': 1566851207937615}

Google Earth building blocks

Google Earth Engine (GEE) is a powerful platform for analyzing geospatial data at scale, providing a vast library of satellite imagery and geospatial datasets. The fundamental data structures of GEE are: geometries, features, feature collections, images, and image collections. Understanding these structures is key to effectively utilizing GEE for any geospatial data analysis task.

Geometries

Geometries represent the simplest form of spatial data in GEE, describing points or shapes in geographic space. They can be points, lines, polygons, or even more complex shapes defining areas (e.g., a rectangle, a watershed, a county), routes, or specific locations on Earth’s surface.

Features

A Feature in GEE is a geometry with associated properties. These properties can be metadata or attributes related to the geometry, such as the name of a location, its population, a label, or any other characteristic. Features combine spatial and descriptive information, making them useful for detailed geospatial analyses.

Geometry and Feature

Feature Collections

A Feature Collection is a group of features aggregated into a single data structure. This collection can represent a series of points, such as weather stations, or complex combinations of polygons, such as administrative boundaries, each with their own set of attributes. Feature Collections are instrumental in managing and analyzing related sets of features.

FeatureCollection

Images

An Image in GEE is a raster data structure, representing Earth data in grid format where each cell has a value. These images can depict various types of data, such as satellite imagery, temperature maps, elevation data, or derived indices (e.g., NDVI for vegetation analysis). An image may contain multiple bands, with each band representing a different dataset or time snapshot.

Image

Image Collections

An Image Collection is a series of images grouped together, typically representing the same area over time. This could be a sequence of satellite images capturing changes throughout the seasons or years, or a collection of derived datasets like monthly precipitation maps. Image Collections enable temporal analyses and change detection studies over large geographic areas.

ImageCollection

Commonalities and Differences

  • Spatial Representation: Geometries and Features represent vector data, while Images represent raster data. Feature Collections aggregate vector data, and Image Collections aggregate raster data.
  • Data Complexity: Geometries are the simplest, defining only shapes. Features add descriptive data to geometries. Images and Image Collections introduce the complexity of raster data, allowing for detailed analysis over discrete units of space and time
  • Analysis Capabilities: Features and Feature Collections are ideal for analyses that require descriptive data alongside spatial data. Images and Image Collections are suited for pixel-based analyses and time-series studies.

Summary

Let’s summarize these concepts:

  • Geometries: Basic shapes (point, line, polygon).
  • Features: Shapes with associated properties (metadata) like names and unique identifiers.
  • Feature Collections: Aggregated features.
  • Images: Grid of values (raster), single band or multiple bands.
  • Image Collections: Series of related images over time.