Pandas/GeoPandas

Functions

API to use h3ronpy with the pandas dataframe library including geopandas.

Warning

To avoid pulling in unused dependencies, h3ronpy does not declare a dependency to pandas and geopandas. These packages need to be installed separately.

Raster module

Conversion of 2D numpy arrays to H3 cells.

The geo-context is passed to this library using a coordinate transformation matrix - this can be either a GDAL-like array of six float values, or a Affine-object as used by rasterio.

Note

As H3 itself used WGS84 (EPSG:4326) Lat/Lon coordinates, the coordinate transformation matrix used in this module must be based on WGS84 as well. Raster data using other coordinate systems need to be reprojected accordingly.

While H3 cells are hexagons and pentagons, this raster conversion process only takes the raster value under the centroid of the cell into account. When the data shall be aggregated, use any of these methods:

  1. Make use the nearest_h3_resolution function to convert to the H3 resolution nearest to the pixel size of the raster. After that the cell resolution can be changed using the change_resolution function and dataframe libraries can be used to perform the desired aggregations. This can be a rather memory-intensive process.

  2. Scale the raster down using an interpolation algorithm. After that use method 1. This can save a lot of memory, but may not be applicable to all datasets - for example dataset with absolute values per pixel like population counts.

Resolution search modes of nearest_h3_resolution:

  • “min_diff”: chose the H3 resolution where the difference in the area of a pixel and the h3index is as small as possible.

  • “smaller_than_pixel”: chose the H3 resolution where the area of the h3index is smaller than the area of a pixel.

h3ronpy.pandas.raster.raster_to_dataframe(in_raster: ndarray, transform, h3_resolution: int, nodata_value=None, axis_order: str = 'yx', compact: bool = True, geo: bool = False) GeoDataFrame | DataFrame

Convert a raster/array to a pandas DataFrame containing H3 indexes

This function is parallelized and uses the available CPUs by distributing tiles to a thread pool.

The input geometry must be in WGS84.

Parameters:
  • in_raster (ndarray) – input 2-d array

  • transform – the affine transformation

  • nodata_value – the nodata value. For these cells of the array there will be no h3 indexes generated

  • axis_order (str) – axis order of the 2d array. Either “xy” or “yx”

  • h3_resolution (int) – target h3 resolution

  • compact (bool) – Return compacted h3 indexes (see H3 docs). This results in mixed H3 resolutions, but also can reduce the amount of required memory.

  • geo (bool) – Return a geopandas GeoDataFrame with geometries. increases the memory usage.

Returns:

pandas DataFrame or GeoDataFrame

Return type:

GeoDataFrame | DataFrame

h3ronpy.pandas.raster.raster_to_geodataframe(*a, **kw) GeoDataFrame

convert to a geodataframe

Uses the same parameters as array_to_dataframe

Return type:

GeoDataFrame

Vector module

h3ronpy.pandas.vector.cells_dataframe_to_geodataframe(df: DataFrame, cell_column_name: str = 'cell') GeoDataFrame

Convert a dataframe with a column containing cells to a geodataframe

Parameters:
  • df (DataFrame) – input dataframe

  • cell_column_name (str) – name of the column containing the h3 indexes

Returns:

GeoDataFrame

Return type:

GeoDataFrame

h3ronpy.pandas.vector.cells_to_points(arr, radians: bool = False) GeoSeries

Create a geoseries containing the centroid point geometries of a cell array

Parameters:

radians (bool)

Return type:

GeoSeries

h3ronpy.pandas.vector.cells_to_polygons(arr, radians: bool = False, link_cells: bool = False) GeoSeries

Create a geoseries containing the polygon geometries of a cell array

Parameters:
Return type:

GeoSeries

h3ronpy.pandas.vector.directededges_to_linestrings(arr, radians: bool = False) GeoSeries

Create a geoseries containing the linestrings geometries of a directededge array

Parameters:

radians (bool)

Return type:

GeoSeries

h3ronpy.pandas.vector.geodataframe_to_cells(gdf: GeoDataFrame, resolution: int, containment_mode: ContainmentMode = ContainmentMode.ContainsCentroid, compact: bool = False, cell_column_name: str = 'cell') DataFrame

Convert a GeoDataFrame to H3 cells while exploding all other columns according to the number of cells derived from the rows geometry.

The conversion of GeoDataFrames is parallelized using the available CPUs.

The duplication of all non-cell columns leads to increased memory requirements. Depending on the use-case some of the more low-level conversion functions should be preferred.

Parameters:
  • gdf (GeoDataFrame)

  • resolution (int) – H3 resolution

  • containment_mode (ContainmentMode) – Containment mode used to decide if a cell is contained in a polygon or not. See the ContainmentMode class.

  • compact (bool) – Compact the returned cells by replacing cells with their parent cells when all children of that cell are part of the set.

  • cell_column_name (str)

Returns:

Return type:

DataFrame

h3ronpy.pandas.vector.vertexes_to_points(arr, radians: bool = False) GeoSeries

Create a geoseries containing the point geometries of a vertex array

Parameters:

radians (bool)

Return type:

GeoSeries