AA Toolbox API

Configuration

class ochanticipy.CountryConfig(*, iso3: str, codab: CodABConfig | None = None, fewsnet: FewsNetConfig | None = None, glofas: GlofasConfig | None = None, usgs_ndvi: UsgsNdviConfig | None = None)[source]

Country configuration.

Parameters:
  • iso3 (str) – Country ISO3, must be exactly 3 letters long

  • codab (CodABConfig, optional) – Configuration object for COD AB

  • fewsnet (FewsNetConfig, optional) – Configuration object for FEWS NET

  • glofas (GlofasConfig, optional) – Configuration object for GloFAS

  • usgs_ndvi (UsgsNdviConfig, optional) – Configuration object for USGS NDVI

ochanticipy.create_country_config(iso3: str) CountryConfig[source]

Return a country configuration object from AA Toolbox.

Parameters:

iso3 (str) – Country ISO3, must be exactly 3 letters long

Return type:

CountryConfig instance

ochanticipy.create_custom_country_config(filepath: str | Path) CountryConfig[source]

Return a custom country configuration object.

Parameters:

filepath (str, pathlib.Path) – Path to the configuration file

Return type:

CountryConfig instance

Data sources

CHIRPS

Class to download and load CHIRPS observational precipitation data.

Daily

class ochanticipy.ChirpsDaily(country_config: CountryConfig, geo_bounding_box: GeoBoundingBox, resolution: float = 0.05, start_date: date | str | None = None, end_date: date | str | None = None)[source]

Class object to retrieve CHIRPS observational monthly precipitation data.

Parameters:
  • country_config (CountryConfig) – Country configuration

  • geo_bounding_box (GeoBoundingBox) – the bounding coordinates of the area that should be included in the data.

  • resolution (float, default = 0.05) – resolution of data to be downloaded. Can be 0.05 or 0.25.

  • start_date (Optional[Union[datetime.date, str]], default = None) – Data will be considered starting from date start_date. Input can be an ISO8601 string or datetime.date object. If None, it is set to 1981-1-1.

  • end_date (Optional[Union[datetime.date, str]], default = None) – Data will be considered up to date end_date. Input can be an ISO8601 string or datetime.date object. If None, it is set to the date for which most recent data is available.

Examples

>>> from ochanticipy import create_country_config, CodAB, GeoBoundingBox,
... ChirpsDaily
>>> import datetime
>>>
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> codab.download()
>>> admin0 = codab.load(admin_level=0)
>>> geo_bounding_box = GeoBoundingBox.from_shape(admin0)
>>> start_date = datetime.date(year=2007, month=10, day=23)
>>> end_date = datetime.date(year=2020, month=3, day=2)
>>> chirps_daily = ChirpsDaily(
...   country_config=country_config,
...   geo_bounding_box=geo_bounding_box,
...   start_date=start_date,
...   end_date=end_date
... )
>>> chirps_daily.download()
>>> chirps_daily.process()
>>> chirps_daily_data = chirps_daily.load()
download(clobber: bool = False)

Download the CHIRPS observed precipitation as NetCDF file.

Parameters:

clobber (bool, default = False) – If True, overwrites existing raw files

Return type:

The folder where the data is downloaded.

load() Dataset

Load the CHIRPS data.

Should only be called after the data has been downloaded and processed.

Return type:

The processed CHIRPS dataset.

process(clobber: bool = False)

Process the CHIRPS data.

Should only be called after data has been download.

Parameters:

clobber (bool, default = False) – If True, overwrites existing processed files.

Return type:

The folder where the data is processed.

Monthly

class ochanticipy.ChirpsMonthly(country_config: CountryConfig, geo_bounding_box: GeoBoundingBox, start_date: date | str | None = None, end_date: date | str | None = None)[source]

Class object to retrieve CHIRPS observational monthly precipitation data.

Parameters:
  • country_config (CountryConfig) – Country configuration

  • geo_bounding_box (GeoBoundingBox) – the bounding coordinates of the area that should be included in the data.

  • start_date (Optional[Union[datetime.date, str]], default = None) – Data will be considered starting from date start_date. Input can be an ISO8601 string or datetime.date object. If None, it is set to 1981-1-1.

  • end_date (Optional[Union[datetime.date, str]], default = None) – Data will be considered up to date end_date. Input can be an ISO8601 string or datetime.date object. If None, it is set to the date for which most recent data is available.

Examples

>>> from ochanticipy import create_country_config, CodAB, GeoBoundingBox
>>> from ochanticipy import ChirpsMonthly
>>> import datetime
>>>
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> codab.download()
>>> admin0 = codab.load(admin_level=0)
>>> geo_bounding_box = GeoBoundingBox.from_shape(admin0)
>>>
>>> start_date = datetime.date(year=2007, month=10, day=23)
>>> end_date = datetime.date(year=2020, month=3, day=2)
>>> chirps_monthly = ChirpsMonthly(
...   country_config=country_config,
...   geo_bounding_box=geo_bounding_box,
...   start_date=start_date,
...   end_date=end_date
... )
>>> chirps_monthly.download()
>>> chirps_monthly.process()
>>> chirps_monthly_data = chirps_monthly.load()
download(clobber: bool = False)

Download the CHIRPS observed precipitation as NetCDF file.

Parameters:

clobber (bool, default = False) – If True, overwrites existing raw files

Return type:

The folder where the data is downloaded.

load() Dataset

Load the CHIRPS data.

Should only be called after the data has been downloaded and processed.

Return type:

The processed CHIRPS dataset.

process(clobber: bool = False)

Process the CHIRPS data.

Should only be called after data has been download.

Parameters:

clobber (bool, default = False) – If True, overwrites existing processed files.

Return type:

The folder where the data is processed.

Common Operational Datasets

Download and manipulate COD administrative boundaries.

class ochanticipy.CodAB(country_config: CountryConfig)[source]

Work with COD AB (administrative boundaries).

Parameters:

country_config (CountryConfig) – Country configuration

download(clobber: bool = False) Path | List[Path][source]

Download COD AB file from HDX.

Parameters:

clobber (bool, default = False) – If True, overwrites existing COD AB files

Return type:

The downloaded filepath(s)

Examples

>>> from ochanticipy import create_country_config, CodAB
>>> # Download COD administrative boundaries for Nepal
>>> country_config = create_country_config(iso3="npl")
>>> codab = CodAB(country_config=country_config)
>>> npl_cod_shapefile = codab.download()
load(admin_level: int = 0) GeoDataFrame[source]

Get the COD AB data by admin level.

Parameters:

admin_level (int, default = 0) – The administrative level

Return type:

COD AB geodataframe with specified admin level

Raises:
  • AttributeError – If the requested admin level is higher than what is available

  • FileNotFoundError – If the requested filename or layer name are not found

Examples

>>> from ochanticipy import create_country_config, CodAB
>>>
>>> # Retrieve admin 2 boundaries for Nepal
>>> country_config = create_country_config(iso3="npl")
>>> codab = CodAB(country_config=country_config)
>>> npl_admin2 = codab.load(admin_level=2)
load_custom(custom_layer_number: int = 0) GeoDataFrame[source]

Get the COD AB data from a custom (non-level) layer.

Parameters:

custom_layer_number (int) – The 0-indexed number of the layer listed in the custom_layer_names parameter of the country’s config file

Return type:

COD AB geodataframe with custom admin level

Raises:
  • AttributeError – If the requested custom layer number is not available

  • FileNotFoundError – If the requested filename or layer name are not found

Examples

>>> from ochanticipy import create_country_config, CodAB
>>>
>>> # Retrieve district boundaries for Nepal
>>> country_config = create_country_config(iso3="npl")
>>> codab = CodAB(country_config=country_config)
>>> npl_district = codab.load_custom(custom_layer_number=0)
process(*args, **kwargs)[source]

Process COD AB data.

Method not implemented.

FEWS NET

FEWS NET processing.

Download and save the data provided by FEWS NET as provided on <https://fews.net/>.

FEWS NET is only available in a set of countries. Check their website to see which countries are included.

class ochanticipy.FewsNet(country_config)[source]

Base class to retrieve FewsNet data.

Parameters:

country_config (CountryConfig) – Country configuration

download(pub_year: int, pub_month: int, clobber: bool = False) Path[source]

Retrieve the raw FEWS NET data.

Depending on the region and date, this data is published per region or per country. This function retrieves the country data if it exists, and else the regional data for pub_year-pub_month.

Parameters:
  • pub_year (int) – publication year of the data that should be downloaded

  • pub_month (int) – publication month of the data that should be downloaded. This commonly refers to the month of the Current Situation period

  • clobber (bool, default = False) – If True, overwrites existing raw files

Return type:

Path to the downloaded file.

Examples

>>> from ochanticipy import create_country_config, FewsNet
>>> # Download FEWS NET data for ETH published in 2021-06
>>> country_config = create_country_config(iso3="eth")
>>> fewsnet = FewsNet(country_config=country_config)
>>> eth_fn_202106_path = fewsnet.download(pub_year=2021,pub_month=6)
load(pub_year: int, pub_month: int, projection_period: ValidProjectionPeriods) GeoDataFrame[source]

Load FEWS NET data.

For the given pub_year, pub_month and projection_period.

Parameters:
  • pub_year (int) – publication year of the data that should be loaded

  • pub_month (int) – publication month of the data that should be loaded. This refers to the first month of the Current Situation period

  • projection_period (str) – The projection period to be loaded. This should be CS, ML1, or ML2. Referring to Current Situation, near term projection, and medium term projection respectively.

Return type:

Geopandas DataFrame with the specified data.

Examples

>>> from ochanticipy import create_country_config, FewsNet
>>> # Load FEWS NET data for ETH published in 2021-06 of medium-term
... projection period (ML1)
>>> country_config = create_country_config(iso3="eth")
>>> fewsnet = FewsNet(country_config=country_config)
>>> gdf_eth_fn_202106 = fewsnet.load(pub_year=2021,pub_month=6,
... projection_period = "ML1")
process(*args, **kwargs)[source]

Process FEWS NET data.

Method not implemented.

GloFAS

Attention

This module has the following additional requirements:

cdsapi
cfgrib

These can be installed as follows:

python -m pip install ocha-anticipy[glofas]

Base class for downloading and processing GloFAS river discharge data.

Reanalysis

class ochanticipy.GlofasReanalysis(country_config: CountryConfig, geo_bounding_box: GeoBoundingBox, start_date: date | str = None, end_date: date | str = None, model_version: int = 4)[source]

Class for downloading and processing GloFAS reanalysis data.

The GloFAS reanalysis dataset is a global raster presenting river discharnge from 1979 until present day (updated daily), see this paper for more details.

This class downloads the raw raster data from CDS, and processes it from a raster to a datasets of reporting points from the GloFAS interface. Due to the CDS request size limits, separate files are downloaded per year.

Parameters:
  • country_config (CountryConfig) – Country configuration

  • geo_bounding_box (GeoBoundingBox) – The bounding coordinates of the area that should be included

  • start_date (Union[date, str], default: date(1979, 1, 1)) – The starting date for the dataset. If left blank, defaults to the earliest available date

  • end_date (Union[date, str], default: date.today()) – The ending date for the dataset. If left blank, defaults to the current date

  • model_version (int, default: 4) – The version of the GloFAS model to use, can only be 3 or 4. If in doubt, always use the latest (default).

Examples

Download, process and load all historical GloFAS reanalysis data until the current date, set to Oct 22, 2022 for this example.

>>> from datetime import date
>>> from ochanticipy import create_country_config, CodAB, GeoBoundingBox,
... GlofasReanalysis
>>>
>>> country_config = create_country_config(iso3="bgd")
>>> codab = CodAB(country_config=country_config)
>>> codab.download()
>>> admin_npl = codab.load()
>>> geo_bounding_box = GeoBoundingBox.from_shape(admin_npl)
>>>
>>> glofas_reanalysis = GlofasReanalysis(
...     country_config=country_config,
...     geo_bounding_box=geo_bounding_box,
...     end_date=date(year=2022, month=10, day=22)
... )
>>> glofas_reanalysis.download()
>>> glofas_reanalysis.process()
>>>
>>> npl_glofas_reanalysis_reporting_points = glofas_reanalysis.load()
download(clobber: bool = False) List[Path]

Download the GloFAS data by querying CDS.

The raw GloFAS data is available as a global raster in CDS. This method downloads the raster files for the specified region of interest and date range. The files are in GRIB format and are split up either by day, month, or year depending on the GloFAS product.

Parameters:

clobber (bool, default = False) – Overwrite files that were already downloaded

Return type:

A list paths of downloaded files

load() Dataset

Load the processed GloFAS data as an xarray.DataSet.

Returns:

  • A single xarray dataset containing all GloFAS reporting points

  • and their associated river discharge

process(clobber: bool = False) List[Path]

Process the downloaded GloFAS files.

For each raw GRIB file, read it in and extract the river discharge from the reporting point coordinates specified in the configuration file. Saves the output as a NetCDF file, where files are split by day, month or year depending on the GloFAS product.

Parameters:

clobber (bool, default = False) – Overwrite files that were already processed

Return type:

A list paths of processed files

Forecast

class ochanticipy.GlofasForecast(country_config: CountryConfig, geo_bounding_box: GeoBoundingBox, leadtime_max: int, start_date: date | str = None, end_date: date | str = None, model_version: int = 4)[source]

Class for downloading and processing GloFAS forecast data.

The GloFAS forecast dataset is a global raster presenting river discharge forecast from 26 May 2021 until present day (updated daily), see this paper for more details. While CDS does have version 3 pre-release data from 2020-2021, we understand that there were some small issues that were fixed in the final version, so at this point in time this module does not support downloading the pre-release data.

This class downloads the raw raster data from CDS, and processes it from a raster to a datasets of reporting points from the GloFAS interface. Due to the CDS request size limits, separate files are downloaded per day (that contain all requested lead times).

Parameters:
  • country_config (CountryConfig) – Country configuration

  • geo_bounding_box (GeoBoundingBox) – The bounding coordinates of the area that should be included

  • leadtime_max (int) – The maximum desired lead time D in days. All forecast data for lead times 1 to D days are downloaded

  • start_date (Union[date, str], default: date(year=2021, month=5, day=26)) – The starting date for the dataset. If left blank, defaults to the earliest available date

  • end_date (Union[date, str], default: date.today()) – The ending date for the dataset. If left blank, defaults to the current date

  • model_version (int, default: 4) – The version of the GloFAS model to use, can only be 3 or 4. If in doubt, always use the latest (default).

Examples

Download, process and load GloFAS forecast data for the past month, for a lead time of 15 days.

>>> from datetime import date
>>> from ochanticipy import create_country_config, CodAB, GeoBoundingBox,
... GlofasForecast
>>>
>>> country_config = create_country_config(iso3="npl")
>>> codab = CodAB(country_config=country_config)
>>> codab.download()
>>> admin_npl = codab.load()
>>> geo_bounding_box = GeoBoundingBox.from_shape(admin_npl)
>>>
>>> glofas_forecast = GlofasForecast(
...     country_config=country_config,
...     geo_bounding_box=geo_bounding_box,
...     leadtime_max=15,
...     end_date=date(year=2022, month=10, day=22),
...     start_date=date(year=2022, month=9, day=22)
... )
>>> glofas_forecast.download()
>>> glofas_forecast.process()
>>>
>>> npl_glofas_forecast_reporting_points = glofas_forecast.load()
download(clobber: bool = False) List[Path]

Download the GloFAS data by querying CDS.

The raw GloFAS data is available as a global raster in CDS. This method downloads the raster files for the specified region of interest and date range. The files are in GRIB format and are split up either by day, month, or year depending on the GloFAS product.

Parameters:

clobber (bool, default = False) – Overwrite files that were already downloaded

Return type:

A list paths of downloaded files

load() Dataset

Load the processed GloFAS data as an xarray.DataSet.

Returns:

  • A single xarray dataset containing all GloFAS reporting points

  • and their associated river discharge

process(clobber: bool = False) List[Path]

Process the downloaded GloFAS files.

For each raw GRIB file, read it in and extract the river discharge from the reporting point coordinates specified in the configuration file. Saves the output as a NetCDF file, where files are split by day, month or year depending on the GloFAS product.

Parameters:

clobber (bool, default = False) – Overwrite files that were already processed

Return type:

A list paths of processed files

Reforecast

class ochanticipy.GlofasReforecast(country_config: CountryConfig, geo_bounding_box: GeoBoundingBox, leadtime_max: int, start_date: date | str = None, end_date: date | str = None, model_version: int = 4)[source]

Class for downloading and processing GloFAS reforecast data.

The GloFAS reforecast dataset is a global raster presenting river discharge forecasted from 1999 until 2018, see this paper for more details.

This class downloads the raw raster data from CDS, and processes it from a raster to a datasets of reporting points from the GloFAS interface. Due to the CDS request size limits, separate files are downloaded per month (that contain all requested lead times).

Parameters:
  • country_config (CountryConfig) – Country configuration

  • geo_bounding_box (GeoBoundingBox) – The bounding coordinates of the area that should be included

  • leadtime_max (int) – The maximum desired lead time D in days. All forecast data for lead times 1 to D days are downloaded

  • start_date (Union[date, str], default: date(year=1999, month=1, day=1)) – The starting date for the dataset. If left blank, defaults to the earliest available date

  • end_date (Union[date, str], default: date(year=2018, month=12, day=31)) – The ending date for the dataset. If left blank, defaults to the last available date

  • model_version (int, default: 4) – The version of the GloFAS model to use, can only be 3 or 4. If in doubt, always use the latest (default).

Examples

Download, process and load all available GloFAS reforecast data for a lead time of 15 days.

>>> from ochanticipy import create_country_config, CodAB, GeoBoundingBox,
... GlofasReforecast
>>>
>>> country_config = create_country_config(iso3="npl")
>>> codab = CodAB(country_config=country_config)
>>> codab.download()
>>> admin_npl = codab.load()
>>> geo_bounding_box = GeoBoundingBox.from_shape(admin_npl)
>>>
>>> glofas_reforecast = GlofasReforecast(
...     country_config=country_config,
...     geo_bounding_box=geo_bounding_box,
...     leadtime_max=15
... )
>>> glofas_reforecast.download()
>>> glofas_reforecast.process()
>>>
>>> npl_glofas_reforecast_reporting_points = glofas_reforecast.load()
download(clobber: bool = False) List[Path]

Download the GloFAS data by querying CDS.

The raw GloFAS data is available as a global raster in CDS. This method downloads the raster files for the specified region of interest and date range. The files are in GRIB format and are split up either by day, month, or year depending on the GloFAS product.

Parameters:

clobber (bool, default = False) – Overwrite files that were already downloaded

Return type:

A list paths of downloaded files

load() Dataset

Load the processed GloFAS data as an xarray.DataSet.

Returns:

  • A single xarray dataset containing all GloFAS reporting points

  • and their associated river discharge

process(clobber: bool = False) List[Path]

Process the downloaded GloFAS files.

For each raw GRIB file, read it in and extract the river discharge from the reporting point coordinates specified in the configuration file. Saves the output as a NetCDF file, where files are split by day, month or year depending on the GloFAS product.

Parameters:

clobber (bool, default = False) – Overwrite files that were already processed

Return type:

A list paths of processed files

IRI

Class to download and load IRI’s seasonal forecast.

Probability forecast

class ochanticipy.IriForecastProb(country_config, geo_bounding_box: GeoBoundingBox)[source]

Class to retrieve IRI’s forecast data per tercile.

The retrieved data contains the probability per tercile for the given bounding box. Automatically all seasons and leadtimes are downloaded.

Parameters:
  • country_config (CountryConfig) – Country configuration

  • geo_bounding_box (GeoBoundingBox) – the bounding coordinates of the area that should be included in the data.

Examples

>>> from ochanticipy import create_country_config,     ...     GeoBoundingBox, IriForecastProb
>>> country_config = create_country_config(iso3="bfa")
>>> geo_bounding_box = GeoBoundingBox(lat_max=13.0,
...                                   lat_min=12.0,
...                                   lon_max=-3.0,
...                                   lon_min=-2.0)
>>>
>>> # Initialize class and retrieve data
>>> iri = IriForecastProb(country_config,geo_bounding_box)
>>> iri.download() # Must have IRI_AUTH environment variable set
>>> iri.process()
>>>
>>> iri_data = iri.load()
download(clobber: bool = False) Path

Download the IRI seasonal tercile forecast as NetCDF file.

To download data from the IRI API, a key is required for authentication, and must be set in the IRI_AUTH environment variable. To obtain this key config you need to create an account here. Note that this key might be changed over time, and need to be updated regularly.

Parameters:

clobber (bool, default = False) – If True, overwrites existing raw files

Return type:

The downloaded filepath

load() Dataset

Load the IRI forecast data.

Should only be called after the download and process methods have been executed.

Return type:

The processed IRI dataset

process(clobber: bool = False) Path

Process the IRI forecast.

Should only be called after the download method has been executed.

Parameters:

clobber (bool, default = False) – If True, overwrites existing processed files

Return type:

The processed filepath

Dominant forecast

class ochanticipy.IriForecastDominant(country_config: CountryConfig, geo_bounding_box: GeoBoundingBox)[source]

Class to retrieve IRI’s forecast dominant tercile data.

The retrieved data contains the dominant probability. Automatically all seasons and leadtimes are downloaded.

Parameters:
  • country_config (CountryConfig) – Country configuration

  • geo_bounding_box (GeoBoundingBox) – the bounding coordinates of the area that should be included in the data.

Examples

>>> from ochanticipy import create_country_config,     ...     GeoBoundingBox, IriForecastDominant
>>> country_config = create_country_config(iso3="bfa")
>>> geo_bounding_box = GeoBoundingBox(lat_max=13.0,
...                                   lat_min=12.0,
...                                   lon_max=-3.0,
...                                   lon_min=-2.0)
>>>
>>> # Initialize class and retrieve data
>>> iri = IriForecastDominant(country_config,geo_bounding_box)
>>> iri.download() # Must have IRI_AUTH environment variable set
>>> iri.process()
>>>
>>> iri_data = iri.load()
download(clobber: bool = False) Path

Download the IRI seasonal tercile forecast as NetCDF file.

To download data from the IRI API, a key is required for authentication, and must be set in the IRI_AUTH environment variable. To obtain this key config you need to create an account here. Note that this key might be changed over time, and need to be updated regularly.

Parameters:

clobber (bool, default = False) – If True, overwrites existing raw files

Return type:

The downloaded filepath

load() Dataset

Load the IRI forecast data.

Should only be called after the download and process methods have been executed.

Return type:

The processed IRI dataset

process(clobber: bool = False) Path

Process the IRI forecast.

Should only be called after the download method has been executed.

Parameters:

clobber (bool, default = False) – If True, overwrites existing processed files

Return type:

The processed filepath

NDVI (USGS eMODIS)

Classes to download and process USGS FEWS NET NDVI data.

Download, process, and load NDVI data published in the USGS FEWS NET data portal. Classes available to process temporally smoothed NDVI values, percent of median values, difference to median values, and difference from current value to the previous year’s value.

Smoothed

class ochanticipy.UsgsNdviSmoothed(country_config: CountryConfig, start_date: date | str | Tuple[int, int] | None = None, end_date: date | str | Tuple[int, int] | None = None)[source]

Base class to retrieve smoothed NDVI data.

The retrieved data is the smoothed NDVI values processed by the USGS. Temporal smoothing is done to adjust for cloud cover and other errors. Data for the 3 most recent dekads is not fully smoothed, and are re-smoothed at the end of the 3 dekad period.

Parameters:
  • country_config (CountryConfig) – Country configuration

  • start_date (_DATE_TYPE, default = None) – Start date. Can be passed as a datetime.date object or a data string in ISO8601 format, and the relevant dekad will be determined. Or pass directly as year-dekad tuple, e.g. (2020, 1). If None, start_date is set to earliest date with data: 2002, dekad 19.

  • end_date (_DATE_TYPE, default = None) – End date. Can be passed as a datetime.date object and the relevant dekad will be determined, as a date string in ISO8601 format, or as a year-dekad tuple, i.e. (2020, 1). If None, end_date is set to date.today().

Examples

>>> from ochanticipy import create_country_config,     ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...     gdf=bfa_admin2,
...     feature_col="ADM2_FR"
... )
>>>
>>> # load in processed data
>>> df = bfa_ndvi.load(feature_col="ADM2_FR")
download(clobber: bool = False) Path

Download raw NDVI data as .tif files.

NDVI data is downloaded from the USGS API, with data for individual regions, years, and dekads stored as separate .tif files. No authentication is required. Data is downloaded for all available dekads from self._start_date to self._end_date.

Parameters:

clobber (bool, default = False) – If True, overwrites existing files

Returns:

The downloaded filepath

Return type:

Path

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
load(feature_col: str) DataFrame

Load the processed USGS NDVI data.

Parameters:

feature_col (str) – String is used as a suffix to the processed file path for unique identication of analyses done on different files and columns. The same value must be passed to process().

Returns:

The processed NDVI dataset.

Return type:

pd.DataFrame

Raises:

FileNotFoundError – If the requested file cannot be found.

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...    gdf=bfa_admin2,
...    feature_col="ADM2_FR"
)
>>> bfa_ndvi.load(feature_col="ADM2_FR")
load_raster(load_date: date | str | Tuple[int, int]) DataArray

Load raster for specific year and dekad.

Parameters:

load_date (Union[date, str, Tuple[int, int]]) – Date. Can be passed as a datetime.date object and the relevant dekad will be determined, as a date string in ISO8601 format, or as a year-dekad tuple, i.e. (2020, 1).

Returns:

Data array of NDVI data.

Return type:

xr.DataArray

Raises:

FileNotFoundError – If the requested file cannot be found.

process(gdf: GeoDataFrame, feature_col: str, clobber: bool = False, **kwargs) Path

Process NDVI data for specific area.

NDVI data is clipped to the provided geometries, usually a geopandas dataframes geometry feature. kwargs are passed on to oap.computer_raster_stats(). The feature_col is used to define the unique processed file.

The processing keeps track of the latest timestamp of when the raw raster files were modified. If the latest timestamp of the raw data is greater than when it was last processed, the file will automatically be re-processed.

Parameters:
  • gdf (geopandas.GeoDataFrame) – GeoDataFrame with row per area for stats computation. If pd.DataFrame is passed, geometry column must have the name geometry. Passed to oap.compute_raster_stats().

  • feature_col (str) – Column in gdf to use as row/feature identifier. and dates. Passed to oap.compute_raster_stats(). The string is also used as a suffix to the processed file path for unique identication of analyses done on different files and columns.

  • clobber (bool, default = False) – If True, overwrites existing processed dates. If the new file matches the old file, dates will be reprocessed and appended to the data frame. If files do not match, the old file will be replaced. If False, stats are only calculated for year-dekads that have not already been calculated within the file. However, if False and files do not match, value error will be raised.

  • **kwargs – Additional keyword arguments passed to oap.computer_raster_stats().

Returns:

The processed path

Return type:

Path

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>> bfa_admin1 = codab.load(admin_level=1)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...     gdf=bfa_admin2,
...     feature_col="ADM2_FR"
... )
>>>
>>> # process for admin1
>>> bfa_ndvi.process(
...     gdf=bfa_admin1,
...     feature_col="ADM1_FR"
... )

Percent of median

class ochanticipy.UsgsNdviPctMedian(country_config: CountryConfig, start_date: date | str | Tuple[int, int] | None = None, end_date: date | str | Tuple[int, int] | None = None)[source]

Base class to retrieve % of median NDVI.

The retrieved data is the percent of median NDVI values calculated from 2003 - 2017, as processed by the USGS.

Parameters:
  • country_config (CountryConfig) – Country configuration

  • start_date (_DATE_TYPE, default = None) – Start date. Can be passed as a datetime.date object or a data string in ISO8601 format, and the relevant dekad will be determined. Or pass directly as year-dekad tuple, e.g. (2020, 1). If None, start_date is set to earliest date with data: 2002, dekad 19.

  • end_date (_DATE_TYPE, default = None) – End date. Can be passed as a datetime.date object and the relevant dekad will be determined, as a date string in ISO8601 format, or as a year-dekad tuple, i.e. (2020, 1). If None, end_date is set to date.today().

Examples

>>> from ochanticipy import create_country_config,     ...  CodAB, UsgsNdviPctMedian
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviPctMedian(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...     gdf=bfa_admin2,
...     feature_col="ADM2_FR"
... )
>>>
>>> # load in processed data
>>> df = bfa_ndvi.load(feature_col="ADM2_FR")
download(clobber: bool = False) Path

Download raw NDVI data as .tif files.

NDVI data is downloaded from the USGS API, with data for individual regions, years, and dekads stored as separate .tif files. No authentication is required. Data is downloaded for all available dekads from self._start_date to self._end_date.

Parameters:

clobber (bool, default = False) – If True, overwrites existing files

Returns:

The downloaded filepath

Return type:

Path

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
load(feature_col: str) DataFrame

Load the processed USGS NDVI data.

Parameters:

feature_col (str) – String is used as a suffix to the processed file path for unique identication of analyses done on different files and columns. The same value must be passed to process().

Returns:

The processed NDVI dataset.

Return type:

pd.DataFrame

Raises:

FileNotFoundError – If the requested file cannot be found.

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...    gdf=bfa_admin2,
...    feature_col="ADM2_FR"
)
>>> bfa_ndvi.load(feature_col="ADM2_FR")
load_raster(load_date: date | str | Tuple[int, int]) DataArray

Load raster for specific year and dekad.

Parameters:

load_date (Union[date, str, Tuple[int, int]]) – Date. Can be passed as a datetime.date object and the relevant dekad will be determined, as a date string in ISO8601 format, or as a year-dekad tuple, i.e. (2020, 1).

Returns:

Data array of NDVI data.

Return type:

xr.DataArray

Raises:

FileNotFoundError – If the requested file cannot be found.

process(gdf: GeoDataFrame, feature_col: str, clobber: bool = False, **kwargs) Path

Process NDVI data for specific area.

NDVI data is clipped to the provided geometries, usually a geopandas dataframes geometry feature. kwargs are passed on to oap.computer_raster_stats(). The feature_col is used to define the unique processed file.

The processing keeps track of the latest timestamp of when the raw raster files were modified. If the latest timestamp of the raw data is greater than when it was last processed, the file will automatically be re-processed.

Parameters:
  • gdf (geopandas.GeoDataFrame) – GeoDataFrame with row per area for stats computation. If pd.DataFrame is passed, geometry column must have the name geometry. Passed to oap.compute_raster_stats().

  • feature_col (str) – Column in gdf to use as row/feature identifier. and dates. Passed to oap.compute_raster_stats(). The string is also used as a suffix to the processed file path for unique identication of analyses done on different files and columns.

  • clobber (bool, default = False) – If True, overwrites existing processed dates. If the new file matches the old file, dates will be reprocessed and appended to the data frame. If files do not match, the old file will be replaced. If False, stats are only calculated for year-dekads that have not already been calculated within the file. However, if False and files do not match, value error will be raised.

  • **kwargs – Additional keyword arguments passed to oap.computer_raster_stats().

Returns:

The processed path

Return type:

Path

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>> bfa_admin1 = codab.load(admin_level=1)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...     gdf=bfa_admin2,
...     feature_col="ADM2_FR"
... )
>>>
>>> # process for admin1
>>> bfa_ndvi.process(
...     gdf=bfa_admin1,
...     feature_col="ADM1_FR"
... )

Median anomaly

class ochanticipy.UsgsNdviMedianAnomaly(country_config: CountryConfig, start_date: date | str | Tuple[int, int] | None = None, end_date: date | str | Tuple[int, int] | None = None)[source]

Base class to retrieve NDVI anomaly data.

The retrieved data is NDVI anomaly data calculated as a subtraction of the median value, based on data from 2003 - 2017, from the current value. Negative values indicate less vegetation than the median, positive values indicate more vegetation.

Parameters:
  • country_config (CountryConfig) – Country configuration

  • start_date (_DATE_TYPE, default = None) – Start date. Can be passed as a datetime.date object or a data string in ISO8601 format, and the relevant dekad will be determined. Or pass directly as year-dekad tuple, e.g. (2020, 1). If None, start_date is set to earliest date with data: 2002, dekad 19.

  • end_date (_DATE_TYPE, default = None) – End date. Can be passed as a datetime.date object and the relevant dekad will be determined, as a date string in ISO8601 format, or as a year-dekad tuple, i.e. (2020, 1). If None, end_date is set to date.today().

Examples

>>> from ochanticipy import create_country_config,     ...  CodAB, UsgsNdviMedianAnomaly
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviMedianAnomaly(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...     gdf=bfa_admin2,
...     feature_col="ADM2_FR"
... )
>>>
>>> # load in processed data
>>> df = bfa_ndvi.load(feature_col="ADM2_FR")
download(clobber: bool = False) Path

Download raw NDVI data as .tif files.

NDVI data is downloaded from the USGS API, with data for individual regions, years, and dekads stored as separate .tif files. No authentication is required. Data is downloaded for all available dekads from self._start_date to self._end_date.

Parameters:

clobber (bool, default = False) – If True, overwrites existing files

Returns:

The downloaded filepath

Return type:

Path

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
load(feature_col: str) DataFrame

Load the processed USGS NDVI data.

Parameters:

feature_col (str) – String is used as a suffix to the processed file path for unique identication of analyses done on different files and columns. The same value must be passed to process().

Returns:

The processed NDVI dataset.

Return type:

pd.DataFrame

Raises:

FileNotFoundError – If the requested file cannot be found.

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...    gdf=bfa_admin2,
...    feature_col="ADM2_FR"
)
>>> bfa_ndvi.load(feature_col="ADM2_FR")
load_raster(load_date: date | str | Tuple[int, int]) DataArray

Load raster for specific year and dekad.

Parameters:

load_date (Union[date, str, Tuple[int, int]]) – Date. Can be passed as a datetime.date object and the relevant dekad will be determined, as a date string in ISO8601 format, or as a year-dekad tuple, i.e. (2020, 1).

Returns:

Data array of NDVI data.

Return type:

xr.DataArray

Raises:

FileNotFoundError – If the requested file cannot be found.

process(gdf: GeoDataFrame, feature_col: str, clobber: bool = False, **kwargs) Path

Process NDVI data for specific area.

NDVI data is clipped to the provided geometries, usually a geopandas dataframes geometry feature. kwargs are passed on to oap.computer_raster_stats(). The feature_col is used to define the unique processed file.

The processing keeps track of the latest timestamp of when the raw raster files were modified. If the latest timestamp of the raw data is greater than when it was last processed, the file will automatically be re-processed.

Parameters:
  • gdf (geopandas.GeoDataFrame) – GeoDataFrame with row per area for stats computation. If pd.DataFrame is passed, geometry column must have the name geometry. Passed to oap.compute_raster_stats().

  • feature_col (str) – Column in gdf to use as row/feature identifier. and dates. Passed to oap.compute_raster_stats(). The string is also used as a suffix to the processed file path for unique identication of analyses done on different files and columns.

  • clobber (bool, default = False) – If True, overwrites existing processed dates. If the new file matches the old file, dates will be reprocessed and appended to the data frame. If files do not match, the old file will be replaced. If False, stats are only calculated for year-dekads that have not already been calculated within the file. However, if False and files do not match, value error will be raised.

  • **kwargs – Additional keyword arguments passed to oap.computer_raster_stats().

Returns:

The processed path

Return type:

Path

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>> bfa_admin1 = codab.load(admin_level=1)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...     gdf=bfa_admin2,
...     feature_col="ADM2_FR"
... )
>>>
>>> # process for admin1
>>> bfa_ndvi.process(
...     gdf=bfa_admin1,
...     feature_col="ADM1_FR"
... )

Difference from previous year

class ochanticipy.UsgsNdviYearDifference(country_config: CountryConfig, start_date: date | str | Tuple[int, int] | None = None, end_date: date | str | Tuple[int, int] | None = None)[source]

Base class to retrieve NDVI year difference data.

The retrieved data is NDVI yearly difference data, calculated as the subtraction of the previous year’s NDVI value from the current year’s. Negative values indicate the current vegetation is less than the previous year’s, positive that there is more vegetation in the current year.

Parameters:
  • country_config (CountryConfig) – Country configuration

  • start_date (_DATE_TYPE, default = None) – Start date. Can be passed as a datetime.date object or a data string in ISO8601 format, and the relevant dekad will be determined. Or pass directly as year-dekad tuple, e.g. (2020, 1). If None, start_date is set to earliest date with data: 2002, dekad 19.

  • end_date (_DATE_TYPE, default = None) – End date. Can be passed as a datetime.date object and the relevant dekad will be determined, as a date string in ISO8601 format, or as a year-dekad tuple, i.e. (2020, 1). If None, end_date is set to date.today().

Examples

>>> from ochanticipy import create_country_config,     ...  CodAB, UsgsNdviDifference
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviDifference(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...     gdf=bfa_admin2,
...     feature_col="ADM2_FR"
... )
>>>
>>> # load in processed data
>>> df = bfa_ndvi.load(feature_col="ADM2_FR")
download(clobber: bool = False) Path

Download raw NDVI data as .tif files.

NDVI data is downloaded from the USGS API, with data for individual regions, years, and dekads stored as separate .tif files. No authentication is required. Data is downloaded for all available dekads from self._start_date to self._end_date.

Parameters:

clobber (bool, default = False) – If True, overwrites existing files

Returns:

The downloaded filepath

Return type:

Path

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
load(feature_col: str) DataFrame

Load the processed USGS NDVI data.

Parameters:

feature_col (str) – String is used as a suffix to the processed file path for unique identication of analyses done on different files and columns. The same value must be passed to process().

Returns:

The processed NDVI dataset.

Return type:

pd.DataFrame

Raises:

FileNotFoundError – If the requested file cannot be found.

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...    gdf=bfa_admin2,
...    feature_col="ADM2_FR"
)
>>> bfa_ndvi.load(feature_col="ADM2_FR")
load_raster(load_date: date | str | Tuple[int, int]) DataArray

Load raster for specific year and dekad.

Parameters:

load_date (Union[date, str, Tuple[int, int]]) – Date. Can be passed as a datetime.date object and the relevant dekad will be determined, as a date string in ISO8601 format, or as a year-dekad tuple, i.e. (2020, 1).

Returns:

Data array of NDVI data.

Return type:

xr.DataArray

Raises:

FileNotFoundError – If the requested file cannot be found.

process(gdf: GeoDataFrame, feature_col: str, clobber: bool = False, **kwargs) Path

Process NDVI data for specific area.

NDVI data is clipped to the provided geometries, usually a geopandas dataframes geometry feature. kwargs are passed on to oap.computer_raster_stats(). The feature_col is used to define the unique processed file.

The processing keeps track of the latest timestamp of when the raw raster files were modified. If the latest timestamp of the raw data is greater than when it was last processed, the file will automatically be re-processed.

Parameters:
  • gdf (geopandas.GeoDataFrame) – GeoDataFrame with row per area for stats computation. If pd.DataFrame is passed, geometry column must have the name geometry. Passed to oap.compute_raster_stats().

  • feature_col (str) – Column in gdf to use as row/feature identifier. and dates. Passed to oap.compute_raster_stats(). The string is also used as a suffix to the processed file path for unique identication of analyses done on different files and columns.

  • clobber (bool, default = False) – If True, overwrites existing processed dates. If the new file matches the old file, dates will be reprocessed and appended to the data frame. If files do not match, the old file will be replaced. If False, stats are only calculated for year-dekads that have not already been calculated within the file. However, if False and files do not match, value error will be raised.

  • **kwargs – Additional keyword arguments passed to oap.computer_raster_stats().

Returns:

The processed path

Return type:

Path

Examples

>>> from ochanticipy import create_country_config,         ...  CodAB, UsgsNdviSmoothed
>>>
>>> # Retrieve admin 2 boundaries for Burkina Faso
>>> country_config = create_country_config(iso3="bfa")
>>> codab = CodAB(country_config=country_config)
>>> bfa_admin2 = codab.load(admin_level=2)
>>> bfa_admin1 = codab.load(admin_level=1)
>>>
>>> # setup NDVI
>>> bfa_ndvi = UsgsNdviSmoothed(
...     country_config=country_config,
...     start_date=[2020, 1],
...     end_date=[2020, 3]
... )
>>> bfa_ndvi.download()
>>> bfa_ndvi.process(
...     gdf=bfa_admin2,
...     feature_col="ADM2_FR"
... )
>>>
>>> # process for admin1
>>> bfa_ndvi.process(
...     gdf=bfa_admin1,
...     feature_col="ADM1_FR"
... )

Utilities

GeoboundingBox

Functionality to retrieve and modify boundary coordinates.

It is possible to create an GeoBoundingBox object either from lat_max, lat_min, lon_max, lon_min coordinates, or from a shapefile that has been read in with geopandas.

class ochanticipy.GeoBoundingBox(lat_max: float, lat_min: float, lon_max: float, lon_min: float)[source]

Create an object containing the bounds of an area.

Standard geographic coordinate system is used where latitude runs from -90 to 90 degrees, and latitude from -180 to 180. North must always be greater than south, and east greater than west.

Parameters:
  • lat_max (float) – The northern latitude boundary of the area (degrees). The value must be between -90 and 90, and greater than or equal to the southern boundary.

  • lat_min (float) – The southern latitude boundary of the area (degrees). The value must be between -90 and 90, and less than or equal to the northern boundary.

  • lon_max (float) – The easternmost longitude boundary of the area (degrees). The value must be between -180 and 180, and greater than or equal to the western boundary.

  • lon_min (float) – The westernmost longitude boundary of the area (degrees). The value must be between -180 and 180, and less than or equal to the eastern boundary.

classmethod from_shape(shape: GeoSeries | GeoDataFrame) GeoBoundingBox[source]

Create GeoBoundingBox from a geopandas object.

Parameters:

shape (geopandas.GeoSeries, geopandas.GeoDataFrame) – A shape whose bounds will be retrieved

Return type:

GeoBoundingBox from the total bounds of the GeoDataFrame

Examples

>>> import geopandas as gpd
>>> df_admin_boundaries = gpd.read_file("admin0_boundaries.gpkg")
>>> geobb = GeoBoundingBox.from_shape(df_admin_boundaries)
get_filename_repr(precision: int = 0) str[source]

Get succinct boundary representation for usage in filenames.

Parameters:

precision (int, default = 0) – Precision, i.e. number of decimal places to round to. Default is 0 for ints.

Return type:

String containing N, S, E and W coordinates.

property lat_max: float

Get the northern latitude boundary of the area (degrees).

property lat_min: float

Get the southern latitude boundary of the area (degrees).

property lon_max: float

Get the eastern longitude boundary of the area (degrees).

property lon_min: float

Get the western longitude boundary of the area (degrees).

round_coords(offset_val: float = 0.0, round_val: int | float = 1) GeoBoundingBox[source]

Round the bounding box coordinates.

Rounding is always done outside the original bounding box, i.e. the resulting bounding box is always equal or larger than the original bounding box. Rounding can only be done once per instance.

Parameters:
  • offset_val (float, default = 0.0) – Offset the coordinates by this factor.

  • round_val (int or float, default = 1) – Rounds to the nearest round_val. Can be an int for integer rounding or float for decimal rounding. If 1, round to integers.

Return type:

GeoBoundingBox instance with rounded and offset coordinates

Raster module

Utilities to manipulate and analyze raster data.

The raster module provides accessor utilities for xarray data arrays and datasets accessible using the oap accessor. These functions are available just by importing directly the library using import ochanticipy.

Since rioxarray already extends xarray, this module’s extensions inherit from the RasterArray and RasterDataset extensions respectively. This ensures cleaner code in the module as rio methods are available immediately, but also means a couple of design decisions are followed.

The xarray.DataArray and xarray.Dataset extensions here inherit from rioxarray base classes. Thus, methods that are identical for both objects are defined in a mixin class OapRasterMixin which can be inherited by the two respective extensions.

Data arrays

class ochanticipy.utils.raster.OapRasterArray(xarray_object)[source]

OCHA AnticiPy extension for xarray.DataArray.

change_longitude_range(to_180_range: bool = True, inplace: bool = False) DataArray | Dataset | None

Convert longitude range between -180 to 180 and 0 to 360.

The standard longitude range is from -180 to 180, while some applications use 0 to 360. This includes `rasterstats.zonal_stats <https://pypi.org/project/rasterstats/>`_, which assumes ranges from 0 to 360.

change_longitude_range() will convert between the two coordinate ranges based on its current state. By default it will use the -180 to 180 range unless to_180_range is False, then it will use 0-360 If coordinates lie solely between 0 and 180 then there is no need for conversion and the input will be returned.

Parameters:
  • to_180_range (bool, default = True) – If True, the returned range is -180 to 180 Else, the returned range is 0 to 360

  • inplace (bool, optional) – If True, will overwrite existing data array. Default is False.

Returns:

Dataset with transformed longitude coordinates.

Return type:

Union[xarray.DataArray, xarray.Dataset]

Examples

>>> import xarray
>>> import numpy
>>> import pandas
>>> temp = 15 + 8 * numpy.random.randn(4, 4, 3)
>>> precip = 10 * numpy.random.rand(4, 4, 3)
>>> ds = xarray.Dataset(
...   {
...     "temperature": (["lat", "lon", "time"], temp),
...     "precipitation": (["lat", "lon", "time"], precip)
...   },
...   coords={
...     "lat":numpy.array([87, 88, 89, 90]),
...     "lon":numpy.array([5, 120, 199, 360]),
...     "time": pandas.date_range("2014-09-06", periods=3)
...   }
... )
>>> ds_inv = ds.oap.change_longitude_range()
>>> ds_inv.get_index("lon")
Index([-161, 0, 5, 120], dtype='int64', name='lon')
>>> # invert coordinates back to original, in place
>>> ds_inv.oap.change_longitude_range(to_180_range=False, inplace=True)
>>> ds_inv.get_index("lon")
Index([0, 5, 120, 199], dtype='int64', name='lon')
compute_raster_stats(gdf: GeoDataFrame, feature_col: str, stats_list: List[str] | None = None, percentile_list: List[int] | None = None, all_touched: bool = False) DataFrame[source]

Compute raster statistics for polygon geometry.

compute_raster_stats() is designed to quickly compute raster statistics across a polygon and its features.

Parameters:
  • gdf (geopandas.GeoDataFrame) – GeoDataFrame with row per area for stats computation. If pd.DataFrame is passed, geometry column must have the name geometry.

  • feature_col (str) – Column in gdf to use as row/feature identifier.

  • stats_list (Optional[List[str]], optional) – List of statistics to calculate, by default None. Passed to get_attr().

  • percentile_list (Optional[List[int]], optional) – List of percentiles to compute, by default None.

  • all_touched (bool, optional) – If True all cells touching the region will be included, by default False. If False, only cells with their centre in the region will be included.

Returns:

Dataframe with computed statistics.

Return type:

pandas.DataFrame

Examples

>>> import geopandas as gpd
>>> import xarray as xr
>>> import rioxarray
>>> from shapely.geometry import Polygon
>>>
>>> # compute raster stats on simple data
>>> d = {
...     "name": ["area_a", "area_b"],
...     "geometry": [
...         Polygon([(0, 0), (0, 2), (2, 2), (2, 0)]),
...         Polygon([(2, 0), (2, 2), (3, 2), (3, 0)]),
...     ],
... }
>>> gdf = gpd.GeoDataFrame(d)
>>>
>>> da = xr.DataArray(
...     [[1, 2, 3], [4, 5, 6]],
...     dims=("y", "x"),
...     coords={"y": [1.5, 0.5], "x": [0.5, 1.5, 2.5]},
... ).rio.write_crs("EPSG:4326")
>>>
>>> da.oap.compute_raster_stats(
...     gdf=gdf,
...     feature_col="name"
... ) 
   mean_name            std_name min_name max_name sum_name count_name    name # noqa: E501
0       3.0  1.5811388300841898        1        5     12.0          4  area_a  # noqa: E501
1       4.5                 1.5        3        6      9.0          2  area_b  # noqa: E501
correct_calendar(inplace: bool = False) DataArray | Dataset | None

Correct calendar attribute for recognition by xarray.

Some datasets come with a wrong calendar attribute that isn’t recognized by xarray. This function corrects the coordinate attribute to ensure that a calendar attribute exists and specifies a calendar alias that is supportable by xarray.cftime_range and NetCDF in general.

Currently ensures that calendar attributes that are either specified with units="months since" or calendar="360" explicitly have calendar="360_day". This is based on discussions in this GitHub issue. If and when further issues are found with calendar attributes, support for conversion will be added here.

Parameters:

inplace (bool, optional) – If True, it will modify the dataarray in place. Otherwise it will return a modified copy.

Returns:

Data array or dataset with transformed calendar coordinate.

Return type:

Union[xarray.DataArray, xarray.Dataset]

Examples

>>> import xarray
>>> import numpy
>>> da = xarray.DataArray(
...  numpy.arange(64).reshape(4,4,4),
...  coords={"lat":numpy.array([87, 88, 89, 90]),
...          "lon":numpy.array([5, 120, 199, 360]),
...          "t":numpy.array([10,11,12,13])}
... )
>>> da["t"].attrs["units"] = "months since 1960-01-01"
>>> da_crct = da.oap.correct_calendar()
>>> da_crct["t"].attrs["calendar"]
'360_day'
invert_coordinates(inplace: bool = False) DataArray | Dataset | None

Invert latitude and longitude in data array.

This function checks for inversion of latitude and longitude and inverts them if needed. Datasets with inverted coordinates can produce incorrect results in certain functions like rasterstats.zonal_stats(). Correctly ordered coordinates should be:

  • latitude: Largest to smallest.

  • longitude: Smallest to largest.

If data array already has correct coordinate ordering, it is directly returned. Function largely copied from https://github.com/perrygeo/python-rasterstats/issues/218.

Parameters:

inplace (bool, optional) – If True, will overwrite existing data array. Default is False.

Returns:

Data array or dataset with correct coordinate ordering.

Return type:

Union[xarray.DataArray, xarray.Dataset]

Examples

>>> import xarray
>>> import numpy
>>> da = xarray.DataArray(
...  numpy.arange(16).reshape(4,4),
...  coords={"lat":numpy.array([87, 88, 89, 90]),
...          "lon":numpy.array([70, 69, 68, 67])}
... )
>>> da.oap.invert_coordinates(inplace=True)
>>> da.get_index("lon")
Index([67, 68, 69, 70], dtype='int64', name='lon')
>>> da.get_index("lat")
Index([90, 89, 88, 87], dtype='int64', name='lat')
property longitude_range

The longitude range.

The longitude range indicates if coordinates are between -180 and 180 (indicated by ‘180’) or 0 and 360 (indicated by ‘360’).

Type:

str

set_time_dim(t_dim: str, inplace: bool = False) DataArray | Dataset | None

Set the time dimension of the dataset.

Parameters:
  • t_dim (str) – The name of the time dimension.

  • inplace (bool, optional) – If True, it will modify the dataarray in place. Otherwise it will return a modified copy.

Returns:

Data array or dataset with time dimension.

Return type:

Union[xarray.DataArray, xarray.Dataset]

Examples

>>> import xarray
>>> import numpy
>>> da = xarray.DataArray(
...  numpy.arange(64).reshape(4,4,4),
...  coords={"lat":numpy.array([87, 88, 89, 90]),
...          "lon":numpy.array([5, 120, 199, 360]),
...          "F":numpy.array([10,11,12,13])}
... )
>>> da.oap.set_time_dim(t_dim="F", inplace=True)
>>> da.oap.t_dim
'F'
property t_dim

The dimension for time.

Type:

str

property x_dim: Hashable

The dimension for the X-axis.

Type:

Hashable

property y_dim: Hashable

The dimension for the Y-axis.

Type:

Hashable

Datasets

class ochanticipy.utils.raster.OapRasterDataset(xarray_object)[source]

OCHA AnticiPy extension for xarray.Dataset.

change_longitude_range(to_180_range: bool = True, inplace: bool = False) DataArray | Dataset | None

Convert longitude range between -180 to 180 and 0 to 360.

The standard longitude range is from -180 to 180, while some applications use 0 to 360. This includes `rasterstats.zonal_stats <https://pypi.org/project/rasterstats/>`_, which assumes ranges from 0 to 360.

change_longitude_range() will convert between the two coordinate ranges based on its current state. By default it will use the -180 to 180 range unless to_180_range is False, then it will use 0-360 If coordinates lie solely between 0 and 180 then there is no need for conversion and the input will be returned.

Parameters:
  • to_180_range (bool, default = True) – If True, the returned range is -180 to 180 Else, the returned range is 0 to 360

  • inplace (bool, optional) – If True, will overwrite existing data array. Default is False.

Returns:

Dataset with transformed longitude coordinates.

Return type:

Union[xarray.DataArray, xarray.Dataset]

Examples

>>> import xarray
>>> import numpy
>>> import pandas
>>> temp = 15 + 8 * numpy.random.randn(4, 4, 3)
>>> precip = 10 * numpy.random.rand(4, 4, 3)
>>> ds = xarray.Dataset(
...   {
...     "temperature": (["lat", "lon", "time"], temp),
...     "precipitation": (["lat", "lon", "time"], precip)
...   },
...   coords={
...     "lat":numpy.array([87, 88, 89, 90]),
...     "lon":numpy.array([5, 120, 199, 360]),
...     "time": pandas.date_range("2014-09-06", periods=3)
...   }
... )
>>> ds_inv = ds.oap.change_longitude_range()
>>> ds_inv.get_index("lon")
Index([-161, 0, 5, 120], dtype='int64', name='lon')
>>> # invert coordinates back to original, in place
>>> ds_inv.oap.change_longitude_range(to_180_range=False, inplace=True)
>>> ds_inv.get_index("lon")
Index([0, 5, 120, 199], dtype='int64', name='lon')
compute_raster_stats(var_names: List[str] | str | None = None, **kwargs: Any)[source]

Compute raster statistics across dataset arrays.

compute_raster_stats() calculates raster statistics on component data arrays of a dataset. By default, calculates on all non-coordinate variables, unless a list of variable names is passed in, which then have statistics calculated for them.

Parameters:
  • var_names (Union[List[str], str, None], optional) – Dataset data array variables to calculate raster statistics on.

  • kwargs (Any) – Keyword arguments passed to the array method compute_raster_stats()

Returns:

List of raster statistics data frames.

Return type:

List[pandas.DataFrame]

Examples

>>> import geopandas as gpd
>>> import xarray as xr
>>> import rioxarray
>>> from shapely.geometry import Polygon
>>>
>>> # compute raster stats on simple data
>>> d = {
...     "name": ["area_a", "area_b"],
...     "geometry": [
...         Polygon([(0, 0), (0, 2), (2, 2), (2, 0)]),
...         Polygon([(2, 0), (2, 2), (3, 2), (3, 0)]),
...     ],
... }
>>> gdf = gpd.GeoDataFrame(d)
>>>
>>> ds = xr.DataArray(
...     [[1, 2, 3], [4, 5, 6]],
...     dims=("y", "x"),
...     coords={"y": [1.5, 0.5], "x": [0.5, 1.5, 2.5]},
... ).rio.write_crs("EPSG:4326").to_dataset(name="data")
>>>
>>> ds.oap.compute_raster_stats(
...    var_names=["data"],
...    gdf=gdf,
...    feature_col="name"
... ) 
[       mean                 std      min      max      sum      count    name # noqa: E501
0       3.0  1.5811388300841898        1        5     12.0          4  area_a  # noqa: E501
1       4.5                 1.5        3        6      9.0          2  area_b] # noqa: E501
correct_calendar(inplace: bool = False) DataArray | Dataset | None

Correct calendar attribute for recognition by xarray.

Some datasets come with a wrong calendar attribute that isn’t recognized by xarray. This function corrects the coordinate attribute to ensure that a calendar attribute exists and specifies a calendar alias that is supportable by xarray.cftime_range and NetCDF in general.

Currently ensures that calendar attributes that are either specified with units="months since" or calendar="360" explicitly have calendar="360_day". This is based on discussions in this GitHub issue. If and when further issues are found with calendar attributes, support for conversion will be added here.

Parameters:

inplace (bool, optional) – If True, it will modify the dataarray in place. Otherwise it will return a modified copy.

Returns:

Data array or dataset with transformed calendar coordinate.

Return type:

Union[xarray.DataArray, xarray.Dataset]

Examples

>>> import xarray
>>> import numpy
>>> da = xarray.DataArray(
...  numpy.arange(64).reshape(4,4,4),
...  coords={"lat":numpy.array([87, 88, 89, 90]),
...          "lon":numpy.array([5, 120, 199, 360]),
...          "t":numpy.array([10,11,12,13])}
... )
>>> da["t"].attrs["units"] = "months since 1960-01-01"
>>> da_crct = da.oap.correct_calendar()
>>> da_crct["t"].attrs["calendar"]
'360_day'
get_raster_array(var_name: str) DataArray[source]

Get xarray.DataArray from variable and keep dimensions.

Accessing a component xarray.DataArray using the non-coordinate variable name loses and dimensions set through rio or oap. This includes x_dim, y_dim, and t_dim that have to be specifically set using rio.set_spatial_dims() or oap.set_time_dim() respectively. For any dataset ds, ds.get_raster_array("var") will retrieve the data array without losing the dimensions. Using ds["var"] will lose the dimensions.

Parameters:

var_name (str) – Name of variable.

Returns:

A data array.

Return type:

xarray.DataArray

Examples

>>> import xarray
>>> import numpy
>>> temp = 15 + 8 * numpy.random.randn(4, 4, 3)
>>> precip = 10 * np.random.rand(4, 4, 3)
>>> ds = xarray.Dataset(
...   {
...     "temperature": (["lat", "lon", "F"], temp),
...     "precipitation": (["lat", "lon", "F"], precip)
...   },
...   coords={
...     "lat":numpy.array([87, 88, 89, 90]),
...     "lon":numpy.array([5, 120, 199, 360]),
...     "F": pd.date_range("2014-09-06", periods=3)
...   }
... )
>>> ds.oap.set_time_dim("F", inplace=True)
>>> da = ds.oap.get_raster_array("temperature")
>>> da.oap.t_dim
'F'
>>> # directly accessing array loses set dimensions
>>> ds['temperature'].oap.t_dim 
Traceback (most recent call last):
    ...
rioxarray.exceptions.DimensionError: Time dimension not found.
    'oap.set_time_dim()' or using 'rename()' to change the
    dimension name to 't' can address this.
Data variable: temperature
invert_coordinates(inplace: bool = False) DataArray | Dataset | None

Invert latitude and longitude in data array.

This function checks for inversion of latitude and longitude and inverts them if needed. Datasets with inverted coordinates can produce incorrect results in certain functions like rasterstats.zonal_stats(). Correctly ordered coordinates should be:

  • latitude: Largest to smallest.

  • longitude: Smallest to largest.

If data array already has correct coordinate ordering, it is directly returned. Function largely copied from https://github.com/perrygeo/python-rasterstats/issues/218.

Parameters:

inplace (bool, optional) – If True, will overwrite existing data array. Default is False.

Returns:

Data array or dataset with correct coordinate ordering.

Return type:

Union[xarray.DataArray, xarray.Dataset]

Examples

>>> import xarray
>>> import numpy
>>> da = xarray.DataArray(
...  numpy.arange(16).reshape(4,4),
...  coords={"lat":numpy.array([87, 88, 89, 90]),
...          "lon":numpy.array([70, 69, 68, 67])}
... )
>>> da.oap.invert_coordinates(inplace=True)
>>> da.get_index("lon")
Index([67, 68, 69, 70], dtype='int64', name='lon')
>>> da.get_index("lat")
Index([90, 89, 88, 87], dtype='int64', name='lat')
property longitude_range

The longitude range.

The longitude range indicates if coordinates are between -180 and 180 (indicated by ‘180’) or 0 and 360 (indicated by ‘360’).

Type:

str

set_time_dim(t_dim: str, inplace: bool = False) DataArray | Dataset | None

Set the time dimension of the dataset.

Parameters:
  • t_dim (str) – The name of the time dimension.

  • inplace (bool, optional) – If True, it will modify the dataarray in place. Otherwise it will return a modified copy.

Returns:

Data array or dataset with time dimension.

Return type:

Union[xarray.DataArray, xarray.Dataset]

Examples

>>> import xarray
>>> import numpy
>>> da = xarray.DataArray(
...  numpy.arange(64).reshape(4,4,4),
...  coords={"lat":numpy.array([87, 88, 89, 90]),
...          "lon":numpy.array([5, 120, 199, 360]),
...          "F":numpy.array([10,11,12,13])}
... )
>>> da.oap.set_time_dim(t_dim="F", inplace=True)
>>> da.oap.t_dim
'F'
property t_dim

The dimension for time.

Type:

str

property x_dim: Hashable

The dimension for the X-axis.

Type:

Hashable

property y_dim: Hashable

The dimension for the Y-axis.

Type:

Hashable