Repository of all forecasts submitted to the Ecological Forecasting Initiative

Partitioned parquet database of forecasts submitted to the Ecological Forecasting Initiative. See dashboard at https://projects.ecoforecast.org/neon4cast-dashboard/
Product Details
Visibility
Public
Created
20 Aug 2023
Last Updated
3 Apr 2025
Product Contents
root
README

Forecasts submitted to the Ecological Forecasting Initiative's NEON Challenge

In development, experimental

At this time, canonical sources are still hosted on data.ecoforecast.org.

Resources

Quickstart

Arrow provides an easy way to access remote parquet files from most languages widely used in data science. Here we access all forecasts submitted to a particular theme. (Users looking to load only a single model should specify that on the path for faster access. The STAC catalog can be used to explore available models).

The examples below show 'cloud-native' connections to the data -- 'lazy' connections that do not download the entire asset, but allow us to filter, subset, and operate directly on the remote data product.

R Access

1library(arrow)
2base = "s3://anonymous@us-west-2.opendata.source.coop"
3repo = "eco4cast/neon4cast-forecasts"
4theme = "aquatics"
5uri = glue::glue("{base}/{repo}/parquet/{theme}?region=us-west-2")
6
7open_dataset(uri)
8
1library(arrow)
2base = "s3://anonymous@us-west-2.opendata.source.coop"
3repo = "eco4cast/neon4cast-forecasts"
4theme = "aquatics"
5uri = glue::glue("{base}/{repo}/parquet/{theme}?region=us-west-2")
6
7open_dataset(uri)
8

Python Access

1import pyarrow.dataset as ds
2
3base = "s3://anonymous@us-west-2.opendata.source.coop"
4repo = "eco4cast/neon4cast-forecasts"
5theme = "aquatics"
6uri = f"{base}/{repo}/parquet/{theme}?region=us-west-2"
7
8ds.dataset(uri, format="parquet")
1import pyarrow.dataset as ds
2
3base = "s3://anonymous@us-west-2.opendata.source.coop"
4repo = "eco4cast/neon4cast-forecasts"
5theme = "aquatics"
6uri = f"{base}/{repo}/parquet/{theme}?region=us-west-2"
7
8ds.dataset(uri, format="parquet")

duckdb

At this time, duckdb access substantially faster than arrow.

R + duckdb

R users can get a dplyr-compatible lazy remote tibble as follows:

1# remotes::install_github("cboettig/duckdbfs")
2library(duckdbfs)
3
4base = "s3://anonymous@us-west-2.opendata.source.coop"
5repo = "eco4cast/neon4cast-forecasts"
6theme = "aquatics"
7uri = glue::glue("{base}/{repo}/parquet/{theme}?region=us-west-2")
8
9df = open_dataset(uri)
1# remotes::install_github("cboettig/duckdbfs")
2library(duckdbfs)
3
4base = "s3://anonymous@us-west-2.opendata.source.coop"
5repo = "eco4cast/neon4cast-forecasts"
6theme = "aquatics"
7uri = glue::glue("{base}/{repo}/parquet/{theme}?region=us-west-2")
8
9df = open_dataset(uri)

Python + duckdb

ibis provides a more Pythonic interface to SQL:

1import ibis
2con = ibis.duckdb.connect()
3
4base = "s3://us-west-2.opendata.source.coop"
5repo = "eco4cast/neon4cast-forecasts"
6theme = "aquatics"
7uri = f"{base}/{repo}/parquet/{theme}/**"
8
9con.raw_sql(f"""
10INSTALL httpfs;
11LOAD httpfs;
12SET s3_region='us-west-2';
13""")
14
15db = con.read_parquet(uri)
1import ibis
2con = ibis.duckdb.connect()
3
4base = "s3://us-west-2.opendata.source.coop"
5repo = "eco4cast/neon4cast-forecasts"
6theme = "aquatics"
7uri = f"{base}/{repo}/parquet/{theme}/**"
8
9con.raw_sql(f"""
10INSTALL httpfs;
11LOAD httpfs;
12SET s3_region='us-west-2';
13""")
14
15db = con.read_parquet(uri)
Source Cooperative is a Radiant Earth project