Introduction#
OceanDB is a python package for managing oceanic satellite data intelligently. The python package interfaces with a postgres database enabling efficient geospatial/temporal queries. OceanDB comes with a simple CLI that allows users to initialize the database and ingest data.
Installation Instructions#
Create a Copernicus Marine account (if needed)
If you don’t already have a Copernicus Marine account, create one.
Clone OceanDB repository
Configure the .env file
Open the OceanDB directory, and copy the example .env.example file to .env. Open the new .env file, and edit to set the postgres server and directories to download the data.
POSTGRES_HOST=postgres POSTGRES_USERNAME=postgres POSTGRES_PASSWORD=postgres POSTGRES_PORT=5432 POSTGRES_DATABASE=ocean ALONG_TRACK_DATA_DIRECTORY=/app/data/copernicus EDDY_DATA_DIRECTORY=/app/data/eddies COPERNICUS_PASSWORD=copernicus_marine_service_password_placeholder COPERNICUS_USERNAME=copernicus_marine_service_usernameSetup python environment
The details depend on how you use python, e.g. from the command line or an IDE like PyCharm. These instructions are specific to PyCharm.
Install OceanDB
With your python environment activated
pip install OceanDB pip install -e OceanDB // editable install for development
OceanDB Initialization Instructions#
The OceanDB package provides a CLI for initializing the database and ingesting data.
Initializing the Database
oceandb init // Creates the database, tables, partitions, and reference data oceandb create-indices // Build query indices after bulk ingestDownloading Along-Track Data
Use oceandb download to fetch Copernicus Marine Service along-track data.
The command downloads files into ALONG_TRACK_DATA_DIRECTORY, which must be
set in .env.
ALONG_TRACK_DATA_DIRECTORY=/path/to/copernicus
COPERNICUS_USERNAME=copernicus_marine_service_username
COPERNICUS_PASSWORD=copernicus_marine_service_password
The downloader first runs a Copernicus dry-run preview. For normal downloads, it prints the matching file count and total size, then asks whether to continue before downloading anything.
Preview a download without fetching files:
oceandb download --dry-run j3 --start-date 2024-01-01 --end-date 2024-02-01
Download one mission. The command previews the matching files, then prompts for confirmation:
oceandb download j3 --start-date 2024-01-01 --end-date 2024-02-01
Download multiple missions:
oceandb download s3a s3b --start-date 2024-01-01 --end-date 2024-02-01
Download all supported along-track missions:
oceandb download all --dataset-version 202411
Skip the confirmation prompt for scripted runs:
oceandb download j3 --start-date 2024-01-01 --end-date 2024-02-01 --yes
After data has been downloaded, ingest it into OceanDB.
Quickly visualize the packaged basin polygons and their basin IDs:
oceandb visualize-basins --output artifacts/basin_map.html
Ingesting Along-Track Data
oceandb ingest-along-track reads from the same
ALONG_TRACK_DATA_DIRECTORY used by oceandb download.
By default if no arguments are provided this CLI command will iterate over all of the data.
oceandb ingest-along-track
oceandb ingest-along-track s3a
oceandb ingest-along-track s3a j3 c2
oceandb ingest-along-track j3 --start-date 2019-01-01 --end-date 2020-12-03
oceandb ingest-along-track s6a --end-date 2024-01-01
oceandb ingest-along-track s6a --start-date 2024-01-01
oceandb summary alongtrack
Ingesting Eddy Data
oceandb ingest-eddy
Querying SLA Data
To query the sea level anomaly for a given satellite mission, time range & radius around a given point
latitude = -69 longitude = 28 date = datetime(year=2013, month=3, day=14, hour=5) data = along_track.geographic_nearest_neighbors_dt( latitudes=np.array([latitude]), longitudes=np.array([longitude]), dates=[date], missions=['al'] ) for d in data: print(d)
Running OceanDB scripts in PyCharm#
Activate the environment & Install OceanDB
source .venv/bin/activate
pip install -e .
Set the Pycharm Run Configuration Parameters
In the top right of the PyCharm window, click the ‘edit’ button to configure the PyCharm run parameters
Script path
Select the script you want to run, for example:
src/OceanDB/tests/test_geographic_nearest_neighbor.pyPython interpreter
Ensure the correct virtual environment is selected (the same one where OceanDB was installed).Working directory
Set this to the repository root (the directory containingpyproject.toml).Environment file (.env)
Set Paths to .env files to point to your.envfile containing PostgreSQL credentials and any other required environment variables.
Running OceanDB in Docker Instructions#
Running Postgres
If you want to spin up a postgres development container with docker-compose
make run_postgres // runs postgres postgis in docker composeBuild OceanDB Python Image If building a development image
make build_image
Query Notes#
Had to Modify the Query Slightly
Every projected column must be aliased to its schema name. No schema object should ever reference a query-specific table alias.