Logo seatizen monitoring
Seatizen Monitoring

How to extract a sub-dataset

Overview

This comprehensive tutorial provides detailed instructions on extracting a subset of data from the SeatizenAtlas dataset. The process involves configuring spatial and temporal parameters, selecting metadata fields, and utilizing automated tools to retrieve the corresponding imagery from remote repositories.

Dataset Extraction Process

Exporter explanation

The Explorer interface is organized into three primary components: the interactive map, the data configuration section, and the model configuration section. Each component serves a specific function in the data extraction workflow.

1
Spatial Boundary Definition

This tool enables users to delineate specific geographic zones on the map. The export operation will exclusively include images located within these defined boundaries. Multiple zones can be created to accommodate complex spatial requirements.

Important: If no spatial boundaries are defined, the export will process all images displayed on the map, which may result in extended processing times (potentially several minutes).

2
Drawn Zone Visualization

This example demonstrates a defined export zone encompassing a cluster of four plankton board sessions located in the northern region of Saint-Leu. The visualization provides immediate feedback on the spatial extent of your data selection.

3
Platform Selection

This selector allows users to filter data by acquisition platform. Multiple platforms can be selected simultaneously, each distinguished by a unique color code on the map. The map dynamically updates to display only data from the selected platforms.

4
Temporal Filtering

Users can refine their dataset by specifying a temporal range. The map will exclusively display data collected within the selected time period.

5
Metadata Field Selection

This section enables selection of metadata fields to include in the export. For successful dataset retrieval, two fields must be selected:

  • version_doi: Doi pointing to the Zenodo deposit.
  • relative_file_path: Specifies the images path within its parent session directory

Model Configuration Section

While not required for basic dataset extraction, the model configuration section provides advanced functionality for users working with classification predictions:

6

Selection of multilabel classification models (note: drone platforms do not provide multilabel predictions)

7

Specification of classes to retain from the multilabel model output

8

Option to export raw prediction scores or binary presence/absence values after threshold application

9

Export button to generate and download the CSV file containing the configured dataset

Disclaimer: Predictions presented in this interface are generated by artificial intelligence algorithms and are provided for informational purposes only. They may contain inaccuracies. The application publisher disclaims all responsibility for their interpretation or use.

Image Retrieval Process

1. CSV Export Structure

The export operation generates a CSV file with the following structure:

FileName,GPSLatitude,GPSLongitude,version_doi,relative_file_path
20231110_REU-ST-LEU_ASV-1_04_5_567.jpeg,-21.16281806626762,55.2865898413734,https://doi.org/10.5281/zenodo.12760339,20231110_REU-ST-LEU_ASV-1_04/PROCESSED_DATA/FRAMES/20231110_REU-ST-LEU_ASV-1_04_5_567.jpeg
20231110_REU-ST-LEU_ASV-1_04_5_990.jpeg,-21.16282476635246,55.28659116164465,https://doi.org/10.5281/zenodo.12760339,20231110_REU-ST-LEU_ASV-1_04/PROCESSED_DATA/FRAMES/20231110_REU-ST-LEU_ASV-1_04_5_990.jpeg
20231110_REU-ST-LEU_ASV-1_04_5_1409.jpeg,-21.162930832467936,55.28655471331759,https://doi.org/10.5281/zenodo.12760339,20231110_REU-ST-LEU_ASV-1_04/PROCESSED_DATA/FRAMES/20231110_REU-ST-LEU_ASV-1_04_5_1409.jpeg
...

2. Retrieving the Images

There are two ways to set up the environment required to download the images from Zenodo:

Method 1: Using a Python Environment

You can follow the detailed installation guide to set up a local Python environment:

GitHub Repository README

Run the command:

python zenodo-download.py -ecf -pcf /path/to/csv/file/provided/by/seatizen/monitoring -po /path/where/you/want/your/frames/folder

Method 2: Using Docker (Recommended for Simplicity)

If you prefer to avoid installing dependencies manually, you can use Docker Desktop.

Once installed, simply run the following command:

docker run -it --user 1000 --rm \
  -v ./path/where/you/want/your/frames/folder:/home/seatizen/plancha \
  -v ./path/folder/where/csv/file/provided/by/seatizen/monitoring:/home/seatizen/app/csv_inputs \
  --name zenodo-manager seatizendoi/zenodo-manager:latest bash

This command:

  • Automatically downloads the Docker image seatizendoi/zenodo-manager:latest
  • Mounts your local folders (plancha and csv_inputs) inside the container
  • Opens an interactive session ready to run the Zenodo Tools scripts

Once your environment is ready, run the following command inside the container:

python zenodo-download.py -ecf -pcf /home/seatizen/app/csv_inputs/seatizen_monitoring.csv -po /home/seatizen/plancha

Command Parameters:

  • -ecfEnable CSV file processing mode
  • -pcfPath to the CSV file exported from SeatizenAtlas
  • -poPath to the output directory where downloaded images will be stored

Final Output

Upon successful completion of the download process, you will have access to two key components:

CSV Metadata File

A structured CSV file containing comprehensive metadata for each image, including precise GPS coordinates. This file can be imported into QGIS or other GIS software for spatial visualization and analysis of image locations.

Images Directory

A FRAMES subdirectory located at /path/where/you/want/your/frames/folder/Frames containing all downloaded images organized according to their original session structure.

SeatizenAtlas Data Exporter Tutorial · For technical support, please refer to the GitHub repository