Advancing Research Through DataCite's Global Access Fund: Datamap/Amazon – The First Integrated Repository of Scientific Atmospheric Data From the Amazon
Creators & Contributors
The Amazon region is crucial for climate balance and biodiversity preservation. DataMap/Amazon, an initiative of the Center for Sustainable Amazon Studies at the University of São Paulo (USP), promotes the production and dissemination of science for the sustainable development of the Amazon. It is a comprehensive data platform designed to centralize and unify data discovery, visualization, and processing for research datasets, particularly atmospheric data. DataMap empowers researchers to find, analyze, and share datasets seamlessly by simplifying data access and reducing infrastructure complexity.
The aggregation into a scientific repository of observations aimed at quantifying the balance of greenhouse gases and climate uncertainties in the Amazon presents a unique opportunity to better understand how environmental phenomena work in the region while also enabling the synthesis of the enormous amount of data generated there.
DataMap/Amazon was conceived with the above vision. It is an initiative relevant to not only Brazilian but also Latin American science more generally, which typically lacks the necessary computational tools to integrate and synthesize knowledge from the vast collection of information by researchers from institutions such as the National Institute for Amazonian Research, the National Space Institute (INPE), USP, and the State University of Campinas.
Thanks to the DataCite Global Access Fund (GAF) 2024, the DataMap/Amazon initiative led by USP obtained funding to address the gap in computational tools throughout Latin America for synthesizing data related to greenhouse gases and climate change in the Amazon. DataCite's GAF supported the creation of DataMap/Amazon as the first integrated repository of scientific atmospheric data from the Amazon.
Scientific Atmospheric Data from the Amazon – DataMap/Amazon
This platform has the goal of improving the long-term management of in situ data collected by two Amazon projects to better understand how Amazonian ecosystems respond to climate change and their role in the global carbon cycle:
- LBA (Large-Scale Biosphere-Atmosphere Experiment in Amazonia): A Brazilian initiative aimed at understanding the functioning of the Amazon rainforest by studying large-scale cycles of carbon, nutrients, water, and energy. The project provides critical insights into the biosphere and atmosphere interactions.
- AmazonFACE (Amazon Free-Air CO2 Enrichment Experiment): Focused on observing the long-term effects of increased atmospheric CO2 on tropical forests.
Specifically, GAF supported the following features of DataMap/Amazon:
- Development of a software tool to manage and publish datasets collected from LBA and AmazonFace.
- Curation of legacy datasets from these projects that have not yet been published in repositories.
- Training of researchers to utilize tools for managing and publishing datasets in DataMap/Amazon.
Details on each of these features are outlined below.
DataMap/Amazon Infrastructure Development
Through the GAF award, DataMap/Amazon has achieved several critical milestones, laying a solid foundation for comprehensive data management and accessibility. Key functionalities that have been implemented are:
- Digital Object Identifier (DOI) Integration: To facilitate accurate dataset citation and promote data discoverability, DataMap/Amazon has an integration with the DataCite REST Application Program Interface (API) to create and manage DOIs.
- User Portal and Dataset Management: An interface has been created for searching, editing, and organizing datasets, equipped with tools to modify metadata, manage data access, and publish DOIs, along with user profiles and access control based on ORCID authentication.
- API Integration: DataMap/Amazon has established a series of HTTP REST APIs for entities and operations such as users, clients, datasets, and metadata.
- Open-source code: DataMap/Amazon provides open-source code for atmospheric data management. The software is available in the GitHub repository at https://github.com/ardc-brazil.
Atmospheric Dataset Curation
The dataset curation process in the DataMap/Amazon platform includes collecting, validating, and publishing metadata.
The metadata for each legacy dataset from LBA and AmazonFace was collected based on information extracted from the datasets, input from data mentors, and standardized guidelines provided by the Atmospheric Radiation Measurement (ARM) framework. The metadata and datasets were then aggregated in standard NetCDF 3 files and validated against ARM's Online Metadata Editor, a comprehensive system designed to verify metadata consistency. Finally, each dataset was assigned a DOI, ensuring traceability and facilitating proper citation in future research. Unique DOIs have been given to each dataset version, enabling users to track updates with varying statuses: Findable, Registered, or Draft record.
Researcher Training – VII Workshop on Data Science
The VII Workshop on Data Science: Best Practices on Data Sharing and Data Sythesis took place on 14–15 October 2024 at the Escola Politécnica of USP in São Paulo, Brazil. This workshop series aimed to enhance knowledge dissemination and promote sharing experiences related to open data science among researchers and professionals both in Brazil and worldwide. In its 7th edition, the event sought to foster close collaboration among the US Department of Energy's Oak Ridge National Laboratory (ORNL), DataCite's global community, and the Big Data Research Group from USP/INPE. In particular, it highlighted DataMap/Amazon.
The open sessions of the workshop were attended by 111 participants, 67 in person and 44 remotely. All sessions were recorded, with the recordings from the first day available on YouTube. Participants included graduate and undergraduate students, researchers working with atmospheric data in the Amazon, and professionals interested in the event’s theme.
Conclusions and Next Steps
DataMap/Amazon’s roadmap outlines several upcoming projects, including:
- Metadata Workflow with Reviewers: Establishment of workflows with reviewer roles to ensure datasets meet quality standards before DOI publication, thereby improving data reliability.
- Improved Data Exploration and Visualization: Implementation of tools for basic data visualization within the platform, enabling users to explore dataset samples.
- Jupyter Server: Integration of Jupyter Notebook functionality directly into the platform, enabling users to explore, analyze, and process data interactively.
- Advanced Data Search and Indexing: Expansion of the range of accessible data and DOI datasets from external sources, such as ARM's data portal.
- Join Brazilian Conciencia Consortium: The Big Data Research Group will join as a leading institution in the Brazilian Conciencia Consortium in 2025.
- CoreTrustSeal Certification: DataMap/Amazon will apply for CoreTrustSeal Certification.
- AmazonFACE Module for Automation of Data Sensor Curation: Deployment of a data curation flow for sensor dataset acquisition from the AmazonFACE initiative.
DataMap/Amazon represents a significant step forward in integrating and managing atmospheric data from the Amazon, enhancing scientific collaboration and knowledge synthesis. With support from DataCite's GAF, the initiative has developed essential infrastructure, curated legacy datasets, and trained researchers in data-sharing best practices. These advancements strengthen the foundation for long-term climate research in the Amazon, ensuring that valuable data are accessible, discoverable, and properly cited for future studies.
Additional details
Description
The Amazon region is crucial for climate balance and biodiversity preservation. DataMap/Amazon, an initiative of the Center for Sustainable Amazon Studies at the University of São Paulo (USP), promotes the production and dissemination of science for the sustainable development of the Amazon.
Identifiers
- UUID
- 6d49de9c-04f4-4bd9-b2f5-99231d5a909b
- GUID
- https://datacite.org/?p=12423
- URL
- https://datacite.org/blog/advancing-research-through-datacites-global-access-fund-datamap-amazon-the-first-integrated-repository-of-scientific-atmospheric-data-from-the-amazon/
Dates
- Issued
-
2025-03-13T08:17:30
- Updated
-
2025-03-13T08:26:40