Quickstart
Welcome to madmpy
, a Python library designed to help manage and validate Data Management Plans (DMPs). This guide will walk you through installing madmpy
, setting up your first project, and running basic operations.
Installation
Ensure you have Python 3.11+ installed, then install madmpy
(preferably in a virtualenv
) using pip
as follows:
$ pip install madmpy
Alternatively, install from the source:
git clone https://github.com/msicilia/madmpy.git
cd madmpy
pip install -e .
Warning
If you encounter issues during installation, ensure that you have the required dependencies installed. You may need: pip install 'pydantic>=2.10.4'
Using madmpy
Once installed, you can start using madmpy
. The following sections demonstrate its basic functionality.
Import the library
First, import madmpy
into your Python script or interactive session:
import madmpy
To work with DMPs, load the module after importing madmpy
:
>>> import madmpy
>>> dmp_module = madmpy.load()
Loaded madmpy with RDA-DMP specification v1.1
Note
madmpy
by default uses the latest version of the RDA-DMP Common Standard, currently 1.1
. To use an older version, specify it explicitly using set_version(VERSION)
.
``` python
import madmpy VERSION = "1.0" madmpy.set_version(VERSION) dmp_module = madmpy.load() Loaded madmpy with RDA-DMP specification v1.0 ```
Validate a DMP file
To validate a DMP file in JSON
format, provide the file path to validate_DMP(path/to/file)
.
>>> madmpy.validate_DMP("data/ex9-dmp-long.json")
DMP validated!
You can use example files provided by the Research Data Alliance, located in the data
folder of the project.
Create a DMP
Besides validating DMPs under the RDA-DMP Common Standard, madmpy
allows creating new DMPs that conform to this specification.
Following the API Reference, you can create objects corresponding to the DMP. Below is an example of a .py
snippet to generate a DMP including only the required components.
import madmpy
dmp_module = madmpy.load()
title = "DMP Title"
language = dmp_module.LanguageEnum.eng
dataset = dmp_module.Dataset(
dataset_id=dmp_module.DatasetIdentifier(
identifier="https://doi.org/10.25504/FAIRsharing.r3vtvx",
type=dmp_module.dmp_dataset_id_type.DOI,
),
description="Dataset description example",
personal_data="no",
sensitive_data="no",
technical_resource=[dmp_module.TechnicalResource(name="Technical resource")],
title="Dataset title",
)
contact = dmp_module.Contact(
name="name",
contact_id=dmp_module.ContactIdentifier(
identifier="https://orcid.org/0000-0001-2345-6789",
type=dmp_module.contact_id_type.ORCID,
),
mbox="name@email.com",
)
dmp_id = dmp_module.DMPIdentifier(
identifier="https://doi.org/10.15497/rda00039",
type=dmp_module.dmp_dataset_id_type.DOI)
DMP = dmp_module.DMP(
dataset=[dataset],
language=language,
title=title,
contact=contact,
dmp_id=dmp_id,
ethical_issues_exist=dmp_module.YesNoUnknown.NO,
created= datetime.datetime.now().replace(microsecond=0),
modified= datetime.datetime.now().replace(microsecond=0),
)
This will generate a DMP
object based on Pydantic, which internally handles validations and constraints of the standard. To convert this object to JSON, use Pydantic's model_dump_json()
method:
print(DMP.model_dump_json(indent=4))
This will generate a JSON-formatted representation that can be stored or used for validation.
{
"title": "DMP Title",
"contact": {
"name": "name",
"contact_id": {
"identifier": "https://orcid.org/0000-0001-2345-6789",
"type": "orcid"
},
"mbox": "name@email.com"
},
"contributor": null,
"cost": null,
"created": "2025-02-10T13:49:29",
"dataset": [
{
"data_quality_assurance": null,
"dataset_id": {
"identifier": "https://doi.org/10.25504/FAIRsharing.r3vtvx",
"type": "doi"
},
"description": "Dataset description example",
"distribution": null,
"issued": null,
"keyword": null,
"language": null,
"metadata": null,
"personal_data": "no",
"preservation_statement": null,
"security_and_privacy": null,
"sensitive_data": "no",
"technical_resource": [
{
"description": null,
"name": "Technical resource"
}
],
"title": "Dataset title",
"type": null
}
],
"description": null,
"dmp_id": null,
"ethical_issues_description": null,
"ethical_issues_exist": "no",
"ethical_issues_report": null,
"language": "eng",
"modified": "2025-02-10T13:49:29",
"project": null
}
Note
Autocompletion in Visual Studio Code (VS Code) may not work correctly because the DMP version is loaded dynamically when initializing the library. To learn about the parameters of each standard component, refer to the API Reference.
Now you're ready to start working with madmpy
! 🚀