madmpy API Reference v1.1

Welcome to the madmpy API Reference. This documentation provides a comprehensive overview of the available modules, classes, and methods within the library. The madmpy library is designed to facilitate the creation, validation, and management of Data Management Plans (DMPs) based on the RDA-DMP Common Standard. Whether you are integrating DMP functionalities into your system or exploring the different components of a DMP, this reference will help you understand and use the structures and parameters effectively.

Certification

Bases: str, Enum

Enum representing the certification types for dataset distribution hosts.

Parameters:
  • DIN31644

    Certification according to DIN 31644 standard.

  • DINI_ZERTIFIKAT

    Certification by the German Initiative for Network Information (DINI).

  • DSA

    Data Seal of Approval certification.

  • ISO16363

    Certification based on the ISO 16363 standard for trustworthy digital repositories.

  • ISO16919

    Certification according to the ISO 16919 standard.

  • TRAC

    Certification based on the Trusted Repositories Audit & Certification (TRAC) standard.

  • WDS

    Certification from the World Data System (WDS).

  • CORETRUSTSEAL

    Certification by the CoreTrustSeal organization.

Source code in src/madmpy/v1_1/dmp.py
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
class Certification(str, Enum):
    """
    Enum representing the certification types for dataset distribution hosts.

    Args:
        DIN31644: Certification according to DIN 31644 standard.
        DINI_ZERTIFIKAT: Certification by the German Initiative for Network Information (DINI).
        DSA: Data Seal of Approval certification.
        ISO16363: Certification based on the ISO 16363 standard for trustworthy digital repositories.
        ISO16919: Certification according to the ISO 16919 standard.
        TRAC: Certification based on the Trusted Repositories Audit & Certification (TRAC) standard.
        WDS: Certification from the World Data System (WDS).
        CORETRUSTSEAL: Certification by the CoreTrustSeal organization.
    """
    DIN31644 = "din31644"
    DINIZERTIFIKAT = "dini-zertifikat"
    DSA = "dsa"
    ISO16363 = "iso16363"
    ISO16919 = "iso16919"
    TRAC = "trac"
    WDS = "wds"
    CORETRUSTSEAL = "coretrustseal"
Contact

Bases: BaseModel

Represents the main contact person for a DMP.

Parameters:
  • name (str) –

    The name of the contact person. Example: "Charlie Chaplin".

  • contact_id (ContactIdentifier) –

    The unique identifier for the contact, including an identifier value and type.

  • mbox (str) –

    The contact person's email address. Example: "cc@example.com".

Source code in src/madmpy/v1_1/dmp.py
134
135
136
137
138
139
140
141
142
143
144
145
class Contact(BaseModel):
    """
    Represents the main contact person for a DMP.

    Args:
        name (str): The name of the contact person. Example: "Charlie Chaplin".
        contact_id (ContactIdentifier): The unique identifier for the contact, including an identifier value and type.
        mbox (str): The contact person's email address. Example: "cc@example.com".
    """
    name: str
    contact_id: ContactIdentifier
    mbox: str
ContactIdentifier

Bases: BaseModel

Represents a unique identifier for the contact person in a DMP.

Parameters:
  • identifier (str) –

    A unique identifier for the contact, such as an ORCID URL. Example: "https://orcid.org/0000-0003-0644-4174".

  • type (contact_id_type) –

    The type of identifier, restricted to specific values (orcid, isni, openid, other) as defined in the schema.

Source code in src/madmpy/v1_1/dmp.py
123
124
125
126
127
128
129
130
131
132
class ContactIdentifier(BaseModel):
    """
    Represents a unique identifier for the contact person in a DMP.

    Args:
        identifier (str): A unique identifier for the contact, such as an ORCID URL. Example: "https://orcid.org/0000-0003-0644-4174".
        type (contact_id_type): The type of identifier, restricted to specific values (orcid, isni, openid, other) as defined in the schema.
    """
    identifier: str
    type: contact_id_type
Contributor

Bases: BaseModel

Represents a contributor in a DMP.

Parameters:
  • contributor_id (ContributorIdentifier) –

    The unique identifier for the contributor.

  • mbox (str) –

    The email address of the contributor (optional).

  • name (str) –

    The name of the contributor. Example: "John Smith".

  • role (List[str]) –

    The roles of the contributor. Example: ["Data Steward"].

Source code in src/madmpy/v1_1/dmp.py
308
309
310
311
312
313
314
315
316
317
318
319
320
321
class Contributor(BaseModel):
    """
    Represents a contributor in a DMP.

    Args:
        contributor_id (ContributorIdentifier): The unique identifier for the contributor.
        mbox (str): The email address of the contributor (optional).
        name (str): The name of the contributor. Example: "John Smith".
        role (List[str]): The roles of the contributor. Example: ["Data Steward"].
    """
    contributor_id: ContributorIdentifier
    mbox: Optional[str] = None
    name: str
    role: list[str]
ContributorIdentifier

Bases: BaseModel

Represents a unique identifier for a contributor.

Parameters:
  • identifier (str) –

    A unique identifier for the contact, such as an ORCID URL. Example: "https://orcid.org/0000-0000-0000-0000".

  • type (contributor_id_type) –

    The type of identifier, restricted to specific values (orcid, isni, openid, other) as defined in the schema.

Source code in src/madmpy/v1_1/dmp.py
297
298
299
300
301
302
303
304
305
306
class ContributorIdentifier(BaseModel):
    """
    Represents a unique identifier for a contributor.

    Args:
        identifier (str): A unique identifier for the contact, such as an ORCID URL. Example: "https://orcid.org/0000-0000-0000-0000".
        type (contributor_id_type): The type of identifier, restricted to specific values (orcid, isni, openid, other) as defined in the schema.
    """
    identifier: str
    type: contributor_id_type
Cost

Bases: BaseModel

Represents a cost entry in a DMP.

Parameters:
  • currency_code (CurrencyCode) –

    The currency code in ISO 4217 format. Example: "EUR".

  • description (str) –

    A brief description of the cost. Example: "Costs for maintaining...".

  • title (str) –

    The title of the cost entry. Example: "Storage and Backup".

  • value (float) –

    The numerical value of the cost. Example: 123.40.

Source code in src/madmpy/v1_1/dmp.py
323
324
325
326
327
328
329
330
331
332
333
334
335
336
class Cost(BaseModel):
    """
    Represents a cost entry in a DMP.

    Args:
        currency_code (CurrencyCode): The currency code in ISO 4217 format. Example: "EUR".
        description (str): A brief description of the cost. Example: "Costs for maintaining...".
        title (str): The title of the cost entry. Example: "Storage and Backup".
        value (float): The numerical value of the cost. Example: 123.40.
    """
    currency_code: Optional[CurrencyCode] = None
    description: Optional[str] = None
    title: str
    value: Optional[float] = None
DMP

Bases: BaseModel

Represents a DMP.

Parameters:
  • title (str) –

    The title of the DMFieldP.

  • project (Project) –

    The project associated with this DMP.

  • created (datetime) –

    The timestamp when the DMP was created.

  • modified (datetime) –

    The timestamp when the DMP was last modified.

  • language (LanguageEnum) –

    The primary language of the DMP.

  • description (Optional[str]) –

    A description of the DMP.

  • dmp_id (DMPIdentifier) –

    The unique identifier for the DMP.

Source code in src/madmpy/v1_1/dmp.py
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
class DMP(BaseModel):
    """
    Represents a DMP.

    Args:
        title (str): The title of the DMFieldP.
        project (Project): The project associated with this DMP.
        created (datetime): The timestamp when the DMP was created.
        modified (datetime): The timestamp when the DMP was last modified.
        language (LanguageEnum): The primary language of the DMP.
        description (Optional[str]): A description of the DMP.
        dmp_id (DMPIdentifier): The unique identifier for the DMP.
    """

    title: str
    contact: Contact 
    contributor: list[Contributor] = None
    cost: Optional[list[Cost]] = None
    created: datetime
    dataset: list[Dataset]    
    description: Optional[str] = None
    dmp_id: Annotated[DMPIdentifier, AfterValidator(validate_id)] # \
    #    = Field(default = DMPIdentifier(identifier="change-me", type="other"))
    ethical_issues_description: Optional[str] = None
    ethical_issues_exist: Optional[YesNoUnknown] = None
    ethical_issues_report: Optional[AnyUrl] = None
    language: LanguageEnum
    modified: datetime
    project: Optional[list[Project]] = None
DMPIdentifier

Bases: BaseModel

Represents an identifier for the DMP itself.

Parameters:
  • identifier (str) –

    A unique identifier for the DMP. Example: "https://doi.org/10.1371/journal.pcbi.1006750".

  • type (dmp_dataset_id_type) –

    The type of identifier, must be one of the allowed values. Example: "doi".

Source code in src/madmpy/v1_1/dmp.py
500
501
502
503
504
505
506
507
508
509
class DMPIdentifier(BaseModel):
    """
    Represents an identifier for the  DMP itself.

    Args:
        identifier (str): A unique identifier for the DMP. Example: "https://doi.org/10.1371/journal.pcbi.1006750".
        type (dmp_dataset_id_type): The type of identifier, must be one of the allowed values. Example: "doi".
    """
    identifier: str
    type: dmp_dataset_id_type
DataAccess

Bases: str, Enum

Enum representing the access mode for datasets.

Parameters:
  • OPEN (str) –

    Data is openly accessible to the public.

  • SHARED (str) –

    Data is shared with specific groups or individuals under certain conditions.

  • CLOSED (str) –

    Data access is restricted and not publicly available.

Source code in src/madmpy/v1_1/dmp.py
164
165
166
167
168
169
170
171
172
173
174
175
class DataAccess(str, Enum):
    """
    Enum representing the access mode for datasets.

    Args:
        OPEN (str): Data is openly accessible to the public.
        SHARED (str): Data is shared with specific groups or individuals under certain conditions.
        CLOSED (str): Data access is restricted and not publicly available.
    """
    OPEN = "open"
    SHARED = "shared"
    CLOSED = "closed"
Dataset

Bases: BaseModel

Represents a dataset within a DMP.

Parameters:
  • data_quality_assurance (List[str]) –

    List of quality assurance measures.

  • dataset_id (DatasetIdentifier) –

    Identifier for the dataset.

  • description (str) –

    Description of the dataset.

  • distribution (List[Distribution]) –

    Technical distribution details.

  • issued (datetime) –

    Date of issue of the dataset.

  • keyword (List[str]) –

    Keywords describing the dataset.

  • language (LanguageEnum) –

    Language of the dataset.

  • metadata (List[Metadata]) –

    Metadata standards used.

  • personal_data (YesNoUnknown) –

    Indicates if the dataset contains personal data.

  • preservation_statement (str) –

    Description of dataset preservation measures.

  • security_and_privacy (List[SecurityPrivacy]) –

    Security and privacy measures applied.

  • sensitive_data (YesNoUnknown) –

    Indicates if the dataset contains sensitive data.

  • technical_resource (List[TechnicalResource]) –

    Technical resources required.

  • title (str) –

    Title of the dataset.

  • type (str) –

    Type of dataset according to DataCite or COAR. Otherwise use the common name for the type, e.g. raw data, software, survey, etc.

Source code in src/madmpy/v1_1/dmp.py
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
class Dataset(BaseModel):
    """
    Represents a dataset within a DMP.

    Args:
        data_quality_assurance (List[str]): List of quality assurance measures.
        dataset_id (DatasetIdentifier): Identifier for the dataset.
        description (str): Description of the dataset.
        distribution (List[Distribution]): Technical distribution details.
        issued (datetime): Date of issue of the dataset.
        keyword (List[str]): Keywords describing the dataset.
        language (LanguageEnum): Language of the dataset.
        metadata (List[Metadata]): Metadata standards used.
        personal_data (YesNoUnknown): Indicates if the dataset contains personal data.
        preservation_statement (str): Description of dataset preservation measures.
        security_and_privacy (List[SecurityPrivacy]): Security and privacy measures applied.
        sensitive_data (YesNoUnknown): Indicates if the dataset contains sensitive data.
        technical_resource (List[TechnicalResource]): Technical resources required.
        title (str): Title of the dataset.
        type (str): Type of dataset according to DataCite or COAR. Otherwise use the common name for the type, e.g. raw data, software, survey, etc.
    """
    data_quality_assurance: Optional[list[str]] = None
    dataset_id: Annotated[DatasetIdentifier, AfterValidator(validate_id)]
    description: Optional[str] = None
    distribution: Optional[list[Distribution]] = None
    issued: Optional[datetime] = None
    keyword: Optional[list[str]] = None
    language: Optional[LanguageEnum] = None
    metadata: Optional[list[Metadata]] = None
    personal_data: YesNoUnknown
    preservation_statement: Optional[str] = None
    security_and_privacy: Optional[list[SecurityPrivacy]] = None
    sensitive_data: YesNoUnknown
    technical_resource: Optional[list[TechnicalResource]] = None
    title: str
    type: Optional[str] = None
DatasetIdentifier

Bases: BaseModel

Represents an identifier for a dataset.

Parameters:
  • identifier (str) –

    A unique identifier for the dataset. Example: "https://hdl.handle.net/11353/10.923628".

  • type (dmp_dataset_id_type) –

    The type of identifier, must be one of the allowed values (handle, doi, ark, url, other).

Source code in src/madmpy/v1_1/dmp.py
451
452
453
454
455
456
457
458
459
460
class DatasetIdentifier(BaseModel):
    """
    Represents an identifier for a dataset.

    Args:
        identifier (str): A unique identifier for the dataset. Example: "https://hdl.handle.net/11353/10.923628".
        type (dmp_dataset_id_type): The type of identifier, must be one of the allowed values (handle, doi, ark, url, other).
    """
    identifier: str
    type: dmp_dataset_id_type
Distribution

Bases: BaseModel

Represents a dataset distribution, providing technical information on a specific instance of data.

Parameters:
  • access_url (AnyUrl) –

    URL of the resource that gives access to a distribution of the dataset. Example: "http://some.repo".

  • available_until (datetime) –

    Date until the distribution is available.

  • byte_size (int) –

    Size of the dataset distribution in bytes.

  • data_access (DataAccess) –

    Access mode for the dataset (open, shared or closed).

  • description (str) –

    Description of the dataset distribution. Example: "Best quality data before resizing".

  • download_url (AnyUrl) –

    URL to directly download the dataset.

  • format (List[str]) –

    Format of the dataset distribution. Example: ["image/tiff"].

  • host (Host) –

    Host information where the dataset is stored.

  • license (List[License]) –

    Licenses applied to the dataset distribution.

  • title (str) –

    Title of the dataset distribution.

Source code in src/madmpy/v1_1/dmp.py
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
class Distribution(BaseModel):
    """
    Represents a dataset distribution, providing technical information on a specific instance of data.

    Args:
        access_url (AnyUrl): URL of the resource that gives access to a distribution of the dataset. Example: "http://some.repo".
        available_until (datetime): Date until the distribution is available.
        byte_size (int): Size of the dataset distribution in bytes.
        data_access (DataAccess): Access mode for the dataset (open, shared or closed).
        description (str): Description of the dataset distribution. Example: "Best quality data before resizing".
        download_url (AnyUrl): URL to directly download the dataset.
        format (List[str]): Format of the dataset distribution. Example: ["image/tiff"].
        host (Host): Host information where the dataset is stored.
        license (List[License]): Licenses applied to the dataset distribution.
        title (str): Title of the dataset distribution.
    """
    access_url: Optional[AnyUrl] = None
    available_until: Optional[datetime] = None
    byte_size: Optional[int] = None
    data_access: Optional[DataAccess] = None
    description: Optional[str] = None
    download_url: Optional[AnyUrl] = None
    format: Optional[list[str]] = None
    host: Optional[Host] = None
    license: Optional[list[License]]
    title: str
Funding

Bases: BaseModel

Represents the funding details associated with a project.

Parameters:
  • funder_id (FundingIdentifier) –

    The identifier of the funding organization.

  • funding_status (FundingStatus) –

    The status of the funding application. Example: "granted".

  • grant_id (GrantIdentifier) –

    The identifier of the grant associated with the project.

Source code in src/madmpy/v1_1/dmp.py
533
534
535
536
537
538
539
540
541
542
543
544
class Funding(BaseModel):
    """
    Represents the funding details associated with a project.

    Args:
        funder_id (FundingIdentifier): The identifier of the funding organization.
        funding_status (FundingStatus): The status of the funding application. Example: "granted".
        grant_id (GrantIdentifier): The identifier of the grant associated with the project.
    """
    funder_id: FundingIdentifier
    funding_status: Optional[FundingStatus] = None
    grant_id: Optional[GrantIdentifier] = None
FundingIdentifier

Bases: BaseModel

Represents the identifier of a funder.

Parameters:
  • identifier (str) –

    The unique identifier for the funder. Example: "501100002428" (CrossRef Funder Registry ID).

  • type (FundingIdType) –

    The type of funder identifier, must be one of the allowed values: fundref, url, other.

Source code in src/madmpy/v1_1/dmp.py
511
512
513
514
515
516
517
518
519
520
class FundingIdentifier(BaseModel):
    """
    Represents the identifier of a funder.

    Args:
        identifier (str): The unique identifier for the funder. Example: "501100002428" (CrossRef Funder Registry ID).
        type (FundingIdType): The type of funder identifier, must be one of the allowed values: fundref, url, other.
    """
    identifier: str
    type: funding_id_type
FundingStatus

Bases: str, Enum

Enum representing the possible funding statuses.

Parameters:
  • PLANNED

    Funding has been planned but not yet applied for.

  • APPLIED

    Funding has been applied for but not yet granted.

  • GRANTED

    Funding has been awarded to the project.

  • REJECTED

    Funding application has been rejected.

Source code in src/madmpy/v1_1/dmp.py
282
283
284
285
286
287
288
289
290
291
292
293
294
295
class FundingStatus(str, Enum):
    """
    Enum representing the possible funding statuses.

    Args:
        PLANNED: Funding has been planned but not yet applied for.
        APPLIED: Funding has been applied for but not yet granted.
        GRANTED: Funding has been awarded to the project.
        REJECTED: Funding application has been rejected.
    """
    PLANNED = "planned"
    APPLIED = "applied"
    GRANTED = "granted"
    REJECTED = "rejected"
GrantIdentifier

Bases: BaseModel

Represents the identifier of a funding grant.

Parameters:
  • identifier (str) –

    The unique identifier for the grant. Example: "776242" (Grant ID).

  • type (GrantIdType) –

    The type of grant identifier, must be one of the allowed values: url, other.

Source code in src/madmpy/v1_1/dmp.py
522
523
524
525
526
527
528
529
530
531
class GrantIdentifier(BaseModel):
    """
    Represents the identifier of a funding grant.

    Args:
        identifier (str): The unique identifier for the grant. Example: "776242" (Grant ID).
        type (GrantIdType): The type of grant identifier, must be one of the allowed values: url, other.
    """
    identifier: str
    type: grant_id_type
Host

Bases: BaseModel

Represents a dataset distribution host in a DMP. Information about the QoS provided by the infrastructure (e.g., repository) where data is stored.

Parameters:
  • availability (str) –

    Availability percentage of the host. Example: "99.5".

  • backup_frequency (str) –

    Frequency at which backups are performed. Example: "weekly".

  • backup_type (str) –

    Type of backup storage used. Example: "tapes".

  • certified_with (Certification) –

    Certification type of the repository. Example: "coretrustseal".

  • description (str) –

    A description of the repository or host. Example: "Repository hosted by...".

  • geo_location (CountryCode) –

    Physical location of the repository, expressed using an ISO 3166-1 country code. Example: "AT".

  • pid_system ([List[PidSystem]]) –

    Persistent Identifier (PID) systems supported by the host. Example: ["doi"].

  • storage_type (str) –

    The type of storage used. Example: "External Hard Drive".

  • support_versioning (YesNoUnknown) –

    Whether the host supports versioning.

  • title (str) –

    The title of the repository or host. Example: "Super Repository".

  • url (AnyUrl) –

    The URL of the system hosting a distribution of a dataset. Example: "https://zenodo.org".

Source code in src/madmpy/v1_1/dmp.py
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
class Host(BaseModel):
    """
    Represents a dataset distribution host in a DMP. Information about the QoS provided by the infrastructure (e.g., repository) where data is stored.

    Args:
        availability (str): Availability percentage of the host. Example: "99.5".
        backup_frequency (str): Frequency at which backups are performed. Example: "weekly".
        backup_type (str): Type of backup storage used. Example: "tapes".
        certified_with (Certification): Certification type of the repository. Example: "coretrustseal".
        description (str): A description of the repository or host. Example: "Repository hosted by...".
        geo_location (CountryCode): Physical location of the repository, expressed using an ISO 3166-1 country code. Example: "AT".
        pid_system ([List[PidSystem]]): Persistent Identifier (PID) systems supported by the host. Example: ["doi"].
        storage_type (str): The type of storage used.  Example: "External Hard Drive".
        support_versioning (YesNoUnknown): Whether the host supports versioning.  
        title (str): The title of the repository or host. Example: "Super Repository".
        url (AnyUrl): The URL of the system hosting a distribution of a dataset. Example: "https://zenodo.org".
    """
    availability: Optional[str] = None
    backup_frequency: Optional[str] = None
    backup_type: Optional[str] = None
    certified_with: Optional[Certification] = None
    description: Optional[str] = None
    geo_location: Optional[CountryCode] = None
    pid_system: Optional[list[PidSystem]] = None
    storage_type: Optional[str] = None
    support_versioning: Optional[YesNoUnknown] = None
    title: str
    url: AnyUrl
License

Bases: BaseModel

Represents a license applied to a dataset distribution.

Parameters:
  • license_ref (AnyUrl) –

    URL link to the license document. Example: "https://creativecommons.org/licenses/by/4.0/".

  • start_date (datetime) –

    Date when the license starts being applicable. If set in the future, it indicates an embargo period.

Source code in src/madmpy/v1_1/dmp.py
367
368
369
370
371
372
373
374
375
376
class License (BaseModel):
    """
    Represents a license applied to a dataset distribution.

    Args:
        license_ref (AnyUrl): URL link to the license document. Example: "https://creativecommons.org/licenses/by/4.0/".
        start_date (datetime): Date when the license starts being applicable. If set in the future, it indicates an embargo period.
    """
    license_ref: AnyUrl
    start_date: datetime
Metadata

Bases: BaseModel

Represents metadata standards used in a dataset.

Parameters:
  • description (str) –

    A description of the metadata standard. Example: "Provides taxonomy for...".

  • language (LanguageEnum) –

    The language in which the metadata is written, using ISO 639-3. Example: "eng".

  • metadata_standard_id (MetadataIdentifier) –

    The identifier of the metadata standard used.

Source code in src/madmpy/v1_1/dmp.py
416
417
418
419
420
421
422
423
424
425
426
427
class Metadata(BaseModel):
    """
    Represents metadata standards used in a dataset.

    Args:
        description (str): A description of the metadata standard. Example: "Provides taxonomy for...".
        language (LanguageEnum): The language in which the metadata is written, using ISO 639-3. Example: "eng".
        metadata_standard_id (MetadataIdentifier): The identifier of the metadata standard used.
    """
    description: Optional[str] = None
    language: LanguageEnum
    metadata_standard_id: MetadataIdentifier
MetadataIdentifier

Bases: BaseModel

Represents an identifier for a metadata standard used in a dataset.

Parameters:
  • identifier (str) –

    The identifier for the metadata standard. Example: "http://www.dublincore.org/specifications/dublin-core/dcmi-terms/".

  • type (str) –

    The type of identifier, restricted to "url" or "other".

Source code in src/madmpy/v1_1/dmp.py
405
406
407
408
409
410
411
412
413
414
class MetadataIdentifier(BaseModel):
    """
    Represents an identifier for a metadata standard used in a dataset.

    Args:
        identifier (str): The identifier for the metadata standard. Example: "http://www.dublincore.org/specifications/dublin-core/dcmi-terms/".
        type (str): The type of identifier, restricted to "url" or "other".
    """
    identifier: str
    type: metadata_id_type
PidSystem

Bases: str, Enum

Enum representing the Persistent Identifier (PID) systems used for dataset distribution hosts.

Parameters:
  • ARK

    Archival Resource Key (ARK) identifier system.

  • ARXIV

    arXiv identifier for preprints.

  • BIBCODE

    Bibliographic codes used in astronomy and astrophysics.

  • DOI

    Digital Object Identifier (DOI) system.

  • EAN13

    International Article Number (EAN-13) barcode standard.

  • EISSN

    Electronic International Standard Serial Number.

  • HANDLE

    Handle System for persistent digital identifiers.

  • IGSN

    International Geo Sample Number.

  • ISBN

    International Standard Book Number.

  • ISSN

    International Standard Serial Number.

  • ISTC

    International Standard Text Code.

  • LISSN

    Linking ISSN for serial publications.

  • LSID

    Life Science Identifier.

  • PMID

    PubMed Identifier for biomedical literature.

  • PURL

    Persistent Uniform Resource Locator.

  • UPC

    Universal Product Code.

  • URL

    Uniform Resource Locator.

  • URN

    Uniform Resource Name.

  • OTHER

    Other unspecified PID system.

Source code in src/madmpy/v1_1/dmp.py
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
class PidSystem(str, Enum):
    """
    Enum representing the Persistent Identifier (PID) systems used for dataset distribution hosts.

    Args:
        ARK: Archival Resource Key (ARK) identifier system.
        ARXIV: arXiv identifier for preprints.
        BIBCODE: Bibliographic codes used in astronomy and astrophysics.
        DOI: Digital Object Identifier (DOI) system.
        EAN13: International Article Number (EAN-13) barcode standard.
        EISSN: Electronic International Standard Serial Number.
        HANDLE: Handle System for persistent digital identifiers.
        IGSN: International Geo Sample Number.
        ISBN: International Standard Book Number.
        ISSN: International Standard Serial Number.
        ISTC: International Standard Text Code.
        LISSN: Linking ISSN for serial publications.
        LSID: Life Science Identifier.
        PMID: PubMed Identifier for biomedical literature.
        PURL: Persistent Uniform Resource Locator.
        UPC: Universal Product Code.
        URL: Uniform Resource Locator.
        URN: Uniform Resource Name.
        OTHER: Other unspecified PID system.
    """
    ARK = "ark"
    ARXIV = "arxiv"
    BIBCODE = "bibcode"
    DOI = "doi"
    EAN13 = "ean13"
    EISSN = "eissn"
    HANDLE = "handle"
    IGSN = "igsn"
    ISBN = "isbn"
    ISSN = "issn"
    ISTC = "istc"
    LISSN = "lissn"
    LSID = "lsid"
    PMID = "pmid"
    PURL = "purl"
    UPC = "upc"
    URL = "url"
    URN = "urn"
    OTHER = "other"
Project

Bases: BaseModel

Represents a project related to a DMP.

Parameters:
  • title (str) –

    The title of the project. Example: "Our New Project".

  • description (str) –

    A description of the project.

  • start (datetime) –

    The start date of the project.

  • end (datetime) –

    The end date of the project.

  • funding (List[Funding]) –

    A list of funding sources related to the project.

Source code in src/madmpy/v1_1/dmp.py
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
class Project(BaseModel):
    """
    Represents a project related to a DMP.

    Args:
        title (str): The title of the project. Example: "Our New Project".
        description (str): A description of the project.
        start (datetime): The start date of the project.
        end (datetime): The end date of the project.
        funding (List[Funding]): A list of funding sources related to the project.
    """
    title: str
    description: Optional[str] = None
    start : Optional[datetime] = None
    end : Optional[datetime] = None
    funding : Optional[list[Funding]] = None
SecurityPrivacy

Bases: BaseModel

Represents security and privacy measures applied to the dataset.

Parameters:
  • description (str) –

    A description of security and privacy measures.

  • title (str) –

    The title of the security/privacy measure.

Source code in src/madmpy/v1_1/dmp.py
429
430
431
432
433
434
435
436
437
438
class SecurityPrivacy(BaseModel):
    """
    Represents security and privacy measures applied to the dataset.

    Args:
        description (str): A description of security and privacy measures.
        title (str): The title of the security/privacy measure.
    """
    description: Optional[str] = None
    title: str
TechnicalResource

Bases: BaseModel

Represents technical resources needed to implement a DMP.

Parameters:
  • description (str) –

    A description of the technical resource.

  • name (str) –

    The name of the technical resource.

Source code in src/madmpy/v1_1/dmp.py
440
441
442
443
444
445
446
447
448
449
class TechnicalResource(BaseModel):
    """
    Represents technical resources needed to implement a DMP.

    Args:
        description (str): A description of the technical resource.
        name (str): The name of the technical resource.
    """
    description: Optional[str] = None
    name: str
YesNoUnknown

Bases: str, Enum

Enum representing a three-state option to indicate if a feature or option is supported.

Parameters:
  • YES

    The option is supported.

  • NO

    The option is not supported.

  • UNKNOWN

    It is unknown if the option is supported.

Source code in src/madmpy/v1_1/dmp.py
245
246
247
248
249
250
251
252
253
254
255
256
class YesNoUnknown(str, Enum):
    """
    Enum representing a three-state option to indicate if a feature or option is supported.

    Args:
        YES: The option is supported.
        NO: The option is not supported.
        UNKNOWN: It is unknown if the option is supported.
    """
    YES = "yes"
    NO = "no"
    UNKNOWN = "unknown"
contact_id_type

Bases: str, Enum

Enum for allowed contact identifier types.

Parameters:
  • ORCID (str) –

    Open Researcher and Contributor ID.

  • ISNI (str) –

    International Standard Name Identifier.

  • OPENID (str) –

    OpenID for user authentication.

  • OTHER (str) –

    Other unspecified identifier type.

Source code in src/madmpy/v1_1/dmp.py
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
class contact_id_type(str, Enum):
    """
    Enum for allowed contact identifier types.

    Args:
        ORCID (str): Open Researcher and Contributor ID.
        ISNI (str): International Standard Name Identifier.
        OPENID (str): OpenID for user authentication.
        OTHER (str): Other unspecified identifier type.
    """
    ORCID = "orcid"
    ISNI = "isni"
    OPENID = "openid"
    OTHER = "other"
contributor_id_type

Bases: str, Enum

Enum for allowed contributor identifier types.

Parameters:
  • ORCID (str) –

    Open Researcher and Contributor ID.

  • ISNI (str) –

    International Standard Name Identifier.

  • OPENID (str) –

    OpenID for user authentication.

  • OTHER (str) –

    Other unspecified identifier type.

Source code in src/madmpy/v1_1/dmp.py
108
109
110
111
112
113
114
115
116
117
118
119
120
121
class contributor_id_type(str, Enum):
    """
    Enum for allowed contributor identifier types.

    Args:
        ORCID (str): Open Researcher and Contributor ID.
        ISNI (str): International Standard Name Identifier.
        OPENID (str): OpenID for user authentication.
        OTHER (str): Other unspecified identifier type.
    """
    ORCID = "orcid"
    ISNI = "isni"
    OPENID = "openid"
    OTHER = "other"
dmp_dataset_id_type

Bases: str, Enum

Enum for allowed DMP dataset identifier types.

Parameters:
  • HANDLE

    Handle.

  • DOI

    Digital Object Identifier.

  • ARK

    Archival Resource Key.

  • URL

    Identifier is a standard URL.

  • OTHER

    Other unspecified identifier type.

Source code in src/madmpy/v1_1/dmp.py
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
class dmp_dataset_id_type(str, Enum):
    """
    Enum for allowed DMP dataset identifier types.

    Args:
        HANDLE: Handle.
        DOI: Digital Object Identifier.
        ARK: Archival Resource Key.
        URL: Identifier is a standard URL.
        OTHER: Other unspecified identifier type.
    """
    HANDLE = "handle"
    DOI = "doi"
    ARK = "ark"
    URL = "url"
    OTHER = "other"
funding_id_type

Bases: str, Enum

Enum representing the allowed identifier types for funders.

Parameters:
  • FUNDREF

    Identifier from the CrossRef Funder Registry.

  • URL

    A direct URL to the funder.

  • OTHER

    Other unspecified identifier type.

Source code in src/madmpy/v1_1/dmp.py
258
259
260
261
262
263
264
265
266
267
268
269
class funding_id_type(str, Enum):
    """
    Enum representing the allowed identifier types for funders.

    Args:
        FUNDREF: Identifier from the CrossRef Funder Registry.
        URL: A direct URL to the funder.
        OTHER: Other unspecified identifier type.
    """
    FUNDREF = "fundref"
    URL = "url"
    OTHER = "other"
grant_id_type

Bases: str, Enum

Enum representing the allowed identifier types for grants.

Parameters:
  • URL

    A direct URL to the grant.

  • OTHER

    Other unspecified identifier type.

Source code in src/madmpy/v1_1/dmp.py
271
272
273
274
275
276
277
278
279
280
class grant_id_type(str, Enum):
    """
    Enum representing the allowed identifier types for grants.

    Args:
        URL: A direct URL to the grant.
        OTHER: Other unspecified identifier type.
    """
    URL = "url"
    OTHER = "other"
metadata_id_type

Bases: str, Enum

Enum for allowed metadata identifier types.

Parameters:
  • URL

    Identifier type is a URL.

  • OTHER

    Other unspecified identifier type.

Source code in src/madmpy/v1_1/dmp.py
82
83
84
85
86
87
88
89
90
91
class metadata_id_type(str, Enum):
    """
    Enum for allowed metadata identifier types.

    Args:
        URL: Identifier type is a URL.
        OTHER: Other unspecified identifier type.
    """
    URL = "url"
    OTHER = "other"
extract_identifier(url, id_type)

Extracts the identifier from a URL based on the specified type.

Parameters:
  • url (str) –

    The URL containing the identifier.

  • id_type (str) –

    The type of identifier to extract. Supported types are "doi", "orcid", "ark", and "handle".

Returns:
  • str or None: The extracted identifier if found, otherwise None.

Source code in src/madmpy/v1_1/dmp.py
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
def extract_identifier(url, id_type):
    """
    Extracts the identifier from a URL based on the specified type.

    Args:
        url (str): The URL containing the identifier.
        id_type (str): The type of identifier to extract. Supported types are "doi", "orcid", "ark", and "handle".

    Returns:
        str or None: The extracted identifier if found, otherwise None.
    """
    patterns = {
        "doi": r"10\.\d{4,9}/[-._;()/:A-Z0-9]+$",
        "orcid": r"\d{4}-\d{4}-\d{4}-\d{3}[0-9X]{1}$",
        "ark": r"ark:/[-a-zA-Z0-9@:%_\\+.~#?&//=]+$",
        "handle": r"\d+\.\d+/[a-zA-Z0-9._;()/:@&=+$,-]+$"
    }

    if id_type not in patterns:
        return url

    match = re.search(patterns[id_type], url, re.IGNORECASE)
    return match.group(0) if match else None
validate_id(value)

Validates an identifier.

Parameters:
  • value (object) –

    An object containing type and identifier attributes.

Raises:
  • ValueError

    If the object does not have type and identifier attributes.

  • ValueError

    If the identifier does not match the expected format for its type.

Returns:
  • object

    The validated value object.

Source code in src/madmpy/v1_1/dmp.py
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
def validate_id(value):
    """
    Validates an identifier.

    Args:
        value (object): An object containing `type` and `identifier` attributes.

    Raises:
        ValueError: If the object does not have `type` and `identifier` attributes.
        ValueError: If the identifier does not match the expected format for its type.

    Returns:
        object: The validated `value` object.
    """
    if not hasattr(value, "type") or not hasattr(value, "identifier"):
        raise ValueError("The object must have 'type' and 'identifier' attributes.")

    identifier = extract_identifier(str(value.identifier).strip(), value.type)
    if not identifier:
        raise ValueError(f"No valid {value.type} identifier found in URL.")

    match value.type:
        case "doi":
            doi_pattern = r"^10\.\d{4,9}/[-._;()/:A-Z0-9]+$"
            if not re.match(doi_pattern, identifier, re.IGNORECASE):
                raise ValueError("Invalid DOI format")
        case "orcid":
            orcid_pattern = r"^\d{4}-\d{4}-\d{4}-\d{3}[0-9X]{1}$"
            if not re.match(orcid_pattern, identifier):
                raise ValueError("Invalid ORCID format")
        case "ark":
            ark_pattern = r"^ark:\/\d{5,10}\/[\w\-.]+(\?[^\s#]+|#[^\s]+)?$"
            if not re.match(ark_pattern, identifier):
                raise ValueError("Invalid ARK format")
        case "handle":
            handle_pattern = r"\d{1,5}(\.\d+)?\/[\w\-.]+$"
            if not re.match(handle_pattern, identifier):
                raise ValueError("Invalid Handle format")
        case "other":
            if not identifier:
                raise ValueError("Identifier cannot be empty for 'other'")
        case _:
            raise ValueError("Unsupported identifier type")

    return value