harvest

Harvest related operations

List all available harvest backends

GET//www.data.gouv.fr/api/1/harvest/backends
Header parameters
Response

Success

Body
featuresarray of HarvestFeature (object)

The backend optional features

filtersarray of HarvestFilter (object)

The backend supported filters

idstring

The backend identifier

labelstring

The backend display name

Request
const response = await fetch('//www.data.gouv.fr/api/1/harvest/backends', {
    method: 'GET',
    headers: {},
});
const data = await response.json();
Response
{
  "features": [
    {
      "default": "text",
      "description": "text",
      "key": "text",
      "label": "text"
    }
  ],
  "filters": [
    {
      "description": "text",
      "key": "text",
      "label": "text",
      "type": "text"
    }
  ],
  "id": "text",
  "label": "text"
}

List all jobs for a given source

GET//www.data.gouv.fr/api/1/harvest/job/{ident}/
Path parameters
ident*string
Query parameters
Header parameters
Response

Success

Body
created*string (date-time)

The job creation date

endedstring (date-time)

The job end date

errorsarray of HarvestError (object)

The job initialization errors

id*string

The job execution ID

itemsarray of HarvestItem (object)

The job collected items

source*string

The source owning the job

startedstring (date-time)

The job start date

status*enum

The job status

Example: "pending"
pendinginitializinginitializedprocessingdonedone-errorsfailed
Request
const response = await fetch('//www.data.gouv.fr/api/1/harvest/job/{ident}/', {
    method: 'GET',
    headers: {},
});
const data = await response.json();
Response
{
  "created": "2024-09-15T17:51:14.243Z",
  "ended": "2024-09-15T17:51:14.243Z",
  "errors": [
    {
      "level": "text",
      "message": "text"
    }
  ],
  "id": "text",
  "items": [
    {
      "args": [
        "text"
      ],
      "created": "2024-09-15T17:51:14.243Z",
      "dataservice": {
        "acronym": "text",
        "archived_at": "2024-09-15T17:51:14.243Z",
        "authorization_request_url": "text",
        "availability": "99.99",
        "base_api_url": "text",
        "contact_point": {
          "email": "text",
          "id": "text",
          "name": "text",
          "organization": {
            "class": "text",
            "id": "text",
            "acronym": "text",
            "badges": [
              {
                "kind": "text"
              }
            ],
            "logo": "text",
            "logo_thumbnail": "text",
            "name": "text",
            "page": "text",
            "slug": "text",
            "uri": "text"
          },
          "owner": {
            "class": "text",
            "id": "text",
            "avatar": "text",
            "avatar_thumbnail": "text",
            "first_name": "text",
            "last_name": "text",
            "page": "text",
            "slug": "text",
            "uri": "text"
          }
        },
        "created_at": "2024-09-15T17:51:14.243Z",
        "datasets": [
          {
            "class": "text",
            "id": "text",
            "acronym": "text",
            "page": "text",
            "title": "text",
            "uri": "text"
          }
        ],
        "deleted_at": "2024-09-15T17:51:14.243Z",
        "description": "text",
        "endpoint_description_url": "text",
        "format": "REST",
        "harvest": {
          "archived_at": "2024-09-15T17:51:14.243Z",
          "backend": "text",
          "created_at": "2024-09-15T17:51:14.243Z",
          "domain": "text",
          "id": "text",
          "last_update": "2024-09-15T17:51:14.243Z",
          "remote_id": "text",
          "remote_url": "text",
          "source_id": "text",
          "source_url": "text",
          "uri": "text"
        },
        "has_token": false,
        "id": "text",
        "is_restricted": false,
        "license": "text",
        "metadata_modified_at": "2024-09-15T17:51:14.243Z",
        "organization": {
          "class": "text",
          "id": "text",
          "acronym": "text",
          "badges": [
            {
              "kind": "text"
            }
          ],
          "logo": "text",
          "logo_thumbnail": "text",
          "name": "text",
          "page": "text",
          "slug": "text",
          "uri": "text"
        },
        "owner": {
          "class": "text",
          "id": "text",
          "avatar": "text",
          "avatar_thumbnail": "text",
          "first_name": "text",
          "last_name": "text",
          "page": "text",
          "slug": "text",
          "uri": "text"
        },
        "private": false,
        "rate_limiting": "text",
        "self_api_url": "text",
        "self_web_url": "text",
        "slug": "text",
        "tags": [
          "text"
        ],
        "title": "My awesome API"
      },
      "dataset": {
        "class": "text",
        "id": "text",
        "acronym": "text",
        "page": "text",
        "title": "text",
        "uri": "text"
      },
      "ended": "2024-09-15T17:51:14.243Z",
      "errors": [
        {
          "level": "text",
          "message": "text"
        }
      ],
      "logs": [
        {
          "level": "text",
          "message": "text"
        }
      ],
      "remote_id": "text",
      "started": "2024-09-15T17:51:14.243Z",
      "status": "pending"
    }
  ],
  "source": "text",
  "started": "2024-09-15T17:51:14.243Z",
  "status": "pending"
}

List all available harvesters

GET//www.data.gouv.fr/api/1/harvest/job_status
Response

Success

Body
itemsstring
Request
const response = await fetch('//www.data.gouv.fr/api/1/harvest/job_status', {
    method: 'GET',
    headers: {},
});
const data = await response.json();
Response
[
  "text"
]

Preview an harvesting from a source created with the given payload

POST//www.data.gouv.fr/api/1/harvest/source/preview
Header parameters
Body
active*boolean

Is this source active

autoarchive*boolean

If enabled, datasets not present on the remote source will be automatically archived

backend*enum

The source backend

Example: "csw-dcat"
csw-dcatcsw-iso-19139dcatckandkanmaaf
configobject

The configuration as key-value pairs

created_at*string (date-time)

The source creation date

deletedstring (date-time)

The source deletion date

descriptionstring (markdown)

The source description

idstring

The source unique identifier

last_joball of

The last job for this source

name*string

The source display name

organizationall of

The producer organization

ownerall of

The owner information

schedulestring

The source schedule (interval or cron expression)

url*string

The source base URL

validationall of

Has the source been validated

Response

Success

Body
created*string (date-time)

The job creation date

endedstring (date-time)

The job end date

errorsarray of HarvestError (object)

The job initialization errors

id*string

The job execution ID

itemsarray of HarvestItemPreview (object)

The job collected items

source*string

The source owning the job

startedstring (date-time)

The job start date

status*enum

The job status

Example: "pending"
pendinginitializinginitializedprocessingdonedone-errorsfailed
Request
const response = await fetch('//www.data.gouv.fr/api/1/harvest/source/preview', {
    method: 'POST',
    headers: {
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      "active": false,
      "autoarchive": true,
      "backend": "csw-dcat",
      "created_at": "2024-09-15T17:51:14.243Z",
      "name": "text",
      "url": "text"
    }),
});
const data = await response.json();
Response
{
  "created": "2024-09-15T17:51:14.243Z",
  "ended": "2024-09-15T17:51:14.243Z",
  "errors": [
    {
      "level": "text",
      "message": "text"
    }
  ],
  "id": "text",
  "items": [
    {
      "args": [
        "text"
      ],
      "created": "2024-09-15T17:51:14.243Z",
      "dataservice": {
        "acronym": "text",
        "archived_at": "2024-09-15T17:51:14.243Z",
        "authorization_request_url": "text",
        "availability": "99.99",
        "base_api_url": "text",
        "contact_point": {
          "email": "text",
          "id": "text",
          "name": "text",
          "organization": {
            "class": "text",
            "id": "text",
            "acronym": "text",
            "badges": [
              {
                "kind": "text"
              }
            ],
            "logo": "text",
            "logo_thumbnail": "text",
            "name": "text",
            "page": "text",
            "slug": "text",
            "uri": "text"
          },
          "owner": {
            "class": "text",
            "id": "text",
            "avatar": "text",
            "avatar_thumbnail": "text",
            "first_name": "text",
            "last_name": "text",
            "page": "text",
            "slug": "text",
            "uri": "text"
          }
        },
        "created_at": "2024-09-15T17:51:14.243Z",
        "datasets": [
          {
            "class": "text",
            "id": "text",
            "acronym": "text",
            "page": "text",
            "title": "text",
            "uri": "text"
          }
        ],
        "deleted_at": "2024-09-15T17:51:14.243Z",
        "description": "text",
        "endpoint_description_url": "text",
        "format": "REST",
        "harvest": {
          "archived_at": "2024-09-15T17:51:14.243Z",
          "backend": "text",
          "created_at": "2024-09-15T17:51:14.243Z",
          "domain": "text",
          "id": "text",
          "last_update": "2024-09-15T17:51:14.243Z",
          "remote_id": "text",
          "remote_url": "text",
          "source_id": "text",
          "source_url": "text",
          "uri": "text"
        },
        "has_token": false,
        "id": "text",
        "is_restricted": false,
        "license": "text",
        "metadata_modified_at": "2024-09-15T17:51:14.243Z",
        "organization": {
          "class": "text",
          "id": "text",
          "acronym": "text",
          "badges": [
            {
              "kind": "text"
            }
          ],
          "logo": "text",
          "logo_thumbnail": "text",
          "name": "text",
          "page": "text",
          "slug": "text",
          "uri": "text"
        },
        "owner": {
          "class": "text",
          "id": "text",
          "avatar": "text",
          "avatar_thumbnail": "text",
          "first_name": "text",
          "last_name": "text",
          "page": "text",
          "slug": "text",
          "uri": "text"
        },
        "private": false,
        "rate_limiting": "text",
        "self_api_url": "text",
        "self_web_url": "text",
        "slug": "text",
        "tags": [
          "text"
        ],
        "title": "My awesome API"
      },
      "dataset": {
        "acronym": "text",
        "archived": "2024-09-15T17:51:14.243Z",
        "badges": [
          {
            "kind": "text"
          }
        ],
        "community_resources": [
          {
            "checksum": {
              "type": "sha1",
              "value": "text"
            },
            "created_at": "2024-09-15T17:51:14.243Z",
            "description": "text",
            "filetype": "file",
            "format": "text",
            "harvest": {
              "created_at": "2024-09-15T17:51:14.243Z",
              "modified_at": "2024-09-15T17:51:14.243Z",
              "uri": "text"
            },
            "id": "text",
            "internal": {
              "created_at_internal": "2024-09-15T17:51:14.243Z",
              "last_modified_internal": "2024-09-15T17:51:14.243Z"
            },
            "last_modified": "2024-09-15T17:51:14.243Z",
            "latest": "text",
            "mime": "text",
            "preview_url": "text",
            "schema": {
              "name": "text",
              "url": "text",
              "version": "text"
            },
            "title": "text",
            "type": "main",
            "url": "text",
            "dataset": {
              "class": "text",
              "id": "text",
              "acronym": "text",
              "page": "text",
              "title": "text",
              "uri": "text"
            },
            "organization": {
              "class": "text",
              "id": "text",
              "acronym": "text",
              "badges": [
                {
                  "kind": "text"
                }
              ],
              "logo": "text",
              "logo_thumbnail": "text",
              "name": "text",
              "page": "text",
              "slug": "text",
              "uri": "text"
            },
            "owner": {
              "class": "text",
              "id": "text",
              "avatar": "text",
              "avatar_thumbnail": "text",
              "first_name": "text",
              "last_name": "text",
              "page": "text",
              "slug": "text",
              "uri": "text"
            }
          }
        ],
        "contact_point": {
          "email": "text",
          "id": "text",
          "name": "text",
          "organization": {
            "class": "text",
            "id": "text",
            "acronym": "text",
            "badges": [
              {
                "kind": "text"
              }
            ],
            "logo": "text",
            "logo_thumbnail": "text",
            "name": "text",
            "page": "text",
            "slug": "text",
            "uri": "text"
          },
          "owner": {
            "class": "text",
            "id": "text",
            "avatar": "text",
            "avatar_thumbnail": "text",
            "first_name": "text",
            "last_name": "text",
            "page": "text",
            "slug": "text",
            "uri": "text"
          }
        },
        "created_at": "2024-09-15T17:51:14.243Z",
        "deleted": "2024-09-15T17:51:14.243Z",
        "description": "text",
        "featured": false,
        "frequency": "unknown",
        "frequency_date": "2024-09-15T17:51:14.243Z",
        "harvest": {
          "archived": "text",
          "archived_at": "2024-09-15T17:51:14.243Z",
          "backend": "text",
          "ckan_name": "text",
          "ckan_source": "text",
          "created_at": "2024-09-15T17:51:14.243Z",
          "dct_identifier": "text",
          "domain": "text",
          "last_update": "2024-09-15T17:51:14.243Z",
          "modified_at": "2024-09-15T17:51:14.243Z",
          "remote_id": "text",
          "remote_url": "text",
          "source_id": "text",
          "uri": "text"
        },
        "id": "text",
        "internal": {
          "created_at_internal": "2024-09-15T17:51:14.243Z",
          "last_modified_internal": "2024-09-15T17:51:14.243Z"
        },
        "last_modified": "2024-09-15T17:51:14.243Z",
        "last_update": "2024-09-15T17:51:14.243Z",
        "license": "notspecified",
        "organization": {
          "class": "text",
          "id": "text",
          "acronym": "text",
          "badges": [
            {
              "kind": "text"
            }
          ],
          "logo": "text",
          "logo_thumbnail": "text",
          "name": "text",
          "page": "text",
          "slug": "text",
          "uri": "text"
        },
        "owner": {
          "class": "text",
          "id": "text",
          "avatar": "text",
          "avatar_thumbnail": "text",
          "first_name": "text",
          "last_name": "text",
          "page": "text",
          "slug": "text",
          "uri": "text"
        },
        "page": "text",
        "private": false,
        "resources": [
          {
            "checksum": {
              "type": "sha1",
              "value": "text"
            },
            "created_at": "2024-09-15T17:51:14.243Z",
            "description": "text",
            "filetype": "file",
            "format": "text",
            "harvest": {
              "created_at": "2024-09-15T17:51:14.243Z",
              "modified_at": "2024-09-15T17:51:14.243Z",
              "uri": "text"
            },
            "id": "text",
            "internal": {
              "created_at_internal": "2024-09-15T17:51:14.243Z",
              "last_modified_internal": "2024-09-15T17:51:14.243Z"
            },
            "last_modified": "2024-09-15T17:51:14.243Z",
            "latest": "text",
            "mime": "text",
            "preview_url": "text",
            "schema": {
              "name": "text",
              "url": "text",
              "version": "text"
            },
            "title": "text",
            "type": "main",
            "url": "text"
          }
        ],
        "schema": {
          "name": "text",
          "url": "text",
          "version": "text"
        },
        "slug": "text",
        "spatial": {
          "geom": {
            "coordinates": [],
            "type": "Point"
          },
          "granularity": "other",
          "zones": [
            "text"
          ]
        },
        "tags": [
          "text"
        ],
        "temporal_coverage": {
          "end": "2024-09-15T17:51:14.243Z",
          "start": "2024-09-15T17:51:14.243Z"
        },
        "title": "text",
        "uri": "text"
      },
      "ended": "2024-09-15T17:51:14.243Z",
      "errors": [
        {
          "level": "text",
          "message": "text"
        }
      ],
      "logs": [
        {
          "level": "text",
          "message": "text"
        }
      ],
      "remote_id": "text",
      "started": "2024-09-15T17:51:14.243Z",
      "status": "pending"
    }
  ],
  "source": "text",
  "started": "2024-09-15T17:51:14.243Z",
  "status": "pending"
}

Get a single source given an ID or a slug

GET//www.data.gouv.fr/api/1/harvest/source/{ident}
Path parameters
ident*string

A source ID or slug

Header parameters
Response

Success

Body
active*boolean

Is this source active

autoarchive*boolean

If enabled, datasets not present on the remote source will be automatically archived

backend*enum

The source backend

Example: "csw-dcat"
csw-dcatcsw-iso-19139dcatckandkanmaaf
configobject

The configuration as key-value pairs

created_at*string (date-time)

The source creation date

deletedstring (date-time)

The source deletion date

descriptionstring (markdown)

The source description

idstring

The source unique identifier

last_joball of

The last job for this source

name*string

The source display name

organizationall of

The producer organization

ownerall of

The owner information

schedulestring

The source schedule (interval or cron expression)

url*string

The source base URL

validationall of

Has the source been validated

Request
const response = await fetch('//www.data.gouv.fr/api/1/harvest/source/{ident}', {
    method: 'GET',
    headers: {},
});
const data = await response.json();
Response
{
  "active": false,
  "autoarchive": true,
  "backend": "csw-dcat",
  "created_at": "2024-09-15T17:51:14.243Z",
  "deleted": "2024-09-15T17:51:14.243Z",
  "description": "text",
  "id": "text",
  "last_job": {
    "created": "2024-09-15T17:51:14.243Z",
    "ended": "2024-09-15T17:51:14.243Z",
    "errors": [
      {
        "level": "text",
        "message": "text"
      }
    ],
    "id": "text",
    "items": [
      {
        "args": [
          "text"
        ],
        "created": "2024-09-15T17:51:14.243Z",
        "dataservice": {
          "acronym": "text",
          "archived_at": "2024-09-15T17:51:14.243Z",
          "authorization_request_url": "text",
          "availability": "99.99",
          "base_api_url": "text",
          "contact_point": {
            "email": "text",
            "id": "text",
            "name": "text",
            "organization": {
              "class": "text",
              "id": "text",
              "acronym": "text",
              "badges": [
                {
                  "kind": "text"
                }
              ],
              "logo": "text",
              "logo_thumbnail": "text",
              "name": "text",
              "page": "text",
              "slug": "text",
              "uri": "text"
            },
            "owner": {
              "class": "text",
              "id": "text",
              "avatar": "text",
              "avatar_thumbnail": "text",
              "first_name": "text",
              "last_name": "text",
              "page": "text",
              "slug": "text",
              "uri": "text"
            }
          },
          "created_at": "2024-09-15T17:51:14.243Z",
          "datasets": [
            {
              "class": "text",
              "id": "text",
              "acronym": "text",
              "page": "text",
              "title": "text",
              "uri": "text"
            }
          ],
          "deleted_at": "2024-09-15T17:51:14.243Z",
          "description": "text",
          "endpoint_description_url": "text",
          "format": "REST",
          "harvest": {
            "archived_at": "2024-09-15T17:51:14.243Z",
            "backend": "text",
            "created_at": "2024-09-15T17:51:14.243Z",
            "domain": "text",
            "id": "text",
            "last_update": "2024-09-15T17:51:14.243Z",
            "remote_id": "text",
            "remote_url": "text",
            "source_id": "text",
            "source_url": "text",
            "uri": "text"
          },
          "has_token": false,
          "id": "text",
          "is_restricted": false,
          "license": "text",
          "metadata_modified_at": "2024-09-15T17:51:14.243Z",
          "organization": {
            "class": "text",
            "id": "text",
            "acronym": "text",
            "badges": [
              {
                "kind": "text"
              }
            ],
            "logo": "text",
            "logo_thumbnail": "text",
            "name": "text",
            "page": "text",
            "slug": "text",
            "uri": "text"
          },
          "owner": {
            "class": "text",
            "id": "text",
            "avatar": "text",
            "avatar_thumbnail": "text",
            "first_name": "text",
            "last_name": "text",
            "page": "text",
            "slug": "text",
            "uri": "text"
          },
          "private": false,
          "rate_limiting": "text",
          "self_api_url": "text",
          "self_web_url": "text",
          "slug": "text",
          "tags": [
            "text"
          ],
          "title": "My awesome API"
        },
        "dataset": {
          "class": "text",
          "id": "text",
          "acronym": "text",
          "page": "text",
          "title": "text",
          "uri": "text"
        },
        "ended": "2024-09-15T17:51:14.243Z",
        "errors": [
          {
            "level": "text",
            "message": "text"
          }
        ],
        "logs": [
          {
            "level": "text",
            "message": "text"
          }
        ],
        "remote_id": "text",
        "started": "2024-09-15T17:51:14.243Z",
        "status": "pending"
      }
    ],
    "source": "text",
    "started": "2024-09-15T17:51:14.243Z",
    "status": "pending"
  },
  "name": "text",
  "organization": {
    "class": "text",
    "id": "text",
    "acronym": "text",
    "badges": [
      {
        "kind": "text"
      }
    ],
    "logo": "text",
    "logo_thumbnail": "text",
    "name": "text",
    "page": "text",
    "slug": "text",
    "uri": "text"
  },
  "owner": {
    "class": "text",
    "id": "text",
    "avatar": "text",
    "avatar_thumbnail": "text",
    "first_name": "text",
    "last_name": "text",
    "page": "text",
    "slug": "text",
    "uri": "text"
  },
  "schedule": "text",
  "url": "text",
  "validation": {
    "by": {
      "class": "text",
      "id": "text",
      "avatar": "text",
      "avatar_thumbnail": "text",
      "first_name": "text",
      "last_name": "text",
      "page": "text",
      "slug": "text",
      "uri": "text"
    },
    "comment": "text",
    "on": "2024-09-15T17:51:14.243Z",
    "state": "pending"
  }
}

Update a harvest source

PUT//www.data.gouv.fr/api/1/harvest/source/{ident}
Path parameters
ident*string

A source ID or slug

Header parameters
Body
active*boolean

Is this source active

autoarchive*boolean

If enabled, datasets not present on the remote source will be automatically archived

backend*enum

The source backend

Example: "csw-dcat"
csw-dcatcsw-iso-19139dcatckandkanmaaf
configobject

The configuration as key-value pairs

created_at*string (date-time)

The source creation date

deletedstring (date-time)

The source deletion date

descriptionstring (markdown)

The source description

idstring

The source unique identifier

last_joball of

The last job for this source

name*string

The source display name

organizationall of

The producer organization

ownerall of

The owner information

schedulestring

The source schedule (interval or cron expression)

url*string

The source base URL

validationall of

Has the source been validated

Response

Success

Body
active*boolean

Is this source active

autoarchive*boolean

If enabled, datasets not present on the remote source will be automatically archived

backend*enum

The source backend

Example: "csw-dcat"
csw-dcatcsw-iso-19139dcatckandkanmaaf
configobject

The configuration as key-value pairs

created_at*string (date-time)

The source creation date

deletedstring (date-time)

The source deletion date

descriptionstring (markdown)

The source description

idstring

The source unique identifier

last_joball of

The last job for this source

name*string

The source display name

organizationall of

The producer organization

ownerall of

The owner information

schedulestring

The source schedule (interval or cron expression)

url*string

The source base URL

validationall of

Has the source been validated

Request
const response