Crawler

A crawler allows you to retrieve a full list of assets contained in a connection, in a single operation. Manage crawlers to retrieve data at a large scale and enrich your inventory more efficiently.

Endpoints

Global security

These security schemes apply to the entire API

Security scheme

This scheme can be referenced across the API

BearerAuthentication
Bearer authentication
Name Description
Format Bearer <TOKEN>
Headers
Name Description Type Attributes and examples
Authorization The authorization token (PAT, SAT or JWT) string Required

Retrieve all the crawlers of a tenant

GET /connections/crawlers
This endpoint lets you retrieve all the crawlers of a tenant. You can also include the crawlers that have been deleted by activating the dedicated option. You can retrieve the crawler only from a connection by indicating the connection ID.

Request

Query parameters
Name Description Type Attributes and examples
limit not used integer Optional
INT32
offset not used integer Optional
INT32
talendVersion default version of the API string Optional
includeDeleted if true, it will also returns the crawlers that have been deleted. boolean Optional
connectionId Use this option if you want to retrieve the crawler linked to a connection string Optional
Headers
Name Description Type Attributes and examples
talend-version default version of the API string Optional

Response

200Status 200
Body

The response payload contains the list of returned crawlers. A crawler contains 3 parts :

  • the sharing set which tells with whom the generated datasets are shared
  • the status of the crawler
  • the datasets created by the crawler
PaginatedResources_CrawlerModel
Status 200 application/json
{
  "data": [
    {
      "id": "59451bf0-a81a-11eb-bcbc-0242ac130002",
      "connectionId": "d54a8f03-7906-4930-a7cc-4eb90e968f89",
      "name": "Crawler1",
      "description": "Description du crawler 1",
      "sharings": [
        {
          "scimType": "user",
          "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
          "level": "OWNER"
        },
        {
          "scimType": "group",
          "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
          "level": "READER"
        },
        {
          "scimType": "user",
          "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
          "level": "WRITER"
        }
      ],
      "status": {
        "runStatus": "NotStarted"
      },
      "createdAt": "2021-01-08T15:41:29.263Z",
      "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
      "crawledDatasets": [
        "Table1",
        "Table2",
        "Table3",
        "View1"
      ]
    },
    {
      "id": "3a45cb46-a81a-11eb-bcbc-0242ac130002",
      "connectionId": "165ea830-e003-11eb-ba80-0242ac130004",
      "name": "Crawler2",
      "description": "Description du crawler 2",
      "sharings": [
        {
          "scimType": "user",
          "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
          "level": "OWNER"
        },
        {
          "scimType": "group",
          "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
          "level": "READER"
        },
        {
          "scimType": "user",
          "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
          "level": "WRITER"
        }
      ],
      "status": {
        "runStatus": "RetrievingProperties",
        "runStartedAt": "2021-01-08T15:41:29.263Z",
        "runBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516"        
      },
      "createdAt": "2021-01-08T15:41:29.263Z",
      "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
      "crawledDatasets": [
        "Dataset 1",
        "Dataset 2",
        "Dataset 3",
        "Dataset 4"
      ]
    },
    {
      "id": "108fb1c2-a81a-11eb-bcbc-0242ac130002",
      "connectionId": "1db0db6c-e003-11eb-ba80-0242ac130004",
      "name": "Crawler3",
      "description": "Description du crawler 3",
      "sharings": [
        {
          "scimType": "user",
          "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
          "level": "OWNER"
        },
        {
          "scimType": "group",
          "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
          "level": "READER"
        },
        {
          "scimType": "user",
          "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
          "level": "WRITER"
        }
      ],
      "status": {
        "runStatus": "PropertiesRetrievalFailed",
        "runStartedAt": "2021-01-08T15:41:29.263Z",
        "runBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
        "failure": "cannot generate dataset properties"
      },
      "createdAt": "2021-01-08T15:41:29.263Z",
      "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
      "crawledDatasets": [
        "Dataset 1",
        "Dataset 2",
        "Dataset 3",
        "Dataset 4"
      ]
    },
    {
      "id": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
      "connectionId": "2695204e-e003-11eb-ba80-0242ac130004",
      "name": "Crawler4",
      "description": "Description du crawler 4",
      "sharings": [
        {
          "scimType": "user",
          "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
          "level": "OWNER"
        },
        {
          "scimType": "group",
          "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
          "level": "READER"
        },
        {
          "scimType": "user",
          "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
          "level": "WRITER"
        }
      ],
      "status": {
        "runStatus": "CreatingDatasets",
        "runStartedAt": "2021-01-08T15:41:29.263Z",
        "runBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516"
      },
      "createdAt": "2021-01-08T15:41:29.263Z",
      "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
      "crawledDatasets": [
        "Dataset 1",
        "Dataset 2",
        "Dataset 3",
        "Dataset 4"
      ]
    },
    {
      "id": "7c7f7872-a81a-11eb-bcbc-0242ac130002",
      "connectionId": "2e46ed68-e003-11eb-ba80-0242ac130004",
      "name": "Crawler5",
      "description": "Description du crawler 5",
      "sharings": [
        {
          "scimType": "user",
          "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
          "level": "OWNER"
        },
        {
          "scimType": "group",
          "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
          "level": "READER"
        },
        {
          "scimType": "user",
          "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
          "level": "WRITER"
        }
      ],
      "status": {
        "runStatus": "Finished",
        "runStartedAt": "2021-01-08T15:41:29.263Z",
        "runBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
        "runFinishedAt": "2021-01-08T15:41:29.263Z"
      },
      "createdAt": "2021-01-08T15:41:29.263Z",
      "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
      "crawledDatasets": [
        "Dataset 1",
        "Dataset 2",
        "Dataset 3",
        "Dataset 4"
      ]
    }
  ],
  "offset": 0,
  "limit": 0,
  "total": 5
}
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
500Status 500
Internal Server Error
Body
ServerError

Create a new crawler

POST /connections/crawlers

At this time, a crawler can only be created on a JDBC connection. You can only have one active crawler per connection. An active crawler is a crawler that has not been deleted. When the user runs the crawler, datasets from the tables and views of the JDBC connection will be created.

Known limitations:

Max objects limit : We recommend selecting less than 1000 tables/views. Beyond this limit, you may encounter issues when launching the run endpoint.

Max datasets limit: The maximum number of datasets a user can have is 1500. Beyond this limit, you may encounter timeouts when calling the dataset endpoint that list them all for a user. In consequence, when configuring a crawler, it is important to ensure that you will not exceed this limit after the crawler has run.

Request

Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional
Body
CreateCrawlerRequest
application/json
{
  "connectionId": "d54a8f03-7906-4930-a7cc-4eb90e968f89",
  "name": "Crawler - JDBC",
  "selectedDatasets": [
    "accounts",
    "orders",
    "items"
  ],
  "sharings": [
    {
      "scimType": "user",
      "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
      "level": "OWNER"
    },
    {
      "scimType": "group",
      "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
      "level": "READER"
    },
    {
      "scimType": "user",
      "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
      "level": "WRITER"
    }
  ]
}

Response

201Status 201
Body
CreateCrawlerResponse
Status 201 application/json
{
  "id": "ac6e2117-fbb5-442a-bb02-cefabbf04516"
}
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
409Status 409
Already Exist
Body
AlreadyExist
500Status 500
Internal Server Error
Body
ServerError

Update the tables and views selection of an existing crawler

PUT /connections/crawlers/{crawlerId}
This endpoint allows you to add and remove some tables or views from the crawler configuration.

Request

Path variables
Name Description Type Attributes and examples
crawlerId string Required
Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional
Body
UpdateCrawlerRequest

Response

200Status 200
Body
The technical talend ID of the crawler that has been updated.
string
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
404Status 404
Not found
Body
NotFound
409Status 409
The crawler is running, action not available
Body
AlreadyRunning
500Status 500
Internal Server Error
Body
ServerError

Update the name and description of a crawler

PATCH /connections/crawlers/{crawlerId}
This endpoint allows you to update the name and the description of a crawler.

Request

Path variables
Name Description Type Attributes and examples
crawlerId string Required
Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional
Body
PatchCrawlerRequest

Response

200Status 200
Body
The technical talend ID of the crawler that has been updated.
string
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
404Status 404
Not found
Body
NotFound
409Status 409
The crawler is running, action not available
Body
AlreadyRunning
500Status 500
Internal Server Error
Body
ServerError

Delete a crawler

DELETE /connections/crawlers/{crawlerId}
Use this method to delete a crawler. Because you can only have one crawler at a time on a JDBC connection, you may need to delete a crawler in order to create a new one. You can also edit an existing crawler and make some modifications. Deleting a crawler doesn’t physically remove it unless this crawler has no more related datasets. You can still find it with the endpoint that lists all the crawlers by including the deleted crawlers in the search.

Request

Path variables
Name Description Type Attributes and examples
crawlerId string Required
Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional

Response

204Status 204
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
404Status 404
Not found
Body
NotFound
409Status 409
The crawler is running, action not available
Body
AlreadyRunning
500Status 500
Internal Server Error
Body
ServerError

Get a crawler by its ID

GET /connections/crawlers/{crawlerId}
Retrieve a crawler using its ID. The response payload contains the crawler itself.

Request

Path variables
Name Description Type Attributes and examples
crawlerId string Required
Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
includeDeleted if true, it will also returns the crawlers that have been deleted. boolean Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional

Response

200Status 200
Body

The payload contains all the crawler data:

  • name, description
  • selected tables/views
  • sharingset
CrawlerModel
Status 200 application/json
{
  "id": "59451bf0-a81a-11eb-bcbc-0242ac130002",
  "connectionId": "d54a8f03-7906-4930-a7cc-4eb90e968f89",
  "name": "Crawler1",
  "description": "Description du crawler 1",
  "sharings": [
    {
      "scimType": "user",
      "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
      "level": "OWNER"
    },
    {
      "scimType": "group",
      "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
      "level": "READER"
    },
    {
      "scimType": "user",
      "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
      "level": "WRITER"
    }
  ],
  "status": {
    "runStatus": "NotStarted",
    "nbDatasetsToCrawl": 0,
    "nbDatasetsFinished": 0,
    "nbDatasetsCreated": 0,
    "nbDatasetsFailed": 0,
    "nbSamplesFailed": 0
  },
  "createdAt": "2021-01-08T15:41:29.263Z",
  "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
  "crawledDatasets": [
    "Dataset 1",
    "Dataset 2",
    "Dataset 3",
    "Dataset 4"
  ]
}
{
  "id": "3a45cb46-a81a-11eb-bcbc-0242ac130002",
  "connectionId": "165ea830-e003-11eb-ba80-0242ac130004",
  "name": "Crawler2",
  "description": "Description du crawler 2",
  "sharings": [
    {
      "scimType": "user",
      "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
      "level": "OWNER"
    },
    {
      "scimType": "group",
      "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
      "level": "READER"
    },
    {
      "scimType": "user",
      "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
      "level": "WRITER"
    }
  ],
  "status": {
    "runStatus": "RetrievingProperties",
    "runStartedAt": "2021-01-08T15:41:29.263Z",
    "runBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
    "nbDatasetsToCrawl": 0,
    "nbDatasetsFinished": 0,
    "nbDatasetsCreated": 0,
    "nbDatasetsFailed": 0,
    "nbSamplesFailed": 0
  },
  "createdAt": "2021-01-08T15:41:29.263Z",
  "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
  "crawledDatasets": [
    "Dataset 1",
    "Dataset 2",
    "Dataset 3",
    "Dataset 4"
  ]
}
{
  "id": "108fb1c2-a81a-11eb-bcbc-0242ac130002",
  "connectionId": "1db0db6c-e003-11eb-ba80-0242ac130004",
  "name": "Crawler3",
  "description": "Description du crawler 3",
  "sharings": [
    {
      "scimType": "user",
      "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
      "level": "OWNER"
    },
    {
      "scimType": "group",
      "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
      "level": "READER"
    },
    {
      "scimType": "user",
      "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
      "level": "WRITER"
    }
  ],
  "status": {
    "runStatus": "PropertiesRetrievalFailed",
    "runStartedAt": "2021-01-08T15:41:29.263Z",
    "runBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
    "failure": "cannot generate dataset properties",
    "nbDatasetsToCrawl": 0,
    "nbDatasetsFinished": 0,
    "nbDatasetsCreated": 0,
    "nbDatasetsFailed": 0,
    "nbSamplesFailed": 0
  },
  "createdAt": "2021-01-08T15:41:29.263Z",
  "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
  "crawledDatasets": [
    "Dataset 1",
    "Dataset 2",
    "Dataset 3",
    "Dataset 4"
  ]
}
{
  "id": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
  "connectionId": "2695204e-e003-11eb-ba80-0242ac130004",
  "name": "Crawler4",
  "description": "Description du crawler 4",
  "sharings": [
    {
      "scimType": "user",
      "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
      "level": "OWNER"
    },
    {
      "scimType": "group",
      "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
      "level": "READER"
    },
    {
      "scimType": "user",
      "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
      "level": "WRITER"
    }
  ],
  "status": {
    "runStatus": "CreatingDatasets",
    "runStartedAt": "2021-01-08T15:41:29.263Z",
    "runBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
    "nbDatasetsToCrawl": 0,
    "nbDatasetsFinished": 0,
    "nbDatasetsCreated": 0,
    "nbDatasetsFailed": 0,
    "nbSamplesFailed": 0
  },
  "createdAt": "2021-01-08T15:41:29.263Z",
  "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
  "crawledDatasets": [
    "Dataset 1",
    "Dataset 2",
    "Dataset 3",
    "Dataset 4"
  ]
}
{
  "id": "7c7f7872-a81a-11eb-bcbc-0242ac130002",
  "connectionId": "2e46ed68-e003-11eb-ba80-0242ac130004",
  "name": "Crawler5",
  "description": "Description du crawler 5",
  "sharings": [
    {
      "scimType": "user",
      "scimId": "b8a78dcb-65b4-4823-ad76-88720fc6309e",
      "level": "OWNER"
    },
    {
      "scimType": "group",
      "scimId": "877f89dc-709b-4ef1-8d0e-a851f67a065a",
      "level": "READER"
    },
    {
      "scimType": "user",
      "scimId": "bd4c7ae4-a1df-4702-845e-11946fa07d85",
      "level": "WRITER"
    }
  ],
  "status": {
    "runStatus": "Finished",
    "runStartedAt": "2021-01-08T15:41:29.263Z",
    "runBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
    "runFinishedAt": "2021-01-08T15:41:29.263Z",
    "nbDatasetsToCrawl": 0,
    "nbDatasetsFinished": 0,
    "nbDatasetsCreated": 0,
    "nbDatasetsFailed": 0,
    "nbSamplesFailed": 0
  },
  "createdAt": "2021-01-08T15:41:29.263Z",
  "createdBy": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
  "crawledDatasets": [
    "Dataset 1",
    "Dataset 2",
    "Dataset 3",
    "Dataset 4"
  ]
}
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
404Status 404
Not found
Body
NotFound
500Status 500
Internal Server Error
Body
ServerError

Run a crawler

POST /connections/crawlers/{crawlerId}/run

This endpoint allows you to start the crawler. When calling this endpoint, the crawler will rely on its configuration in order to retrieve all the selected tables and views and turn them into datasets. Once the dataset will be created the crawler will also retrieve their samples.

You can launch the crawler as many time as you want. Running a crawler once will create the datasets. Running a crawler again will only refresh the sample of the existing datasets.

Known limitations:

Max objects limit : We recommend selecting less than 1000 tables/views. Beyond this limit, you may encounter issues when launching the run endpoint.

Max datasets limit: The maximum number of datasets a user can have is 1500. Beyond this limit, you may encounter timeouts when calling the dataset endpoint that list them all for a user. In consequence, when configuring a crawler, it is important to ensure that you will not exceed this limit after the crawler has run.

Request

Path variables
Name Description Type Attributes and examples
crawlerId string Required
Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional

Response

202Status 202
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
404Status 404
Not found
Body
NotFound
500Status 500
Internal Server Error
Body
ServerError

End a crawler while it is running

POST /connections/crawlers/{crawlerId}/end

This endpoint allows you to stop a crawler while it is running. After launching a crawler, the run can take up to a few hours to complete, according the number of objects you selected. You may want to stop the run for many reasons, for instance if you notice that the crawler was created on the wrong connection.

Stopping a crawler does not mean cancelling the crawler. The datasets that have already been created will not be deleted. If you want to clean them, you can use the faceted search to retrieve the datasets created by a crawler and delete them.

Request

Path variables
Name Description Type Attributes and examples
crawlerId string Required
Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional

Response

202Status 202
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
404Status 404
Not found
Body
NotFound
500Status 500
Internal Server Error
Body
ServerError

Get the error log file

GET /connections/crawlers/{crawlerId}/errors.log
This endpoint allows you to retrieve the error log file of a crawler run. Even if your crawler ends successfully, some samples may have not been fetched, so you might want to know the technical reasons by downloading the error logs.

Request

Path variables
Name Description Type Attributes and examples
crawlerId string Required
Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional

Response

200Status 200
Headers
Name Description Type Attributes and examples
Content-Disposition string Required
Body
string
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized

Retrieve the result of the last scan

GET /connections/scan/{connectionId}
This endpoint allows you to retrieve the tables and views held by a connection based on the last connection scan. It can only be done if you have previously scanned the JDBC connection by calling this endpoint with the POST method. Once the scan is done, you can call the GET endpoint in order to get the result. If the connection has changed (for example new tables/views have been added), then you must call a POST scan in order to refresh the state.

Request

Path variables
Name Description Type Attributes and examples
connectionId string Required
Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional

Response

200Status 200
Body
ConnectionScan
Status 200 application/json
{
  "id": "ac6e2117-fbb5-442a-bb02-cefabbf04516",
  "lastScan": "2021-01-08T15:41:29.263Z",
  "results": [
    {
      "displayName": "view1",
      "technicalName": "view1",
      "metadata": {
        "type": "VIEW"
      }
    },
    {
      "displayName": "table1",
      "technicalName": "table1",
      "metadata": {
        "type": "TABLE"
      }
    }
  ]
}
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
404Status 404
Not found
Body
NotFound
500Status 500
Internal Server Error
Body
ServerError

Scan a JDBC connection

POST /connections/scan/{connectionId}
This endpoint allows you to scan the content of a JDBC connection. The scan will search for all the tables and views that are held by this connection. However, the tables and views found by this scan are not returned in the response payload. In order to get them, you need to call this endpoint with the GET method.

Request

Path variables
Name Description Type Attributes and examples
connectionId string Required
Query parameters
Name Description Type Attributes and examples
talendVersion string Optional
Headers
Name Description Type Attributes and examples
talend-version string Optional

Response

204Status 204
401Status 401
Not authenticated
Body
NotAuthenticated
403Status 403
Not authorized
Body
NotAuthorized
404Status 404
Not found
Body
NotFound
500Status 500
Internal Server Error
Body
ServerError

CreateCrawlerRequest

Describe the crawler about to be created with the name, description, tables and views selection, user and group for the sharing policies.
Object
Name Description Type Attributes and examples
connectionId The technical talend ID of the connection string Required
name Name of the crawler string Required
description Description of the crawler string Optional
selectedDatasets array of string Optional
Datatype details
Type Description Attributes and examples
array Names of the tables and views that we want to retrieve with the crawler
string
sharings array of Sharing Optional
Datatype details
Type Description Attributes and examples
array Sharing policies
Sharing

Sharing

This object indicate the sharing informations.
Object
Name Description Type Attributes and examples
scimType indicate USER or GROUP string Required
scimId the scim id of the USER or GROUP string Required
level the level of the sharing string Required

NotAuthenticated

The user has not been authenticated.
Object
Name Description Type Attributes and examples
message string Required
cause string Optional

NotAuthorized

The user has been authenticated but doesn’t have the entitlement for this service.
Object
Name Description Type Attributes and examples
message string Required
cause string Optional

AlreadyExist

A crawler already exists for this connection. You can only have one crawler per connection.
Object
Name Description Type Attributes and examples
connectionId The technical talend ID of the connection string Required
i18nMsg string Optional

ServerError

The server encountered an error during the processing of the request.
Object
Name Description Type Attributes and examples
message string Required
cause string Optional

CreateCrawlerResponse

Contains the response following a crawler creation request. The object contains the ID of the crawler created.
Object
Name Description Type Attributes and examples
id The technical talend ID of the crawler that has been created string Required

NotFound

Generic response payload when an object hasn’t been found. The entityType gives more information about the object.
Object
Name Description Type Attributes and examples
entityId The technical talend ID of the entity string Required
entityType The talend type of the entity EntityType Required
i18nMsg string Optional

EntityType

Entity type used for the NotFound data type in order to indicate the type of the object that hasn’t been found.
string

CrawlerModel

This object represents a whole crawler. It has two formats : either complete or light.
one of CrawlerComplete , CrawlerLight

CrawlerComplete

This object represents a crawler in the complete version.
Object
Name Description Type Attributes and examples
id The technical talend ID of the crawler string Required
connectionId The technical talend ID of the connection string Required
name Name of the crawler string Required
description Description of the crawler string Optional
sharings array of Sharing Optional
Datatype details
Type Description Attributes and examples
array Sharing policies
Sharing
status The status of the crawler Status Required
createdAt The date when the crawler has been created datetime Required
RFC3339
createdBy Technical ID of the talend user. string Required
crawledDatasets array of string Optional
Datatype details
Type Description Attributes and examples
array Names of the tables and views that we want to retrieve with the crawler.
string
updateAt The date when the crawler has been updated datetime Optional
RFC3339
updatedBy Technical ID of the talend user. string Optional
deletedAt The date when the crawler has been deleted datetime Optional
RFC3339
deletedBy Technical ID of the talend user. string Optional

Status

This object contains informations about the crawler status
Object
Name Description Type Attributes and examples
runStatus The running status of the crawler RunStatus Required
runStartedAt The date when the run has started datetime Optional
RFC3339
runBy Technical ID of the talend user. string Optional
runFinishedAt The date when the run has finished datetime Optional
RFC3339
failure string Optional
nbDatasetsToCrawl Number of datasets to retrieve integer Required
INT32
nbDatasetsFinished Number of datasets already retrieved integer Required
INT32
nbDatasetsCreated Number of datasets tha has been created integer Required
INT32
nbDatasetsFailed Number of datasets that has not been created integer Required
INT32
nbSamplesFailed Number of samples that has not been created integer Required
INT32

RunStatus

This object contains the different run status of the crawler.
string
RunStatus
CreatingDatasets
Ended
Finished
NotStarted
PropertiesRetrievalFailed
RetrievingProperties

CrawlerLight

This object represents a crawler in the light version (only contain the status).
Object
Name Description Type Attributes and examples
id The technical talend ID of the crawler string Required
name The name of the crawler string Required
connectionId The technical talend ID of the connection string Required
runStatus The status of the crawler RunStatus Required
deletedAt The date when the crawler has been deleted datetime Optional
RFC3339

AlreadyRunning

This entity indicates that the crawler is already running and provides the crawler ID.
Object
Name Description Type Attributes and examples
crawlerId The technical talend ID of the crawler string Required
i18nMsg string Optional

PaginatedResources_CrawlerModel

This object represents a whole crawler but in a paginated context.
Object
Name Description Type Attributes and examples
data array of CrawlerModel Optional
Datatype details
Type Description Attributes and examples
array Contains the list of the crawlers
CrawlerModel
offset Pagination offset integer Required
INT32
limit Pagination limit integer Required
INT32
total Total number of crawlers integer Required
INT32

UpdateCrawlerRequest

Describe the elements to update on the crawler as the name, description and the tables/views selection.
Object
Name Description Type Attributes and examples
name New name of the crawler string Required
description New description of the crawler string Optional
selectedDatasets array of string Optional
Datatype details
Type Description Attributes and examples
array Names of the tables and views that we want to retrieve with the crawler.
string

PatchCrawlerRequest

This object is used when the user wants to update the name and the description of the crawler. Those fields will replace the current ones once the update will be applied.
Object
Name Description Type Attributes and examples
name New name of the crawler string Optional
description New description of the crawler string Optional

PaginatedResources_CrawledDataset

This objects contains the datasets related to a crawler in a paginated context.
Object
Name Description Type Attributes and examples
data array of CrawledDataset Optional
Datatype details
Type Description Attributes and examples
array Contains the list of the objects selected in the crawler configuration
CrawledDataset
offset Pagination offset integer Required
INT32
limit Pagination limit integer Required
INT32
total Total number of crawlers integer Required
INT32

CrawledDataset

This object represents a datasets related to a crawler.
Object
Name Description Type Attributes and examples
id The technical ID of the object selected by the crawler string Required
crawlerId The technical talend ID of the crawler string Required
datasetId The technical ID of the dataset generated by the crawler string Optional
displayName Names of the tables and views that we want to retrieve with the crawler string Required
technicalName Technical names of the tables and views that we want to retrieve with the crawler string Required
metadata Indicate if it is a TABLE or a VIEW Metadata_infos Required
exportStatus Contains the status of the dataset in the crawler context ExportStatus Required
failure Indicate if we encountered a failure during the crawling for this dataset string Optional
lastUpdate Date of the last time this dataset has been refreshed datetime Required
RFC3339

Metadata_infos

This objects contains the metadata of an entity according to the context of the call. Exemple : indicate if an object is a table or a view.
Object

ExportStatus

this object contains the different statuses of a dataset in a crawler context.
string

ConnectionScan

This object contains the information about the last scan of the connection. It allows the user to retrieve the objects found by the last scan.
Object
Name Description Type Attributes and examples
id The technical talend ID of the connection string Required
lastScan Date of the last time this connection has been scanned datetime Required
RFC3339
results array of ScannedDataset Optional
Datatype details
Type Description Attributes and examples
array Objects of the connection retrieved by the last scan
ScannedDataset

ScannedDataset

This object contains the objects founded by the scan.
Object
Name Description Type Attributes and examples
displayName Names of the tables and views that we want to retrieve with the crawler string Required
technicalName Technical names of the tables and views that we want to retrieve with the crawler string Required
metadata Indicate if this is a TABLE or a VIEW Metadata_infos Required
back to top