This series describes the technical implementation steps to generate and retrieve a dataset. The dataset generation process is a series of steps using Data API endpoints. Once generated, the dataset can be downloaded, or used with Reports API to render a completed report, depending on the dataset type.
A dataset is a set of data files (sometimes many thousands of files) containing aggregated results and analysis in JSON and CSV formats. The exact contents and format of a dataset will vary depending on the dataset type. The dataset is generated asynchronously using Data API. Once completed, the dataset can be retrieved via Data API or rendered in-browser using Reports API, if the dataset type supports it.
Report generation is performed asynchronously as follows:
1. Initialize a new dataset using Data API: Create and configure a new dataset. The endpoint returns a
dataset_id and one or more URLs to which input files need to be uploaded.
If input files are not required to be uploaded, the
dataset_type and a
job_reference are returned, and you can proceed directly to step 3 below.
2. Upload input files:
a. Upload input files: Upload the input files required for the dataset. The data is uploaded to a specific URL returned from step 1.
b. Commence dataset generation job using Data API: Notify Data API that input files have been uploaded and kick off the dataset generation job.
3. Poll for job completion using Data API: The status of the dataset generation job can be obtained by polling the
4. Retrieve results via Data API or Reports API (if applicable): On completion, raw report data can be retrieved by your server application using Data API. If the dataset type has a corresponding report type in Reports API, the report can render the new dataset, and raw data can also be accessed from Reports API's public methods.