Historical Data

The Historical Data API gives you access to point-in-time snapshots of domain data. You can list available datasets and stream the full contents of any snapshot as NDJSON.

The dataset model

Each dataset represents a point-in-time snapshot of all indexed domains.

Properties

  • Name
    label
    Type
    string
    Description

    Human-readable date for the snapshot (e.g., "2026-03-28").

  • Name
    value
    Type
    string
    Description

    Internal snapshot identifier used when requesting data (e.g., "sitedata-domains-20260328055749").


GET/json/api/v1/historical/domain

List datasets

Returns the list of available historical datasets. Each entry includes a human-readable date label and a snapshot identifier you can pass to the streaming endpoint.

Requires the historical data feature on your plan.

Request

GET
/json/api/v1/historical/domain
curl -s https://websitedata.app/json/api/v1/historical/domain \
  -H "Authorization: Bearer {your-api-key}"

Response

{
  "datasets": [
    {
      "label": "2026-03-28",
      "value": "sitedata-domains-20260328055749"
    },
    {
      "label": "2026-03-21",
      "value": "sitedata-domains-20260321060132"
    },
    {
      "label": "2026-03-14",
      "value": "sitedata-domains-20260314054500"
    }
  ]
}

GET/json/api/v1/historical/domain/:label

Stream domains

Streams the full contents of a historical dataset as newline-delimited JSON (NDJSON). Each line is a domain object serialized with the standard domain fields.

Use "latest" as the label to get the most recent snapshot, or pass a date label (e.g., "2026-03-28") from the list endpoint.

Requires the historical data feature on your plan.

Path parameters

  • Name
    label
    Type
    string
    Description

    The dataset to stream. Use "latest" for the most recent, or a date label from the list endpoint.

Query parameters

  • Name
    fields
    Type
    string
    Description

    Comma-separated list of fields to include. Defaults to name, rank, title, meta_description, country_code, language_code, categories, technologies, tld1. See available fields below.

  • Name
    limit
    Type
    integer
    Description

    Maximum number of domains to return. Omit for the full dataset.

Available fields

name, rank, title, meta_description, country_code, language_code, categories, technologies, tld1, meta_keywords, canonical, icon, status, dns, first_published_at, redirects_to, contact_information

The contact_information field returns an array of objects, each with type, value, and full_value. Types include email, phone, facebook, instagram, twitter, linkedin, tiktok, youtube, pinterest, snapchat, yelp, whatsapp, and discord.

The aliases domain, tld, country, and language are also accepted and map to their canonical equivalents.

Request

GET
/json/api/v1/historical/domain/:label
curl -s https://websitedata.app/json/api/v1/historical/domain/latest \
  -H "Authorization: Bearer {your-api-key}"

Response (NDJSON, one object per line)

{"name":"www.youtube.com","rank":1,"title":"YouTube","meta_description":"Enjoy the videos and music...","country_code":"US","language_code":"en","categories":["/Arts & Entertainment"],"technologies":["Google Tag Manager"],"tld1":"youtube.com"}
{"name":"www.wikipedia.org","rank":2,"title":"Wikipedia","meta_description":"Wikipedia is a free...","country_code":"US","language_code":"en","categories":["/Reference"],"technologies":[],"tld1":"wikipedia.org"}

Was this page helpful?