Skip to main content

Parse HTML API

Extract structured data from raw HTML or a URL by specifying the fields you want to retrieve.

Flow

  1. Submit HTML or URL — POST /parse_html/
  2. Poll for result — GET /parse_html/{id}/

Submit HTML for Parsing

POST /api/v1/parse_html/

Submit raw HTML content or a URL along with the list of fields to extract.

Authentication

Requires API key (X-Api-Key header).

Request Body

FieldTypeRequiredDescription
htmlstringNoRaw HTML content to parse
urlstringNoURL of the page to parse
fieldsstring[]NoList of field names to extract

Provide either html or url.

{
"url": "https://example.com/page",
"fields": ["title", "price", "description"]
}

Response

200 OK

Returns the task identifier to poll for results.


Get Parse Result

GET /api/v1/parse_html/{id}/

Retrieve the parsing result for a previously submitted task.

Authentication

Public endpoint — no authentication required.

Path Parameters

ParameterTypeDescription
idstringTask identifier returned by the POST request

Response

200 OK

Example

import time
import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://detect.expert/api/v1"
headers = {"X-Api-Key": API_KEY}

# Submit HTML for parsing
payload = {
"url": "https://example.com/page",
"fields": ["title", "price", "description"]
}
response = requests.post(f"{BASE_URL}/parse_html/", json=payload, headers=headers)
task = response.json()
task_id = task["id"]

# Poll for result
while True:
result = requests.get(f"{BASE_URL}/parse_html/{task_id}/")
data = result.json()
if data.get("status") == "completed":
print(data)
break
time.sleep(3)