Parse HTML API
Extract structured data from raw HTML or a URL by specifying the fields you want to retrieve.
Flow
- Submit HTML or URL —
POST /parse_html/ - Poll for result —
GET /parse_html/{id}/
Submit HTML for Parsing
POST /api/v1/parse_html/
Submit raw HTML content or a URL along with the list of fields to extract.
Authentication
Requires API key (X-Api-Key header).
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
html | string | No | Raw HTML content to parse |
url | string | No | URL of the page to parse |
fields | string[] | No | List of field names to extract |
Provide either
htmlorurl.
{
"url": "https://example.com/page",
"fields": ["title", "price", "description"]
}
Response
200 OK
Returns the task identifier to poll for results.
Get Parse Result
GET /api/v1/parse_html/{id}/
Retrieve the parsing result for a previously submitted task.
Authentication
Public endpoint — no authentication required.
Path Parameters
| Parameter | Type | Description |
|---|---|---|
id | string | Task identifier returned by the POST request |
Response
200 OK
Example
import time
import requests
API_KEY = "YOUR_API_KEY"
BASE_URL = "https://detect.expert/api/v1"
headers = {"X-Api-Key": API_KEY}
# Submit HTML for parsing
payload = {
"url": "https://example.com/page",
"fields": ["title", "price", "description"]
}
response = requests.post(f"{BASE_URL}/parse_html/", json=payload, headers=headers)
task = response.json()
task_id = task["id"]
# Poll for result
while True:
result = requests.get(f"{BASE_URL}/parse_html/{task_id}/")
data = result.json()
if data.get("status") == "completed":
print(data)
break
time.sleep(3)