API reference
Extract clean agent-ready context from public URLs.
Sinyx returns agent-ready context, Markdown, text, HTML, full metadata, or semantic chunks from a single extraction pipeline so AI workflows can use public web pages with source and quality signals.
Authentication
The API is distributed through RapidAPI. Send your RapidAPI key and host headers with each request.
x-rapidapi-host: sinyx.p.rapidapi.com
x-rapidapi-key: YOUR_KEY
Single extraction
https://sinyx.p.rapidapi.com/v1/extract
| Field | Type | Required | Description |
|---|---|---|---|
url |
String | Yes | Public HTTP or HTTPS URL to extract. |
format |
String | No | full, markdown, text, html, chunks, or context. |
selector |
String | No | Optional CSS selector for targeted extraction. |
curl --request POST \
--url https://sinyx.p.rapidapi.com/v1/extract \
--header 'Content-Type: application/json' \
--header 'x-rapidapi-host: sinyx.p.rapidapi.com' \
--header 'x-rapidapi-key: YOUR_KEY' \
--data '{
"url": "https://example.com/article",
"format": "context"
}'
Agent quickstart
For an AI agent, request format: "context" and pass the source, freshness, quality, top chunks, and Markdown into your prompt or retrieval step.
const response = await fetch("https://sinyx.p.rapidapi.com/v1/extract", {
method: "POST",
headers: {
"Content-Type": "application/json",
"x-rapidapi-host": "sinyx.p.rapidapi.com",
"x-rapidapi-key": process.env.RAPIDAPI_KEY
},
body: JSON.stringify({
url: "https://example.com/article",
format: "context"
})
});
const page = await response.json();
const agentContext = {
source: page.citation,
freshness: page.freshness,
quality: page.quality,
topChunks: page.chunks.slice(0, 3),
markdown: page.markdown
};
Batch extraction
https://sinyx.p.rapidapi.com/v1/extract/batch
Process up to five URLs concurrently. Individual URL failures are returned alongside successful results.
curl --request POST \
--url https://sinyx.p.rapidapi.com/v1/extract/batch \
--header 'Content-Type: application/json' \
--header 'x-rapidapi-host: sinyx.p.rapidapi.com' \
--header 'x-rapidapi-key: YOUR_KEY' \
--data '{
"urls": ["https://apple.com", "https://github.com"],
"format": "context"
}'
Chunks schema
Use format: "chunks" when you want semantic blocks ready for retrieval or vector storage.
| Field | Type | Description |
|---|---|---|
chunkIndex |
Number | Sequential position in the extracted document. |
headingPath |
String | Breadcrumb path for the content block. |
content |
String | Plain text content for the chunk. |
priority |
Number | Relevance score from 0.0 to 1.0. |
Context format
Use format: "context" when an agent needs source-aware context instead of a single raw representation.
{
"url": "https://example.com/article",
"format": "context"
}
| Signal | Why it matters |
|---|---|
citation |
Lets agents cite source title, domain, author, fetched time, and canonical URL. |
freshness |
Helps agents decide whether a page is current enough for the task. |
quality |
Scores whether the extracted content is dense, useful, and low-noise. |
context |
Returns Markdown, chunks, citation, freshness, and quality in one agent-ready response. |
{
"success": true,
"url": "https://example.com/article",
"title": "Example Article",
"markdown": "# Example Article\n\nClean extracted content...",
"chunks": [
{
"chunkIndex": 0,
"headingPath": "Root > Overview",
"priority": 0.7,
"content": "Clean extracted content for agent workflows."
}
],
"citation": {
"title": "Example Article",
"sourceDomain": "example.com",
"canonicalUrl": "https://example.com/article",
"fetchedAt": "2026-05-01T12:00:00.000Z"
},
"freshness": {
"fetchedAt": "2026-05-01T12:00:00.000Z",
"ageDays": 11,
"isRecent": true
},
"quality": {
"score": 0.88,
"pageType": "article",
"hasTables": true
}
}
MCP setup
Use Sinyx from Claude Code, Codex, Cursor, Windsurf, and other MCP clients with the published sinyx-mcp package.