API reference

Extract clean agent-ready context from public URLs.

Sinyx returns agent-ready context, Markdown, text, HTML, full metadata, or semantic chunks from a single extraction pipeline so AI workflows can use public web pages with source and quality signals.

Authentication

The API is distributed through RapidAPI. Send your RapidAPI key and host headers with each request.

x-rapidapi-host: sinyx.p.rapidapi.com
x-rapidapi-key: YOUR_KEY

Single extraction

POST https://sinyx.p.rapidapi.com/v1/extract

Field	Type	Required	Description
`url`	String	Yes	Public HTTP or HTTPS URL to extract.
`format`	String	No	`full`, `markdown`, `text`, `html`, `chunks`, or `context`.
`selector`	String	No	Optional CSS selector for targeted extraction.

curl --request POST \
  --url https://sinyx.p.rapidapi.com/v1/extract \
  --header 'Content-Type: application/json' \
  --header 'x-rapidapi-host: sinyx.p.rapidapi.com' \
  --header 'x-rapidapi-key: YOUR_KEY' \
  --data '{
    "url": "https://example.com/article",
    "format": "context"
  }'

Agent quickstart

For an AI agent, request format: "context" and pass the source, freshness, quality, top chunks, and Markdown into your prompt or retrieval step.

const response = await fetch("https://sinyx.p.rapidapi.com/v1/extract", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "x-rapidapi-host": "sinyx.p.rapidapi.com",
    "x-rapidapi-key": process.env.RAPIDAPI_KEY
  },
  body: JSON.stringify({
    url: "https://example.com/article",
    format: "context"
  })
});

const page = await response.json();
const agentContext = {
  source: page.citation,
  freshness: page.freshness,
  quality: page.quality,
  topChunks: page.chunks.slice(0, 3),
  markdown: page.markdown
};

Batch extraction

POST https://sinyx.p.rapidapi.com/v1/extract/batch

Process up to five URLs concurrently. Individual URL failures are returned alongside successful results.

curl --request POST \
  --url https://sinyx.p.rapidapi.com/v1/extract/batch \
  --header 'Content-Type: application/json' \
  --header 'x-rapidapi-host: sinyx.p.rapidapi.com' \
  --header 'x-rapidapi-key: YOUR_KEY' \
  --data '{
    "urls": ["https://apple.com", "https://github.com"],
    "format": "context"
  }'

Chunks schema

Use format: "chunks" when you want semantic blocks ready for retrieval or vector storage.

Field	Type	Description
`chunkIndex`	Number	Sequential position in the extracted document.
`headingPath`	String	Breadcrumb path for the content block.
`content`	String	Plain text content for the chunk.
`priority`	Number	Relevance score from `0.0` to `1.0`.

Context format

Use format: "context" when an agent needs source-aware context instead of a single raw representation.

{
  "url": "https://example.com/article",
  "format": "context"
}

Signal	Why it matters
`citation`	Lets agents cite source title, domain, author, fetched time, and canonical URL.
`freshness`	Helps agents decide whether a page is current enough for the task.
`quality`	Scores whether the extracted content is dense, useful, and low-noise.
`context`	Returns Markdown, chunks, citation, freshness, and quality in one agent-ready response.

{
  "success": true,
  "url": "https://example.com/article",
  "title": "Example Article",
  "markdown": "# Example Article\n\nClean extracted content...",
  "chunks": [
    {
      "chunkIndex": 0,
      "headingPath": "Root > Overview",
      "priority": 0.7,
      "content": "Clean extracted content for agent workflows."
    }
  ],
  "citation": {
    "title": "Example Article",
    "sourceDomain": "example.com",
    "canonicalUrl": "https://example.com/article",
    "fetchedAt": "2026-05-01T12:00:00.000Z"
  },
  "freshness": {
    "fetchedAt": "2026-05-01T12:00:00.000Z",
    "ageDays": 11,
    "isRecent": true
  },
  "quality": {
    "score": 0.88,
    "pageType": "article",
    "hasTables": true
  }
}

MCP setup

Use Sinyx from Claude Code, Codex, Cursor, Windsurf, and other MCP clients with the published sinyx-mcp package.

Open MCP guide View npm package