Documentation notebook

Field guide for DocShark internals

Home / Docs / Introduction

DocShark

DocShark is a fast, local-first Model Context Protocol (MCP) server that scrapes, indexes, and serves documentation from any website. One package, unlimited doc sources, zero cloud dependency.

Context

Why this exists

Teams often end up with many disconnected MCP integrations. DocShark centralizes discovery, indexing, and retrieval so agents can work from a single source of truth.

Why DocShark?

AI coding assistants hallucinate documentation. Existing solutions all have gaps:

SolutionWhat it doesGap
Context7Pulls docs from GitHub reposGets source code, not rendered docs. Closed-source.
docs-mcp-serverScrapes websites, indexes locallyHeavy deps (Playwright, LangChain, Node 22+).
Per-library MCPsServe one library eachOne MCP per library doesn’t scale.

DocShark fills the gap: a single, general-purpose MCP that fetches docs from any website, stores them locally with zero-config search, and exposes them via the Model Context Protocol — all in one Bun-installed package.

Single server model

Replace many per-library MCP servers with one local index and one tool surface.

Agent-first outputs

Results are chunked and formatted so coding assistants can act on them immediately.

Rendered docs support

Targets real documentation pages instead of only repository markdown.

Lightweight runtime

Bun + SQLite + FTS5 keeps the stack compact without cloud dependencies.

Key Features

  • Any docs site — Point DocShark at any URL and it crawls, extracts, and indexes the content
  • SQLite + FTS5 — Blazing-fast full-text search with zero external dependencies
  • Local-first — Everything runs on your machine. No cloud, no API keys, no rate limits
  • Smart chunking — Heading-aware splitting preserves context and hierarchy
  • MCP + CLI — Full MCP server for AI assistants, plus a powerful command-line interface
  • Lightweight — ~375KB core footprint. No LangChain, no heavy browser deps by default

Quick Start

# Add documentation from any site
bunx docshark add https://svelte.dev/docs

# Search across your indexed docs
bunx docshark search "runes reactivity"

# Start the MCP server
bunx docshark start --stdio

Tech Stack

ComponentTechnology
RuntimeBun
MCP ServerTMCP
DatabaseSQLite (bun:sqlite) + FTS5
Content ExtractionReadability.js + linkedom
Markdown ConversionTurndown + GFM plugin
ValidationValibot
CLIcac

A local-first research notebook for software documentation. Crawl, index, and serve real docs to coding agents without adding a cloud layer to the workflow.

GitHub repository

Built for grounded documentation workflows and long-form technical reading.

© 2026 DocShark