html-to-markdown¶

Convert HTML to Markdown, Djot, or plain text. One Rust core, 12 language bindings, identical output on every runtime. Part of the kreuzberg.dev document intelligence ecosystem.

Get Started Installation GitHub Discord

Why html-to-markdown¶

Rust core

Single-pass DOM walk written in Rust. The same code path runs from Python, the browser, and the CLI — no per-language conversion logic.

12 bindings

Rust, Python, TypeScript, Go, Ruby, PHP, Java, C#, Elixir, R, C, and WebAssembly. One option name maps to one option name in every language.

Three output formats

Markdown (CommonMark) by default, plus Djot and plain text via output_format. The same options apply to every format.

Metadata extraction

Document title, Open Graph, Twitter Card, JSON-LD, links, and images in one pass. Enabled by default — disable with extract_metadata: false.

Table extraction

HTML tables into result.tables with structured cells, row/column spans, and header flags, alongside the rendered Markdown.

Visitor pattern

42 element-level callbacks on the HtmlVisitor trait to skip, replace, or preserve any node. Zero cost when unused.

Language Support¶

Language	Install	API Reference
Rust	`cargo add html-to-markdown-rs`	Reference
Python	`pip install html-to-markdown`	Reference
TypeScript / Node	`npm install @kreuzberg/html-to-markdown`	Reference
Go	`go get github.com/kreuzberg-dev/html-to-markdown/packages/go/v3`	Reference
Ruby	`gem install html-to-markdown`	Reference
PHP	`composer require kreuzberg-dev/html-to-markdown`	Reference
Java	Maven `dev.kreuzberg:html-to-markdown`	Reference
C#	`dotnet add package KreuzbergDev.HtmlToMarkdown`	Reference
Elixir	`{:html_to_markdown, "~> 3.4"}`	Reference
R	`install.packages("htmltomarkdown")`	Reference
C (FFI)	Shared library + header	Reference
WebAssembly	`npm install @kreuzberg/html-to-markdown-wasm`	Reference
CLI	`cargo install html-to-markdown-cli`	CLI Guide