Plugin system (visitors)¶
The visitor system is the library's extensibility point. Implement HtmlVisitor and you can replace, skip, or augment how any HTML element becomes Markdown. No fork required.
Rust users opt in with features = ["visitor"]. Bindings expose the same hooks through their native idiom — anonymous class in PHP, instance with handle_* callbacks in Elixir, Python class with visit_* methods, etc. — and link against a Rust core built with the feature enabled.
The trait¶
pub trait HtmlVisitor: std::fmt::Debug {
fn visit_text(&mut self, ctx: &NodeContext, text: &str) -> VisitResult { VisitResult::Continue }
fn visit_element_start(&mut self, ctx: &NodeContext) -> VisitResult { VisitResult::Continue }
fn visit_element_end(&mut self, ctx: &NodeContext, output: &str) -> VisitResult { VisitResult::Continue }
fn visit_link(&mut self, ctx: &NodeContext, href: &str, text: &str, title: Option<&str>) -> VisitResult { VisitResult::Continue }
fn visit_image(&mut self, ctx: &NodeContext, src: &str, alt: &str, title: Option<&str>) -> VisitResult { VisitResult::Continue }
fn visit_heading(&mut self, ctx: &NodeContext, level: u32, text: &str, id: Option<&str>) -> VisitResult { VisitResult::Continue }
// … 36 more element-specific methods, all with `Continue` defaults
}
42 methods in total: text and element pre/post hooks plus a method per HTML element family (links, images, headings, lists, code, tables, definition lists, …). Override only the methods you need.
VisitResult¶
Every callback returns a VisitResult. There are five variants:
| Variant | Effect |
|---|---|
Continue |
Use the default rendering. Default for every method. |
Custom(String) |
Replace the default output with the supplied string. The visitor owns this subtree. |
Skip |
Drop the element and all of its children. |
PreserveHtml |
Emit the raw HTML for this element verbatim, without conversion. |
Error(String) |
Halt conversion. Surfaces as ConversionError::Visitor in Rust. |
visit_text is hot — it fires for every text node, often 100+ times per page. Return Continue fast when you don't care, and avoid allocations in the hot path.
Registration¶
In Rust, attach a visitor through the options builder:
use html_to_markdown_rs::visitor::{HtmlVisitor, VisitorHandle};
use html_to_markdown_rs::ConversionOptions;
use std::cell::RefCell;
use std::rc::Rc;
let visitor: VisitorHandle = Rc::new(RefCell::new(MyVisitor::default()));
let options = ConversionOptions::builder().visitor(Some(visitor)).build();
In other languages, the same hook is reached by passing a visitor object to the options builder. For binding-specific examples see Guides → Visitor pattern.
Cost when unused¶
Without a visitor registered, the dispatch site short-circuits and the default handler runs directly — there is no virtual call, no allocation, no extra branch on the hot path. The visitor feature is opt-in for Rust users specifically so consumers who never need it pay nothing.
Across the C FFI¶
When the visitor crosses the FFI boundary (Go, Java, C#, C), the C layer exposes a HtmHtmVisitorCallbacks struct of function pointers. The Rust side wraps each callback in a bridge that marshals strings and NodeContext fields and translates the returned status code into a VisitResult. The status codes are HTM_VISIT_CONTINUE, HTM_VISIT_SKIP, HTM_VISIT_PRESERVE_HTML, HTM_VISIT_CUSTOM, and HTM_VISIT_ERROR.
Found a bug or mistake on this page?
If something here is wrong or out of date, open an issue on GitHub or contribute a fix via pull request.