Tag Preservation v2.5.0¶
Tag preservation allows you to keep specific HTML elements as raw HTML in the Markdown output instead of converting them to Markdown syntax. This is useful when Markdown cannot represent the full richness of certain HTML structures.
What Tag Preservation Does¶
When you add a tag name to the preserve_tags configuration list, any matching HTML element and its contents are passed through verbatim into the Markdown output. The element is not converted -- the raw HTML appears as-is in the final Markdown string.
Without preservation:
Becomes a standard GFM table (losing the inline style):
With preserve_tags: ["table"]:
The original HTML table is kept intact in the output:
When to Use Tag Preservation¶
Complex Tables¶
Markdown tables are limited to simple grids without merged cells, colspans, rowspans, or styling. Preserve <table> when you need:
- Merged cells (
colspan,rowspan) - Cell background colors or alignment styles
- Nested tables
- Caption elements (
<caption>)
SVG Content¶
SVG elements cannot be represented in Markdown. Preserve <svg> to keep vector graphics inline:
Custom Elements / Web Components¶
Web components use custom tag names that have no Markdown equivalent:
Preserving my-widget keeps the custom element intact for client-side JavaScript to process.
Interactive Elements¶
Elements like <details>/<summary>, <dialog>, or form elements may need to be preserved when the Markdown renderer supports inline HTML:
Mixed HTML/Markdown Content¶
When generating Markdown that will be rendered in environments supporting inline HTML (GitHub, GitLab, most static site generators), preserving specific elements gives you the best of both worlds:
- Standard content converts cleanly to Markdown
- Complex structures stay as HTML where Markdown falls short
Configuration¶
Tag preservation is configured through the preserve_tags option, which accepts a list of HTML tag names (case-insensitive).
How It Works¶
During DOM traversal, when the engine encounters an element whose tag name is in the preserve_tags list:
- The element's opening tag, attributes, children, and closing tag are serialized back to HTML
- The HTML string is inserted directly into the Markdown output
- No Markdown conversion is applied to the element or any of its descendants
- Surrounding content continues to be converted normally
Children are also preserved
When a tag is preserved, all of its child elements are included as raw HTML too. If you preserve <div>, everything inside that <div> stays as HTML, even elements that would normally convert cleanly to Markdown.
Tag Preservation vs. Other Approaches¶
| Approach | Behavior | Best For |
|---|---|---|
preserve_tags | Keep entire element as HTML | Complex structures that need full HTML |
strip_tags | Remove tags, keep text content | Removing unwanted wrappers |
skip_images | Omit <img> elements entirely | Text-only extraction |
Visitor PreserveHtml | Per-element decision via callback | Conditional preservation logic |
Visitor for conditional preservation
If you need to preserve some instances of a tag but convert others (e.g., preserve tables with colspan but convert simple tables), use the visitor pattern with a PreserveHtml return value based on element attributes.
Further Reading¶
- Configuration Options Guide -- all configuration options including
preserve_tags - Visitor Pattern -- programmatic control over element handling
- Conversion Pipeline -- how tag preservation fits into the pipeline