Convert HTML to Custom Formats with MationHTML.js

Category: Javascript | January 6, 2025
Authorrezzvy
Last UpdateJanuary 6, 2025
LicenseMIT
Tags
Views37 views
Convert HTML to Custom Formats with MationHTML.js

MationHTML is a lightweight JavaScript library that converts HTML strings into custom formats. It parses an HTML string and applies user-defined conversion rules to each element.

Web developers frequently encounter scenarios requiring HTML content transformation. MationHTML addresses these challenges by offering a robust solution for converting HTML strings across various contexts. Potential use cases include:

  • Markdown Conversion: Transform HTML into markdown-compatible text
  • Custom Markup Generation: Create specialized markup for specific platforms
  • Content Migration: Convert HTML between different document formats
  • Preprocessing Content: Prepare web content for different rendering environments

How to use it:

1. Download the package and load the MationHTML library in the document.

<script src="/path/to/mationhtml.min.js"></script>

2. Create a new MationHTML instance and define conversion rules. The register() method takes rule objects with these properties:

  • tag: The HTML tag to match (e.g., “b”).
  • to: The output format string. Use {content} for the tag’s inner text. Access attributes with {dataset.attributeName} (e.g., [a='{dataset.src}']{content}[/a]).
  • format: (Optional) A function to override the to property. It receives an object with content, dataset, and node.
const mationHTML = new MationHTML();
mationHTML.register([
  {
    tag: "b",
    to: "**{content}**",
  },
  {
    tag: "h1",
    to: "# {content}",
  },
  // more rules here
]);

3. Use the convert() method to transform your HTML string. The second parameter determines if you want to convert only the <body> content (true by default) or the entire document (false).

const output1 = mationHTML.convert("<b>CSSScript</b>");
const output2 = mationHTML.convert("<h1>CSSScript</h1>");
// **CSSScript**
console.log(output1)
// # CSSScript
console.log(output2)

4. Customize further with available properties:

  • noRuleFallback: A function that runs when no rule is found for a tag. This function overrides the default behavior. It should return a string to replace the content when no rules match. For example: (api) => api.content.toUpperCase();
  • ignoreTags: An array of tag names to ignore during conversion. For example: ["i", "u"] ignores <i> and <u> tags.

How It Works:

MationHTML works by parsing the provided HTML string and processing each tag according to the rules you’ve registered. The core functionality involves recursively traversing the HTML tree, checking each element for a corresponding rule. If a rule exists for a tag, it applies the transformation. If no rule is found, it either applies a fallback function (if defined) or returns the content unchanged.

Internally, MationHTML parses the HTML string using the DOMParser, which converts the string into a document object. Then, it processes each element, handling child nodes, attributes, and text content. If an element matches a rule, the specified conversion takes place. If no rule matches, the fallback function or default behavior is applied.

Changelog:

01/06/2025

  • feat: apply multiple matching rules to node

01/06/2025

  • feat: use selector instead of tag name

You Might Be Interested In:


Leave a Reply