alpaca0984.log

Convert Markdown to HTML with Syntax highlighting

alpaca0984

While I was creating this blog's template, I wanted to convert Markdown files into HTML ones with code syntax highlighting.

Fortunately, there is a remark that works with markdown as structured data.

As dependencies, I need these npm packages:

npm install --save-dev remark remark-rehype rehype-highlight rehype-stringify

For conversion, the code looks this. You see the highlighing works :)

import { remark } from 'remark'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import rehypeStringify from 'rehype-stringify'
import rehypeHighlight from 'rehype-highlight'
 
export default async function markdownToHtml(markdown: string) {
  const result = await remark()
    .use(remarkParse)
    .use(remarkRehype)
    .use(rehypeHighlight, { subset: false })
    .use(rehypeStringify)
    .process(markdown)
 
  return result.toString()
}

One step further, as rehype-highlight mentioned, it's highly recommened to sanitaze markdown input so that we are not open up to XSS vulnerabilities.

I installed the sanitization system although I statically convert markdown files before deployment:

npm install --save-dev rehype-sanitize

...and updated above code. rehype-highlight uses highlight.js inside and it automatically adds classes to highlight syntaxes. We need to allow them.

  import { remark } from 'remark'
  import remarkParse from 'remark-parse'
  import remarkRehype from 'remark-rehype'
  import rehypeStringify from 'rehype-stringify'
  import rehypeHighlight from 'rehype-highlight'
+ import rehypeHighlight from 'rehype-highlight'
 
  export default async function markdownToHtml(markdown: string) {
    const result = await remark()
      .use(remarkParse)
      .use(remarkRehype)
      .use(rehypeHighlight, { subset: false })
+     .use(rehypeSanitize, {
+       // @see https://github.com/rehypejs/rehype-highlight#example-sanitation
+       ...defaultSchema,
+       attributes: {
+         ...defaultSchema.attributes,
+         span: [
+           ...(defaultSchema.attributes?.span || []),
+           // List of all allowed tokens:
+           ['className', 'hljs-addition', 'hljs-attr', 'hljs-attribute', 'hljs-built_in', 'hljs-bullet', 'hljs-char', 'hljs-code', 'hljs-comment', 'hljs-deletion', 'hljs-doctag', 'hljs-emphasis', 'hljs-formula', 'hljs-keyword', 'hljs-link', 'hljs-literal', 'hljs-meta', 'hljs-name', 'hljs-number', 'hljs-operator', 'hljs-params', 'hljs-property', 'hljs-punctuation', 'hljs-quote', 'hljs-regexp', 'hljs-section', 'hljs-selector-attr', 'hljs-selector-class', 'hljs-selector-id', 'hljs-selector-pseudo', 'hljs-selector-tag', 'hljs-string', 'hljs-strong', 'hljs-subst', 'hljs-symbol', 'hljs-tag', 'hljs-template-tag', 'hljs-template-variable', 'hljs-title', 'hljs-type', 'hljs-variable']
+         ]
+       }
+     })
      .use(rehypeStringify)
      .process(markdown)
 
    return result.toString()
  }

For styling, highlight.js has preset css files so we can just pick one and customize it if we want to. My favorite is github styles.