DIY Gatsby: MDX

This is Part III of a series where we'll look into how kimmo.blog works. Part III covers MDX rendering.

Motivation has been covered in Writing, or coding.

What is MDX

MDX is an amazing markup format. It allows familiar authoring experience using markdown syntax – with the capability of adding custom React components along with the text.

For example this source

**Here's** an _example_ of a [toast](https://ux.stackexchange.com/questions/11998/what-is-a-toast-notification):

<ButtonLink onClick={() => info('This is amazing!')}>Spawn a toast</ButtonLink>

would render:

Here's an example of a toast:

Spawn a toast

All this could be achieved with good old React components, but it's just not as smooth of a writing experience. Paragraphs would need to be wrapped in <p> elements, links would become cumbersome to type, and blah.

Writing blog posts in React would make the site generator simpler, but we're already on the over-engineering train so let's continue.

How MDX works

While MDX is an incredible authoring experience, it comes at the price of complexity when done right. By right, I mean with good performance.

It is possible to greatly simplify the rendering, if performance is not a concern. MDX text could be sent as a raw string to the browser and it would do the heavy rendering with e.g. @mdx-js/runtime. It would make the build pipeline simpler, but other aspects worse:

"warning: this is not the preferred way to use MDX since it introduces a substantial amount of overhead and dramatically increases bundle sizes. It must not be used with user input that isn’t sandboxed."

This blog uses next-mdx-remote library to deal with MDX rendering. It's been designed for Next.js, but it also works well for our use case.

To speed things up in the browser, next-mdx-remote compiles the MDX to minified JavaScript server-side. The serialize method does this using @mdx-js/mdx and esbuild under the hood.

This is how the previous example would be serialized:

import { serialize } from "next-mdx-remote/serialize";

const content = `
**Here's** an _example_ of a [toast](https://ux.stackexchange.com/questions/11998/what-is-a-toast-notification):

<ButtonLink onClick={() => info('This is amazing!')}>Spawn a toast</ButtonLink>
`;

const { compiledSource } = await serialize(content);

The compiledSource is a string that contains the compiled JavaScript. Here's the code unminified and simplified:

function MDXContent(n) {
  var e = n,
    {
      components: o
    } = e,
    t = m(e, ["components"]);
  return mdx(MDXLayout, c(a(a({}, layoutProps), t), {
    components: o,
    mdxType: "MDXLayout"
  }), mdx("p", null, mdx("strong", {
    parentName: "p"
  }, "Here's"), " an ", mdx("em", {
    parentName: "p"
  }, "example"), " of a ", mdx("a", a({
    parentName: "p"
  }, {
    href: "https://ux.stackexchange.com/questions/11998/what-is-a-toast-notification"
  }), "a toast"), ":"), mdx(ButtonLink, {
    onClick: () => info("This is amazing!"),
    mdxType: "ButtonLink"
  }, "Spawn a toast"))
}

In the browser, next-mdx-remote evals the code with Reflect.construct(), and renders the MDXContent:

<MDX.MDXProvider components={components}>
  <MDXContent />
</MDX.MDXProvider>

How the blog renders posts

The high-level flow of rendering .mdx pages is similar to React pages, but has more steps:

Find all .mdx blog posts
For each blog post:
- Parse metadata from front matter
- Render PostPage.tsx.template
- Render pageHydrate.tsx.template
- Render page.html.template with the HTML generated from MDX and other metadata
- Save a compiledSource.txt file which contains the MDX content as minified JavaScript.

We'll soon understand why .txt extension is used. Let's go through the templates one by one.

PostPage.tsx.template

The template looks like this:

import { PostLayout } from "src/components/PostLayout";
import { MDXRemote } from "src/components/MDXRemote";
// Will be replaced with "./compiledSource.txt"
import compiledSource from "{{{ compiledSourcePath }}}";

export default function PageComponent(props) {
  return (
    <PostLayout siteData={props.siteData} data={props.pageData}>
      <MDXRemote compiledSource={compiledSource} />
    </PostLayout>
  );
}

The purpose of the rendered file is to represent the React component for a blog post page. The template will be rendered to a verbosely named file: output/posts/<slug>/<slug>-post.tsx.

It's easier to debug the build process when unique names are used instead of tens of "post.tsx" files.

pageHydrate.tsx.template

The same template that we used to render React pages in Part II. In React pages, the import refers to a page component in the source code.

import PageComponent from "src/pages/NotFound404";

With MDX pages on the other hand, the PageComponent is imported from the dynamically rendered post page file:

import PageComponent from "./1-writing-or-coding-post";

The template is saved to output/posts/<slug>/<slug>-post-hydrate.tsx.

page.html.template

Again, the same template as for React pages. The MDX content will be rendered as HTML in <body>. Blog post title, description, and tags from the front matter are rendered into <head>:

<html>
  <head>
    <title>Writing, or coding - kimmo.blog</title>
    <meta name="description" content="Do I want to write content, or do I want to code a blog?" />
    <meta name="keywords" content="meta,blog,content creation" />
  </head>
  <body>
    <div id="react-root"><!-- MDX blog page as HTML --></div>
    <script type="module" src="./1-writing-or-coding-post-hydrate.js"></script>
  </body>
</html>

compiledSource.txt

As mentioned, this file contains the MDX as minified JavaScript.

Why does it have a .txt extension then?

We need to be able to transfer the code exactly as-is to the browser, so that it can be passed as a string to MDXRemote. A good way to do this is using a rollup string plugin. The plugin has been configured so that a TypeScript code can import any .txt file, and the exact contents will be assigned to a variable.

This functionality used to be just a simple text replacement directly into code, but that broke with special characters and line breaks.

To recap

Each .mdx blog post file ends up creating quite some files in the output directory. First TypeScript files and compileSource.txt are rendered, then Rollup transpiles the source code into JS bundles, and finally static assets are copied.