Parsing Markdown

2025-02-19

I’m falling down a rabbit hole, and it’s time to note and regroup. I was curious whether Python’s new Inline script metadata specification could work with Jupyter notebooks. (Note that they “just work” with Marimo, since those are plain Python scripts; Marimo also knows about and supports the inline metadata format.) The answer so far is “no,” although there is at least one person who is gamely piling on uv in a way that most likely is overwrought and won’t stick long term. But as a result, I took a closer look at the MyST Markdown project. This seems to be an outgrowth or reorganization of some ideas from the JupyterBook or Executable Books projects, and this essentially looks like the Pandoc idea. There is apparently an ecosystem of JavaScript stuff based around unifiedjs that defines a universal syntax tree inspired by Web IDL (unist), and extended from that is a Markdown-specific syntax tree spec (mdast), and the MyST project itself is extending mdast to provide for ReStructuredText-like directives. Downsides, the MyST command line tooling currently has a hard dependency on Node, which seems dumb. So this is yet again the Document -> AST -> Document paradigm, but there is an ecosystem of JavaScript and Python tools working around the same AST representation here, unlike in Pandoc land, which is really cool.

So maybe random project idea: is there room for something that converts between a unifiedjs AST and a Pandoc AST?

And of course I love Obsidian, but it’s infuriating that it’s introduced yet another Markdown variant that’s not exactly the same as any other extant system. I don’t think the Obsidian team has even released a reference parser for their Markdown, for example.