为markdown-it编写插件

markdown-it的文档和语法确实有点难理解, 我也只琢磨出了我想要的插件的写法, 所以这篇文章目前应该叫: 为markdown-it编写一个渲染自定义块语法的插件.

什么是markdown-it

就是用来渲染markdown的, 拥有极强的可定制型, 可以为它写插件以支持自定义语法.

它被使用到了:

vuepress

问题

项目中需要将md中的yaml转换成json并显示.

md格式如下

    ``` ymal
    a: 1
    ```

希望显示如下

{
  "a": 1
}

ps: 实际上你想将块当中的内容处理成任何东西都可以

这是标准markdown肯定做不了的事情, 所以就需要扩展markdown-it.

理解

首先肯定是要去看官方api文档的, 很快, 你就云里雾里.

我也是看着几个关键字(也就是文档的一级目录, 如Ruler, Token), 再百度搜搜其他资料(如 Markdown-It 的解析过程), 再去看看其他插件是怎么写的, 才猜得略知一二.

介于其他资料只是写了原理, 而没有写如何修改, 所以我就再输出一下我的思路.

Token

Token是最终会被渲染的数据结构

Ruler会将# hello转换为如下token (以简化)

{
  render: 'h1',
  content: 'hello'
}

这个数据格式能很方便的渲染成html.

Ruler

Ruler将md转换为另一个语法: Token

在markdown-ite中Ruler分两种, 分别对应解析block和inline的Ruler

block: 多行的语法, 如 code, table 等
inline: 单行的语法, 如 # * 等

其中block优先级高, 需要先执行.

如果要自定义插件去解析md, 则一定需要写这个.

Parsing

运行Ruler的东西就叫Parsing, 一个Parsing中有多个Ruler, 他们执行有先后. 在markdown-it中有两个Parsing, 分别是block和inline.

block始终比inline先执行.

Render

Render就是将token渲染成为html.

如果要自定义插件去解析md, 则一定需要写这个.

一个#的渲染函数如下

function render(tokens, idx){
    let content = tokens[idx].content;
    return '<h1>'+content+'</h1>'
}

总结流程

当markdown-it实例创建出来, 就会生成几个默认的东西, ParsingBlock, ParsingInline 以及它们的默认Rulers.

首先运行ParsingBlock之中的Rules去解析一次md, 再运行ParsingInline的Rules, 将两次得到的Token组合在一起(具体怎么组合我没深入了)就可以交由Render渲染了.

如何写

我们如果要自定义语法, 就需要自定义Rules用来解析语法和Render用来渲染语法.

这里我建议读者去参考官方插件的写法, 就这点官方文档, 想从零写插件基本不可能.

如果想编写自定义容器, 可以看markdown-it-container
如果想编写块, 可以看markdown-it-math
如果想编写inline, 可以看markdown-it-mark

容器: 只是给目标内容添加一个包裹层, 不会影响其中内容的渲染, 如果你要做折叠效果的话就应该使用这个语法.

块: 多行语法, 如table, 整个块中的内容就交由这个语法渲染.

inline: 一行的语法, 如#

我在做以上需求的时候, 就是抄的markdown-it-math, 不多说了, cv走起.

我再简单讲讲部分代码

只需要两行代码:

// 由于我们要覆盖默认的code的解析器, 所以需要在code之前添加.
md.block.ruler.before('code', 'yaml2json', block);
md.renderer.rules.yaml2json = render;

然后再写100行

  let open = '``` yaml'
  let close = '```'

  function block(state, startLine, endLine, silent) {
    var openDelim, len, params, nextLine, token, firstLine, lastLine, lastLinePos,
      haveEndMarker = false,
      pos = state.bMarks[startLine] + state.tShift[startLine],
      max = state.eMarks[startLine];

    if (pos + open.length > max) {
      return false;
    }

    openDelim = state.src.slice(pos, pos + open.length);

    if (openDelim !== open) {
      return false;
    }

    pos += open.length;
    firstLine = state.src.slice(pos, max);

    // Since start is found, we can report success here in validation mode
    if (silent) {
      return true;
    }

    if (firstLine.trim().slice(-close.length) === close) {
      // Single line expression
      firstLine = firstLine.trim().slice(0, -close.length);
      haveEndMarker = true;
    }

    // search end of block
    nextLine = startLine;

    for (; ;) {
      if (haveEndMarker) {
        break;
      }

      nextLine++;

      if (nextLine >= endLine) {
        // unclosed block should be autoclosed by end of document.
        // also block seems to be autoclosed by end of parent
        break;
      }

      pos = state.bMarks[nextLine] + state.tShift[nextLine];
      max = state.eMarks[nextLine];

      if (pos < max && state.tShift[nextLine] < state.blkIndent) {
        // non-empty line with negative indent should stop the list:
        break;
      }

      if (state.src.slice(pos, max).trim().slice(-close.length) !== close) {
        continue;
      }

      if (state.tShift[nextLine] - state.blkIndent >= 4) {
        // closing block math should be indented less then 4 spaces
        continue;
      }

      lastLinePos = state.src.slice(0, max).lastIndexOf(close);
      lastLine = state.src.slice(pos, lastLinePos);

      pos += lastLine.length + close.length;

      // make sure tail has spaces only
      pos = state.skipSpaces(pos);

      if (pos < max) {
        continue;
      }

      // found!
      haveEndMarker = true;
    }

    // If math block has heading spaces, they should be removed from its inner block
    len = state.tShift[startLine];

    state.line = nextLine + (haveEndMarker ? 1 : 0);

    token = state.push('yaml2json', 'yaml2json', 0);
    token.block = true;
    token.content = (firstLine && firstLine.trim() ? firstLine + '\n' : '') +
      state.getLines(startLine + 1, nextLine, len, true) +
      (lastLine && lastLine.trim() ? lastLine : '');
    token.info = params;
    token.map = [startLine, state.line];
    token.markup = open;

    return true;
  }

block方法我也理解不了, 说实话还是好复杂, 我就cv了刚刚提到的math插件并且删掉了不用的代码 (捂脸)

其实这么多代码只是为了获取从开始标记到结束标记的块的内容.

import yaml from 'js-yaml'

  function render(tokens, idx) {
    let content = tokens[idx].content;
    let y = yaml.load(content)
    let html = JSON.stringify(y, null, 4)
    return '<pre>'+html+'</pre>';
  }

render的入参和出参都和简单, 一看就懂

由于我们覆盖掉了默认的code解析器, 所以就没有代码高亮功能了.

我没有这个需求, 所以我可以这样写, 这样写的好处是不会污染ruler代码.

如果你需要代码高亮功能, 可以不自定义render而使用默认的code的render, 那么解析yaml的代码就应该写在ruler里. 或者你还可以使用 hightlight-js 插件自己去高亮代码.