Topic -

Writing your document with Markdown >

Pandoc Markdown vs standard Markdown

MarkupBinder uses Pandoc 'under the hood' to convert individual content files between formats. Pandoc's Markdown differs from standard Markdown as explained below; these differences / enhancements are enabled by MarkupBinder and makes writing with Markdown easier and more reliable.

The following are excerpts from the Pandoc readme file relevant to a blind user; markups which require careful alignment mainly to define tables are omitted. For full explanation of Pandoc features see the Pandoc readme file.

In parsing markdown, Pandoc departs from and extends standard markdown in a few respects.

Backslash escapes

Except inside a code block or inline code, any punctuation or space character preceded by a backslash will be treated literally, even if it would normally indicate formatting. Thus, for example, if one writes

*\*hello\**

one will get

<em>*hello*</em>

instead of

<strong>hello</strong>

This rule is easier to remember than standard markdown’s rule, which allows only the following characters to be backslash-escaped:

\`*_{}[]()>#+-.!

A backslash-escaped space is parsed as a nonbreaking space. It will appear in TeX output as ‘~’ and in HTML and XML as ‘\ ’ or ‘\ ’.

A backslash-escaped newline (i.e. a backslash occurring at the end of a line) is parsed as a hard line break. It will appear in TeX output as ‘\\’ and in HTML as ‘<br />’. This is a nice alternative to markdown’s “invisible” way of indicating hard line breaks using two trailing spaces on a line.

Subscripts and superscripts

Superscripts may be written by surrounding the superscripted text by ^ characters; subscripts may be written by surrounding the subscripted text by ~ characters. Thus, for example,

H~2~O is a liquid.  2^10^ is 1024.

If the superscripted or subscripted text contains spaces, these spaces must be escaped with backslashes. (This is to prevent accidental superscripting and subscripting through the ordinary use of ~ and ^.) Thus, if you want the letter P with ‘a cat’ in subscripts, use P~a\ cat~, not P~a cat~.

Strikeout

To strikeout a section of text with a horizontal line, begin and end it with ~~. Thus, for example,

This ~~is deleted text.~~

Nested Lists

Pandoc behaves differently from standard markdown on some “edge cases” involving lists. Consider this source:

1.  First
2.  Second:
    -   Fee
    -   Fie
    -   Foe

3.  Third

Pandoc transforms this into a “compact list” (with no <p> tags around “First”, “Second”, or “Third”), while markdown puts <p> tags around “Second” and “Third” (but not “First”), because of the blank space around “Third”. Pandoc follows a simple rule: if the text is followed by a blank line, it is treated as a paragraph. Since “Second” is followed by a list, and not a blank line, it isn’t treated as a paragraph. The fact that the list is followed by a blank line is irrelevant. (Note: Pandoc works this way even when the --strict option is specified. This behavior is consistent with the official markdown syntax description, even though it is different from that of Markdown.pl.)

Ordered Lists

Unlike standard markdown, Pandoc allows ordered list items to be marked with uppercase and lowercase letters and roman numerals, in addition to arabic numerals. (This behavior can be turned off using the --strict option.) List markers may be enclosed in parentheses or followed by a single right-parentheses or period. They must be separated from the text that follows by at least one space, and, if the list marker is a capital letter with a period, by at least two spaces.

Pandoc also pays attention to the type of list marker used, and to the starting number, and both of these are preserved where possible in the output format. Thus, the following yields a list with numbers followed by a single parenthesis, starting with 9, and a sublist with lowercase roman numerals:

 9)  Ninth
10)  Tenth
11)  Eleventh
       i. subone
      ii. subtwo
     iii. subthree

Note that Pandoc pays attention only to the starting marker in a list. So, the following yields a list numbered sequentially starting from 2:

(2) Two
(5) Three
1.  Four
*   Five

If default list markers are desired, use ‘#.’:

#.  one
#.  two
#.  three

Definition lists

Pandoc supports definition lists, using a syntax inspired by PHP Markdown Extra and reStructuredText

Term 1

:   Definition 1

Term 2 with *inline markup*

:   Definition 2

        { some code, part of Definition 2 }

    Third paragraph of definition 2.

Each term must fit on one line, which may optionally be followed by a blank line, and must be followed by one or more definitions. A definition begins with a colon or tilde, which may be indented one or two spaces. A term may have multiple definitions, and each definition may consist of one or more block elements (paragraph, code block, list, etc.), each indented four spaces or one tab stop.

If you leave space after the definition (as in the example above), the blocks of the definitions will be considered paragraphs. In some output formats, this will mean greater spacing between term/definition pairs. For a compact definition list, do not leave space between the definition and the next term:

Term 1
  ~ Definition 1
Term 2
  ~ Definition 2a
  ~ Definition 2b

Reference links

Pandoc allows implicit reference links with just a single set of brackets. So, the following links are equivalent:

1. Here's my [link]
2. Here's my [link][]

[link]: linky.com

(Note: Pandoc works this way even if --strict is specified, because Markdown.pl 1.0.2b7 allows single-bracket links.)

Footnotes

Pandoc’s markdown allows footnotes, using the following syntax:

Here is a footnote reference,[^1] and another.[^longnote]

[^1]: Here is the footnote.

[^longnote]: Here's one with multiple blocks.

    Subsequent paragraphs are indented to show that they 
belong to the previous footnote.

        { some.code }

    The whole paragraph can be indented, or just the first
    line.  In this way, multi-paragraph footnotes work like
    multi-paragraph list items.

This paragraph won't be part of the note, because it isn't indented.

The identifiers in footnote references may not contain spaces, tabs, or newlines. These identifiers are used only to correlate the footnote reference with the note itself; in the output, footnotes will be numbered sequentially.

The footnotes themselves need not be placed at the end of the document. They may appear anywhere except inside other block elements (lists, block quotes, tables, etc.).

Inline footnotes are also allowed (though, unlike regular notes, they cannot contain multiple paragraphs). The syntax is as follows:

Here is an inline note.^[Inlines notes are easier to write, since
you don't have to pick an identifier and move down to type the
note.]

Inline and regular footnotes may be mixed freely.

Delimited Code blocks

In addition to standard indented code blocks, Pandoc supports delimited code blocks. These begin with a row of three or more tildes (~) and end with a row of tildes that must be at least as long as the starting row. Everything between the tilde-lines is treated as code. No indentation is necessary:

~~~~~~~
{code here}
~~~~~~~

Like regular code blocks, delimited code blocks must be separated from surrounding text by blank lines.

If the code itself contains a row of tildes, just use a longer row of tildes at the start and end:

~~~~~~~~~~~~~~~~
~~~~~~~~~~
code including tildes
~~~~~~~~~~
~~~~~~~~~~~~~~~~

Optionally, you may specify the language of the code block using this syntax:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {.haskell .numberLines}
qsort []     = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++
               qsort (filter (>= x) xs) 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Some output formats can use this information to do syntax highlighting. Currently, the only output format that uses this information is HTML.

If pandoc has been compiled with syntax highlighting support, then the code block above will appear highlighted, with numbered lines. (To see which languages are supported, do pandoc --version.)

If pandoc has not been compiled with syntax highlighting support, the code block above will appear as follows:

<pre class="haskell">
  <code>
  ...
  </code>
</pre>

Images with captions

An image occurring by itself in a paragraph will be rendered as a figure with a caption. (In LaTeX, a figure environment will be used; in HTML, the image will be placed in a div with class figure, together with a caption in a p with class caption.) The image’s alt text will be used as the caption.

![This is the caption](/url/of/image.png)

If you just want a regular inline image, just make sure it is not the only thing in the paragraph. One way to do this is to insert a nonbreaking space after the image:

![This image won't be a figure](/url/of/image.png)\

Markdown in HTML blocks

While standard markdown leaves HTML blocks exactly as they are, Pandoc treats text between HTML tags as markdown. Thus, for example, Pandoc will turn

<table>
    <tr>
        <td>*one*</td>
        <td>[a link](http://google.com)</td>
    </tr>
</table>

into

<table>
    <tr>
        <td><em>one</em></td>
        <td><a href="http://google.com">a link</a></td>
    </tr>
</table>

whereas Markdown.pl will preserve it as is.

There is one exception to this rule: text between <script> and </script> tags is not interpreted as markdown.

This departure from standard markdown should make it easier to mix markdown with HTML block elements. For example, one can surround a block of markdown text with <div> tags without preventing it from being interpreted as markdown.

Blank lines before headers and blockquotes

Standard markdown syntax does not require a blank line before a header or blockquote. Pandoc does require this (except, of course, at the beginning of the document). The reason for the requirement is that it is all too easy for a > or # to end up at the beginning of a line by accident (perhaps through line wrapping). Consider, for example:

I like several of their flavors of ice cream:  #22, for example, and
#5.

[ Previous - A quick introduction to Markdown syntax ]

[ Up - Writing your document with Markdown - Section ]

[ Up 2 - MarkupBinder Manual - Main Index ]