Using Highlight With Pandoc

by Tristano Ajmone, MIT License (MIT). Last edited: Oct 26, 2017.

"Highlight.pp" v2.1.1 (2017-10-26)

Highlight PP-Macros” is a set of macros to add cross-platform in-document access to Highlight within Markdown and reStructuredText source files via PP (pandoc preprocessor) and pandoc. It will allow the following operations from within source documents:


CONTENTS


Introduction

The solution proposed here relies on three tools to achieve integration of Highlight in pandoc source files:

  1. Pandoc
  2. PP (a generic pandoc preprocessor)
  3. The Highlight.pp macros file found in this folder

Pandoc

Pandoc — aka “the universal markup converter” — is a well known command line tool for converting documents from one format to another, and especially from/to various markdown flavors, including “pandoc markdown” (an extended markdown variant introduced by pandoc itself). This very document was written in markdown and converted to HTML5 using pandoc and a custom template.

Pandoc is widely used for academic papers and technical documentation, and it ships with a built-in syntax highlighter (skylighting). By default, pandoc doesn’t support third party syntax highlighters, and the supported languages are hardcoded into pandoc and not extensible by end users without recompiling pandoc.1

Furthermore, pandoc doesn’t provide any means to import the source code from an external file — every time the source file changes, you need to manually copy-&-paste its contents again.

PP

Highlight PP-Macros leverage PP, a generic preprocessor “written with pandoc in mind.” PP adds to the pandoc workflow the power of user definable macros and a set of built-in macros for cross-platform file I/O operations and conditional branching — and many other great features not covered here.

Since one of PP’s goals was to introduce literate programming functionality in pandoc workflows, it provides a solid base for handling external files, invoking third party tools, and performing operations on source code blocks.

Highlight Macros

The file “Highlight.pp” contains the definitions of the macros for integrating Highlight in pandoc workflow. It was taken from the “Pandoc Goodies” project, this file being one of the various macros modules that make up the PP-Macros Library subproject hosted therein:

ADVISE: If you intend using the Highlight PP-Macros, you are strongly advised to check regularly the above link for updated versions of “Highlight.pp”. The contents of this folder are intended mainly as a starting example of how to integrate Highlight into pandoc workflow, and they may not reflect the latest release of the macros found on the Pandoc Goodies project.

How It Works

Currently, the Highlight PP-Macros module works only with HTML output. It works on all OS that support PP.

There are three macros available:

The !HighlightFile Macro

The !HighlightFile macro takes two mandatory parameters and an optional third one:

  1. FILE — the name of the external source code file.
  2. LANG — the language of the source code.
  3. OPTIONS — additional command line options to be passed to Highlight.

The LANG parameter is mandatory even if the source code file has an extension known to Highlight: the macro needs LANG in order to write its value as the class of the enclosing <pre><code> tags (these tags are written by the macro, not by Highlight). For example, !HighlightFile(somefile.py)(python) will produce:

<pre class="hl python"><code class="python">

By default, all Highlight macros invokes Highlight with the following basic options:

… followed by the literal string value of the OPTIONS parameter (if any), which must consist of options compatible with the above invocation settings, and keeping in mind that Highlight’s output will be enclosed in <pre><code> tags.

Since this macro outputs raw HTML blocks, pandoc’s built-it highlighter will ignore them. Furthermore, pandoc will still highlight any fenced (or backticked) source code block within the markdown document (ie: unless it’s explicitly invoked with the --no-highlight options). This allows using Highlight side by side with pandoc’s highlighter, and even benefit from dual syntax color themes (this might require some CSS tweaking, or custom stylesheets).

The !Highlight Macro

The !Highlight macro takes two mandatory parameters and an optional third one:

  1. LANG — the language of the source code.
  2. OPTIONS — command line options to be passed to Highlight.
  3. CODE BLOCK — the source code to highlight.

The CODE BLOCK parameter is passed as fenced code — ie: using two equal-lenght lines of tildas (or backticks) as delimiters, instead of brackets:

!Highlight( LANG )( OPTIONS )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CODE BLOCK
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From PP’s documentation:

The last argument can be enclosed between lines of tildas or backquotes (of the same length) instead of parenthesis, brackets or braces. This is useful for literate programming, diagrams or scripts (see examples). Code block arguments are not stripped: spaces and blank lines are preserved.

The !Highlight macro has been rewritten to work both under *nix shells and Window’s CMD — the macro detects the OS and PP-invocation context, and based on the findings it invokes the appropriate internal macro variant. (NOTE: Git Bash for Windows will trigger the shell variant.)

When invoked from CMD, each time the !Highlight macro is called it will create a temporary file (named “_pp-tempfileX.tmp”, where X is a numeric counter) in the folder of the invoking script, to temporarily store the code to highlight. At each PP invocation the X counter is reset, and the previous temp files are written over.

When run inside Shell/Bash (including Git Bash for Windows), !Highlight doesn’t write any temporary files to disk.

If you wish to change the location where temporary files are written, set the PP_MACROS_PATH environment variable to the path of the desired destination folder. Otherwise, you can add to your batch script a delete command to remove all temporary files created:

DEL _pp-tempfile*

WARNING: When using the full “Pandoc-Goodies PP-Macros Library” (as opposed to using the Highlight.pp macros module on its own) you shouldn’t use the PP_MACROS_PATH env var hack just mentioned — this var is set and used by the full library, and you won’t have to concern with temporary files.

The !HighlightInlineTheme Macro

The macro !HighlightInlineTheme offers a quick solution for theming Highlight code without having to first export a theme to CSS (via Highlight) and then import it back into the final document as an external CSS file (via pandoc): the macro imports a Highlight theme directly into the document, as an inline CSS stylesheet.

It takes a single mandatory parameter:

  1. THEME — the name of the Hihglight theme to import (without the “.theme” extension).

It invokes Highlight with the following options:

The macros encloses Highlight’s CSS output within <style type="text/css"> tags.

This macro can be invoked anywhere in the document and will apply to all Highlight code blocks (one theme per document). The imported theme will not affect code blocks highlighted by pandoc.

NOTE: Pandoc’s built-in highlighting styles might or might not interfere with imported Highlight themes, depending on the style’s definition (pandoc’s default style, pygments, usually doesn’t interfere because it doesn’t set a background color). In some cases, slight adjustments to Highlight’s CSS are required to preserve styling integrity; this can be achieved by adding some custom inline CSS that affects the classes used by both highlighters.

Example Files

Building The Examples

This files contains prebuilt version of the example files, but if you want to experiment with them here are the instructions.

Ensure that pandoc, PP and Highlight are available on the system PATH, then execute the build-example script (build-example.bat in CMD, build-example.sh in Shell/Bash).

The script issues two commands:

pp -import Highlight.pp example.md > example-preprocessed.md
pandoc --self-contained -S --normalize example-preprocessed.md -o example.html

The first line invokes pp and tells it to -import the contents of Highlight.pp (so that the macros herein defined are loaded in the current environment) and then process the file example.md. Since PP outputs to STDOUT, we redirect (>) its output to example-preprocessed.md.

If you open example-preprocessed.md in a text editor you’ll see how the macros inside example.md were expanded.

The second line invokes pandoc and tells it to convert example-preprocessed.md and write the output (-o) to the example.html file (pandoc is smart enough to deduce the desired input and output formats from the files’ extensions). The --self-contained option is used to produce a fully standalone HTML file with no external dependencies. The -S (short for --smart) and --normalize options are to produce a nicer output text.

NOTE: In a real work scenario, you would carry out the above tasks with a single command, by piping (|) PP’s output directly to pandoc instead of writing it to the example-preprocessed.md file:

pp -import Highlight.pp example.md | pandoc --self-contained -S --normalize -o example.html

For learning purposes, the build-example scripts write to disk the intermediate preprocessed markdown file, so that you may examine how the macros are expanded by PP before feeding them to pandoc.

Enjoy…

Tristano Ajmone
(Italy)

External Links

Changelog

2017/10/26

2017/04/16

2017/04/16

( first release )


  1. This is the state of things with pandoc v1; with the upcoming release of pandoc version 2 this will change (see #3334) as dynamic loading of syntax definitions will be supported via the new built-in highlighter. Even so, using Highlight with pandoc will still be possible and desirable because of the features offered by Highlight plugins, or because of syntax definitions not available for pandoc.