reStructuredText and the curious case of how headers are implemented by Matthew Setter

reStructuredText and the curious case of how headers are implemented

April 18th, 2017

While I use ReStructuredText, and its companion platform Sphinx-Doc a lot, that doesn't mean that I believe they're the best combination for technical writing and documentation.

It's not that I dislike either of them; quite the opposite. On the whole, they're both accomplished tools, ones that provide an enormous level of functionality designed around the needs of technical writing and communication.

But they have shortcomings that I don't like; ones that regularly frustrate me because they reduce both my productivity and effectiveness.

Given that, and that I’ve worked with them for long enough now, I believe it’s fair that I offer some constructive criticism about some of the areas in which I’d love to see them improve.

How reStructuredText Headers Work

Today, I’m going to start with the way in which reStructuredText implements headers. Note: this isn’t intended as a hack job.

If you're not familiar with it, let's step through how it works, using the example below as our guidepost.

This is the first header

This is the second header

This is the third header

This is the fourth header

To markup a line of text as a header, you underline the text with a repeating character, such as in the example above. Underlining text with the same repeating character marks the text with the same level of importance.

The first underlined line is the document's top-level header. Any successive underlined text is a lower level of importance. You can use a broad range of characters (such as carets, tildes, hyphens, and equals signs) to underline the header text.

As a result, you have a lot of flexibility and choice. This is fine, and easy to get used to (with a little practice). But, the approach is problematic, and also encourages errors. Here’s why.

Firstly, you have to remember which character was used for each header. Depending on the size of the document you're working with, this might take some periodic scanning of the document to determine which character represents the header you’re interested in.

If you have an editor such as VIM, this can reduce the effort. But, it's still problematic.

Secondly, if you want to change the header level, you have to find the character which represents the header level you want, and change it to the one you want, or to use a different character if you’re adding the next level header. This may not take that much time — depending on the size of your document.

Thirdly, you have to remember to underline all of the text in the header. You can’t underline part of it, though you can underline more if you wish.

That's my main bug-bear with the way headers work in reStructuredText. Insert grumble here: I've had countless times where warnings are rendered when compiling documentation in Sphinx-Doc because one or more headers were missing an underline character or more.

Perhaps this sounds more like a rant than a real issue. But I find the setup quite frustrating.

And then there’s another question: Why underline in the first place? Why not, instead, precede it, as Markdown and other formats such as Asciidoc do, with a preceding character which repeats to show the header significance?

How Headers Work in Other File Formats

If you're not familiar with these formats, take the following Asciidoc example:

= Header 1

== Header 2

=== Header 3

Here, a single equals sign indicates the document’s top-level header. Two equals signs indicates a second-level header, and so on. Short, sweet, and straight to the point.

This approach has several advantages:

  1. You can quickly see what level of importance the header is.
  2. You use the same character, avoiding any confusion with which character represented which header.
  3. You use fewer characters to define a header as there's one per header level.
  4. You don't need to fully underline the text (this is particularly handy when editing and reviewing).

What’s Your Preference?

Given this brief comparison of both file formats, which one would you rather use? Would you rather use the one that is concise and intuitive? Or would you rather use the one that is cumbersome and tedious?

From having used a range of formats, for me, there’s no choice. The Asciidoc/Markdown style wins, hands-down.

In Conclusion.

I'll be honest; I don't know the historical reasons for why reStructuredText implement headers in the way that it does. Perhaps there were perfectly logical reasons for doing so. Perhaps there weren't either.

But despite what I've said, on the whole reStructuredText is pretty good — especially when combined with the additions that Sphinx-Doc provides. However, there are areas where it can be improved and refined — and this is one of them!

If you're looking for a one-stop shop solution, combining file-format and deployment solution, then it's a pretty compelling solution. Though I believe that other file formats and tool chains, especially Asciidoc and Asciidoctor, are far more intuitive and efficient.

Matthew Setter. Ethical Hacker, Online Privacy Advocate, and a Software Engineer.

Matthew Setter

Software Engineer, Ethical Hacker, & Online Privacy Advocate.

Matthew Setter is a software engineer, ethical hacker, privacy advocate, & technical writer, who loves travelling. He is based in Nuremberg, Germany. When he's not doing all things tech, he's spending time with his family, and friends.