RFC Beautification Working Group R. Gieben
Internet-Draft Google
Intended status: Informational January 1, 2013
Expires: July 5, 2013

Writing I-Ds and RFCs using Pandoc and xml2rfc 2.x
draft-gieben-writing-rfcs-pandoc-02

Abstract

This memo presents a technique for using Pandoc syntax as a source format for documents in the Internet-Drafts (I-Ds) and Request for Comments (RFC) series.

This version is adapted to work with "xml2rfc" version 2.x.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on July 5, 2013.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.


Table of Contents

1. Introduction

This document presents a technique for using Pandoc syntax as a source format for documents in the Internet-Drafts (I-Ds) and Request for Comments (RFC) series.

This version is adapted to work with xml2rfc version 2.x.

Pandoc is an "almost plain text" format and therefor particularly well suited for editing RFC-like documents.

2. Pandoc to RFC

When writing [RFC4641] we directly wrote the XML. Needless to say it was tedious even thought the XML of xml2rfc is very "light". The latest version of xml2rfc version 2 can be found here.

During the last few years people have been developing markup languages that are very easy to remember and type. These languages have become known as almost plain text-markup languages. One of the first was the Markdown syntax. One that was developed later and incorporates Markdown and a number of extensions is Pandoc. The power of Pandoc also comes from the fact that it can be translated to numerous output formats, including, but not limited to: HTML, (plain) Markdown and docbook XML.

So using Pandoc for writing RFCs seems like a sane choice. As xml2rfc uses XML, the easiest way would be to create docbook XML and transform that using XSLT. Pandoc2rfc does just that. The conversions are, in some way amusing, as we start off with (almost) plain text, use elaborate XML and end up with plain text again.

+-------------------+   pandoc   +---------+  
| ALMOST PLAIN TEXT |   ------>  | DOCBOOK |  
+-------------------+            +---------+  
              |                       |
non-existent  |                       | xsltproc
  faster way  |                       |
              v                       v
      +------------+    xml2rfc  +---------+ 
      | PLAIN TEXT |  <--------  | XML2RFC | 
      +------------+             +---------+ 

Figure 1: Attempt to justify Pandoc2rfc.

The XML generated (the output after the xsltproc step in Figure 1) is suitable for inclusion in either the middle or back section of an RFC. The simplest way is to create a template XML file and include the appropriate XML:

<?xml version='1.0' ?>
<!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [
<!ENTITY pandocMiddle PUBLIC '' 'middle.xml'>
<!ENTITY pandocBack   PUBLIC '' 'back.xml'>
]>

<rfc ipr='trust200902' docName='draft-gieben-pandoc-rfcs-02'>
 <front>
    <title>Writing I-Ds and RFCs using Pandoc v2</title>
</front>

<middle>
    &pandocMiddle;
</middle>

<back>
    &pandocBack;
</back>

</rfc>

Figure 2: A minimal template.xml.

See the Makefile for an example of this. In this case you need to edit 3 documents:

  1. middle.pdc - contains the main body of text;
  2. back.pdc - holds appendices and references;
  3. template.xml (probably a fairly static file).

The draft (draft.txt) you are reading now, is automatically created when you call make. The homepage of Pandoc2rfc is this github repository.

2.1. Dependencies

It needs xsltproc and pandoc to be installed. See the Pandoc user manual for the details on how to type in Pandoc style. And of course xml2rfc version two.

When using Pandoc2rfc consider adding the following sentence to an Acknowledgements section:

 This document was produced using the Pandoc2rfc tool.

3. Starting a new project

When starting a new project with pandoc2rfc you'll need to copy the following files:

After that you can start editing.

4. Supported Features

5. Unsupported Features

6. Acknowledgements

The following people have helped to make Pandoc2rfc what it is today: Benno Overeinder, Erlend Hamnaberg, Matthijs Mekking, Trygve Laugstøl.

This document was prepared using Pandoc2rfc.

7. Pandoc Constructs

So, what syntax do you need to use to get the correct output? Well, it is just Pandoc. The best introduction to the Pandoc style is given in this README from Pandoc itself.

For convenience we list the most important ones in the following sections.

7.1. Paragraph

Paragraphs are separated with an empty line.

7.2. Section

Just use the normal sectioning commands available in Pandoc, for instance:

# Section1 One
Bla

Converts to: <section title="Section1 One" anchor="section1-one"> If you have another section that is also named "Section1 One", that anchor will be called "section1-one-1", but only when the sections are in the same source file!

Referencing the section is done with see [](#section1-one), as in see Section 7.2.

7.3. List Styles

A good number of styles are supported.

7.3.1. Symbol

A symbol list.

* Item one;
* Item two.

Converts to <list style="symbol">:

7.3.2. Number

A numbered list.

1. Item one;
1. Item two.

Converts to <list style="numbers">:

  1. Item one;
  2. Item two.

7.3.3. Empty

Using the default list markers from Pandoc:

A list using the default list markers.

#. Item one;
#. Item two.

Converts to <list style="empty">:

7.3.4. Roman

Use the supported Pandoc syntax:

ii. Item one;
ii. Item two.

Converts to <list style="format %i.">:

i.
Item one;
ii.
Item two.

If you use uppercase Roman numerals, they convert to a different style:

II. Item one;
II. Item two.

Yields <list style="format (%d) ">:

(1)
Item one;
(2)
Item two.

7.3.5. Letter

A numbered list.

a.  Item one;
b.  Item two.

Converts to <list style="letters">:

  1. Item one;
  2. Item two.

Uppercasing the letters works too (note two spaces after the letter.

A.  Item one;
B.  Item two.

Becomes:

A.
Item one;
B.
Item two.

7.3.6. Hanging

This is more like a description list, so we need to use:

First item that needs clarification:

:   Explanation one
More stuff, because item is difficult to explain.
* item1
* item2

Second item that needs clarification:

:   Explanation two

Converts to: <list style="hanging"> and <t hangText="First item that...">

If you want a newline after the hangTexts, search for the string OPTION in transform.xsl and uncomment it.

7.4. Figure/Artwork

Indent the paragraph with 4 spaces.

Like this

Converts to: <figure><artwork> ... Note that xml2rfc supports a caption with <artwork>. Pandoc does not support this, but Pandoc2rfc does. If you add a @Figure: some text as the last line, the artwork gets a title attribute with the text after @Figure:. It will also be possible to reference the artwork. If a caption is supplied the artwork will be centered. If a caption is needed but the figure should not be centered use @figure:\.

7.4.1. References

The reference anchor attribute will be: fig: + first 10 (normalized) characters from the caption. Where normalized means:

So the first artwork with a caption will get fig:a-minimal- as a reference. See for instance Figure 2.

This anchoring is completely handled from within the xslt. Note that duplicate anchors are an XML validation error which will make xml2rfc fail.

7.5. Block Quote

Any paragraph like:

> quoted text

Converts to: <t><list style="empty"> ... paragraph, making it indented.

7.6. References

7.6.1. External

Any reference like:

[Click here](URI)

Converts to: <ulink target="URI">Click here ...

7.6.2. Internal

Any reference like:

[Click here](#localid)

Converts to: <link target="localid">Click here ...

For referring to RFCs (for which you manually need add the reference source in the template, with an external XML entity), you can just use:

[](#RFC2119)

And it does the right thing. Referencing sections is done with:

See [](#pandoc-constructs)

The word 'Section' is inserted automatically: ... see Section 7 ... For referencing figures/artworks see Section 7.4. For referencing tables see Section 7.8.

7.7. Spanx Styles

The verb style can be selected with back-tics: `text` Converts to: <spanx style="verb"> ...

And the emphasis style with asterisks: *text* or underscores: _text_ Converts to: <spanx style="emph"> ...

And the emphasis style with double asterisks: **text** Converts to: <spanx style="strong"> ...

7.8. Tables

A table can be entered as:

  Right     Left     Center     Default                                                                  
  -------   ------ ----------   -------                                                                  
       12     12        12        12                                                                   
      123     123       123       123                                                                   
        1     1         1         1       

  Table: A caption describing the table.
  

Figure 3: A caption describing the figure describing the table.

Is translated to <texttable> element in xml2rfc. You can choose multiple styles as input, but they all are converted to the same style (plain <texttable>) table in xml2rfc. The column alignment is copied over to the generated XML.

7.8.1. References

The caption is always translated to a title attribute. If a table has a caption, it will also get a reference. The reference anchor attribute will be: tab- + first 10 (normalized) characters from the caption. Where normalized means:

So the first table with a caption will get tab-a-caption- for reference use. See for instance

This anchoring is completely handled from within the xslt. Note that duplicate anchors are an XML validation error which will make xml2rfc fail.

7.9. Indexes

The footnote syntax of Pandoc is slightly abused to support an index. Footnotes are entered in two steps, you have a marker in the text, and later you give actual footnote text. Like this:

[^1]

[^1]: footnote text

We re-use this syntax for the <iref> tag. The above text translates to:

<iref item="footnote text"/>

Sub items are also supported. Use an exclamation mark (!) to separate them:

[^1]: item!sub item

8. Usage guidelines

8.1. Working with multiple files

As an author you will probably break up a draft in multiple files, each dealing with a subject or section. When doing so sections with the same title will clash with each other. Pandoc can deal with this situation, but only if the different sections are in the same file or processed in the same Pandoc run. Concatenating the different section files before processing them is a solution to this problem. You can, for instance, amend the Makefile and add something like this:

allsections.pdc: section.pdc.1 section.pdc.2 section.pdc.3
        cat $@ > allsections.pdc

And then process allsection.pdc in the normal way.

8.2. Setting the title

If you use double quotes in the documents title in the docName attribute, like:

<rfc ipr="trust200902" docName="draft-gieben-writing-rfcs-pandoc-02">

The Makefile will pick this up automatically and make a symbolic link:

draft-gieben-writing-rfcs-pandoc-00.txt -> draft.txt

This makes uploading the file to the i-d tracker a bit easier.

8.3. Uploading the XML/txt

The draft.xml target will generate an XML file with all XML included, so you can upload just one file to the I-D tracker.

8.4. VIM syntax highlighting

If you are a VIM user you might be interested in a syntax highlighting file (see [VIM]) that slightly lightens up your reading experience while viewing a draft.txt from VIM.

9. Security Considerations

This document raises no security issues.

10. IANA Considerations

This document has no actions for IANA.

11. References

11.1. Informative References

[VIM] Gieben, R., "VIM syntax file for RFCs and I-Ds", October 2012.

11.2. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4641] Kolkman, O. and R. Gieben, "DNSSEC Operational Practices", RFC 4641, September 2006.

Appendix A. Tests

This appendix consists out of a few tests that should all render to proper xml2rfc XML.

A.1. A Very Long Title Considerations With Regards to the Already Deployed Routing Policy

Test a very long title.

A.2. Markup in heading

This is discarded by xml2rfc.

A.3. Blockquote

A.4. Verbatim code blocks

A verbatim code block
jkasjksajassjasjsajsajkas

A.5. Reference Tests

Refer to RFC 2119 if you will. Or maybe you want to inspect Figure 2 in Section 2 again. Or you might want to Click here.

A.6. Spanx Tests

A.7. List Tests

  1. First we do
  2. And then

And the other around.

Description lists:

Item to explain:
It works because of herbs.
Another item to explain:
More explaining.

Multiple paragraphs in such a list.

lists in description lists.

Item to explain:
It works because of
  1. One
  2. Two

Another item to explain:
More explaining
Item to explain:
It works because of
  1. One1
  2. Two1
    • Itemize list
    • Another item

Another item to explain again:
More explaining

list with description lists.

  1. More
    Item to explain:
    Explanation ...
    Item to explain:
    Another explanation ...

  2. Go'bye

Multiple paragraphs in a list.

Artwork

  1. This is the first bullet point and it needs multiple paragraphs...

    ... to be explained properly.
  2. This is the next bullet. New paragraphs should be indented with 4 four spaces.
  3. Another item with some artwork, indented by 8 spaces.
  4. Final item.

xml2rfc does not allow this, so the second paragraph is faked with a

<vspace blankLines='1'>

Ordered lists.

  1. First item
  2. Second item

A lowercase roman list:

i.
Item 1
ii.
Item 2

An uppercase roman list.

(1)
Item1
(2)
Item2
(3)
Item 3

And default list markers.

Some surrounding text, to make it look better.

Text at the end.

Lowercase letters list.

  1. First item
  2. Second item

Uppercase letters list.

A.
First item
B.
Second item

And artwork in a description list.

miek.nl.    IN  NS  a.miek.nl.                             
a.miek.nl.  IN  A   192.0.2.1    ; <- this is glue            

Item1:
Tell something about it. Tell something about it. Tell something about it. Tell something about it. Tell something about it. Tell something about it.

Tell some more about it. Tell some more about it. Tell some more about it.
Item2:
Another description

List with a sublist with a paragraph above the sublist

  1. First Item
  2. Second item
  3. Third item

    A paragraph that comes first
    1. But what do you know
    2. This is another list

A.8. Table Tests

Demonstration of simple table syntax.
Right Left Center Default
12 12 12 12
123 123 123 123
1 1 1 1
Here's the caption. It, too, may span multiple lines. This is a multiline table. This is verbatim text.
Centered Header Default Aligned Right Aligned Left Aligned
First row 12.0 Example of a row that spans multiple lines.
Second row 5.0 Here's another one. Note the blank line between rows.

Sample grid table.
Fruit Price Advantages
Bananas $1.34 built-in wrapper
Oranges $2.10 cures scurvy

Grid tables without a caption

Fruit Price Advantages
Bananas $1.34 built-in wrapper
Oranges $2.10 cures scurvy

This table has no caption, and therefor no reference. But you can refer to some of the other tables, with for instance:

See [](#tab-here-s-the)

Which will become "See Table 2".

We should also be able to refer to the table numbers directly, to say things like 'Look at Tables 1, 2 and 3.'

A.9. Numbered examples

This is another example:

  1. Another bla bla..

as (1) shows...

A.10. Figure tests

This is a figure
This is a figure
This is a figure
This is a figure

Figure 4: This is the caption, with text in `typewriter`. Which isnt converted to a <spanx> style, because this is copied as-is.

And how a figure that is not centered, do to using figure and not Figure.

This is a figure
This is a figure

Figure 5: A non centered figure.

Test the use of @title:

This is a figure with a title
This is a figure with a title
@title: and here it is: a title, don't mess it up *

A.11. Verse tests

This is a verse text This is another line

Index

L
list
  default markers
  Uppercase Letters
T
table
  grid
  simple

Author's Address

R. (Miek) Gieben Google 123 Buckingham Palace Road London, SW1W 9SH UK EMail: miek@miek.nl URI: http://miek.nl/