CodeDown is a simple, but very univeral converter between code and documentation texts, based on Markdown as the central format. codedown
is an application and command that performs all these conversions on demand.
For programmers there are two kinds of texts:
code, that is text that talkes to the computer (i.e. some compiler or interpreter), written in some programming language like C, Perl, Lisp or SQL.
documentation, addressed to human people. The documentation can be plain text, or a standard text format, which is usually read through a device like a browser or reader, or after a conversion, such as HTML, LaTeX, Docbook or PDF.
Any non-trivial and serious program nowadays comprises both: some code and its documentation. And because both kind of texts are so closely related, they are often put into the same source text. This is usually done in one of two fashions:
In the first version of the single source text approach, the source text is a well-formed code text. Extra documentations are usually inserted comments and a document generator converts the source text into a document.
+------+ document generator
| Code |------------------------------------> Document
+------+ |
| |
| |
V V
compiler or interpreter viewer, printer or browser
A classic example of such a document generator from the source code is Javadoc, that takes Java source code files and generates HTML documents.
In the second version, the source is text in some markup or document format, where code is inserted by a special syntax construction, and there is a special application extracts these code parts to feed it to a compiler or interpreter. This method is often called literate programming, but we more appropriately name it code extraction.
code extractor +----------+
Code <------------------------------------ | Document |
| +----------+
| |
| |
V V
compiler or interpreter viewer, printer or browser
In Haskell for example, this can be done by inserting Haskell code either with leading >
symbols at each beginning of a line into a plain text document, or it can be put in a \begin{code} ... \end{code}
block within a LaTeX document. The code extractor is built into the Haskell compiler (such as the GHC).
codedown
is a universal document generator and code extractorCodeDown comes with the codedown
command line application, which is
both a document generator and a code extractor in the just mentioned fashion.
is defined or at least easily adapted for virtually all code languages: C, Lisp, etc. This makes it not only a document generator, but even a document generator generator.
is very universal on the types of document, as well: HTML, LaTeX, EBUB, etc.
The choice in each case is achieved by a specification of the source and target format. For example,
codedown --from=PHP --to=HTML ...
invokes a document generator that takes PHP code as input and produces HTML output
codedown --from=Cpp --to=Markdown ...
is a document generator that takes C++ code as input and produces Markdown text
codedown --from=markdown --to=cpp ...
is a code extractor that takes Markdown source text and returns C++ program code
where in all cases "...
" stands for more specifications, such as the input and ouput files and other options.
By the way,
codedown --help=codes
returns a list of all the types of code understood by the current implementation, and
codedown --help=docs
displays the currently supported document formats.
CodeDown uses Markdown as the central format for all conversions.
PHP \ / HTML
Java | CoreCodeDown Pandoc | LaTeX
Perl | = Code <------------------> Markdown <-----------> Document = | PDF
Scheme | | RTF
etc. / \ etc.
The back and forth conversions between code and Markdown is done by the CoreCodeDown
module, all conversions between Markdown and any other document format is performed by the Pandoc
module.
For example,
a call of the CodeDown PHP-to-HTML converter works in two steps behind the scenes:
CoreCodeDown Pandoc
PHP ----------------> Markdown ------------> HTML
First, the PHP source code is converted to Markdown and this is done by functions of the CoreCodeDown
module. Secondly, the intermediate Markdown text is converted to HTML by the Pandoc
module.
An invocation of the HTML-to-C converter first applies Pandoc
to convert HTML to Markdown, and then calls CoreCodeDown
to convert Markdown to C.
CoreCodeDown Pandoc
C <------------------ Markdown <------------- HTML
A PHP-to-Markdown conversion only uses the CoreCodeDown
to convert the code into Markdown text:
CoreCodeDown
PHP -----------------> Markdown
Pandoc is a truely universal document converter, implemented by John MacFarlaine as an independent Haskell package. We strongly recommend the installation and use along with the CodeDown package. Pandoc has its own command line application pandoc
with a very rich set of options. The syntax of the codedown
command is designed very closely to the pandoc
syntax1, so that both programs may support each other. 2
Suppose, you have a C source file Sample.c
, enriched with CodeDown documentation, an you want turn that into a nice HTML document Sample.html
. Besides, you want this document to be a full HTML document (with <head>
and everything), so you use the Pandoc option -s
(which is short for --standalone
). Also, there should be a comfortable table of contents (Pandoc option: --toc
or --table-of-contents
) and a nice CSS stylesheet integration (Pandoc option: --css=CodeDown.css
or -c CodeDown.css
).
Alltogether, this can be done with a single line:
codedown --from=C --to=HTML --input=Sample.c --output=Sample.html -s --css=CodeDown.css --toc
That call is automatically decomposed into two command calls, namely first
codedown --from=c --to=Markdown --input=Sample.c --output=Sample.c.markdown
that generates an intermediate Markdown file Sample.c.markdown
, and secondly
codedown --from=Markdown --to=Html --input=Sample.c.markdown --output=Sample.html -s --css=CodeDown.css --toc
which is just a disguised form of the actual pandoc
call of
pandoc --from=Markdown --to=Html --output=Sample.html -s --css=CodeDown.css --toc Sample.c.markdown
(Note the biggest difference between codedown
and pandoc
calls: codedown
has an --input
option, but for pandoc
the input files are attached to the other options.)
According to its design, we continue here as follows:
The use of CodeDown only makes sense when the strength and features of Markdown are understood and appreciated. We therefore start with a recap of this lightweight markup language.
CoreCodeDown explains how the code-to-Markdown and the Markdown-to-code conversions are defined. There are only two or three simple rules applied to all types of code and here are the strengths and weaknesses of the whole application.
The codedown
user guide explains the use of the application, that combines CoreCodeDown conversions and the power of Pandoc into one command.
Markdown was originally designed as a way to ease the generation and comprehension of HTML source code. But meanwhile, there are a couple of Markdown extensions and implementations (including Pandoc) that suggest Markdown as a default authoring format for documents in general.
Emphasized text like <emph>this</emph>
is simplified to _this_
(alternatively, *this*
is also possible)
Strongly emphasized parts like <strong>that one</strong>
is simplified to __that one__
(or **that one**
)
To produce headers like
<h1>Header One</h1>
<h2>Header Two</h2>
<h3>Header Three</h3>
etc. we just write
# Header One
## Header Two
### Header Three
etc, respectively, each one starting at the beginning of a line.
Links like
the entry on <a href="http://en.wikipedia.org/wiki/Markdown"> Markdown </a> in the Wikipedia
can be induced by writing it inline, with the address inside the text
the entry on [Markdown](http://en.wikipedia.org/wiki/Markdown) in the Wipedia
or with a reference label, say link1
, that points to the address later on
the entry on [Markdown][link1] in the Wikipedia
[link1]: http://en.wikipedia.org/wiki/Markdown
or even shorter with an implicit link label
the entry on [Markdown][] in the Wikipedia
[Markdown]: http://en.wikipedia.org/wiki/Markdown
Paragraphs
<p>This is one paragraph.</p>
<p>This is the next one.</p>
are written more naturally by leaving an empty line between the paragraph blocks
This is one paragraph.
This is the next one.
An unordered list
<ul>
<li>apple</li>
<li>banana</li>
<li>cherry</li>
</ul>
is much more conveniently written
* apple
* banana
* cherry
Suppose, you want to write some little document, which looks nicely rendered as follows (note, that this is only an image, the links are not functional):
If you create this document in HTML file Sample.html
, the content would then be something like this:
<h3> The superfluous CodeDown manual </h3> <p> This "manual" explains the ideas and how to install and use the <code>codedown</code> tool. It is "superfluous", because once you understand the two or three simple CodeDown syntax rules, you won't need a manual anymore. </p> <p> However, the simplicity of CodeDown is built on the power and complexity of two key ingredients: </p> <ul> <li> <strong>Markdown</strong>, either in its <a href="http://daringfireball.net/projects/markdown">original</a> version or with various <a href="http://en.wikipedia.org/wiki/Markdown_extensions">extensions</a> and plenty of <a href="http://http://xbeta.org/wiki/show/Markdown">implementations</a>. </li> <li> <strong>Pandoc</strong>, that comes with a rich set of features and <a href="http://johnmacfarlane.net/pandoc/README.html#options">options</a>. </li> </ul>
But you may as well create the following file Sample.markdown
, which is much more conveniently written and easier to read:
### The superfluous CodeDown manual This "manual" explains the ideas and how to install and use the `codedown` tool. It is "superfluous", because once you understand the two or three simple CodeDown syntax rules, you won't need a manual anymore. However, the simplicity of CodeDown is built on the power and complexity of two key ingredients: * __Markdown__, either in its [original][] version or with various [extensions][] and plenty of [implementations][]. * __Pandoc__, that comes with a rich set of features and [options][]. [original]: http://daringfireball.net/projects/markdown [extensions]: http://en.wikipedia.org/wiki/Markdown_extensions [implementations]: http://http://xbeta.org/wiki/show/Markdown [options]: http://johnmacfarlane.net/pandoc/README.html#options
and then generate the HTML file from the Markdown file with the original Perl executable
Markdown.pl Sample.markdown > Sample.html
or alternatively with Pandoc by calling
pandoc --from=markdown --to=html --output=example.html example.markdown
The same effect is also achieved with a codedown
call 3
codedown --from=markdown --to=html --output=example.html --input=example.markdown
In fact, the style for the document image above was achieved by inserting our default CSS stylesheet CodeDown.css
with an additional option --css=CodeDown.css
to either the pandoc
or the codedown
call above.
There are three more Markdown syntax rules, that will be particularly important for the CodeDown conversions later on:
Inline code like <code>1+2=3</code>
is written as `1+2=3`
(surrounded by backticks)
If we want a code block, say
if x == y
then True
else False
to appear as it is, we can wrap it in a <pre>...</pre>
tag. Markdown converters use <pre><code>...</code></pre>
, instead, i.e.
<pre><code>if x == y
then True
else False
</code></pre>
In Markdown, this can be achieved much easier by just indenting each line of the according block by 4 or more spaces (or alternatively by one or more tabs)4 and write
if x == y
then True
else False
Besides, it is possible to use all characters without replacing them by HTML entities. For example, suppose we want a piece of text that renders in the browser as
if ($x < 7) {
echo "$x is smaller than seven";
}
then we would have to encode this in HTML as
<pre><code>if ($x < 7) {
echo "$x is smaller than seven";
}
</code></pre>
But in Markdown, we can just write the initial text block itself (indented by say 4 spaces).
if ($x < 7) {
echo "$x is smaller than seven";
}
A blockquote
<blockquote>
The biggest source of inspiration for Markdown's syntax is the format of plain text email.
</blockquote>
is written in email style by putting a >
(an a space symbol) in front of the quoted text passages
> The biggest source of inspiration for Markdown's syntax is the format of plain text email.
Markdown is an excellent format for writing documents!
And in particular: Markdown is also a great format for the documentation of programming source code!
If you ever have to write a manual for some program or application, this is a very convenient format. It is very easy to read and write, especially the just mentioned syntax for inline code and code blocks is very efficient and intuitive. The huge amount of Markdown converter implementations, including some online tools, makes it ubiquitously available. And they not only convert to HTML, but to any documentation format you could possibly whish for: groff man pages, PDF, RTF, LaTeX, DocBook XML, you name it. Besides, it is even very readable in its own text style.
By the way, this very document CodeDownManual.html was originally written in Markdown and then converted to HTML.5 The source text CodeDownManual.markdown should thus be a good example for the ease and beauty of the Markdown syntax (in the extended Pandoc version).
CoreCodeDown describes the conversions between code of an arbitrary programming language on one hand and Markdown on the other hand.
CoreCodeDown
Code <-------------------> Markdown
The rules for these conversions are very universal. CodeDown is a document generator and code extractor for virtually any programming language. In this sense, it is not only a document generator, but a true "document generator generator". And yet, these rules are very simple and just a variation of the same three principles, for every type of code.
All main stream programming languages (and this even includes other specific formal languages like SQL, HTML, XML and TeX/LaTeX) allow the insertion of comments into the source code. These are text parts, that are ignored by the machine, but provide information for Human readers and users. The syntax for comments always works according to at least one of the following two conventions:
Line comments
There is a special symbol (a single character or short character string), after which the rest of the current line of code is ignored. In C and JavaScript, this symbol is the "//
", in Scheme and Lisp this is the semicolon ";
".
Block comments
This is a text part spanning over an arbitrary length, wrapped between a begin and end symbol. For example, in JavaScript, block comments are enclosed between "/*
" and "*/
". In SML, the delimiters are "(*
" and "*)
".
Every modern programming language provides at least one of the following kinds of comments. Some only have line comments, such as Scheme, Bash scripts or Perl.6 Others only know block comments, such as SML and SQL. And languages like C and Haskell have both. 7
CodeDown modifies the native comments of a given code language, so that there are special dedicated parts defined in the source code:
Markdown document parts
These are comments written Markdown format, which are preserved during the document generation process. Depending on the comments defined in the code language, these parts are
Markdown document lines, in case the code language has line comments, or
Markdown document blocks, if block comments are defined.
For example, C has line comments (after "//
") and block comments (between "/*
" and "*/
"). Therfore, it has Markdown document lines (after "// //
") and Markdown document blocks (between "/***
" and "***/
") defined. So if a piece of C source has the form
... some C code ...
... some C code ...
// // ... one Markdown line ...
// // ... another Markdown line ...
... some C code ...
... some C code ...
/***
... one line of Markdown ...
... a second line of Markdown ...
... a third line of Markdown ...
***/
... some C code ...
... some C code ...
// // ... final line of Markdown ...
... some C code ...
then a C-to-Markdown conversion returns
... one Markdown line ...
... another Markdown line ...
... one line of Markdown ...
... a second line of Markdown ...
... a third line of Markdown ...
... final line of Markdown ...
Note, that any text after a Markdown block opening symbol (here "/***
") is ignored by the converter. For example, if we have a C code snippet of the form
... line of C code ...
/*** First line of text.
Second line of text.
Third line of text.
***/
... line of C code ...
then a C-to-Markdown conversion produces the output
Second line of text.
Third line of text.
We recommend, not to add text on a line with the opening (here "/***
") or code on the line of the closing symbol ("***/
").
Literal code blocks
Placing parts of code between two special comment delimiters tells the document converter to put these parts inside of special Markdown code blocks and preserve them in the generated document. Note, that the code parts are not inside of comments, but they are surrounded by two comment lines, so that the code itself is still read by the machine.
For example, in C these literal code block delimiters the symbols "///BEGIN///
" and "///END///
", each one on a single line. A code snippet
///BEGIN///
... line of C code ...
... another line of C code ...
... yet another line of code ...
///END///
will be converted into a Markdown code block (i.e. lines indented by four spaces) inside of quote block (lines beginning with >
)
> ... line of C code ...
> ... another line of C code ...
> ... yet another line of code ...
8 Recall, that a conversion of this Markdown block into say HTML will then be similar to this structure
<blockquote><pre><code>
... line of C code ...
... another line of C code ...
... yet another line of code ...
</code></pre></blockquote>
We recommend, not to add code on the line after a literal block delimiter.
Note, that all special CodeDown comment symbols (e.g. in case of C, these are: "// //
", "/***
" "***/
", "///BEGIN///
" and "///END///
") have to be placed at the beginning of a line.
And again, you should not write anything after the block delimiters (here: "/***
", "***/
", "///BEGIN///
" and "///END///
"). 9
Suppose, we have a C source code file HelloWorld.c
with the following content:
/***
# The mother of all programs
This is the __Hello world program__ in C.
***/
#include <stdio.h>
// // ## Implementation of the `main` function:
///BEGIN///
int main (void) {
printf ("Hello world\n"); // prints a message
return 0; // exit normally
}
///END///
/***
## Compilation and use
To compile the program and generate an executable `hello`, call
gcc -o hello HelloWorld.c
Subsequently, you apply it by calling
./hello
It will answer with
Hello world
***/
A C-to-Markdown conversion into a Markdown file HelloWorld.c.markdown
would generate the following content:
# The mother of all programs
This is the __Hello world program__ in C.
## Implementation of the `main` function:
> int main (void) {
> printf ("Hello world\n"); // prints a message
> return 0; // exit normally
> }
## Compilation and use
To compile the program and generate an executable `hello`, call
gcc -o hello HelloWorld.c
Subsequently, you apply it by calling
./hello
It will answer with
Hello world
The process is invoked with a call of
codedown --from=C --to=Markdown --input=HelloWorld.c --output=HelloWorld.c.markdown
Although this is just not part of the CoreCodeDown conversion, let us show how the Markdown further converts into HTML.
A Markdown-to-HTML conversion and an according HTML file HelloWorld.c.html
is generated with the Pandoc command
pandoc --from=Markdown --to=HTML --output=HelloWorld.c.html HelloWorld.c.markdown
but also with the codedown
command
codedown --from=Markdown --to=HTML --output=HelloWorld.c.html --input=HelloWorld.c.markdown
(Note the only real difference between a pandoc
and a codedown
call: pandoc
has no --input
option, the input file is the last argument of the call.)
Of course, we can also generate the HTML file from the C source file right away with a call of
codedown --from=C --to=HTML --input=HelloWorld.c --output=HelloWorld.c.html
Anyway, the HTML file HelloWorld.c.html
has the following content (the HTML text is shifted on many places, in order to make the structure a little more readable):
<h1> The mother of all programs </h1>
<p> This is the <strong>Hello world program</strong> in C. </p>
<h2> Implementation of the <code>main</code> function: </h2>
<blockquote><pre><code>
int main (void) {
printf ("Hello world\n"); // prints a message
return 0; // exit normally
}
</code></pre></blockquote>
<h2> Compilation and use </h2>
<p> To compile the program and generate an executable <code>hello</code>, call </p>
<pre><code>
gcc -o hello HelloWorld.c
</code></pre>
<p> Subsequently, you apply it by calling </p>
<pre><code>
./hello
</code></pre>
<p> It will answer with </p>
<pre><code>
Hello world
</code></pre>
In a standard browser (and when the HTML document was generated with the additional --css=CodeDown.css
option), this looks as follows:
--help=CODE
optionWe just explained the core CodeDown syntax rules for the C programming language. You can recall these rules any time with either one of the following commands:
codedown --help=c
codedown -h c
The answer will be this:
Markdown document lines in C:
+-----------------------------------------+
| // // ... one line of Markdown text ... |
+-----------------------------------------+
Markdown document blocks in C:
+--------------------------------+
| /*** |
| ... lines of Markdown text ... |
| ***/ |
+--------------------------------+
Literal C code blocks:
+-------------------------+
| ///BEGIN/// |
| ... lines of C code ... |
| ///END/// |
+-------------------------+
This is the entire summary of all the CodeDown rules for the C programming language.
A call of
codedown --help=CODE
will display a similar overview for any CODE
, such as c
, php
, etc. (The CODE
value is case-insensitive, i.e. you can also write C
, PHP
, etc.)
A call of
codedown --help=codes
shows the list of all defined types of code in the current CodeDown version.
Suppose you would like to enrich JavaScript source code with CodeDown comments to generate a nice document. To see the CodeDown rules for JavaScript, just call
codedown --help=javascript
and that will show you:
Markdown document lines in JAVASCRIPT:
+-----------------------------------------+
| // // ... one line of Markdown text ... |
+-----------------------------------------+
Markdown document blocks in JAVASCRIPT:
+--------------------------------+
| /*** |
| ... lines of Markdown text ... |
| ***/ |
+--------------------------------+
Literal JAVASCRIPT code blocks:
+----------------------------------+
| ///BEGIN/// |
| ... lines of JAVASCRIPT code ... |
| ///END/// |
+----------------------------------+
JavaScript (like Java, PHP and others) has C-like syntax constructions, including the line and block comments. Therefore, JavaScript has Markdown document lines, Markdown document blocks, and literal code blocks similar to the ones in C.
Scheme only has line comments (after the semicolon ";
"). It therefore only has Markdown document lines, but no Markdown documnent blocks. To find out about the CodeDown rules, type
codedown --help=scheme
and we see
Markdown document lines in SCHEME:
+---------------------------------------+
| ; ; ... one line of Markdown text ... |
+---------------------------------------+
Literal SCHEME code blocks:
+------------------------------+
| ;;;BEGIN;;; |
| ... lines of SCHEME code ... |
| ;;;END;;; |
+------------------------------+
For example, the code snippet
; ; # `Hello world` in scheme
;;;BEGIN;;;
(display "Hello world")
;;;END;;;
converts into the following Markdown
# `Hello world` in scheme
> (display "Hello world")
which in turn turns into the following HTML
<h1><code>Hello world</code> in scheme</h1>
<blockquote><pre><code>(display "Hello world")
</code></pre></blockquote>
Standard ML (SML) has block, but no line comments. Accordingly, we only have Markdown document blocks and literal code blocks available. A call of
codedown --help=sml
shows us the syntax details:
Markdown document blocks in SML:
+--------------------------------+
| (*** |
| ... lines of Markdown text ... |
| ***) |
+--------------------------------+
Literal SML code blocks:
+---------------------------+
| (***BEGIN***) |
| ... lines of SML code ... |
| (***END***) |
+---------------------------+
So this piece of SML code
(***
# `Hello world` in Standard ML
***)
(***BEGIN***)
print "Hello world\n";
(***END***)
will convert into this Markdown text:
# `Hello world` in Standard ML
> print "Hello world\n";
Some textual document formats like LaTeX, HTML or XML also provide comments. So from the CodeDown point of view, these formats can be treated like any other code language. However, codedown
is always thinking of them as document formats in --from
and --to
options. So, in case you want them to be treated as code instead, you add a (case-insensitive) _code
suffix to its name and write LaTeX_Code
, HTML_CODE
or xml_code
, respectively.
For example, the CodeDown rules for HTML are displayed with --help=html_code
Markdown document blocks in HTML_CODE:
+--------------------------------+
| <!-- -- |
| ... lines of Markdown text ... |
| -- --> |
+--------------------------------+
Literal HTML_CODE code blocks:
+---------------------------------+
| <!--BEGIN--> |
| ... lines of HTML_CODE code ... |
| <!--END--> |
+---------------------------------+
Accordingly, if a file HtmlSample.html
contains
<!-- --
### Chapter 6. Writing ordered lists
A listing such as
1. one
2. two
3. three
is encoded as follows:
-- -->
<!--BEGIN-->
<ol>
<li> one </li>
<li> two </li>
<li> three </li>
</ol>
<!--END-->
and if we call
codedown --from=html_code --to=markdown --input=HtmlSample.html
we generate the following Markdown text (on the standard output, since no other --output
is specified)
### Chapter 6. Writing ordered lists
A listing such as
1. one
2. two
3. three
is encoded as follows:
> <ol>
> <li> one </li>
> <li> two </li>
> <li> three </li>
> </ol>
The same CoreCodeDown rules that explain the conversion from code to Markdown also work the other way round. We have both, document generation and code extraction
document generation
Code -----------------------> Markdown
code extraction
Code <----------------------- Markdown
You write a text in Markdown with literal code blocks of the form
> ... line of code ...
> ... another line of code ...
> ... etc ...
i.e. each line starts with a ">
" plus 5 or more spaces.
The code extraction is performed by calling
codedown --from=markdown --to=CODE ...
where CODE
is say C
(or JavaScript
etc.), and the result is code in that language. All Markdown now appears in comments and the literal code blocks recovered as C code of the form
///BEGIN///
... line of code ...
... another line of code ...
... etc ...
///END///
In fact:
In CodeDown, document generation and code extraction are inverse operations (up to insignificant features like extra spaces or the switch of line and block comments).
This makes source code and documentation formats equal and interchangeable environments for programmers at any point.
For example, let HelloWorld.c.markdown
be a file containing this:
# The mother of all programs
This is the __Hello world program__ in C.
## Implementation of the `main` function:
> int main (void) {
> printf ("Hello world\n"); // prints a message
> return 0; // exit normally
> }
## Conversion to C
To convert this this Markdown source file `HelloWorld.c.markdown` to a proper
C program, call
codedown -f markdown -t c -i HelloWorld.c.markdown -o HelloWorld.c
The result is the proper C file `HelloWorld.c`.
We call the code extractor with
codedown --from=Markdown --to=C --input=HelloWorld.c.markdown --output=HelloWorld.c
which creates HelloWorld.c
containing this:
/***
# The mother of all programs
This is the __Hello world program__ in C.
## Implementation of the `main` function:
***/
///BEGIN///
int main (void) {
printf ("Hello world\n"); // prints a message
return 0; // exit normally
}
///END///
/***
## Conversion to C
To convert this this Markdown source file `HelloWorld.c.markdown` to a proper
C program, call
codedown -f markdown -t c -i HelloWorld.c.markdown -o HelloWorld.c
The result is the proper C file `HelloWorld.c`.
***/
There is also a plain code extractor, which only extract the literal code blocks from a document and omits all other text. This is done with the code
value for the --to
option.
For example, for the just mentioned file HelloWorld.c.markdown
a call of
codedown --from=Markdown --to=code --input=HelloWorld.c.markdown --output=PlainHelloWorld.c
generates the C file PlainHelloWorld.c
which contains this
int main (void) {
printf ("Hello world\n"); // prints a message
return 0; // exit normally
}
Similarly, there is also a plain document generator, which takes any code file and returns this code in one big literal Markdown code block, and this is done with the --from
option set to the value code
.
For example, if we have a file any.code
containing this
code or Markdown 1
code or Markdown 2
code or Markdown 3
then calling
codedown --from=code --to=markdown --input=any.code
returns this (on the standard output, since no --output
is specified):
> code or Markdown 1
> code or Markdown 2
> code or Markdown 3
This plain document generator can e.g. be useful if an entire source file needs to be printed and a nice CodeDown layout is preferred over a plain monospace font.10
codedown
user guideWe explain the syntax and options of the codedown
executable in some detail. A short summary can be obtained at any time by calling the --help
option without a value, i.e.
codedown --help
or its short version
codedown -h
The answer will be something like this:
SYNTAX:
codedown OPTION_1 OPTION_2 ... OPTION_N
TYPES OF CODE:
scheme, bash, latex_code, perl, python, ruby, sml, ocaml, sql, html_code, xml_code, c, cpp, java, scala, javascript, php, haskell, lisp
TYPES OF DOCUMENTS:
json, html, s5, slidy, docbook, opendocument, latex, context, texinfo, man, markdown, rst, mediawiki, textile, rtf, org, odt, epub, pdf
HELP OPTIONS:
--help -h this general help message
--help=codes -h codes list of supported types of codes
--help=docs -h docs list of supported document type formats
--help=symbols -h symbols prints the table of code types symbols
--help=pandoc -h pandoc same as: pandoc --help
--help=CODE -h CODE the CodeDown rules for specified type of CODE
CORE OPTIONS:
--from=FORMAT -f FORMAT --read=FORMAT -r FORMAT specifies the FORMAT of the source
--to=FORMAT -t FORMAT --write=FORMAT -w FORMAT specifies the FORMAT of the target
--input=FILE_1,...,FILE_N -i FILE_1 ... FILE_N the input source files
--output=FILE -o FILE the output targe file
ADDITIONAL PANDOC OPTIONS: (see `pandoc -h` for many more):
--standalone -s produce a whole document, with header etc.
--html5 -5 produce HTML5 instead of HTML4
--number-sections -N number the headings
--table-of-contents --toc include an automatically generated table of contents
--css=URL -c URL use URL as CSS stylesheet
LINKS:
http://bucephalus.org/CodeDown
http://johnmacfarlane.net/pandoc
In the sequel, we will explain this information in some more detail.
CodeDown is universal converter between different text formats, and there are three types of formats:
Markdown as the primary document format
document types, which are all the other target document formats supported by Pandoc, including: HTML, LaTeX, ConTeXt, PDF11, RTF, DocBook XML, OpenDocument XML, ODT, GNU Texinfo, MediaWiki markup, textile, groff man pages, Emacs org-mode, EPUB ebooks, JSON, RST, and slide shows with S5 and Slidy.
For more information on the currently supported target document formats, check out one of the following help calls
codedown -h pandoc
codedown --help=pandoc
which in turn is the same as a call of pandoc --help
.
code types, the different code programming language, including scheme, bash, Perl, Python, Ruby, SML (i.e. Standard ML), OCaml, SQL, XML, C, Cpp (i.e. C++), Java, Scala, JavaScript, PHP, Haskell, Lisp.
Some of the document types are also code types, namely LaTeX, HTML and XML. But if they are considered as such, we attach a _code
suffix to the name, i.e. the values are LATEX_CODE
, HTML_CODE
and XML_CODE
.
The list of all possible code types is shown by calling one of the following two help commands
codedown --help=codes
codedown -h codes
The names of all these formats need to be specified in the source (--from
or --read
) and target (--to
or --write
) options of the codedown
command. These name values are case-insensitive. For example, LaTeX
, latex
and LATEX
are equally possible.
codedown
callThe syntax of the codedown
command is
codedown OPTION_1 OPTION_2 ... OPTION_N
The order of the options in such a call is arbitrary.12
Each OPTION
is a combination of a key and possible values. Each OPTION
has has a long form
--KEY
--KEY=VALUE
--KEY=VALUE_1,VALUE_2,...,VALUE_N
and often also an equivalent short version, where the key K
stands for just one letter
-K
-K VALUE
-K VALUE_1 VALUE_2 ... VALUE_N
For example, a possible call could be
codedown --from=JavaScript --to=HTML --input=Part1.js,Part2.js --output=Manual.html --standalone --css=MyStyle.css
where each of these options has a short version and can thus be replaced by 13
codedown -f JavaScript -t HTML -i Part1.js Part2.js -o Manual.html -s -c MyStyle.css
codedown
optionsWe distinguish three kinds of options
The help option
--help
or -h
shows a general help message with the syntax of the codedown command
--help=codes
or -h codes
shows the list of supported types of code
--help=docs
or -h docs
shows a list of the supported document formats (there are more; see pandoc --help
)
--help=symbols
or -h symbols
prints a the table with the comment symbols and CodeDown symbols for each supported type of code.
--help=version
or -h version
prints the version number of the currently running CodeDown and Pandoc versions.
--help=pandoc
or -h pandoc
prints the help message which is generated by a call of pandoc --help
--help=CODE
or -h CODE
shows the CodeDown document conversion rules for the given CODE
, i.e. one of the supported types of code.
The essential core options
--from=FORMAT
or -f FORMAT
(or alternatively --read=FORMAT
or -r FORMAT
)
specifies the format of the input source. If this option is not specified, codedown
first attempts to determine it from the extension of the (first) file specified by the --input
(or -i
) option. If this fails, too, then the FORMAT
is set to markdown
as the default input.
--to=FORMAT
or -t FORMAT
(or alternatively --write=FORMAT
or -w FORMAT
)
the format of the output target. If this option is not specified, codedown
first attempts to determine it from the extension of the file specified by the --ouput
(or -o
) option. If this fails, too, then the FORMAT
is set to markdown
as the default output.
--input=FILE_1,...,FILE_N
or -i FILE_1 ... FILE_N
specifies the input text. If the list of files is empty, the input is set to standard user input. If there is more the one file in the list, the contents of these files are concatenated.
Note, that this is the only CodeDown option that has no equivalent in Pandoc. There, the input files are listed as such, without a preceding --input
or -i
key. (There is however a Pandoc key -i
, which is short for --incremental
, and that makes list items in Slidy or S5 slideshows to be displayed incrementally. If you want to control this Pandoc option from a codedown
call, you have to use the long --incremental
.)
--output=FILE
or -o FILE
defines the file for the output of the conversion. If FILE
does not exist, yet, it will be created, otherwise it will be overwritten. If this option is not specified, all output goes to the standard output (except when the target format is set to odt
, epub
or pdf
).
A special case is the possibility to set one format to code
(see above). If the source format is set to code
(i.e. --from=code
) and the target format is markdown
, then the whole input is put into a single code block. This can be useful if you need to display an entire source file in standard CodeDown layout. Conversely, if --from=markdown
and --to=code
then the code blocks (lines preceded with >
plus 5 spaces, at least) are extracted as code blocks, while everything else is ignored.
The additional Pandoc options
When you generate a document format other than Markdown, then codedown
invokes the Pandoc document converter behind the scenes, that turns the (intermediate) Markdown text into the demanded document. And this Pandoc conversion allows numerous options to fine-tune the result, and all these options can be integrated into the codedown
call.
The following small selection only show the Pandoc options that were used for the generation of the HTML version of this document.
--css=FILE
or -c FILE
integrated a CSS stylesheet with a link to the according FILE
.
--standalone
or -s
produces a full HTML (or LaTeX or RTF) document instead of a content fragment, with header and footer etc.
--table-of-contents
or --toc
includes an automatically generated table of contents at the top of the document.
For more Pandoc options use either one of the following two calls
codedown --help=pandoc
pandoc --help
or consult the Pandoc User Guide for even more.
The default installation comprises three steps:
This is a easy and convenient installation of a whole Haskell infrastructure on your system. You don't need to understand Haskell at all in order to use pandoc
and codedown
, but your system does. The Platform also comes with a cabal
(Common Architecture for Building Appliciations and Libraries) command that enables easy installation from thousands of Haskell packages from the HaskellDB.
Install Pandoc with
cabal install pandoc
That simple call fetches the latest release from the huge HackageDB repository. You can check if that worked, for example by calling
pandoc --version
Install CodeDown with
cabal install codedown
Again, you can check with
codedown --help=version
For bug fixes, new releases, comments etc. see the CodeDown homepage.
The only main difference between is that codedown
has an --input=File1,...,FileN
option, which makes it overall syntax (codedown Option1 ... OptionM
) a bit more elegant, whereas in pandoc
the input files are attached (syntax: pandoc Option ... OptionM File1 ... FileN
). ↩
In a certain sense, codedown
is more powerful than pandoc
, because it can also be called as a document-to-document converter (e.g. codedown --from=HTML --to=LaTeX ...
), in which case the pandoc
command is invoked behind the scenes with the same options. But we don't recommend the use of codedown
as a document-to-document converter and stipulate the use of pandoc
instead. ↩
The syntax of the codedown
command is very similar to the pandoc
command syntax. There is one big difference, however, namely the --input
option, which does not exist for pandoc
. There, the input files are added at the end of the call, as the example shows. ↩
There is yet another version for code blocks in Markdown, but only in the extended Markdown version of Pandoc, namely delimited code blocks between tilde-lines, with an option to use syntax highlighting for many types of code. You can use that, too, but the official version of CodeDown does not mention this explicitly. ↩
The conversion was done with the command
codedown --from=markdown --to=html --input CodeDownManual.markdown --output=CodeDownManual.html \
--table-of-contents --standalone --css=CodeDown.css
and that has the same effect as
pandoc --from=markdown --to=html --output=CodeDownManual --table-of-contents --standalone \
--css=CodeDown.css CodeDownManual.markdown
↩In this context, Perl, Python and Ruby are considered languages that only have line comments, because their block comments use a special markup for their own document converters. ↩
In the implementation of the general document generators in the CoreCodeDown.hs module we say that a code language is of type 1, if it has a line, but no block comment. If it is the other way round, we call it a type 2 code language. If it has both, line and block comments, it is of type 3. For example, Scheme and Bash are type 1, SML and SQL are type 2, and C and (Common) Lisp are type 3. ↩
You may wonder, why a literal code block, say in C
///BEGIN///
... a line of C code ...
... another line of C code ...
///END///
turns into a code block inside of a block quote
> ... a line of C code ...
> ... another line of C code ...
instead of the more simple code block
... a line of C code ....
... another line of C code ...
In terms of HTML this means that the result is
<blockquote><pre><code>
... a line of C code ...
... another line of C code ...
</code></pre></blockquote>
instead of
<pre><code>
... a line of C code ...
... another line of C code ...
</code></pre>
So, why does CodeDown choose the more complicated version?
Well, in earlier version of CodeDown (e.g. ElephantMark), the simpler version was the rule for literal code blocks, indeed. But when source code is annotated with documentations, one would often use standard Markdown code blocks for examples or input-output dialogs, and then it is nice to have a custom layout distinction between these annotations and literal code blocks. For the document generation of HTML, this can be very nicely done with a CSS stylesheet, that defines custom layouts for the according combinations of the <code>
, <pre>
and <blockquote>
tags.
In the browser view of the example of C code the literal block appears with a border around the grey block, and ordinary code blocks have only a grey background.
By the way, in Haskell, this choice has another advantage, namely that these literal code blocks comply with the "Bird tracks" syntax for literate comments. ↩
As mentioned in the core conversion rules, we recommend not to add anything after the delimiters for CodeDown blocks, and use an entire line instead. The first reason for this rule is its simplicity. Another reason is that writing code after literal block delimiters has different effects for different code languages.
In case the code language has line comments, the literal block delimiters are by default chosen to be line comments, too, and that would thus allow to add comments. For example, the C code of the form
...
///BEGIN///
... first line of C code ...
... second line of C code ...
///END///
...
could be modified to
...
///BEGIN/// Now comes my precious piece of code:
... first line of C code ...
... second line of C code ...
///END/// So far for my precious piece of code.
...
and the C-to-Markdown conversion would still produce exactly the same result.
But if the code language doesn't have line comments, the literal block delimiters have to be block comments, and anything after the delimiters must be proper code. For example, SML only has block comments (between "(*
" and "*)
") and its literal block delimiters are "(***BEGIN***)
" and "(***END***)
". If we add any text after the "(***BEGIN***)
" on the same line, this text would have to be code, different to say C.
So, in order to use general conventions for all code languages alike:
↩Don't write anything after CodeDown block delimiters!
The according call is e.g.
codedown --from=code --to=html --input=any.code --output=any.code.html --css=CodeDown.css
↩PDF output is generated via LaTeX and is supported with the markdown2pdf
wrapper, included in the Pandoc installation. By using codedown
, all this is done automatically. For example, calling codedown -f markdown -t pdf -i example.markdown -o example.pdf
should work just fine. ↩
To be precise, the order of the options in a codedown
call is not entirely arbitrary, namely in case you specify the same option several times. But this is never intended and average users will avoid doing that, anyway. ↩
As it is common for one-letter UNIX command options without values, these one-letter flags can be condensed into a single one. For example, in UNIX, a call of ls -A -l -r -R -S
is equivalent to ls -AlrRS
. This works in CodeDown and Pandoc, too, but the time and space to mention this is probably not worth the time that can be saved when using these abbreviations. ↩