Parser
The MiniLatex parser is written in Elm using Evan Czaplicki's parser combinator package. We will take a look at some of the top level parsers, then examine in detail the parser for environments, which is both the most important and the most complex one.
The top level parser
Here is the top-level parsing function:
The oneOf
parser tries each parser in its argument list in turn, succeeding by applying the first parser to succeed, and failing otherwise.
The expression lazy (\_ -> environment)
is needed to ensure that recursion works. If one reads down the argument list of oneOf
, one can extract a production for a grammar for MiniLatex:
We will come back to this later when we discuss the MiniLatex grammar. There is no one-to-one correspondence between parser functions and productions, but there is a close relations.
Macro parser
Let us look at one of the components of the latexExpression
parser, say, the parsers for macros -- expressions like like \italic{Wow!}
. For this we use theelm/parser
pipeline construction, which sequences parsers:
The rough idea of the pipeline is to apply the parsers macroName
, itemList optionalArg
, itemList arg
, and wsParser
in sequence, chomping away at the input, keeping the parse result in the case of |=
and ignoring it in the case |.
In the case at hand, the pipeline will feed three arguments to the type constructor Macro
. Since succeed
has type succeed : a -> Parser a
, the result will be a Parser LatexExpression
.
Note that the macro
function takes a parser as input and produces a Parser LatexExpression
as output. The input, which has type Parser ()
should parse white space -- perhaps just ' '
, or perhaps newlines as well. This definition of macro
gives added flexibility.
As with the top level parser, we can derive a production for the MiniLatex grammar directlh from the defintion of the parser:
Rather than pursuing the rabbit down its hole to explain the parsers macroName
, itemList
, optionalArg
, arg
, and the possible whitespace parsers, we refer to the source code.
The environment parser
The environnment parser is the most complex of the parsers. We give a simplified version, then discuss the changes needed to obtain the actual parser.
The envName parser recognizes text of the form \\begin{foo}
, extracting the string foo
:
There is a companion parser endWord
, which recognizes text like \\end{foo}
. The parser andThen
is used to sequence parsers:
Consider now the second parser which makes up environment
One sees that the types match, since
The result is that the information gathered by environment
is passed to environmentParser
with arguments of the form \\end{foo}
and foo
. At this point the use of the two arguments is redundant. However, for the "real" parser, envType
needs to be transformed: in some cases, it is passed through as is, while in others it is changed.
Below is the definition of environmentParser
. It is simple application of the parser pipeline construct.
Digression: Monads in Elm
Although Elm does not explicitly mention the much (and mistakenly) feared notion of monad, monads are implicit in Elm. Indeed, the andThen
function is a variant of the bind function for the Parser
monad. Compare the type signatures for andThen
and the monadic bind operator >>=
:
Up to a permutation and renaming of arguments, they are the same.
Note
My favorite way of thinking about bind is to use the function
Consider first functions f : a -> b
and g : b -> c
. They can be composed: we can form g o f : a -> c
. Now let f : a -> M b
and g : b -> M c
be "monadic" functions, where M
is a type constructor (Elm) or a functor (Haskell). They cannot be composed as is. However, with the aid of beta
, they can be:
Conclusion: beta gives a way of composing monadic functions f
and g
, where M
is the monad. Bind does the same thing:
All the rigamarole about functors, monads, etc., is important, because the monad conditions on M
guarantee that the new composition operator o'
does what we expect it to, e.g., obey the associative law:
This is the point of view of Kliesli categories, which I learned from the writings and videos of Bartosz Milewski. (INSERT REFERENCE)
Last updated