Parsing from Scratch

Room 1301, Teaching Building No. 3, Tsinghua University

Parsing is possibly one of the most successful area that perfectly integrating theory and practice: parsing techniques are founded on formal languages and automata theory, and state-of-the-art parser generators and combinators libraries are favored in the industry. Parsing plays a role either directly or indirectly in almost all softwares – compiler construction, command line argument parsing, information extraction from structured text, text matching with your favorite regular expressions, and so on. For those who love building their own languages, parsing could very likely be the second challenge they encounter during the implementation of the compiler, after they eventually come up with a satisfied concrete syntax design. This talk will cover two ways of constructing a parser in the mainstream – parser generators and parser combinators libraries. We will together build some of them from scratch to convince you that parsing is indeed not a magic – it is just an art of manipulating text. Finally, we will see a couple of new ideas for building parsers in a “safer” and more “correct” way, or a more “automated” way, proposed in recent studies from the programming language community.

This talk is a part of Tunight, held by TUNA. The attached video is in Chinese, 4.2 G, downloaded via cloud.tsinghua.