Monday, January 07, 2008
YEPP, the Eiffel Parser Producer
Bottom-up shift-reduce parsers (such as those generated by Gobo Eiffel Yacc) are flexible and efficient. They run fast because as they scan the input tokens they consider multiple possible constructs at the same time (in parallel, if you like).
Their downside is that it's mighty hard to code them in such a way as to get them to emit really good error messages when they encounter a syntax error. By the time you have added enough error handling to achieve good syntax error messages, you may well have lost the power, convenience and flexibility originally offered by the shift-reduce approach.
An alternative approach is the top-down parser, which in modern programming languages usually makes use of recursive descent to climb down the parse tree. It's likely to be a bit slower than a shift-reduce parser, particularly for grammars that are very complex at the lower levels (expressions downwards).
But if there's a syntax error, the top-down parser can give a very clear and helpful error message based on where it's up to in the descent of the parse tree.
Top-down parsers are fairly straightforward to write, but they are tedious. There's lots of code which follows a fairly repetitive pattern, but with just enough variation from construct to construct that it's hard to abstract out the patterns. If you don't want to write the parser manually, you can use a parser generator (whereas with a shift-reduce parser, you would almost always use a parser generator because of its added complexity).
Cyril Adrian has put together the YEPP Eiffel Parser Producer. It's meant to replace the venerable lex/yacc couple for SmartEiffel users. Its input files use an Eiffelish syntax extended to allow simple grammar declarations in an EBNF notation. Its output is an Eiffel class. It is built using (both in itself and for its output) ESE's parse library, which implements a top-down parsing strategy.
YEPP is part of the Enterprise SmartEiffel project
Top-down parsing is also found in Java's ANTLR parser-generator. Shift-reduce parsing is found in yacc and bison.
Their downside is that it's mighty hard to code them in such a way as to get them to emit really good error messages when they encounter a syntax error. By the time you have added enough error handling to achieve good syntax error messages, you may well have lost the power, convenience and flexibility originally offered by the shift-reduce approach.
An alternative approach is the top-down parser, which in modern programming languages usually makes use of recursive descent to climb down the parse tree. It's likely to be a bit slower than a shift-reduce parser, particularly for grammars that are very complex at the lower levels (expressions downwards).
But if there's a syntax error, the top-down parser can give a very clear and helpful error message based on where it's up to in the descent of the parse tree.
Top-down parsers are fairly straightforward to write, but they are tedious. There's lots of code which follows a fairly repetitive pattern, but with just enough variation from construct to construct that it's hard to abstract out the patterns. If you don't want to write the parser manually, you can use a parser generator (whereas with a shift-reduce parser, you would almost always use a parser generator because of its added complexity).
Cyril Adrian has put together the YEPP Eiffel Parser Producer. It's meant to replace the venerable lex/yacc couple for SmartEiffel users. Its input files use an Eiffelish syntax extended to allow simple grammar declarations in an EBNF notation. Its output is an Eiffel class. It is built using (both in itself and for its output) ESE's parse library, which implements a top-down parsing strategy.
YEPP is part of the Enterprise SmartEiffel project
Top-down parsing is also found in Java's ANTLR parser-generator. Shift-reduce parsing is found in yacc and bison.
Comments:
<< Home
Hi, happy new year to all of you :-)
Thank you Roger for your kind support.
Note that SmartEiffel's parser is also a top-down parser and AFAIK it's quite fast :-) It's a lot faster than yepp, I think by at least an order of magnitude... because it's hand-crafted and optimized using look-ahead techniques (which yepp does not use, only backtrack).
Also note that ESE's parse library supports the whole SmartEiffel grammar. It may be useful for some people.
Post a Comment
Thank you Roger for your kind support.
Note that SmartEiffel's parser is also a top-down parser and AFAIK it's quite fast :-) It's a lot faster than yepp, I think by at least an order of magnitude... because it's hand-crafted and optimized using look-ahead techniques (which yepp does not use, only backtrack).
Also note that ESE's parse library supports the whole SmartEiffel grammar. It may be useful for some people.
<< Home