I had a (third) chat with Brian Andresen today about Lexers and Parser Generators, a type of software that's working overtime to suck worse than most email clients.
Brian Andresen
11:06:49
remember we were looking for Unicode-savvy parsing tools a while back?
Seth Dillingham
11:06:59
I do
Brian Andresen
11:07:04
I'm looking for new parser-generator tools for my project at work
Seth Dillingham
11:09:23
oh man, very interesting
Brian Andresen
11:09:51
I suspect that the author is more mindful of capability than performance
still, I plan to download it and try a small grammar with GOLD
I'll also try out ANTLR (again) and ACCENT (
http://accent.compilertools.net/)
(why do all of these tools use all-caps names? dunno.)
Seth Dillingham
11:11:18
ANTLR is an acronym, I don't know about the rest
not that it matters much
Brian Andresen
11:11:40
yeah
Seth Dillingham
11:12:00
I know ANTLR made some more progress, but I haven't played with it.
heh. Check out the first News Item on the
ANTLR home page
Brian Andresen
11:12:59
bah, GOLD is Win32-only.
Seth Dillingham
11:13:07
Grr.
This whole class of software is a joke.
Brian Andresen
11:13:24
2009?! we're going to have to wait a long time for that next beta.
Seth Dillingham
11:14:09
But the news is in past tense! I think they're probably just thousands of timezones ahead of us. That would explain it.
Brian Andresen
11:14:17
heh
yeah, there's not much out there that's free. I've found a bunch of commercial tools, but none of them have even inspired me to request a trial version
Seth Dillingham
11:24:57
Brian Andresen
11:25:11
that's very relevant, thank you
Seth Dillingham
11:25:32
Looking at this stuff never makes me happy. :-(
Brian Andresen
11:25:40
no kiddin'
well, that just left me with a grand total of zero tools to investigate. lame.
Seth Dillingham
11:26:18
I think, "I could devote a ton of my time to learning the issues and helping to fix these problems. Or I could make a living and have a life, and make do with what I already have."
Brian Andresen
11:26:31
yep
Seth Dillingham
11:26:58
But it still makes me nuts. My work could really benefit from a good, Unicode-savvy parser generator.
with a C++ target
Brian Andresen
11:28:01
yeah. the thing that got me started on this (for Agilent) was how poorly designed lex/yacc (and flex/bison) are for providing code to be part of a larger project
they were designed for making a standalone executable that doesn't need to do much more beyond the parsing, it seems
Seth Dillingham
11:28:56
they're specifically for feeding a compiler, right?
Brian Andresen
11:29:22
our simulator already has six lex/yacc-based parsers in the code, and we end up having to mangle the yy___ symbols and other globals to even just make it link
Seth Dillingham
11:29:57
(a particular kind of compiler/builder, I mean)
Brian Andresen
11:30:06
yeah
Seth Dillingham
11:30:17
Hey, do you mind if I post this conversation on [tw]? I'll hide your handle.
Brian Andresen
11:31:36
oh right, and my other gripe was memory management. Suppose we're parsing through a line and allocating memory for various things as we go. Then we hit a syntax error. There are ways to design the rules to do error recovery, but designing the rules to allow error recovery to clean up all allocated memory is not obvious at all.
go for it
Seth Dillingham
11:32:48
thanks