Saturday, January 14, 2012

I'd like to have a text parser, like Perl CPAN Text::ParseWords,
that *only* breaks the text into words
- but which does not transform the words, handle escape characters, etc.

For example,
      shellwords("a b 'c d' e")
   c d
i.e. it breaks the text up into words,
but it also transforms the words.

I would like to separate the breakup from the transformation:
   'c d'

Note that if you ever encounter such a list whose words can themselves be further broken up,
then you know that it has been parsed by some tool after your original parser.

