Andy "Krazy" Glew is a computer architect, a long time poster on comp.arch ... and an evangelist of collaboration tools such as wikis, calendars, blogs, etc. Plus an occasional commentator on politics, taxes, and policy. Particularly the politics of multi-ethnic societies such as Quebec, my birthplace. Photo credit: http://docs.google.com/View?id=dcxddbtr_23cg5thdfj
Saturday, January 14, 2012
I'd like to have a text parser, like Perl CPAN Text::ParseWords,
that *only* breaks the text into words
- but which does not transform the words, handle escape characters, etc.
For example,
Text::ParseWords::
shellwords("a b 'c d' e")
returns
a
b
c d
e
i.e. it breaks the text up into words,
but it also transforms the words.
I would like to separate the breakup from the transformation:
a
b
'c d'
e
Note that if you ever encounter such a list whose words can themselves be further broken up,
then you know that it has been parsed by some tool after your original parser.
[[Category:Programming]] [[Categy::Text]]
No comments:
Post a Comment