The content of this blog is my personal opinion only. Although I am an employee - currently of Imagination Technologies's MIPS group, in the past of other companies such as Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Sunday, November 21, 2010

Moving programming languages beyond ASCII text

Just read Pul-Henning Kamp's CACM article "Sir, Please Step Away from the ASR-33!"


Constraining ourselves to use ASCII has led to less readable programming languages and programs, and probably has also led to bugs and security breakins, as Kamp suggests with C's & and &&.

Kamp advocates using unicode, with "the entire gamut of Greek letters, mathematical and technical symbols, brackets, brockets, sprockets, and weird and wonderful glyphs..."

I am not so sure that I would go all the way to full unicode, if for no other reason that I don't no how I would edit a program that used "...Dentistry symbol light down and horizontal with wave" (0x23c7)", as Kamp finishes the previous quote. People complain about APL being write-only! But certainly a subset of unicode - subscripted letters in the major alphabets, and ideograms.

Let us end not just the tyranny of ascii, but the tyranny of English. Variable names with accents for the rest of the Europeans; variable names in Chinese and Japanese ideograms. Internationalization systems for program variable names and keywords!!!!

Okay, maybe that goes a bit far - but nevertheless I feel the pain of non-English native speakers.

Kamp goes on to talk about using wide screens, e.g. "Why can't we pull minor scopes and subroutines out in that right-hand space...?"

I'm not sure about that, but I am a big user of tables. Truth tables, etc., that make it easy to see when you have handled all of the possible cases. Arrange the code in a table; heck, click through to zoom in and see details if the code in a table cell gets too hard to read even on our wide screens.

And Kamp goes on to talk about using color for syntax. Again, amen... although that raises the possibility of red/green colorblindness causing new bugs.

Myself, I learned to program from theAlgol 60 primer, with boldfaced keywords.

I think, however, most of this is presentation. We need to start using something like XML as the basis of our languages. It can express all of the above. It can be presented using color, fonts, etc, on wide and narrow screens. It can be customized to your local display devices using CSS.

I know: many people hate XML.  I suggest it mainly because it exists, and is sufficient, and there are many existing tools that can be leveraged.

And did I mention language extensions via domain-specific languages? Many conference speakers and famous California professors advocate them.

Me, I am mixed about domain specific languages: I want to express things as natural in my domain, but I want all of my existing tools, my emacs modes, my interface generators, etc., to have a chance of handling them. More important than anything, it can be parsed by language processors that do not necessarily know the syntax of the domain specific language in use.

Think of how hard it was to extend C to C++: whenever they added a new keyword, existing programs using that keyword as a variable broke. This will not happen if your keywords are distinguished in XML using

< keyword > if < / keyword >
< keyword ascii="if" / >
< if type="keyword" / >  
I don't terribly care which representation is used, so long as there are standard tools to convert back and forth to a form that my other tools can expect to receive.

(Although the multiple tries it took me to get Google's blogging software to type  XML < literals > - I had to add spaces - shows how much tool change is needed.)

No comments: