The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Wednesday, February 22, 2012

emacs needs FindBin

I was just installing org-mode for the old vedrsion of emacs at work.  Don't have root or admin, so installed it in a private directory.  Had to modify my emacs load-path to be able to make things work.

Which led me to realize: emacs doesn't have FindBin.  And needs it.  Something that says "for everything in this file, add the directory of that fileto load-path" (but not for stuff elsewhere).

Everyone nees FindBin.

Source relative naming is a key for reducing name collisions.

Now, about the security implications...

Gordon Moore and the Talent Economy


Friday, February 17, 2012

Email address as website login ID

More and more websites are allowing you to use your email address as your website login ID or user name.  This is good.

What is not so good is that email addresses change over time.  Now, hopefully you can change the email address that the website uses to send email to you.

However, today I encountered a website that allows you to change your email address - but not the user name, which was the email address that you had when you first registered with the site.

So I am now in the situation of

a) Having to login with MyReallyOldObsoleteEmailAddress@defunct-mail-server.net

b) Getting email sent to newaddress@current-mail.net


Online vs. Paper Mail Notifications

Many businesses - stockbrokers, health insurance, telephone companies, etc. - want to save money ("and trees") by having you receive notifications online.

I am a but of a Luddite in that I like receiving paper mail.  Not for everything - but I know just how fragiler the email system is, and how easy it could be to lose access to all of my email accounts.  Less easy now than it used to be, but nevertheless...

Actually, what I really want is both paper and electronic.  Paper as a backup.  Electronic so that I can easily search.

But note that I said "electronic" and "easily search".  Most online notifications are pull - as in, "go to the website and download the bill".  Or, maybe "we will send you email when a new item is posted to your account, but then you must go to the website and pull it down yourself".  Frankly, I am too lazy and too busy to want to do this.

I want push, and/or I want automation.  I want such information aggregated automatically.  (Yes, I know about mint.com)

Also: I keep paper records for all of my financial stuff.  If ir's online only I have to print out out myself.  If they paper mail it to me, I save a step.  That's worth it to me, although perhaps not to them.  (I have asked my financial providers to use online for most stuff, but to send me full paper mail annual or quarterly reports.  But they usually don't have that flexibility.)

Here's an idea:

Instead of the only options being (0) Paper mail only, and (1) Online only.

How about

(1.5) Paper and online. Some sites allow pure pull from the website, but decide not to email you when they send you paper mail notifications.

(1.75) Online - but if you have not acknowledged that you have looked at in a month, or whatever to critical time period is, send a paper version.

I wonder if there could be a "full spectrum aggregator" - one that not only aggregates your financial info, but also aggregates all of these things like health insurance.

The proliferation of private email systems sucks.

Sunday, February 12, 2012


GTD (Getting Things Done)

Tickler file - perpetual calendar.  Replay for humans - place a card in the calendar at about the right time.

It is sad that physical index cards are often easier to do stuff like this with than computer software.  Most calendar programs think that an event occurs, and recurs, only on spcifgic dates and with a specific frequency.

I once had a PalmPilot PDA calendar program that understidd stuff like allergy shots: the reminder appeared after the minimum spacing e.g. 14 days, being more intensive as you approached the maximum spacing, say 28 days, and automatically rescheduled itself with the same spacing relative to when you actually said that you got your shot.  I.e. T[n] = T[n-1] + dist(14,28).  Not T[n] = T[0] + n*21 +/- 7.   It makes a big difference.

It's sad that paper index cards work better than software. Perhaps I can write, in my copious spare time.

I'd like to have the good stuff that computers bring: search, cross indexing, the ability to be filed in more than one category.

Key things: defer.  Don't even show it on the ToDo list until such and such a time. Date


Standard Perpetual calendar
* to a day 1-31, in the rotating days of month
* to a month Jan-Dec,. in the rotating months of year

Not so standard:
* to this afternoon, evening, next morning, lunch
* to the weekend

Precise date and time - sure, maybe.

I would like to keep one rotating tickler file, but I need to separate work and personal.  SW should allow that.  But for now, using paper index cards, I don;t crate a tickler file for each.

Now/Soon/Someday is often as much as I can be bothered with.

We may want to be able to defer, and not think about, certain actions.  But we may also want to be able to recall a deferred action - e.g. "Tell me the next thing I have to do wrt my personal computing environment - my mind is too slagged for real work.  I don't care if it has been put off to a date in the future."

Thursday, February 09, 2012

PSON examples (possible)

Here some examples of notation.

BTW, why am I posting here instead of on wiki?  Mainly because it is quicker to post here than on my wiki.  Mediawiki is slow. Too many clicks.  In twiki I had comment mode. (TBD: install twiki on my webhost. Want ACLs.)

Anyway, examoples - of the PSON possible "copmpact" notation, andf then the correspondingly fully parehthesized version (although I may not use quotes, etc., everywhere):

a: b c d
a: [b,c,d]

a:b c d

a: b, c, d

a: b, c, d;

a: b c d, e f

a:b c:d e:f

a: b c:d

a: b c: d

a: b=c d=e f=g

a: b=c d e f=g


a: b=c d e f=g h:i

a: b c d
a: [b,c,d]

a b:c d

It seems a bit weird to have
a: b c d => a: [b,c,d]


x a: b c d => [x a:b,c,d]

Is this really what I want?

It seems to mean that label: at the beginning of a string means something different than in the middle.

a,b,c: d e,f
[a,b,c:[d e],f]

a,b, x c:d e,f

a,b,c:d e,f
[a,b,c:[d e],f]


Mixing array and name syntax

So, straight array syntax and straight name=value is handled in many ways
Array: [ a, b, c, d ]
Struct: { k1=v1, k2=v2 }
Or , paren-less (or implicit parens):
Array:  a, b, c, d
Struct:  k1=v1, k2=v2

Q: but what about intermixing - having a key-value appear in the middle of a list of values:
a, b, c, d=e, f, g, h
Should this be interpreted as a struct in an array, or what?

(Again, emphasizing: this is just for convenience. Explicit parens always available.)

(1) implicit struct element

[ a, b, c, {d=e}, f, g, h]

(2)  explicit in same array/struct, inside/in addition to sequence

A different way of looking at it might be to say "All array notation is really just implict name=value pairs":

Array: [ a, b, c, d ]
is equivalent to
Array: { 0:a, 1:b, 2:c, 3:d }
This would suggest that
[ a, b, c, d=e, f, g, h]
should be
{ 1:a, 2:b, 3:c, d:e, 4:e, 5:f, 6:g, 7:h }
although even that is not so explicit as one might like: we may want to emphasize that elements "named" d and 4 actually point to the same value, not to different elements that happen to have equal values.

(3) explicit in same array, but outside of sequence

Rather than creating an aklis to the positional notation, put iot oustide positional:

[ a, b, c, d=e, f, g, h]
should be
{ 1:a, 2:b, 3:c, d:e, 4:f, 5:g, 6:h }
This seems to be what I have wanted to do most often with keyword arguments for functions:

add( 1, 2) 
we want to refer to the positional "1" and "2" using the same names, irrespective of where the keyword parameter is written.

BOTTOM LINE:  I think I prefer (3) explicit in same array, but outside of sequence



Talking with someone about GNU make passes - first evaluating the dependency graph, and then building it..

.SECONDEXPANSION http://www.gnu.org/software/make/manual/make.html#Secondary-Expansion - a kluge, IMHO.  Although it may be good enough.

One of my friends got annoyed enough by passes that he wrote his own build tool. (Hi, Mark!)

Suggested relaxation - basically, keep passing over, until changes stop happening.

Observation: in a language like Perl or Make, with sigils like $var, each pass might strip a sigil.  1st pass $VAR, second $$VAR, third $$$$VAR, etc.  Ugly!  I hate counting backquotes.

Languages that use unadorned variable names look prettier.

But are they any more or less likely to have infinite relaxation cycles?  Probably no more.


How does this relate to my desire to set a bit in text to say "Never parse this?"  Not much.  Or, rather, that may be the key to reducing user errors with relaxation. Set the bit, and never substitute no more.


Hey, maybe this is the key: there are two types of text, syntactic text - into which substitutions may be done - and non-syntactic, terminal, finalized text.

If you arrive at terminal text you are done, done, done.

Syntactic text may continue, or may stop relaxing when there are no more changes.

Most languages make terminal text harder to express - quoted string.

I've worked with languages that make syntactic text harder and uglier - e.g. Algol W with 'IF' 'ELSE' quoted keywords.

Key is notations that allow us to go from easy syntactic / ugly terminal text, through ugly syntactic / unquoted, easy, terminal text.  Points in the middle...

Google calendar scrollbar oscillation

For the past week have been plagued by oscillation of the scrollbars in Google calendar.

In Chrome.  Typically when fullscreen on my 1200x1920 monitor.

Scrollbars on, scroll bars off.

Changing window size remedies.

Just goes to show that Google can have annoying bugs, just like anyone else.

VNC client evaporation

Today have been plagued by my VNC client "evaporating".

Probably because my VNC window is pretty large, 1200 high by circa 6000 wide, to span all of my displays.

Nested structure without parentheses

In my thoughts about PSON (Pseudo Object Notation), descended from Pseudo-XML and earlier Perl-SQL notation:

Stipulate the usual: nice parens and quotes.  I'm just looking to strip out visual clutter ad create a more human readable subset - e.g. so that muy debug lines can be parseable.

Earlier I blogged about letting = be a tighter binding strength than :. so that
s: a=1, b=2, t: 7
s: { a=1, b=2 }, t: 7
 There's no need for binding strength.  Can just have any name = or name : without an atom\
s= a=1 b:2, t=7
s= {a=1 b:2}, t=7

Delineating the end of the struct is an issue. Hinted at above by s= a=1 b:2, t=7

We can use whitespace separated elements, comma-separated, semi-colon separated, period separated as a hierarchy of nesting:

         a b, c, d, e; f g h, i j k

         (((a b), c, d, e); ((f g h), (i j k)))

Not general, because only three levels.  But human friendly, similar to what I do naturally in real life.

Q: intermixing : and = binding, and the blank/comma/semicolon separated lists?  Who binds first?

<angle brackets in titles confuses blogger>

If you put angle brackets in the title of a blogger post, like this:
<angle brackets in titles confuses blogger>
blogger gets confused and decides that it is an untitled post - whereas if there is truly bno title, blogger guesses from the first line.

Not clear that this is a bug, but certainly a feature of questionable merit.

Yet another quotification issue.

Monday, February 06, 2012


<indented-block  left-margin="|">
     |      is

< tags with words breaking XML rules >

<tag some attributes>

<test expected-time="10min" start="">

</test elapsed-time=5min>

<tags that occupy a line by themselves are fairly clear/>

and, if so restricted, allowing < and > inside text is simple.

<tag>on same line</tag> and <more/>
no so bad.  But tyhe morte allowed, the more possibilities for confusion.

Serialization Formats

There are so many: http://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats And yet, I want some more, with my pseudo-XML (or pseudo-JSON, or whatever)
    ASN/1: an XML subset, that defines how values are represented (as text, not attributes).
    Netstring: length:strings,
    JSON: hashes ansd arrays; keys are strings, just like values. 
    OGDL: trees, indentation based, with commas and parenrtheses for when you don't want new lines. #comments and references. Guaranteed round tripping, except for comments.
    Property lists; NeXT to Apple. 
    YAML: outline indentation, name:, little quoting, {hash: value}, [arrays, ...]; &id anchors and *id references, name: value, name: !!type value, !!binary 
    XML: <tag>barewords</tag>, <>tag/>, attribute metadata. Basically simple, and then a whole lot of crap descended from SGML. Schemas and DTDs :-( Namespaces, references, XQuery, XSLT. An occasionally uncomfortable distinction between metadata (atributes) and data (text).
Why can't we just gedt along? Why can't they all interconvert?

Why I dislike XML

I want to be abkle to write <matching-brackert>
text, maybe some math, like a <b and c> d </matching-brackert>
withiout having to write ugly \&\l\t\; Heck, this WYSIWYG blog ceditor is an example - I can;t figure out how to type thye escape code for & lt ;

Why I like XML

In XML, or pseudo XML, you can see the matching brackets

<matching bracket> stuff
</matching >

whereas in simply parenthesized notations like JSON

{n1: {n2: n3: "do you have enough matching brackets?" }}}


You can usefully define semantics for mismatching constructs. XML also allows unquoted text, because of the very verbosity of its punctuation.

Using = and : binding operators to delineate structure

I have messed around with "pseudo-XML" a lot. Oh, heck, whop cares about the XML part, except that it is a standard?  Structured, simple, data.  Trying to make stuff human readable, but also parseable.

Here's a new-to-me trick:

I often use key=value pairs.

Sometimes I use name: value.

I.e. both = and : are nice ways of binding name and value.

How about:
n1: nA=vA, nB=vB, nC=vC
=>  nA=vA, nB=vB, nC=vC
or whatever way you want to express name/value in XML.

I.e. : as a binding operator that can bind to a list of name=value pairs

Or non-XML, if that's what tickles your fancy.  JSON?:
{ "n1": {"nA": "vA", "nB": "vB", "nC": "vC" } }
(BTW, if I ever declare that I am doing pseudo-JSON, it will be to have fewer quotes.)

{ n1: {nA: vA, nB: vB, nC: vC } }

The biggest difference between shell script languages and regular languages is quoting. "Barewords" are a big characteristic of scripting languages.

:s can bind until the next colon - i.e, all colons have the same strength
n1: nA=vA, nB=vB, n2: v2, n3: nX=vX, nY=vY
{ n1: { nA:vA, nB:vB}, n2:v2, n3: {nX:vX, nY: vY} }
To get more levels deep, would need brackets, or indentation. more operators: ::==

Heck, I'm not trying to define a full language here.  There are at least 3 perfectly good data languages - XML, JSON, S-expressions.

I just want to define some notation that is a bit more user friendly, but can fall back to the other notations if necessary.

Saturday, February 04, 2012

Recognizing errors in build scripts

A very important thing, when writing or some other form of build recipe, is recognizing when an error has occurred that might prevent you going on the the next step (of the build DAG).

This is easy when commands are well behaved.  When they indicate errors tidily. (E.g. in the UNIX convention, 0 for success, nonzero for error.)

It is less easy for tools that are not so well behaved.  When the tool regularly produces errors, both in the UNIX exit code, and in whatever output it sends to stdout or stderr.  When past users seem to have nearly always been running the tool by hand, and there is folklore about what errors can be ignored, and which errors matter.

Better if the folklore is recorded, e.g. on a webpage or wiki. Bad if the folklore is passed by word of mouth.  Worse, if no single person knows all the folklore.

Not so bad if the tool produces a well characterized set of outputs, e.g. output files, and if the file only exists if it was correctly built.  Bad if the output file can be created, but malformed, not completed, etc.

I am tempted to say "BKM: make all output go tgo output.tmp, and then mv to output.final when okay."  But in this same effort I have found several examples of where the output file is perfectly fine, but where an error occured later in the tool producing it.

Not so bad if the output file contains error markers, such as "When processing this record, there was an error".  No matter if this occurs at the very end, or in the middle.  Bad if the output file contains no error markers, so you are left wondering, in a partially created output file, why it stopped.  Really bad if the output file format ignores missing data, such as assuming it is all zeroes.  really, Really, REALLY bad.  I wasted lots of time on that, umm, "feature".

Here are some things that I have done to ciope with such error-ful output:

First, wrapperize the tools that produce unreliable error indications.

Second, in the wrapper or in the build tool, place code to look at the output. Let this "validation" code decide if it is safe to go on to the next step or not.

At first, perhaps unconditionally go on to the next step, since any particular error may be ignorable.

But as you debug stuff, start recognizing error conditions.  When you are confident that an error is non-ignorable, have the wrapper return an error.  When you are confident that the error is ignorable, or that there was no error, have the wrapper return success.

Print messages when you are in-between.  Probably continue anyway, but at least flag when your validation code - typically pattern matching on error messages - thinks that something suspicious has happened, but is not sure.

Be schizophrenic.  Don't try to give a single answer.  Have your validation code do

GOOD: I saw a passed message
BAD:!!: the log file was much tiner than it has ever been on a successful run 
SUSPICIOUS:?: manycompilation errors, but they were ignored in the past.
BAD:!!: the expected output files were not created
Use any heuristic that you can imagine, that you are willing to take the time to code.

Probably most important: save examples of good and bad runs, so that you can start looking for patterns.

A good example of how make is needlessly repetitive is how I have been building makefile lines today:

I copy a command line that I pasted into the shell, e.g.

command < file1 -other_input file2 > file3

I paste it into my makefile, twice

command < file1 -other_input file2 > file3
command < file1 -other_input file2 > file3

and then edit

.PHONY: verb-to-run-command
verb-to-run-command: file3
file3: command file1 file2
        command < file1 -other_input file2 > file3

Wednesday, February 01, 2012

Using namespace - fixing the problem

We just had the "using namespace std, versus std:: all over the place" conversation at work.  The usual: "using namespace" can lead to had to find bugs.

Behavior can change when functions, methods, and operators are imported when a new "using namespace" line is added possibly leading to an import being bound instead of a preexisting function, because of the prioritization implicit in overloading / polymorphism.

(Some of the toughest bugs to find in this area have involved templates. Templates in libray Lib1 using a naked name such as foo(T1,T2), that are used in Lib2, and whose naked name foo gets bound to something in Lib3... up until the time Lib4 is imported via "using namespace", at which time it gets changed.)

I spouted the standard reason not to allow "using namespace".  But I hate myself for doing so, since I find code that is cluttered up by std:: ugly to read.

So of course I try to think about what the real problem is.

The real problem is that some names, functions, operators, methods match multiple prototypes in different libraries.  And the match chosen may change in major ways with minor edits (or, possibly even worse, in minor ways with minor edits).

So, let's look at these one at a time:

First, some names, functions, operators, methods match multiple prototypes in different libraries.

(1) Surely the compiler could compare namespaces, and warn the user about names that might possibly conflict - that might possibly be coerced to different instances in different modules/libraries/namespaces. (Heck, I think that such an ability might be useful even inside the same library/module/namespace.)

(2) Specifically, perhaps an given use of a name that is remapped via overloading/polymorphism could warn if there was more than one possible mapping?  Especially if in separate libraries/modules/namespaces?

Second, the match may change:

(3) Perhaps the editing/build system could record which overloading is applied at every point in the source code.  And warn when it changes.

Like so much source annotation, this would have to be resilient across edits.  But, that's what patch, and so many VCS, do.  Heck, it would even be useful if only inexactly resilient across edits - if there was just a table of source module/name -> dest module/instance mappings, and that was diffed.