Krazy Glew's Blog: Thursday, August 02, 2012

Disclaimer

The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Thursday, August 02, 2012

Lack of keyword parameters a big cause of programming .... errors, bugs, inefficiency

Sometimes the title says it.

I wonder how much inefficiency has been introduced into programming, how many bugs caused, because positional parameters became the default standard for programming languages long ago.

If not actual bugs because two parameters are flipped in a way that is not detected by the type system, how about inefficiency. "I can't remember the order of the arguments, so I must look them up..."

Microsoft's IntelliSense helps ... but I'm usually an macs user, Don't have that. IntelliSense has almost beenh enough to make me switch.

---

Bjarne says that keyword parameters almost made it into a C++ standard, and lost only because of legacy C issues. He suggests parameter objects. Cool. So, I want to automatically create a parameter object for each and every function.

Hmm.... I think that I could write a preprocessor that did that, munging the symbol table.

Something like

ParameterObject.NamedParameter1()...

Unfortunately, parameter names do not get into the symbol table.

Chaining, or Why You Should Stop Returning Void - Zero Wind :: Jamie Wong

Chaining, or Why You Should Stop Returning Void - Zero Wind :: Jamie Wong:

Bjarne Soustrup's recommendation on how to do keywords in C++:

FunctionFooParameterObject()
.keyword_parameter1(val1)
.keyword_paraemeter3(val3)
.keyword_parameter5(val5)
.call_it()

Works.

Only real hassle is that it is a lot of dang code to write.

Google "cloud" tool shortcomings

Are Google docs and blogger cloud? Whatever.

Google docs:

unintelligible URLs
not clear how indexed they are - I do know that docs that I have "cross posted" between my wiki (mediawiki) and Google docs have been indexed, by Google, on the wiki, but not on Google docs.

"It doesn't exist if you can't google it"

Blogger:

no formatting in comments
can't link to a particular comment

? permalinks ?

Jamie Wong's An Argument for Mutable Local History

http://jamie-wong.com/2012/05/25/an-argument-for-mutable-local-history/

Argues for mutable local history, and hence rebase.

I think all of these arguments apply also to global history, at whatever level.

But I want immutability as much as possible.

Again, I conclude that it is not the absolute history we what to mutate. It is our VIEW of the history.

(Although there are valid reasons to really want to edit history, e.. to drop stuff that you are not legally allowed to use.)

Futures make sets safe

Set notation is very convenient.

E.g.

index_set = { 1, 2, 3}
array[ index_set ] = 0

rather than a loop

But a problem with set notation is that intermediate sets can be very large.

E.g.

index_set = 1, 2, 3 ...
array[index_set] = random()
array = array[i] where i < 5

rearranging

index_set = 1, 2, 3 ...
small_index_set = i IN index_set, where i < 5
array[small_index_set] = random()

If the array[index_set] = assignment makes things real, this statement is impractical. Whereas array[small_index_set] = is finite sized.

Futures make this sort of manipulation feasible.

Pattern: Ordered Lists, with Associative Lookup and Update

One of my friends (TC) complained that XML had too much order. He said that XML structs should be unordered, like Perl hashes, associative arrays. Indeed, JSON has unordered key=alue pairs, and ordered lists: the former is not the latter.

I'm not so sure. I keep running into situations where I want to have an ordered data structure, but where I also want an associative lookup and/or update.

For example, my shell PATH. It is definitely order dependent. But I often want to do operations such as

is DIR already in PATH
add DIR to PATH if it is not already present

prepend or append

add DIR to PATH next to DIR_already_in_path

just before, or just after

replace DIR0 in PATH (assuming it is already there) with DIR1

For example, hook lists - like hook lists for emacs.

For example, like Bash PROMPT_COMMAND. This really isn't a hook list, but imagine that it is, using ; as the separator (gets more complicated with IFs, etc.). Here I have been wanting a more general associative lookup, not just exact matching.

E.g. replace any element that looks like a command that invokes my logical-path program.*logical-path.* with a different invocation of that program

Just like associative arrays, hashes, have greaty impoved programmer productivity by being built in data structures in Perl and Python (and to a weaker extent in C++ STL std::map), it may be worth thinking about standardizing ordered associative arrays. Using Perl associative array syntax (not my favorite, but it's easy at hand):

$OAA{key} # error if more than 1 match?
$OAA{key:1} $OAA{key:2} ... # first match, second match, ...
$OAA{key:#} # number of matches
$OAA{key} = value # replace. error if no match
$OAA{key} = {} # delete - or some other way of indicating droppage
$OAA{key} <<= value # insert just before
$OAA{key} >>= value # insert just after
$OAA{key:*} = ... # replace all occurrences

We might distinguish key from value, as in key=value. Or we might allow the key and the value to be the same, i.e. the entire value might be matched against.

We might require key to be exactly matched - or we might allow regexps, or arbitrary expressions.

$OAA{i WHERE $OAA{i} < 2} = ... # replace all occurrences

We might allow the value to have structure, with fields.

$OAA{rec WHERE $OAA{rec}.age > 10 and $OAA{rec}.weight <70 br="br">$OAA{.age > 10 and .weight <70 blockquote="blockquote">Nothing that you can't do with map here - but the notation may be more pleasant. And compact, pleasant notation matters. (As opposed to compact unpleasant notation.)

Verges on relations.

---

Extend to multi dimensional?

Quoting is the source of much evil (or the lack thereof)

Quotification, or the lack thereof, is the source of much evil. SQL injection type errors, etc.

I've written already about a modest proposal, to apply a simple taint bit to bytes saying "This is data, not language syntax."

I hesitate to expose my own bugs, but what the heck:

Today I just had a bug that looked like

bash: syntax error near unexpected token `('

terminating .bashrc too early, before all path setup was done.

as I tried to make my UNIX and cygwin environments converge.

Bug was insufficient quoting in

eval $pathvar=$elem:"$pathval"

in some code manipulating the PATH. Had problems with filenames containing special characters like space and ()s. Such fiulenames ae relatively uncommon on *IX, but common on Windows, and Cygwin is both.

Fixed by adding quotes

eval $pathvar="'$elem'":"'$pathval'"

At least I hope it's fixed. What if pathvar has a special character? Nah...

Part of the trouble with quotification is that you have to get it just right. Too many quotes are bad, just like too few.

I might like to do

eval SAFE(SAFE(SAFE($pathvar)))=SAFE(SAFE(SAFE($elem)))::SAFE(SAFE(SAFE($pathval)))

which might really be considered

eval $pathvar=$elem:$pathval

where I have used green text background to indicate stff that should have no sntax when eval'ed, and red to indicate syntax.

This works okay of the eval is immediate. But if the results of eval are themselves evalled an unknown number of times, a simple bit is not sufficient. Nesting...

---

khb commented:

So to be clear, you suggest that we have (for want of a better term) a bit to indicate that something should be treated as data? Was that the sort of thing the near legendary Intel 432 did?

I'm not saying it's a bad idea, I'm just trying to map it into historical perspective ;>

Responding inline, because one of blogger's weaknesses is no formatting in comments. (Probably deliberate, to prevent complex conversations.)

I do want metadata associated with data

This might be a tag bit per memory location (but it doesn't have to be)

actually, I can't remember if the iAPX 432 had such tag bits. It may not have needed them, because it treated everything as object oriented
one version of the i960, and the IBM AS/400, had such tag bits. However, they used the tag bit for privilege - what they did would not have helped directly for SQL injection type attacks.

But it doesn't need to be a tag bit. Much discussion on

It doesn't even need to be hardware. Although I may sometimes discuss HW/ucode implementations, this particular issue, SQL injection and other non-machine code, eval, attacks, can be addressed completely in software.

E.g. have the language parser just operate on wide bytes, 16 bit or wider. Wider than you use for normal language input.
Reserve the upper bits for these "tags".
Have the compiler refuse to accept a semicolon-with-non-syntactic bit attached as language syntax. Similarly for keywords.

Yes, it's tags. But not necessarily HW tags like some old AI machines have. SW tags are good enough. And, in this case, only in strings that you are going to give to a language parser, compiler, evaluator. Basically, use wchar, wide char, 16 bit or wider characters, where some of the

Heck: I edit in gnu emacs, where every character in the buffer can have fairly arbitrary attributes - like 'keyword, 'variable-name, etc. Why not make use of such attributes in the character set that the language parser accepts as input? Especially if can reduce security bugs.

Note: we might not yet be ready to jump to semantically enforced tags - where keywords like IF THEN ELSE would be *required* to have the 'keyword tag on them. We probably still need to be able to accept untagged ASCII characters as input. But, we might be ready for semantic hints - tagging something 'definitely-not-a-keyword. If the parser tries to take such definitely not tagged character as part of a keyword, that could be an error. (As opposed to making it part of an identifier.)

abort: filename contains ':', which is reserved on Windows: ...

The usual: record one of my stupidities, in the ope that it will save someone else some time in the future - perhaps amnesiac me:

Just spent longer than I should have wondering why I was getting

abort: filename contains ':', which is reserved on Windows: ...

errors - when all I was trying to do was hg update -r default, back to a version that I had been using just yesterday.

This on cygwin, on windows.

Turns out that my current revision had a broken ~/.bashrc, so instead of using cygwin mercurial, /bin/hg, I was using Windows Mercurial, /cygdrive/c/Program Files/TortoiseHg/hg

And the Windows Mercurial enforces Windows naming conventions.

Fix: just say /bin/hg ...

Krazy Glew's Blog

Disclaimer

Thursday, August 02, 2012

Lack of keyword parameters a big cause of programming .... errors, bugs, inefficiency

Chaining, or Why You Should Stop Returning Void - Zero Wind :: Jamie Wong

Google "cloud" tool shortcomings

Jamie Wong's An Argument for Mutable Local History

Futures make sets safe

Pattern: Ordered Lists, with Associative Lookup and Update

Quoting is the source of much evil (or the lack thereof)

abort: filename contains ':', which is reserved on Windows: ...

Blog Archive

Labels

Search This Blog

Followers

About Me

Links to Me