Krazy Glew's Blog: 06/01/2012

Disclaimer

The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Friday, June 29, 2012

Postfix if with else

Today I found myself writing

do A if possible, else do B if possible, else error

English that is.

But it caused me to think about Perl's Postfix if

statement if condition;
versus
if( condition ) {code} [[elsif(cond) {code}]* else {code}]
i.e.

if( condition ) {code}
if( condition ) {code} else {code}
if( condition ) {code} elsif(cond) {code} else {code}
etc.

Strange to many programming languages, but sometimes makes conde more readable.

I wonder if the posifix if woth else, that I wrote in English, might be worth thinking about.

Syntactically it is a ternary operator:
operand1 IF operand2 ELSE operand3

I vaguely recall disjoint operator and function syntax - Algol 68? or was it Algol 60? - ...

Reading email in more than place

Andy Glew

1:55 PM - Public

I read email in more than one place, on more than one device, in more than one situation (phone (actually, PDA, but that's a story in itself), tablet, tablet PC by itself, tablet PC docked with 4 big monitors).

I need more dispositions than Archive or Delete.

I need "I can't read this now/here/on this device".

I suppose that it might be useful to be able to say where I could read it - e.g. "defer reading until I'm docked at my desk with 4 external monitors".

But "I can't read read it {in this situation}" is a simple button. It could probably fit on my phone's screen. Whereas "Defer to another situation" is a multi-valued control, and harder to fit.

Note that "I can't read it in this situation" is NOT the same as "Later". It may be a superset. But being re-presented an email to read, later, on the same device that it was too painful to read on in the first place doesn't help.

--

Possibility of learning: record where deferred, and where finally read. Infer. Next time the user presses the "I can't read this {in this situation}" button, you might prompt "Read it at home?"

--

As usual, if already available, I'd like to hear about it.

Error Handling

I sometimes say "the most important part of programming is error handling".

This is not true. Shown by how many programs, how many library routines, how much code, does not handle errors at all. Successful programs, libraries, code.

How about "one of the hardest parts of programming is error handling"? That error handling is hard is one of reasons it is so often neglected.

And "one of the biggest barriers to reuse is error handling".

In at least two senses:

(1) You pick up a tool or library, try to use it on what you believe to be correct input - and it craps out with something like a Python stack dump. Rather than debug, you try something else. Whereas, if an error message had explained what the problem was - if the code had checked format, rather than assuming - then you might have fixed the input.

(2) In a much narrower sense, the error handling conventions a code assumes may get in the way. E.g. I assume that you throw an exception, you assume that it returns 0. Or -1. If throwing an exception, what type of exception object do I throw? (I have given up, and just throw strings.)

Can you tell I am debugging Python today? I'm not 100% sure, but the Python code I have encountered has made more such assumptions...

Thursday, June 28, 2012

The Joel Tests "best tools money can buy" and free software

I like the Joel Test:

http://www.joelonsoftware.com/articles/fog0000000043.html

Do you use source control?
Can you make a build in one step?
Do you make daily builds?
Do you have a bug database?
Do you fix bugs before writing new code?
Do you have an up-to-date schedule?
Do you have a spec?
Do programmers have quiet working conditions?
Do you use the best tools money can buy?
Do you have testers?
Do new candidates write code during their interview?
Do you do hallway usability testing?

But some of the questions have set me to musing.

E.g. 9 Do you use the best tools money can buy?

One might interpret this as saying that it is unreasonable to use free or Open Source tools, if there are tools that can be bought that are better.

But: better is arguable. The fact that something is free, or, better, Open Source, greatly reduces many hassles. E.g. "We can't have your work on XYZZY because that would need a new license."

(By the way: SW can be free but not Open Source, or vice versa. Best if its free and Open Source.)

Joel's examples talk mainly about being wiling to spend money on fast computers for builds and tests, extra monitors, and disk space wars. Mentioning a software tool, a bitmap editor, only as an aside.

But let's talk about SW tools, fre and Open Source.

I think here the proper way to consider this question is "Do you have the latest and greatest versions of the free and Open Source tools you are using installed?"

Or rather, since I am not an advocate of gratuitous feature creep: are you ever held back by lack of a feature in the version of free and Open Source tools you have installed --- that is fixed in more recent, stable, versions that will probably run on your machines?

One of the nice things about virtual machines is that more and more you can have N different distros running, so you can pick which tool versions you need.

Wednesday, June 27, 2012

ISO phone camera app to recognize pills

Historically I've only ever needed to take 1 pill, an antihistamine for my allergies, circa once a day. Ah, youth!

Unfortunately, now I need to take more than one pill, more than once a day. (I'm an overweight guy over 50, with a history of allergies - you guess at what my medical conditions are.) Enough that I want to carry the pills that I will use in the next day or so around in a pill carrier in my pocket. And I notice that it can be hard to tell the pills apart.

I started off with an old film case, with all the pills in one place. But this became confusing: which pill is which?

I just noticed that there are several different designs, some with compartments by day of the week.

Hmm... I suppose that keeping the pills physically separate may reduce contamination - medication 1 rubbing off on medication 2.

I am sure that there will be some pill carriers that have little codes of letters on the boxes that remind you which pill is which. But those make the box bigger - and you have to remember which code belongs to which pill.

I tried little slips of paper, or little envelopes. Haven't found any tiny enough.

So, thinking about this today, I decided to take photographs of the different pills. So far, all of them have been different enough in color and shape and size and markings that I can tell them apart - so long as I remember what they look like. Or so long as I have a photograph to remind me.

Which causes me to wonder if there is yet a "pill recognizer" app? An app that can run, on Google Goggles or whatever, and tell you what pills it is looking at? May disambiguate based on what pills it knows you are taking.

---

Who would have thought that there could be technology involved in taking pills?

Different varieties and purposes of pillboxes.

Photographs, pill recognizer apps.

Heck: just realizing that I needed a pill carrier in my pocket, and a "health supplies" bag - was a learning experience. I can't just keep all of this stuff on a shelf in my medicine cabinet - especially since I need to move it around if I travel. Bags are not sophisticated technology, but they are ... (By the way, I am looking for a bag that is better suited to carrying bottles and vials. Currently I'm using an old bookbag from a conference.)

---

Here's another: pill dippers. Take some pills that you have received, that are easily confused with other pills that you already have. And dip them in some sort of uniqifying color coating, that dissolves in all the appropriate ways in your moth, esophagus, or stomach.

---

I have often heard that doctors complain about patient non-compliance: patients who don't take their prescriptions.

There is the issue of cost - many folks in the USA may only be able to afford to take a prescription once a week or as needed, when the doctor prescribes it to be taken daily or more often (fortunately I have good health insurance, so that does not apply to me, at this time).

... I wonder if compliance is better in countries with socialized medicine? ...

Anyway, apart from the issue of cost, this pill taking is just plain confusing:

some have to be taken with food.
some without
some at night
some in the morning

Let alone when your doctor suggests that you take something in a way not described on the Rx label.

Monday, June 25, 2012

abspath and simple diff tests

I like tests.

Lacking a full fledged test infrastructure, simply diffing outputfiles is often sufficient.

Except for stuff like

a) using absolute paths related to where the source tree is

b) other time and position variant stuff.

I'd almost like a diff

1) that was smart enough to recognize and skip timestamps

2) that was smart enough to recognize absolurte/relative file patterns.

I have added relative/absolute file options to many tools - so often that I suspect it is not the tools fault, but really should be fixed in the test infrastructure.

I have preprocessed expected output files - but that can be fragile, if the preprocess tool breaks and, e.g., gives an empty string.

Saturday, June 23, 2012

Ubiquitous tracking (and versioning)

Sometimes I wish that the whole bloody filesystem was capable of the sort of tracking that a version control system does.

E.g. today I am looking at some saved output files, that I copied from a path I got in email a while back.

But I forgot to record the provenance. I.e. I forgot to record where I got them from.

It sure would be nice if the fgilsystem remembered that, a few weeks back, I did

cp -R jffggghggh here

to get this data.

--

Just recording the commands might be useful.

Full versioning may not be necessary.

--

Ultimately: log everything?

Apart from the privacy and big brother implications, the real problem with logging everything is the query system. Just recording all of your commands is NOT very useful, is hard to query.

Want to record, e.g. that command

cp -R jffggghggh here

executing in directyory a/aa/h/g

produced file here/b/h

...

Hmm... this is the sort of thing that the automatic build systems do.

One-liners considered dangerous

It is nice to come up with one-liners. E.g. today I was looking for a one-liner hg post-clone hook.

But... even when you can find a one-liner, oftentimes you give up error checking or good error messages to get it. Or, you might create something fraguke, that only works on the input you happened to have seen today, when generalizing it even a little bit is easy, but would make it more than one line.

I wonder if one liners are a source of security bugs?

Thursday, June 21, 2012

recursive make can break parallel make -j

Recursive make is dangerously stupid. There are better tools, but not everyone has realized.

In a system using recursive make, two rules invoke recursive make for different rulesof the same directory.
Unfortunately, these rukes are not parallelizable. The nonrecursive makefile knows this, but the separate recursive make invocations do not.

I.e. recursive make can break parallel make -j.

how-can-mercurial-or-any-other-dvcs-recognize-partially-overlapped-histories

http://stackoverflow.com/questions/11146096/how-can-mercurial-or-any-other-dvcs-recognize-partially-overlapped-histories

Q: is there any way in Mercurial to usefully merge two repositories
whrrer the lines of history are similar, but not identical?

E.g. where one rep has coarse grain revisions 0,1,2
and the other has fine graun revisions 0, 0.1, 0.2, 1, 1.1, 1.2, 2,
and come up with a single history?

Rather than a mess of branches and heads, which is what I get when I try using what I know of Mercurial?

Or the even fancier
Repo1: 0, 1, 1.1, 1.2, 2
Repo 2: 0, 0.1, 0.2, 1, 2, 3

Merge: 0, 0.1, 0.2, 1, 1.1, 1.2 2, 3

In more detail:

What I want is a merge that can recognize when file contents are the same,
or which can recognize that two lines of history are similar, although not all versions in one line
are in the other,
and give something like:

o=o changesets with same file contents on different historical lines
o | (line1)
| | changeset: 2:2a02e67e7b5d
| | user: Andy Glew
| | date: Thu Jun 21 12:40:15 2012 -0700
| | summary: 2
| o (line2)
| | changeset: 8:089179dde80a
| | user: Andy Glew
| | date: Thu Jun 21 12:40:15 2012 -0700
| | summary: 2
| |
| o changeset: 7:615416921e33
| | user: Andy Glew
| | date: Thu Jun 21 12:40:14 2012 -0700
| | summary: 1.2
| |
| o changeset: 6:a43a88065141
| | user: Andy Glew
| | date: Thu Jun 21 12:40:14 2012 -0700
| | summary: 1.1
| |
| |
o=o changesets with same file contents on different historical lines
o | (line1)
| | changeset: 1:93cbae111269
| | user: Andy Glew
| | date: Thu Jun 21 12:40:13 2012 -0700
| | summary: 1
| o (line2)
| | changeset: 5:fef4050e0162
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 1
| |
| o changeset: 4:b51fbedc72e5
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 0.2
| |
| o changeset: 3:45b7f64b2a23
| | parent: 0:c80bc10826be
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 0.1
| |
| |
|/
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

I can imagine that possibly a merge changeset would be necessary at the o=o points.

But I would like to have it recognized automatically.

Here's an example of how such a history would be created.
Contrived in this example, but something siomilar is happening to me in real-life,
when a project wants coarse grain commits, but where I want to preserve the fine grain ciommits
(as well as the coarse grain stuff released to the project).

[glew@mipscs587 ~/hack/hg-granularity] 900$ bash 12:39:54>. ./eg

% set verbose

% mkdir hg-repo
% cd hg-repo
% ./hg-repo
% hg init
% echo 0 > a
% hg add a
% hg ci -m0 a

% cd ..
% hg clone hg-repo fine
updating to branch default
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
% hg clone hg-repo coarse
updating to branch default
1 files updated, 0 files merged, 0 files removed, 0 files unresolved

% cd fine
./fine
% echo 0.1 > a; hg ci -m0.1
% echo 0.2 > a; hg ci -m0.2
% echo 1 > a; hg ci -m1
% cat a
1
% hg push default
pushing to /home/glew/hack/hg-granularity/hg-repo
searching for changes
adding changesets
adding manifests
adding file changes
added 3 changesets with 3 changes to 1 files
% hg glog
@ changeset: 3:fef4050e0162
| tag: tip
| user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 1
|
o changeset: 2:b51fbedc72e5
| user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 0.2
|
o changeset: 1:45b7f64b2a23
| user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 0.1
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

% cd ../coarse
% cp ../fine/a .
% cat a
1
% hg ci -m1
% hg glog
@ changeset: 1:93cbae111269
| tag: tip
| user: Andy Glew
| date: Thu Jun 21 12:40:13 2012 -0700
| summary: 1
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

% cd ../fine
% echo 1.1 > a; hg ci -m1.1
% echo 1.2 > a; hg ci -m1.2
% echo 2 > a; hg ci -m2
% cat a
2
% hg push default
pushing to /home/glew/hack/hg-granularity/hg-repo
searching for changes
adding changesets
adding manifests
adding file changes
added 3 changesets with 3 changes to 1 files
% hg glog
@ changeset: 6:089179dde80a
| tag: tip
| user: Andy Glew
| date: Thu Jun 21 12:40:15 2012 -0700
| summary: 2
|
o changeset: 5:615416921e33
| user: Andy Glew
| date: Thu Jun 21 12:40:14 2012 -0700
| summary: 1.2
|
o changeset: 4:a43a88065141
| user: Andy Glew
| date: Thu Jun 21 12:40:14 2012 -0700
| summary: 1.1
|
o changeset: 3:fef4050e0162
| user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 1
|
o changeset: 2:b51fbedc72e5
| user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 0.2
|
o changeset: 1:45b7f64b2a23
| user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 0.1
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

% cd ../coarse
% cp ../fine/a .
% cat a
2
% hg ci -m2
% hg glog
@ changeset: 2:2a02e67e7b5d
| tag: tip
| user: Andy Glew
| date: Thu Jun 21 12:40:15 2012 -0700
| summary: 2
|
o changeset: 1:93cbae111269
| user: Andy Glew
| date: Thu Jun 21 12:40:13 2012 -0700
| summary: 1
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

OK, so now I have a fine grain history in the fine repo,
and a coarse grain history in the coarse repo. I wouldf like to merge them.
(Firget that the coarse is a subset of the fine: I can easily contrive exanples where they are not).

Simply pushing the coarse graon history gives a warning.
I will opush it later,
but first I will try merging in a separate clone.

% hg push default
pushing to /home/glew/hack/hg-granularity/hg-repo
searching for changes
abort: push creates new remote head 2a02e67e7b5d!
(you should pull and merge or use push -f to force)

% cd ..

% hg clone coarse merge-fine-and-coarse
updating to branch default
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
% cd merge-fine-and-coarse/
./merge-fine-and-coarse/

% hg glog
@ changeset: 2:2a02e67e7b5d
| tag: tip
| user: Andy Glew
| date: Thu Jun 21 12:40:15 2012 -0700
| summary: 2
|
o changeset: 1:93cbae111269
| user: Andy Glew
| date: Thu Jun 21 12:40:13 2012 -0700
| summary: 1
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

% hg pull ../hg-repo
pulling from ../hg-repo
searching for changes
adding changesets
adding manifests
adding file changes
added 6 changesets with 6 changes to 1 files (+1 heads)
(run 'hg heads' to see heads, 'hg merge' to merge)

% hg heads
changeset: 8:089179dde80a
tag: tip
user: Andy Glew
date: Thu Jun 21 12:40:15 2012 -0700
summary: 2

changeset: 2:2a02e67e7b5d
user: Andy Glew
date: Thu Jun 21 12:40:15 2012 -0700
summary: 2

Here is the merge.

Notice that the pairs
o changeset: 8:089179dde80a
| @ changeset: 2:2a02e67e7b5d
and
o changeset: 5:fef4050e0162
| o changeset: 1:93cbae111269
have the same file contents,
one from the coartse and the other from the fine repo.
But the Mercurial history graph does not reflect this.

% hg glog
o changeset: 8:089179dde80a
| tag: tip
| user: Andy Glew
| date: Thu Jun 21 12:40:15 2012 -0700
| summary: 2
|
o changeset: 7:615416921e33
| user: Andy Glew
| date: Thu Jun 21 12:40:14 2012 -0700
| summary: 1.2
|
o changeset: 6:a43a88065141
| user: Andy Glew
| date: Thu Jun 21 12:40:14 2012 -0700
| summary: 1.1
|
o changeset: 5:fef4050e0162
| user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 1
|
o changeset: 4:b51fbedc72e5
| user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 0.2
|
o changeset: 3:45b7f64b2a23
| parent: 0:c80bc10826be
| user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 0.1
|
| @ changeset: 2:2a02e67e7b5d
| | user: Andy Glew
| | date: Thu Jun 21 12:40:15 2012 -0700
| | summary: 2
| |
| o changeset: 1:93cbae111269
|/ user: Andy Glew
| date: Thu Jun 21 12:40:13 2012 -0700
| summary: 1
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

% hg diff -r 2a02e67e7b5d -r 089179dde80a

So I'll try a merge

% hg merge -r 8
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
(branch merge, don't forget to commit)

% hg ci -m'merge of fine and coarse at 2'

Better -
this shows that
o changeset: 8:089179dde80a
| @ changeset: 2:2a02e67e7b5d
are a convergence point,
although an extra dummy changesrt was necessary.

But it does not show the commonality between
o changeset: 5:fef4050e0162
| o changeset: 1:93cbae111269

Here's the merged graph

% hg glog
@ changeset: 9:328db8187d31
|\ tag: tip
| | parent: 2:2a02e67e7b5d
| | parent: 8:089179dde80a
| | user: Andy Glew
| | date: Thu Jun 21 12:43:51 2012 -0700
| | summary: merge of fine and coarse at 2
| |
| o changeset: 8:089179dde80a
| | user: Andy Glew
| | date: Thu Jun 21 12:40:15 2012 -0700
| | summary: 2
| |
| o changeset: 7:615416921e33
| | user: Andy Glew
| | date: Thu Jun 21 12:40:14 2012 -0700
| | summary: 1.2
| |
| o changeset: 6:a43a88065141
| | user: Andy Glew
| | date: Thu Jun 21 12:40:14 2012 -0700
| | summary: 1.1
| |
| o changeset: 5:fef4050e0162
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 1
| |
| o changeset: 4:b51fbedc72e5
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 0.2
| |
| o changeset: 3:45b7f64b2a23
| | parent: 0:c80bc10826be
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 0.1
| |
o | changeset: 2:2a02e67e7b5d
| | user: Andy Glew
| | date: Thu Jun 21 12:40:15 2012 -0700
| | summary: 2
| |
o | changeset: 1:93cbae111269
|/ user: Andy Glew
| date: Thu Jun 21 12:40:13 2012 -0700
| summary: 1
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

How about another merge?

% hg update -r 1
1 files updated, 0 files merged, 0 files removed, 0 files unresolved

% hg merge -r 5
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
(branch merge, don't forget to commit)

% hg ci -m'merge of fine and coarse at 1'
created new head

% hg glog
@ changeset: 10:cca7fec90d3f
|\ tag: tip
| | parent: 1:93cbae111269
| | parent: 5:fef4050e0162
| | user: Andy Glew
| | date: Thu Jun 21 12:45:03 2012 -0700
| | summary: merge of fine and coarse at 1
| |
| | o changeset: 9:328db8187d31
| | |\ parent: 2:2a02e67e7b5d
| | | | parent: 8:089179dde80a
| | | | user: Andy Glew
| | | | date: Thu Jun 21 12:43:51 2012 -0700
| | | | summary: merge of fine and coarse at 2
| | | |
| | | o changeset: 8:089179dde80a
| | | | user: Andy Glew
| | | | date: Thu Jun 21 12:40:15 2012 -0700
| | | | summary: 2
| | | |
| | | o changeset: 7:615416921e33
| | | | user: Andy Glew
| | | | date: Thu Jun 21 12:40:14 2012 -0700
| | | | summary: 1.2
| | | |
| +---o changeset: 6:a43a88065141
| | | user: Andy Glew
| | | date: Thu Jun 21 12:40:14 2012 -0700
| | | summary: 1.1
| | |
| o | changeset: 5:fef4050e0162
| | | user: Andy Glew
| | | date: Thu Jun 21 12:40:12 2012 -0700
| | | summary: 1
| | |
| o | changeset: 4:b51fbedc72e5
| | | user: Andy Glew
| | | date: Thu Jun 21 12:40:12 2012 -0700
| | | summary: 0.2
| | |
| o | changeset: 3:45b7f64b2a23
| | | parent: 0:c80bc10826be
| | | user: Andy Glew
| | | date: Thu Jun 21 12:40:12 2012 -0700
| | | summary: 0.1
| | |
+---o changeset: 2:2a02e67e7b5d
| | user: Andy Glew
| | date: Thu Jun 21 12:40:15 2012 -0700
| | summary: 2
| |
o | changeset: 1:93cbae111269
|/ user: Andy Glew
| date: Thu Jun 21 12:40:13 2012 -0700
| summary: 1
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

This is not right. It has established a new head, whereas
what we wanted was some way of indicating that
o changeset: 5:fef4050e0162
| o changeset: 1:93cbae111269
are the same.

OK, switch back to the original coarse

% cd ../coarse

% hg push default
pushing to /home/glew/hack/hg-granularity/hg-repo
searching for changes
abort: push creates new remote head 2a02e67e7b5d!
(you should pull and merge or use push -f to force)

% hg push -f default
pushing to /home/glew/hack/hg-granularity/hg-repo
searching for changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 2 changes to 1 files (+1 heads)

% hg glog
@ changeset: 2:2a02e67e7b5d
| tag: tip
| user: Andy Glew
| date: Thu Jun 21 12:40:15 2012 -0700
| summary: 2
|
o changeset: 1:93cbae111269
| user: Andy Glew
| date: Thu Jun 21 12:40:13 2012 -0700
| summary: 1
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

% cd ../hg-repo/

% hg glog
o changeset: 8:2a02e67e7b5d
| tag: tip
| user: Andy Glew
| date: Thu Jun 21 12:40:15 2012 -0700
| summary: 2
|
o changeset: 7:93cbae111269
| parent: 0:c80bc10826be
| user: Andy Glew
| date: Thu Jun 21 12:40:13 2012 -0700
| summary: 1
|
| o changeset: 6:089179dde80a
| | user: Andy Glew
| | date: Thu Jun 21 12:40:15 2012 -0700
| | summary: 2
| |
| o changeset: 5:615416921e33
| | user: Andy Glew
| | date: Thu Jun 21 12:40:14 2012 -0700
| | summary: 1.2
| |
| o changeset: 4:a43a88065141
| | user: Andy Glew
| | date: Thu Jun 21 12:40:14 2012 -0700
| | summary: 1.1
| |
| o changeset: 3:fef4050e0162
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 1
| |
| o changeset: 2:b51fbedc72e5
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 0.2
| |
| o changeset: 1:45b7f64b2a23
|/ user: Andy Glew
| date: Thu Jun 21 12:40:12 2012 -0700
| summary: 0.1
|
@ changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

% echo This is not right
This is not right

What I want is a merge that can recognize when file contents are the same,
or which can recognize that two lines of history are similar, although not all versions in one line
are in the other,
and give something like:

o=o changesets with same file contents on different historical lines
o | (line1)
| | changeset: 2:2a02e67e7b5d
| | user: Andy Glew
| | date: Thu Jun 21 12:40:15 2012 -0700
| | summary: 2
| o (line2)
| | changeset: 8:089179dde80a
| | user: Andy Glew
| | date: Thu Jun 21 12:40:15 2012 -0700
| | summary: 2
| |
| o changeset: 7:615416921e33
| | user: Andy Glew
| | date: Thu Jun 21 12:40:14 2012 -0700
| | summary: 1.2
| |
| o changeset: 6:a43a88065141
| | user: Andy Glew
| | date: Thu Jun 21 12:40:14 2012 -0700
| | summary: 1.1
| |
| |
o=o changesets with same file contents on different historical lines
o | (line1)
| | changeset: 1:93cbae111269
| | user: Andy Glew
| | date: Thu Jun 21 12:40:13 2012 -0700
| | summary: 1
| o (line2)
| | changeset: 5:fef4050e0162
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 1
| |
| o changeset: 4:b51fbedc72e5
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 0.2
| |
| o changeset: 3:45b7f64b2a23
| | parent: 0:c80bc10826be
| | user: Andy Glew
| | date: Thu Jun 21 12:40:12 2012 -0700
| | summary: 0.1
| |
| |
|/
|
o changeset: 0:c80bc10826be
user: Andy Glew
date: Thu Jun 21 12:40:11 2012 -0700
summary: 0

I can imagine that possibly a merge changeset would be necessary at the o=o points.

But I would like to have it recognized automatically.

Heck, forget "recognized". I would like to have a way that I can recognize it manually, but have it represented in Mercurial.

Wednesday, June 20, 2012

Knobs should be left to right - why (some) getopts are broken

There are two main styles of command line argument or options parsers - which my subculture calls "knobs".

My preference is to parse strictly left to right.

Another common approach is to partse all the knobs, possibly out of order. Often the last value written for a knob overrides all previous values.

This can cause problems - hence my preference for left to right. Here's an example:

Imagine that you have a "big" knob, -big A or -big B, that has a big effect - e.g. swapping in a table of multiple settings.

And imagine that you have a small knob, -small x or -small y, that changesd a subsetting.

E.g.
command -big A => table[0] = A0, table[1] = A1, table[2] = A2...

command -big B => table[0] = B0, table[1] = B1, table[2] = B2...

command -big A -big B
=> the later B overrides the earlier A
=> table[0] = B0, table[1] = B1, table[2]=B2...

so far so good.

But imagine that -small changes only one entry.

command -big A -small x => table[0] = A0, table[1] = x, table[2] = A2...

command -big B -small y => table[0] = B0, table[1] = y, table[2] = B2...

Later -small may override earlier -small

command -big A -small x -small y => table[0] = A0, table[1] = y, table[2] = A2...

But... should a later -big override an earlier -small? I think so:

command -big A -small x -big B
=> My preference
=> table[0] = B0, table[1] = B1, table[2] = B2...

But many getopts parsers instead do:

command -big A -small x -big B
=> if -small is processed after -big
=> table[0] = B0, table[1] = x, table[2] = B2...

or, worse, have -big override even earlier -smalls.

List valued knobs don't help much here. Short-sight. Although I admit that I have occasionally use list valued knobs as a kluge when dealing with a non left to right argumernt parser.

---

Some knobs have side effects. Some even create other knobs. There needs to be a defined order of evaluation. Left to right is as goos as any.

---

The problem with left to right is that it tends to imply that you are building a data structure as you parse the knobs. Which sometimes means that a later knob may cause you to want to destroy a datastructure that was being built.

Tuesday, June 19, 2012

Monolithics and Makefiles

I think that one of the reasons why some people write big monolithic programs with too many classes and functions and just plain stuff in the same headers and .c/.cpp files is that they do not have goopd dynamic Make tools. They get the Makefile working, and then don't touch it, since touching it would require editing by hand.

Multihomed workspaces and repos

I often find myself wanting a "multihiomed" workspace or repo:

E.g. a workspace where I push to one or more places regularly.

E.g. I may push to my personal backup or tracking repo.
But occasionally I need to push to a project master.

This becomes complicated when a meta-project is composed of several subrepos - e.g. the main project source, but also source for libraries and tools. Then I want to push the whiole meta-project to the meta-project-repo, the main project (which is a subrepo of the meta-project) to the main project master repo, and the tools and libraries sometimes to the meta-project-repo, and sometimes to their respective master repos.

I can do this by hand, if I remember the names.

But I want to create nice default behavior. Like "push to my backup or m,eta-repo".
And "release all of my changes to rtheir respective providers".

Of course, the respective providers may themselves have flows.

--

I don't see a standard way in hg.

This is partly why I create the CVS.1 CVS.2 ... dirs, long ago.

For now, I am manually recording the multiple homes in files such as meta-repo/subrepo/hg-paths.

Trouyble is, I may not be allowed to asdd stuff like that to meta-repo/subrepo, since that tree belongs to thwe provider.

So I have had to add the metadata about multihoming elsewhere, e.g.
meta-repo/subrepo/hg-paths--for--subrepo.
(I would call it meta-repo/subrepo/hg-paths:subrepo), except I am warned about Windows not liking colons.

Minor BKM: -comment_knob

I find it convenient to give certain programs a command line option that is just a placeholder, which ignoeres its string argument. Just to hold comments.

Especially useful in a test, so you can say

command -comment_knob 'expect failure' -illegal-option ...

Monday, June 18, 2012

Else: what's the difference?

What's the difference

between cascaded IFs

if( cond1 ) {
...
} elsif( cond2 ) {
...
} else {
...
}

and a nested IF in the ELSE part:

if( cond1 ) {
} else {
if( cond2 ) {
...
} else {
...
}
}

?

Not much - except for intent. The latter calls out the ELSE part, the IF cond2 part, emphasizing its separateness. The latter pivots easily to a nested IF in the THEN part. The cascaded IF probably needs to be transformed to the nested IF on the WELSE part, and then pivoted.

Note that the cascaded IF is often used to express a concurrent IF:

if cond1 => ...
:: cond2 => ...
else ...
fi

where the difference is meaningful: in the concurrent IF all of the condityions, cond1 and cond2, are evaluated. Indeed, some argue that there should be no ELSE clause for a concurrent IF.

--

I wonder if there could be some useful difference made between cascaded IF and nested IF in the ELSE part.

Disabling default-push in Mercurial

(I may have already posted this, but cannot find it at this time.)

On a few occasions I have embarrassed myself by pushing to the project master repository accidentally, when I meant to push to my rep where I keep all of my history. (My project requires me to collapse fine grain commits to a much smaller number of coarse gran commits, losing history but keeping the log small. I keep meaning to try mq for this.)

Some folks on stackoverflow *almost* describe how to do this.

Although the post has a bug, which I have fixed. I would post the fix to stackoverflow, except it is down right now, far enough that I am not sure that the below is the correct URL

http://stackoverflow.com/can-you-prevent-default-push-but-allow-pull

Here is the my current BKM to disable default-push:

I've embellished the idea of setting paths.default-push in ~/.hgrc, making it a little bit more self documenting and less error-prone - since, as I point out below, setting default-push = . does not always disable pushing.

in ~/.hgrc

[paths]

# my main project master repo

project-master = ...

#DISABLING IMPLICIT PUSH

# to prevent embarassment from accidentally pushing to the project master repo

# instead of, in my case, a repo that has fine grain commits

# that the rest of the team does not want to see in the project master repo

#default-push = .

# this works mostly, but NOT if you use hg on your home directory

# since '.' in ~/.hgrc seems to be interpreted as -R ~

#default-push = /NONEXISTENT_default-push_--_must_specify_push_target_explicity

# this works ok, but I can clean up the error message using blanks

# keeping this around because blanks in pathnames cionfuse many tools

default-push = /'NONEXISTENT default-push -- must specify push target explicitly'

# this amounts to disabling implicit push targets.

Friday, June 08, 2012

The trouble with passwords

Security breakins leading to password exposure have been in the news this week, what with LinkedIn.

My last blog, about how biometrics are only a partial solution, was prompted by this. Actually, prompted by a National Public radio item on the LinkedIn breakin.

Security is getting mindshare when it hits NPR.

LastPass makes me feel a bit complacent, since I now have big different random passwords on nearly all of my sites, and, supposedly, LastPass never actually holds the unencrypted passwords. Supposedly the passwords are encrypted before being sent to LastPass. So, a breakin at LastPass should NOT lead to password leakage, unless the crypto is poor. (On the other hand, LinkedIn apparently encrypted its passwords, but did not salt them - so their crypto WAS poor.)

(Actually, I must admit: I only recently started using LastPass. All of my newer sites used LastPass, but I still had some old passwords that were pre-LastPass, human memoragble and therefore weaker. None of the them were the same as my LinkedIn password, but I changed them anyway. I hope I changed them all... It's quite a challenge to scan your password list looking for weak passwords, given the interfaces. That is something I'd like automated - but then again, so would the bad guys.)

If LastPass' crypto is not broken, then the weakness for LastPass is at my end: looking at the passwords via my browser.

But, anyway...

If LastPass' security is good enough - nothing is secure, but probably it is better to look for other problems to solve - then what needs fixing more? I.e. what cloud services need improved security more urgently?

Well... LastPass supposedly doesn't access unencrypted passwords, but account aggregators like Yodlee and Mint do. These services log into many, many, bank and other accounts, so that you can see all of your financial data in one place. Of necessity, if they do this while you are offline,
then they have access to all of the passwords.

Ouch!

If Yodlee or Mint are broken into, then thousands, perhaps millions, of people's financial information is exposed.

(Interestingly, the old Wesabe, a dead company in this area (now open source), was adamant about not keeping passwords - or, rather, encrypting. Decrypting passwords only at your PC. But IIRC this meant that Wesabe could only aggregate when your PC was connected. Which loses if your only PC is a laptop, often not connected overnight.)

OK, how do we address this:

* we want no single aggregator to store all of the passwords, so that they can be broken

* aggregators necessarily, given the state of the art [*], have to send passwords (albeit over SSL)

How about:

Split the passwords.

Let no single cloud service store all of the password.

Let the aggregator service be stateless wrt passwords. Let it access 2 or more password storage services, and get all parts (both halves) of the passwords needed to access the user accounts. Access. Download. And then forget.

Breakins at any single one of the password storage services would not disclose all (encrypted) passwords.

A breakin at the aggregator would not disclose stored passwords.

But... if the aggregator was pwned, then the badguys could be intercepting the passwords on the fly.

---

We can get ornate, and imagine websites "calling back":

* aggregator tells PWstore1 and PWstore2 that it is about to access website W
* aggregator accesses website W
* website W calls PWstore1 and PWstore2 to check

etc.

Plus, of course, generic challenge/response instead of entering passwords into boxes.

---

Passwords would need to get longer so that the split passwords are not so vulnerable.

---

There's probably some fatal flaw in what I propose above. Embarassing. But I no longer care to remain silent and avoid embarassment.

"Google Chrome to Phone" is not affiliated with Google???

I installed "Chrome to Phone" on my Android device, from the Google Play Store. At this store, Chrome to Phone is described as <<< Google Chrome to Phone Google Incorporated Top Developer >>> I installed the related Chrome extension from the Chrome web store: <<< Google Chrome to Phone Extension (3412)Fun from chrometophone-extension@google.com >>> However, when I try to click the link on the "Google Chrome to Phone" icon by the address bar of my browser, I get: <<< The Chrome To Phone application on your computer is requesting access to your Google Account for the product(s) listed below. Chrome to Phone (chrometophone.appspot.com) - not affiliated with Google If you grant access, you can revoke access at any time under 'My Account'. The Chrome To Phone application will not have access to your password or any other personal information from your Google Account. Learn more The application that directed you here claims to be "Chrome To Phone". We are unable to verify this claim as the application runs on your computer, as opposed to a website. We recommend you deny access unless you trust the application. >>> So, which is it? "Google Chrome to Phone" is not affiliated with Google? Or have I been hit by malware that is intercepting this? More likely, Google acquired the app, but did not update everything. Sigh.

Thursday, June 07, 2012

Biometrics are the smallest and least important part of replacing paswords

Marketplace Tech Report for Thursday, June 7, 2012 | Marketplace.org

Just heard the item about the LinkedIn security failure, passwords, and the researcher working on password typing rhythm.

Had to write because I am sick and tired of biometrics folks like the password typing rhythm guy saying they could solve password problems - when there is a fundamental limitation that means that they only solve the least important part of the problem, and that not very well.

First, terminology: "biometrics" means anything that measures something about you Like your password typing rhythm, your fingerprint, your retina scan.

The problem: anything like biometrics can be recorded and replayed by a bad guy. If exact replay is detected, the bad guys are smart enough to vary it.

What this means: biometrics, such as password typing rhythm, is only useful if the device that is being used is PHYSCALLY SECURE as well as clear of viruses, and if it uses encryptionto talk to whatever remote server like Google you are trying to authenticate to.

For example: probably a bank can trust a fingerprint reader, or a password typing rhythm system, if the readers are kept at the bank.

But the bank should NOT trust password typing rhythm or fingerprints that are read from a remote device that is not physically secure. E.g. say that I go to an internet cafe in Mexico to use the web while on vacation: the fingerprints or password typing rhythm read there should NOT be trusted, because some bad guy may own the computer that is reading them.

Stuff you own is intermediate: your desktop and laptop PCs, and your cell phone, are not completely physically secure. Not as secure as at a bank. But more secure than an internet cafe in Mexico. Plus, they may have malware, such as a keyboard logger, recording your password typing rhythm and giving it to the bad guys.

(By the way, my own most recent expertise is in preventing such malware.)

Here's the bottom line:

* Remote servers cannot trust biometrics like password typing rhythm.

** not unless they can trust you to have a physically secure device, and secure communications.

* If you give your biometrics to multiple sites - like the 528 sites one of the folks you interviewed has passwords on - then chances are that one of them will have a security breach, and the bad guys will know your biometrucs.

Is it hopeless? No!

The way forward is what Google is already starting to do: working with your cell phone as well as your PC to authenticate securely.

E.g. now, when I log into Gmail on my PC, Google texts my cell phone as a cross check.

Now,if I lose both my cell phone and my laptop PC together, the bad guy might have both. But this is just a step.

One of the next steps will be for the biometrics, e.g. the fingerprint, to be read on your cellphone. And for your cell phone to reply to Google saying "Yes, I have read Andy's fingerprint". And for the cellphone to automatically contact my PC, saying the same.

Worried about losing your cell phone? Make the device that reads the biometric into something you wear, like a bracelet or amulet on a necklace or ring. I call this a "security amulet". You might regularly rub it, e.g. to read your fingerprint. (We might even imagine surgically implanting it.)

Actually, when we do this, we don't really need to have the cell phone send the biometric back to Google, or any of the other 528 web sites. Perhaps your security amulet will read a fingerprint, while mine is listening to the "core rhythm" of my heartbeat. And perhaps once a day it may check my password typing rhythm.

We still need to worry abut losing the biometric device / cell phone / security amulet - although it is harder to lose something like your wedding ring, it is possible. There's no panacea here, except to note that we can make devices that are tamper resistant - if they are lost or stolen, and a bad guy is trying to break in and steal your security data, it can (1) erase itself if it detects a clumsy attempt to break in, and (2) make it hard to break in in an undetectable way E.g. giving you, say, 48 hours to realize that you have lost your security amulet / wedding ring.

The biometrics is only a small part of this. Its the least important part, actually: more important are the security protocols between your cellphone / security amulet, and all of the servers you want to use securely.

Google's texting of a code to your cellphoneis just a start. But its happening. Slowly, but it's happening.

So, PLEASE stop interviewing biometrics folks as if they can solve security problems. Biometrics is only the smallest and least important part of the problem.

Sunday, June 03, 2012

A make pattern


test: run-test1 run-test2

.PHONY: run-test1
run-test1: test1.x
        ./test.x
test1.x: test1.cpp ...
        gcc -o test1.x test1.cpp ...

Of course, in a decent build system, this is a macro Build_Targets_For("foo.cpp"), rather than having to write them all out by hand. Scons may not be the best, but it avoids much clutter.

.x

I once gave a bad revioew to a UNIX book that suggested that executables should have a .x suffix.

But I find myself using the .x suffix, in large part because I want to automatically .hgignore (or .cvsignore, or ,gitignore) executables created by tests.

I *do* only tend to use .x for test executables. I prefer not to have to type in foo.x to use the tool foo. Although sometimes I will create a script foo that invokes foo.x.

I sometimes wonder about a suffix path as well as the directory path.

Wanted: per-directory .hgignore

One of my big complaints about Mercurial is that it only supports /.hgignore.

It does not support .hgnores in sub-directories.

I have a lot of legacy code with .cvsignores scattered around. In CVS, when you move the directoriy from one place to another, the .cvsignores just work at the new location. (Although moving the CVS directories around is a challenge.) Whereas in Mercurial, wehen you move a subdirectory, you also need to go and fix any patterns in the project .hgignore. :-(

Because of this, one is tempted to use patterns that are possibly too broad. I have several times found myself .hgignorre'ing files in one part of the tree because they matched a pattern intended for a different part of the tree.

Status versus log

Many people - e.g. the people who want me to edit version control history - really want a status, rather than a log.

A log tends to record diffs. Deltas. Changes, What I did or am doing.

A log message for a version control system tends to describe what has changed between this and the last.

Note that a version control log is less complete than a human log. A human log may describe what you tried, but which failed, and which you did not commit. Sometimes I think it is as important to record that, as it is to record what worked well enough to release.

Oftentimes what people really want is a status. "Here's the state of the project - this works, this does not".

Note that in a human log, sometimes you may step back and write a status.

Tools like hg bisect really want a status. E.h. hg bisect only wants to attempt to test on versions that are known good - good enough to have run some sort of test in the past.

Statuses change. "Passes all tests" may become false, if more tests are added.

Logs

I like keeping a manual log (journal or diary) as I go.

Logs want to be a meta-file concept.

E.g. I only want to edit and view one log. I want to be able to track all of my work in a single place.

But, I also want to associate subsets of my overall log with particular projects.

E.g. I want to view log entries related to a project under ~/work/project/LOG. But I want to view these, and all others, under ~/LOG.

Editing this in ordinary files sucks. If I place it in a subdir/LOG I can't see it in ~/LOG, etc.

Suboptimal interactions with version control. I want version control for logs, but... Even if I am just working in projectdir/, editing projectdir/LOG, if I go off on a branch, edit projectdir/LOG, and switch back, I lose the log. The log wants to transcend branches.

I have occasionally tried to work around this by placing all of my log in the ciommit log. But that's not all that great an idea.

I wahnt as much automated log support, support for tracking personal history, as possible.

E.g. record the fact that I sent an email.

Automatically copy version control commit messages into my overall log. But, of course, make everything filterable/foldable.

Friday, June 01, 2012

hg subrepos

hg subrepos:

push - by default pushes subrepos

pull - does not recurse by default

nor does hg status

This is almost exactly the wrong thing: the dangerous thing, that can write master repos, is the default. But the thing that does not modify anything are not default enabled for subrepos.

The docs warn: hg subrepos are a feature of last resort.
http://mercurial.selenic.com/wiki/Subrepository
http://mercurial.selenic.com/wiki/FeaturesOfLastResort