Krazy Glew's Blog: 07/01/2012

Disclaimer

The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Friday, July 27, 2012

Why Gmail doesn't want me to be efficient

I have long complained that Gmail seems to want me to be inefficient. E.g. instead of giving me a scrollable list of, say, 500 emails in response to a search, they insist on showing me screenfulls at a time. Plus, they don't easily adjust to my current screen size - they always give me circa 20 items at a time. Even when I am on large screen that can support 120 such.

Now, there may settings, like showing 100 conversations per page. But (1) those don't apply to everything, e.g. not the searches - and if there is a setting that does, I have not found it. (2) They have not provided easy to adjust controls, such as dragging to enlarge the numbers shown.

I used to imagine that this was due to web HTML / Javascript limitations.

But now I suspect another reason: if they help me be more efficient, I look at fewer ads.

Thursday, July 26, 2012

Positive confirmation of success

Test suites should not just say "all tests passed". Or, even worse, return success silently.

They should at least optionally list what tests have been run. Or at least a summary of how many tests have been run.

It has happened before that tests have been omitted by accident. Test suites themselves can have bugs.

This is just an example of where positive conformation of an action can be important.

However, it is nice to be able to turn off that positive confirmation, e.g. in UNIX pipelines.

Myself, I like "folding"such confirmation.

Metadata for Mercurial

http://stackoverflow.com/questions/4443712/is-there-a-mercurial-extension-like-svn-propset/11679615#11679615

(I'm not saying anything that I haven't said before: metadata needs to be versioned (everything needs to be versioned), but metadata may be imagined as transcending versions in normal operation. Looking at these Q&As helps me crystallize my thoughts.)

I'm going to add a really late quasi-answer to this question, since it is one that many, many, people using Mercurial, including me, ask - something we want to have, and expect to have, after using other tools like svn properties.

As far as I can tell, there is no such extension for Mercurial. yet.

Yes, the Mercurial way seems to be to track files and only files. And it might be that they way to right such an extension is to put the necessary metadata into ...repo/.hg* files.

OK. I've been playing around with that. Doing stuff by hand, before trying to write tools.

The key weakness of the versioned .hg files approach is that if you check out a non-tip version, e.g. "hg update -r OLD-VERSION", you get an old version of the metadata.

I think the key thing, therefore, may be to put the metadata in ...repo/.hg* files.

But... most operations should be performed with or upon the latest version of such files. I.e. I think such metadata files "transcend" versions - i.e. they want to be versioned, but in an ideal situation you might imagine them as overlaid upon the normally versioned files that you may have checked out an older version of.

Moreover, in many cases you want the branching of such metadata files to be handled separately. E.g. imagine a file where you are trying to write a description of all branches, together. Perhaps comparing branches, like "branch1.1 is a more recent version of branch1". You don't want that description to be on either branch; or, rather, you want it to be on both brancges, at the same time, and you want changes to be reflected on both branches.

Such a putative extension would either operate on "hg cat -r tip ...repo/.hg-my-new-metadata". Or it would somehow overlay the versioned of the files with the normally version transcending metadata files.

I have made some progress on doing this with subrepos:

superrepo
files // normally-versioned-files <-- a subrepo
metadata // version transcending metadata <-- a subrepo

this allows me to check out the latest metadata alongside an older version of the files

It's not quite there, because checking out a particular version of the superrepo may get old versions of the metadata subrepo. But at least the newer versions are in the subrepo.

Let me also note that, whether you put the metadata in a neighboring subrepo, or keep it in the same repo (but operate on the tip), there is a problem that you can do

hg clone -r OLD-REVISION repo newrepo

This will strip out metadata later than OLD-REVISION. Including the metadata that says that "OLD-REVISION passed all of the tests", i.e. it will strip out metadata from a later revision that might apply to OLD-REVISION.

This same problem happens with hg tags.

One might say "well, never do that" - never strip out history. Unfortunately, that is often recommended as a way of "tidying up" the repository.

It seems hard to avoid this with Mercurial.

Wednesday, July 25, 2012

Wanted: hg branch -r REV

I want the following in Mercurial: -r REV argumn for hg branch.

Default "hg branch" is equivalnt to "hg branch -r ."

Branch of some other REV is "hg branch -r REV"

Actually, what I really want is this in revsets: "hg log -r "branch_of(.)"

Then I could do: hg log -r "max(ancestors(.) and not branch(branch_of(.)))"

- the most recent ancestor on a different branch

Ah: matching (just not in the version a work). But it is not quite on... it returns revs, not branch

Tuesday, July 24, 2012

co -r REV workspace = overlay co -r REV source + co -r LATEST metadata?

The Mercurial way seems to be to handle files and only files.

Metadata, like svn proprerties, indicating whether tests habve passed or not, in the Mercurial mindset lives only in files. See, for example:

http://stackoverflow.com/questions/4443712/is-there-a-mercurial-extension-like-svn-propset

I actually am okay by this. Keep stuff in files. However, I think there are issues wrt how a workspace should be assembled: more and more I think of a workspace as an overlay of metadata, in files, that transcends versions, and the actual versioned source code files.

I.e. if you have a file that describes all of the branches in a repo, and what they are used for, it rather defeats the purpose if, when you update the workspace to a branch, you lose other

Monday, July 23, 2012

Bob Colwell / Craig Barrett

http://newsletter.sigmicro.org/sigmicro-oral-history-transcripts/Bob-Colwell-Transcript.pdf

0:27:22 BC: That never happened. Instead, for example five Intel fellows including me went to visit Craig Barrett in June of 98 with the same Itanium story, that Itanium was not going to be able to deliver what was being promised. The positioning of Itanium relative to the x86
line is wrong, because x86 is going to better than you think and Itanium is going to be worse and they're going to meet in the middle. We're being forced to put a gap in the product lines between Itanium and x86 to try to boost the prospects for Itanium. There's a gap there now that AMD is going to drive a truck through, they're going to, what do you think they're going to hit, they're going to go right after that hole" which in fact they did. It didn't take any deep insight to see all of these things, but Craig essentially got really mad at us, kicked us out of his office and said (and this is a direct quote) "I don't pay you to bring me bad news, I pay you to go make my plans work out".

0:28:26 BC: So and he, and at that point he stood up and walked out and to back of his head. I said, "Well that's just great Craig. You ignored the message and shot the messengers. I'll never be back no matter how strong of a message I've got that you need to hear, I'll never bring it to you now.”

Friday, July 20, 2012

small files may be their own checkin log message

sometimes I find myself taking the contents of a small file that I am checking in, e.g. with hg ci,
and inserting the contents of that file as the checkin log mesage.

I.e. sometimes the file itself explains what it is better than words can.

Typical for small scripts, on or two liners.

This makes some history messages that interleave log message and file contents ugly.

Note: often not the entire checking log. Often I put a little bit of text explaining.

Thursday, July 19, 2012

Repo state on branches

Interesting:

if you do hg clone -r default

(m)any branches that were closed will now be reopened.

Because the Mercurial close record for a branch is usually a stub in the revision tree.

This is suboptimal.

It is also the sort of thing I have been obsessing about, both for source code version control, but also for logs: some state may want to transcend branches.

Code phases: Distinction between simulation and instrumentation code

I often find in simulators that I want to distinguish the simulation code, the code actually needed to run, from the instrumentation code. The latter exists only for analysis of the results, but is not necessary if all you care about is functional correctness.

Information should only flow one way, from simulation code to instrumentation. Never backwards.

I have experimented with C preprocessor macrios like INSTRUMENTATION( perf_counter++ ), and so on, but these are clumsy. I will admitt hat INSTRUMENTATION: looks better than INSTRUMENTATION( )

If you use sch macros, you can always do #define INSTRUMENTATION(x) /*nothing*/
and test that it still compiles and runs tests.

However, simulation/instrumentation are just two phases of code.

Debug may be a another phase, at least read-only debug observation. (Sometimes there is active debug, that influences the actual running.)

Asserts are yet another phase. Different from debug, because you often leave them on.

In both debg and assert phases, information flows from program main phase to the phases.

How about server/client partitioning? Although in this case data flows both ways.

== and !=

Q: should you always define operator!=() when operator==() is defined?

Uncomparables, where neither == or != are defined? Essentially, true, false, unknown.

File by file merge

Screw this "treat the whole repository as a single thing".

To disentangle a thrash, I am copying files one or two at a ti,me from a thrashing branch to a new branch.

Something like

hg update -r careful-branch
hg revert -r thrashing-branch file1.cpp file1.h
...
hg ci -m 'file1.cpp/.h merged from careful-branch'

Basically doing a file by file merge.

It sucks that Mercurial does not record this as a merge. Doesn't show up in the graphical diagram drawn by glog. I'd like some sort of dotted line.

Hmm : could track workflow - what files remain to be merged. Even on a diff chunk by chunk basis:

merge this chunk ; reject this chunk; defer until next pass.

Wednesday, July 18, 2012

Log entries cloned to status

I want to make logs (diaries, journals, etc.) more useful.

I have discussed the distinction between log entries - "at time T I did or observed this" - and status records - "the current state, e.g. of a code tree or my home directory, is SSSSS".

I.e. log entries are transient and historical.

Status entries are persistent, and, supposedly, current.

The big problem is that status goes stale. So status is always of the form "At time T the status was S (and I assume it still is, unless somebody has changed it, in which case they should please change the status)".

It is a pain to update both status and log.

Many log entries are status. But the log is not status - if for now other reason than a log grows huge, and periodically gets cleaned out or renamed.

Idea: automaticaly mark blurbs that are being written as status entries, in addition o adding them to the log. (I want to make everyting written get logged.)

E.g. copy them to ~/STATUS as well as ~/LOG. Linked appropriately.

Tuesday, July 17, 2012

Cute trick to snarf Windows dialog box text

I have long been frustrated when trying to report bugs in Windows, because I could not easily snarf the text of a dialog box, etc., to stuff into the bug report.

Well, it turns out there is a way:

http://stackoverflow.com/questions/158151/how-can-i-save-a-screenshot-directly-to-a-file-in-windows

PhiLho:

Little known fact: in most standard Windows (XP) dialogs, you can hit Ctrl+C to have a textual copy of the content of the dialog.

Example: open a file in Notepad, hit space, close the window, hit Ctrl+C on the Confirm Exit dialog, cancel, paste in Notepad the text of the dialog.

Well, I have wasted lots of time not knowing this...

O)f course, I would learn about it after I get into the habit of using tools like SnagIt and Windows 7's Snipping Tool to snip bitmaps by default, and paste those into... email, OneNote.

Heck, yesterday I got a primitive ability to paste images into GNU EMACS "text" files.
(Basically, I am making such text files be directories, and using my-org-screenshot (found all over, e.g. http://pastebin.com/QfLb9ZBr) to put the screenshot into a file that EMACS' org-mode can reference. Currently using directories, am modifying EMACS' tar mode to allow the new file to be written. Following my dictum that "UNIX already has all of the support needed for structured files: directories. Archives of directories, to make them convenient to move around.")

Heck, I think the biggest change in my usage patterns over the last three years has been to start blithely throwing bitmaps around. Basically, to treat bitmaps as a first class data type.

Especially useful when you have tools like Microsoft OneNote that can OCR a bitmap. Abd can therefore do a pretty good job of searching notes composed out of bitmap files.

Centralized Notification Ringtone/Sound Management

Android allows me to have different sounds (ringtones) for different notification events. This is good.

However, each seems to be managed in isolation. This is bad.

The most important thing about sound notifications is that they be distinct, so that you can remember which is for what. Best if there is some natural connection - like the sound of mail dropping through a slot for incoming email. (Although I get so much email... this is the only natural one I can think of.)

But distinct. Not good to have the same sound for incoming email (low priority), as for the reminder to get in the car and drive across town to an appointment you don't want to miss.

Manged in isolation, need to jump to each notifying app and change.

Want cenralised management, where I can see a list of all the different types of notifications I use, and can hear their sounds back to back.

I think Blackberry had this. But "all the differet typs of notifications I use". *I* *USE*. I recall the Blackberry list had all the notifications on the phone, includng many I had no idea about. Wasted time setting distinct notificaions, n one case thinking that incoming mssages meant incoming SMS text messages, not some other service.

--

Want different notifications for calendar items. E.g. different notifications for the "Leave now, drive 1 hour across town", and the "Pick up the phone for a meeting".

Saturday, July 14, 2012

Version control over multiple paths

Changes that are logically associated are often interleaved:

E.g instead of

v0--(big change a (a1) (a2))-->va--(big change b (b1) (b2))-->vab

we often see

v0--(a1)-->va1--(b1)-->va1b1--(a2)-->va12b1--(b2)-->va12b12

now, some people like to rewrite history, to get the all a before b change. I don't want to rewrte history. But I do want to create a more easily understood version of history.

Why can we not have multiple paths to the same end result? E.g.

v0--(a1)-->va1--(b1)-->va1b1--(a2)-->va12b1--(b2)-->va12b12
\ || ||
o--(a1)-->va1--(a2)-->va12'--(b1)-->va12b1'--(b2)-->va12b12

What are the chances of this formatting working? The diagram probably is messed up.

Anyway, this depicts an alternate history. va1 and va12b12 correspond to exactly the same file contents as the original history. But va12' and va12b1' do not.

We can imagine allowing the user to "check out" va12' and va12b1', but warning that it never occurred in the original history, and may not have been tested as well.

We can also imagine back annoating, ti say that these versions were okay. Tests passed.

For that matter, we could also have

v0--(a1)-->va1--(b1)-->va1b1--(a2)-->va12b1--(b2)-->va12b12
\\ || ||
o--(a1)-->va1--(a2)-->va12'--(b1)-->va12b1'--(b2)-->va12b12

|| ||

o--(b1)-->vb1'--(b2)-->v12'--(a1)-->va1b12''--(a2)-->va12b12

and, overall, between any two points we may have an indication of what chagesets may be grouped, and which are commutative. as well as recording which intermediate points in the possible histories have been tested.

DVCS branches

If branches are a first class concept in a DVCS, then it should be possible to rename them.

If branches are just a "tag", then it should be possible to rename.

I don't want to lose history. I just want to revise it - preservingtheoriginal.

Email with a friend about Mercurial

A friend:

Reading a hg tutorial now. They make a big deal out of having your own local repo that you can commit to and push to a shared repo when ready. That is essentially how I use branches in Svn. But branches are a pain when you have many trees. Trying to keep track of them all...

Urg. So many brain damaged DVCS people say that a private repo is a branch.

Maybe so ... but it is a dead branch that has fallen off the trunk,

and is lying on the floor of the forest waiting to rot, get chippered, or catch fire.

Or maybe, just maybe, get grafted back.

Ok, analogy too far.

But... Mercurial does have real branches. Just like svn, just like cvs, just like ...

First you clone a repo.

Then you make braches in your local repo.

You can check in to the branch in your local repo.

You can push your local repos changes to the master repo.

You can push the local brach, but they warn you about this - real branches were added late

to Mercurial, and many people are scared of them.

(Git, on the other hand, treats braches much more naturally.)

You can merge from the trunk to your branch, in your local repo (you always work in your local repo).

Test that worked.

When it does, merge from that back to the trunk

(the trunk in Mercurial is called -rdefault)

Finally push that from your local repo back to the master repo.

Here:

hg clone master-repo local-repo
cd local-repo
hg update -r default
hg branch task-branch
..make edits
hg ci -m 'my-edits' // on task branch
hg push master-repo // push, just to save in master repo. don't merge yet.
hg pull master-repo // probably updates the trunk, -r default, and maybe branches opther than what you are working on
hg merge -r default // merge from the trubk into your task-branch
..test
..fix
hg ci -m 'merging from trunk (default branch)'
...make more edits
hg ci -m 'more edits on task-branch'
hg pull master-repo
hg merge -r default
... test
hg ci -m 'second tracking merge from master / default onto task-branch'
hg update -r default // I find this confusing: "hg update" is what you use to switch branches
hg merge -r task-branch
hg ci -m 'merged from task-branch back into default trunk'
hg diff -r task-branch -r default // should be udentical wrt data, although hg's history checksum makes them different
hg push master-repo

One project you know and loved? has a multi-GB git repo now. Checking out a new tree is very time consuming... Maybe there is a way to limit the past? Someone checks in a binary and you lug it around forever--all versions of that binary... That seems like one small advantage for Svn. You check out what is current.... Maybe hg has help here??

hg is a bit better than git. it keeps checkpoints, so doesn't apply diffs.

but, all of these silly DVCSes carry around all history. Stupid.

OK to have all history in a master repo,

but I want my workspace repo to have a subset of the history - recent stuff.

I have tried to do this with Mercurial, but no joy.

There are many tools that allow you to edit history. But I don't want to edit history. I just want to subset it.

E.g. all 6 volumes of Gibbon in the library. But only 1 in my backpack. Yes, even on a Kindle...

By the way, you can configure HG so that several working trees share a repo.

Monday, July 09, 2012

Notifications when changeGROUPs are pushed

The Mercurial mailing list archived at http://mercurial.808500.n3.nabble.com/hgext-notify-as-a-changegroup-hook-generates-a-diff-between-wrong-versions-td3988885.html (and elsewhere, ncluding Google groups) has recently discussed what hgext.notify should provide.

I have been annoyed by this, because currently it diffs tip. And hg tip may not be on the default branch. So if I check in stuff on a branch, the next changegroup after me will see hgext.notify diff against my changes.

I'm not ready to jump into the Mercurial mailing list. Not enough context. But let's think about it:

A changegroup - a set of changes being pushed together - is not necessarily connected. It is possible that, e.g. there may be some changes on the default branch (i.e. the trunk), and some on other branches.

e.g.


old             changegroup 1  changegroup 2
---1d---3d---   ----6d-------  ---8d---
    \
     2b--4b--   --5b-----7b--

Currently, I believe hgext.ntify diffs 8d against 7b. Whereas it probably "should" be diffed against 6d. Similarly, 6d should be diffed against 3d. Moreover, in changegroup 1 there are two diffs of interest: 5b agauinst 4b, and 6d against 3d. By the way, I think that we are soon going to see a semantic split: (a) changegroup, a set of related changes, bersus something like pushgroup, a set of changes pushed together. A pushgroup might contain multiple changegroups. But this doesn't matter too much... Even with this distinction, a changehgroup may have multiple ancestors. The graph between old stuff and a changehroup may have more than one arc across it. So, what should you do? Hypothesis: if handling not changegroups, put individual revisions, (1.1) the default diff should be against the parent on the same branch. Mercurial has anonymous branches, so there might be too parents on the same branch. Need then to figure out which is most useful. I would say (1.2) if there is only one parent on the same brach that is in the old set, prior to the changegroup, use that. (1.3) if there is more than one parent on the same brach that is in the old set, prior to the changegroup, use the most recent. Or possibly the one that has the most/fewest diffs. Note that I am prioritizing to the same branch. If a changeset has parents on other branches, don't diff against them - unless they are the only branches. (which can happen when a branch morphs - since Mercurial dos not allow branches to be renamed retroactively). So, what about changegroups? The changegroup may be multiheaded, or not. If the changegroup only has a single head, you probably want to diff against the rev on the same branch as the head, in the old set, prior to the changegroup. Using the same prioritizations as above. If the changegroup has multiple heads, (a) you may want to diff each, or (b) you may want to select a most interesting head. E.g. the most recent. Or the most recent on the trunk.

Thursday, July 05, 2012

hg --close-branch updates tip :-(

Here's something joyful: when you close a branch in Mercurial, e.g. via

hg update -C branch-to-be-closed
hg commit --close-branch -m 'branch to be closed'

it becomes the "tip" - the most recent changest.

Many tools confuse "tip" with something like "the tip of the default branch". And do stuff like checking out or diffstat'ing with tip.

Which probably is not what you want for a closed branch.

Merge node common to several branches

Noticing a pattern:

hg branch b
... edit b ...
... test ...
hg merge -r default
... test ...
hg ci -m 'merged default into b'
hg update -r default
hg merge -r b
... diff -r default -r b
... test ...
hg ci -m 'merged from branch b into default (after already merged default into branch b, i.e. identical node in both default and branch b'

This seems suboptimal.

It seems to me that instead of creating the same node, wrt content, on two branches, in this case default and b, we could create it once. And link it in to both branches.

Key: must remember that the node is on both branches, so that afterwards can continue working separately.

I.e. could have updated to the merge node M but be on the default branch. Or on the b branch. (Or possibly both?)

It may be common to end a branch at such a merge node. Perhaps commn enough to be the default. But not univeral.

Krazy Glew's Blog