The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Saturday, July 14, 2012

Version control over multiple paths

Changes that are logically associated are often interleaved:

E.g instead of

v0--(big change a (a1) (a2))-->va--(big change b (b1) (b2))-->vab

we often see


now, some people like to rewrite history, to get the all a before b change. I don't want to rewrte history. But I do want to create a more easily understood version of history.

Why can we not have multiple paths to the same end result? E.g.

  \                  ||                                                                     ||

What are the chances of this formatting working? The diagram probably is messed up.

Anyway, this depicts an alternate history.  va1 and va12b12 correspond to exactly the same file contents as the original history.  But va12' and va12b1' do not.

We can imagine allowing the user to "check out" va12' and va12b1', but warning that it never occurred in the original history, and may not have been tested as well.

We can also imagine back annoating, ti say that these versions were okay. Tests passed.

For that matter, we could also have

  \\                  ||                                                                     ||
   ||                                                                                       ||

and, overall, between any two points we may have an indication of what chagesets may be grouped, and which are commutative.  as well as recording which intermediate points in the possible histories have been tested.

DVCS branches

If branches are a first class concept in a DVCS, then it should be possible to rename them.

If branches are just a "tag", then it should be possible to rename.

I don't want to lose history.  I just want to revise it - preservingtheoriginal.

Email with a friend about Mercurial

A friend:

Reading a hg tutorial now. They make a big deal out of having your own local repo that you can commit to and push to a shared repo when ready. That is essentially how I use branches in Svn.  But branches are a pain when you have many trees. Trying to keep track of them all...  

Urg.  So many brain damaged DVCS people say that a private repo is a branch.

Maybe so ... but it is a dead branch that has fallen off the trunk,
and is lying on the floor of the forest waiting to rot, get chippered, or catch fire.

Or maybe, just maybe, get grafted back.  

Ok, analogy too far.

But...  Mercurial does have real branches.  Just like svn, just like cvs, just like ...

First you clone a repo.
Then you make braches in your local repo.

You can check in to the branch in your local repo.
You can push your local repos changes to the master repo.
You can push the local brach, but they warn you about this - real branches were added late
to Mercurial, and many people are scared of them.

(Git, on the other hand, treats braches much more naturally.)

You can merge from the trunk to your branch, in your local repo (you always work in your local repo).
Test that worked.
When it does, merge from that back to the trunk 
(the trunk in Mercurial is called -rdefault)

Finally push that from your local repo back to the master repo.


hg clone master-repo local-repo
cd local-repo
hg update -r default
hg branch task-branch
..make edits
hg ci -m 'my-edits' // on task branch
hg push master-repo // push, just to save in master repo. don't merge yet.
hg pull master-repo // probably updates the trunk, -r default, and maybe branches opther than what you are working on
hg merge -r default  // merge from the trubk into your task-branch
hg ci -m 'merging from trunk (default branch)'
...make more edits
hg ci -m 'more edits on task-branch'
hg pull master-repo
hg merge -r default
... test
hg ci -m 'second tracking merge from master / default onto task-branch'
hg update -r default  // I find this confusing: "hg update" is what you use to switch branches
hg merge -r task-branch
hg ci -m 'merged from task-branch back into default trunk'
hg diff -r task-branch -r default // should be udentical wrt data, although hg's history checksum makes them different
hg push master-repo

One project you know and loved? has a multi-GB git repo now.  Checking out a new tree is very time consuming... Maybe there is a way to limit the past? Someone checks in a binary and you lug it around forever--all versions of that binary... That seems like one small advantage for Svn.  You check out what is current....  Maybe hg has help here??

hg is a bit better than git.  it keeps checkpoints, so doesn't apply diffs.

but, all of these silly DVCSes carry around all history. Stupid.

OK to have all history in a master repo,
but I want my workspace repo to have a subset of the history - recent stuff.

I have tried to do this with Mercurial, but no joy.

There are many tools that allow you to edit history.  But I don't want to edit history.  I just want to subset it.

E.g. all 6 volumes of Gibbon in the library.  But only 1 in my backpack.  Yes, even on a Kindle...


By the way, you can configure HG so that several working trees share a repo.