The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Monday, May 18, 2009

git usage model - individual files, subdirs

My attempts o use git are running into usage model problems.

As mentioned before, I have checked two diverged CVS trees into different branches of the same git repository. I want to merge the branches - but not all at once. I want to merge on a file by file, directory by directory, subprobject by subproject, basis, testing as I go.

Unfortunately, apparently git-merge is all or nothing. It merges all of the files in a branch, but does not have the option of merging one at a time.

I.e. it does not seem to have the ability to do what in CVS might look like

cvs update -rbranch1
cvs update -jbranch2 foo.c
cvs ci

merging only foo.c

This is the rub. Git seems only capable of treating the entire repo, the entire branch, as an object, whereas I look on files, directories, etc. as potential subobjects.

I am falling back to checking out two different git repos for the branches, and merging "manually" by copying files from one to the other. But this sucks.


Mark Seaborn said...

Yep, that's the problem with Git's data model. As far as I know, all the decentralised SCMs have the same basic history-is-a-DAG model, so the same will apply to Bazaar. I'm not familiar with Mercurial, but I suspect it will apply there too.

The exception is Darcs, which has a set-of-patches model.

You could try merging a subdirectory without creating a merge commit ("git reset" before "git commit", I think). i.e. Create a commit with a single parent. Git won't track the fact that you did the merge (merge tracking is all-repository-or-nothing). When you have merged all the subdirectories, a proper full-repository Git merge will be trivial.

AndyGlew said...

Thanks for your comment and suggestion, Mark.

Q: how did you happen to come across my blog? Just curious. Googling for git?

I think that I am just going to merge onto one (or both) or my live branches, and tag when merged. Eventually, I hope to have merged everything... Problem with this approach is that it encourages subsequent spontaneous re-divergences, because it does not identify objects that should be shared between branches from the merge point on.

I suppose I could also create git submodules, which are really a form of shared object. But git submodules are not exactly lightweight or elegant. Almost as bad as CVS modules. That's another post.

Terminology: I do not think that the problem is "history is a DAG". The problem is that history is a DAG, where a node is a contour, i.e. a set of objects.

We could have history is a dag, where nodes are objects, individual files. Then you could draw a boundary around the set of nodes that comprise a version of the entire project. If you collapsed such a meta-node, you would have the current graph.