Long ago I gave up on CVS's modules. I found that checking out subtrees worked reasonably well.
I want to figure out how to check out ad in just particular objects or directory subtrees from a git repo. Checking out should be easy enough, particularly if just exporting: scan whatever head of branch or contour you want, and check out the objects that match a criterion such as "under this tree".
Checking out the history is a bit more challenging. If file objects have always been at the same position in the tree. no worries. If file objects have moved ... moving within the sub-tree that is being checked out, agaiun no worries. But if they have moved into or out of the sub-tree being checked out, what should you do?
I lean towards checking out the history objects, but somehow preventing checking out of file version objects that are outside the current sub-tree. You could read about them in the log, diff against them, do the equivalent of cvs update -p on them to look at the contents ... but the nice default of checking out a version of such a file object into its historical position in the filesystem would not apply to such out-of-bounds file objects.
Checking in is more of a challenge, mainly because of assumptions wrt atomicity of multi-file, whole-project, checkins. Clearly this atomicity cannot be supported at all times if my goal of checking in/out individual file objects and subtrees is to be supported.
However, I think that we should support such atomicity as much as possible, since so many people like it. The lack was one of the biggest complaints about CVS.
I think that it boils down to questions such as "What is a branch?" and "What is the head of a branch?" Whole project commits correspond a set of file objects, with a guarantee that the next checkin on the branch will be linked backwards to the current set.
E.g. imagine a smal multi-file system, all on the same branch:
F1/v1 -> F1/v2 -> F1/v3
F2/v1 -> F2/v2 -> F2/v3
F3/v1 -> F3/v2 -> F3/v3
Checking in/out the whole project conceptually advances the versions of all file objects in lockstep, or, equivalently
Project/v1{F1,F2,F3} -> Project/v2{F1,F2,F3} -> Project/v3{F1,F2,F3}
Checking in some, but not all, of the files corresponds to something like this:
F1/v1 -> F1/v2 -> F1/v3
F2/v1 -> F2/v2 -> F2/v3 -> F2/v4
F3/v1 -> F3/v2 -> F3/v3 -> F3/v4
or almost equivalently:
Project/v1{F1,F2,F3} -> Project/v2{F1,F2,F3} -> Project/v3{F1,F2,F3}
-> IncompleteProject/v4{F2,F3}
On this branch, you could ask for the most recent whole-project atomic commit, or the latest version of all files.
Part of the problem is that the "filesystem" consists of the logical equivalent of directory files that list all filenames in a given version of the whole project, and point to the version (content) of the corresponding file objects. The filesystem does not really have a representation for a version of an individual file, except its content. Having per-file-object version tracking is conceptually easy, but would be more work, and would be liable to inconsistencies.
This is not a conceptual problem, just an implementation artifact. If we had database views...
We could keep the whole project viewpoint by doing something like
Project/v1{F1,F2,F3} -> Project/v2{F1,F2,F3} -> Project/v3{F1,F2,F3}
-> Project/v4{F1/v3,IncompleteProject/v4{F2,F3}}
 
 
No comments:
Post a Comment