The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Friday, April 25, 2008

Why Wiki Collaboration Beats Email Collaboration

The local group of wiki evangelists mailed this around.

Comments and wishes:

a) I wish that I had quantitative data on time saved by the faster wiki collaboration cycle compared to the slower email collaboration cycle.

b) Email collaboration can be done off-line - e.g. by an executive (or a Joe Engineer, like me) on an airplane without net access.

c) Email has a built-in gateway function - useful when you are compelled to include organizational enemies in your review list.

d) I suspect something similar applies to SharePoint vs. Wiki. To a first order, you cannot collaborate on SharePoint pages, since the SharePoint editor is so klugey. But SharePoint supports collaboration on documents posted to SharePoint pages, and is better than Wikis (at least lacking WebDAV).
I.e. Wiki wins on collaboration throgh the pages itself.
SharePoint wins on collaboration through documents / attachments.

Thursday, April 24, 2008

Version Control Checkout and Sharing of *Partially Populated* Subtrees

I'm on the version control tool warpath again.

I'm a damned good example of why VC tools need subprojects, nested trees, whatever... Never you mind Keith Packard (Mr X Consortium) saying that I shouldn't bother.

At the moment I am messing around with
a) at least 2 diverged CVS trees for my home directory, one on my laptop, one on Linux.
b) a simulator, which uses BitKeeper
c) my own main project, which uses CVS (fortunately, one site only)
d) various other tools at work, using CVS and SVN
e) my TWiki site at work, which uses RCS (although I can make it look like CVS if necessary)
f) friends and coworkers who use GIT

My contributions to b), c), d), and e) make extensive use of stuff from my personal library. But, these other projects do not want to import my whole personal library - rather, they want to import just the modules they need, and their dependences.

Part of my personal source tree is structured as follows:


" debug.h
" lprintf.h
" MathConst.h
" ag.h
" debug.h
" lfile.h
" libag.h
" simplelock.h
" strfamily.h*
" assert
" bitpattern
" bitvector
" builtin_initialization
" class_name
" code_location
" compile " component
" " SimpleScalar-libag-component
" error
" error_stream
" File_Slurp
" file_slurp
" fmt
" getopt
" getter_setter
" indented_ostream
" Inherit
" interval
" IO_converter
" libuarch
" " cache-array
" " " examples
" " memory_image
" " multilevel_cache
" " SimpleScalar-libag-libuarch
" Memory_Object
" misc
" new-libag-ideas
" " quantiles
" number
" old-stuff-trying-to-reuse
" regd_ptr
" smart_ptr
" stack_trace
" stdio-stream
" test
" thisprintf
" to_bitstring
" to_string
" typename
" types
" unindent
" xml

You get the picture: lots of little modules, each in their own directories or directory trees.

Way back when I started, I put multiple header files in the same directory, like include/debug.h, include/lfile.h. I also created "utility" files like ag.h and libag.h. Over time I learned that it was best to put each module, each header file, in a directory of its own. Such a directory gives you a place to place tests, etc. There is a pattern of repeating the directory name and the include name #include "libag/test/test.h"

Above I have excerpted "header-only" libraries. I also have some C/C+ libraries that need to be compiled, but I have learned that "header only" lubraries, that only need to be #included, are a lot more convenient to use.

Some of my libraries have interdependencies. For example, libag/test/test.hh includes ../code_location/codelocation.hh.
First thing to note: the use of includer relative paths. If I simply copy (cp -R) libag/{test,code_location}, and #include "import/libag/test/test.hh", everything works.

Second, and most important thing to note: a typical user of some (but not all) of my libraries might want a pruned subtree:

" test
" code_location

Importing all of libag is just clutter.

I.e. they do not want to import just complete subtrees.

They want to import a pruned subtree - stuff that is in the parent directory, libag, as well as subdirectories such as libag/test and libag/code_location.

I no longer put actual library functions at such parent-root directories. (Alright, I still have some files there, libag/stralloc.h, mainly dating back before I decided to use the "directory per module" pattern.) But I still have metadata there - libag/README.libag, explaining library organization. I also have tools like SConstruct files and files that guide test jig runners, so that I can simply say "make test" or "scons test" in libag, and all the tests run. (My test runner does NOT require a list of all the subdirectories.)
I would like to keep some of that metadata in libag when it is checked out for use by other projects.
I.e. I want all of the file contents of libag, but only a subset of the subdirectories.
I say again: I want a PRUNED SUBTREE of the source.


Really, what I want is a version control that makes it easy to share small modules. Not heavyweight subprojects, a la git. Not projects that I have to define in advance, like git or CVSROOT/modules. I want really lightweight modules. Directory subtrees seem to be a very natural way of specifying submodules.
I.e. at the very least I want to be able to do "vcs co libag/test libag/code_location".
Unfortunately, this least thing that I want to do is barely supported by any of the modern version control tools (git, hg, bzr). cvs, oronically, supports it fairly well.

But I want to do better than this least.

Ultimately, I want automatic dependency tracking. I want to be able to say "vcs co libag/test", and have something automatically recognize that I need to checdk out libag/code-location as well. Perl CPAN does it - I want it everywhere.

But I also want to be able to check out libag, with local files, and a pruned set of subdirectories.


OK, say I have gotten that far: I can check out just the modules I want.

Next, I want to be able to work in a single workspace - and check things into multiple version control repositories.
E.g. I want to be able to use CVS to check changes made to my libraries back into my personal epositories.
But the other projects that are importing my stuff also want to check things into their repositories. E.g. keiko/imprt/libag/test/test.hh gets BitKeepered, as well as CVS'ed.

Years ago, I gave the CVS community patches to allow this: I allowed there to be multiple CVS metadata directories, e.g. CVS1 and CVS2, and I allowed the user to specify which to use, CVS1 to check into repository 1, CVS2 for repository 2.

I've done similar kluges with BitKeeper and Subversion. E.g. I have checked CVS subdirectories, containg CVS/Root, CVS/Repository, and CVS/Entries files into BitKeeper - so that a bk clone or bk pull can always be kept in synch with CVS.
It is hard to go the other way: distributed version control systems such as git and BitKeeper, that clone an entire repository, *can* be checked into other VC systems, but don't behave nicely, wrt disk space and other issues. The problem: they do not clearly separate the concept of repository and workspace and links between the two.
AMD's "buildlist" methodology - basically a list of all files, and their RCS/CVS versions - lent itself well to this.


Reading through some of the interminable "Bzr versus Git versus Hg" comparisons, I see people asking "Who cares about history? You only need the last few years of changes."
Well, I care about history. I have been accumulating code snippets, library functions, and tools for more than 20 years. Some may date back to 1980. I probably have SCCS (!) files dating back to 1985-1987, copied and/or checked into RCS, and then CVS. There are some big gaps in the record, corresponding to periods where changes I made to my libraries were owned by companies, not me --- but I have a long history. Not very dense - some of these tools may sit unaffected for years, e.g. I am just now brushing off PerlSQL - but long lived.
But why care about this history? Well, I *have* on several occasions had to back up to 10 year old versions of some of my library functions, when the flavors of UNIX/Linux changed under me.

Sunday, April 06, 2008

More Toshiba TabletPC Woes - the Continuing Story

I still don't have a working Toshiba M400 TabletPC - more than 4 months after I purchased it. I purchased it in part to do my taxes and accounting on, but instead I had to use my wife's HP Pavilion tx1420us Entertainment Notebook PC - another Vista TabletPC.

I swear, I will never again buy a machine site unseen over the Internet!!!! And I will return the machine the moment there is a problem, within the return deadline. Unfortunately, I bought this machine for Xmas, and did not open it up within the requisite 15 days.

I am tempted to say that I will never again buy a Toshiba - but I had a great experience with my last Toshiba TabletPC, a Portege 3505.

I really wish that I could return this new Toshiba TabletPC. Preferably with a refund. Or replace it with a new comparable model of comparable price - although after this experience I am shy of the top end Toshiba/Core 2 Duos. Or even replaced with a working PC of the same model.

I hope that Toshiba will eventually get it all working --- but even so, I have not had any use of it for 4 months, and I would be buying a different model. (E.g. delay repairing a new computer, and the value of the repaired computer plummets.)

I probably just have a lemon. I hope it gets fixed by this trip to the repair depot, as opposed to sending it to the local repair shop.


The latest installment: Toshiba authorized return to depot for service. But the shipping container did not arrive before I left for spring break vacation, and there was a 10 day timeout. Reset by a phone call.

One thing that is becoming obvious: I don't have time to deal with this. With work, plus personal computer work using my wife's working computer (taxes, etc.) the times I can pay attention to this Toshiba are quantized, at particular weekend days or timeslots.


New fillip: after reinstalling the OS, per Toshiba's recommendation, I now am asked for a power-on password. I am unsure whether it is a hard disk password or a BIOS password, or what.

No matter - I don't think I set such a password. I wonder if the power or keyboard problems I suspect are plaguing this machine are causing this password manifestation?

I will just have to rely on the depot service guys zorching the whole thing.