The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

Tuesday, September 25, 2012

Improving Test Pass Rate and Correlation

I prefer to do TDD, Test Driven Design - heck, often the original term, Test First Programming.

It's nice to write a test that fails, write the code, pass the tests including the new one (getting a green bar on your testing widget, or a count of tests failed=0 if not so GUI), rinse, lather, and repeat.

Sometimes I write a bunch of tests in advance, but comment, ifdef, or if out all bit the first test that I do not expect to pass.  Get the green bar of all current tests passing, enable an already written test, and repeat.  This gives me an idea of where I am going overall, preventing getting bogged down as tests proliferate for minor details.

It is important, useful, good, to incrementally enable such tests, rather than enabling them all at once.

a) It's nice to get the green bar when all tests pass.  Positive reinforcement.  If a whole slew of tests are written in advance and are mostly failing, it can be too easy to lose sight of the progress you are making.  Or to go backwards without noticing - e.g. 2 tests start working and 1 fails, looks like +1 started working, so you may miss the new failure.

b) on a more mundane level, this allows you to sketch out tests, in languages like C++, that do not even compile yet.  It is a waste of time to write a whole slew of tests in advance, spend ages getting them to compile, only to realize that there was a design flaw along the way that becomes evident as you get some earlier tests to run.

OK: TDD good.  Keeping tests mostly running, good.


But unfortunately sometimes we work in less Agile organizations.  Or in Agile teams, coupled to an ordinary Q&A team.  The sort of Q&A team that writes a thousand test cases in the first week, and then goes off and does something else while you take months to implement.  (It is especially easy to write such test cases if there is a random pattern stress tester, or some automation that creates many similar tests, slightly different.  A maze of twisty little tests, all slightly different.)

Worse: the QA team may be working concurrently. So, yesterday you had 100 tests passing, 900 failing.  Today you have 110 passing, 990 failing?  Is that an improvement, or did they just write some tests that test already implemented features?

It is depressing to have a thousand tests failing. And to have to pass/fail stats vary almost randomly.

Beyond depressing, it doesn't help you decide what to work on next.  Some of the test failures are expected - unimplemented features.  Some may be the feature you are coding now - higher priority.  Some are for features that were working, but which regressed - possibly higher priority to fix.

Faced with such an amorphous mass of tests, it seems a good idea to carve out a subset that is known good, and grow that.

Also, to do a preliminary triage or failure analysis, and create an simple automated tool that classifies the failures according to expected fail/pass, type of error, etc.   This way, you pick a class of failing tests, write the code to make them pass, and then repeat - always choosing a class of failing tests that are incrementally easy to write the code to make pass, rather than arbitrarily choosing a test that may require much blind coding before it can pass.

Run that regularly.  Use it to guide your work.

Or, at least: I'm using it to guide MY work.

And, along the way... yes, I'll also do some TDD test-first of my own. In my experience, the QA tests often stress, but often do not test the most basic features.

No comments: