Keeping House

With any non-trivial software project, there are a variety of housekeeping tasks oriented around keeping development going and maintaining quality.  Housekeeping work needs to be kept in balance–too much focus on it and the project makes no progress; too little and the product becomes difficult to maintain and loses quality.  I signed onto the MIT Kerberos Consortium last October with the understanding that I would focus on these housekeeping issues, although I spend a fair amount of time on feature work.  Here are some of the areas I think about, without trying to draw too much focus away from user-visible improvements:

  • Code style consistency: MIT Kerberos has been in development for decades, and historically there has not been a strong focus on consistency of formatting and idioms.  We have agreed on a set of rules for new work; the challenge now is converting our large body of existing code to follow these principles.  For this release cycle we have decided to do a “great reindent”–basically, to eliminate the use of tabs in our C sources and ensure that all code use four-space indents.  That’s only a small part of the style principles, but it’s one of the most noticeable parts.
  • Function size: Ideally, most functions in MIT krb5 would fit within a screenful of text and be understandable at a glance.  Unfortunately, many of our functions have accreted over time to huge lengths, such as the 615-line krb5_get_init_creds().  It is hard to change functions like these with confidence since they are so complicated.  At some point, I would like to spend some time decomposing these monster functions into smaller parts.  Going forward, we can learn from this problem–when adding to the work a function performs, use a helper function rather than open-coding a complicated series of steps into an already moderate-sized function.
  • Naming: At times, krb5 developers have been lazy about choosing intuitive names.  When a TGS request arrives at the KDC (i.e. when a client requests to acquire service tickets using its ticket-granting ticket), the KDC code invokes dispatch(), which invokes process_tgs_req(), which invokes kdc_process_tgs_req() and then does a whole lot of other stuff.  The latter two function names are essentially equivalent, making it unclear how labor should be divided between them.  A related problem is consistenct of abbreviations; for example, in the kadmin library, we have filenames like lib/kadm5/srv/svr_policy.c and lib/kadm5/srv/server_dict.c.  It is hard to even type one of these filenames when the word “server” has to be mentally translated into the keystrokes svr, srv, or server at different times. I would love to spend some time improving this situation at some point.  Going forward, we should pay attention to quality of naming when reviewing contributions and new work.
  • Code documentation: the biggest problem here is that we don’t have krb5 API documentation.  We have plans to use doxygen for this but we haven’t yet done the work.  We also don’t have a culture of documenting our internal function contracts; you generally just have to look at the function name and arguments and guess what it’s supposed to do, or read the code to find out what it actually does.  Changing that in a code base of this size is a pretty massive undertaking, which I hope to make a dent in at some point.  For new code, we should insist on at least a brief comment before each function explaining its purpose.
  • Regression testing: I could talk about testing for a long time, and I will do so in a later blog post.  The good news is that we have a fairly large test suite in krb5.  The bad news is that it isn’t comprehensive, much of it relies on technology which has fallen out of favor (dejagnu, and therefore tcl and expect), it isn’t always easy to add tests for new code, and it isn’t always easy to debug test failures.
  • Portability testing: Thanks to Ken Raeburn, we have some infrastructure for regularly running regression tests on a variety of platforms.  Unfortunately, we have not been spending enough time debugging the resulting test failures.  Some of them are amusingly obscure–for instance, a recent one was that ftpd was sending replies to the client in the wrong order if the server-side user doesn’t have a valid home directory, as was true of one of the test accounts–but we aren’t taking advantage of the testing infrastructure if we aren’t getting to a clean baseline.
  • Interoperability testing: We don’t do enough of this.  Events like the Interoperability Testing Workshop are a great opportunity to test new features against other implementations, but we should also be testing basic functionality against other implementations on a regular basis.  To accomplish this, we probably want to set up static infrastructure based on other implementations (such as a read-only Heimdal or Active Directory KDC) and create a separate suite of functional tests which invokes our code against that infrastructure.  We could also make more use of gssMonger.
  • Static analysis: We make use of Coverity, a commercial product, to scan our code for potential defects.  It tends to find a lot of unimportant problems in old, well-tested code, but is pretty good at uncovering real problems in new code.  Unfortunately, I have had limited time to clean up the defects it finds in the current krb5 code base (although I have mostly purged libkrb5 of defects since I signed on).
  • Build system: Like most open source C projects, we use autoconf for portability (but not automake or libtool) and make for the build itself.  I think autoconf is approaching the end of its useful life.  It’s good at ensuring portability to a lot of Unix systems few people care about, and bad about ensuring portability to Windows.  Autoconf tests are written in an obscure macro language (m4) on top of least-common-denominator Bourne shell; it’s rare to find a developer with the skill set to work with this, and rarer to find a developer who wants to.  Given infinite time, I would love to rework the krb5 build system using scons, which I think is the most promising successor to autotools and make.  (Update 8/25: After more reading, I think we would need to consider more carefully which cross-platform build tool to use, if we were to do this.)
  • Version control: We use Subversion, which is an industry standing for open source projects and mostly does a fine job of maintaining our project history.  However, it’s not very good at merging or distributed development.  At some point it might be worth considering a transition to git, which has probably been gaining the most traction among DVCS tools in the open source world.  Currently Subversion hasn’t reached the pain point to make this transition a priority.
  • Bug tracking: Lest I sound too negative about everything, I think we have a perfectly satisfying bug tracking infrastructure using RT.  I wish we could keep up with the tickets better, but that’s a universal constant in open source work and probably in software work in general.
  • Release management: The accepted practice when I signed on was to have a major release (1.x) every 18 to 24 months, with a three-month testing period, and patch releases (1.x.y) as necessary.  Such a long release schedule makes it difficult to avoid the temptation to cram new features into the release during the testing period or in patch releases.  For the future, we have agreed to use a 6 to 12 month release schedule with a shorter testing period.
  • Project management: We use a pretty informal project management discipline defined here.  We made some modifications to it recently to hold more of the discussion of projects on the mailing list and less on the wiki.  By and large I like how it works, and don’t plan to make any changes.

As you can tell, it would be easy to spend all of my time on these issues alone, and probably all of several other people’s time as well.  For better or for worse, we don’t have that luxury.  If you have opinions about how we should be prioritizing the time we spend on code quality or on approaches we should take, your comments are welcome.

Leave a Reply

You must be logged in to post a comment.