OpenSolaris

  subsites:   Code Reviews   Gates   Issues   Defects   Polls   Test   PKG   Planet   Mail
You are not signed in. Sign in or register.

OpenSolaris DSCM Evaluation: Bzr (Interim Report)

1. Introduction

bzr, or Bazaar-NG, is a distributed source code management system being developed as a successor to Bazaar 1.x. Work on bzr is funded by Canonical; bzr is released under the GPL.

2. Version used

bzr 0.7 was used for the evaluation.

3. Requirements

This section will look at the requirements as listed in the Distributed Source Code Management Requirements, version 1.4.

3.1 E0 - Open source

bzr is open source, and available under the GPL, version 2.

3.2 E1 - Unbiased and disconnected distribution

bzr can operate in a disconnected fashion. With the bzrtools extensions, bzr gains a push implementation; the core tool supports only pull by default.

3.3 E2 - Networked operation

bzr supports a number of network transports. sftp can be used for push and pull.

3.4 E3 - Interface stability and completeness

bzr has a plugin interface. Popular plugins have migrated into the core as their value is demonstrated.

bzr has supported upgrade across storage version changes. The development site suggests that the current format is long-lived, but also is exploring alternate formats with greater efficiencies.

3.5 E4 - Standard operations and transactions

History preserving rename is supported. Merge across rename is supported. Deletion of files is supported.

3.6 E5 - Per changeset metadata

Generic handling of versioned metadata is still an architectural goal. bzr can support GPG-based signing of revisions in the tested version.

3.7 C6 - Ease of use

bzr, because of its primarily Python-based implementation, is straightforward to install. For use of the sftp transport, the additional modules paramiko and pyCrypto are required; these modules are produced by members of the Python community separate from the bzr effort.

bzr's CLI is consistent and generally easy to understand. The command has reasonably detailed internal help text. bzr with bzrtools has a number of advanced commands that may cause confusion; in particular, the merge/remerge/commit/uncommit/resolve/shelve subcommand complex encompasses a set of operational choices that could result in user vertigo during synchronization.

3.8 C7 "No dedicated server" operational mode

bzr does not have a server program.

3.9 C8 - Tool community health

The bzr community appears to have active core developers and contributors.

3.10 C9 - OpenSolaris community implementation expertise

Specifically for bzr, none known; a number of community members have Python experience. Due to participation in other OSS communities, that number is increasing.

3.11 C10 - Interface extensibility

The post_commit hook allows a Python function, presumably delivered in a plugin, to be executed on post_commit. (Because of the nature of push, which is serverless, notification operations like we use today appear to require the existence of a child repository to follow the integration repository and execute hooks.) A mail-on-commit operation is another possibility.

3.12 C11 - Transactional operations and corruption recovery

bzr attempts to maintain transactional integrity. The 'check' subcommand is used to verify history consistency; I did not evaluate whether check would catch all possible forms of corruption.

3.13 C12 - Content generality

Binary files are supported. An interesting aspect of bzr's implementation is that it uses Unicode internally, with repositories being stored in UTF-8.

3.14 O13 - Partial trees

Partial trees are not supported

3.15 O14 - Per-file histories

Per-file histories are not supported. The log subcommand can be used to examine the revisions affecting a specific file or files.

4. Evaluation

4.1 Test hardware used

psrinfo -v output:

The physical processor has 1 virtual processor (0)
x86 (AuthenticAMD family 15 model 5 step 8 clock 2189 MHz)
AMD Opteron(tm) Processor 248
The physical processor has 1 virtual processor (1)
x86 (AuthenticAMD family 15 model 5 step 8 clock 2189 MHz)
AMD Opteron(tm) Processor 248

uname -a output:

SunOS muskoka 5.11 snv_35 i86pc i386 i86pc

Tests were run on a local 37G mirrored UFS filesystem.

4.2 Test results

4.2.1 Speed

First commit (bzr add + bzr commit) of the OpenSolaris source tree: 7m20s

Local clone of this repository: 23m02s - 25m31s over three runs.

Local commit of one file in the repository: 16m47s.

Operations in general appear to require a traverse of the entire tree, as mentioned in the preliminary findings.

In one test, bzr branch coincidentally exhausted the available swap on the system. After extending a resident-set-size tracking tool (to monitor the complete RSS of the progeny of the initial command), I compared RSS usage of three DSCMs: bzr, Mercurial, and TeamWare. bzr, which had a peak RSS of 144.6MB, exceeded Mercurial's peak of 40.5MB and TeamWare's of 51.9MB by a substantial factor.

4.2.2 Conflict resolution

A test harness was used to test the following conflict scenarios:

  • Two users each have a clone of a central repository. Both make a different change to the same line of the same file.

    bzr detected the diff3 conflict. After manually resolving the conflict, the second user was able to commit and push.

  • Three users each have a clone of a central repository. Both move the same files to different locations. A 3rd user renames one of the files in its original directory. All then do a commit and a push.

    bzr detected the conflict. I assume that there is a resolution procedure, but this case (rename conflict) is not well covered by the documentation.

4.3 Source code

20 409 lines of Python compose the core of bzr. The implementation appears to be broken up into sensible modules. There appears to be a commitment to testing at the module level, with coverage for most of the .py files in bzrlib.

The support modules and bzrtools were not examined.

6. Conclusions

bzr appears to be a distributed SCM that has made sound choices regarding algorithms and representation, but is still progressing towards implementation maturity. A repository of the size of the OS/Net consolidation appears to present resource challenges, as local operations require significantly more time and more memory than other DSCMs. The use of sftp for push is not appealing, as it leaves a hosting environment with little choice but to offer sftp access in a general sense. Even a limited server program would greatly constrain the set of actions open to an uninformed or malicious committer.

7. Todo:

  • Storage consumption as yet unexamined.
  • Although we haven't used C elementtree, the timing does not appear to suggest CPU as the scarce resource.