OpenSolaris

  subsites:   Code Reviews   Gates   Issues   Defects   Polls   Test   PKG   Planet   Mail
You are not signed in. Sign in or register.

OpenSolaris DSCM Evaluation: Mercurial

A. Tool Information

Mercurial version 0.8.

B. Configuration Details

1. CPU architecture and machine specifics

kosh$ psrinfo -v
Status of virtual processor 0 as of: 03/30/2006 14:15:16
  on-line since 03/09/2006 16:44:59.
  The i386 processor operates at 1600 MHz,
        and has an i387 compatible floating point processor.
kosh$ uname -a
SunOS kosh 5.11 snv_35 i86pc i386 i86pc
    

2. Memory available

Memory size: 1024 Megabytes
    

3. Size of the repository (space)

641MB, including working directory. This repository mirrors ON's internal Teamware workspace as of 2006-04-11 16:52:19Z and has history back to build 18 (shortly after the OpenSolaris launch).

4. Number of files in the repository

43,311.

5. Average number of deltas per file

1.3.

More details: 1,884 changesets, 57,173 changes. (57173/43311=1.3)

C. Evaluation Areas

1. Operation Functionality

(Refer to the Requirements Document for detail.)

e1. unbiased and disconnected distribution

Mercurial implements a model of independent repositories, though a repository can be configured to have a de-facto parent.

Mercurial supports updates between two repositories with a common ancestor. If child-2 pulls from child-1 and has one or more changesets that aren't in child-1 (a common scenario in ON development), child-2 will have to do an "update -m", even if the changes are non-conflicting. Unlike Teamware, the merge must then be explicitly committed. (This is a feature, since it means you can test the merge before committing it, but it seems likely to cause confusion for Sun engineers. If we use Mercurial Queues for most development work, this issue probably goes away.)

The "disconnected-use" requirements are all satisfied.

e2. networked operation

As mentioned in the requirements document, Mercurial supports remote access via ssh. Pulls can also be done over HTTP. (And of course NFS can be used for pushes and pulls).

e3. interface stability and completeness

storage

The storage representation appears to be well-documented. Certainly there's more information on the Mercurial website than we've been able to find for SCCS's storage representation.

Mercurial's storage representation does not use Unix i-numbers, so snapshots such as those provided by ZFS or Network Appliance filers should not cause problems.

A ZFS snapshot can be used as the source for a Mercurial clone operation.

At least some of the on-disk data structures do not appear to be versioned. This is a potential hazard. At least one storage representation change is planned: "RevlogNG", which is planned for Mercurial 0.9. This change is expected to address the versioning issue.

On-disk data structures are binary files, but I had no problems using the same repository from both SPARC and x86 systems. Binary files give improved performance, but in the unlikely event that manual repairs are needed, we'll need a binary editing program.

Mercurial does not provide its own access control mechanism for controlling access to subtrees within a repository. While it might be possible to restrict user access to certain subtrees using filesystem ACLs, it would probably be better to use various pre-operation hooks (e.g., pretxnchangegroup) to implement that sort of control. Bryan O'Sullivan reports that there is a recipe in the wiki for doing access control over ssh, without needing to create a separate user account for everyone.

command-line interface, hooks

The command-line and hook interfaces appear to be adequately documented.

One nit: the current documentation appears to reflect the current development version of the code, rather than the most recent release[1]; there is nothing in the documentation to clarify what version it applies to. If OpenSolaris uses Mercurial, we may wish to place snapshots of the code and documentation on opensolaris.org to avoid confusion.

The hook infrastructure invokes the named hook(s) with a few tokens such as the changeset ID passed in via the environment. This means that the hook may need to invoke various Mercurial commands to find out more about the changeset. A potential issue is that it may not always be possible to get the desired information back from the existing Mercurial command-line interfaces. For example, "hg log" gives the old and new names of a renamed file, along with the names of any other files involved in the changeset, but it can't tell you that file "foo" was renamed to "bar". Fortunately, all the known examples of this problem are considered bugs and have fixes planned.

Bryan O'Sullivan reports that the Mercurial team is also considering support for Python hooks that would run in the Mercurial process.

Lock-reentrancy does not appear to be a problem. Mercurial does not need read locks, and hooks are currently limited to only doing read operations.

At least one unexpected behavior was noted while testing: pushing a changeset from repository A to repository B caused A's commit hook to fire. This appears to be a known bug.

network protocol(s)

There is some documentation on the network protocol, though it's a bit sketchy. The protocol is versioned.

e4. standard operations and transactions

Mercurial operates on files. Rename or remove of a directory is translated into a rename or remove on the directory's contents.

Rename is implemented as copy and delete. More work is needed here: merges don't track renames, and rename conflicts are not detected.

Deleting a file and creating a new file with the same path is supported. The new instance includes the history of the old instance.

Mercurial supports the scenario where a file is deleted in one workspace, the deletion is backed out in the "gate" repository, and the file is edited concurrently in a second workspace. The backout was done via

$ hg revert -r lastrev
$ hg add file
$ hg commit -m "back out deletion of file"

where lastrev refers to the last changeset before the deletion.

The merge in the second workspace was a little messy: after the "update -m", the working copy of the file had the old and new versions of the file concatenated together.

Deleted files can still be referenced using the path and the -r option, for example

$ hg cat -r rev file
$ hg diff -r rev1 -r rev2 file

e5. per-changeset metadata

Mercurial associates a text comment with each changeset; this is added as part of the commit operation. The first line of the comment is displayed as a summary of the changeset for operations like "log", though the full comment can be displayed with the "-v" option. This is somewhat inconsistent with the current conventions for putbacks into ON; we'll want to think about what we want to do. Mercurial supports customization of "log" output, so that's one option.

c6. ease of use

Although I haven't built Mercurial from source, Bryan O'Sullivan reports

It builds out of the box on Solaris 10, once the system Python's Makefile is fixed to use gcc.

The primary interface is the hg command, with subcommands for the various options. This is pretty standard.

The model is straightforward: you commit one or more changesets to the repository that you're working in, then you push or pull them to other repositories. Note that pushing (or pulling) a changeset updates the target repository, but updating the target's source tree is a separate step ("hg update").

Mercurial offers subcommands specifically for generating and accepting source patches.

Mercurial supplies an HTTP server, as well. This can be used for browsing and for pulls over HTTP.

Support for backouts: the revert subcommand can be used to back out all changesets back to a particular revision. Backing out a changeset after the files have been subsequently modified is less straightforward. There is currently an open feature request for this. In the meantime, one can generate a source patch for the changeset that you want to back out, then apply the patch using "patch -R".

By default, "hg status" lists files that aren't tracked in the repository (e.g., compiled binaries, editor backup files). While this is not peculiar to Mercurial, it's a change from Teamware, and it will generate an impossible level of noise in most real-life scenarios with ON. While doing "dmake clobber" will reduce the noise considerably[2], that is inconvenient for a tree the size of ON. Mercurial does provide mechanisms to filter out noise (e.g., status subcommand options, and .hgignore files), but it's not clear they can be used to give the desired results (show untracked source files and makefiles, but ignore all other untracked files).

Files should be imported with read-write permission. Mercurial keeps track of the permissions, and it complains if you try to update a read-only file (e.g., after a push or pull).

merging

Mercurial's default for resolving conflicts is the hgmerge script. This script checks for the presence of various third-party conflict resolver programs, such as tkdiff. At least some of these programs (e.g., tkdiff) offer functionality that is comparable to what is available with Teamware and Filemerge. It should be easy to add Filemerge support to hgmerge if that's desired.

If hgmerge can't find any of the expected conflict resolvers, it falls back to using diff(1) and patch(1). If patch rejects any part of the diffs, hgmerge invokes an editor so that the user can manually repair the conflict.

While first experimenting with Mercurial, I found it very easy to get my repository into a state where it would keep complaining about "outstanding uncommitted changes", but it was hard to figure out how to get out of that state. (Answer: use "hg update -C".) This probably needs to go into a FAQ.

mismerges

The current version of hgmerge has some problems that make it easy to introduce mismerges. One of the Mercurial developers is looking at reducing the likelihood of mismerges. We may also wish to consider our own changes.

First, if hgmerge uses patch, we may want to force a review of the changes, even if the patch applies cleanly.

Second, the code for invoking the editor and determining whether the conflict was resolved is a bit brittle[3]. This may just be a bug that needs fixing. But we may also want a more explicit "yes I have resolved the conflicts" action from the user (which is something Subversion does).

There is at least one open issue in the Mercurial bug database related to failed merges. Resolving this issue may address the brittleness problem mentioned above.

intermediary snapshots

The current ON convention is that putbacks should not introduce SCCS deltas for intermediary snapshots or Teamware merges. This is achieved by using the redelget and reedit subcommands in wx. Similar functionality is available with Mercurial using the Mercurial Queue (mq) extension.

c7. no-dedicated-server mode

Mercurial can run without any server daemons. ssh support is handled by starting a remote, transient Mercurial server automatically, which communicates with the local system over the ssh connection.

c8. tool community health

Mercurial has an active developer community. At least one developer (Bryan O'Sullivan) has helped with our evaluation of Mercurial, and he is interested in helping to address issues that we have run into so far (e.g., rename).

There have been a few Solaris-specific problems with Mercurial. The Mercurial developers have been quick to respond. And in at least one case they installed Solaris so that they could troubleshoot the issue. Also, the developers have paid attention to larger issues, rather than applying a steady stream of band-aids. For example, to address the recurring shell-compatibility problems with hgmerge, one developer suggested rewriting hgmerge in Python.

c9. OpenSolaris community expertise

Mercurial is almost entirely written in Python; hgmerge is a shell script, though there is some talk of rewriting it in Python, too. It's unclear (to me) how much Python expertise there is in the OpenSolaris community. Fortunately, Python appears easy to learn, and the code tends to be more readable than, say, Perl. Commenting is a bit sparse, but I haven't had problems following the code.

c10. interface extensibility

Mercurial has a hooks mechanism as well as a documented extensions mechanism. Some hooks can abort the current operation.

c11. transactional operations and corruption recovery

Mercurial's state files are updating by appending. So corrupted files can be repaired by rolling back to a consistent set of files.

Signal handling (e.g., SIGINT) appears sensible.

I simulated a crash using SIGKILL during a clone operation and was able to get the workspace into a state where there were many missing files. That is, the files were in the repository, but they were not in the visible source tree. This experiment raised a couple issues that we'll want to look at more carefully.

  • the child repository was left with two lock files which would block some operations but not others. A push to the parent was not blocked. Either this is a bug or we need to understand Mercurial's use of locks better.
  • Mercurial seemed confused by the contents of the child. I edited a file in the workspace--README.opensolaris--and tried to commit it, but Mercurial said there were no changes. Mercurial let me clone the child, edit README.opensolaris in the second child, and push the changes to the first child. Trying to work with README.opensolaris in the first child was still troublesome--it was as though Mercurial did not realize that README.opensolaris was already in the repository. For operational simplicitly, we'll want some sort of cleanup and repair script, with simple guidelines on when to run it.

There also appear to be a couple open issues related to locking[4].

c12. content generality

Mercurial supports binary files as well as text. However, the merge code appears to assume text files. We'll need to think about to handle binary files.

o13. partial trees

Not currently supported, though there have been discussions about adding support for it..

o14. per-file histories

Changesets are for the entire repository, not per-file. But you can get a per-file history by specifying the file name with "hg log".

2. Storage

We didn't have any problems running out of storage or swap. No storage spikes were observed.

One thing that deserves further investigation is the storage consumed by the conflict tests in the test harness that we used. The first test introduces a content conflict in usr/src/cmd/sort/Makefile.com. The second test introduces a couple rename conflicts in usr/src/cmd/pwd. This led to a size increase of 2.3MB in the test repository, which seems excessive.

3. Performance

A local clone of the OpenSolaris ON 20060222 tree takes a couple minutes on the above hardware using ZFS; about twice that on UFS.

Using the repo with history, performance looks like this (times are in mm:ss):

operation Local (zfs) LAN (NFS) LAN (ssh) WAN (Menlo Park-United Kingdom) (ssh)
clone (6 samples) 2:55 (std dev: 1:00) 6:54 (std dev: 0:17) 4:01 (std dev: 0:08) 35:42 (std dev: 2:45)
local commit (1 file; 6 samples) 0:23 (std dev: 0:13) 0:17 (std dev: 0:09) 0:19 (std dev: 0:11) 0:19 (std dev: 0:10)
push (1 file, 1-3 deltas; 4 samples) 0:01 0:02 0:03 0:06
pull (1 file, 1-3 deltas; 6 samples) 0:01 0:01 0:02 0:06

D. Changes/Features Required/Desired

Must Have Initially

  • rename/merge integration
  • versioning of on-disk data structures (this should be in the upcoming RevlogNG changes)
  • a reliable mechanism for checking a tree (repository plus working directory) after a crash and making any necessary repairs (see Section c11 above)

Want Eventually

  • detection of rename conflicts
  • supported interface for showing renames and deletions in the log (planned for Mercurial 0.9)
  • better handling of failed merges (issue12 in the Mercurial issue database)

Notes

[1] For example, the 2006-03-22 version of the hgrc(5) man page lists hooks that were apparently added after 0.8, although 0.8 is the most recent release.

[2] Besides leaving editor backup files, our clobber builds leave some generated files behind.

[3] The relevant code is

$EDITOR "$LOCAL" "$LOCAL.rej" && test -s "$LOCAL.rej" || exit 0
    

If $EDITOR can't be invoked for some reason, we'll take the "exit 0", which indicates a successful merge.

[4] issue132 "hg should revalidate its data after locking the repo" and issue154 "race between undo and all readers"

History

2006-04-20
Resolved TBDs. More information on use with ZFS snapshots. Minor editorial changes.
2006-04-13
Updates after email discussion with Bryan O'Sullivan <bos at serpentine dot com>
2006-03-31
First draft.
Last change: 2006-04-20 11:29 PDT