Mercurial version 0.8.
kosh$ psrinfo -v
Status of virtual processor 0 as of: 03/30/2006 14:15:16
on-line since 03/09/2006 16:44:59.
The i386 processor operates at 1600 MHz,
and has an i387 compatible floating point processor.
kosh$ uname -a
SunOS kosh 5.11 snv_35 i86pc i386 i86pc
Memory size: 1024 Megabytes
TBD. I'm currently creating a repository from the 20060222 snapshot. However, before the end of testing we plan to create a repository that contains the current sources plus history back to the OpenSolaris Launch in June 2005.
TBD. The 20060222 snapshot has 34,293 files.
TBD. 1 delta per file for the current test repository.
(Refer to the Requirements Document for detail.)
Mercurial implements a model of independent repositories, though a repository can be configured to have a de-facto parent. The preferred model for propagating changes is to pull them from the child, rather than pushing them to the parent.
I believe that Mercurial supports updates between two repositories with a common ancestor, but I haven't tested this (TBD).
I believe that the "disconnected-use" requirements are all satisfied, though I haven't tested them (TBD).
As mentioned in the requirements document, Mercurial supports
remote access via ssh.
The storage representation appears to be well-documented. Certainly there's more information on the Mercurial website than we've been able to find for SCCS's storage representation.
Mercurial's storage representation does not use Unix i-numbers, so snapshots such as those provided by ZFS or Network Appliance filers should not cause problems.
At least some of the on-disk data structures do not appear to be versioned. This is a potential hazard. At least one storage representation change is planned: "RevlogNG", which is planned for Mercurial 0.9.
On-disk data structures are binary files, but I had no problems using the same repository from both SPARC and x86 systems. Binary files give improved performance, but if manual repairs are needed, we'll need a binary editing program.
Mercurial does not provide its own access control mechanism for
controlling access to subtrees within a repository. While it
might be possible to restrict user access to certain subtrees
using filesystem ACLs, it would probably be better to use
various pre-operation hooks (e.g.,
pretxnchangegroup) to implement that sort of
control.
The command-line and hook interfaces appear to be adequately documented.
One nit: the current documentation appears to reflect the current development version of the code, rather than the most recent release[1]; there is nothing in the documentation to clarify what version it applies to. If OpenSolaris uses Mercurial, we may wish to place snapshots of the code and documentation on opensolaris.org to avoid confusion.
The hook infrastructure invokes the named hook(s) with a few tokens such as the changeset ID passed in via the environment. This means that the hook may need to invoke various Mercurial commands to find out more about the changeset. Presumably Mercurial's designers have thought about lock re-entrancy issues, but this should be verified. Also, it may not always be possible to get the desired information back from the existing Mercurial command-line interfaces. For example, "hg log" gives the old and new names of a renamed file, along with the names of any other files involved in the changeset, but it can't tell you that file "foo" was renamed to "bar".
At least one unexpected behavior was noted while testing:
pushing a changeset from repository A to repository B caused A's
commit hook to fire. If this is intentional, we'll
need to spend some time to make sure we understand the behavior
of the hooks that we want to depend on.
There is some documentation on the network protocol, though it's a bit sketchy. The protocol is versioned.
TBD (need to do more investigation). Rename support definitely needs work: merges don't track renames, and rename conflicts are not detected.
Mercurial associates a text comment with each changeset; this
is added as part of the commit operation. The
first line of the comment is displayed as a summary of the
changeset for operations like "log", though the full
comment can be displayed with the "-v" option. This
is somewhat inconsistent with the current conventions for
putbacks into ON; we'll want to think about what we want to
do.
Although I haven't built Mercurial from source, my understanding is that it's pretty straightforward. It requires Python, but I don't believe it requires anything beyond a normal Python distribution.
The primary interface is the hg command, with
subcommands for the various options. This is pretty
standard.
The model is straightforward: you commit
one or more changesets to the repository that you're working in,
then you push or pull them to other
repositories. Note that pushing a changset updates the target
repository, but updating the target's source tree is a separate
step ("hg update").
Mercurial offers subcommands specifically for generating and accepting source patches.
Mercurial supplies an HTTP server, as well. This can be used for browsing and for pulls over HTTP.
Support for backouts: the revert subcommand can be
used to back out all
changesets back to a particular revision. Backing out a
changeset after the files have been subsequently modified is
less straightforward. One suggestion is to generate a source
patch for the changeset that you want to back out, then apply
the patch using "patch -R".
By default, "hg status" lists files that aren't
tracked in the repository (e.g., compiled binaries, editor
backup files). This will generate an impossible level of noise
in most real-life scenarios with ON. While doing "dmake
clobber" will reduce the noise considerably[2], that is inconvenient for a tree the
size of ON. The status subcommand does offer
options to filter out noise, but it's not clear they can be used
to give the desired results (show untracked source files and
makefiles, but ignore all other untracked files).
Files should be imported with read-write permission. Mercurial keeps track of the permissions, and it complains if you try to update a read-only file (e.g., after a push or pull).
Mercurial's default for resolving conflicts is the
hgmerge script. This script checks for the
presence of various third-party conflict resolver programs, such
as tkdiff. At least some of these programs (e.g.,
tkdiff) offer functionality that is comparable to
what is available with Teamware and Filemerge. It should be
easy to add Filemerge support to hgmerge if that's
desired.
If hgmerge can't find any of the expected conflict
resolvers, it falls back to using diff(1) and
patch(1). If patch rejects any part
of the diffs, hgmerge invokes an editor so that
the user can manually repair the conflict.
While first experimenting with Mercurial, I found it very easy to get my repository into a state where it would keep complaining about "outstanding uncommitted changes", but it was hard to figure out how to get out of that state. (Answer: use "hg update -C".) This probably needs to go into a FAQ.
We'll want to think about possible changes to
hgmerge to reduce the likelihood of mismerges.
First, if hgmerge uses patch, we may
want to force a review of the changes, even if the patch applies
cleanly.
Second, the code for invoking the editor and determining whether the conflict was resolved is a bit brittle[3]. This may just be a bug that needs fixing. But we may also want a more explicit "yes I have resolved the conflicts" action from the user (which is something Subversion does).
There is at least one open issue in the Mercurial bug database related to failed merges. Resolving this issue may address the brittleness problem mentioned above.
The current ON convention is that putbacks should not introduce
SCCS deltas for intermediary snapshots or Teamware merges. This
is achieved by using the redelget and
reedit subcommands in wx. Similar
functionality is available with Mercurial using the Mercurial
Queue (mq) extension.
Mercurial can run without any server daemons. ssh
support is handled by starting a remote, transient Mercurial
server automatically, which communicates with the local system
over the ssh connection.
Mercurial has an active developer community. At least one developer (Bryan O'Sullivan) has helped with our evaluation of Mercurial, and he is interested in helping to address issues that we have run into so far (e.g., rename).
There have been a few problems with hgmerge, due
to the age of Solaris's /bin/sh. These have typically been
detected and repaired within a couple weeks. Also, at least one
community member has noticed this pattern and suggested a more
fundamental fix (rewriting hgmerge in Python).
Mercurial is almost entirely written in Python; hgmerge is a
shell script, though there is some talk of rewriting it in
Python, too. It's unclear (to me) how much Python expertise
there is in the OpenSolaris community. Fortunately, Python
appears easy to learn, and the code tends to be more readable
than, say, Perl. Commenting is a bit sparse, but I haven't had
problems following the code.
Mercurial has a hooks mechanism as well as a documented extensions mechanism. Some hooks can abort the current operation.
Mercurial's state files are updating by appending. So presumably corrupted files can be repaired by rolling back to a consistent set of files.
Signal handling (e.g., SIGINT) appears
sensible.
I simulated a crash using SIGKILL during a clone
operation and was able to
get the workspace into a state where there were many missing
files. That is, the files were in the repository, but they were not
in the visible source tree. This experiment raised a couple
issues that we'll want to look at more carefully.
README.opensolaris--and tried to
commit it, but Mercurial said there were no changes.
Mercurial let me clone the child, edit
README.opensolaris in the second child, and push
the changes to the first child. Trying to work with
README.opensolaris in the first child was still
troublesome--it was as though Mercurial did not realize that
README.opensolaris was already in the
repository. For operational simplicitly, we'll want some sort
of cleanup and repair script, with simple guidelines on when
to run it.There also appear to be a couple open issues related to locking[4].
Email discussions have indicated that Mercurial is supposed to support binary files as well as text. However, the merge code appears to assume text files.
Not supported.
Changesets are for the entire repository, not per-file. But you can get a per-file history by specifying the file name with "hg log".
Formal evaluation is still TBD. Informal results are that Mercurial does not cause any storage usage spikes.
TBD (still collecting numbers). A local clone of the 20060222 tree takes a couple minutes on the above hardware using ZFS; about twice that on UFS.
[1] For example, the 2006-03-22 version of
the hgrc(5) man page lists hooks that were
apparently added after 0.8, although 0.8 is the most recent
release.
[2] Besides leaving editor backup files, our clobber builds leave some generated files behind.
[3] The relevant code is
$EDITOR "$LOCAL" "$LOCAL.rej" && test -s "$LOCAL.rej" || exit 0
If
$EDITOR can't be invoked for some reason, we'll
take the "exit 0", which indicates a successful
merge.
[4] issue132 "hg should revalidate its data after locking the repo" and issue154 "race between undo and all readers"
Last change: 2006-03-31 15:01 PST