OpenSolaris

You are not signed in. Sign in or register.

OpenSolaris Project: Data Migration Manager

View the leaders for this project
Project Observers

Endorsing communities

Storage

View leaders for this project

Project observers

Endorsing communities

Storage

News!

  • 04/23 DMM discussed on Storage Stop during the Q/A session at the FAST '08 OpenSolaris BoF (Feb 25-29, San Jose CA)
  • 04/18/08 First DMM source code drop is now available! See Source Code below. We also will be debuting the DMM blog shortly where we can move discussion about DMM and migration in general to that forum. Stay tuned!
  • 03/13/08 We're here!!

Welcome to DMM

And welcome to the evolving DMM project page!

Data Migration Manager (DMM) is a tool which facilitates the movement of file systems from one platform to another over the NFS and/or CIFS protocols. Why not simply use tar or cpio and cp for this? Those work fine for small file systems, say 10Gb, but fall short when it comes to larger file systems and in general — answers to the 3 'W' questions. What are those?

  • Where exactly is file X at this moment?
  • When can I resume reading and writing to it?
  • What happens if a failure occurs while those programs are running?

DMM answers all three of these questions with a simple answer. It's available immediately on the target platform that you specified. You can read and write to it to your heart's content and in the unlikely event there were link failures or crashes along the way, each and every byte from the source along with any newly written bytes are all safely tucked away in a Solaris file system.

DMM is designed to be the solution you trust when you have large amounts of data to move that will take days to complete.

Key features of DMM:

  • Allows read/write access to file system content while a migration is running.
  • Migrates one or more file systems from source platform A to target B.
  • Migrates files and related metadata over the CIFS and NFS protocols.
  • Uses a phased-commit approach to ensure that data is protected from failures.
  • Very mindful of its presence — tries very hard to make sure that an application cannot tell that it is there.
  • Easy to use GUI or command line interface.
  • Event-driven scheduling allows a migration to consume more resources at off-hours, etc.
  • Migrates files in the background while still responding to on-demand requests made by active processes.
  • Limited policy allows influence over the order of files to be migrated asynchronously.

Documentation

We'll have the detailed functional specification for DMM posted here as soon as we can. In the meantime, a high-level overview is below.

High-level architectural overview

DMM is divided into two major pieces, the user-space portions and a kernel portion.

User-space

The DMM user space is a collection of processes which scales to the number of concurrent migrations running at any given time. There is a master migration process which controls each set of tasks representing a file system migration. This master talks to the GUI or CLI and has overall responsibility for operations.

A central component of the user-space environment is the database that is the brain of DMM. DMM is using mySQL 5.1 for its first release. This database tracks among other things, directories and files to be moved as well as the overall state of each migration.

Each file system that is being migrated has a set of processes monitoring and controlling that migration. Two major things are going on here during this time:

  • Events are being posted via the Solaris Doors interface from the DMM kernel module. These events indicate that files are being examined or read/written to by processes running on the target. These are high-priority things that get attention immediately.
  • When events from the kernel have slowed down to the point where DMM can do other things, it is looking through the database for files to be moved. Some of these may have been specified by the user thorough the policy component of the GUI to be moved at a higher priority.

At some point, all the files from the read-only source have been migrated and the migration process packs up and goes home. The file system on the target is now fully operational with no remaining ties to DMM.

Kernel

DMM has a lightweight kernel component designed to interpose between the virtual file system operational layer (VOP) and the file system itself. For each file system being migrated, DMM attaches a Solaris FEM monitor to each vnode that arises for that file system. That's our hook — Whaa Haa Ha!! we're now in control and rule the world! Errr, never mind..

Each vnode operation (VOP_READ, VOP_GETATTR, etc) likely triggers an event which is routed to the DMM user-space portion via one of two Solaris Doors that we established to user-space for that file system's migration. DMM's user-space then does all the work of determining how to proceed at this point. It may decide to bring the whole file across if it's pretty small, it may decide to just bring across the data or information being requested on that call. We don't want to hold up the application too much so we'll be pretty smart about this part.

The key here is we can hold the application inside the kernel while we retrieve the data. That is the heart of what DMM is all about.

Remember that database we talked about above (our brain)? Let's say a file is being deleted before we've brought it across. Worse yet, let's say it's being renamed. Our event to user-space may be one that indicates that we're about to delete or rename the file. Once we get the okey-dokey that the user-space DMM has logged this intent in our database, we'll perform the VOP_RENAME for example. We'll update the DB when we're done with that portion, etc. This is the phased commit approach we talked about above in the features section. We have to assume that the target host could crash at any moment (as unlikely as that is..)

We're absolutely paranoid in DMM about ever losing a byte of a file, ever.

Once any file is migrated, we remove the FEM monitor from its vnode and all future dealings that the application has with that file are handled directly by the file system.

Source code

Our first source code drop can be obtained here DMM Source code

Download it and take a look. This is the framework which we'll fill in over time but you'll get a chance right now to see the user-space and kernel components and what we're talking about.

While it is functional, it's not real convenient to use at the moment but this will change over time. One of the first things you'll see is that DMM's source code is still organized as a project onto itself vs. being organized with respect to OpenSolaris. This will change in the next drop.

Have a look and we'll soon be talking about this on our blog!

Related Projects

CIFS Client