OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » zfs » discuss

Thread: Status of zpool remove in raidz and non-redundant stripes

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 7 - Last Post: Jul 10, 2009 4:47 AM by: wmertens
mike_b

Posts: 9
From:

Registered: 12/5/08
Status of zpool remove in raidz and non-redundant stripes
Posted: Dec 5, 2008 7:55 AM
To: Communities » zfs » discuss
  Click to reply to this thread Reply

I've seen discussions as far back as 2006 that say development is underway to allow the addition and remove of disks in a raidz vdev to grow/shrink the group. Meaning, if a 4x100GB raidz only used 150GB of space, one could do 'zpool remove tank c0t3d0' and data residing on c0t3d0 would be migrated to other disks in the raidz. Then, c0t3d0 would be free for removal and reuse.

What is the status of this support in nv101?

If a pool has multiple raidz vdevs, how would one add a disk to the second raidz vdev?

relling

Posts: 1,859
From: US

Registered: 6/17/05
Re: [zfs-discuss] Status of zpool remove in raidz and non-redundant stripes
Posted: Dec 5, 2008 10:28 AM   in response to: mike_b

  Click to reply to this thread Reply

Mike Brancato wrote:
> I've seen discussions as far back as 2006 that say development is underway to allow the addition and remove of disks in a raidz vdev to grow/shrink the group. Meaning, if a 4x100GB raidz only used 150GB of space, one could do 'zpool remove tank c0t3d0' and data residing on c0t3d0 would be migrated to other disks in the raidz. Then, c0t3d0 would be free for removal and reuse.
>
> What is the status of this support in nv101?

Not available. I predict that you will see it mentioned everywhere,
billboards, graffiti, slashdot, etc. when it arrives.
-- richard
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


mike_b

Posts: 9
From:

Registered: 12/5/08
Re: [zfs-discuss] Status of zpool remove in raidz and non-redundant stripes
Posted: Dec 5, 2008 11:01 AM   in response to: relling
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Well, I knew it wasn't available. I meant to ask what is the status of the development of the feature? Not started, I presume.

Is there no timeline?

Miles Nordin
carton@Ivy.NET
Re: [zfs-discuss] Status of zpool remove in raidz and non-redundant stripes
Posted: Dec 5, 2008 12:40 PM   in response to: mike_b

  Click to reply to this thread Reply

>>>>> "mb" == Mike Brancato <mike at mikebrancato dot com> writes:

mb> if a 4x100GB raidz only used 150GB of space, one could do
mb> 'zpool remove tank c0t3d0' and data residing on c0t3d0 would
mb> be migrated to other disks in the raidz.

that sounds like in-place changing of stripe width, and wasn't part of
the discussion I remember. We were wishing for vdev removal, but
you'd have to remove a whole vdev at a time. It would be analagous to
'zpool add', so just as you can't add 1 disk to widen a 3-disk raidz
vdev to 4-disks, you couldn't do the reverse even with the wished-for
feature.

To change from 4x100GB raidz to 3x100GB raidz, you'd have to:

zpool add pool raidz disk5 disk6 disk7
zpool evacuate pool raidz disk1 disk2 disk3 disk4

RFE 4852783 is to create something like zpool evacuate, removing the
whole vdev at once and migrating onto other vdev's, not other disks.

The feature's advantage as-is would be for pools with many vdev's. It
could also be an advantage for pools with just one vdev that are
humongous: you want to change the shape of the 1 vdev, but you need to
do the copy/evacuation online because it takes a week. If not for the
week, on a 1-vdev pool you could destroy the pool and make a new one
without needing any more media than you would with the new feature.

For home storage with big, slow, cheap pools, what you want sounds
nice. Someone once told me he'd gotten Veritas to change a plex's
width with the vg online, but for me I think it's scary because, if it
crashed halfway through, I'm not sure how the system could communicate
to me what's happening in a way I'd understand, much less recover from
it. I'm not saying Veritas doesn't do both, just that I'd chuckle
happily if I saw it actually work (which was the storyteller's
response too). For vdev removal I think you could harmlessly stop the
evacuation at any time with only O(1) quickie-import-time recovery,
without needing to communicate anything. much easier. so i like the
RFE as-is, analagous to Linux LVM2's pvmove.
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


myxiplx

Posts: 877
From: GB

Registered: 10/24/07
Re: Status of zpool remove in raidz and non-redundant stripes
Posted: Dec 6, 2008 2:59 PM   in response to: mike_b
To: Communities » zfs » discuss
  Click to reply to this thread Reply

If I remember right, the code needed for this has implications for a lot of things:

- defrag
- adding disks to raidz zvols
- removing disks from vols
- restriping volumes (to give consistent performance after expansion)

In fact, I just found the question I asked a year or so back, which had a good reply from Jeff
http://opensolaris.org/jive/message.jspa?messageID=186561

... and while typing this, I also just found this blog post from Adam Leventhal in April, which is also related:
http://blogs.sun.com/ahl/entry/expand_o_matic_raid_z

tobert

Posts: 8
From:

Registered: 11/14/08
Re: Status of zpool remove in raidz and non-redundant stripes
Posted: Dec 6, 2008 3:55 PM   in response to: myxiplx
To: Communities » zfs » discuss
  Click to reply to this thread Reply

They also mentioned this at some of the ZFS talks at LISA 2008. The general argument is that, while plenty of hobbyists are clamoring for this, not enough paying customers are asking to make it a high enough priority to get done.

If you think about it, the code is not only complicated but will be incredibly hard to get right and _prove_ it's right.

Maybe the ZFS guys can just borrow the algorithm from Linux mdraid's experimental CONFIG_MD_RAID5_RESHAPE:

http://git.kernel.org/?p=linux/kernel/git/djbw/md.git;a=blob;f=drivers/md/raid5.c;h=224de022e7c5d6574cf46747947b3c9e326c8632;hb=HEAD#1885

dpolombo

Posts: 8
From: Lille, France

Registered: 4/28/08
Re: Status of zpool remove in raidz and non-redundant stripes
Posted: Feb 17, 2009 6:38 AM   in response to: tobert
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Honestly, if the main reason we're not getting this feature is that not enough paying customers are asking for it, it is remarkably short-sighted, and quite inconsistent with some of the claims made about ZFS (namely, "completely eliminates the concept of volumes and the associated problems of partitions, provisioning, [...]", from http://www.opensolaris.org/os/community/zfs/whatis/).

The last two companies I've worked for won't even consider ZFS until it's possible to easily remove a vdev from a zpool, and they're never going to ask Sun for this feature. They'll just dismiss the product as unsuitable for their environment - said environment including most notably a bunch of SAN boxes.

Could these companies work with ZFS in its current state? The answer is quite probably yes, but it would mean some significant changes in the way they approach their storage models, and most companies who've forked out big money on EMC or HDS SAN boxes aren't going to do that.

Could they work with this additional feature? I think they would, as ZFS would then seamlessly integrate in their current environment, with the added benefit of not having to pay for Symantec/Veritas licenses.

I find this regrettable, as I really love ZFS and its day-to-day administration (or rather, the absence thereof).

wmertens

Posts: 5
From:

Registered: 1/31/07
Re: Status of zpool remove in raidz and non-redundant stripes
Posted: Jul 10, 2009 4:47 AM   in response to: dpolombo
To: Communities » zfs » discuss
  Click to reply to this thread Reply

You're right - in my company (a very big one) we just stumbled across this as well and we're strongly considering not using ZFS because of it.

It's easy to type zpool add when you meant zpool replace - and then you can go rebuild your box because it was the root pool. Nice.

At the very least, "zpool add" should have more warnings.




Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
Copyright © 1995-2005 Sun Microsystems, Inc.