OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » zfs » discuss

Thread: tracking error to file

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 11 - Last Post: Feb 20, 2007 10:43 AM by: goo
shawga

Posts: 102
From: Louisville, CO

Registered: 3/12/06
tracking error to file
Posted: May 19, 2006 12:23 PM

  Click to reply to this thread Reply

In my testing, I've found the following error:

zpool status -v
pool: local
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
local ONLINE 0 0 0
c0d1p0 ONLINE 0 0 0
c2d0p1 ONLINE 0 0 0
c3d0p1 ONLINE 0 0 0
c0d0s7 ONLINE 0 0 0

errors: The following persistent errors have been detected:

DATASET OBJECT RANGE
1b 2402 lvl=0 blkid=1965

I haven't found a way to report in human terms what the above object
refers to. Is there such a method?

I can clear the error using existing tools, but I'd like to know what
is broken before I destroy it.

Thanks!

-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273 Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382 greg dot shaw at sun dot com (work)
Louisville, CO 80028-4382 shaw at fmsoft dot com (home)
"When Microsoft writes an application for Linux, I've Won." - Linus
Torvalds


_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



Matthew Ahrens
ahrens@eng.sun.com
Re: tracking error to file
Posted: May 21, 2006 11:25 PM   in response to: shawga

  Click to reply to this thread Reply

On Fri, May 19, 2006 at 01:23:02PM -0600, Gregory Shaw wrote:
> DATASET OBJECT RANGE
> 1b 2402 lvl=0 blkid=1965
>
> I haven't found a way to report in human terms what the above object
> refers to. Is there such a method?

There isn't any great method currently, but you can use 'zdb' to find
this information. The quickest way would be to first determine the name
of dataset 0x1b (=27):

# zdb local | grep "ID 27,"
Dataset local/ahrens [ZPL], ID 27, ...

Then get info on that particular object in that filesystem:

# zdb -vvv <dataset_name> 2402
...
Object lvl iblk dblk lsize asize type
2402 1 16K 3.50K 3.50K 2.50K ZFS plain file
264 bonus ZFS znode
path /raidz/usr/src/uts/common/fs/zfs/dmu.c
...

The "path" listed is relative to the filesystem's mountpoint.

--matt
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



shawga

Posts: 102
From: Louisville, CO

Registered: 3/12/06
Re: tracking error to file
Posted: May 22, 2006 8:21 AM   in response to: Matthew Ahrens

  Click to reply to this thread Reply

Thanks! I will do the below.

I brought it up on the alias, as I thought the problem would be
encountered by a user eventually. They'll want the same information
-- What does the error impact?

On May 22, 2006, at 12:25 AM, Matthew Ahrens wrote:

> On Fri, May 19, 2006 at 01:23:02PM -0600, Gregory Shaw wrote:
>> DATASET OBJECT RANGE
>> 1b 2402 lvl=0 blkid=1965
>>
>> I haven't found a way to report in human terms what the above object
>> refers to. Is there such a method?
>
> There isn't any great method currently, but you can use 'zdb' to find
> this information. The quickest way would be to first determine the
> name
> of dataset 0x1b (=27):
>
> # zdb local | grep "ID 27,"
> Dataset local/ahrens [ZPL], ID 27, ...
>
> Then get info on that particular object in that filesystem:
>
> # zdb -vvv <dataset_name> 2402
> ...
> Object lvl iblk dblk lsize asize type
> 2402 1 16K 3.50K 3.50K 2.50K ZFS plain file
> 264 bonus ZFS znode
> path /raidz/usr/src/uts/common/fs/zfs/dmu.c
> ...
>
> The "path" listed is relative to the filesystem's mountpoint.
>
> --matt
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris dot org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273 Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382 greg dot shaw at sun dot com (work)
Louisville, CO 80028-4382 shaw at fmsoft dot com (home)
"When Microsoft writes an application for Linux, I've Won." - Linus
Torvalds


_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



Wout Mertens
wmertens@cisco.com
Re: tracking error to file
Posted: May 23, 2006 2:49 AM   in response to: Matthew Ahrens

  Click to reply to this thread Reply

Can that same method be used to figure out what files changed between
snapshots?

Wout.

On 22 May 2006, at 08:25, Matthew Ahrens wrote:

> On Fri, May 19, 2006 at 01:23:02PM -0600, Gregory Shaw wrote:
>> DATASET OBJECT RANGE
>> 1b 2402 lvl=0 blkid=1965
>>
>> I haven't found a way to report in human terms what the above object
>> refers to. Is there such a method?
>
> There isn't any great method currently, but you can use 'zdb' to find
> this information. The quickest way would be to first determine the
> name
> of dataset 0x1b (=27):
>
> # zdb local | grep "ID 27,"
> Dataset local/ahrens [ZPL], ID 27, ...
>
> Then get info on that particular object in that filesystem:
>
> # zdb -vvv <dataset_name> 2402
> ...
> Object lvl iblk dblk lsize asize type
> 2402 1 16K 3.50K 3.50K 2.50K ZFS plain file
> 264 bonus ZFS znode
> path /raidz/usr/src/uts/common/fs/zfs/dmu.c
> ...
>
> The "path" listed is relative to the filesystem's mountpoint.
>
> --matt
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris dot org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



Matthew Ahrens
ahrens@eng.sun.com
Re: tracking error to file
Posted: May 23, 2006 9:44 AM   in response to: Wout Mertens

  Click to reply to this thread Reply

On Tue, May 23, 2006 at 11:49:47AM +0200, Wout Mertens wrote:
> Can that same method be used to figure out what files changed between
> snapshots?

To figure out what files changed, we need to (a) figure out what object
numbers changed, and (b) do the object number to file name translation.

The method I described (using zdb) will not be involved in either step.
zdb is an undocumented interface, and using it for this purpose is only
a workaround. However, the same algorithms implemented in zdb will be
used to do step (b), the object number to file name translation.

--matt
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



rab

Posts: 95
From: US

Registered: 3/9/05
Re: tracking error to file
Posted: Sep 27, 2006 3:32 PM   in response to: Matthew Ahrens
To: Communities » zfs » discuss
  Click to reply to this thread Reply

The zdb object -> path trick doesn't give me a path name:


errors: The following persistent errors have been detected:

DATASET OBJECT RANGE
13 a51b lvl=0 blkid=9

bash-3.00# zdb mypool | grep "ID 19,"
Dataset mypool/rab [ZPL], ID 19, cr_txg 6, last_txg 4391649, 80.3G, 41883

objectsbash-3.00# zdb -vvv mypool/rab a51b
Dataset mypool/rab [ZPL], ID 19, cr_txg 6, last_txg 4391649, 80.3G, 41883 objects, rootbp [L0 DMU objset] 400L/200P DVA[0]=<1:4408daa00:200> DVA[1]=<0:8d7323200:200> DVA[2]=<1:6a1c4ee00:200> fletcher4 lzjb LE contiguous birth=4391649 fill=41883 cksum=b79e8d8b0:469ba0a4696:e05ec517a391:1ea5669d90270d

ZIL header: claim_txg 0, seq 0

first block: [L0 ZIL intent log] 20000L/20000P DVA[0]=<1:31c560000:20000> zilog uncompressed LE contiguous birth=4030488 fill=0 cksum=7e20922ee4d68bf1:e4a75d71f8cd7cb5:13:1

Block seqno 1, won't claim


Object lvl iblk dblk lsize asize type
0 6 16K 16K 22.1M 15.2M DMU dnode

Should I be concerned? If the corruption isn't in my data, and ZFS metadata self-consistent at all times, does the corruption matter?

bash-3.00# uname -a
SunOS xxxx 5.11 onnv-gate:2006-09-26 i86pc i386 i86pc

ahrens

Posts: 424
From: US

Registered: 3/9/05
Re: Re: tracking error to file
Posted: Sep 27, 2006 3:55 PM   in response to: rab

  Click to reply to this thread Reply

Russell Blaine wrote:
> The zdb object -> path trick doesn't give me a path name:
>
>
> errors: The following persistent errors have been detected:
>
> DATASET OBJECT RANGE
> 13 a51b lvl=0 blkid=9

> objectsbash-3.00# zdb -vvv mypool/rab a51b

Try 0xa51b.

--matt
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



rab

Posts: 95
From: US

Registered: 3/9/05
Re: Re: tracking error to file
Posted: Sep 27, 2006 6:45 PM   in response to: ahrens
To: Communities » zfs » discuss
  Click to reply to this thread Reply

That was it. Thanks, Matt.

davin

Posts: 1
From: New York Metro Area

Registered: 2/18/07
Re: tracking error to file
Posted: Feb 18, 2007 9:19 PM   in response to: Matthew Ahrens
To: Communities » zfs » discuss
  Click to reply to this thread Reply

I have one that looks like this:
pool: preplica-1
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
preplica-1 ONLINE 2 0 2
c2t0d0 ONLINE 0 0 0
c2t1d0 ONLINE 0 0 0
c2t2d0 ONLINE 2 0 2
c2t3d0 ONLINE 0 0 0

errors: The following persistent errors have been detected:

DATASET OBJECT RANGE
36 3a2939 lvl=0 blkid=0

% uname -a
SunOS preplica01 5.10 Generic_118833-17 sun4u sparc SUNW,Sun-Fire-V210

% zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
preplica-1 9.06T 8.78T 291G 96% ONLINE -


This is a replicated filesystem, that is kept up to date with zfs send/recv, and is never even mounted locally. Originally the error was in a regular inode. So I did the find -inum thing, and found the filename. I cp'ed the file and deleted the old copy on the original filesystem, and did some incremental zfs send|recv's to propagate the fix here. And I expected the problem to go away.

But instead it started looking like that above.

I tried the trick with zdb listed here, but
zdb preplica-1 | grep "ID 36,"
is taking forever to complete. But none of the filesystems listed near the front of the output have ID 36.

So I tried the zdb -vvv of 0x3a2939 on each of the filesystems that I have - and none of them was ID 36! Not even the one that the bad inode had originally been reported it.

Any suggestions?

I know that it's a relatively old version of Solaris 10, with a fairly old patchset.

Should I be concerned about this error? I do know what caused it (a bad disk in the underlying hardware raid5 storage - yes... I know... I know... :-) - which was removed). So I'm not concerned about ongoing corruption from this specific problem. I just want to know what file is impacted by it.

Thanks!
Davin.

goo

Posts: 370
From: US

Registered: 6/13/05
Re: Re: tracking error to file
Posted: Feb 20, 2007 9:27 AM   in response to: davin

  Click to reply to this thread Reply


On Feb 18, 2007, at 9:19 PM, Davin Milun wrote:

> I have one that looks like this:
> pool: preplica-1
> state: ONLINE
> status: One or more devices has experienced an error resulting in data
> corruption. Applications may be affected.
> action: Restore the file in question if possible. Otherwise
> restore the
> entire pool from backup.
> see: http://www.sun.com/msg/ZFS-8000-8A
> scrub: none requested
> config:
>
> NAME STATE READ WRITE CKSUM
> preplica-1 ONLINE 2 0 2
> c2t0d0 ONLINE 0 0 0
> c2t1d0 ONLINE 0 0 0
> c2t2d0 ONLINE 2 0 2
> c2t3d0 ONLINE 0 0 0
>
> errors: The following persistent errors have been detected:
>
> DATASET OBJECT RANGE
> 36 3a2939 lvl=0 blkid=0
>
> % uname -a
> SunOS preplica01 5.10 Generic_118833-17 sun4u sparc SUNW,Sun-Fire-V210
>
> % zpool list
> NAME SIZE USED AVAIL CAP HEALTH
> ALTROOT
> preplica-1 9.06T 8.78T 291G 96% ONLINE -
>
>
> This is a replicated filesystem, that is kept up to date with zfs
> send/recv, and is never even mounted locally. Originally the error
> was in a regular inode. So I did the find -inum thing, and found
> the filename. I cp'ed the file and deleted the old copy on the
> original filesystem, and did some incremental zfs send|recv's to
> propagate the fix here. And I expected the problem to go away.

If you run a 'zpool scrub preplica-1', then the persistent error log
will be cleaned up. In the future, we'll have a background scrubber
to make your life easier.

eric

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



wstuart

Posts: 125
From: MPLS

Registered: 1/5/07
Re: Re: tracking error to file
Posted: Feb 20, 2007 10:43 AM   in response to: goo

  Click to reply to this thread Reply






>
> If you run a 'zpool scrub preplica-1', then the persistent error log
> will be cleaned up. In the future, we'll have a background scrubber
> to make your life easier.
>
> eric

Eric,

Great news! Are there any details about how this will be implemented
yet? I am most curious to how tunable it will be as far as system
resources (CPU/IO etc).

-Wade

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



goo

Posts: 370
From: US

Registered: 6/13/05
Re: Re: tracking error to file
Posted: Feb 20, 2007 11:54 AM   in response to: wstuart

  Click to reply to this thread Reply


On Feb 20, 2007, at 10:43 AM, Wade dot Stuart at fallon dot com wrote:

>
>
>
>
>
>>
>> If you run a 'zpool scrub preplica-1', then the persistent error log
>> will be cleaned up. In the future, we'll have a background scrubber
>> to make your life easier.
>>
>> eric
>
> Eric,
>
> Great news! Are there any details about how this will be
> implemented
> yet? I am most curious to how tunable it will be as far as system
> resources (CPU/IO etc).
>

No details yet, still working those out along with the infrastructure
to make it happen.

eric

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss






Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
© 2010, Oracle Corporation and/or its affiliates

Oracle