OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » zfs » discuss

Thread: [zfs-discuss] marvell88sx2 driver build126

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 30 - Last Post: Nov 11, 2009 1:23 PM by: kebabber Threads: [ Previous | Next ]
tcook

Posts: 591
From: US

Registered: 8/21/06
[zfs-discuss] marvell88sx2 driver build126
Posted: Nov 1, 2009 9:27 PM

  Click to reply to this thread Reply

I've sent this to the driver list as well, but since the zfs folks tend to be intimately involved with the marvell driver stack, I figured I'd give you guys a shot too.



Does anyone happen to know if there was a driver change with build 126?  I had a pool that was 2x5+1 raidz vdev's.  I moved all the data off temporarily, changed it to one 10+2 raidz2 vdev, and am in the process of moving all the data back.

I've had two drives "fail" in the last 3 hours that have been running fine for over a year, and presented absolutely no issues moving the data out of the original zpool.  My first inclination is this is a driver issue.

I'm currently running 2xMarvell SAT2-MV8 SATA controllers.  6 disks on the first controller, 7 on the second (one hot spare).


zpool status
  pool: fserv
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
 scrub: resilver completed after 1h38m with 0 errors on Sun Nov  1 18:42:16 2009
config:

        NAME          STATE     READ WRITE CKSUM
        fserv         DEGRADED     0     0     0
          raidz2-0    DEGRADED     0     0     0
            c8t0d0    ONLINE       0     0     0
            c8t1d0    ONLINE       0     0     0
            spare-2   DEGRADED     0     0 2.83M
              c8t2d0  REMOVED      0     0     0
              c7t6d0  ONLINE       0     0     0  35.6G resilvered
            c8t3d0    ONLINE       0     0     0
            c8t4d0    ONLINE       0     0     0
            c8t5d0    ONLINE       0     0     0
            c7t0d0    ONLINE       0     0     0
            c7t1d0    ONLINE       0     0     0
            c7t2d0    ONLINE       0     0     0
            c7t3d0    ONLINE       0     0     0
            c7t4d0    REMOVED      0     0     0
            c7t5d0    ONLINE       0     0     0
        spares
          c7t6d0      INUSE     currently in use



Nov  1 16:21:34 fserv sata: [ID 801593 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/
pci1033,125@0,1/pci11ab,11ab@6:
Nov  1 16:21:34 fserv  SATA device at port 2 - device failed
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:34 fserv   Command failed to complete...Device is gone
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:34 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:34 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:34 fserv   drive offline
Nov  1 16:21:40 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:40 fserv   drive offline
Nov  1 16:21:40 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:40 fserv   drive offline
Nov  1 16:21:40 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:40 fserv   drive offline
Nov  1 16:21:40 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 16:21:40 fserv   drive offline
Nov  1 17:03:38 fserv marvell88sx: [ID 268337 kern.warning] WARNING: marvell88sx2:device on port 4 failed to reset
Nov  1 17:04:08 fserv sata: [ID 801593 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@4:
Nov  1 17:04:08 fserv  SATA device at port 4 - device failed
Nov  1 17:04:08 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@4/disk@4,0 (sd30):
Nov  1 17:04:08 fserv   Command failed to complete...Device is gone
Nov  1 17:04:08 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@4/disk@4,0 (sd30):
Nov  1 17:04:08 fserv   drive offline
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@4/disk@4,0 (sd30):
Nov  1 17:04:09 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@4/disk@4,0 (sd30):
Nov  1 17:04:09 fserv   drive offline
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@4/disk@4,0 (sd30):
Nov  1 17:04:09 fserv   drive offline
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@4/disk@4,0 (sd30):
Nov  1 17:04:09 fserv   drive offline
Nov  1 17:04:09 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0/pci11ab,11ab@4/disk@4,0 (sd30):
Nov  1 17:04:09 fserv   drive offline
Nov  1 18:31:59 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 18:31:59 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:32:11 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 18:32:11 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:35:00 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 18:35:00 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:35:12 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 18:35:12 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:35:21 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 18:35:21 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 18:38:36 fserv scsi: [ID 107833 kern.warning] WARNING: /pci@7c,0/pci10de,376@a/pci1033,125@0,1/pci11ab,11ab@6/disk@2,0 (sd26):
Nov  1 18:38:36 fserv   SYNCHRONIZE CACHE command failed (5)
Nov  1 21:06:31 fserv pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata) instance 2 irq 0xe vector 0x44 ioapic 0x4 intin 0xe is bound to cpu 3
Nov  1 21:06:31 fserv pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata) instance 3 irq 0xf vector 0x44 ioapic 0x4 intin 0xf is bound to cpu 0


_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 2, 2009 5:34 AM   in response to: tcook
To: Communities » zfs » discuss
  Click to reply to this thread Reply

I have the same card and might have seen the same problem. Yesterday I upgraded to b126 and started to migrate all my data to 8 disc raidz2 connected to such a card. And suddenly ZFS reported checksum errors. I thought the drives were faulty. But you suggest the problem could have been the driver? I also noticed that one of the drives had resilvered a small amount, just like yours.

I now use b125 and there are no checksum errors. So, is there a bug in the new b126 driver?

tcook

Posts: 591
From: US

Registered: 8/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 3, 2009 10:14 AM   in response to: kebabber

  Click to reply to this thread Reply



On Mon, Nov 2, 2009 at 6:34 AM, Orvar Korvar <knatte_fnatte_tjatte at yahoo dot com> wrote:
I have the same card and might have seen the same problem. Yesterday I upgraded to b126 and started to migrate all my data to 8 disc raidz2 connected to such a card. And suddenly ZFS reported checksum errors. I thought the drives were faulty. But you suggest the problem could have been the driver? I also noticed that one of the drives had resilvered a small amount, just like yours.

I now use b125 and there are no checksum errors. So, is there a bug in the new b126 driver?


Can any of you Sun folks comment on this? 

--Tim

_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 6, 2009 8:38 AM   in response to: tcook
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Noone has noticed this?

picker

Posts: 125
From: US

Registered: 12/1/05
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 6, 2009 11:38 AM   in response to: kebabber

  Click to reply to this thread Reply

> Nov 1 16:21:34 fserv Command failed to complete...Device is gone
> Nov 1 17:04:08 fserv Command failed to complete...Device is gone

kinda looks like drive FW or cable issue... if it was a driver
issue it might be a lost command or reset for phase resync.

> driver change with build 126?
not for the SATA framework, but for HBAs there is:
http://hub.opensolaris.org/bin/view/Community+Group+on/2009093001

Rob

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 6, 2009 12:10 PM   in response to: picker
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Right now I do not dare to use builds later than 125, because in b126 the problem showed up. Maybe a coincidence, maybe not. But I think it is best to not use b126 or later, until someone has confirmed there are no driver changes.

So, to confirm, there are no driver changes in b126 for the marvell88sx2, right? So I should safely be able to use b126 and later?

tcook

Posts: 591
From: US

Registered: 8/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 6, 2009 3:39 PM   in response to: kebabber

  Click to reply to this thread Reply



On Fri, Nov 6, 2009 at 2:10 PM, Orvar Korvar <knatte_fnatte_tjatte at yahoo dot com> wrote:
Right now I do not dare to use builds later than 125, because in b126 the problem showed up. Maybe a coincidence, maybe not. But I think it is best to not use b126 or later, until someone has confirmed there are no driver changes.

So, to confirm, there are no driver changes in b126 for the marvell88sx2, right? So I should safely be able to use b126 and later?



Let me know what your results are if you decide to upgrade.  I've already replaced both drives that were having issues, I'll do cables later but I'm still having a hard time believing my cables magically went bad right when I upgraded to build 126.  The new drives have the same issues the old drives did.  New brand and model.

And from what I can tell, I'm getting checksum errors through the roof on the replace as well...



  pool: fserv
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: resilver in progress for 0h34m, 22.60% done, 1h57m to go
config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            c8t0d0                  ONLINE       0     0     0
            c8t1d0                  ONLINE       0     0     0
            spare-2                 DEGRADED     0     0     0
              14340903866396142118  UNAVAIL      0     0     0  was /dev/dsk/c8t2d0s0
              c7t6d0                ONLINE       0     0     0
            c8t3d0                  REMOVED      0     0     0
            c8t4d0                  ONLINE       0     0     0
            c8t5d0                  ONLINE       0     0     0
            c7t0d0                  ONLINE       0     0     0
            c7t1d0                  ONLINE       0     0     0
            c7t2d0                  ONLINE       0     0     0
            c7t3d0                  ONLINE       0     0     0
            replacing-10            DEGRADED     0     0  816K
              15401866802517339500  FAULTED      0     0     0  was /dev/dsk/c7t4d0s0/old
              c7t4d0                ONLINE       0     0     0  52.3G resilvered
            c7t5d0                  ONLINE       0     0     0
        spares
          c7t6d0                    INUSE     currently in use





--Tim
_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 7, 2009 2:27 AM   in response to: tcook
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Ok, so you changed drives and you still see errors? Are the drives brand new or used? What kind of drives, which brand? 2TB? And if you reboot into an earlier build such as b125 you dont see any errors, right?

Right now I am running b125. I dont dare to run b126, if your observation is correct. Could you just rip out drivers from b125? I could post them drivers here for you, if you tell me which files you need. And then you can see if it is the drivers causing the problem or not.

tcook

Posts: 591
From: US

Registered: 8/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 7, 2009 9:06 AM   in response to: kebabber

  Click to reply to this thread Reply



On Sat, Nov 7, 2009 at 4:27 AM, Orvar Korvar <knatte_fnatte_tjatte at yahoo dot com> wrote:
Ok, so you changed drives and you still see errors? Are the drives brand new or used? What kind of drives, which brand? 2TB? And if you reboot into an earlier build such as b125 you dont see any errors, right?

Brand new. I've tried both 1TB hitachi and 1.5TB seagate (not the "bad" ones). 

I can't boot into an older version because the last version I had was b118 which doesn't have zfs version 19 support.  I've been looking to see if there's a way to downgrade via IPS but that's turned up a lot of nothing.


 

Right now I am running b125. I dont dare to run b126, if your observation is correct. Could you just rip out drivers from b125? I could post them drivers here for you, if you tell me which files you need. And then you can see if it is the drivers causing the problem or not.


It's tough to say what exactly is causing the problems.  I would imagine ripping something like sd from the older version would break more than it would fix.

--Tim
_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


cindys

Posts: 404
From: US

Registered: 11/2/05
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 7, 2009 10:02 AM   in response to: tcook

  Click to reply to this thread Reply

Hi Tim and all,

I believe you are saying that marvell88sx2 driver error messages started
in build 126, along with new disk errors in RAIDZ pools.

Is this correct? If so, please send me the following information:

1. Hardware you are running

2. If you are also seeing new disk errors in your RAIDZ pools
include your zpool status output.

I'm not the right person to be diagnosing driver-level issues but I will
investigate.

Thanks,

Cindy


----- Original Message -----
From: Tim Cook <tim at cook dot ms>
Date: Saturday, November 7, 2009 10:08 am
Subject: Re: [zfs-discuss] marvell88sx2 driver build126
To: Orvar Korvar <knatte_fnatte_tjatte at yahoo dot com>
Cc: zfs-discuss at opensolaris dot org

> On Sat, Nov 7, 2009 at 4:27 AM, Orvar Korvar <knatte_fnatte_tjatte at yahoo dot com
> > wrote:
>
> > Ok, so you changed drives and you still see errors? Are the drives brand
> > new or used? What kind of drives, which brand? 2TB? And if you
> reboot into
> > an earlier build such as b125 you dont see any errors, right?
> >
>
> Brand new. I've tried both 1TB hitachi and 1.5TB seagate (not the "bad"
> ones).
>
> I can't boot into an older version because the last version I had was
> b118
> which doesn't have zfs version 19 support. I've been looking to see if
> there's a way to downgrade via IPS but that's turned up a lot of nothing.
>
>
>
>
> >
> > Right now I am running b125. I dont dare to run b126, if your observation
> > is correct. Could you just rip out drivers from b125? I could post them
> > drivers here for you, if you tell me which files you need. And then
> you can
> > see if it is the drivers causing the problem or not.
>
>
>
> It's tough to say what exactly is causing the problems. I would imagine
> ripping something like sd from the older version would break more than
> it
> would fix.
>
> --Tim
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris dot org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


tcook

Posts: 591
From: US

Registered: 8/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 7, 2009 10:44 AM   in response to: cindys

  Click to reply to this thread Reply



On Sat, Nov 7, 2009 at 12:02 PM, Cindy Swearingen <Cindy dot Swearingen at sun dot com> wrote:
Hi Tim and all,

I believe you are saying that marvell88sx2 driver error messages started
in build 126, along with new disk errors in RAIDZ pools.

Is this correct? If so, please send me the following information:

Yes.

1. Hardware you are running

Motherboard: SUPERMICRO MBD-H8DAE-2-O
2xAMD opteron 22xx CPU's  (forget the exact model, they're 2010mhz)
8GB crucial ECC ddr2 memory
2xSupermicro AOC-SAT2-MV8 SATA adapters
Supermicro SC932T-R760B case with 15xSATA passthrough backplane

I also have an nvidia video card in it, but I'm not sure of the model, and doubt it has any role in this troubleshooting.


 

2. If you are also seeing new disk errors in your RAIDZ pools
include your zpool status output.

Well, I can give you a current one, but I've done about a hundred things troubleshooting, so it isn't representative of what the issues were a few days ago.  I'm still trying to figure out why it's choking on any drive I put into c8t2d0.  It's stopped generating errors on c7t4d0, but I haven't changed a thing with that slot outside of stopping the zpool replace and restarting it a few times... which is also extremely odd to me.

r00t@fserv:~$ zpool status
  pool: fserv
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: resilver completed after 2h53m with 0 errors on Fri Nov  6 22:09:08 2009
config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            c8t0d0                  ONLINE       0     0     0
            c8t1d0                  ONLINE       0     0     0
            spare-2                 DEGRADED     0     0     0
              14340903866396142118  UNAVAIL      0     0     0  was /dev/dsk/c8t2d0s0
              c7t6d0                ONLINE       0     0     0
            c8t3d0                  ONLINE       0     0     0  2.68G resilvered
            c8t4d0                  ONLINE       0     0     0
            c8t5d0                  ONLINE       0     0     0
            c7t0d0                  ONLINE       0     0     0
            c7t1d0                  ONLINE       0     0     0
            c7t2d0                  ONLINE       0     0     0
            c7t3d0                  ONLINE       0     0     0
            c7t4d0                  ONLINE       0     0     0  231G resilvered
            c7t5d0                  ONLINE       0     0     0
        spares
          c7t6d0                    INUSE     currently in use

errors: No known data errors
_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 7, 2009 1:33 PM   in response to: cindys
To: Communities » zfs » discuss
  Click to reply to this thread Reply

I saw the same checksum error problem when I booted into b126. I havent dared try b126 again, I use b125 now, without problems. Here is my hardware
Intel Q9450 + P45 Gigabyte EP45-DS3P motherboard + Ati 4850
I have the same AOC SATA controller card. And some Samsung Spinpoint F1, 1TB drives. Brand new.

cindys

Posts: 404
From: US

Registered: 11/2/05
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 9, 2009 12:51 PM   in response to: kebabber

  Click to reply to this thread Reply

Hi,

I can't find any bug-related issues with marvell88sx2 in b126.

I looked over Dave Hollister's shoulder while he searched for
marvell in his webrevs of this putback and nothing came up:

> driver change with build 126?
not for the SATA framework, but for HBAs there is:
http://hub.opensolaris.org/bin/view/Community+Group+on/2009093001

I will find a thumper, load build 125, create a raidz pool, and
upgrade to b126.

I'll also send the error messages that Tim provided to someone who
works in the driver group.

Thanks,

Cindy

On 11/07/09 14:33, Orvar Korvar wrote:
> I saw the same checksum error problem when I booted into b126. I havent dared try b126 again, I use b125 now, without problems. Here is my hardware
> Intel Q9450 + P45 Gigabyte EP45-DS3P motherboard + Ati 4850
> I have the same AOC SATA controller card. And some Samsung Spinpoint F1, 1TB drives. Brand new.
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


tcook

Posts: 591
From: US

Registered: 8/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 9, 2009 11:59 PM   in response to: cindys

  Click to reply to this thread Reply



On Mon, Nov 9, 2009 at 2:51 PM, Cindy Swearingen <Cindy dot Swearingen at sun dot com> wrote:
Hi,

I can't find any bug-related issues with marvell88sx2 in b126.

I looked over Dave Hollister's shoulder while he searched for
marvell in his webrevs of this putback and nothing came up:

> driver change with build 126?
not for the SATA framework, but for HBAs there is:
http://hub.opensolaris.org/bin/view/Community+Group+on/2009093001

I will find a thumper, load build 125, create a raidz pool, and
upgrade to b126.

I'll also send the error messages that Tim provided to someone who
works in the driver group.

Thanks,

Cindy


I tried the build 125 driver and it didn't make a difference.  The odd part I've just noticed is that it's port 4 on both cards that have been giving me issues.  I guess it's possible it's just a coincidence/bad luck.

I've grabbed the b125 ISO from genunix and am going to try booting off the livecd to see if it produces different results.

--Tim
_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 10, 2009 1:25 AM   in response to: cindys
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Does this mean that there are no driver changes in marvell88sx2, between b125 and b126? If no driver changes, then it means that we both had extreme unluck with our drives, because we both had checksum errors? And my discs were brand new.

How probable is this? Something is weird here. What is your opinion on this? Should we agree that there was a hardware error, and it was just a coincidence?

cindys

Posts: 404
From: US

Registered: 11/2/05
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 10, 2009 7:56 AM   in response to: kebabber

  Click to reply to this thread Reply

Hi Orvar,

Correct, I don't see any marvell8sx2 driver changes between b125-126.

So far, only you and Tim are reporting these issues.

Generally, we see bugs filed by the internal test teams if they see
similar problems.

I will try to reproduce the RAIDZ checksum errors separately from the
marvell88sx2 issue.

Thanks,

Cindy


On 11/10/09 02:25, Orvar Korvar wrote:
> Does this mean that there are no driver changes in marvell88sx2, between b125 and b126? If no driver changes, then it means that we both had extreme unluck with our drives, because we both had checksum errors? And my discs were brand new.
>
> How probable is this? Something is weird here. What is your opinion on this? Should we agree that there was a hardware error, and it was just a coincidence?
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


relling

Posts: 1,858
From: US

Registered: 6/17/05
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 10, 2009 8:55 AM   in response to: kebabber

  Click to reply to this thread Reply


On Nov 10, 2009, at 1:25 AM, Orvar Korvar wrote:

> Does this mean that there are no driver changes in marvell88sx2,
> between b125 and b126? If no driver changes, then it means that we
> both had extreme unluck with our drives, because we both had
> checksum errors? And my discs were brand new.

There are other drivers in the software stack that may have changed.
-- richard

>
> How probable is this? Something is weird here. What is your opinion
> on this? Should we agree that there was a hardware error, and it was
> just a coincidence?
> --
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris dot org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


tcook

Posts: 591
From: US

Registered: 8/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 10, 2009 3:15 PM   in response to: relling

  Click to reply to this thread Reply



On Tue, Nov 10, 2009 at 10:55 AM, Richard Elling <richard dot elling at gmail dot com> wrote:

On Nov 10, 2009, at 1:25 AM, Orvar Korvar wrote:

Does this mean that there are no driver changes in marvell88sx2, between b125 and b126? If no driver changes, then it means that we both had extreme unluck with our drives, because we both had checksum errors? And my discs were brand new.

There are other drivers in the software stack that may have changed.
 -- richard



How probable is this? Something is weird here. What is your opinion on this? Should we agree that there was a hardware error, and it was just a coincidence?


So... I do appear to have reached somewhat of a truce with the system and b126 at the moment.  I'm now going through and replacing the last of my old maxtor 300GB drives with brand new hitachi 1TB drives.  One thing I'm noticing is a lot of checksum errors being generated during the resilver.  Is this normal?  Furthermore, since I see "no known data errors", is it safe to assume it's all being corrected, and I'm not losing any data?  I still do have a separate copy of this data on a box at work that should be completely consistent... but I will need to re-purpose that storage soon, and will be without a known good backup for a while (I know, I know).  I'd rather do a fresh zfs send/receive than find out 6 months from now I lost something.

  pool: fserv
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h8m, 0.89% done, 15h14m to go
config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            c8t0d0                  ONLINE       0     0     0
            c8t1d0                  ONLINE       0     0     0
            c8t2d0                  ONLINE       0     0     0
            c8t3d0                  ONLINE       0     0     0
            c8t4d0                  ONLINE       0     0     0
            c8t5d0                  ONLINE       0     0     0
            c7t0d0                  ONLINE       0     0     0
            c7t1d0                  ONLINE       0     0     0
            c7t2d0                  ONLINE       0     0     0
            replacing-9             DEGRADED     0     0  161K
              14274451003165180679  FAULTED      0     0     0  was /dev/dsk/c7t3d0s0/old
              c7t3d0                ONLINE       0     0     0  2.05G resilvered
            c7t4d0                  ONLINE       0     0     0
            c7t5d0                  ONLINE       0     0     0
        spares
          c7t6d0                    AVAIL
 
errors: No known data errors


--Tim

_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


tcook

Posts: 591
From: US

Registered: 8/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 10, 2009 9:01 PM   in response to: tcook

  Click to reply to this thread Reply



On Tue, Nov 10, 2009 at 5:15 PM, Tim Cook <tim at cook dot ms> wrote:


On Tue, Nov 10, 2009 at 10:55 AM, Richard Elling <richard dot elling at gmail dot com> wrote:

On Nov 10, 2009, at 1:25 AM, Orvar Korvar wrote:

Does this mean that there are no driver changes in marvell88sx2, between b125 and b126? If no driver changes, then it means that we both had extreme unluck with our drives, because we both had checksum errors? And my discs were brand new.

There are other drivers in the software stack that may have changed.
 -- richard



How probable is this? Something is weird here. What is your opinion on this? Should we agree that there was a hardware error, and it was just a coincidence?


So... I do appear to have reached somewhat of a truce with the system and b126 at the moment.  I'm now going through and replacing the last of my old maxtor 300GB drives with brand new hitachi 1TB drives.  One thing I'm noticing is a lot of checksum errors being generated during the resilver.  Is this normal?  Furthermore, since I see "no known data errors", is it safe to assume it's all being corrected, and I'm not losing any data?  I still do have a separate copy of this data on a box at work that should be completely consistent... but I will need to re-purpose that storage soon, and will be without a known good backup for a while (I know, I know).  I'd rather do a fresh zfs send/receive than find out 6 months from now I lost something.

  pool: fserv
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h8m, 0.89% done, 15h14m to go

config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            c8t0d0                  ONLINE       0     0     0
            c8t1d0                  ONLINE       0     0     0
            c8t2d0                  ONLINE       0     0     0

            c8t3d0                  ONLINE       0     0     0
            c8t4d0                  ONLINE       0     0     0
            c8t5d0                  ONLINE       0     0     0
            c7t0d0                  ONLINE       0     0     0
            c7t1d0                  ONLINE       0     0     0
            c7t2d0                  ONLINE       0     0     0
            replacing-9             DEGRADED     0     0  161K
              14274451003165180679  FAULTED      0     0     0  was /dev/dsk/c7t3d0s0/old
              c7t3d0                ONLINE       0     0     0  2.05G resilvered

            c7t4d0                  ONLINE       0     0     0
            c7t5d0                  ONLINE       0     0     0
        spares
          c7t6d0                    AVAIL

 
errors: No known data errors


--Tim



Anyo ne?  It's up to 7.35M checksum errors and it's rebuilding extremely slowly (as evidenced by the 10 hour time).  The errors are only showing on the "replacing-9" line, not the individual drive.


  pool: fserv
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 6h56m, 39.61% done, 10h34m to go
config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            c8t0d0                  ONLINE       0     0     0
            c8t1d0                  ONLINE       0     0     0
            c8t2d0                  ONLINE       0     0     0
            c8t3d0                  ONLINE       0     0     0
            c8t4d0                  ONLINE       0     0     0
            c8t5d0                  ONLINE       0     0     0
            c7t0d0                  ONLINE       0     0     0
            c7t1d0                  ONLINE       0     0     0
            c7t2d0                  ONLINE       0     0     0
            replacing-9             DEGRADED     0     0 7.35M
              14274451003165180679  FAULTED      0     0     0  was /dev/dsk/c7t3d0s0/old
              c7t3d0                ONLINE       0     0     0  91.9G resilvered
            c7t4d0                  ONLINE       0     0     0
            c7t5d0                  ONLINE       0     0     0
        spares
          c7t6d0                    AVAIL  

errors: No known data errors



--Tim

_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


caphill

Posts: 4
From:

Registered: 7/28/09
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 11, 2009 5:24 AM   in response to: tcook

  Click to reply to this thread Reply

On Nov 11, 2009, at 12:01 AM, Tim Cook wrote:

> On Tue, Nov 10, 2009 at 5:15 PM, Tim Cook <tim at cook dot ms> wrote:
>> One thing I'm
>> noticing is a lot of checksum errors being generated during the
>> resilver.
>> Is this normal?

> Anyone? It's up to 7.35M checksum errors and it's rebuilding
> extremely
> slowly (as evidenced by the 10 hour time). The errors are only
> showing on
> the "replacing-9" line, not the individual drive.

I've only replaced a drive once, but it didn't show any checksum
errors during the resilver. This was a 2 TB WD Green drive in a
mirror pool that had started to show write errors. It was attached to
a SuperMicro AOC-SAT2-MV8.

Good luck,
Ware
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


taylor

Posts: 29
From:

Registered: 3/9/05
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 11, 2009 9:23 AM   in response to: tcook
To: Communities » zfs » discuss
  Click to reply to this thread Reply

The checksum errors are fixed in build 128 with:

6807339 spurious checksum errors when replacing a vdev

No; you're not losing any data due to this.

- Eric

kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 11, 2009 1:23 PM   in response to: taylor
To: Communities » zfs » discuss
  Click to reply to this thread Reply

So he did actually hit a bug? But the bug is not dangerous as it doesnt destroy data?

But I did not replace any devices and still it showed checksum errors. I think I did a zfs send | zfs receive? I dont remember. But I just copied things back and forth, and the checksum errors showed up. So what does that mean?

kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 11, 2009 1:38 AM   in response to: relling
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Other drivers in the stack? Which drivers? And have anyone of them been changed between b125 and b126?

tcook

Posts: 591
From: US

Registered: 8/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 11, 2009 7:59 AM   in response to: kebabber

  Click to reply to this thread Reply



On Wed, Nov 11, 2009 at 3:38 AM, Orvar Korvar <knatte_fnatte_tjatte at yahoo dot com> wrote:
Other drivers in the stack? Which drivers? And have anyone of them been changed between b125 and b126?


Looks like the sd drive for one.

http://dlc.sun.com/osol/on/downloads/b126/on-changelog-b126.html

--Tim
_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 8, 2009 3:57 AM   in response to: tcook
To: Communities » zfs » discuss
  Click to reply to this thread Reply

"I can't boot into an older version because the last version I had was b118 which doesn't have zfs version 19 support. I've been looking to see if there's a way to downgrade via IPS but that's turned up a lot of nothing."

If someone can tell me which files are needed for the driver I can extract them from my b125 and post them here for you, so you can try out. Then we can know if the problem is in b126 drivers or not. If b125 drivers work, we know the problem is in b126. Otherwise there might be some other problem.

Another solution could be that you install SCXE b125. There are links to that DVD b125. And from SCXE you can upgrade to later Opensolaris builds. I think.

Is it possible to upgrade to a specific build via IPS? When I use the Update Manager, I always upgrade to the latest build. Can I target, say, bXXX? Or is the only way to get bXXX, by installing SXCE?

myxiplx

Posts: 877
From: GB

Registered: 10/24/07
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 8, 2009 6:31 AM   in response to: kebabber
To: Communities » zfs » discuss
  Click to reply to this thread Reply

> "I can't boot into an older version because the last
> version I had was b118 which doesn't have zfs version
> 19 support. I've been looking to see if there's a
> way to downgrade via IPS but that's turned up a lot
> of nothing."
>
> If someone can tell me which files are needed for the
> driver I can extract them from my b125 and post them
> here for you, so you can try out. Then we can know if
> the problem is in b126 drivers or not. If b125
> drivers work, we know the problem is in b126.
> Otherwise there might be some other problem.
>
> Another solution could be that you install SCXE b125.
> There are links to that DVD b125. And from SCXE you
> can upgrade to later Opensolaris builds. I think.
>
> Is it possible to upgrade to a specific build via
> IPS? When I use the Update Manager, I always upgrade
> to the latest build. Can I target, say, bXXX? Or is
> the only way to get bXXX, by installing SXCE?

Here are some notes i stole from the list earlier. I think they might be on a wiki somewhere now, but it seems relatively easy to upgrade to a specific version:

Starting from OpenSolaris 2009.06 (snv_111b) active BE.

1) beadm create snv_111b-dev
2) beadm activate snv_111b-dev
3) reboot
4) pkg set-authority -O http://pkg.opensolaris.org/dev opensolaris.org
5) pkg install SUNWipkg
6) pkg list 'entire*'
7) beadm create snv_118
8) beadm mount snv_118 /mnt
9) pkg -R /mnt refresh
10) pkg -R /mnt install entire@0.5.11-0.118
11) bootadm update-archive -R /mnt
12) beadm umount snv_118
13) beadm activate snv_118
14) reboot

Now you have a snv_118 development environment.

kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 8, 2009 7:47 AM   in response to: myxiplx
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Great! So if I want another build, for instance b125, I just change step 10?
10) pkg -R /mnt install entire@0.5.11-0.125
Yes?

What is this "0.5.11" thing? Should that be changed too, if I try to install b125? Like "0.5.12-0.125"?

tcook

Posts: 591
From: US

Registered: 8/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 8, 2009 8:54 AM   in response to: kebabber

  Click to reply to this thread Reply



On Sun, Nov 8, 2009 at 9:47 AM, Orvar Korvar <knatte_fnatte_tjatte at yahoo dot com> wrote:
Great! So if I want another build, for instance b125, I just change step 10?
10) pkg -R /mnt install entire@0.5.11-0.125
Yes?

What is this "0.5.11" thing? Should that be changed too, if I try to install b125? Like "0.5.12-0.125"?


No.  That's the SunOS version number, and you should always use 0.5.11- for anything in opensolaris today.  Solaris 10= "5.10".  Opensolaris="5.11".  9=5.9 etc. etc. etc.

http://en.wikipedia.org/wiki/Solaris_%28operating_system%29

--Tim

_______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris dot org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


nwsmith

Posts: 426
From: GB

Registered: 3/21/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 8, 2009 12:08 PM   in response to: kebabber
To: Communities » zfs » discuss
  Click to reply to this thread Reply

I think you can work out the files for the driver by looking here:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/pkgdefs/SUNWmv88sx/prototype_i386

So the 32 bit driver is:

kernel/drv/marvell88sx

And the 64 bit driver is:

kernel/drv/amd64/marvell88sx

It a pity that the marvell driver is not open source.
For the sata drivers that are open source,

ahci, nv_sata, si3124

..you can see the history of all the changes to the source code
of the drivers, all cross referenced to the bug numbers, using OpenGrok:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/sata/adapters/

Regards
Nigel Smith

kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 8, 2009 2:04 PM   in response to: nwsmith
To: Communities » zfs » discuss
  Click to reply to this thread Reply

Ok, here I attached the 64 bit variant. You can try it if you wish and see if the checksum errors disappear.

kebabber

Posts: 778
From:

Registered: 2/14/06
Re: [zfs-discuss] marvell88sx2 driver build126
Posted: Nov 8, 2009 2:05 PM   in response to: kebabber
To: Communities » zfs » discuss
  Click to reply to this thread Reply

This is from build 125.




Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
Copyright © 1995-2005 Sun Microsystems, Inc.