OpenSolaris

  subsites   code review   repo   packages   bugs   defect   polls   planet
You are not signed in. Sign in or register.

Heads-up: x86 snv_24, rootnex, and adp.

Date: Tue, 27 Sep 2005 14:22:17 -0400
From: Mark Johnson <mark dot johnson at sun dot com>
To: onnv-gate at onnv dot eng dot sun dot com
Subject: Heads-up: x86 snv_24, rootnex, and adp.


This heads-up only applies to x86/x64 machines.

I have a report of a problem with a BFU to b24 on
x86 which appears to be related to my rootnex putback.

Specifically, the configuration having a problem is
a debug version of the x86 kernel and the adp driver.
If anyone has an adp based adapter w/ b24, and either
sees or doesn't see a problem, please contact me.

  e-mail:     mark dot johnson at sun dot com
  extension:  x20869
  cell phone: 603-566-4364

If you have a adaptec adapter which uses adp, read on...

Thanks,


MRJ


I believe this problem is isolated to the adp driver
on a debug kernel, and that it will not happen on
a non-debug kernel (or non-debug version of rootnex to
be more specific). But I don't have a config to prove/disprove
this yet. If this turns out to be the case, see the
workarounds below...

The following error message was seen.

WARNING: /pci@0,0/pci9004,7881@6/sd@0,0 (sd2):
         Error setting up next portion of DMA transfer

Here's a portion my original heads-up..

 > From onnv-gate-request Fri Sep  9 08:45:04 2005
 > This heads-up only applies to x86/x64 machines.
 >
 > My recent putback of:
 >
 > 4699148 some ddivs_dmae assertions FAIL due to ddi_dma* (9f,s) product or manpages bugs
 > 4739176 ddi_dma_sync.9f (ddi_dma_sync()) interface differ from one described in manpage
 > 6213398 x86 rootnex ignores offset and size on ddi_dma_sync()
 > 6218329 rootnex_io_brkup_attr can pass negative segment sizes to rootnex_get_phyaddr
 > 6262957 x86 rootnex should pre-allocate some cookies for performance
 > 6262959 x86 rootnex causes a lot of xcalls when using copy buffers
 > 6264169 x86 rootnex dma routines need cleanup
 > 6288756 Opteron kernel leaks memory and DMA resources when ddi_dma_addr_bind_handle() fails.
 > 6291263 In the i86pc rootnex module, INT_MAX_BUF should be bigger (at least MMU_PAGESIZE byes bigger).
 >
 > makes some considerable changes to the x86 rootnex DDI DMA routines. Although
 > the changes have been very well tested, it is still possible that a couple of
 > drivers could break due to either a bug introduced in the x86 rootnex dma
 > routines or a previously undetected driver bug which is now caught by the ddi
 > dma routines.


Some quick background. The debug version of rootnex now performs additional
checks on some DDI DMA parameters to help catch driver bugs. If the problem
only occurs in the debug kernel, it is highly likely that the adp driver
is failing these checks.

Again, if this turns out to be the problem, you can turn off these checks
in the debug kernel by putting the following in /etc/system.
(rootnex_unbind_verify_buffer is already off due to a elxl bug, but
I included it for future reference).

set rootnex:rootnex_alloc_check_parms=0
set rootnex:rootnex_bind_check_parms=0
set rootnex:rootnex_bind_check_inuse=0
set rootnex:rootnex_unbind_verify_buffer=0
set rootnex:rootnex_sync_check_parms=0

If you already have bfu'd/installed a debug version of snv_24 on
a system with adp, and you cannot boot, you can do the following
to remedy that...

o When the grub menu comes up, select the kernel you're
   going to boot, then hit the 'e' key to edit the
   grub menu (don't worry, these changes will *not* saved)

o use the arrow key to move down to the
   "kernel /platform/i86pc/.." line

o hit the 'e' key to edit that line. Assuming
   you don't have something custom there already,
   add a space then kmdb -d after multiboot

o hit return to get back to the previous menu.
   Then hit the 'b' key to boot your system.

o Now you should see a kmdb prompt. Do the following..

[0]> ::bp rootnex`rootnex_attach
[0]> :c
[CUT]
Loaded modules: [ specfs ]
kmdb: stop at rootnex`rootnex_attach
kmdb: target stopped at:
rootnex`rootnex_attach: pushq  %rbp
[0]> rootnex_alloc_check_parms?W 0
rootnex`rootnex_alloc_check_parms:              1               =       0x0
[0]> rootnex_bind_check_parms?W 0
rootnex`rootnex_bind_check_parms:               1               =       0x0
[0]> rootnex_bind_check_inuse?W 0
rootnex`rootnex_bind_check_inuse:               1               =       0x0
[0]> rootnex_unbind_verify_buffer?W 0
rootnex`rootnex_unbind_verify_buffer:           0               =       0x0
[0]> rootnex_sync_check_parms?W 0
rootnex`rootnex_sync_check_parms:               1               =       0x0
[0]> :c


Hopefully your booting successfully now. If so,
Add the following to your /etc/system file and you
should be all set.

set rootnex:rootnex_alloc_check_parms=0
set rootnex:rootnex_bind_check_parms=0
set rootnex:rootnex_bind_check_inuse=0
set rootnex:rootnex_unbind_verify_buffer=0
set rootnex:rootnex_sync_check_parms=0


-- 
Mark Johnson <mark dot johnson at sun dot com>
Sun Microsystems, Inc.
(781) 442-0869