OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » mdb » discuss

Thread: conditional variable debugging

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 10 - Last Post: Jun 9, 2006 12:52 PM by: meem
pavan

Posts: 31
From: US

Registered: 11/7/05
conditional variable debugging
Posted: Jun 6, 2006 7:37 PM

  Click to reply to this thread Reply

Hi All,
I am trying to debug a conditional variable at this situation:

I have a daemon running and I am trying to send a signal to it thru cv_signal()
controlled by a conditional variable. The first request I sent to it works fine
and when it comes to the second request, I am not able to wake that daemon
with cv_signal(). I am guessing that there is some problem with the conditional
variable and trying to debug that. I can see the daemon kernel thread running.


There is a structure with a conditional variable

struct a {
int ...;
...
...
kcondvar_t something;
};

suppose if the address of "struct a" after allocation is 100 and the address of
conditional variable is 120.

Questions:

1. If I do a "::wchaninfo", should that list the address of the conditional variable
as 120?

2. ::wchaninfo ==> dump condition variable
Does this dumps all the conditional variable addressess which are active at that point
time?


Regards,
Pavan

rbourbon

Posts: 505
From: Grenoble, France

Registered: 3/9/05
Re: conditional variable debugging
Posted: Jun 8, 2006 3:48 AM   in response to: pavan

  Click to reply to this thread Reply


Quick background check: Is the thread calling cv_signal()
holding the lock passed to cv_wait ? Normally it should.


-r

_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org



pavan

Posts: 31
From: US

Registered: 11/7/05
Re: conditional variable debugging
Posted: Jun 8, 2006 9:32 AM   in response to: rbourbon

  Click to reply to this thread Reply

Yes it is.



Roch wrote:

>Quick background check: Is the thread calling cv_signal()
>holding the lock passed to cv_wait ? Normally it should.
>
>
>-r
>
>
>

_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org



sommerfe

Posts: 975
From: US

Registered: 3/9/05
Re: conditional variable debugging
Posted: Jun 8, 2006 1:54 PM   in response to: rbourbon

  Click to reply to this thread Reply

On Thu, 2006-06-08 at 06:48, Roch wrote:
> Quick background check: Is the thread calling cv_signal()
> holding the lock passed to cv_wait ? Normally it should.

So, for what it's worth, I've seen a lot of people confused about why
this is necessary or even helpful.

IIRC the main reason for this was to avoid priority inversion -- if you
drop the lock and get preempted by a higher-priority thread before you
can cv_signal(), the cv_signal won't happen for a while and so a
higher-priority thread in cv_wait() might not get to run for a while.

Someone please correct me if I'm either (a) wrong about this, or (b)
there are other reasons besides this one...

- Bill



_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org



meem

Posts: 3,045
From: US

Registered: 3/9/05
Re: conditional variable debugging
Posted: Jun 8, 2006 6:52 PM   in response to: sommerfe

  Click to reply to this thread Reply


> IIRC the main reason for this was to avoid priority inversion -- if you
> drop the lock and get preempted by a higher-priority thread before you
> can cv_signal(), the cv_signal won't happen for a while and so a
> higher-priority thread in cv_wait() might not get to run for a while.

OTOH, if you cv_signal() while you're holding the lock, isn't it possible
the signaled thread will wake up, be unable to grab the lock (since the
signalling is still holding it), go back to sleep (or perhaps spin) and
have to be poked again when the signalling thread actually does drop the
lock?

--
meem
_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org



Alexander Kolba...
akolb@eng.sun.com
Re: conditional variable debugging
Posted: Jun 9, 2006 11:30 AM   in response to: meem

  Click to reply to this thread Reply

>
> > IIRC the main reason for this was to avoid priority inversion -- if you
> > drop the lock and get preempted by a higher-priority thread before you
> > can cv_signal(), the cv_signal won't happen for a while and so a
> > higher-priority thread in cv_wait() might not get to run for a while.
>
> OTOH, if you cv_signal() while you're holding the lock, isn't it possible
> the signaled thread will wake up, be unable to grab the lock (since the
> signalling is still holding it), go back to sleep (or perhaps spin) and
> have to be poked again when the signalling thread actually does drop the
> lock?

Whether this is good or bad, this is the official requirement. A quote from
the condvar(9F):

cv_signal() signals the condition and wakes one blocked
thread. All blocked threads can be unblocked by calling
cv_broadcast(). You must acquire the mutex passed into
cv_wait() before calling cv_signal() or cv_broadcast().

I bet, though, that some pieces of Solaris code ignore this requirement - for
better or for worse.

- Alex Kolbasov

_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org



meem

Posts: 3,045
From: US

Registered: 3/9/05
Re: conditional variable debugging
Posted: Jun 9, 2006 12:24 PM   in response to: Alexander Kolba...

  Click to reply to this thread Reply


> Whether this is good or bad, this is the official requirement. A quote from
> the condvar(9F):
>
> cv_signal() signals the condition and wakes one blocked
> thread. All blocked threads can be unblocked by calling
> cv_broadcast(). You must acquire the mutex passed into
> cv_wait() before calling cv_signal() or cv_broadcast().

Further, cond_signal(3C) states:

The cond_broadcast() function unblocks all threads that are
blocked on the condition variable pointed to by cvp.

If no threads are blocked on the condition variable, then
cond_signal() and cond_broadcast() have no effect.

Both functions should be called under the protection of the
same mutex that is used with the condition variable being
signaled. Otherwise, the condition variable may be signaled
between the test of the associated condition and blocking in
cond_wait(). This can cause an infinite wait.

... but that rationale looks bogus to me. In particular, the thread
heading into cond_wait() must have already tested the condition under the
lock and concluded it was false in order to decide to cond_wait(). Since
any thread changing state that would affect the condition must also be
holding the lock, there is no way for the state (and thus the outcome of
the test) to change beween the test and the cond_wait(), and thus any
cond_signal() sent during that window would end up being spurious anyway.

Please feel free to prove me wrong :-)

--
meem
_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org



jwadams

Posts: 370
From: San Francisco, CA

Registered: 3/9/05
Re: conditional variable debugging
Posted: Jun 9, 2006 12:52 PM   in response to: meem

  Click to reply to this thread Reply

On Fri, Jun 09, 2006 at 03:24:00PM -0400, Peter Memishian wrote:
>
> > Whether this is good or bad, this is the official requirement. A quote from
> > the condvar(9F):
> >
> > cv_signal() signals the condition and wakes one blocked
> > thread. All blocked threads can be unblocked by calling
> > cv_broadcast(). You must acquire the mutex passed into
> > cv_wait() before calling cv_signal() or cv_broadcast().
>
> Further, cond_signal(3C) states:
>
> The cond_broadcast() function unblocks all threads that are
> blocked on the condition variable pointed to by cvp.
>
> If no threads are blocked on the condition variable, then
> cond_signal() and cond_broadcast() have no effect.
>
> Both functions should be called under the protection of the
> same mutex that is used with the condition variable being
> signaled. Otherwise, the condition variable may be signaled
> between the test of the associated condition and blocking in
> cond_wait(). This can cause an infinite wait.
>
> ... but that rationale looks bogus to me. In particular, the thread
> heading into cond_wait() must have already tested the condition under the
> lock and concluded it was false in order to decide to cond_wait(). Since
> any thread changing state that would affect the condition must also be
> holding the lock, there is no way for the state (and thus the outcome of
> the test) to change beween the test and the cond_wait(), and thus any
> cond_signal() sent during that window would end up being spurious anyway.
>
> Please feel free to prove me wrong :-)

If I remember correctly, the main problems you can run into with signaling
after dropping the lock is that there can be destruction races:

thread 1 Thread 2

mutex_exit(&obj->mutex)
-------------------------->
mutex_enter(&obj->mutex)
set up object for destruction
mutex_exit(&obj->mutex)
kmem_free(obj);
<--------------------------
cv_signal(&obj->cv);

I agree that the argument in cond_signal(3C) is bogus; the standard states
that "signaling under the lock can make scheduling more deterministic", but
it doesn't require anyone to do so. See pthread_cond_signal(3C):

The pthread_cond_signal() or pthread_cond_broadcast() func-
tions may be called by a thread whether or not it currently
owns the mutex that threads calling pthread_cond_wait() or
pthread_cond_timedwait() have associated with the condition
variable during their waits; however, if predictable
scheduling behavior is required, then that mutex is locked
by the thread calling pthread_cond_signal() or
pthread_cond_broadcast().

Cheers,
- jonathan

--
Jonathan Adams, Solaris Kernel Development
_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org



meem

Posts: 3,045
From: US

Registered: 3/9/05
Re: conditional variable debugging
Posted: Jun 9, 2006 2:21 PM   in response to: jwadams

  Click to reply to this thread Reply


> I agree that the argument in cond_signal(3C) is bogus

So it seems a manpage CR is in order. However, it disturbs me that such
wording ever found its way into our documentation.

--
meem
_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org



Michael Shapiro
mws@zion.eng.sun.com
Re: conditional variable debugging
Posted: Jun 8, 2006 11:46 AM   in response to: pavan

  Click to reply to this thread Reply

>
> Hi All,
> I am trying to debug a conditional variable at this situation:
> I have a daemon running and I am trying to send a signal to it thru cv_signal
> controlled by a conditional variable. The first request I sent to it works
> fine and when it comes to the second request, I am not able to wake that
> daemon with cv_signal(). I am guessing that there is some problem with the
> conditional variable and trying to debug that. I can see the daemon kernel
> thread running.
>
> There is a structure with a conditional variable
>
> struct a {
> int ...;
> ...
> ...
> kcondvar_t something;
> };
>
> suppose if the address of "struct a" after allocation is 100 and the address
> of conditional variable is 120.
>
> Questions:
>
> 1. If I do a "::wchaninfo", should that list the address of the conditional
> variable as 120?

Yes, if there are active waiters on that variable. It does not, however,
produce a list of all possible condvar_t's in the system.

> 2. ::wchaninfo ==> dump condition variable
> Does this dumps all the conditional variable addressess which are active
> at that point time?

Yes, if there are waiters.

-Mike

--
Mike Shapiro, Solaris Kernel Development. blogs.sun.com/mws/
_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org



jwadams

Posts: 370
From: San Francisco, CA

Registered: 3/9/05
Re: conditional variable debugging
Posted: Jun 8, 2006 3:16 PM   in response to: pavan

  Click to reply to this thread Reply

On Tue, Jun 06, 2006 at 07:37:27PM -0700, Pavan Reddy wrote:
> Hi All,
> I am trying to debug a conditional variable at this situation:
>
> I have a daemon running and I am trying to send a signal to it thru cv_signal()
> controlled by a conditional variable. The first request I sent to it works fine
> and when it comes to the second request, I am not able to wake that daemon
> with cv_signal(). I am guessing that there is some problem with the conditional
> variable and trying to debug that. I can see the daemon kernel thread running.
>
>
> There is a structure with a conditional variable
>
> struct a {
> int ...;
> ...
> ...
> kcondvar_t something;
> };
>
> suppose if the address of "struct a" after allocation is 100 and the address of
> conditional variable is 120.
>
> Questions:
>
> 1. If I do a "::wchaninfo", should that list the address of the conditional variable
> as 120?
>
> 2. ::wchaninfo ==> dump condition variable
> Does this dumps all the conditional variable addressess which are active at that point
> time?

You might look at dtrace for this; sched:::sleep will fire when a thread
sleeps on a condvar, and sched:::wakeup will fire when one thread wakes
another.

Cheers,
- jonathan

--
Jonathan Adams, Solaris Kernel Development
_______________________________________________
mdb-discuss mailing list
mdb-discuss at opensolaris dot org






Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
Copyright © 1995-2005 Sun Microsystems, Inc.