|
Replies:
10
-
Last Post:
Jun 9, 2006 12:52 PM
by: meem
|
|
|
Posts:
31
From:
US
Registered:
11/7/05
|
|
|
|
conditional variable debugging
Posted:
Jun 6, 2006 7:37 PM
|
|
Hi All, I am trying to debug a conditional variable at this situation:
I have a daemon running and I am trying to send a signal to it thru cv_signal() controlled by a conditional variable. The first request I sent to it works fine and when it comes to the second request, I am not able to wake that daemon with cv_signal(). I am guessing that there is some problem with the conditional variable and trying to debug that. I can see the daemon kernel thread running.
There is a structure with a conditional variable
struct a { int ...; ... ... kcondvar_t something; };
suppose if the address of "struct a" after allocation is 100 and the address of conditional variable is 120.
Questions:
1. If I do a "::wchaninfo", should that list the address of the conditional variable as 120?
2. ::wchaninfo ==> dump condition variable Does this dumps all the conditional variable addressess which are active at that point time?
Regards, Pavan
|
|
|
Posts:
505
From:
Grenoble, France
Registered:
3/9/05
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 8, 2006 3:48 AM
in response to: pavan
|
|
Quick background check: Is the thread calling cv_signal() holding the lock passed to cv_wait ? Normally it should.
-r
_______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
Posts:
31
From:
US
Registered:
11/7/05
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 8, 2006 9:32 AM
in response to: rbourbon
|
|
Yes it is.
Roch wrote:
>Quick background check: Is the thread calling cv_signal() >holding the lock passed to cv_wait ? Normally it should. > > >-r > > >
_______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
Posts:
975
From:
US
Registered:
3/9/05
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 8, 2006 1:54 PM
in response to: rbourbon
|
|
On Thu, 2006-06-08 at 06:48, Roch wrote: > Quick background check: Is the thread calling cv_signal() > holding the lock passed to cv_wait ? Normally it should.
So, for what it's worth, I've seen a lot of people confused about why this is necessary or even helpful.
IIRC the main reason for this was to avoid priority inversion -- if you drop the lock and get preempted by a higher-priority thread before you can cv_signal(), the cv_signal won't happen for a while and so a higher-priority thread in cv_wait() might not get to run for a while.
Someone please correct me if I'm either (a) wrong about this, or (b) there are other reasons besides this one...
- Bill
_______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 8, 2006 6:52 PM
in response to: sommerfe
|
|
> IIRC the main reason for this was to avoid priority inversion -- if you > drop the lock and get preempted by a higher-priority thread before you > can cv_signal(), the cv_signal won't happen for a while and so a > higher-priority thread in cv_wait() might not get to run for a while.
OTOH, if you cv_signal() while you're holding the lock, isn't it possible the signaled thread will wake up, be unable to grab the lock (since the signalling is still holding it), go back to sleep (or perhaps spin) and have to be poked again when the signalling thread actually does drop the lock?
-- meem _______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
Alexander Kolba...
akolb@eng.sun.com
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 9, 2006 11:30 AM
in response to: meem
|
|
> > > IIRC the main reason for this was to avoid priority inversion -- if you > > drop the lock and get preempted by a higher-priority thread before you > > can cv_signal(), the cv_signal won't happen for a while and so a > > higher-priority thread in cv_wait() might not get to run for a while. > > OTOH, if you cv_signal() while you're holding the lock, isn't it possible > the signaled thread will wake up, be unable to grab the lock (since the > signalling is still holding it), go back to sleep (or perhaps spin) and > have to be poked again when the signalling thread actually does drop the > lock?
Whether this is good or bad, this is the official requirement. A quote from the condvar(9F):
cv_signal() signals the condition and wakes one blocked thread. All blocked threads can be unblocked by calling cv_broadcast(). You must acquire the mutex passed into cv_wait() before calling cv_signal() or cv_broadcast().
I bet, though, that some pieces of Solaris code ignore this requirement - for better or for worse.
- Alex Kolbasov
_______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 9, 2006 12:24 PM
in response to: Alexander Kolba...
|
|
> Whether this is good or bad, this is the official requirement. A quote from > the condvar(9F): > > cv_signal() signals the condition and wakes one blocked > thread. All blocked threads can be unblocked by calling > cv_broadcast(). You must acquire the mutex passed into > cv_wait() before calling cv_signal() or cv_broadcast().
Further, cond_signal(3C) states:
The cond_broadcast() function unblocks all threads that are blocked on the condition variable pointed to by cvp.
If no threads are blocked on the condition variable, then cond_signal() and cond_broadcast() have no effect.
Both functions should be called under the protection of the same mutex that is used with the condition variable being signaled. Otherwise, the condition variable may be signaled between the test of the associated condition and blocking in cond_wait(). This can cause an infinite wait.
... but that rationale looks bogus to me. In particular, the thread heading into cond_wait() must have already tested the condition under the lock and concluded it was false in order to decide to cond_wait(). Since any thread changing state that would affect the condition must also be holding the lock, there is no way for the state (and thus the outcome of the test) to change beween the test and the cond_wait(), and thus any cond_signal() sent during that window would end up being spurious anyway.
Please feel free to prove me wrong :-)
-- meem _______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
Posts:
370
From:
San Francisco, CA
Registered:
3/9/05
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 9, 2006 12:52 PM
in response to: meem
|
|
On Fri, Jun 09, 2006 at 03:24:00PM -0400, Peter Memishian wrote: > > > Whether this is good or bad, this is the official requirement. A quote from > > the condvar(9F): > > > > cv_signal() signals the condition and wakes one blocked > > thread. All blocked threads can be unblocked by calling > > cv_broadcast(). You must acquire the mutex passed into > > cv_wait() before calling cv_signal() or cv_broadcast(). > > Further, cond_signal(3C) states: > > The cond_broadcast() function unblocks all threads that are > blocked on the condition variable pointed to by cvp. > > If no threads are blocked on the condition variable, then > cond_signal() and cond_broadcast() have no effect. > > Both functions should be called under the protection of the > same mutex that is used with the condition variable being > signaled. Otherwise, the condition variable may be signaled > between the test of the associated condition and blocking in > cond_wait(). This can cause an infinite wait. > > ... but that rationale looks bogus to me. In particular, the thread > heading into cond_wait() must have already tested the condition under the > lock and concluded it was false in order to decide to cond_wait(). Since > any thread changing state that would affect the condition must also be > holding the lock, there is no way for the state (and thus the outcome of > the test) to change beween the test and the cond_wait(), and thus any > cond_signal() sent during that window would end up being spurious anyway. > > Please feel free to prove me wrong :-)
If I remember correctly, the main problems you can run into with signaling after dropping the lock is that there can be destruction races:
thread 1 Thread 2
mutex_exit(&obj->mutex) --------------------------> mutex_enter(&obj->mutex) set up object for destruction mutex_exit(&obj->mutex) kmem_free(obj); <-------------------------- cv_signal(&obj->cv);
I agree that the argument in cond_signal(3C) is bogus; the standard states that "signaling under the lock can make scheduling more deterministic", but it doesn't require anyone to do so. See pthread_cond_signal(3C):
The pthread_cond_signal() or pthread_cond_broadcast() func- tions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex is locked by the thread calling pthread_cond_signal() or pthread_cond_broadcast().
Cheers, - jonathan
-- Jonathan Adams, Solaris Kernel Development _______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 9, 2006 2:21 PM
in response to: jwadams
|
|
> I agree that the argument in cond_signal(3C) is bogus
So it seems a manpage CR is in order. However, it disturbs me that such wording ever found its way into our documentation.
-- meem _______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
Michael Shapiro
mws@zion.eng.sun.com
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 8, 2006 11:46 AM
in response to: pavan
|
|
> > Hi All, > I am trying to debug a conditional variable at this situation: > I have a daemon running and I am trying to send a signal to it thru cv_signal > controlled by a conditional variable. The first request I sent to it works > fine and when it comes to the second request, I am not able to wake that > daemon with cv_signal(). I am guessing that there is some problem with the > conditional variable and trying to debug that. I can see the daemon kernel > thread running. > > There is a structure with a conditional variable > > struct a { > int ...; > ... > ... > kcondvar_t something; > }; > > suppose if the address of "struct a" after allocation is 100 and the address > of conditional variable is 120. > > Questions: > > 1. If I do a "::wchaninfo", should that list the address of the conditional > variable as 120?
Yes, if there are active waiters on that variable. It does not, however, produce a list of all possible condvar_t's in the system.
> 2. ::wchaninfo ==> dump condition variable > Does this dumps all the conditional variable addressess which are active > at that point time?
Yes, if there are waiters.
-Mike -- Mike Shapiro, Solaris Kernel Development. blogs.sun.com/mws/ _______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
Posts:
370
From:
San Francisco, CA
Registered:
3/9/05
|
|
|
|
Re: conditional variable debugging
Posted:
Jun 8, 2006 3:16 PM
in response to: pavan
|
|
On Tue, Jun 06, 2006 at 07:37:27PM -0700, Pavan Reddy wrote: > Hi All, > I am trying to debug a conditional variable at this situation: > > I have a daemon running and I am trying to send a signal to it thru cv_signal() > controlled by a conditional variable. The first request I sent to it works fine > and when it comes to the second request, I am not able to wake that daemon > with cv_signal(). I am guessing that there is some problem with the conditional > variable and trying to debug that. I can see the daemon kernel thread running. > > > There is a structure with a conditional variable > > struct a { > int ...; > ... > ... > kcondvar_t something; > }; > > suppose if the address of "struct a" after allocation is 100 and the address of > conditional variable is 120. > > Questions: > > 1. If I do a "::wchaninfo", should that list the address of the conditional variable > as 120? > > 2. ::wchaninfo ==> dump condition variable > Does this dumps all the conditional variable addressess which are active at that point > time?
You might look at dtrace for this; sched:::sleep will fire when a thread sleeps on a condvar, and sched:::wakeup will fire when one thread wakes another.
Cheers, - jonathan
-- Jonathan Adams, Solaris Kernel Development _______________________________________________ mdb-discuss mailing list mdb-discuss at opensolaris dot org
|
|
|
|
|