|
Replies:
44
-
Last Post:
Jul 17, 2006 8:37 PM
by: Dong-Hai Han
|
|
|
Posts:
6
From:
Registered:
7/4/05
|
|
|
|
Proposal for new DLPI cmd to disable packet loopback in promiscuous mode
Posted:
Mar 20, 2006 2:22 AM
|
|
Hi,
I am looking for feedback/comments for following addition to the DLPI stack:
Thanks, -Thomas
Background: With the current Solaris DLPI implementation, when a network device is put into promiscuous mode, all packets that are sent out through this network device are automatically duplicated and looped back to all streams associated with this network device. This allows applications such as snoop(1m) to observe traffic that is sent out by other applications. However, it also has the side effect, that a particular network application must discard the loopback packets when reading from the network device if it does not want to process its own packets twice. This is not always possible or straight forward since there might not always exists a way to distinguish traffic coming from the wire from traffic that the application has generated earlier and has been looped back by the system (i.e. Linux DLT_LINUX_SLL information with PACKET_OUTGOING) . Still if one assumes that an application is capable of making this distinction, there is performance penalty to incur, namely that the system duplicates with great effort packets that the application will later drop immediatly. This becomes an issue at Gigabit speeds or higher.
Functional specification: This document proposes the addition of a new DLPI request to the GLDv3 (aka Nemo) framework that will allow an application to control if and how loopback packets are generated in promiscuous mode.
/* * DL_PROMLOOP_REQ, M_PROTO type */ typedef struct { t_uscalar_t dl_primitive; /* DL_PROMLOOP_REQ */ t_uscalar_t dl_level; /* Promiscuous loopback mode */ } dl_promloop_req_t;
#define DL_PROMLOOP_REQ TBD /* Set promiscuous loopback mode */
/* * DLPI promiscuous loopback mode definitions */ #define DL_PROMLOOP_DEV_OFF 0x01 /* Disable for device */ #define DL_PROMLOOP_DEV_ON 0x02 /* Enable for device */ #define DL_PROMLOOP_STR_OFF 0x03 /* Disable for stream */ #define DL_PROMLOOP_STR_ON 0x04 /* Enable for stream */
We consider two major modes: one that applies at the device level and one that applies at the stream level:
a) Device level operation (DL_PROMLOOP_DEV_OFF and DL_PROMLOOP_DEV_ON) A DL_PROMLOOP_REQ DLPI request that has a level of DL_PROMLOOP_DEV_OFF or DL_PROMLOOP_DEV_ON will affect all streams that are bound to a particular network device. That is, whether packets are looped back or not is set at the device level and will therefore apply to all open streams for that device. A DL_PROMLOOP_DEV_OFF will disable the loopback of packets device wide. A DL_PROMLOOP_DEV_ON will enable the loopback of packets device wide (default behaviour).
b) Stream level operation (DL_PROMLOOP_STR_OFF and DL_PROMLOOP_STR_ON) A DL_PROMLOOP_REQ DLPI request that has a level of DL_PROMLOOP_STR_OFF or DL_PROMLOOP_STR_ON will affect the requesting (current) stream only. That is, the system will still loopback packets but only to those streams that have the DL_PROMLOOP_STR_ON level set (default). A DL_PROMLOOP_STR_OFF will disable the loopback of packets for the current stream. A DL_PROMLOOP_STR_ON will enable the loopback of packets for the current stream (default behaviour).
A device level setting takes precedence over a streams level setting. That is if the device level is set to DL_PROMLOOP_DEV_OFF, no stream will receive looped back packets regardless of its own level setting. The default settings are DL_PROMLOOP_DEV_ON and DL_PROMLOOP_STR_ON in order to preserve backwards compatibility. A device that is put into promiscuous mode with DL_PROMLOOP_DEV_OFF will have the benefit that the system can still use the GLDv3 fast-path since loopback processing is completly disabled. On the other hand, a stream with DL_PROMLOOP_STR_OFF will have the benefit that it can be fully observed with snoop(1m), albeit using a slower soft-path.
Optional: One can also think of adding two DLPI notifications: DL_NOTE_PROMLOOP_DEV_ON and DL_NOTE_PROMLOOP_DEV_OFF that would be fired when a stream issues a DL_PROMLOOP_DEV_ON and DL_PROMLOOP_DEV_OFF. This could be useful for snoop(1m) since there would be otherwise no way of being informed of such changes.
Changes required: Changes will need to be done to the GLDv3 framework (including header, code and man pages). Also there should be a (new or extended current) system administration command to display and set the device level promiscuous loopback mode. This would be helpful in situations where one wants to restore a known system level. Applications: Network appliances that utilise Solaris as OS and rely heavily on promiscous mode to perform their task. Ease of porting from Linux to Solaris for such applications.
Development estimates: I have done a POC for this feature with OpenSolaris20060102. Changes were made to the DLD driver, the DLS and MAC modules. The code changes amount to about 500 LOC. I would be willing to work on integrating this into OpenSolaris.
|
|
|
Posts:
6,810
From:
US
Registered:
3/9/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 4:41 AM
in response to: tbastian
|
|
Thomas Bastian writes: > We consider two major modes: one that applies at the device level > and one that applies at the stream level:
What is the motivation for having two separate ways to set this? Why not have this new feature _only_ at the stream level? Are there usage models that correspond to both levels? The only one I see is DLT_LINUX_SLL, which seems to imply stream-level (though I'm not positive).
How do these two mechanisms interact with each other? You say that the device level "takes precedence," but what does that really mean? If I set DL_PROMLOOP_DEV_ON at the device level, does this mean:
a. Current open streams are unaffected, but newly-opened ones will default to having the flag turned "on."
b. All open streams at the point in time in which I send DL_PROMLOOP_DEV_ON will be switched into loop-on mode, even those that had previously set DL_PROMLOOP_STR_OFF.
c. All open streams are switched to loop-on mode, and, because this takes precedence over the stream level control, subsequent use of DL_PROMLOOP_STR_OFF does nothing.
Does the proposal distinguish between looped-back traffic that originates with the stream user and traffic that originates with other streams?
It seems to me that snoop(1M) relies on being able to catch packets transmitted by all streams, but that since it never sends any packets, it doesn't care about self-originated traffic. Further, raw DLPI users simply never (as a matter of design) want to see their own traffic looped back.
So, why not dispense with the knob entirely, and simply change the definition? Fix it so that promiscuous mode in DLPI does not itself loop back traffic to the same stream that generated it. I.e., only cases that cause loopback in the non-promiscuous behavior would loop back. This would simplify the driver changes, the documentation, the user interface, and the porting work required for applications.
Is there any case in which seeing the unicast traffic that you generated on your own promiscuous-mode stream is not a bug?
-- James Carlson, KISS Network <james dot d dot carlson at sun dot com> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 6:33 AM
in response to: carlsonj
|
|
[No Body]
|
|
|
|
Posts:
6,810
From:
US
Registered:
3/9/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 6:33 AM
in response to: Guest
|
|
Thomas Bastian - Sun Microsystems writes: > > What is the motivation for having two separate ways to set this? Why > > not have this new feature _only_ at the stream level? Are there usage > > models that correspond to both levels? The only one I see is > > DLT_LINUX_SLL, which seems to imply stream-level (though I'm not > > positive). > You are right. From a feature point of view the streams level would be > enough. Were I can see the benefit with the device level setting is that > we could avoid using mac_txloop() and therefore use the fast regular way > to get packets out (since we don't need a loop copy). But maybe the > speed benefit will be negligible after all. I have not made any > measurements for this.
I'd rather not expose the details of performance optimizations to uninvolved parties. They change far too often for this to be a good way to entangle the design.
In other words, if there is any performance to be gained here, then the system should detect the special cases itself and set up the right behavior. Thus, if all of the streams either are non-promiscuous or if all of the promiscuous streams elect not to have local copies, then use the "fast" version. Otherwise, don't.
(I really think the complexity involved with the pointer management dwarfs any possible gain from avoiding a single, well-designed flag check, and that the current design needs a rethink. But that's probably a different topic.)
> > c. All open streams are switched to loop-on mode, and, because this > > takes precedence over the stream level control, subsequent use > > of DL_PROMLOOP_STR_OFF does nothing. > Its partly c.) I guess my proposal is not clear enough on this point. > Let me try to rephrase it. The DL_PROMLOOP_DEV_ON enables loopback mode > for the device, hence this setting is a pre-requisite for any stream to > see loopback packets at all from this device. If the DL_PROMLOOP_DEV
So ... this means there are really *three* states for the device level flag. It can be "forced on," "forced off," or "unset." There's no way to set that third mode with the new interface; the system starts up that way by default, but if anyone ever sets either of the other modes, it's a one-way trap door. You can't get back (except, perhaps, by unplumbing).
That's a bit confusing, and I'm not sure I see why it's necessary.
> > Does the proposal distinguish between looped-back traffic that > > originates with the stream user and traffic that originates with other > > streams? > Not in the POC currently. This is an important point on which I am still > unclear what the best approach would be.
It seems to me that it's really key to the problem.
> > So, why not dispense with the knob entirely, and simply change the > > definition? Fix it so that promiscuous mode in DLPI does not itself > > loop back traffic to the same stream that generated it. I.e., only > > cases that cause loopback in the non-promiscuous behavior would loop > > back. This would simplify the driver changes, the documentation, the > > user interface, and the porting work required for applications. > I am not sure this is possible. Agreed that it would be the simplest > approach. I am not 100% positive but I think it is a well known > "feature" of DLPI that in promiscuous mode, packets are looped back. I > think this is the way it works on other systems (HP-UX, AIX, etc...) as > well (to be confirmed). If there is such a requirement for DLPI in > promiscuous mode, then we could not go down that route because we would > break compatibility I suppose.
I don't think that's the important question. I think this one is:
> > Is there any case in which seeing the unicast traffic that you > > generated on your own promiscuous-mode stream is not a bug?
It seems to me that promiscuous DLPI streams are relatively rare. In most (nearly all) cases, they're used for snoop/ethereal/libpcap, and those applications are read-only.
The narrow case where the current DLPI semantics break down for some users is in the rarest of the rare: a promiscuous DLPI stream user who also transmits unicast packets. It seems fair to me to ask whether the current behavior is something that anyone could ever have relied on in any useful way, or whether it's merely a bug. In other words, do those applications _ever_ process those packets beyond just detecting and discarding them?
I'd be strongly tempted to treat this as a bug, and change it in a Minor release along with a suitable release note. The only "tunable" I might provide would be an intentionally undocumented variable (that could be tweaked with /etc/system) to reenable the old behavior, just in case there's some unknown application somewhere that's actually harmed by the new behavior.
The chance of that, though, seems quite remote to me, and the risk looks reasonable for a Minor release, especially in comparison to the complexity and risk of potentially modifying multiple (and largely unknown!) DLPI applications to take advantage of this new feature, and adding lasting complexity to Solaris for the mode switch implementation that could really never be removed.
(For a patch or micro release binding, the default may need to be the other way.)
But, yes, I agree that verifying against the standards (which seem to say nothing about the issue) and against other implementations is a good idea. I don't think, though, that if other implementations have bugs, this necessarily means we must as well.
-- James Carlson, KISS Network <james dot d dot carlson at sun dot com> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6
From:
Registered:
7/4/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 6:54 AM
in response to: carlsonj
|
|
On Mon, 2006-03-20 at 15:33, James Carlson wrote: > Thomas Bastian - Sun Microsystems writes: > > > What is the motivation for having two separate ways to set this? Why > > > not have this new feature _only_ at the stream level? Are there usage > > > models that correspond to both levels? The only one I see is > > > DLT_LINUX_SLL, which seems to imply stream-level (though I'm not > > > positive). > > You are right. From a feature point of view the streams level would be > > enough. Were I can see the benefit with the device level setting is that > > we could avoid using mac_txloop() and therefore use the fast regular way > > to get packets out (since we don't need a loop copy). But maybe the > > speed benefit will be negligible after all. I have not made any > > measurements for this. > > I'd rather not expose the details of performance optimizations to > uninvolved parties. They change far too often for this to be a good > way to entangle the design. > > In other words, if there is any performance to be gained here, then > the system should detect the special cases itself and set up the right > behavior. Thus, if all of the streams either are non-promiscuous or > if all of the promiscuous streams elect not to have local copies, then > use the "fast" version. Otherwise, don't. > > (I really think the complexity involved with the pointer management > dwarfs any possible gain from avoiding a single, well-designed flag > check, and that the current design needs a rethink. But that's > probably a different topic.) > > > > c. All open streams are switched to loop-on mode, and, because this > > > takes precedence over the stream level control, subsequent use > > > of DL_PROMLOOP_STR_OFF does nothing. > > Its partly c.) I guess my proposal is not clear enough on this point. > > Let me try to rephrase it. The DL_PROMLOOP_DEV_ON enables loopback mode > > for the device, hence this setting is a pre-requisite for any stream to > > see loopback packets at all from this device. If the DL_PROMLOOP_DEV > > So ... this means there are really *three* states for the device level > flag. It can be "forced on," "forced off," or "unset." There's no > way to set that third mode with the new interface; the system starts > up that way by default, but if anyone ever sets either of the other > modes, it's a one-way trap door. You can't get back (except, perhaps, > by unplumbing). > > That's a bit confusing, and I'm not sure I see why it's necessary. > > > > Does the proposal distinguish between looped-back traffic that > > > originates with the stream user and traffic that originates with other > > > streams? > > Not in the POC currently. This is an important point on which I am still > > unclear what the best approach would be. > > It seems to me that it's really key to the problem. > > > > So, why not dispense with the knob entirely, and simply change the > > > definition? Fix it so that promiscuous mode in DLPI does not itself > > > loop back traffic to the same stream that generated it. I.e., only > > > cases that cause loopback in the non-promiscuous behavior would loop > > > back. This would simplify the driver changes, the documentation, the > > > user interface, and the porting work required for applications. > > I am not sure this is possible. Agreed that it would be the simplest > > approach. I am not 100% positive but I think it is a well known > > "feature" of DLPI that in promiscuous mode, packets are looped back. I > > think this is the way it works on other systems (HP-UX, AIX, etc...) as > > well (to be confirmed). If there is such a requirement for DLPI in > > promiscuous mode, then we could not go down that route because we would > > break compatibility I suppose. > > I don't think that's the important question. I think this one is: > > > > Is there any case in which seeing the unicast traffic that you > > > generated on your own promiscuous-mode stream is not a bug? > > It seems to me that promiscuous DLPI streams are relatively rare. In > most (nearly all) cases, they're used for snoop/ethereal/libpcap, and > those applications are read-only. > > The narrow case where the current DLPI semantics break down for some > users is in the rarest of the rare: a promiscuous DLPI stream user who > also transmits unicast packets. It seems fair to me to ask whether > the current behavior is something that anyone could ever have relied > on in any useful way, or whether it's merely a bug. In other words, > do those applications _ever_ process those packets beyond just > detecting and discarding them? > > I'd be strongly tempted to treat this as a bug, and change it in a > Minor release along with a suitable release note. The only "tunable" > I might provide would be an intentionally undocumented variable (that > could be tweaked with /etc/system) to reenable the old behavior, just > in case there's some unknown application somewhere that's actually > harmed by the new behavior. > > The chance of that, though, seems quite remote to me, and the risk > looks reasonable for a Minor release, especially in comparison to the > complexity and risk of potentially modifying multiple (and largely > unknown!) DLPI applications to take advantage of this new feature, > and adding lasting complexity to Solaris for the mode switch > implementation that could really never be removed. > > (For a patch or micro release binding, the default may need to be the > other way.) > > But, yes, I agree that verifying against the standards (which seem to > say nothing about the issue) and against other implementations is a > good idea. I don't think, though, that if other implementations have > bugs, this necessarily means we must as well. Well to be honest I am fine with treating this as a bug because I fully agree with you that the current promiscous mode behaviour does not make sense at all. I am happy to hear other people's opinion about this. And in the meantime, I will see if I find anything in the specs about what (if any) the "expected" behaviour should be.
Thanks, -Thomas
> > -- > James Carlson, KISS Network <james dot d dot carlson at sun dot com> > Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 > MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6
From:
Registered:
7/4/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 7:56 AM
in response to: tbastian
|
|
Just checked the DLPI spec for this topic. There is no requirement that a DLPI user gets its own packets looped back. So I guess we can safely treat this proposal as a bug. I will file a bug instead. Thanks, -Thomas
On Mon, 2006-03-20 at 15:54, Thomas Bastian - Sun Microsystems wrote: > On Mon, 2006-03-20 at 15:33, James Carlson wrote: > > Thomas Bastian - Sun Microsystems writes: > > > > What is the motivation for having two separate ways to set this? Why > > > > not have this new feature _only_ at the stream level? Are there usage > > > > models that correspond to both levels? The only one I see is > > > > DLT_LINUX_SLL, which seems to imply stream-level (though I'm not > > > > positive). > > > You are right. From a feature point of view the streams level would be > > > enough. Were I can see the benefit with the device level setting is that > > > we could avoid using mac_txloop() and therefore use the fast regular way > > > to get packets out (since we don't need a loop copy). But maybe the > > > speed benefit will be negligible after all. I have not made any > > > measurements for this. > > > > I'd rather not expose the details of performance optimizations to > > uninvolved parties. They change far too often for this to be a good > > way to entangle the design. > > > > In other words, if there is any performance to be gained here, then > > the system should detect the special cases itself and set up the right > > behavior. Thus, if all of the streams either are non-promiscuous or > > if all of the promiscuous streams elect not to have local copies, then > > use the "fast" version. Otherwise, don't. > > > > (I really think the complexity involved with the pointer management > > dwarfs any possible gain from avoiding a single, well-designed flag > > check, and that the current design needs a rethink. But that's > > probably a different topic.) > > > > > > c. All open streams are switched to loop-on mode, and, because this > > > > takes precedence over the stream level control, subsequent use > > > > of DL_PROMLOOP_STR_OFF does nothing. > > > Its partly c.) I guess my proposal is not clear enough on this point. > > > Let me try to rephrase it. The DL_PROMLOOP_DEV_ON enables loopback mode > > > for the device, hence this setting is a pre-requisite for any stream to > > > see loopback packets at all from this device. If the DL_PROMLOOP_DEV > > > > So ... this means there are really *three* states for the device level > > flag. It can be "forced on," "forced off," or "unset." There's no > > way to set that third mode with the new interface; the system starts > > up that way by default, but if anyone ever sets either of the other > > modes, it's a one-way trap door. You can't get back (except, perhaps, > > by unplumbing). > > > > That's a bit confusing, and I'm not sure I see why it's necessary. > > > > > > Does the proposal distinguish between looped-back traffic that > > > > originates with the stream user and traffic that originates with other > > > > streams? > > > Not in the POC currently. This is an important point on which I am still > > > unclear what the best approach would be. > > > > It seems to me that it's really key to the problem. > > > > > > So, why not dispense with the knob entirely, and simply change the > > > > definition? Fix it so that promiscuous mode in DLPI does not itself > > > > loop back traffic to the same stream that generated it. I.e., only > > > > cases that cause loopback in the non-promiscuous behavior would loop > > > > back. This would simplify the driver changes, the documentation, the > > > > user interface, and the porting work required for applications. > > > I am not sure this is possible. Agreed that it would be the simplest > > > approach. I am not 100% positive but I think it is a well known > > > "feature" of DLPI that in promiscuous mode, packets are looped back. I > > > think this is the way it works on other systems (HP-UX, AIX, etc...) as > > > well (to be confirmed). If there is such a requirement for DLPI in > > > promiscuous mode, then we could not go down that route because we would > > > break compatibility I suppose. > > > > I don't think that's the important question. I think this one is: > > > > > > Is there any case in which seeing the unicast traffic that you > > > > generated on your own promiscuous-mode stream is not a bug? > > > > It seems to me that promiscuous DLPI streams are relatively rare. In > > most (nearly all) cases, they're used for snoop/ethereal/libpcap, and > > those applications are read-only. > > > > The narrow case where the current DLPI semantics break down for some > > users is in the rarest of the rare: a promiscuous DLPI stream user who > > also transmits unicast packets. It seems fair to me to ask whether > > the current behavior is something that anyone could ever have relied > > on in any useful way, or whether it's merely a bug. In other words, > > do those applications _ever_ process those packets beyond just > > detecting and discarding them? > > > > I'd be strongly tempted to treat this as a bug, and change it in a > > Minor release along with a suitable release note. The only "tunable" > > I might provide would be an intentionally undocumented variable (that > > could be tweaked with /etc/system) to reenable the old behavior, just > > in case there's some unknown application somewhere that's actually > > harmed by the new behavior. > > > > The chance of that, though, seems quite remote to me, and the risk > > looks reasonable for a Minor release, especially in comparison to the > > complexity and risk of potentially modifying multiple (and largely > > unknown!) DLPI applications to take advantage of this new feature, > > and adding lasting complexity to Solaris for the mode switch > > implementation that could really never be removed. > > > > (For a patch or micro release binding, the default may need to be the > > other way.) > > > > But, yes, I agree that verifying against the standards (which seem to > > say nothing about the issue) and against other implementations is a > > good idea. I don't think, though, that if other implementations have > > bugs, this necessarily means we must as well. > Well to be honest I am fine with treating this as a bug because I fully > agree with you that the current promiscous mode behaviour does not make > sense at all. I am happy to hear other people's opinion about this. And > in the meantime, I will see if I find anything in the specs about what > (if any) the "expected" behaviour should be. > > Thanks, > -Thomas > > > > > -- > > James Carlson, KISS Network <james dot d dot carlson at sun dot com> > > Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 > > MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 > > _______________________________________________ > networking-discuss mailing list > networking-discuss at opensolaris dot org
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
429
From:
GB
Registered:
6/15/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 8:29 AM
in response to: tbastian
|
|
On 3/20/06, Thomas Bastian - Sun Microsystems <Thomas dot Bastian at sun dot com> wrote: > Just checked the DLPI spec for this topic. There is no requirement that > a DLPI user gets its own packets looped back. So I guess we can safely > treat this proposal as a bug. I will file a bug instead.
I think the behaviour was adopted from GLD. You may want to check with David Butterfield to see if he can remember any reason why the packets get looped back. It may have only been for consistency; e.g. a token-ring interface in promiscuous mode will always receive its own packets off the wire so one has to jump through a few hoops *not* to loop packets back. You may also want to check whether there are 3rd party applications out there that expect to get their own packets in promiscuous mode.
-- Paul Durrant http://www.linkedin.com/in/pdurrant _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6
From:
Registered:
7/4/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 8:50 AM
in response to: pdurrant
|
|
On Mon, 2006-03-20 at 17:29, Paul Durrant wrote: > On 3/20/06, Thomas Bastian - Sun Microsystems <Thomas dot Bastian at sun dot com> wrote: > > Just checked the DLPI spec for this topic. There is no requirement that > > a DLPI user gets its own packets looped back. So I guess we can safely > > treat this proposal as a bug. I will file a bug instead. > > I think the behaviour was adopted from GLD. You may want to check with > David Butterfield to see if he can remember any reason why the packets > get looped back. You mean GLDv2? I will check with David on this one. > It may have only been for consistency; e.g. a token-ring interface in > promiscuous mode will always receive its own packets off the wire so > one has to jump through a few hoops *not* to loop packets back. > You may also want to check whether there are 3rd party applications > out there that expect to get their own packets in promiscuous mode. That will be an impossible task I am affraid :-( This is why in the first place I thought adding a new functionality to the stack instead of treating this as a bug. It is always easier to leave the world as it is, and add a new tweak for the ones who need change. It is not necessarly a beautiful approach but it saves a lot of headaches.
Thanks, -Thomas
> > -- > Paul Durrant > http://www.linkedin.com/in/pdurrant
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6,810
From:
US
Registered:
3/9/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 9:04 AM
in response to: tbastian
|
|
Thomas Bastian - Sun Microsystems writes: > That will be an impossible task I am affraid :-( This is why in the > first place I thought adding a new functionality to the stack instead of > treating this as a bug. It is always easier to leave the world as it is, > and add a new tweak for the ones who need change. It is not necessarly a > beautiful approach but it saves a lot of headaches.
I think it just creates a different set of headaches.
-- James Carlson, KISS Network <james dot d dot carlson at sun dot com> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
3
From:
California
Registered:
8/5/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 9:36 AM
in response to: tbastian
|
|
Hi Thomas You may also want to check the IEEE specs for the definition of Promiscuous. The DLPI spec follows that. It's always been well understood that 'Promiscuous' means every packet that hits the physical layer, which includes packets sent out. The Software loopback compensates for hardware that cannot do loopback, but in essence it's to achieve the same result which is to see the packets you sent out. Not sure this is a bug.... cheers Frank
Thomas Bastian - Sun Microsystems wrote: > Just checked the DLPI spec for this topic. There is no requirement that > a DLPI user gets its own packets looped back. So I guess we can safely > treat this proposal as a bug. I will file a bug instead. > Thanks, > -Thomas > > On Mon, 2006-03-20 at 15:54, Thomas Bastian - Sun Microsystems wrote: > >> On Mon, 2006-03-20 at 15:33, James Carlson wrote: >> >>> Thomas Bastian - Sun Microsystems writes: >>> >>>>> What is the motivation for having two separate ways to set this? Why >>>>> not have this new feature _only_ at the stream level? Are there usage >>>>> models that correspond to both levels? The only one I see is >>>>> DLT_LINUX_SLL, which seems to imply stream-level (though I'm not >>>>> positive). >>>>> >>>> You are right. From a feature point of view the streams level would be >>>> enough. Were I can see the benefit with the device level setting is that >>>> we could avoid using mac_txloop() and therefore use the fast regular way >>>> to get packets out (since we don't need a loop copy). But maybe the >>>> speed benefit will be negligible after all. I have not made any >>>> measurements for this. >>>> >>> I'd rather not expose the details of performance optimizations to >>> uninvolved parties. They change far too often for this to be a good >>> way to entangle the design. >>> >>> In other words, if there is any performance to be gained here, then >>> the system should detect the special cases itself and set up the right >>> behavior. Thus, if all of the streams either are non-promiscuous or >>> if all of the promiscuous streams elect not to have local copies, then >>> use the "fast" version. Otherwise, don't. >>> >>> (I really think the complexity involved with the pointer management >>> dwarfs any possible gain from avoiding a single, well-designed flag >>> check, and that the current design needs a rethink. But that's >>> probably a different topic.) >>> >>> >>>>> c. All open streams are switched to loop-on mode, and, because this >>>>> takes precedence over the stream level control, subsequent use >>>>> of DL_PROMLOOP_STR_OFF does nothing. >>>>> >>>> Its partly c.) I guess my proposal is not clear enough on this point. >>>> Let me try to rephrase it. The DL_PROMLOOP_DEV_ON enables loopback mode >>>> for the device, hence this setting is a pre-requisite for any stream to >>>> see loopback packets at all from this device. If the DL_PROMLOOP_DEV >>>> >>> So ... this means there are really *three* states for the device level >>> flag. It can be "forced on," "forced off," or "unset." There's no >>> way to set that third mode with the new interface; the system starts >>> up that way by default, but if anyone ever sets either of the other >>> modes, it's a one-way trap door. You can't get back (except, perhaps, >>> by unplumbing). >>> >>> That's a bit confusing, and I'm not sure I see why it's necessary. >>> >>> >>>>> Does the proposal distinguish between looped-back traffic that >>>>> originates with the stream user and traffic that originates with other >>>>> streams? >>>>> >>>> Not in the POC currently. This is an important point on which I am still >>>> unclear what the best approach would be. >>>> >>> It seems to me that it's really key to the problem. >>> >>> >>>>> So, why not dispense with the knob entirely, and simply change the >>>>> definition? Fix it so that promiscuous mode in DLPI does not itself >>>>> loop back traffic to the same stream that generated it. I.e., only >>>>> cases that cause loopback in the non-promiscuous behavior would loop >>>>> back. This would simplify the driver changes, the documentation, the >>>>> user interface, and the porting work required for applications. >>>>> >>>> I am not sure this is possible. Agreed that it would be the simplest >>>> approach. I am not 100% positive but I think it is a well known >>>> "feature" of DLPI that in promiscuous mode, packets are looped back. I >>>> think this is the way it works on other systems (HP-UX, AIX, etc...) as >>>> well (to be confirmed). If there is such a requirement for DLPI in >>>> promiscuous mode, then we could not go down that route because we would >>>> break compatibility I suppose. >>>> >>> I don't think that's the important question. I think this one is: >>> >>> >>>>> Is there any case in which seeing the unicast traffic that you >>>>> generated on your own promiscuous-mode stream is not a bug? >>>>> >>> It seems to me that promiscuous DLPI streams are relatively rare. In >>> most (nearly all) cases, they're used for snoop/ethereal/libpcap, and >>> those applications are read-only. >>> >>> The narrow case where the current DLPI semantics break down for some >>> users is in the rarest of the rare: a promiscuous DLPI stream user who >>> also transmits unicast packets. It seems fair to me to ask whether >>> the current behavior is something that anyone could ever have relied >>> on in any useful way, or whether it's merely a bug. In other words, >>> do those applications _ever_ process those packets beyond just >>> detecting and discarding them? >>> >>> I'd be strongly tempted to treat this as a bug, and change it in a >>> Minor release along with a suitable release note. The only "tunable" >>> I might provide would be an intentionally undocumented variable (that >>> could be tweaked with /etc/system) to reenable the old behavior, just >>> in case there's some unknown application somewhere that's actually >>> harmed by the new behavior. >>> >>> The chance of that, though, seems quite remote to me, and the risk >>> looks reasonable for a Minor release, especially in comparison to the >>> complexity and risk of potentially modifying multiple (and largely >>> unknown!) DLPI applications to take advantage of this new feature, >>> and adding lasting complexity to Solaris for the mode switch >>> implementation that could really never be removed. >>> >>> (For a patch or micro release binding, the default may need to be the >>> other way.) >>> >>> But, yes, I agree that verifying against the standards (which seem to >>> say nothing about the issue) and against other implementations is a >>> good idea. I don't think, though, that if other implementations have >>> bugs, this necessarily means we must as well. >>> >> Well to be honest I am fine with treating this as a bug because I fully >> agree with you that the current promiscous mode behaviour does not make >> sense at all. I am happy to hear other people's opinion about this. And >> in the meantime, I will see if I find anything in the specs about what >> (if any) the "expected" behaviour should be. >> >> Thanks, >> -Thomas >> >> >>> -- >>> James Carlson, KISS Network <james dot d dot carlson at sun dot com> >>> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 >>> MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 >>> >> _______________________________________________ >> networking-discuss mailing list >> networking-discuss at opensolaris dot org >> > > _______________________________________________ > networking-discuss mailing list > networking-discuss at opensolaris dot org >
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6
From:
Registered:
7/4/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 9:45 AM
in response to: frd
|
|
Hi Frank, Thanks for the tip. Will check that out. I agree that promiscuous also means to see packets that are sent out. But do you think it makes sense for a stream to receive a copy of its own packets sent out? I mean other streams in the system should receive a copy for sure, but not the originating stream? Thanks, -Thomas
On Mon, 2006-03-20 at 18:36, Frank DiMambro wrote: > Hi Thomas > You may also want to check the IEEE specs for the definition of > Promiscuous. The DLPI spec follows that. It's always been well > understood that 'Promiscuous' means every packet that hits the > physical layer, which includes packets sent out. The Software > loopback compensates for hardware that cannot do loopback, > but in essence it's to achieve the same result which is to see the > packets you sent out. Not sure this is a bug.... > > cheers > Frank > > Thomas Bastian - Sun Microsystems wrote: > > Just checked the DLPI spec for this topic. There is no requirement that > > a DLPI user gets its own packets looped back. So I guess we can safely > > treat this proposal as a bug. I will file a bug instead. > > Thanks, > > -Thomas > > > > On Mon, 2006-03-20 at 15:54, Thomas Bastian - Sun Microsystems wrote: > > > >> On Mon, 2006-03-20 at 15:33, James Carlson wrote: > >> > >>> Thomas Bastian - Sun Microsystems writes: > >>> > >>>>> What is the motivation for having two separate ways to set this? Why > >>>>> not have this new feature _only_ at the stream level? Are there usage > >>>>> models that correspond to both levels? The only one I see is > >>>>> DLT_LINUX_SLL, which seems to imply stream-level (though I'm not > >>>>> positive). > >>>>> > >>>> You are right. From a feature point of view the streams level would be > >>>> enough. Were I can see the benefit with the device level setting is that > >>>> we could avoid using mac_txloop() and therefore use the fast regular way > >>>> to get packets out (since we don't need a loop copy). But maybe the > >>>> speed benefit will be negligible after all. I have not made any > >>>> measurements for this. > >>>> > >>> I'd rather not expose the details of performance optimizations to > >>> uninvolved parties. They change far too often for this to be a good > >>> way to entangle the design. > >>> > >>> In other words, if there is any performance to be gained here, then > >>> the system should detect the special cases itself and set up the right > >>> behavior. Thus, if all of the streams either are non-promiscuous or > >>> if all of the promiscuous streams elect not to have local copies, then > >>> use the "fast" version. Otherwise, don't. > >>> > >>> (I really think the complexity involved with the pointer management > >>> dwarfs any possible gain from avoiding a single, well-designed flag > >>> check, and that the current design needs a rethink. But that's > >>> probably a different topic.) > >>> > >>> > >>>>> c. All open streams are switched to loop-on mode, and, because this > >>>>> takes precedence over the stream level control, subsequent use > >>>>> of DL_PROMLOOP_STR_OFF does nothing. > >>>>> > >>>> Its partly c.) I guess my proposal is not clear enough on this point. > >>>> Let me try to rephrase it. The DL_PROMLOOP_DEV_ON enables loopback mode > >>>> for the device, hence this setting is a pre-requisite for any stream to > >>>> see loopback packets at all from this device. If the DL_PROMLOOP_DEV > >>>> > >>> So ... this means there are really *three* states for the device level > >>> flag. It can be "forced on," "forced off," or "unset." There's no > >>> way to set that third mode with the new interface; the system starts > >>> up that way by default, but if anyone ever sets either of the other > >>> modes, it's a one-way trap door. You can't get back (except, perhaps, > >>> by unplumbing). > >>> > >>> That's a bit confusing, and I'm not sure I see why it's necessary. > >>> > >>> > >>>>> Does the proposal distinguish between looped-back traffic that > >>>>> originates with the stream user and traffic that originates with other > >>>>> streams? > >>>>> > >>>> Not in the POC currently. This is an important point on which I am still > >>>> unclear what the best approach would be. > >>>> > >>> It seems to me that it's really key to the problem. > >>> > >>> > >>>>> So, why not dispense with the knob entirely, and simply change the > >>>>> definition? Fix it so that promiscuous mode in DLPI does not itself > >>>>> loop back traffic to the same stream that generated it. I.e., only > >>>>> cases that cause loopback in the non-promiscuous behavior would loop > >>>>> back. This would simplify the driver changes, the documentation, the > >>>>> user interface, and the porting work required for applications. > >>>>> > >>>> I am not sure this is possible. Agreed that it would be the simplest > >>>> approach. I am not 100% positive but I think it is a well known > >>>> "feature" of DLPI that in promiscuous mode, packets are looped back. I > >>>> think this is the way it works on other systems (HP-UX, AIX, etc...) as > >>>> well (to be confirmed). If there is such a requirement for DLPI in > >>>> promiscuous mode, then we could not go down that route because we would > >>>> break compatibility I suppose. > >>>> > >>> I don't think that's the important question. I think this one is: > >>> > >>> > >>>>> Is there any case in which seeing the unicast traffic that you > >>>>> generated on your own promiscuous-mode stream is not a bug? > >>>>> > >>> It seems to me that promiscuous DLPI streams are relatively rare. In > >>> most (nearly all) cases, they're used for snoop/ethereal/libpcap, and > >>> those applications are read-only. > >>> > >>> The narrow case where the current DLPI semantics break down for some > >>> users is in the rarest of the rare: a promiscuous DLPI stream user who > >>> also transmits unicast packets. It seems fair to me to ask whether > >>> the current behavior is something that anyone could ever have relied > >>> on in any useful way, or whether it's merely a bug. In other words, > >>> do those applications _ever_ process those packets beyond just > >>> detecting and discarding them? > >>> > >>> I'd be strongly tempted to treat this as a bug, and change it in a > >>> Minor release along with a suitable release note. The only "tunable" > >>> I might provide would be an intentionally undocumented variable (that > >>> could be tweaked with /etc/system) to reenable the old behavior, just > >>> in case there's some unknown application somewhere that's actually > >>> harmed by the new behavior. > >>> > >>> The chance of that, though, seems quite remote to me, and the risk > >>> looks reasonable for a Minor release, especially in comparison to the > >>> complexity and risk of potentially modifying multiple (and largely > >>> unknown!) DLPI applications to take advantage of this new feature, > >>> and adding lasting complexity to Solaris for the mode switch > >>> implementation that could really never be removed. > >>> > >>> (For a patch or micro release binding, the default may need to be the > >>> other way.) > >>> > >>> But, yes, I agree that verifying against the standards (which seem to > >>> say nothing about the issue) and against other implementations is a > >>> good idea. I don't think, though, that if other implementations have > >>> bugs, this necessarily means we must as well. > >>> > >> Well to be honest I am fine with treating this as a bug because I fully > >> agree with you that the current promiscous mode behaviour does not make > >> sense at all. I am happy to hear other people's opinion about this. And > >> in the meantime, I will see if I find anything in the specs about what > >> (if any) the "expected" behaviour should be. > >> > >> Thanks, > >> -Thomas > >> > >> > >>> -- > >>> James Carlson, KISS Network <james dot d dot carlson at sun dot com> > >>> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 > >>> MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 > >>> > >> _______________________________________________ > >> networking-discuss mailing list > >> networking-discuss at opensolaris dot org > >> > > > > _______________________________________________ > > networking-discuss mailing list > > networking-discuss at opensolaris dot org > > >
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
188
From:
Registered:
3/9/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 10:22 AM
in response to: tbastian
|
|
Several people have asked for an option to not receive their own packets. So it will be a good feature to have. I vote for the simple semantics that Jim is proposing.
Rao.
Thomas Bastian - Sun Microsystems wrote:
>Hi Frank, >Thanks for the tip. Will check that out. >I agree that promiscuous also means to see packets that are sent out. >But do you think it makes sense for a stream to receive a copy of its >own packets sent out? I mean other streams in the system should receive >a copy for sure, but not the originating stream? >Thanks, >-Thomas > >On Mon, 2006-03-20 at 18:36, Frank DiMambro wrote: > > >>Hi Thomas >> You may also want to check the IEEE specs for the definition of >>Promiscuous. The DLPI spec follows that. It's always been well >>understood that 'Promiscuous' means every packet that hits the >>physical layer, which includes packets sent out. The Software >>loopback compensates for hardware that cannot do loopback, >>but in essence it's to achieve the same result which is to see the >>packets you sent out. Not sure this is a bug.... >> >> cheers >> Frank >> >>Thomas Bastian - Sun Microsystems wrote: >> >> >>>Just checked the DLPI spec for this topic. There is no requirement that >>>a DLPI user gets its own packets looped back. So I guess we can safely >>>treat this proposal as a bug. I will file a bug instead. >>>Thanks, >>>-Thomas >>> >>>On Mon, 2006-03-20 at 15:54, Thomas Bastian - Sun Microsystems wrote: >>> >>> >>> >>>>On Mon, 2006-03-20 at 15:33, James Carlson wrote: >>>> >>>> >>>> >>>>>Thomas Bastian - Sun Microsystems writes: >>>>> >>>>> >>>>> >>>>>>>What is the motivation for having two separate ways to set this? Why >>>>>>>not have this new feature _only_ at the stream level? Are there usage >>>>>>>models that correspond to both levels? The only one I see is >>>>>>>DLT_LINUX_SLL, which seems to imply stream-level (though I'm not >>>>>>>positive). >>>>>>> >>>>>>> >>>>>>> >>>>>>You are right. From a feature point of view the streams level would be >>>>>>enough. Were I can see the benefit with the device level setting is that >>>>>>we could avoid using mac_txloop() and therefore use the fast regular way >>>>>>to get packets out (since we don't need a loop copy). But maybe the >>>>>>speed benefit will be negligible after all. I have not made any >>>>>>measurements for this. >>>>>> >>>>>> >>>>>> >>>>>I'd rather not expose the details of performance optimizations to >>>>>uninvolved parties. They change far too often for this to be a good >>>>>way to entangle the design. >>>>> >>>>>In other words, if there is any performance to be gained here, then >>>>>the system should detect the special cases itself and set up the right >>>>>behavior. Thus, if all of the streams either are non-promiscuous or >>>>>if all of the promiscuous streams elect not to have local copies, then >>>>>use the "fast" version. Otherwise, don't. >>>>> >>>>>(I really think the complexity involved with the pointer management >>>>>dwarfs any possible gain from avoiding a single, well-designed flag >>>>>check, and that the current design needs a rethink. But that's >>>>>probably a different topic.) >>>>> >>>>> >>>>> >>>>> >>>>>>> c. All open streams are switched to loop-on mode, and, because this >>>>>>> takes precedence over the stream level control, subsequent use >>>>>>> of DL_PROMLOOP_STR_OFF does nothing. >>>>>>> >>>>>>> >>>>>>> >>>>>>Its partly c.) I guess my proposal is not clear enough on this point. >>>>>>Let me try to rephrase it. The DL_PROMLOOP_DEV_ON enables loopback mode >>>>>>for the device, hence this setting is a pre-requisite for any stream to >>>>>>see loopback packets at all from this device. If the DL_PROMLOOP_DEV >>>>>> >>>>>> >>>>>> >>>>>So ... this means there are really *three* states for the device level >>>>>flag. It can be "forced on," "forced off," or "unset." There's no >>>>>way to set that third mode with the new interface; the system starts >>>>>up that way by default, but if anyone ever sets either of the other >>>>>modes, it's a one-way trap door. You can't get back (except, perhaps, >>>>>by unplumbing). >>>>> >>>>>That's a bit confusing, and I'm not sure I see why it's necessary. >>>>> >>>>> >>>>> >>>>> >>>>>>>Does the proposal distinguish between looped-back traffic that >>>>>>>originates with the stream user and traffic that originates with other >>>>>>>streams? >>>>>>> >>>>>>> >>>>>>> >>>>>>Not in the POC currently. This is an important point on which I am still >>>>>>unclear what the best approach would be. >>>>>> >>>>>> >>>>>> >>>>>It seems to me that it's really key to the problem. >>>>> >>>>> >>>>> >>>>> >>>>>>>So, why not dispense with the knob entirely, and simply change the >>>>>>>definition? Fix it so that promiscuous mode in DLPI does not itself >>>>>>>loop back traffic to the same stream that generated it. I.e., only >>>>>>>cases that cause loopback in the non-promiscuous behavior would loop >>>>>>>back. This would simplify the driver changes, the documentation, the >>>>>>>user interface, and the porting work required for applications. >>>>>>> >>>>>>> >>>>>>> >>>>>>I am not sure this is possible. Agreed that it would be the simplest >>>>>>approach. I am not 100% positive but I think it is a well known >>>>>>"feature" of DLPI that in promiscuous mode, packets are looped back. I >>>>>>think this is the way it works on other systems (HP-UX, AIX, etc...) as >>>>>>well (to be confirmed). If there is such a requirement for DLPI in >>>>>>promiscuous mode, then we could not go down that route because we would >>>>>>break compatibility I suppose. >>>>>> >>>>>> >>>>>> >>>>>I don't think that's the important question. I think this one is: >>>>> >>>>> >>>>> >>>>> >>>>>>>Is there any case in which seeing the unicast traffic that you >>>>>>>generated on your own promiscuous-mode stream is not a bug? >>>>>>> >>>>>>> >>>>>>> >>>>>It seems to me that promiscuous DLPI streams are relatively rare. In >>>>>most (nearly all) cases, they're used for snoop/ethereal/libpcap, and >>>>>those applications are read-only. >>>>> >>>>>The narrow case where the current DLPI semantics break down for some >>>>>users is in the rarest of the rare: a promiscuous DLPI stream user who >>>>>also transmits unicast packets. It seems fair to me to ask whether >>>>>the current behavior is something that anyone could ever have relied >>>>>on in any useful way, or whether it's merely a bug. In other words, >>>>>do those applications _ever_ process those packets beyond just >>>>>detecting and discarding them? >>>>> >>>>>I'd be strongly tempted to treat this as a bug, and change it in a >>>>>Minor release along with a suitable release note. The only "tunable" >>>>>I might provide would be an intentionally undocumented variable (that >>>>>could be tweaked with /etc/system) to reenable the old behavior, just >>>>>in case there's some unknown application somewhere that's actually >>>>>harmed by the new behavior. >>>>> >>>>>The chance of that, though, seems quite remote to me, and the risk >>>>>looks reasonable for a Minor release, especially in comparison to the >>>>>complexity and risk of potentially modifying multiple (and largely >>>>>unknown!) DLPI applications to take advantage of this new feature, >>>>>and adding lasting complexity to Solaris for the mode switch >>>>>implementation that could really never be removed. >>>>> >>>>>(For a patch or micro release binding, the default may need to be the >>>>>other way.) >>>>> >>>>>But, yes, I agree that verifying against the standards (which seem to >>>>>say nothing about the issue) and against other implementations is a >>>>>good idea. I don't think, though, that if other implementations have >>>>>bugs, this necessarily means we must as well. >>>>> >>>>> >>>>> >>>>Well to be honest I am fine with treating this as a bug because I fully >>>>agree with you that the current promiscous mode behaviour does not make >>>>sense at all. I am happy to hear other people's opinion about this. And >>>>in the meantime, I will see if I find anything in the specs about what >>>>(if any) the "expected" behaviour should be. >>>> >>>>Thanks, >>>>-Thomas >>>> >>>> >>>> >>>> >>>>>-- >>>>>James Carlson, KISS Network <james dot d dot carlson at sun dot com> >>>>>Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 >>>>>MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 >>>>> >>>>> >>>>> >>>>_______________________________________________ >>>>networking-discuss mailing list >>>>networking-discuss at opensolaris dot org >>>> >>>> >>>> >>>_______________________________________________ >>>networking-discuss mailing list >>>networking-discuss at opensolaris dot org >>> >>> >>> > >_______________________________________________ >networking-discuss mailing list >networking-discuss at opensolaris dot org > >
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Frank DiMambro
frd@main-man.com
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 12:16 PM
in response to: tbastian
|
|
Hi Thomas How it's supposed to be implemented is only streams that say they are promiscuous get a copy.
Thomas Bastian - Sun Microsystems wrote:
>Hi Frank, >Thanks for the tip. Will check that out. > > You may also want to check out the hardware specs for defintion of Promiscuous, In my experience largely with Ethernet it has always meant loopback of sent packets.
>I agree that promiscuous also means to see packets that are sent out. >But do you think it makes sense for a stream to receive a copy of its >own packets sent out? > If the stream says it's promiscuous then yes. This does not mean you spray it to other streams, who are not marked as promiscuous.
>I mean other streams in the system should receive >a copy for sure, but not the originating stream? > > No that's not the way is's supposed to work. Only streams that say they are promiscuous should see the sent packets. It's this functionality that lets you snoop a ip stream, and see that's going in and out of ip.
cheers Frank
>Thanks, >-Thomas > >On Mon, 2006-03-20 at 18:36, Frank DiMambro wrote: > > >>Hi Thomas >> You may also want to check the IEEE specs for the definition of >>Promiscuous. The DLPI spec follows that. It's always been well >>understood that 'Promiscuous' means every packet that hits the >>physical layer, which includes packets sent out. The Software >>loopback compensates for hardware that cannot do loopback, >>but in essence it's to achieve the same result which is to see the >>packets you sent out. Not sure this is a bug.... >> >> cheers >> Frank >> >>Thomas Bastian - Sun Microsystems wrote: >> >> >>>Just checked the DLPI spec for this topic. There is no requirement that >>>a DLPI user gets its own packets looped back. So I guess we can safely >>>treat this proposal as a bug. I will file a bug instead. >>>Thanks, >>>-Thomas >>> >>>On Mon, 2006-03-20 at 15:54, Thomas Bastian - Sun Microsystems wrote: >>> >>> >>> >>>>On Mon, 2006-03-20 at 15:33, James Carlson wrote: >>>> >>>> >>>> >>>>>Thomas Bastian - Sun Microsystems writes: >>>>> >>>>> >>>>> >>>>>>>What is the motivation for having two separate ways to set this? Why >>>>>>>not have this new feature _only_ at the stream level? Are there usage >>>>>>>models that correspond to both levels? The only one I see is >>>>>>>DLT_LINUX_SLL, which seems to imply stream-level (though I'm not >>>>>>>positive). >>>>>>> >>>>>>> >>>>>>> >>>>>>You are right. From a feature point of view the streams level would be >>>>>>enough. Were I can see the benefit with the device level setting is that >>>>>>we could avoid using mac_txloop() and therefore use the fast regular way >>>>>>to get packets out (since we don't need a loop copy). But maybe the >>>>>>speed benefit will be negligible after all. I have not made any >>>>>>measurements for this. >>>>>> >>>>>> >>>>>> >>>>>I'd rather not expose the details of performance optimizations to >>>>>uninvolved parties. They change far too often for this to be a good >>>>>way to entangle the design. >>>>> >>>>>In other words, if there is any performance to be gained here, then >>>>>the system should detect the special cases itself and set up the right >>>>>behavior. Thus, if all of the streams either are non-promiscuous or >>>>>if all of the promiscuous streams elect not to have local copies, then >>>>>use the "fast" version. Otherwise, don't. >>>>> >>>>>(I really think the complexity involved with the pointer management >>>>>dwarfs any possible gain from avoiding a single, well-designed flag >>>>>check, and that the current design needs a rethink. But that's >>>>>probably a different topic.) >>>>> >>>>> >>>>> >>>>> >>>>>>> c. All open streams are switched to loop-on mode, and, because this >>>>>>> takes precedence over the stream level control, subsequent use >>>>>>> of DL_PROMLOOP_STR_OFF does nothing. >>>>>>> >>>>>>> >>>>>>> >>>>>>Its partly c.) I guess my proposal is not clear enough on this point. >>>>>>Let me try to rephrase it. The DL_PROMLOOP_DEV_ON enables loopback mode >>>>>>for the device, hence this setting is a pre-requisite for any stream to >>>>>>see loopback packets at all from this device. If the DL_PROMLOOP_DEV >>>>>> >>>>>> >>>>>> >>>>>So ... this means there are really *three* states for the device level >>>>>flag. It can be "forced on," "forced off," or "unset." There's no >>>>>way to set that third mode with the new interface; the system starts >>>>>up that way by default, but if anyone ever sets either of the other >>>>>modes, it's a one-way trap door. You can't get back (except, perhaps, >>>>>by unplumbing). >>>>> >>>>>That's a bit confusing, and I'm not sure I see why it's necessary. >>>>> >>>>> >>>>> >>>>> >>>>>>>Does the proposal distinguish between looped-back traffic that >>>>>>>originates with the stream user and traffic that originates with other >>>>>>>streams? >>>>>>> >>>>>>> >>>>>>> >>>>>>Not in the POC currently. This is an important point on which I am still >>>>>>unclear what the best approach would be. >>>>>> >>>>>> >>>>>> >>>>>It seems to me that it's really key to the problem. >>>>> >>>>> >>>>> >>>>> >>>>>>>So, why not dispense with the knob entirely, and simply change the >>>>>>>definition? Fix it so that promiscuous mode in DLPI does not itself >>>>>>>loop back traffic to the same stream that generated it. I.e., only >>>>>>>cases that cause loopback in the non-promiscuous behavior would loop >>>>>>>back. This would simplify the driver changes, the documentation, the >>>>>>>user interface, and the porting work required for applications. >>>>>>> >>>>>>> >>>>>>> >>>>>>I am not sure this is possible. Agreed that it would be the simplest >>>>>>approach. I am not 100% positive but I think it is a well known >>>>>>"feature" of DLPI that in promiscuous mode, packets are looped back. I >>>>>>think this is the way it works on other systems (HP-UX, AIX, etc...) as >>>>>>well (to be confirmed). If there is such a requirement for DLPI in >>>>>>promiscuous mode, then we could not go down that route because we would >>>>>>break compatibility I suppose. >>>>>> >>>>>> >>>>>> >>>>>I don't think that's the important question. I think this one is: >>>>> >>>>> >>>>> >>>>> >>>>>>>Is there any case in which seeing the unicast traffic that you >>>>>>>generated on your own promiscuous-mode stream is not a bug? >>>>>>> >>>>>>> >>>>>>> >>>>>It seems to me that promiscuous DLPI streams are relatively rare. In >>>>>most (nearly all) cases, they're used for snoop/ethereal/libpcap, and >>>>>those applications are read-only. >>>>> >>>>>The narrow case where the current DLPI semantics break down for some >>>>>users is in the rarest of the rare: a promiscuous DLPI stream user who >>>>>also transmits unicast packets. It seems fair to me to ask whether >>>>>the current behavior is something that anyone could ever have relied >>>>>on in any useful way, or whether it's merely a bug. In other words, >>>>>do those applications _ever_ process those packets beyond just >>>>>detecting and discarding them? >>>>> >>>>>I'd be strongly tempted to treat this as a bug, and change it in a >>>>>Minor release along with a suitable release note. The only "tunable" >>>>>I might provide would be an intentionally undocumented variable (that >>>>>could be tweaked with /etc/system) to reenable the old behavior, just >>>>>in case there's some unknown application somewhere that's actually >>>>>harmed by the new behavior. >>>>> >>>>>The chance of that, though, seems quite remote to me, and the risk >>>>>looks reasonable for a Minor release, especially in comparison to the >>>>>complexity and risk of potentially modifying multiple (and largely >>>>>unknown!) DLPI applications to take advantage of this new feature, >>>>>and adding lasting complexity to Solaris for the mode switch >>>>>implementation that could really never be removed. >>>>> >>>>>(For a patch or micro release binding, the default may need to be the >>>>>other way.) >>>>> >>>>>But, yes, I agree that verifying against the standards (which seem to >>>>>say nothing about the issue) and against other implementations is a >>>>>good idea. I don't think, though, that if other implementations have >>>>>bugs, this necessarily means we must as well. >>>>> >>>>> >>>>> >>>>Well to be honest I am fine with treating this as a bug because I fully >>>>agree with you that the current promiscous mode behaviour does not make >>>>sense at all. I am happy to hear other people's opinion about this. And >>>>in the meantime, I will see if I find anything in the specs about what >>>>(if any) the "expected" behaviour should be. >>>> >>>>Thanks, >>>>-Thomas >>>> >>>> >>>> >>>> >>>>>-- >>>>>James Carlson, KISS Network <james dot d dot carlson at sun dot com> >>>>>Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 >>>>>MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 >>>>> >>>>> >>>>> >>>>_______________________________________________ >>>>networking-discuss mailing list >>>>networking-discuss at opensolaris dot org >>>> >>>> >>>> >>>_______________________________________________ >>>networking-discuss mailing list >>>networking-discuss at opensolaris dot org >>> >>> >>> > > >
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6
From:
Registered:
7/4/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 21, 2006 12:49 AM
in response to: Frank DiMambro
|
|
On Mon, 2006-03-20 at 21:16, Frank DiMambro wrote: > Hi Thomas > How it's supposed to be implemented is only streams that say > they are promiscuous get a copy. Yes. Sorry. I assumed we talked about promiscous mode. Mike Ditto's proposal is very comprehensive about the cases. Thanks, -Thomas > > Thomas Bastian - Sun Microsystems wrote: > > >Hi Frank, > >Thanks for the tip. Will check that out. > > > > > You may also want to check out the hardware specs for defintion > of Promiscuous, In my experience largely with Ethernet it has > always meant loopback of sent packets. > > >I agree that promiscuous also means to see packets that are sent out. > >But do you think it makes sense for a stream to receive a copy of its > >own packets sent out? > > > If the stream says it's promiscuous then yes. This does not mean you > spray it to other streams, who are not marked as promiscuous. > > >I mean other streams in the system should receive > >a copy for sure, but not the originating stream? > > > > > No that's not the way is's supposed to work. Only streams that say > they are promiscuous should see the sent packets. It's this > functionality that lets you snoop a ip stream, and see that's going > in and out of ip. > > cheers > Frank > > >Thanks, > >-Thomas > > > >On Mon, 2006-03-20 at 18:36, Frank DiMambro wrote: > > > > > >>Hi Thomas > >> You may also want to check the IEEE specs for the definition of > >>Promiscuous. The DLPI spec follows that. It's always been well > >>understood that 'Promiscuous' means every packet that hits the > >>physical layer, which includes packets sent out. The Software > >>loopback compensates for hardware that cannot do loopback, > >>but in essence it's to achieve the same result which is to see the > >>packets you sent out. Not sure this is a bug.... > >> > >> cheers > >> Frank > >> > >>Thomas Bastian - Sun Microsystems wrote: > >> > >> > >>>Just checked the DLPI spec for this topic. There is no requirement that > >>>a DLPI user gets its own packets looped back. So I guess we can safely > >>>treat this proposal as a bug. I will file a bug instead. > >>>Thanks, > >>>-Thomas > >>> > >>>On Mon, 2006-03-20 at 15:54, Thomas Bastian - Sun Microsystems wrote: > >>> > >>> > >>> > >>>>On Mon, 2006-03-20 at 15:33, James Carlson wrote: > >>>> > >>>> > >>>> > >>>>>Thomas Bastian - Sun Microsystems writes: > >>>>> > >>>>> > >>>>> > >>>>>>>What is the motivation for having two separate ways to set this? Why > >>>>>>>not have this new feature _only_ at the stream level? Are there usage > >>>>>>>models that correspond to both levels? The only one I see is > >>>>>>>DLT_LINUX_SLL, which seems to imply stream-level (though I'm not > >>>>>>>positive). > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>You are right. From a feature point of view the streams level would be > >>>>>>enough. Were I can see the benefit with the device level setting is that > >>>>>>we could avoid using mac_txloop() and therefore use the fast regular way > >>>>>>to get packets out (since we don't need a loop copy). But maybe the > >>>>>>speed benefit will be negligible after all. I have not made any > >>>>>>measurements for this. > >>>>>> > >>>>>> > >>>>>> > >>>>>I'd rather not expose the details of performance optimizations to > >>>>>uninvolved parties. They change far too often for this to be a good > >>>>>way to entangle the design. > >>>>> > >>>>>In other words, if there is any performance to be gained here, then > >>>>>the system should detect the special cases itself and set up the right > >>>>>behavior. Thus, if all of the streams either are non-promiscuous or > >>>>>if all of the promiscuous streams elect not to have local copies, then > >>>>>use the "fast" version. Otherwise, don't. > >>>>> > >>>>>(I really think the complexity involved with the pointer management > >>>>>dwarfs any possible gain from avoiding a single, well-designed flag > >>>>>check, and that the current design needs a rethink. But that's > >>>>>probably a different topic.) > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>> c. All open streams are switched to loop-on mode, and, because this > >>>>>>> takes precedence over the stream level control, subsequent use > >>>>>>> of DL_PROMLOOP_STR_OFF does nothing. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>Its partly c.) I guess my proposal is not clear enough on this point. > >>>>>>Let me try to rephrase it. The DL_PROMLOOP_DEV_ON enables loopback mode > >>>>>>for the device, hence this setting is a pre-requisite for any stream to > >>>>>>see loopback packets at all from this device. If the DL_PROMLOOP_DEV > >>>>>> > >>>>>> > >>>>>> > >>>>>So ... this means there are really *three* states for the device level > >>>>>flag. It can be "forced on," "forced off," or "unset." There's no > >>>>>way to set that third mode with the new interface; the system starts > >>>>>up that way by default, but if anyone ever sets either of the other > >>>>>modes, it's a one-way trap door. You can't get back (except, perhaps, > >>>>>by unplumbing). > >>>>> > >>>>>That's a bit confusing, and I'm not sure I see why it's necessary. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>>Does the proposal distinguish between looped-back traffic that > >>>>>>>originates with the stream user and traffic that originates with other > >>>>>>>streams? > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>Not in the POC currently. This is an important point on which I am still > >>>>>>unclear what the best approach would be. > >>>>>> > >>>>>> > >>>>>> > >>>>>It seems to me that it's really key to the problem. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>>So, why not dispense with the knob entirely, and simply change the > >>>>>>>definition? Fix it so that promiscuous mode in DLPI does not itself > >>>>>>>loop back traffic to the same stream that generated it. I.e., only > >>>>>>>cases that cause loopback in the non-promiscuous behavior would loop > >>>>>>>back. This would simplify the driver changes, the documentation, the > >>>>>>>user interface, and the porting work required for applications. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>I am not sure this is possible. Agreed that it would be the simplest > >>>>>>approach. I am not 100% positive but I think it is a well known > >>>>>>"feature" of DLPI that in promiscuous mode, packets are looped back. I > >>>>>>think this is the way it works on other systems (HP-UX, AIX, etc...) as > >>>>>>well (to be confirmed). If there is such a requirement for DLPI in > >>>>>>promiscuous mode, then we could not go down that route because we would > >>>>>>break compatibility I suppose. > >>>>>> > >>>>>> > >>>>>> > >>>>>I don't think that's the important question. I think this one is: > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>>Is there any case in which seeing the unicast traffic that you > >>>>>>>generated on your own promiscuous-mode stream is not a bug? > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>It seems to me that promiscuous DLPI streams are relatively rare. In > >>>>>most (nearly all) cases, they're used for snoop/ethereal/libpcap, and > >>>>>those applications are read-only. > >>>>> > >>>>>The narrow case where the current DLPI semantics break down for some > >>>>>users is in the rarest of the rare: a promiscuous DLPI stream user who > >>>>>also transmits unicast packets. It seems fair to me to ask whether > >>>>>the current behavior is something that anyone could ever have relied > >>>>>on in any useful way, or whether it's merely a bug. In other words, > >>>>>do those applications _ever_ process those packets beyond just > >>>>>detecting and discarding them? > >>>>> > >>>>>I'd be strongly tempted to treat this as a bug, and change it in a > >>>>>Minor release along with a suitable release note. The only "tunable" > >>>>>I might provide would be an intentionally undocumented variable (that > >>>>>could be tweaked with /etc/system) to reenable the old behavior, just > >>>>>in case there's some unknown application somewhere that's actually > >>>>>harmed by the new behavior. > >>>>> > >>>>>The chance of that, though, seems quite remote to me, and the risk > >>>>>looks reasonable for a Minor release, especially in comparison to the > >>>>>complexity and risk of potentially modifying multiple (and largely > >>>>>unknown!) DLPI applications to take advantage of this new feature, > >>>>>and adding lasting complexity to Solaris for the mode switch > >>>>>implementation that could really never be removed. > >>>>> > >>>>>(For a patch or micro release binding, the default may need to be the > >>>>>other way.) > >>>>> > >>>>>But, yes, I agree that verifying against the standards (which seem to > >>>>>say nothing about the issue) and against other implementations is a > >>>>>good idea. I don't think, though, that if other implementations have > >>>>>bugs, this necessarily means we must as well. > >>>>> > >>>>> > >>>>> > >>>>Well to be honest I am fine with treating this as a bug because I fully > >>>>agree with you that the current promiscous mode behaviour does not make > >>>>sense at all. I am happy to hear other people's opinion about this. And > >>>>in the meantime, I will see if I find anything in the specs about what > >>>>(if any) the "expected" behaviour should be. > >>>> > >>>>Thanks, > >>>>-Thomas > >>>> > >>>> > >>>> > >>>> > >>>>>-- > >>>>>James Carlson, KISS Network <james dot d dot carlson at sun dot com> > >>>>>Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 > >>>>>MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 > >>>>> > >>>>> > >>>>> > >>>>_______________________________________________ > >>>>networking-discuss mailing list > >>>>networking-discuss at opensolaris dot org > >>>> > >>>> > >>>> > >>>_______________________________________________ > >>>networking-discuss mailing list > >>>networking-discuss at opensolaris dot org > >>> > >>> > >>> > > > > > > >
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6,810
From:
US
Registered:
3/9/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 10:27 AM
in response to: frd
|
|
Frank DiMambro writes: > You may also want to check the IEEE specs for the definition of > Promiscuous. The DLPI spec follows that. It's always been well
The two are logically independent. The IEEE (I assume you're referring to the 802 specifications, not The Open Group standards) defines a few particular link layers, such as Ethernet, but doesn't define them all. More importantly, it's just incorrect to say that DLPI is defined in terms of what any one link layer (even Ethernet) may or may not do.
-- James Carlson, KISS Network <james dot d dot carlson at sun dot com> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
110
From:
US
Registered:
3/9/05
|
|
|
|
Re: Proposal for new DLPI cmd to disable packet
loopback in promiscuous mode
Posted:
Mar 20, 2006 1:37 PM
in response to: tbastian
|
|
A agree that the DLPI spec does not require sent packets to be looped back to the sending stream (whether promiscuous or not) and that it is a bad design to do so. I have already prepared a proposal to change the recommended behavior for DLPI drivers (i.e. change from "unspecified" to "loopback discouraged") as part of fixing some other promiscuous/loopback issues that have come up with briding. I'll send this proposal here shortly.
-=] Mike [=- _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
110
From:
US
Registered:
3/9/05
|
|
|
|
DLPI packet loopback proposal
Posted:
Mar 20, 2006 1:46 PM
in response to: tbastian
|
|
This is a proposal to change the expected behavior of DLPI providers with respect to looping back copies of transmitted packets to the same or other streams that are sharing the device. Brief summary:
Never loopback a packet to it sending stream. Always loopback a packet to any other stream that should recieve that packet if it arrived from the medium.
DLPI Improved Loopback Handling
Mike Ditto
DRAFT Thu Mar 9 01:01:38 PST 2006
In this document, "user" means a user of a data link service (DLS), implemented by one stream connected to a data link service provider, as described in the Data Link Provider Interface specification.
The problem:
Current DLPI device drivers do not allow two users of the same DLS provider (e.g. a NIC) to communicate with each other. All DLS users can send and receive packets on the physical media, and two DLS users speaking independent protocols (e.g. Appletalk and TCP/IP) can share access to a single provider without any need to see each other's packets. But if two simultaneous DLS users want to participate in the same protocol, for example having two different implementations of the same network protocol in simultaneous use, they can not see each other's packets, even when the packets are properly addressed to the local host's (NIC's) MAC address.
All current Solaris DLPI drivers provide a special hack that allows a promiscuous listening user to see the packets sent by other users. Promiscuous listening has a specific intended function, namely to disable some portion of the inbound packet selection filter, widening the subset of received traffic that will be passed to the associated user. The fact that it also causes a change in the transmit processing of other users is a practical wart, created to support monitoring tools such as snoop(1m). This special loopback handling is not explicitly part of the DLPI standard, but is expected and universally provided by Solaris drivers.
Aside from the snoop case, the need for one user to see another user's packets is rare -- thus the longstanding lack of this capability on Solaris without much complaint. A new application, Ethernet bridging, exposes this lack and calls for a solution.
Complications:
If DLS users in general are to receive a copy of every transmitted packet that matches their protocol of interest, there is a degenerate case where users will receive copies of their own transmitted packets that happen to match their receive filter, such as packets sent to broadcast or multicast addresses. This would require extra processing in the DLS user, and might break existing network layer protocol implementations.
Proposed solution:
I would like to solve the problem by requiring drivers to provide orthogonal loopback behavior. By this I mean that any transmitted packet will--in principle--be presented to all other users of that link interface, whether in promiscuous mode or not, but subject to their respective receive filters, of course. By "in principle" I mean that the behavior must be as if this was done, although there might be optimizations that avoid this extra work when it wouldn't result in a packet being seen by any other users.
Note that we won't ever loop a packet back to the same user that requested its transmission, only to other users of the same link provider. Aside from being the most useful and performant behavior, this mimics the behavior of most LAN hardware.
As a driver implementation note, current logic that considers performing transmit packet loopback whenever any user has one or more promiscuous levels enabled must be changed to perform the loopback whenever any of these conditions are true:
At least one user is in DL_PROMISC_PHYS mode.
At least one user is in DL_PROMISC_MULTI mode and the destination address of the present packet is a multicast address.
At least one user is in DL_PROMISC_SAP mode and the destination address of the present packet is equal to the local MAC address.
At least one user other than the current user is bound to a SAP that matches the present packet.
To efficiently test these conditions, some data can be pre-computed for fast access in the transmit path.
Analysis of Compatibility Issues:
It is hopeless to expect that all drivers will be enhanced to support this new functionality. We do expect to enhance all the relevant drivers provided by Sun, and we will implement the functionality in the GLD framework which will automatically enhance many, but not all existing non-Sun drivers. Any Solaris feature that might benefit from the new behavior must be implemented in a way that still works on old drivers, or must clearly communicate to the customer that the feature only works with certain drivers, with the latter option not available to existing committed features.
The new requirement that packets not be looped back to the user that transmitted them is a change in behavior for DL_PROMISC_PHYS users. If a user is in DL_PROMISC_PHYS mode and transmits a packet, a copy will no longer be queued for reading. Applications that both receive promiscuously and transmit are unusual, but some might exist that are impacted by this change. In the case of the bridge module, this is a beneficial change.
There is one potentially incompatible behavior change resulting from the IP and ARP streams now being able to receive packets transmitted by other users. An application that uses DLPI to transmit IP packets will have those packets delivered to the local IP stack if their destination MAC address is the local MAC address or the broadcast address or a multicast address currently of interest to IP. Uses of such applications are unusual and probably don't send to the local MAC address. If they do, the new behavior is likely to be a useful improvement, but there could be some situations where it causes a problem.
Practical Considerations for the Ethernet Bridging Project:
Without any of the changes proposed by this case and without any changes to IP Ethernet bridging is possible, but the local host will be unable to communicate with some of the remote hosts on the bridged LAN -- namely, the ones "across" the bridge from the interface which is plumbed to IP. The feature might still be useful in some situations even with this limitation -- for example one could build a "stealth mode" bridging firewall which was not meant to allow any packets to reach the firewall's IP stack anyway. But it would not be useful in at least one of the major intended uses: Solaris as the OS for Xen domain 0. In that environment we need for the (domain 0) OS's IP stack to have full connectivity with the hosts "across" the bridge, which are in fact the guest domains.
We could work around the DLPI driver issue by modifying IP to put its stream in promiscuous (DL_PROMISC_PHYS) mode. This would allow it to receive the packets coming across the bridge when using a plain ordinary NIC driver. But it would drastically increase the amount of traffic that IP must classify and discard. It would also trigger some slower-path processing in most--if not all--existing drivers. So this could not be made the default operating mode for IP; it would have to be enabled when needed, perhaps manually or perhaps automatically triggered by some action by the bridge module.
With the changes proposed by this case, IP will see all of the bridge's transmissions that it's supposed to see, and the bridge won't get useless copies of its own transmissions. If one driver does not have these changes, we don't need to do anything special to keep basic bridging functionality working (except rely on the existing code that discards the useless copies of every transmitted packet that will passed up by the DLPI driver). But in that case, IP is subject to the same limitations in the two preceding paragraphs - either the local IP stack will be isolated from a portion of the LAN, or we must cause the IP stack to use DL_PROMISC_PHYS mode with the resulting performance penalties.
For at least the initial implementation of Solaris on Xen domain 0, bridging is a key component in the inter-domain network architecture. Even though the inter-domain traffic travels over an in-memory virtual LAN segment between domains, it is actually the driver for the physical NIC that will determine whether IP sees packets from the user domains ("across the bridge"). So it seems that Xen Solaris would have to use some form of the IP-in-promiscuous- mode approach when older NIC drivers are in use. However, the Clearview project plans to deliver a "DLPI shim" component that allows (causes) all non-GLD DLPI drivers to be act as MAC drivers beneath the GLD framework, so that will probably solve this problem.
An alternative design for Ethernet bridging -- the one taken by the free bridge-utils implementation for Linux -- modifies the plumbing and administration procedures for IP such that it appears that IP is connected to a pseudointerface provided by the bridge instead of to the NIC. So far, we have choosen not to use this approach because of its disruptive change to the networking administration model. It would be undesirable for the network configuration procedures (specifically the interface name used by ifconfig et al) to substantially change just because you reboot Solaris with (or without) Xen, even when you are not actually using any Xen user domains or features. _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
7
From:
San Jose, CA
Registered:
11/11/05
|
|
|
|
Re: DLPI packet loopback proposal
Posted:
Jun 29, 2006 2:50 PM
in response to: mditto
|
|
I would like to post a patch ... how do I do it?
I am with Deepnines Technologies, www.deepnines.com. We are a Sun ISV and have ported our Linux SEP product to Solaris 10; the product is highly reliant on transmission of packets outbound on promiscuously configured interfaces. We share Mike Ditto's view that looping a packet back to the _sender's_ stream is a bug; we cannot imagine a use case for this behavior. Further, in the context of our product, seeing our own packets is a performance nightmare.
I have attached a short patch based on opensolaris Build 42 that attempts to address the loopback issue. Since I am not an experienced kernel guru and would appreciate feedback on this effort. The good news is that the patch is only about 164 lines and can be applied by cd'ing to $GATE/usr and typing "gpatch -p1 < loopback_fix.patch". The patch is based on changing the signature of mac_txloop() so that it can be aware of its caller's dls_link_t*. We modify mac_txloop() so that it does not call it's caller's registered loopback function; but does call everybody else's.
There is a bit of hackery involved, and the patch certainly violates "program to an interface, not to an implementation"; we would certainly be interested in the "better way".
--- usr/sdi_txinfo; + dls_impl_t *dip = (dls_impl_t *)dc; + const mac_txinfo_t *mtp = dip->di_txinfo; + mac_handle_t mh = dip->di_mh; + + if ((void*)mtp->mt_fn == (void*)mac_txloop) { + /* Only get here in the promisc loopback case */ + return ((*(mac_tx_lo_t*)mtp->mt_fn) + (mtp->mt_arg, mp, dip->di_dvp->dv_dlp)); + } return (mtp->mt_fn(mtp->mt_arg, mp)); } --- usr/src/uts/common/io/aggr/aggr_lacp.c.fix_lo 2006-06-14 21:27:39.000000000 -0700 +++ usr/src/uts/common/io/aggr/aggr_lacp.c 2006-06-29 10:32:13.211857000 -0700 @@ -569,7 +569,11 @@ * loading mt_fn and mt_arg. */ mtp = portp->lp_txinfo; - mtp->mt_fn(mtp->mt_arg, mp); + if ((void*)mtp->mt_fn == (void*)mac_txloop) + /* TODO: replace NULL with caller's dls_link_t* */ + (*(mac_tx_lo_t*)mtp->mt_fn)(mtp->mt_arg, mp, NULL); + else + mtp->mt_fn(mtp->mt_arg, mp); pl->NTT = B_FALSE; portp->lp_lacp_stats.LACPDUsTx++; @@ -901,7 +905,11 @@ * loading mt_fn and mt_arg. */ mtp = portp->lp_txinfo; - mtp->mt_fn(mtp->mt_arg, mp); + if ((void*)mtp->mt_fn == (void*)mac_txloop) + /* TODO: replace NULL with caller's dls_link_t* */ + (*(mac_tx_lo_t*)mtp->mt_fn)(mtp->mt_arg, mp, NULL); + else + mtp->mt_fn(mtp->mt_arg, mp); return; bail: --- usr/src/uts/common/io/aggr/aggr_send.c.fix_lo 2006-06-14 21:27:39.000000000 -0700 +++ usr/src/uts/common/io/aggr/aggr_send.c 2006-06-29 10:25:16.401166000 -0700 @@ -240,7 +240,14 @@ * changes between loading mt_fn and mt_arg. */ mtp = port->lp_txinfo; - if ((mp = mtp->mt_fn(mtp->mt_arg, mp)) != NULL) { + if ((void*)mtp->mt_fn == (void*)mac_txloop) + /* TODO: replace NULL with caller's + dls_link_t* */ + mp = (*(mac_tx_lo_t*)mtp->mt_fn) + (mtp->mt_arg, mp, NULL); + else + mp = mtp->mt_fn(mtp->mt_arg, mp); + if (mp != NULL) { mp->b_next = nextp; break; } --- usr/src/uts/common/io/mac/mac.c.fix_lo 2006-06-14 21:27:39.000000000 -0700 +++ usr/src/uts/common/io/mac/mac.c 2006-06-29 10:20:07.125445000 -0700 @@ -205,7 +205,7 @@ */ mip->mi_txinfo.mt_fn = mp->m_tx; mip->mi_txinfo.mt_arg = mp->m_driver; - mip->mi_txloopinfo.mt_fn = mac_txloop; + mip->mi_txloopinfo.mt_fn = (mac_tx_t) mac_txloop; mip->mi_txloopinfo.mt_arg = mip; /* @@ -1249,7 +1249,7 @@ * Transmit function -- ONLY used when there are registered loopback listeners. */ mblk_t * -mac_txloop(void *arg, mblk_t *bp) +mac_txloop(void *arg, mblk_t *bp, void *dlp_arg) { mac_impl_t *mip = arg; mac_t *mp = mip->mi_mp; @@ -1272,22 +1272,30 @@ rw_enter(&mip->mi_txloop_lock, RW_READER); mtfp = mip->mi_mtfp; while (mtfp != NULL && loop_bp != NULL) { - bp = loop_bp; + /* Only do the promiscuous loopback call if + the packet was not sent (i.e., tx'ed) on + our own dls_link_t* */ + if (dlp_arg != mtfp->mtf_arg) { + bp = loop_bp; + + /* XXX counter bump if copymsg() + fails? */ + if (mtfp->mtf_nextp != NULL) + loop_bp = copymsg(bp); + else + loop_bp = NULL; - /* XXX counter bump if copymsg() fails? */ - if (mtfp->mtf_nextp != NULL) - loop_bp = copymsg(bp); - else - loop_bp = NULL; - - mtfp->mtf_fn(mtfp->mtf_arg, bp); + mtfp->mtf_fn(mtfp->mtf_arg, bp); + } mtfp = mtfp->mtf_nextp; } rw_exit(&mip->mi_txloop_lock); /* - * It's possible we've raced with the disabling of promiscuous - * mode, in which case we can discard our copy. + * It's possible we've raced with the disabling of + * promiscuous mode _or_ our own dls_link_t* was the + * last one in the chain, in which cases we can + * discard our copy. */ if (loop_bp != NULL) freemsg(loop_bp); --- usr/src/uts/sun4v/io/vsw.c.fix_lo 2006-06-14 21:27:43.000000000 -0700 +++ usr/src/uts/sun4v/io/vsw.c 2006-06-29 10:18:16.400402000 -0700 @@ -1247,7 +1247,14 @@ mp->b_next = NULL; mtp = vswp->txinfo; - if ((mp = mtp->mt_fn(mtp->mt_arg, mp)) != NULL) { + if ((void*)mtp->mt_fn == (void*)mac_txloop) + /* TODO: replace NULL with caller's + dls_link_t* */ + mp = (*(mac_tx_lo_t*)mtp->mt_fn) + (mtp->mt_arg, mp, NULL); + else + mp = mtp->mt_fn(mtp->mt_arg, mp); + if (mp != NULL) { mp->b_next = nextp; break; }
Message was edited by: jmderic
|
|
|
|
Posts:
7
From:
San Jose, CA
Registered:
11/11/05
|
|
|
|
Re: DLPI packet loopback proposal
Posted:
Jun 29, 2006 3:07 PM
in response to: jmderic
|
|
|
|
above patch in file form
|
|
|
|
Posts:
6,810
From:
US
Registered:
3/9/05
|
|
|
|
Re: Re: DLPI packet loopback proposal
Posted:
Jun 30, 2006 4:20 AM
in response to: jmderic
|
|
Mark Deric writes: > I would like to post a patch ... how do I do it?
Posting a patch won't lead to having any of the sources changed. You probably need to start here:
http://www.opensolaris.org/os/about/faq/getting_started_developers/
The short answer is that the place to start is to file a bug or RFE against the part of the system you're looking at. Next, that issue will need to be evaluated and documented. It may require architectural and design review, and will certainly require code review and testing. It can then be integrated.
The code changes are useful at that point in the effort. Without at least a CR to document what's being done, it's unlikely that anyone will want to reverse-engineer your diffs in an attempt to figure out what you're fixing and why.
-- James Carlson, KISS Network <james dot d dot carlson at sun dot com> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
7
From:
San Jose, CA
Registered:
11/11/05
|
|
|
|
Re: Re: DLPI packet loopback proposal
Posted:
Jul 10, 2006 1:34 PM
in response to: carlsonj
|
|
James Carlson writes:
> Posting a patch won't lead to having any of the sources changed. > You probably need to start here:
> http://www.opensolaris.org/os/about/faq/getting_started _developers/
> The short answer is that the place to start is to file a bug or RFE > against the part of the system you're looking at. Next, that issue > will need to be evaluated and documented. It may require > architectural and design review, and will certainly require code > review and testing. It can then be integrated.
Jim, the issue of looping back a packet to the transmitting DLS instance is documented in excruciating detail in this thread (i.e., the thread to which we are both posting): http://www.opensolaris.org/jive/thread.jspa?threadID=6883&tstart=0
It is also documented as sub-item #1 in Bug ID: 6402493.
> The code changes are useful at that point in the effort. Without at > least a CR to document what's being done, it's unlikely that anyone > will want to reverse-engineer your diffs in an attempt to figure out > what you're fixing and why.
I offer the following short form "CR" to document the patch:
1) Goal: Modify the Sol 10/11 behavior so a packet transmitted on a promiscuously configured DLPI file descriptor is not returned (i.e., looped back) to be read by the same file descriptor.
2) The promiscuously configured file descriptor from item #1 is established in the kernel as a dls_impl_t instance. In the current code, the dls_accept_loopback() function is called to determine whether to deliver a transmitted packet to a particular dls_impl_t. Unfortunately, the dls_accept_loopback() function does not know the dls_impl_t of the packet's transmitter; the attached patch fixes this deficiency (except for the case of an aggregated link, and the case of the SPARC only uts/sun4v/io/vsw.c consumer; neither one of which is relevant to our product).
Note: the new patch, attached hereto, superceeds the one I posted before your 7/3-7/06 shutdown and has been tested by local rebuilding ("make") within the opensolaris B42 tree at the following directories: /usr/src/uts/intel/dls /usr/src/uts/intel/mac /usr/src/uts/intel/aggr
The following resultant files were used to replace the corresponding files on the S10 606 distribution: /kernel/misc/amd64/dls /kernel/misc/amd64/mac /kernel/misc/dls /kernel/misc/mac /kernel/drv/amd64/aggr /kernel/drv/aggr
The S10 606 system was rebooted and tested with two separate userland programs (a unit test program and our company's SEP product) for the revised behavior described in item #1 above.
Some items wrt your faq/getting_started _developers link above: A) How does one go about getting a sponsor to be listed on the bug_reports/request_sponsor table? B) What is the best way to coordinate activity here with activity on the same subject being conducted through Sun's Market Development Engineering unit? C) Why can I post attachments sometimes and other (now for example) times when I click on "Reply/Attach Files", I get a dialogue that says, "You are not allowed to edit this message."?
Because of item C, above, the revised patch appears inline, below.
Regards, Mark Deric
> James Carlson, KISS Network <james dot d dot carlson at sun dot com> > 1 Network Drive 71.232W Vox +1 781 442 2084 > MS UBUR02-212 / Burlington MA 01803-2757 42.496N > Fax +1 781 442 1677 > _____________________________________________ > networking-discuss mailing list > networking-discuss at opensolaris dot org >
--- usr/src/uts/common/sys/mac.h.fix_lo 2006-06-14 21:27:32.000000000 -0700 +++ usr/src/uts/common/sys/mac.h 2006-07-02 14:25:47.962970000 -0700 @@ -264,6 +264,7 @@ typedef void (*mac_resources_t)(void *); typedef void (*mac_ioctl_t)(void *, queue_t *, mblk_t *); typedef mblk_t *(*mac_tx_t)(void *, mblk_t *); +typedef mblk_t *(*mac_tx_lo_t)(void *, mblk_t *, void *); /* * MAC extensions. (Currently there are non defined). @@ -330,7 +331,7 @@ typedef void (*mac_notify_t)(void *, mac_notify_type_t); typedef void (*mac_rx_t)(void *, mac_resource_handle_t, mblk_t *); -typedef void (*mac_txloop_t)(void *, mblk_t *); +typedef void (*mac_txloop_t)(void *, mblk_t *, void *); typedef void (*mac_blank_t)(void *, time_t, uint_t); /* @@ -398,7 +399,7 @@ extern void mac_notify(mac_handle_t); extern mac_rx_handle_t mac_rx_add(mac_handle_t, mac_rx_t, void *); extern void mac_rx_remove(mac_handle_t, mac_rx_handle_t); -extern mblk_t *mac_txloop(void *, mblk_t *); +extern mblk_t *mac_txloop(void *, mblk_t *, void *); extern mac_txloop_handle_t mac_txloop_add(mac_handle_t, mac_txloop_t, void *); extern void mac_txloop_remove(mac_handle_t, --- usr/src/uts/common/sys/dls_impl.h.fix_lo 2006-06-14 21:27:33.000000000 -0700 +++ usr/src/uts/common/sys/dls_impl.h 2006-07-05 11:05:02.406665000 -0700 @@ -145,8 +145,8 @@ extern int dls_fini(void); extern boolean_t dls_accept(dls_impl_t *, const uint8_t *, dls_rx_t *, void **); -extern boolean_t dls_accept_loopback(dls_impl_t *, const uint8_t *, - dls_rx_t *, void **); +extern boolean_t dls_accept_loopback(dls_impl_t *, dls_impl_t *, + const uint8_t *, dls_rx_t *, void **); #ifdef __cplusplus } --- usr/src/uts/common/io/dls/dls.c.fix_lo 2006-06-14 21:27:37.000000000 -0700 +++ usr/src/uts/common/io/dls/dls.c 2006-07-05 14:06:05.229098000 -0700 @@ -795,7 +795,17 @@ mblk_t * dls_tx(dls_channel_t dc, mblk_t *mp) { - const mac_txinfo_t *mtp = ((dls_impl_t *)dc)->di_txinfo; + dls_impl_t *dip = (dls_impl_t *)dc; + const mac_txinfo_t *mtp = dip->di_txinfo; + mac_handle_t mh = dip->di_mh; + + if ((void*)mtp->mt_fn == (void*)mac_txloop) { + /* Only get here in the promisc loopback case */ + /* cmn_err(CE_WARN, + "dls_tx: Send promisc block"); */ + return (((mac_tx_lo_t)mtp->mt_fn) + (mtp->mt_arg, mp, dip)); + } return (mtp->mt_fn(mtp->mt_arg, mp)); } @@ -904,9 +914,15 @@ /*ARGSUSED*/ boolean_t -dls_accept_loopback(dls_impl_t *dip, const uint8_t *daddr, dls_rx_t *di_rx, - void **di_rx_arg) +dls_accept_loopback(dls_impl_t *dip, dls_impl_t *dip_caller, + const uint8_t *daddr, dls_rx_t *di_rx, void **di_rx_arg) { + /* Only check accept of the promiscuous loopback call if the packet was + not sent (i.e., tx'ed) on our own dls_impl_t* */ + /* cmn_err(CE_WARN, "dls_accept_loopback: caller: %lu; target: %lu", + (size_t)dip_caller, (size_t)dip); */ + if (dip_caller == dip) + return (B_FALSE); /* * We must not accept packets if the dls_impl_t is not marked as bound * or is being removed. --- usr/src/uts/common/io/dls/dls_link.c.fix_lo 2006-06-14 21:27:37.000000000 -0700 +++ usr/src/uts/common/io/dls/dls_link.c 2006-07-05 11:17:38.480796000 -0700 @@ -593,7 +593,7 @@ } static void -i_dls_link_ether_loopback(void *arg, mblk_t *mp) +i_dls_link_ether_loopback(void *arg, mblk_t *mp, void* dip_caller) { dls_link_t *dlp = arg; mod_hash_t *hash = dlp->dl_impl_hash; @@ -652,8 +652,8 @@ * Find dls_impl_t that will accept the sub-chain. */ for (dip = dhp->dh_list; dip != NULL; dip = dip->di_nextp) { - if (!dls_accept_loopback(dip, daddr, &di_rx, - &di_rx_arg)) + if (!dls_accept_loopback(dip, dip_caller, daddr, + &di_rx, &di_rx_arg)) continue; /* @@ -695,7 +695,8 @@ * Find the first dls_impl_t that will accept the sub-chain. */ for (dip = dhp->dh_list; dip != NULL; dip = dip->di_nextp) - if (dls_accept_loopback(dip, daddr, &di_rx, &di_rx_arg)) + if (dls_accept_loopback(dip, dip_caller, daddr, &di_rx, + &di_rx_arg)) break; /* @@ -715,8 +716,9 @@ */ for (ndip = dip->di_nextp; ndip != NULL; ndip = ndip->di_nextp) - if (dls_accept_loopback(ndip, daddr, - &ndi_rx, &ndi_rx_arg)) + if (dls_accept_loopback(ndip, dip_caller, + daddr, &ndi_rx, + &ndi_rx_arg)) break; /* --- usr/src/uts/common/io/aggr/aggr_lacp.c.fix_lo 2006-06-14 21:27:39.000000000 -0700 +++ usr/src/uts/common/io/aggr/aggr_lacp.c 2006-06-30 15:49:14.880545000 -0700 @@ -569,7 +569,11 @@ * loading mt_fn and mt_arg. */ mtp = portp->lp_txinfo; - mtp->mt_fn(mtp->mt_arg, mp); + if ((void*)mtp->mt_fn == (void*)mac_txloop) + /* TODO: replace NULL with caller's dls_link_t* */ + ((mac_tx_lo_t)mtp->mt_fn)(mtp->mt_arg, mp, NULL); + else + mtp->mt_fn(mtp->mt_arg, mp); pl->NTT = B_FALSE; portp->lp_lacp_stats.LACPDUsTx++; @@ -901,7 +905,11 @@ * loading mt_fn and mt_arg. */ mtp = portp->lp_txinfo; - mtp->mt_fn(mtp->mt_arg, mp); + if ((void*)mtp->mt_fn == (void*)mac_txloop) + /* TODO: replace NULL with caller's dls_link_t* */ + ((mac_tx_lo_t)mtp->mt_fn)(mtp->mt_arg, mp, NULL); + else + mtp->mt_fn(mtp->mt_arg, mp); return; bail: --- usr/src/uts/common/io/aggr/aggr_send.c.fix_lo 2006-06-14 21:27:39.000000000 -0700 +++ usr/src/uts/common/io/aggr/aggr_send.c 2006-06-30 15:50:04.066406000 -0700 @@ -240,7 +240,14 @@ * changes between loading mt_fn and mt_arg. */ mtp = port->lp_txinfo; - if ((mp = mtp->mt_fn(mtp->mt_arg, mp)) != NULL) { + if ((void*)mtp->mt_fn == (void*)mac_txloop) + /* TODO: replace NULL with caller's + dls_link_t* */ + mp = ((mac_tx_lo_t)mtp->mt_fn) + (mtp->mt_arg, mp, NULL); + else + mp = mtp->mt_fn(mtp->mt_arg, mp); + if (mp != NULL) { mp->b_next = nextp; break; } --- usr/src/uts/common/io/mac/mac.c.fix_lo 2006-06-14 21:27:39.000000000 -0700 +++ usr/src/uts/common/io/mac/mac.c 2006-07-05 11:34:02.796287000 -0700 @@ -205,7 +205,7 @@ */ mip->mi_txinfo.mt_fn = mp->m_tx; mip->mi_txinfo.mt_arg = mp->m_driver; - mip->mi_txloopinfo.mt_fn = mac_txloop; + mip->mi_txloopinfo.mt_fn = (mac_tx_t) mac_txloop; mip->mi_txloopinfo.mt_arg = mip; /* @@ -1249,7 +1249,7 @@ * Transmit function -- ONLY used when there are registered loopback listeners. */ mblk_t * -mac_txloop(void *arg, mblk_t *bp) +mac_txloop(void *arg, mblk_t *bp, void *dip_caller) { mac_impl_t *mip = arg; mac_t *mp = mip->mi_mp; @@ -1280,7 +1280,7 @@ else loop_bp = NULL; - mtfp->mtf_fn(mtfp->mtf_arg, bp); + mtfp->mtf_fn(mtfp->mtf_arg, bp, dip_caller); mtfp = mtfp->mtf_nextp; } rw_exit(&mip->mi_txloop_lock); --- usr/src/uts/sun4v/io/vsw.c.fix_lo 2006-06-14 21:27:43.000000000 -0700 +++ usr/src/uts/sun4v/io/vsw.c 2006-06-30 15:50:57.731978000 -0700 @@ -1247,7 +1247,14 @@ mp->b_next = NULL; mtp = vswp->txinfo; - if ((mp = mtp->mt_fn(mtp->mt_arg, mp)) != NULL) { + if ((void*)mtp->mt_fn == (void*)mac_txloop) + /* TODO: replace NULL with caller's + dls_link_t* */ + mp = ((mac_tx_lo_t)mtp->mt_fn) + (mtp->mt_arg, mp, NULL); + else + mp = mtp->mt_fn(mtp->mt_arg, mp); + if (mp != NULL) { mp->b_next = nextp; break; }
|
|
|
|
Posts:
7
From:
San Jose, CA
Registered:
11/11/05
|
|
|
|
Re: Re: DLPI packet loopback proposal
Posted:
Jul 10, 2006 1:35 PM
in response to: jmderic
|
|
|
|
The patch inlined in the above message as a file attachment. Apparently, I only can attach a file when I reply to _my_ _own_ posts. Just guessing, but this seems like a bug on the discussion forum's web based UI.
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
re: Re: Re: DLPI packet loopback proposal
Posted:
Jul 10, 2006 2:01 PM
in response to: jmderic
|
|
> mtp = portp->lp_txinfo; > - mtp->mt_fn(mtp->mt_arg, mp); > + if ((void*)mtp->mt_fn == (void*)mac_txloop) > + /* TODO: replace NU****ith caller's dls_link_t* */ > + ((mac_tx_lo_t)mtp->mt_fn)(mtp->mt_arg, mp, NULL); > + else > + mtp->mt_fn(mtp->mt_arg, mp);
Egad! Having a pointer-to-function sometimes be called with a different function signature is dreadful.
I believe Dong*Hai Han is working on a fix to the DLPI packet loopback issue, along with several other related issues. I've CC'd him on this email.
-- meem _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
7
From:
San Jose, CA
Registered:
11/11/05
|
|
|
|
Re: re: Re: Re: DLPI packet loopback proposal
Posted:
Jul 10, 2006 4:37 PM
in response to: meem
|
|
Peter,
Could not agree with you more on the ugliness of casting a function pointer to a different signature. My issue with alternatives was what is declared "public" vs "private" in the rev 7 Nemo design document dtd 3/13/06 (wherein you are one of 4 authors).
Unfortunately, a mac_tx_t in section 3 and 3.16 is "driver interface public"; whereas, the mac_txloop_t is private. To get to the mac_txloop_t interface, the current design requires traversing the mac_tx_t interface. The bad news is the mac_tx_t interface inappropriately considers the driver's mac_t.m_tx function and the mac module's mac_txloop function as polymorphic implementations of itself. The mac_txloop function needs the caller's dls_impl_t* ; the driver's impl does not.
Hence my comment above in the thread wrt "programming to an implementation, not to an interface"; IMHO, the interface as it stands is broken ... and my proposal is a workaround at best (effective, I hope; and current testing is positive). That being said, we want to deliver product on S10; and broken-ness in the esoterica of design is much better than the broken-ness of an app seeing it's own transmited packets! See the behavior of Linux sockets of type PF_PACKET for comparison.
Sometimes getting something working _now_ with humility; is better than not delivering product because "architects" have not passed.
Mark Deric, Director and Chief Software Architect (I'm cringing), DeepNines, Inc.
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
re: Re: re: Re: Re: DLPI packet loopback proposal
Posted:
Jul 10, 2006 5:00 PM
in response to: jmderic
|
|
> Could not agree with you more on the ugliness of casting a function > pointer to a different signature. My issue with alternatives was what > is declared "public" vs "private" in the rev 7 Nemo design document dtd > 3/13/06 (wherein ****are one of 4 authors).
Indee -- and more than that, I added mac_txloop() in order to allow aggregations to be snooped both at the aggregation level and at the individual-link level. Of course, this issue you're looking to solve wasn't on my radar.
> Unfortunately, a mac_tx_t in section 3 and 3.16 is "driver interface > public"; whereas, the mac_txloop_t is private.
Actually, neither is public at this point -- though in time the MAC driver interfaces will indeed become public, and the MAC client interfaces will remain consolidation-private.
All that said, anyone contributing changes to OpenSolaris is welcome to modify either public or private interfaces, assuming they go through the right channels (e.g., ARC review and the other things that Jim Carlson mentioned). FWIW, it's generally much easier to change private interfaces, since the impact to other parts of the system is much lower.
> To get to the mac_txloop_t interface, the current design requires > traversing the mac_tx_t interface. The bad news is the mac_tx_t > interface inappropriately considers the driver's mac_t.m_tx function > and the mac module's mac_txloop function as polymorphic implementations > of itself.
This was no accident; the goal was to allow interposing with zero overhead in the common case where the promiscuous behavior was not needed.
> The mac_txloop function needs the caller's dls_impl_t* ; the driver's > impl does not.
The fact that mac_txloop() needs that is in itself an architectural issue with the proposed solution. DLS is just one consumer of the mac; hardcoding the mac layer to be aware of it is not appropriate.
As I understand Dong*Hai's solution (which hopefully he will discuss shortly), the DLS layer has its own loopback layer, in addition to the loopback done at the mac layer. With the DLS loopback layer, it is straightforward to filter out the sender-side duplicates.
-- meem _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
7
From:
San Jose, CA
Registered:
11/11/05
|
|
|
|
Promiscuous DLPI packet loopback (was something else)
Posted:
Jul 10, 2006 5:32 PM
in response to: meem
|
|
Thanks for the speedy reply as my mgmt cares a lot about leveraging S10 advances into our SEP product.
My boss is hammering me on understanding your issues wrt the aggr impl, VLANs, and the bigger picture you're working.
Looking to understand the ****-Hai Han approach better; hoping I can contribute to the cause.
Regards, Mark
P.S.: what's up with the sponsor stuff from my earlier emails in the thread? Is there a way we can formalize or presence?
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 10, 2006 6:27 PM
in response to: jmderic
|
|
> Thanks for the speedy reply as my mgmt cares a lot about leveraging S10 > advances into our SEP product.
We care about that too :-)
> My boss is hammering me on understanding your issues wrt the aggr impl, > VLANs, and the bigger picture you're working. > > Looking to understand the Dong*Hai Han approach better; hoping I can > contribute to the cause.
He's in Beijing, so it's likely he's just waking up ;-)
> P.S.: what's up with the sponsor stuff from my earlier emails in the > thread? Is there a way we can formalize or presence?
Sorry, you lost me here. Can you rephrase?
-- meem _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
110
From:
US
Registered:
3/9/05
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 11, 2006 12:55 AM
in response to: jmderic
|
|
Mark,
Thanks for the code/proposal. We also toyed with having the MAC layer pass a dls_impl_t down and back up but we didn't like messing up the DLS/MAC interface so much. I hope you'll like solution that Dong*Hai and I are experimenting with, which we will post soon (I'm hoping within 24 hours).
> P.S.: what's up with the sponsor stuff from my earlier emails in the thread? > Is there a way we can formalize or presence?
I don't think it's necessary in this case to requ****a sponsor since you have people in Sun that are already working on the same issue and we'd be happy to work with you to get the code changed. The sponsor process is basically a proxy for accessing the source code gate or other internal resources when none of the person/persons implementing a change are Sun employees. In this case once we agree on an approach, someone like Dong*Hai or myself can drive the reviews and putback to the gate and you can still be listed as one of the implementors of the change.
For the problems you had with posting attachments via the web site, you might want to describe them on the website-discuss list. BTW, did you try just sending an email to the list address after you had subscribed?
Also, you may want to check out the webrev tool (it's in the SUNWonbld package) which is the prefered way to present a set of code changes for review purposes. Unfortunately, it's not practical to send a webrev as an email attachment; you pretty much have to put it on a web server.
-=] Mike [=- _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 11, 2006 6:39 AM
in response to: mditto
|
|
> Also, you may want to check out the webrev tool (it's in the SUNWonbld > package) which is the prefered way to present a set of code changes > for review purposes. Unfortunately, it's not practical to send a > webrev as an email attachment; you pretty much have to put it on a web > server.
... and thanks to Steve Lau and Dan Price, cr.grommit.com is available for the purpose; see http://cr.grommit.com for details.
-- meem _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Dong-Hai Han
Donghai.Han@Sun.COM
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 11, 2006 7:57 AM
in response to: mditto
|
|
> Mark, > > Thanks for the code/proposal. We also toyed with having the MAC > layer pass > a dls_impl_t down and back up but we didn't like messing up the > DLS/MACinterface so much. I hope you'll like solution that ****- > Hai and I are > experimenting with, which we will post soon (I'm hoping within 24 > hours).
(As MIke promised, within 24 hours. :-) )
Hello, Mark,
I am posting the materials Mike mentioned earlier, the webrev of the changes is: http://www.omnicron.com/~ford/webrev_bridging_dlpi.premacplugin/
As the path implies, it's not the latest snv version( I can't post the latest version since some of other changes are not open yet) , but it's newer than the version on opensolaris I guess, and we think you (and others) will not have problem understanding it.
And, below is the proposal we sent out to some internal aliases a while ago:
Best,
Donghai.
----------------------------------------------------------------------------- -----------------------------------------------
DLPI loopback changes proposal by bridging project
Overview ========
The bridging project i-team proposes changing loopback logic of DLPI, specifically for Nemo, some of the implementation details need changes.
This change is needed for adding bridge function to Solaris, and this change is designed and implemented to not affect the users of DLPI. Of course, since this proposal changes the loopback logic, some potential problems may occur, and in future, other projects will have to take this into consideration.
Background ==========
The loopback function has been in Solaris DLPI implementations for a while, the main usage is for applications like snoop to gather packets sent out from local stack, and till now, it's bound to the promiscuous mode, that is, only when there are promiscuous users, the Data Link Provider will loop packets back.
This logic worked well, since it used to be that only promiscuous users are interested in looped back packets. However, when bridge function( yes, it's the common bridge, or switch function) is added to Solaris, things change, a packet bridged by the bridge module may be headed for local ip stack, which is not promiscuous, that means, we have to loop packets back to non-promiscuous users. Also, this imposes another request, that the loopback mechanism must identify the sender of a packet and avoid loopking the packet back to the sender.
Current loopback implementation in Nemo does the loopback at MAC layer while most of the loopback happens between DLS users, and the MAC layer couldn't( and shouldn't) distinguish DLS users, so it's hard to avoid the loopback to sender problem. Also, though with careful design, current MAC layer loopback doesn't have any performance problem, it has some logic problems by doing all the loopback at MAC layer.
Proposed Solution =================
Based on the requirements of bridge, here we propose following changes[1] to the loopback logic:
1. Whenever a DLPI user receives packets, it is a potential acceptor of loopback packets. 2. Whenever there are multiple DLPI users on a DLS provider, and at least one of the users is sending packets, the potential acceptor should start receiving loopback packets.
[1]For a more complete description of the idea behind these changes, please refer to the article and CR in the Reference section.
Ideally this logic should work for all the situations, however, in practice, it will impose performance problems, for example, when we plumb an interface using ifconfig(1M), ip and arp will be added to the user list of the interface, and with these two users, the above logic is met and loopback will be enabled, which is not necessary of course.
Because of this, some restrictions are added( currently, in Nemo implementation)[2]:
1. At least one of the DLPI users is in promiscuous mode, this works now since currently bridge is behind all these changes, and bridge module is a promiscuous user. However, if in the future the situations change, this should be changed too. 2. If no user is promiscuous, then loopback is enabled when there are users with same sap. Of course, a user with sap X could send packets to sap Y, but in real world, no such user is normal.
[2]Ideally we should do what Mike Ditto has described in his article and the CR, that is, the transmit path should check each packet and see if the destination MAC address "macthes" the local MAC address, however, Because of fear of performance problems, this step is omitted, it would be a more pure design to add the destination address comparison but it's not necessary for the immediate goal of supporting bridging.
Based on above conceptual changes, we have finished the prototype for Nemo, the changes are:
1. The loopbcak happens at multi-level, including MAC layer and DLS layer, that is, MAC layer do loopback for MAC users like DLS and AGGR, while DLS layer do loopback for it's users, like IP stack, snoop, and bridge. This way, it utilizes the well designed DLS layer, and it's very easy to distinguish the sender(dls_imp_t) of the packets. 2. mac_tx_get now returns mi_txloopinfo only when there are users registered through mac_txloop_add, disregarding it's promiscuous or not.
With these changes to loopback behaviour, please be noted that in new code DLS uses the MAC layer's transmit loopback, receive, and "active" client features in subtly different ways from the original design, now it works this way: 1. DLS will register an rx callback with the MAC provider whenever there is at least one bound DLS user. "active" mode does not enter into this. 2. DLS will register a txloop callback with the MAC module whenever there is at least one bound DLS user and no active DLS users (whether bound or not). This allows the (one) active MAC client to have its transmitted packets looped back to all of the inactive MAC clients, but not to itself. 3. DLS will (attempt to) claim active client status with the MAC module whenever there is at least one active DLS user, whether bound or not. This means that it is possible for an unbound DLS user to occupy the special "active" position without actually being able to send, which is somewhat undesirable but there shouldn't be any reason for any DLS user to invoke this odd situation (DLD, for example, will never do this for more than an instant).
Reference =========
[internal] Article: DLPI Improved Loopback Handling, by Mike Ditto ...
The CR for this issue 6402493 DLPI provider loopback behavior should be improved
Discussion on OpenSolaris.org http://www.opensolaris.org/jive/message.jspa?messageID=28615
[internal] Bridging project wiki ...
Bridging on OpenSolaris http://www.opensolaris.org/os/project/ethbridge/
[internal] Prototype workspace ...
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
David Edmondson
dme@sun.com
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 17, 2006 6:53 AM
in response to: Dong-Hai Han
|
|
On 11 Jul 2006, at 3:57pm, Dong*Hai Han wrote: > I am posting the materials Mike mentioned earlier, the webrev of > the changes is: > http://www.omnicron.com/~ford/webrev_bridging_dlpi.premacplugin/
I've taken a quick look at this and, as you know, have incorporated a previous version of the code into the gate for the Matrix project[1].
The general approach seems workable, but I think that the details have to be right. For example, i_dl_txloop() assumes that a MAC driver will consume either all or none of the packets passed to it:
263 /* 264 * Transmit function, used when the link is doing local loopback 265 */ 266 static mblk_t * 267 i_dls_txloop(dls_impl_t *dip, mblk_t *mp) 268 { 269 const mac_txinfo_t *mtp = dip->di_txinfo; 270 dls_link_t *dlp = dip->di_dvp->dv_dlp; 271 mblk_t *nextp; 272 mblk_t *bp; 273 274 while (mp != NULL) { 275 nextp = mp->b_next; 276 mp->b_next = NULL; 277 278 dlp->dl_local_loopback(dip->di_dvp->dv_dlp, mp, dip); 279 280 if ((bp = mtp->mt_fn(mtp->mt_arg, mp)) != NULL) { 281 ASSERT(bp == mp); 282 goto noresources; 283 } 284 285 mp = nextp; 286 } 287 288 return (NULL); 289 290 noresources: 291 mp->b_next = nextp; 292 return (mp); 293 } 294
(Note lines 280/281.)
Quick inspection of the Broadcom driver shows that it's entirely possible that the MAC driver's transmit routine will accept some but not all of the packets passed down. i_dls_txloop() needs to allow for this.
It's also the case that this newer version of the code will loopback packets that are not accepted for transmission by the MAC driver (the loopback happens before the packets are passed down). This differs from the older code, where the loopback occurred only if the MAC driver accepted the packets.
[1] http://www.opensolaris.org/os/community/xen.
dme.
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 17, 2006 7:12 AM
in response to: David Edmondson
|
|
> 274 while (mp != NULL) { > 275 nextp = mp->b_next; > 276 mp->b_next = NULL; > 277 > 278 dlp->dl_local_loopback(dip->di_dvp->dv_dlp, mp, > dip); > 279 > 280 if ((bp = mtp->mt_fn(mtp->mt_arg, mp)) != NULL) { > 281 ASSERT(bp == mp); > 282 goto noresources; > [ ... ] > > Quick inspection of the Broadcom driver shows that it's entirely > possible that the MAC driver's transmit routine will accept some but > not all of the packets passed down. i_dls_txloop() needs to allow > for this.
On lines 275-276, the packet is unchained before it's passed down -- so there is only one packet being sent down for transmission.
-- meem _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
David Edmondson
dme@Sun.COM
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 17, 2006 7:34 AM
in response to: meem
|
|
On 17 Jul 2006, at 3:12pm, Peter Memishian wrote: > On lines 275-276, the packet is unchained before it's passed down > -- so > there is only one packet being sent down for transmission.
Of course, sorry.
dme.
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 17, 2006 8:32 AM
in response to: David Edmondson
|
|
> > On lines 275-276, the packet is unchained before it's passed down > > -- so > > there is only one packet being sent down for transmission. > > Of course, sorry.
I should point out that Cathy Zhou has found a limitation with the current design while working on moving ibd(7D) to GLDv3. Specifically, with its transmit entrypoint, there are three conditions that need to be differentiated:
1. A packet was accepted for transmission. 2. A packet was not accepted due to temporary resource exhaustion. 3. A packet was not accepted for other reasons, and will never be accepted.
Unfortunately, I don't think it's possible to differentiate all three cases in the current interface for mi_tx().
-- meem _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
110
From:
US
Registered:
3/9/05
|
|
|
|
Re: Promiscuous DLPI packet loopback
(was something else)
Posted:
Jul 17, 2006 3:35 PM
in response to: meem
|
|
Peter Memishian wrote: > Unfortunately, I don't think it's possible to differentiate all three > cases in the current interface for mi_tx().
If we could rework this interface a bit so that the tx routine never frees packets but only marks their status in the mblk somehow, then the caller (specifically DLS) could make decisions and do things with packets after the driver gives its result, and free them as appropriate. The problem is that the driver probably will want to free them asynchronously. This seems to need a callback from the MAC driver when it has accepted a packet and committed to its trasmission, but not yet initiated the process that could lead to an asynchronous freemsg, so that DLS has its opportunity to make any dups that it needs for loopback purposes.
Maybe this is evidence that we're just doing queueing at the wrong point. If the MAC provider continuously provided a tx queue full/not-full indicator and always accepted everything given to it, instead of sometimes returning some packets unprocessed, we wouldn't have this problem.
-=] Mike [=- _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Dong-Hai Han
Donghai.Han@Sun.COM
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 17, 2006 8:15 AM
in response to: David Edmondson
|
|
Thanks for taking time reviewing it, my reply is inline.
----- Original Message ----- From: David Edmondson <dme at sun dot com> Date: Monday, July 17, 2006 9:53 pm Subject: Re: [networking-discuss] Promiscuous DLPI packet loopback (was something else) To: ****-Hai Han <Donghai dot Han at Sun dot COM> Cc: networking-discuss at opensolaris dot org
> > On 11 Jul 2006, at 3:57pm, Dong*Hai Han wrote: > > I am posting the materials Mike mentioned earlier, the webrev of > > the changes is: > > http://www.omnicron.com/~ford/webrev_bridging_dlpi.premacplugin/ > > I've taken a quick look at this and, as you know, have incorporated > a > previous version of the code into the gate for the Matrix project[1]. > > The general approach seems workable, but I think that the details > have to be right. For example, i_dl_txloop() assumes that a MAC > driver will consume either all or none of the packets passed to it: > > 263 /* > 264 * Transmit function, used when the link is doing local loopback > 265 */ > 266 static mblk_t * > 267 i_dls_txloop(dls_impl_t *dip, mblk_t *mp) > 268 { > 269 const mac_txinfo_t *mtp = dip->di_txinfo; > 270 dls_link_t *dlp = dip->di_dvp->dv_dlp; > 271 mblk_t *nextp; > 272 mblk_t *bp; > 273 > 274 while (mp != NULL) { > 275 nextp = mp->b_next; > 276 mp->b_next = NULL; > 277 > 278 dlp->dl_local_loopback(dip->di_dvp->dv_dlp, mp, > > dip); > 279 > 280 if ((bp = mtp->mt_fn(mtp->mt_arg, mp)) != NULL) { > 281 ASSERT(bp == mp); > 282 goto noresources; > 283 } > 284 > 285 mp = nextp; > 286 } > 287 > 288 return (NULL); > 289 > 290 noresources: > 291 mp->b_next = nextp; > 292 return (mp); > 293 } > 294 > > (Note lines 280/281.) > > Quick inspection of the Broadcom driver shows that it's entirely > possible that the MAC driver's transmit routine will accept some > but > not all of the packets passed down. i_dls_txloop() needs to allow > for this. Meem has pointed out what's happening here, thanks Meem.
> > It's also the case that this newer version of the code will > loopback > packets that are not accepted for transmission by the MAC driver > (the > loopback happens before the packets are passed down). This differs > > from the older code, where the loopback occurred only if the MAC > driver accepted the packets. Yes, it's changed. Mike and I have dicussed it, our conclusion is that: 1. Even if MAC driver returns OK, it doesn't mean that the packet is safe on the line, or even being sent out. 2. For xen case, when domUs are communicating with each other, does it make sense to cut the virtual link when the packet can't go through the physical link?
So, we chose this approach.
> > [1] http://www.opensolaris.org/os/community/xen. > > dme. > >
Best,
Donghai. _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
3,045
From:
US
Registered:
3/9/05
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 17, 2006 8:36 AM
in response to: Dong-Hai Han
|
|
> > It's also the case that this newer version of the code will > > loopback > > packets that are not accepted for transmission by the MAC driver > > (the > > loopback happens before the packets are passed down). This differs > > > > from the older code, where the loopback occurred only if the MAC > > driver accepted the packets. > Yes, it's changed. Mike and I have dicussed it, our conclusion is that: > 1. Even if MAC driver returns OK, it doesn't mean that the packet is safe on > the line, or even being sent out.
I think it traditionally means that the packet has been passed to the hardware for transmission. That is, I don't think it's OK for us to indicate that the packet has been looped back when in fact the driver has thrown it away and we know it.
-- meem _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
David Edmondson
dme@Sun.COM
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 17, 2006 9:01 AM
in response to: Dong-Hai Han
|
|
On 17 Jul 2006, at 4:15pm, Dong*Hai Han wrote:
>> >> It's also the case that this newer version of the code will >> loopback >> packets that are not accepted for transmission by the MAC driver >> (the >> loopback happens before the packets are passed down). This differs >> >> from the older code, where the loopback occurred only if the MAC >> driver accepted the packets. > Yes, it's changed. Mike and I have dicussed it, our conclusion is > that: > 1. Even if MAC driver returns OK, it doesn't mean that the packet > is safe on > the line, or even being sent out.
Whilst this is true, the proposed behaviour does not match what is generally accepted (or what I wrote). In most instances, if the tx function indicates that it has accepted the packet, then it's reasonable to assume that it's queued for transmission.
If we go the way you propose, it seems that snoop could see a variety of packets that are not accepted for transmission by the driver. This would be very confusing for an administrator.
> 2. For xen case, when domUs are communicating with each other, does > it make sense to cut the virtual link when the packet can't go > through the > physical link?
I don't follow this comment, sorry.
With the current Xen backend driver, if the backend and frontend are not both in the 'Connected' state then attempts to transmit packets via the backend driver fail. Similarly, if snoop is monitoring the backend interface before the backend and frontend are both 'Connected' it doesn't see any packets reflected back.
dme.
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
110
From:
US
Registered:
3/9/05
|
|
|
|
Re: Promiscuous DLPI packet loopback
(was something else)
Posted:
Jul 17, 2006 3:04 PM
in response to: David Edmondson
|
|
David Edmondson wrote: > If we go the way you propose, it seems that snoop could see a variety > of packets that are not accepted for transmission by the driver. This > would be very confusing for an administrator. > >> 2. For xen case, when domUs are communicating with each other, does >> it make sense to cut the virtual link when the packet can't go >> through the >> physical link? > > I don't follow this comment, sorry.
The case here is when the bridge forwards a packet from a Xen DomU to the physical link interface that (Dom0) IP is plumbed on. If the driver is unable to transmit packets because of hardware problems or link failure, should we start dropping packets that the Xen DomU is sending to Dom0? I don't think so, although the right behavior here is not obvious. By generalizing the loopback behavior we are effectively extending the topology of the link. When there are two users of a datalink provider, we consider them to be like two nodes on the link, and there are situations where they are able to talk to each other even while there is a failure elsewhere on the link and some other nodes are unreachable. This is analagous to having a small hub on your desk, connecting your laptop and desktop to the central switch in a distant computer room. If the link to the computer room fails, you still want your laptop and desktop machines to be able to communicate with each other.
But I had forgotten about the queueing aspects. When the driver's transmit buffers are full, it returns some packets unprocessed, allowing the caller to queue them until the hardware is ready. In that case it's wrong to loop them back to snoop or any other user unless we can arrange for them to be queued without being looped back again later. I guess this logic does have to be changed. Unfortunately, it seems to require holding onto a dup of every packet while the driver makes its transmit attempt, as opposed to the loopback-first approach which avoids the dup unless somebody actually needs a copy.
-=] Mike [=- _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Dong-Hai Han
Donghai.Han@Sun.COM
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 17, 2006 8:37 PM
in response to: mditto
|
|
Mike Ditto Wrote: > David Edmondson wrote: > >> If we go the way you propose, it seems that snoop could see a variety >> of packets that are not accepted for transmission by the driver. >> This would be very confusing for an administrator. >> >>> 2. For xen case, when domUs are communicating with each other, does >>> it make sense to cut the virtual link when the packet can't go >>> through the >>> physical link? >> >> >> I don't follow this comment, sorry. > > > The case here is when the bridge forwards a packet from a Xen DomU to > the physical link interface that (Dom0) IP is plumbed on. If the driver > is unable to transmit packets because of hardware problems or link > failure, should we start dropping packets that the Xen DomU is sending > to Dom0? I don't think so, although the right behavior here is not > obvious. By generalizing the loopback behavior we are effectively > extending the topology of the link. When there are two users of a > datalink provider, we consider them to be like two nodes on the link, > and there are situations where they are able to talk to each other even > while there is a failure elsewhere on the link and some other nodes are > unreachable. This is analagous to having a small hub on your desk, > connecting your laptop and desktop to the central switch in a distant > computer room. If the link to the computer room fails, you still want > your laptop and desktop machines to be able to communicate with each > other. > > But I had forgotten about the queueing aspects. When the driver's > transmit buffers are full, it returns some packets unprocessed, > allowing the caller to queue them until the hardware is ready. In > that case it's wrong to loop them back to snoop or any other user > unless we can arrange for them to be queued without being looped back > again later. D'oh!
> I guess this logic does have to be changed. I agree.
> Unfortunately, it seems to require holding onto a dup of every packet > while the driver makes its transmit attempt, as opposed to the > loopback-first approach which avoids the dup unless somebody actually > needs a copy. Looks like our "save-one-copy" approach is unsuitable here. :-(
> > -=] Mike [=-
Best,
Donghai. _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
7
From:
San Jose, CA
Registered:
11/11/05
|
|
|
|
Re: Promiscuous DLPI packet loopback (was
something else)
Posted:
Jul 11, 2006 11:48 AM
in response to: mditto
|
|
Gents,
Wow! We've been trying to surface this issue within Sun since 2002 and the Solaris 8. Now I feel like I've fallen out of the lucky tree and hit all the branches :-) Since my post yesterday afternoon at ~1:30 pm Pacific; I believe there have been 12 posts and this subject has hit the charts on "most popular networking discussion thread".
I feel _very_ much behind the curve now with my naivety wrt Sun's processes, organization, etc. and want to address all of the well-thought out items you gentlemen have written above in kind. I am very interested in following early release code from Donghai's, Peter's, or Mike's sectors; but very much want to make sure I stay on the side of the solution with positive contribution.
Thank you for your responses and I will offer more as I give the above items the study they deserve.
Best Regards, Mark
|
|
|
|
Dong-Hai Han
Donghai.Han@Sun.COM
|
|
|
|
Re: Re: re: Re: Re: DLPI packet loopback proposal
Posted:
Jul 10, 2006 8:36 PM
in response to: meem
|
|
Peter Memishian Wrote: > This was no accident; the goal was to allow interposing with zero overhead > in the common case where the promiscuous behavior was not needed. > > > The mac_txloop function needs the caller's dls_impl_t* ; the driver's > > impl does not. > > The fact that mac_txloop() needs that is in itself an architectural issue > with the proposed solution. DLS is just one consumer of the mac; > hardcoding the mac layer to be aware of it is not appropriate. > > As I understand Dong*Hai's solution (which hopefully he will discuss > shortly), the DLS layer has its own loopback layer, in addition to the > loopback done at the mac layer. With the DLS loopback layer, it is > straightforward to filter out the sender-side duplicates. >
The work I was doing on DLPI loopback is, as Meem said, doing multi-layer loopbacks, I don't think I have to get into much details here, cause Mike Ditto told me that he had discussed this issue with Mark, and Mike's proposal on opensolaris.org has already stated the design very well.
Simply speaking, yes, with a DLS loopback layer, it's trivial to find the sender and avoid the unnecessary loopbacks.
Currently we only do this check at DLS layer, in the future (maybe), we can add code to do it at MAC layer too, of course MAC layer shouldn't know anything about the internals of DLS, like dls_impl_t. We can resolve this by, for example, using opaque handles, etc.
Best,
Donghai.
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
110
From:
US
Registered:
3/9/05
|
|
|
|
Re: Re: re: Re: Re: DLPI packet loopback proposal
Posted:
Jul 10, 2006 10:43 PM
in response to: Dong-Hai Han
|
|
Dong*Hai Han wrote: > The work I was doing on DLPI loopback is, as Meem said, doing > multi-layer loopbacks, I don't think I have to get into much details > here, cause Mike Ditto told me that he had discussed this issue with > Mark, and Mike's proposal on opensolaris.org has already stated the > design very well.
Actually I've only indirectly spoken with someone from DeepNines about getting our prototype fix into DeepNines's hands for early access. Dong*Hai, I think we should go ahead and post a description and webrev of what we have so that Mark can see if it meets their needs. We will still have to go through the PSARC and other reviews once there is agreement that the approach makes sense. Also, we have so far only made the change for GLDv3; I think it is not architecturally complete unless we also at least make the same change for GLDv2 and perhaps some monolithic drivers. Or we could wait until Clearview makes everything go through GLDv3.
> Currently we only do this check at DLS layer, in the future (maybe), we > can add code to do it at MAC layer too, of course MAC layer shouldn't > know anything about the internals of DLS, like dls_impl_t. We can > resolve this by, for example, using opaque handles, etc.
I don't think it is necessary to use even opaque handles if each layer (DLS, MAC) handles only its own loopback and assumes that the other layer will take care of itself.
-=] Mike [=- _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Dong-Hai Han
Donghai.Han@Sun.COM
|
|
|
|
Re: Re: re: Re: Re: DLPI packet loopback proposal
Posted:
Jul 10, 2006 10:51 PM
in response to: mditto
|
|
Mike Ditto Wrote: > Dong*Hai Han wrote: > >> The work I was doing on DLPI loopback is, as Meem said, doing >> multi-layer loopbacks, I don't think I have to get into much details >> here, cause Mike Ditto told me that he had discussed this issue with >> Mark, and Mike's proposal on opensolaris.org has already stated the >> design very well. > > > Actually I've only indirectly spoken with someone from DeepNines about > getting our prototype fix into DeepNines's hands for early access. Er, I misunderstood the messages, anyway, your post is clear enough...
> Dong*Hai, I think we should go ahead and post a description and webrev > of what we have so that Mark can see if it meets their needs. We can do that after the internal review?
>> Currently we only do this check at DLS layer, in the future (maybe), we >> can add code to do it at MAC layer too, of course MAC layer shouldn't >> know anything about the internals of DLS, like dls_impl_t. We can >> resolve this by, for example, using opaque handles, etc. > > > I don't think it is necessary to use even opaque handles if each layer > (DLS, MAC) handles only its own loopback and assumes that the other > layer will take care of itself. En, maybe I used wrong word for it, anyway, each layer needs something to identify various upper layers...
> > -=] Mike [=-
Best,
Donghai. _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
110
From:
US
Registered:
3/9/05
|
|
|
|
Re: Re: re: Re: Re: DLPI packet loopback proposal
Posted:
Jul 10, 2006 11:03 PM
in response to: Dong-Hai Han
|
|
Dong*Hai Han wrote: >> Dong*Hai, I think we should go ahead and post a description and webrev >> of what we have so that Mark can see if it meets their needs. > > We can do that after the internal review?
I think the serious interest by multiple people here on networking-discuss justifies sharing our work in progress now, as long as everybody knows it's just brainstorming/prototyping. Besides, there is no such thing as "internal review"; at this point Mark is as much a contributor to solving this problem as you and I are. :-)
> En, maybe I used wrong word for it, anyway, each layer needs something > to identify various upper layers...
Ah, I see what you mean. Yes, I think eventually the MAC layer may have to start issuing unique handles to its clients so that it knows who is transmitting and can avoid looping back to the sender. But our interim fix solves 99% of the problem without requiring that more substantial change.
-=] Mike [=- _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
|