OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » networking » discuss

Thread: Clearview IP-Level Observability Devices design review (due 11/8)

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 48 - Last Post: Dec 7, 2005 12:01 AM by: meem
philk

Posts: 42
From: UK

Registered: 6/29/05
Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 25, 2005 3:21 PM

  Click to reply to this thread Reply

Hi,

As described in the previous mails which introduced the IPMP
rearchitecture and the IP Tunneling device there is an approachability
project underway called Clearview. This project aims to rationalize,
unify, and enhance the way network interfaces are handled in Solaris.
Currently there are four main components to Clearview namely:

IPMP Rearchitecture
IP Tunnel Device
Vanity Naming and Nemo Unification
IP-Level Observability Devices

The design document for IP-Level Observability Devices is now available
for download at:

http://www.opensolaris.org/os/community/networking/ipobs-design.pdf

The document covers a new set of DLPI devices which will allow access to
packets at the IP layer. This includes packets local to the system, as
well as inter-zone and intra-zone traffic. You will now be able to run
snoop and see the traffic between your zones!

Any feedback is very welcome.

The timer for comments is set at two weeks (8th November).

Thanks for any comments, we're looking forward to working with this
community.

Phil

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 27, 2005 3:35 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining wrote:

> The document covers a new set of DLPI devices which will allow access to
> packets at the IP layer. This includes packets local to the system, as
> well as inter-zone and intra-zone traffic. You will now be able to run
> snoop and see the traffic between your zones!


One quick question on the design of /dev/ipnet name space.
Is there any issue if zone is also in the name space? This
means that in the global zone, the /dev/ipnet/ can be
something like

/dev/ipnet/0/lo0
/eri0
/dev/ipnet/1/lo0
/eri0

or

/dev/ipnet/lo0-0
/eri0-0
/lo0-1
/eri0-1

And in zone 1, the /dev/ipnet/ can be something like

/dev/ipnet/lo0
/eri0

It seems that ancillary data can be avoided using this scheme.
And third party network sniffer, such as ethereal, can be easily
adapted to use it (without change?). One obvious issue is that
the name is not consistent in the global/non-global zone. But
I guess it is not a big issue. Is there other issue such that
this scheme was not chosen?


--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 27, 2005 8:30 AM   in response to: kcpoon

  Click to reply to this thread Reply

Hi Kacheong,

Thanks for the feedback.

One problem I still see with tying the zoneid to the namespace is that
there is no way from a previous capture file of knowing what zoneid the
127.0.0.1 traffic was associated with. In the case where a customer is
sending in snoop files this could cause problems for the engineer in
support.

>One obvious issue is that
>the name is not consistent in the global/non-global zone. But
>I guess it is not a big issue.

Although not a big issue I think consistency for an administrator is
important. I also think having multiple devices for each zone would also
be potentially confusing for an administrator. What traffic would an
administrator expect to see if they snoop on the 0/lo0 device? All
localhost traffic or just the global zone's? Given there are separate
lo0 devices for the other zones perhaps the latter. Having just a single
device removes this possible confusion.


Thanks again

Phil


_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 28, 2005 2:29 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining wrote:

> One problem I still see with tying the zoneid to the namespace is that
> there is no way from a previous capture file of knowing what zoneid the
> 127.0.0.1 traffic was associated with. In the case where a customer is
> sending in snoop files this could cause problems for the engineer in
> support.


We probably can ask the customer to use a meaningful file name.


> Although not a big issue I think consistency for an administrator is
> important. I also think having multiple devices for each zone would also
> be potentially confusing for an administrator. What traffic would an
> administrator expect to see if they snoop on the 0/lo0 device? All
> localhost traffic or just the global zone's? Given there are separate
> lo0 devices for the other zones perhaps the latter. Having just a single
> device removes this possible confusion.


If we add zone as a differentiator, it seems logical that
snooping on 0/lo0 should only report traffic for global zone,
nothing for other zones.

The reason I ask the question is that I think it is better
if the new architecture does not require snoop or other
network traffic analysis tools to be changed. Given the info
in the design doc, I understand that a new MAC type is introduced.
So I assume that since the snoop format is not changed, the
dl_ipnetinfo info may be the "header" of the packet for
MAC type DL_IPNET in a trace file. This is just a guess but
the important point is that all those tools need to be changed.

Has the design team considered other approaches (I read some
in the appendix) which do not require a new MAC type? And is
the DL_IPNET MAC type introduced just for differentiating
loopback traffic in different zones? Even if a machine without
any zone configured, a trace file will still have the dl_ipnetinfo
in it with the new architecture. I guess the design doc should
elaborate on the decision why introducing a new MAC type for
snoop is better than other ways which do not require existing
traffic analysis tools to be changed.

To me, having an extra level in the name space is not an
important issue. But I'd love to have tools, like tcptrace, to
simply work with captured file by the new architecture. This
is just my opinion though :-)



--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



dme

Posts: 62
From:

Registered: 6/10/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 27, 2005 7:05 AM   in response to: philk

  Click to reply to this thread Reply

Phil, good work!

Some comments from a read through.

* There's no discussion of what happens should some form of protocol
offload be in use.

For example, if checksum offload is in use with interface foo0, will
outgoing packets observed through the /dev/ipnet/foo0 have a correct
checksum?

* In various places the document refers to "local zones", though I
understood we were to avoid using that term, preferring "non-global
zones". Personally I don't care, though consistency is good.

* The term "IP traffic" is used in lots of places and I think that
this means "IP_DL_SAP and IP6_DL_SAP traffic", which explicitly
excludes ARP, etc. Is that correct?

It would be good to be specific about what things an administrator
might expect to see that won't be there.

* In section 2, requirement 3 doesn't call out that fact that traffic
local to a non-global zone should be visible.

* Section 3.1 describes the rules which govern whether a packet will
be passed to a consumer. The second part of the rule begins:

"DL_PROMISC_PHYS is enabled and the interface...."

Could you say what type of packets would fall into this case?

* Why is the default behaviour of the observability device to _not_
pass up ancillary data? (Particularly given that snoop will always
ask for it.)

* The version management of DL_IOC_IPNETINFO seems odd. If an
application exists that can understand both versions 1 and 2 of the
ancillary data, how does the application indicate this to the
observability device?

If ancillary data were always present the kernel could pass upward
data in the "current" version. If the application doesn't
understand this, it should fail. This approach would mean that a
single application might function over several versions of the
ancillary data.

* Will the project team provide changes to the maintainers of
"ethereal" to support the new observability device?

* Given that the ancillary data forms part of an on-disk file format,
specifying the alignment and sizes of elements of dl_ipnetinfo_t
seems necessary.

* Section 4.2: the zone of an interface can change in much the same
way as the address, meaning that it will be necessary to track such
changes.

* Section 5: are the ipolink_t and ipoaddr_t data structures
necessary? It seems that they shadow the existing ill and ipif
structures - perhaps ill and ipif could be extended to provide the
relevant functionality? This would avoid the need to keep the two
sets of data coordinated.

dme.
--
David Edmondson, Solaris Engineering, Sun Microsystems.
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



barts

Posts: 1,172
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 27, 2005 8:30 AM   in response to: philk

  Click to reply to this thread Reply

What are the tradeoffs in supporting snooping on logical devices, eg:

snoop -I /dev/bge0:1

You wrote:

> Opening these devices will provide access to all IP packets with
> addresses associated with the interface. This includes both IPv4
> and IPv6 traffic, and addresses hosted on logical interfaces. For
> this reason, there is no /dev/ipnet/eri0:1; instead opening
> /dev/ipnet/eri0 will provide all traffic that is destined for, or
> originating from, any address hosted on eri0.

Clearly, the ability to filter is already there as zones provide access to only their
packets; would the difficulties in extending this to all logical addresses (in global
and local zones) outweigh the utility?

For example, is it useful to be able to snoop on a single zone's traffic on a shared
physical interface in the global zone w/o invoking application-specific filtering
mechanisms?

It would seem to make debugging IPFilter configurations considerably
easier, as well.

- Bart

philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 28, 2005 11:44 AM   in response to: barts

  Click to reply to this thread Reply

Hi Bart,

The choice of naming scheme for /dev/ipnet was discussed for some time
and appendix B tries to capture the choices considered and the reasons
why they where rejected.

Regarding the use of logical interfaces there were several reasons not
to use this scheme:

1) How do you handle the 0'th interface? An administrator would expect
"snoop -d hme0" to work as before so does that mean you'd have to "snoop
-d hme0:0" to just see the packets sent to the address hosted on the
first interface?

2) Treating logical interfaces as real devices adds to the confusion
that already exists around them. Logical interfaces are not real
devices, they provide address aliasing. Given we are trying to phase out
the use of logical interfaces precisely because people
think that packets flow over them, making snoop suddenly start working
on them would make this considerably harder.

3) The fact logical interfaces don't have unique names is an
implementation flaw which we don't want to expose when implementing the
new devices. An example of this is hme0:1 exists for both IPv4 and IPv6.

4) Given the way logical interfaces are implemented there is no
guarantee that zones will boot in the same order twice, nor that IPv6
stateless address auto configuration will assign prefixes in the same
order twice. Snooping on hme0:1 could therefore give you different
results from one attempt to the other, making it difficult to script.

5) Using logical interfaces would not provide visibility into received
broadcast, multicast packets or packets forwarded by ip. This is because
such packets are not associated with any logical interface.

6) If a zone has more than one address assigned to it, then it would not
possible to view all the packets associated with it without running
multiple snoops.

Given these issues the use of logical interfaces doesn't really work.

Phil
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 30, 2005 9:43 PM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining wrote:

> 6) If a zone has more than one address assigned to it, then it would not
> possible to view all the packets associated with it without running
> multiple snoops.


I assume you are referring to the snoop limitation that it can
only snoop on one "interface" at a time. I'd suggest we also
implement this snooping on multiple "interfaces" feature as part
of the new snooping architecture.


--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,046
From: US

Registered: 3/9/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 30, 2005 10:02 PM   in response to: kcpoon

  Click to reply to this thread Reply


> I assume you are referring to the snoop limitation that it can
> only snoop on one "interface" at a time. I'd suggest we also
> implement this snooping on multiple "interfaces" feature as part
> of the new snooping architecture.

Rearchitecting snoop is completely outside of our project goals.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 30, 2005 10:15 PM   in response to: meem

  Click to reply to this thread Reply

Peter Memishian wrote:
> > I assume you are referring to the snoop limitation that it can
> > only snoop on one "interface" at a time. I'd suggest we also
> > implement this snooping on multiple "interfaces" feature as part
> > of the new snooping architecture.
>
> Rearchitecting snoop is completely outside of our project goals.


Why not :-) Actually, if the new architecture allows collecting
data from multiple sources in the kernel and sending the collected
data up to snoop via a single stream, I guess the current snoop
program does not need a rearchitecture. It just needs to send
down some new commands to tell the kernel "collector" to do
the right thing. No?

Just a suggestion though, I've not thought about the details ;-)



--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,046
From: US

Registered: 3/9/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 30, 2005 10:28 PM   in response to: kcpoon

  Click to reply to this thread Reply


> Why not :-)

As per our charter:

Clearview is a project to rationalize, unify, and enhance the way
network interfaces are handled in Solaris at the programmatic and
administrative levels.

> Actually, if the new architecture allows collecting data from multiple
> sources in the kernel and sending the collected data up to snoop via a
> single stream.

You would still need a way to tag the data to figure out what interface it
was actually associated with, which means introducing a generic form of
ancillary data -- which means a full revision of the snoop file format.

If you want to have a conversation about rearchitecting snoop, please
start it a new thread. This is not that project.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



Frank DiMambro
frd@main-man.com
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 31, 2005 9:08 AM   in response to: meem

  Click to reply to this thread Reply

Hi Kacheong
You can uses snoop on different instances of an interface in different
windows at the same time, you can also open multiple snoop sessions
on the same interface. So there's no limitation on how many snoop
instances you can run at the same time. Just has to be in different
windows.
Also another thing you can do is create snoop sessions which place
there output to files, and make them background process. Then later
review the files.

snoop -d xxx0 -o xxx0_snoopfile &

Most of the time, when using using snoop, you generate a file for
review later since using a scroll window the output scrolls by to fast
anyway...

Frank

Peter Memishian wrote:

> > I assume you are referring to the snoop limitation that it can
> > only snoop on one "interface" at a time. I'd suggest we also
> > implement this snooping on multiple "interfaces" feature as part
> > of the new snooping architecture.
>
>Rearchitecting snoop is completely outside of our project goals.
>
>--
>meem
>_______________________________________________
>networking-discuss mailing list
>networking-discuss at opensolaris dot org
>
>

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,813
From: US

Registered: 3/9/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 1, 2005 8:24 AM   in response to: Frank DiMambro

  Click to reply to this thread Reply

Frank DiMambro writes:
> You can uses snoop on different instances of an interface in different
> windows at the same time, you can also open multiple snoop sessions
> on the same interface. So there's no limitation on how many snoop
> instances you can run at the same time. Just has to be in different
> windows.

"windows?"

--
James Carlson, KISS Network <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



Frank DiMambro
frd@main-man.com
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 1, 2005 9:00 AM   in response to: carlsonj

  Click to reply to this thread Reply

Hi Jim
Yes, you can log onto Sun machines from a PC, and run snoop,
even with ethereal readilty available, I still like to use good old
snoop... Still I meant to say 'xterm'.

Frank

James Carlson wrote:

>Frank DiMambro writes:
>
>
>> You can uses snoop on different instances of an interface in different
>>windows at the same time, you can also open multiple snoop sessions
>>on the same interface. So there's no limitation on how many snoop
>>instances you can run at the same time. Just has to be in different
>>windows.
>>
>>
>
>"windows?"
>
>
>

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,813
From: US

Registered: 3/9/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 1, 2005 9:05 AM   in response to: Frank DiMambro

  Click to reply to this thread Reply

Frank DiMambro writes:
> Yes, you can log onto Sun machines from a PC, and run snoop,
> even with ethereal readilty available, I still like to use good old
> snoop... Still I meant to say 'xterm'.

Even changing the word to "xterm" does no good for me. I see no
necessary relationship between snoop and windows on the screen.

--
James Carlson, KISS Network <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



Francesco R. Di...
frd@main-man.com
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 1, 2005 10:29 AM   in response to: carlsonj

  Click to reply to this thread Reply

Hi Jim
Okay...then you've lost me...

Frank

James Carlson wrote:

>Frank DiMambro writes:
>
>
>> Yes, you can log onto Sun machines from a PC, and run snoop,
>>even with ethereal readilty available, I still like to use good old
>>snoop... Still I meant to say 'xterm'.
>>
>>
>
>Even changing the word to "xterm" does no good for me. I see no
>necessary relationship between snoop and windows on the screen.
>
>
>

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 28, 2005 1:56 AM   in response to: philk

  Click to reply to this thread Reply

Hi Dave,

Thanks for the feedback!

>* There's no discussion of what happens should some form of protocol
> offload be in use.

This also applies to loopback packets which aren't checksummed (bugid
6236645). I'll add something to describe this.

>* In various places the document refers to "local zones", though I
> understood we were to avoid using that term, preferring "non-global
> zones". Personally I don't care, though consistency is good.

Thanks, I didn't know this.

>* The term "IP traffic" is used in lots of places and I think that
> this means "IP_DL_SAP and IP6_DL_SAP traffic", which explicitly
> excludes ARP, etc. Is that correct?

Yes. I'll make it more specific.

>* In section 2, requirement 3 doesn't call out that fact that traffic
> local to a non-global zone should be visible.

I thought that was what the text described but maybe it's not clear?

>* Section 3.1 describes the rules which govern whether a packet will
> be passed to a consumer. The second part of the rule begins:
>
> "DL_PROMISC_PHYS is enabled and the interface...."
>
> Could you say what type of packets would fall into this case?

Forwarded packets.

>* Why is the default behaviour of the observability device to _not_
> pass up ancillary data? (Particularly given that snoop will always
> ask for it.)

Given the device provides access to IP packets we wanted to avoid making
it look like the ancillary data was some sort of pseudo link layer header.

>* The version management of DL_IOC_IPNETINFO seems odd. If an
> application exists that can understand both versions 1 and 2 of the
> ancillary data, how does the application indicate this to the
> observability device?
>
> If ancillary data were always present the kernel could pass upward
> data in the "current" version. If the application doesn't
> understand this, it should fail. This approach would mean that a
> single application might function over several versions of the
> ancillary data.

In the version scheme version 2 would be an extension of 1 so you
wouldn't need to ask for 1 and 2.

>* Will the project team provide changes to the maintainers of
> "ethereal" to support the new observability device?

Yes that's the intention.

>* Given that the ancillary data forms part of an on-disk file format,
> specifying the alignment and sizes of elements of dl_ipnetinfo_t
> seems necessary.

Ok.

>* Section 4.2: the zone of an interface can change in much the same
> way as the address, meaning that it will be necessary to track such
> changes.

Ok.

>* Section 5: are the ipolink_t and ipoaddr_t data structures
> necessary? It seems that they shadow the existing ill and ipif
> structures - perhaps ill and ipif could be extended to provide the
> relevant functionality? This would avoid the need to keep the two
> sets of data coordinated.

Um that's an interesting idea. I'll look into it.

Thanks again for the comments.

Phil
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



dme

Posts: 62
From:

Registered: 6/10/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 28, 2005 2:05 AM   in response to: philk

  Click to reply to this thread Reply

* phil dot kirk at sun dot com [20051028T095615]:
> >* In various places the document refers to "local zones", though I
> > understood we were to avoid using that term, preferring "non-global
> > zones". Personally I don't care, though consistency is good.
>
> Thanks, I didn't know this.

It's worth checking with a Zones guru (which I'm not).

> >* The version management of DL_IOC_IPNETINFO seems odd. If an
> > application exists that can understand both versions 1 and 2 of the
> > ancillary data, how does the application indicate this to the
> > observability device?
> >
> > If ancillary data were always present the kernel could pass upward
> > data in the "current" version. If the application doesn't
> > understand this, it should fail. This approach would mean that a
> > single application might function over several versions of the
> > ancillary data.
>
> In the version scheme version 2 would be an extension of 1 so you
> wouldn't need to ask for 1 and 2.

I kind-of hate this.

It means that the kernel has to be able to generate the ancillary data
in multiple formats depending on what the consumer asks for. By the
time we get to version 5 this will be a real pain, and quite possibly
a performance issue.

Having user-level applications that understand multiple versions seems
much easier and more flexible than forcing that responsibility onto
the kernel. User-level applications are also easier to update and
transport between OS versions (e.g. I build my own version of ethereal
with version 1-5 support and can then use it on s10uX, s10uY, s11,
s12, s13, ...).

dme.
--
David Edmondson, Solaris Engineering, Sun Microsystems.
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 28, 2005 3:21 AM   in response to: dme

  Click to reply to this thread Reply

David dot Edmondson at sun dot com wrote:

> Having user-level applications that understand multiple versions seems
> much easier and more flexible than forcing that responsibility onto
> the kernel. User-level applications are also easier to update and
> transport between OS versions (e.g. I build my own version of ethereal
> with version 1-5 support and can then use it on s10uX, s10uY, s11,
> s12, s13, ...).


If we really have to introduce another MAC type, I strongly
recommend that we also introduce a library to parse those
captured file so that all those traffic analysis tools can
use the library and do not need to change every time we change
the dl_ipnetinfo structure.

Or maybe we should not introduce another MAC type in the first
place :-)


--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,046
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 25, 2005 1:53 AM   in response to: dme

  Click to reply to this thread Reply


[ i didn't see this answered. ]

> > >* The version management of DL_IOC_IPNETINFO seems odd. If an
> > > application exists that can understand both versions 1 and 2 of the
> > > ancillary data, how does the application indicate this to the
> > > observability device?
> > >
> > > If ancillary data were always present the kernel could pass upward
> > > data in the "current" version. If the application doesn't
> > > understand this, it should fail. This approach would mean that a
> > > single application might function over several versions of the
> > > ancillary data.
> >
> > In the version scheme version 2 would be an extension of 1 so you
> > wouldn't need to ask for 1 and 2.
>
> I kind-of hate this.
>
> It means that the kernel has to be able to generate the ancillary data
> in multiple formats depending on what the consumer asks for. By the
> time we get to version 5 this will be a real pain, and quite possibly
> a performance issue.

The primary intent was not to have the kernel simultaneously support
multiple versions, but rather to:

1. Allow "old" applications to work with a "new" kernel. This
is done via the length field, included in each ancillary data
header: applications can parse the fields they grok and skip
over the rest. (Recall that a new version just means more
ancillary fields; the existing fields are not incompatibly
changed.)

2. Allow "new" applications to work with an "old" kernel. This
is done via the version number reported by the kernel. An
application can either fail if it does not recognize the
version, or (ideally) only parse the ancillary data fields
that correspond to that version.

I agree that the DL_IOC_IPNETINFO negotiation suggests (and facilitates)
the kernel supporting multiple versions, but (as per above) that wasn't
the intent.

Maybe you'd prefer it if DL_IOC_IPNETINFO instead took a boolean parameter
(0=disable, 1=enable), and, when enabled, returned a non-zero value
indicating the kernel's current DL_IOC_IPNETINFO version? Alternatively,
DL_IOC_IPNETINFO could just return zero when enabled, and the application
could rely on dli_version in each ancillary data header -- but I fear that
would complicate application error reporting (since it would have to wait
to receive a packet before it could determine the kernel's version).

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



dme

Posts: 62
From:

Registered: 6/10/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Dec 6, 2005 11:53 PM   in response to: meem

  Click to reply to this thread Reply

* peter dot memishian at sun dot com [20051125T095726]:
> Maybe you'd prefer it if DL_IOC_IPNETINFO instead took a boolean
> parameter (0=disable, 1=enable), and, when enabled, returned a
> non-zero value indicating the kernel's current DL_IOC_IPNETINFO
> version?

This looks nicest to me.

dme.
--
David Edmondson, Solaris Engineering, Sun Microsystems.
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,046
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Dec 7, 2005 12:01 AM   in response to: dme

  Click to reply to this thread Reply


> > Maybe you'd prefer it if DL_IOC_IPNETINFO instead took a boolean
> > parameter (0=disable, 1=enable), and, when enabled, returned a
> > non-zero value indicating the kernel's current DL_IOC_IPNETINFO
> > version?
>
> This looks nicest to me.

Cool; we can do that.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



pdurrant

Posts: 429
From: GB

Registered: 6/15/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 28, 2005 4:02 AM   in response to: philk

  Click to reply to this thread Reply

Hi Phil,

I'm uneasy about the the DL_IPNET mac type. Is it *really* necessary
or is it just convenience? Section 3.3 does not explore what it wrong
with re-using mac types from the tunnel provider.

Paul

PS: The reference URLs on SWAN aint entirely useful to me ;-)
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 28, 2005 4:15 AM   in response to: pdurrant

  Click to reply to this thread Reply

>Section 3.3 does not explore what it wrong
>with re-using mac types from the tunnel provider.

Using the mac types from the tunnel provider was discussed and at one
point this was the plan. However the two device are very different and
so we decided against it. The "Clearview IP Tunneling Device Driver":

http://www.opensolaris.org/jive/thread.jspa?threadID=2382&tstart=0

discusses some of this. I'll add a more complete discussion to the document.

>PS: The reference URLs on SWAN aint entirely useful to me ;-)

:) Yes we're aware of the problem but there isn't a solution right now.

Thanks

Phil
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 28, 2005 10:09 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining wrote:

> Using the mac types from the tunnel provider was discussed and at one
> point this was the plan. However the two device are very different and
> so we decided against it. The "Clearview IP Tunneling Device Driver":
>
> http://www.opensolaris.org/jive/thread.jspa?threadID=2382&tstart=0
>
> discusses some of this. I'll add a more complete discussion to the
> document.


I think we can separate the two discussions. We are discussing
an Internet packet observability device. The important point here
is observability. For the loopback case, the underlying "virtual
NIC" is irrelevant. We can really "make" it anything and I don't
think any information is lost. For packets going out/coming in
via a real NIC, the type of the device matters as the link layer
header can provide some information, say for ARP packets. And I
guess in the proposed DL_IPNET type captured traffic, the real link
header is still captured and somehow the "packet format" will
indicate what the underlying device type is. I don't really see
a problem for those observability device to use the real underlying
device type. And for loopback, just make it the easy one, Ethernet.
In fact, I think it is good.

The bottom line is: all the existing network traffic analysis tools
will simply work :-)



--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 31, 2005 9:34 AM   in response to: kcpoon

  Click to reply to this thread Reply

Hi Kacheong,

I think Paul was questioning why we were not re-using the tunnel provider
mac types which is a fair question as architecturally they do appear to
overlap. However, hopefully the thread I pointed to explains why using
the same mac types doesn't make sense.

> The bottom line is: all the existing network traffic analysis tools
> will simply work :-)

This seems to be the argument against introducing a new mac type, namely
that existing applications will need to be modified. However I believe
using DL_ETHER or another existing mac type isn't really an option as
we would end up faking link layer headers that never existed. This in my
opinion would be very confusing for consumers of the devices. Also adding
support for the new type to existing applications is not difficult.

Phil


_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 1, 2005 2:11 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining wrote:

> This seems to be the argument against introducing a new mac type, namely
> that existing applications will need to be modified. However I believe
> using DL_ETHER or another existing mac type isn't really an option as
> we would end up faking link layer headers that never existed. This in my
> opinion would be very confusing for consumers of the devices. Also adding
> support for the new type to existing applications is not difficult.


I suspect that we are touching an interesting question, "are we really
talking about any real link layer?" Maybe we should step back and take
a look at the bigger picture of the new architecture.

What can be captured by the new architecture? It is clearly stated
that the packets captured at IP are either those ready to be sent up
to upper layer protocols, or those ready to be sent down to the device
driver, or packets forwarded (it is not clear at which point of the
forwarding process a packet is captured). So packets captured by
the new architecture and real packets sent/received via a physical
medium can be very different. Saying that those ipnet devices support
DL_PROMISC_PHYS and DL_PROMISC_MULTI does not seem correct. It is also
not clear if ARP/RARP packets are captured (probably not). And if
there is no upper layer protocol user to receive an ICMP packet, will
it be captured? The interesting part is that while the packets
captured by the new architecture on the receiving side is always a
subset of the packets received by the device, they can be a superset
if somehow the device driver decides to drop some packets from IP.

Will the packets captured by the new architecture be the same (the
IP part of the packet) as on the physical medium? Yes and no. It is
a probably a yes on the sending side. But it can be a no on the
receiving side if tunnel or IPsec (assuming IPsec is not considered
a upper layer protocol in the design doc) is applied to the packet.
This means that in a trace of an IPsec protected TCP connection, one
can see the incoming packets in clear but outgoing packets are
encrypted. I think the design doc should list clearly where the hooks
for capturing packets are in the packet processing path. And I
suggest the design team to re-consider the placements of those hooks.

Then does it make sense then to tie those devices with real physical
interface? I think it can be confusing given the above. Maybe the
design team has thought about all those issues, but it is not clear in
the design doc that why they are not real confusing issues to users.
And what other options of allowing snoop inside a non-global zone have
been considered? I think both Bart and Peter Tribble have also hinted
that it makes sense to use zone as a differentiator in exporting those
devices, what is the team's response in this?

BTW, while adding support for a new MAC type is not difficult, the
coordination part is hard. Do we actually know how many tools are
out there which are used by customers to analyze snoop trace? So
far, it seems that only ethereal support has been considered by the
team. This is definitely no enough. And given that dl_ipnetinfo
has a version and can be changed, this is not a good choice to go
forward. And since only network trace analysis tools will be
consumers of those devices, why will they be confused by a "fake"
ethernet header for loopback traffic?


P.S. I also suggest the team to include how the architecture works
with IPMP and tunnel in this document, instead of asking the readers
to check the other documents. In fact, it seems that the other
documents actually depend on this snoop architecture. It is good
to have all the details in this document.



--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



pdurrant

Posts: 429
From: GB

Registered: 6/15/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 1, 2005 4:01 AM   in response to: philk

  Click to reply to this thread Reply

On 10/31/05, Philip Kirk - Solaris Sustaining <phil dot kirk at sun dot com> wrote:
>
> This seems to be the argument against introducing a new mac type, namely
> that existing applications will need to be modified. However I believe
> using DL_ETHER or another existing mac type isn't really an option as
> we would end up faking link layer headers that never existed. This in my
> opinion would be very confusing for consumers of the devices. Also adding
> support for the new type to existing applications is not difficult.
>

Perhaps having a single pseudo mac type is the wrong approach?

IP packets that come in off the wire will have a real MAC header
prepended so snooping a logical IP address amounts to snooping the
underlying link layer and filtering for the correct address.

For loopback traffic perhaps you could invent an inter-zone network
MAC type and then just provide a link layer device to export such
packets?

Paul
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



ptribble

Posts: 1,575
From: GB

Registered: 4/27/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Oct 28, 2005 4:40 AM   in response to: philk

  Click to reply to this thread Reply

On Tue, 2005-10-25 at 23:21, Philip Kirk - Solaris Sustaining wrote:
...
> The design document for IP-Level Observability Devices is now available
> for download at:
...
> Any feedback is very welcome.


Not necessarily in order, but looking at B.2 to start with because
that's where the first question I had is asked, namely: why not expose
the logical interfaces?


I find the scheme based on logical interface name to be the most simple
and obvious. It provides the closest match to the network
configuration. I do not find such a scheme confusing.

I do not believe that the possibility of non-deterministic allocation
of logical interfaces is a valid criticism. If I want to monitor the
network traffic corresponding to a particular zone, for example, I
expect to have to work out what logical interface is used.

More generally, as an administrator, what actions would I want to do?
What I really want to do is to do something like:

snoop bge1:3

from the global zone, and see all the traffic that the zone using that
logical interface would see. (In fact, in the case of zones, the output
should be identical to what someone running snoop in the zone itself
would see.)

What I don't want to do is to run snoop on the physical interface and
apply filters to restrict the traffic I see. The system already filters
the traffic, so there's no need for me to duplicate it (and possibly
arrive at a different filter).


What does /dev/ipnet look like in a zone? Does it show all interfaces,
or just the subset of the physical interfaces that the zone is
configured to use?


The document only covers the device design. What kstats will be made
available? (I would like to see decent kstats for both logical
and loopback interfaces, so I can monitor both the packet and data
rates on those interfaces. The loopback already has ipackets/opackets
[but not the 64-bit version], but not obytes64 and rbytes64; currently
logical interfaces have no kstats at all as far as I recall.)

Thanks,

--
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/


_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 4, 2005 12:51 AM   in response to: ptribble

  Click to reply to this thread Reply

Hi Peter,

The reply I sent to Bart Smaalders tries to explain why we rejected
using logical interfaces.

> What does /dev/ipnet look like in a zone? Does it show all
> interfaces, or just the subset of the physical interfaces that the
> zone is configured to use?

Just the subset.

> The document only covers the device design. What kstats will be made
> available?(I would like to see decent kstats for both logical and
> loopback interfaces, so I can monitor both the packet and data rates
> on those interfaces. The loopback already has ipackets/opackets [but
> not the 64-bit version], but not obytes64 and rbytes64; currently
> logical interfaces have no kstats at all as far as I recall.)

These devices won't make any kstats available although I believe
there is a bug open to provide this. I'll try and find the bugid.

Thanks

Phil
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



nordmark

Posts: 619
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 2, 2005 6:59 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining wrote:

> The design document for IP-Level Observability Devices is now available
> for download at:
>
> http://www.opensolaris.org/os/community/networking/ipobs-design.pdf

Philip,

Question on section 4.4

It says that things are asymmetric between the transmit and receive side
in the sense that the transmit hook is at the bottom of IP (when IP is
about to send a packet to the driver), and the receive hook is at the
top of IP (when IP is about to send a packet to a ULP).

What's the motivation for this design?

Wouldn't it look odd if e.g. IPsec is in use, and I do
snoop -I bge0
I would see encrypted packets on transmit, but on receive the packets
would have already been decrypted.

Same concern with forwarding; snoop -I bge0
would show packets forwarded *to* bge0, but not received on bge0 and
forwarded to some other interface.

Both of these behaviors are quite different than what snoop -d bge0
provides.

Erik

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 4, 2005 2:43 AM   in response to: philk

  Click to reply to this thread Reply

Hi Kacheong,

Thanks for the comments, they raise some interesting issues.

> What can be captured by the new architecture? It is clearly stated
> that the packets captured at IP are either those ready to be sent up
> to upper layer protocols, or those ready to be sent down to the
> device driver, or packets forwarded (it is not clear at which point
> of the forwarding process a packet is captured). So packets captured
> by the new architecture and real packets sent/received via a
> physical medium can be very different.

I think this makes it clear we need to define more clearly where the hooks
will be placed and what this means for the traffic captured by these
devices. This is something we will work on.

> Saying that those ipnet devices support DL_PROMISC_PHYS and
> DL_PROMISC_MULTI does not seem correct.

I'm not sure why this seems incorrect. We support them to enable
tools like snoop. We've defined the behaviour to be similar to
physical devices for this reason.

> It is also not clear if ARP/RARP packets are captured (probably not).

Dave Edmondson made a similar point. No ARP/RARP packets
aren't captured and I will make this clear. For inter-zone
communication ARP/RARP are not needed, so not providing
it doesn't affect the ability to debug these problems.

> And if there is no upper layer protocol user to receive an ICMP
> packet, will it be captured?

Clearly we should capture ICMP packets. I think the text describing
the way packets are passed needs some re-wording so it is more
specific.

> This means that in a trace of an IPsec protected TCP connection, one
> can see the incoming packets in clear but outgoing packets are
> encrypted. I think the design doc should list clearly where the
> hooks for capturing packets are in the packet processing path.

Erik Nordmark has made the same point and this indeed would lead
to confusion. Again I think this just highlights that the placement of
hooks needs more discussion and the document should be clearer.

> Then does it make sense then to tie those devices with real physical
> interface? I think it can be confusing given the above.

Clearly we will need to work on describing the hook placement in more
detail and on ensuring that there isn't confusion. Getting the hook
placement right is important. This highlights some missing content
in the design but I'm not sure it's a strong argument against the
naming scheme used in /dev/ipnet, namely the tie to the
physical interface.

The main reason for introducing these devices is to give access to
loopback traffic. By extending what traffic these devices give access
to means we can also enable things like snooping on the ipmp
device. The extension though is really secondary to the main
purpose of the devices. It sounds like this should be clarified in
the document.

> And what other options of allowing snoop inside a non-global zone
> have been considered? I think both Bart and Peter Tribble have also
> hinted that it makes sense to use zone as a differentiator in
> exporting those devices, what is the team's response in this?

I'm not entirely sure what the question is here. Are you asking why
don't we use the logical device name i.e. bge0:1? This was what
Peter and Bart seemed to want. I have responded to this in my
reply to Bart.

> BTW, while adding support for a new MAC type is not difficult, the
> coordination part is hard. Do we actually know how many tools are
> out there which are used by customers to analyze snoop trace? So
> far, it seems that only ethereal support has been considered by the
> team.

We have considered other applications. The majority of such applications
use libpcap which we plan on updating. Yes there may be some other
applications that we'll also need to go and fix but I would hope that
by fixing libpcap and ethereal we would capture the majority. It looks
like this needs to be clarified in the document.

> And since only network trace analysis tools will be consumers of
> those devices, why will they be confused by a "fake" ethernet header
> for loopback traffic?

By consumer I meant administrators looking at the data
captured by the devices. This is where there could be
confusion.

> P.S. I also suggest the team to include how the architecture works
> with IPMP and tunnel in this document, instead of asking the readers
> to check the other documents.

The document already includes some detail on how it will allow
an administrator to view the IPMP group as a whole through the
ipmp device. It points the reader at the IPMP document for more
details. What additional details would you like to see here?

Thanks for your comments.

Phil
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 11, 2005 2:09 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining wrote:

>> Saying that those ipnet devices support DL_PROMISC_PHYS and
>> DL_PROMISC_MULTI does not seem correct.
>
>
> I'm not sure why this seems incorrect. We support them to enable
> tools like snoop. We've defined the behaviour to be similar to
> physical devices for this reason.


They do not seem correct given the placement of hooks as explained
in the current document. Suppose DL_PROMISC_PHYS is set, and
the real NIC driver sends up all packets (it is not clear if
snoop -I bge0 actually has this effect from reading the document).
But since the incoming path hook is placed at where the packets are
already processed by IP, shouldn't IP just drop those packets which
do not destined to the host (let's assume we are not dealing with
forwarding here) and the hook will not receive anything? This means
that DL_PROMISC_PHYS has "no effect" at all with the current design?


> Dave Edmondson made a similar point. No ARP/RARP packets
> aren't captured and I will make this clear. For inter-zone
> communication ARP/RARP are not needed, so not providing
> it doesn't affect the ability to debug these problems.


This is the question I raised, is the new architecture just for
inter-machine IP traffic? If it is, I think it is fine. But
it seems to me that the architecture is targeted for more than
that. Then I think it is not correct not capturing such traffic
if the "name" of the sniffing device is actually based on a real
interface name.


> Erik Nordmark has made the same point and this indeed would lead
> to confusion. Again I think this just highlights that the placement of
> hooks needs more discussion and the document should be clearer.


I think it is more than an issue about explaining the placement
of the hooks. We need to look at the goals of the architecture.
Suppose we just want to have the ability to capture inter-machine
traffic (such as the good old loopback), I think using the hook
framework provided makes sense. But if we want to include traffic
with real NIC, I think the current design needs some work. Or maybe
we should not include non-inter-machine traffic at all.


> Clearly we will need to work on describing the hook placement in more
> detail and on ensuring that there isn't confusion. Getting the hook
> placement right is important. This highlights some missing content
> in the design but I'm not sure it's a strong argument against the
> naming scheme used in /dev/ipnet, namely the tie to the
> physical interface.
>
> The main reason for introducing these devices is to give access to
> loopback traffic. By extending what traffic these devices give access
> to means we can also enable things like snooping on the ipmp
> device. The extension though is really secondary to the main
> purpose of the devices. It sounds like this should be clarified in
> the document.


If the main goal of introducing these new sniffing devices is for
inter-machine traffic, then I'd suggest that these devices not to
capture traffic going out or coming in via real NIC, non-inter-machine
traffic. It is much cleaner and intuitive. Then we may as well not
introduce a new snoop option -I. The current -d is sufficient as long
as the device name is clearly associated with a zone.



>> And what other options of allowing snoop inside a non-global zone have
>> been considered? I think both Bart and Peter Tribble have also hinted
>> that it makes sense to use zone as a differentiator in exporting those
>> devices, what is the team's response in this?
>
>
> I'm not entirely sure what the question is here. Are you asking why
> don't we use the logical device name i.e. bge0:1? This was what
> Peter and Bart seemed to want. I have responded to this in my
> reply to Bart.


Note that I used "hinted" :-) I think their underlying thought
is that since the sniffing architecture already does the zone
classification inside the kernel, why not just export that to the
user land? They used the logical interface device names as an example
of how to export that. But this is not necessary. I've already asked
why the /dev/ipnet/ tree should not contain zone info, such as
/dev/ipnet/x/...


> We have considered other applications. The majority of such applications
> use libpcap which we plan on updating. Yes there may be some other
> applications that we'll also need to go and fix but I would hope that
> by fixing libpcap and ethereal we would capture the majority. It looks
> like this needs to be clarified in the document.


My most used TCP traffic analysis tool, tcptrace, does not use
what are mentioned above for understanding snoop captured file. I
think many people use that too. It is much better if we don't
need to introduce a new "MAC" type which has a version and
can be changed whenever we choose to change znoneid_t...

Since this new architecture is also used to capture IPMP traffic,
isn't it better if the captured file for IPMP traffic has the same
MAC type as the individually captured file (using the normal -d
option on the normal device)?


> By consumer I meant administrators looking at the data
> captured by the devices. This is where there could be
> confusion.


I don't think it is confusing. But I guess it is subjective :-)


> The document already includes some detail on how it will allow
> an administrator to view the IPMP group as a whole through the
> ipmp device. It points the reader at the IPMP document for more
> details. What additional details would you like to see here?


Where is the example? It seems that IPMP is only mentioned in
section 3.2. But exactly what is the relationship between the
new IPMP architecture and the new snoop architecture? Why are those
IPMP sniffing devices under /dev/ipnet/? How are they named? What
will be captured? I can read the IPMP document about the new IPMP
architecture. But it is a good idea to have this new snoop
document to be much more self-contained. The same idea applies
to snooping in IP tunnels.

And if we restrict this new snoop architecture to only capture
inter-machine traffic, then I guess we don't need to mention
IPMP and tunnel at all in this document. Instead, I'd like those
two documents about IPMP and tunnel to clearly describe what exactly
will be captured if we use snoop on them. I don't think it is clear
in the current documents.


--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 18, 2005 3:27 AM   in response to: kcpoon

  Click to reply to this thread Reply

Hi Kacheong,

Thanks for the feedback.

> They do not seem correct given the placement of hooks as explained in
> the current document. Suppose DL_PROMISC_PHYS is set, and the real
> NIC driver sends up all packets (it is not clear if snoop -I bge0
> actually has this effect from reading the document).

I might have misunderstood you but it sounds like you think that
snooping on the devices in /dev/ipnet also has an effect on the real
NIC. This isn't the case. Setting DL_PROMISC_PHYS on a device in
/dev/ipnet only effects what this device will pass up. The real NIC
isn't touched at all. As for the hook placement I've already agreed
this is something we need to look at again for several reasons.

> This is the question I raised, is the new architecture just for
> inter-machine IP traffic? If it is, I think it is fine. But it
> seems to me that the architecture is targeted for more than that.
> Then I think it is not correct not capturing such traffic if the
> "name" of the sniffing device is actually based on a real interface
> name.

No you will see inter and intra machine packets. Given the
devices provide visibility at the IP layer I'm not sure why
you would expect to see ARP traffic as well. Although the
device names are based on the real device name I'd have
hoped placing them in /dev/ipnet makes it clear these
devices are different and that what access to packets they
provide is also different.

> I think using the hook framework provided makes sense. But if we want
> to include traffic with real NIC, I think the current design needs
> some work. Or maybe we should not include non-inter-machine traffic
> at all.

In what way do you think the current design doesn't
work for inter machine traffic?

> My most used TCP traffic analysis tool, tcptrace, does not use what
> are mentioned above for understanding snoop captured file. I think
> many people use that too. It is much better if we don't need to
> introduce a new "MAC" type which has a version and can be changed
> whenever we choose to change znoneid_t...

If there was ever a new version it would just be to pass
up extra data. Existing apps would still work they just wouldn't
get to see the new additional data. Perhaps the doc isn't
clear here.

> I don't think it is confusing. But I guess it is subjective :-)

Yes :)

> Where is the example? It seems that IPMP is only mentioned in
> section 3.2.

It's also in the introduction but it sounds like you want an actual example
of using snoop on the ipmp device in /dev/ipnet?

Thanks

Phil
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 18, 2005 9:23 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining wrote:

> I might have misunderstood you but it sounds like you think that
> snooping on the devices in /dev/ipnet also has an effect on the real
> NIC. This isn't the case. Setting DL_PROMISC_PHYS on a device in
> /dev/ipnet only effects what this device will pass up. The real NIC
> isn't touched at all. As for the hook placement I've already agreed
> this is something we need to look at again for several reasons.


So are you saying that DL_PROMISC_PHYS is just a no-op?


> No you will see inter and intra machine packets. Given the
> devices provide visibility at the IP layer I'm not sure why
> you would expect to see ARP traffic as well. Although the
> device names are based on the real device name I'd have
> hoped placing them in /dev/ipnet makes it clear these
> devices are different and that what access to packets they
> provide is also different.


This is the confusion I tried to raise. I don't think it
is a good idea to use a real device name for capturing
traffic which may not even "go through" the device at all.
What I tried to ask if this project should just limit itself
to the original big goal, capturing intra machine traffic
only.


> In what way do you think the current design doesn't
> work for inter machine traffic?


It is not "not working." It is confusing. So the scheme can
have both "snoop -d hme0" and "snoop -I hme0" and we expect
the user to understand the subtle difference between the two?
Even after the placement of hooks is made more logical, it is
still confusing IMHO. I'd suggest the project team to rethink
the naming scheme if the team really wants to extend this project
to include more than intra machine traffic. (I also hope that
there is no new MAC type :-) And for intra machine traffic,
I hope the new device name can have zone info in it.


> If there was ever a new version it would just be to pass
> up extra data. Existing apps would still work they just wouldn't
> get to see the new additional data. Perhaps the doc isn't
> clear here.


If we expect those apps to ignore the info, why do we want
to introduce the data in the first place? Why introduce a new
MAC type?


> It's also in the introduction but it sounds like you want an actual example
> of using snoop on the ipmp device in /dev/ipnet?


I guess more than an example is needed. What info is being
captured exactly? For example, can an analyzer figure out
exactly which interface of an IPMP group is being used to send
out or receive a packet? Just mentioning the whole group can be
observed provides very little info on what exactly is happening.


--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,046
From: US

Registered: 3/9/05
Re: Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 18, 2005 12:39 PM   in response to: kcpoon

  Click to reply to this thread Reply


> > devices are different and that what access to packets they
> > provide is also different.
>
> This is the confusion I tried to raise. I don't think it
> is a good idea to use a real device name for capturing
> traffic which may not even "go through" the device at all.

it's *not* a device name. it's an ip interface name. hence lo0, ipmp0,
and so forth.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,813
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 14, 2005 3:59 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining writes:
> http://www.opensolaris.org/os/community/networking/ipobs-design.pdf

I'm getting better. Only six days late.

p1 s1:

- It might be good to set out some future intent here. Based on the
way TCP fusion is handled, I'd expect that if we had socket layer
loopback, that would also be disabled when observability is
enabled.

p1 s2:

- This part seems strange to me. Viewing IP as a component, we see
tap points for send and receive that are not symmetric and are
thus confusing. To illustrate:

txd rxd
^
| |
+---+---X---+
| | | |
| | | | IP
| | | |
+---X---+---+
| |
v

Thus, on transmit, we see the results after compression[1],
encryption, fragmentation, and finally NAT renumbering. The data
represent the view of an external system, minus L2 headers. But
on receive, we see the results after NAT, reassembly, decryption,
and decompression. This means it's asymmetric.

It seems to me that the top half and bottom half of IP represent
two interesting reference points -- and there may be others. The
top reference point is the IP holistic view, where the IP service
is offered to transports that are essentially ignorant of any link
layer dealings within IP.

The bottom reference point is IP's representation of underlying
links, and fairly represents the equivalent of per-link snoop in
an environment where you can't touch the real devices (i.e., a
zone).

The two are both interesting -- the top half is quite useful in
that it gives observability into the "internal" (pre) NAT data and
the cleartext traffic for IPsec. The bottom half is useful for
observing what the resulting network packets look like (save L2
data, perhaps).

But I don't see them as equivalent or replacable concepts such
that we can just tap transmit at one point and receive in
another. In fact, I'd represent that top half using
"/dev/ipnet/ip" to mean "all of IP."

It's quite possible that there are multiple interesting reference
points, such as looking at data after IPsec handling but before
NAT occurs. It's not clear to me how many such reference points
there might be.

[1] Not that Solaris supports IPcomp today, but it could.

p3 s3.1:

- I find it confusing to switch back and forth between "interface"
(mostly apparently meaning "physical interface" and not "logical
interface," though the distinction doesn't seem to be kept
throughout the document) and "address" as the identity for the
observability nodes.

Suppose two fragments arrive for a single IP datagram. One
fragment arrives over hme0, the second over bge0, and the
destination address belongs to ce0. What snoops will see this
packet and why?

I suspect that there are two independent tests here. One is "did
this packet go out on [or arrive on] link X?" The other is "is
the source [destination] address assigned as one of the logical
addresses on link X?" Call the former test "L," and the latter
"A."

This gives us at least three possibilities for received packets.
(The fourth is !L & !A, which means nothing matches and the packet
is not received.) What are the dispositions for the other cases?

L & A - pretty sure this is received
L & !A - this might be received if promiscuous
!L & A - is this always received even if not promiscuous?

There are deeper issues here relating to the set of local
addresses on a link. Are all of the possible broadcast and
multicast addresses also on a link and considered to be matches?
Can a user in a zone thus monitor all multicast traffic (even
traffic not generated within that zone or intended for that zone)?
Or is traffic segregated somehow? (If "somehow," then how?)

What about unnumbered links (where the same source address is
shared among two or more links)? Is an unnumbered address "owned"
by just one link, or will you see traffic for that address if you
snoop on *any* of the links?

p4 table 2:

- I got a bit confused by this table. How is it that a packet sent
to 192.168.0.1 is "outbound on lo0?" That doesn't seem possible
to me. At least in standard IP forwarding (also used to direct
local traffic), the outbound link is identified by the destination
address, not the source address. The source address traditionally
plays no role in the selection of the link.

Is this suggesting otherwise?

- What do the 192.168.0.4 cases represent? Forwarding?

p5 s3.2:

- I'm not sure what this first paragraph is saying. It seems to say
that there's some other observability control node that creates
and destroys the other nodes, but it doesn't name that node (what
is it?).

Is this just trying to say that /dev/ipnet/bge0 and the like are
created and destroyed on the fly by the system? Or is it saying
that there's some other node here?

p6 s3.4:

- I'd suggest -1 (ALL_ZONES) as generic way of saying "unknown
zone."

p9:

- How does flow control work for these observability streams? Do we
just drop on !canput?

p9 s4.3:

- How does the hook framework run "in parallel" with packet
processing? What happens if the packet processing ends up freeing
the packet? (At a guess, I think this means that dupmsg(9F) is
used to bump up the db_refcnts. But that can easily result in
unwanted data copies, and IP generally assumes that db_refcnt is 1
after ip_rput is underway [and it copies first if not] so more
might be needed here.)

p10 s4.4:

- What about observing before and after NAT rewrite rules? Today,
we cannot view traffic before the NAT process has taken place, and
cannot relate NAT input to output, so we're reduced to very blunt
instruments like ipnat(1M) to figure out what's going wrong in a
misconfiguration.

p11 s5.1:

- It seems wasteful to me to duplicate the ill_t/ipif_t logic with
essentially identical shadow ipolink_t/ipoaddr_t structures. Why
doesn't netinfo provide a structured view of the system rather
than forcing all clients to build these duplicate lists?

I think there's a bit of a mismatch here between the netinfo
design and the actual requirements of the consumers.

p12 s5.3:

- Why would there be just one possible user of an ipolink_t at a
time? Or am I misreading how ddi_get_soft_state and the minor
node number of the client stream are used here?

p13:

- Pseudo-code doesn't show multicast or broadcast handling. Do
those just not work or is the operation too complex to show?

p14 s5.5:

- Might one day do DL_NOTIFY_* as well. Those events would be
meaningful with links going up and down.

p15:

- Nit on DL_BIND_REQ: what error is used? DL_BADSAP, I'd assume.

p17 sA:

- I think this section reveals a bit of an architectural problem
with the use of IP hooks. If every project just inserts its own
set of hooks in IP, then why not just use direct function calls
from IP into the project and skip the hook complication (and
cost!). The framework artifice buys nothing if there's just one
consumer.

I think that issue is essentially inherent in how the IP hook
"provider" feature is defined. Instead of defining the position
of hooks in terms of what the hooks themselves provide, they're
defined in terms of who actually consumes them.

p18:

- I think that unnumbered and dynamically addressed interfaces are
much more problematic for the address-named node proposal in B.1.

- I'm not sure what B.2 is really asserting. Logical interfaces are
*NOT* real devices. They are, in fact, just address aliases as
those happen to be implemented on Solaris. That's not confusion
-- that's just quirky fact.


--
James Carlson, KISS Network <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 23, 2005 8:41 AM   in response to: carlsonj

  Click to reply to this thread Reply

Hi James,

Thanks for the comments.

> - It might be good to set out some future intent here. Based on the
> way TCP fusion is handled, I'd expect that if we had socket layer
> loopback, that would also be disabled when observability is enabled.

Ok.

> - This part seems strange to me. Viewing IP as a component, we see
> tap points for send and receive that are not symmetric and are thus
> confusing. To illustrate:

Yes this has been commented on by several other people as well.
I'm re-working the hook placement. The idea of multiple tap points
is interesting but I think outside the scope of what we want to do here.

> I suspect that there are two independent tests here. One is "did
> this packet go out on [or arrive on] link X?" The other is "is the
> source [destination] address assigned as one of the logical addresses
> on link X?" Call the former test "L," and the latter "A."

Yes, this was what the text at the bottom of page 3 top of page 4
was trying to convey. It also includes the promiscuous test. It seems
like this isn't clear though?

> There are deeper issues here relating to the set of local addresses
> on a link. Are all of the possible broadcast and multicast addresses
> also on a link and considered to be matches?

Yes.

> Can a user in a zone thus monitor all multicast traffic (even traffic
> not generated within that zone or intended for that zone)? Or is
> traffic segregated somehow? (If "somehow," then how?)

We aim to restrict a zone to only it's own traffic. I'll
detail the "somehow" in the document.

> p4 table 2: Is this suggesting otherwise?

It's wrong. I think the error slipped in during some late edits to
the table :(

> - What do the 192.168.0.4 cases represent? Forwarding?

Yes.

> Is this just trying to say that /dev/ipnet/bge0 and the like are
> created and destroyed on the fly by the system? Or is it saying that
> there's some other node here?

Yes, I'll make this clearer.

> - I'd suggest -1 (ALL_ZONES) as generic way of saying "unknown zone."
>
>
>

Ok.

> - How does flow control work for these observability streams? Do we
> just drop on !canput?

Yes. Do you have a better suggestion to handle this?

> p9 s4.3:
>
> - How does the hook framework run "in parallel" with packet
> processing?

After some discussion about making the hook framework more
generic so it could be used here, we've decided to implement our
own hooks. The document will be updated to reflect this.

> - What about observing before and after NAT rewrite rules? Today, we
> cannot view traffic before the NAT process has taken place, and
> cannot relate NAT input to output, so we're reduced to very blunt
> instruments like ipnat(1M) to figure out what's going wrong in a
> misconfiguration.

I'm not sure that this can be achieved without multiple tap points
and as I said above I don't think this is within the scope of what we
want to do here.

> I think there's a bit of a mismatch here between the netinfo design
> and the actual requirements of the consumers.

I think this part probably needs re-visiting.

> - Why would there be just one possible user of an ipolink_t at a
> time? Or am I misreading how ddi_get_soft_state and the minor node
> number of the client stream are used here?

There would be multiple users of an ipolink_t. Maybe this text need
re-working to make things clearer.

> - Pseudo-code doesn't show multicast or broadcast handling. Do those
> just not work or is the operation too complex to show?

It was left out to avoid complexity.

> - Might one day do DL_NOTIFY_* as well. Those events would be
> meaningful with links going up and down.

Ok.

> - Nit on DL_BIND_REQ: what error is used? DL_BADSAP, I'd assume.

Yes, I'll add this.

> p17 sA:
>
> - I think this section reveals a bit of an architectural problem with
> the use of IP hooks. If every project just inserts its own set of
> hooks in IP, then why not just use direct function calls from IP into
> the project and skip the hook complication (and cost!).

As mentioned above we're now going to use our own hooks so
this section is now gone.

> - I'm not sure what B.2 is really asserting. Logical interfaces are
> *NOT* real devices. They are, in fact, just address aliases as those
> happen to be implemented on Solaris. That's not confusion -- that's
> just quirky fact.

It was trying to highlight that today some users see logical interfaces
as real devices. They aren't aware of the fact that they aren't and
many expect them to work as normal devices including running
snoop on them. If snoop suddenly starts working on logical interfaces
then their belief in logical interfaces is re-enforced.

Thanks again for the comments.

Phil


_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,813
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 23, 2005 8:56 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining writes:
> > - This part seems strange to me. Viewing IP as a component, we see
> > tap points for send and receive that are not symmetric and are thus
> > confusing. To illustrate:
>
> Yes this has been commented on by several other people as well.
> I'm re-working the hook placement. The idea of multiple tap points
> is interesting but I think outside the scope of what we want to do here.

OK.

> > I suspect that there are two independent tests here. One is "did
> > this packet go out on [or arrive on] link X?" The other is "is the
> > source [destination] address assigned as one of the logical addresses
> > on link X?" Call the former test "L," and the latter "A."
>
> Yes, this was what the text at the bottom of page 3 top of page 4
> was trying to convey. It also includes the promiscuous test. It seems
> like this isn't clear though?

I wasn't certain that all of the cases were covered, thus the matrix I
tried to write.

> > Can a user in a zone thus monitor all multicast traffic (even traffic
> > not generated within that zone or intended for that zone)? Or is
> > traffic segregated somehow? (If "somehow," then how?)
>
> We aim to restrict a zone to only it's own traffic. I'll
> detail the "somehow" in the document.

OK. There might be a small problem here on the inbound side as I
suspect there's no good way to know to which zone a multicast address
"belongs." That'll be strange -- to see only the inbound traffic but
not the outbound.

> > - How does flow control work for these observability streams? Do we
> > just drop on !canput?
>
> Yes. Do you have a better suggestion to handle this?

No; just clarifying.

> > - What about observing before and after NAT rewrite rules? Today, we
> > cannot view traffic before the NAT process has taken place, and
> > cannot relate NAT input to output, so we're reduced to very blunt
> > instruments like ipnat(1M) to figure out what's going wrong in a
> > misconfiguration.
>
> I'm not sure that this can be achieved without multiple tap points
> and as I said above I don't think this is within the scope of what we
> want to do here.

OK. I guess I'm not certain at this point what's in and out of scope
for "observability." Is it just "observability for Zones?" Or some
collection of projects/protocols but not others?

I think it's probably fine to rule some things out of scope. I'm not
quite clear on which ones they are. (I'd like to think that they
represent some sort of administrative grouping that makes sense to a
user, rather than a grouping that reflects who "owns" what part of the
code.)

> > - I'm not sure what B.2 is really asserting. Logical interfaces are
> > *NOT* real devices. They are, in fact, just address aliases as those
> > happen to be implemented on Solaris. That's not confusion -- that's
> > just quirky fact.
>
> It was trying to highlight that today some users see logical interfaces
> as real devices. They aren't aware of the fact that they aren't and
> many expect them to work as normal devices including running
> snoop on them. If snoop suddenly starts working on logical interfaces
> then their belief in logical interfaces is re-enforced.

Ah, ok. The text seemed to assert the opposite (that users _should_
think of them as "real").

--
James Carlson, KISS Network <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



philk

Posts: 42
From: UK

Registered: 6/29/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 23, 2005 9:18 AM   in response to: carlsonj

  Click to reply to this thread Reply

> I wasn't certain that all of the cases were covered, thus the matrix
> I tried to write.

Ok. I actually like the matrix you proposed and may try to work something
like this in if that's ok?

> OK. I guess I'm not certain at this point what's in and out of scope
> for "observability." Is it just "observability for Zones?" Or some
> collection of projects/protocols but not others?

I think I may have confused the issue so I'll try and clarify it.
What this component will introduce is observability at the IP layer which
means observability of loopback traffic and thus inter-zone
communication. By extending the design beyond just
loopback traffic we can address some other problems like
viewing an IPMP group. However, for inter-machine packets
we only plan on hooking at the point a packet is received and
before ip processes it, and when a packet is about to be sent
after ip has processed it. What we don't plan on doing is
implementing multiple tap points, it is this part that I think is
out of scope.

Thanks

Phil
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,813
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 23, 2005 9:27 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining writes:
> > I wasn't certain that all of the cases were covered, thus the matrix
> > I tried to write.
>
> Ok. I actually like the matrix you proposed and may try to work something
> like this in if that's ok?

Sure.

> > OK. I guess I'm not certain at this point what's in and out of scope
> > for "observability." Is it just "observability for Zones?" Or some
> > collection of projects/protocols but not others?
>
> I think I may have confused the issue so I'll try and clarify it.
> What this component will introduce is observability at the IP layer which
> means observability of loopback traffic and thus inter-zone
> communication. By extending the design beyond just
> loopback traffic we can address some other problems like
> viewing an IPMP group. However, for inter-machine packets
> we only plan on hooking at the point a packet is received and
> before ip processes it, and when a packet is about to be sent
> after ip has processed it. What we don't plan on doing is
> implementing multiple tap points, it is this part that I think is
> out of scope.

This means that, from a user's perspective, the new functionality
allows you to look at traffic that's inside an IP tunnel, but not
inside an IPsec conversation. Stranger still, you can look at the
"before" picture for NAT using this tool and the "after" picture with
regular DLPI -- but only until the new, better-integrated IP Filter
comes on the scene, and then you'll be back where you started, seeing
only the "after" picture in both cases.

It would be nice to have a more durable definition of what
"observability" means for IP -- one that didn't depend so much on the
system-level implementation -- but I guess that's out of scope here.

--
James Carlson, KISS Network <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 24, 2005 12:53 AM   in response to: philk

  Click to reply to this thread Reply

Philip Kirk - Solaris Sustaining wrote:

> I think I may have confused the issue so I'll try and clarify it.
> What this component will introduce is observability at the IP layer which
> means observability of loopback traffic and thus inter-zone
> communication. By extending the design beyond just
> loopback traffic we can address some other problems like
> viewing an IPMP group. However, for inter-machine packets
> we only plan on hooking at the point a packet is received and
> before ip processes it, and when a packet is about to be sent
> after ip has processed it. What we don't plan on doing is
> implementing multiple tap points, it is this part that I think is
> out of scope.


Maybe I once again suggest that we restrict this project to the
intra machine traffic, loopback and inter-zone communication
and not extend it to include other things. I don't think we
can expect a sys admin to understand the difference between
"snoop -d" and "snoop -I." We should address the whole
network observability in another project. And the tools to use
are probably not snoop, although we may export the data to a
format which snoop and other popular network sniffers understand.
IMHO, it is not right to have a partial and possible confusing
solution for network observability using snoop at this point.


--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,046
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 24, 2005 7:36 AM   in response to: kcpoon

  Click to reply to this thread Reply


> I don't think we can expect a sys admin to understand the difference
> between "snoop -d" and "snoop -I."

Rapidly evolving tools like dladm(1M) already require the system
administrator to understand the difference between a DLPI link and an IP
interface. Moreover, a clean design of the networking administrative
model mandates this distinction.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 25, 2005 2:59 AM   in response to: meem

  Click to reply to this thread Reply

Peter Memishian wrote:
> > I don't think we can expect a sys admin to understand the difference
> > between "snoop -d" and "snoop -I."
>
> Rapidly evolving tools like dladm(1M) already require the system
> administrator to understand the difference between a DLPI link and an IP
> interface. Moreover, a clean design of the networking administrative
> model mandates this distinction.


No, it does not mean that we need to modify snoop to support
that. We can have other tools to export such data. I don't
see a strong requirement why snoop is the best place to add
this kind of IP layer observability support. In fact, I believe
it is confusing the user of this well known tool. We should
use new tools to support a thorough IP observability mechanism,
not to clobber existing tool which has already a well known
interface and output format. This is the wrong choice!



--

K. Poon.
kacheong dot poon at sun dot com


_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,046
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 25, 2005 7:51 AM   in response to: kcpoon

  Click to reply to this thread Reply


> In fact, I believe it is confusing the user of this well known tool.
> We should use new tools to support a thorough IP observability
> mechanism, not to clobber existing tool which has already a well known
> interface and output format.

There is no "clobbering" of an existing tool -- nothing in snoop has been
damaged: one new command-line flag will be added; that's it. Other packet
monitoring tools such as Ethereal already support this concept. There is
nothing that precludes adding a more thorough IP observability mechanism
-- and no one is advocating that mechanism be modeled through snoop.

This work solves a very real and pressing problem of monitoring traffic
between zones. It also allows, IPMP group and loopback traffic in general
to be easily observed. These are all *extremely* useful abilities. It
uses a tool that is already familiar to users, has minimal impact on the
architecture of the system, and is straightforward to implement in a
timely manner.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 28, 2005 3:58 AM   in response to: meem

  Click to reply to this thread Reply

Peter Memishian wrote:

> There is no "clobbering" of an existing tool -- nothing in snoop has been
> damaged: one new command-line flag will be added; that's it. Other packet
> monitoring tools such as Ethereal already support this concept. There is
> nothing that precludes adding a more thorough IP observability mechanism
> -- and no one is advocating that mechanism be modeled through snoop.


Not sure which concept you are referring to above which ethereal
also has. In the ethereal I am using, there is only one option,
-i, which is used to specify an interface (or pipe or stdin). Can
you clarify?

Let's step back one moment and look at where snoop (or ethereal)
observe live packets. The observation point is at the device
driver where packets are received from the wire (assuming there
is a wire :-) or sent to the wire. In other UNIXes, they also
support observing loopback traffic. And they put the observation
point for loopback traffic at a more or less equivalent place as
for the normal non-loopback traffic. This is the expected point
where a normal network traffic sniffer observes packets.

The proposed -I option breaks this expected observation point.
This is clobbering snoop. And as it is snoop, it also does not
make sense to allow a user to change where the observation point
should be. This makes it not quite useful.

The reason why it is not good for the future mechanism is the
following. A complete IP observability mechanism obviously should
include the proposed "snoop observation point." So if we put a
hook at that point now, when we are implementing the complete
mechanism, what should we do about the snoop hook? We cannot
remove it as this "will break backward compatibility" for snoop.
But the future mechanism probably won't work with the proposed
snoop hook (this we don't know, but I definitely hope that we
don't need to design it so that it must work with the proposed
snoop hook). This means that we need to have two hooks at exactly
the same point?


> This work solves a very real and pressing problem of monitoring traffic
> between zones. It also allows, IPMP group and loopback traffic in general
> to be easily observed. These are all *extremely* useful abilities. It
> uses a tool that is already familiar to users, has minimal impact on the
> architecture of the system, and is straightforward to implement in a
> timely manner.


Note that I am suggesting the snoop project to focus on intra-
machine traffic. This is the feature which we lack and all
other UNIXes have. It is the pressing need we have to deliver.
I think we should also have a more thorough IP observability
mechanism. But we should not clobber snoop to do a partial job
for that. I do agree that the mechanism should support exporting
data in a format which snoop and other network sniffers understand
so that they can be used to analyze the traffic. And I guess we
all agree that "straightforward to implement something" is not a
good argument to actually implement that thing. I still see no
strong reason why snoop has to be used for this job...


--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



seb

Posts: 2,142
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 29, 2005 7:32 PM   in response to: kcpoon

  Click to reply to this thread Reply

On Mon, 2005-11-28 at 06:58, Kacheong Poon wrote:
> Let's step back one moment and look at where snoop (or ethereal)
> observe live packets. The observation point is at the device
> driver where packets are received from the wire (assuming there
> is a wire :-) or sent to the wire. In other UNIXes, they also
> support observing loopback traffic. And they put the observation
> point for loopback traffic at a more or less equivalent place as
> for the normal non-loopback traffic. This is the expected point
> where a normal network traffic sniffer observes packets.

With the model provided by other operating systems, if I want to observe
packets associated with a particular application or IP address, I need
to know ahead of time whether those packets are looped back (in which
case I'd observe the loopback interface), or not looped back (in which
case I'd observe the appropriate link-layer device).

If I'm a zone administrator on Solaris and I want to observe the packets
associated with my zone, I don't really know or care if those packets
are going to another zone on the same system or to a different system
altogether. That model doesn't fit with the way we've designed
networking within zones. Whether or not packets associated with a
particular IP interface came from the same system or are destined to the
same system shouldn't be something that the zone administrator needs to
know ahead of time when using a network observability tool. I don't
think the administrator has any way of knowing if a particular IP
address is assigned to a different system or a different zone on the
same system anyway.

The model defined here works quite well for IP observability within a
zone, and that is one of the requirements we're trying to meet here.

>
> The proposed -I option breaks this expected observation point.
> This is clobbering snoop. And as it is snoop, it also does not
> make sense to allow a user to change where the observation point
> should be. This makes it not quite useful.

Snoop and other network observability tools can already decode packets
at various layers, so the observability point seems like a documentation
issue. We can document that when using -I, the data provided comes from
the IP layer. I would agree that if we overloaded -d and under the
covers provided a different observability points, that this would be
confusing...

>
> The reason why it is not good for the future mechanism is the
> following. A complete IP observability mechanism obviously should
> include the proposed "snoop observation point." So if we put a
> hook at that point now, when we are implementing the complete
> mechanism, what should we do about the snoop hook? We cannot
> remove it as this "will break backward compatibility" for snoop.
> But the future mechanism probably won't work with the proposed
> snoop hook (this we don't know, but I definitely hope that we
> don't need to design it so that it must work with the proposed
> snoop hook). This means that we need to have two hooks at exactly
> the same point?

You're talking about two different things above, and I'm not sure that
it was your intention. The hooks in IP and the DLPI device that snoop
consumes are two different things. Snoop depends on the DLPI device,
and it could care less where the device gets its data from. I don't see
how the latter couldn't change in the future.

Also, DLPI is a simple way for us to provide networking data to tools
that already use this interface, and I think you agree with that based
on what you've stated in your last paragraph.

>
> > This work solves a very real and pressing problem of monitoring traffic
> > between zones. It also allows, IPMP group and loopback traffic in general
> > to be easily observed. These are all *extremely* useful abilities. It
> > uses a tool that is already familiar to users, has minimal impact on the
> > architecture of the system, and is straightforward to implement in a
> > timely manner.
>
>
> Note that I am suggesting the snoop project to focus on intra-
> machine traffic. This is the feature which we lack and all
> other UNIXes have. It is the pressing need we have to deliver.

We need to provide the hooks in IP to give access to this data. I don't
see how you can argue both for this and against the hooks in IP as
you've done above... I think I'm missing your point.

> I think we should also have a more thorough IP observability
> mechanism. But we should not clobber snoop to do a partial job
> for that. I do agree that the mechanism should support exporting
> data in a format which snoop and other network sniffers understand
> so that they can be used to analyze the traffic. And I guess we
> all agree that "straightforward to implement something" is not a
> good argument to actually implement that thing. I still see no
> strong reason why snoop has to be used for this job...

I don't think it would necessarily have to be used for that job, but I
also don't see why is necessarily shouldn't be used for that job given
that it already has the mechanisms in place for parsing and filtering
networking data. This project wouldn't hinder a broader network
observability mechanism even if that mechanism didn't involve snoop, or
at least I don't see how it would.

-Seb


_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



kcpoon

Posts: 630
From: HK

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 30, 2005 8:04 AM   in response to: seb

  Click to reply to this thread Reply

Sebastien Roy wrote:

> With the model provided by other operating systems, if I want to observe
> packets associated with a particular application or IP address, I need
> to know ahead of time whether those packets are looped back (in which
> case I'd observe the loopback interface), or not looped back (in which
> case I'd observe the appropriate link-layer device).


Put it this way, this is the expected behavior of a network
sniffer. It is not a surprise to anyone that snoop also behaves
the same way. Is it enough for observability? No, that is why we
need to have a better IP observability mechanism.


> If I'm a zone administrator on Solaris and I want to observe the packets
> associated with my zone, I don't really know or care if those packets
> are going to another zone on the same system or to a different system
> altogether. That model doesn't fit with the way we've designed
> networking within zones. Whether or not packets associated with a
> particular IP interface came from the same system or are destined to the
> same system shouldn't be something that the zone administrator needs to
> know ahead of time when using a network observability tool. I don't
> think the administrator has any way of knowing if a particular IP
> address is assigned to a different system or a different zone on the
> same system anyway.


Look at it another way. If you are a sys admin of a system
with multiple interfaces (forget about zone for this case), how do
you capture the traffic for a particular application? Note that
the application can also talk to other apps running in the same
machine using a non loopback IP address. You will probably need
to capture traffic on all the interfaces, including loopback, right?

Use the above example and apply it to a system with zone. I
think for a sys admin, the way to capture the traffic is still the
same, capturing traffic on all the interfaces. There is no surprise.
Is it the best option? No, that's why we need to have a better IP
observability mechanism. Why should we have -I option in snoop? It
is a real surprise to a sys admin to find that the output file
captured by using -I cannot be decoded by the xyz analyzer.


> The model defined here works quite well for IP observability within a
> zone, and that is one of the requirements we're trying to meet here.


To me, it is just a partial solution. What we need is a better
mechanism. And I don't see the argument why snoop is the right
tool for this mechanism. The pressing need is for Solaris to support
the same level of network sniffer functionality the other UNIXes have,
which is being able to capture loopback traffic. With this ability,
the inter-zone traffic observability issue is also solved. The
remaining zone issue is using snoop in a zone. Note that this is not
just a problem with snoop. It is a generic problem with zone and our
IP stack. I'd suggest we consider a complete solution than just
doing it for each individual component of our stack. In the mean
time, the sys admin in the global zone can observe all traffic for
all zones.


> Snoop and other network observability tools can already decode packets
> at various layers, so the observability point seems like a documentation
> issue. We can document that when using -I, the data provided comes from
> the IP layer. I would agree that if we overloaded -d and under the
> covers provided a different observability points, that this would be
> confusing...


Decoding a packet and capturing a packet are two different things.
We are discussing the point of capture here. I can write a program
to do pure packet capturing without the ability to do any decoding.
The other way around is also true. For example, tcptrace can
analyze TCP packets but it is not able to do packet capturing. We
should not mix them up.

And I guess the way you described above in documenting -I is actually
confusing. What is meant by packets coming from IP layer? Shouldn't
the IP packets captured by using "snoop -d hme0" also come from the
IP layer? What is the difference between -d and -I? It is this
subtle difference which is confusing. I think your description above
illustrates the point I try to make. Note also that using the -d
option to capture loopback traffic is expected by any UNIX sys admin
as it is the way to do it in other UNIXes. Using -I is a surprise
and can be confusing...


> You're talking about two different things above, and I'm not sure that
> it was your intention. The hooks in IP and the DLPI device that snoop
> consumes are two different things. Snoop depends on the DLPI device,
> and it could care less where the device gets its data from. I don't see
> how the latter couldn't change in the future.


I did not talk about the DLPI interface at all (I didn't mention
anything about DLPI interface, did I?). The "snoop hooks" or
"snoop observation points" I mentioned are not related to the DLPI
interface used by snoop... They are the hooks being added by
this project inside the kernel at different places to capture
packets. And I am referring to the difficulty maintaining those
hooks in future if we are going to have a complete IP observability
mechanism which do not capture and export packets the same way as
the proposed hooks. Then we need to maintain the proposed hooks
and add in the new mechanism at the same place. Is it possible to
change the proposed observability device to use the future
mechanism? Maybe or maybe not. The issue is that I don't think
this should even be a design constraint and consideration for the
IP observability mechanism. Why do we want to have something which
is not a complete solution that we need to deal with in future?


> Also, DLPI is a simple way for us to provide networking data to tools
> that already use this interface, and I think you agree with that based
> on what you've stated in your last paragraph.


As I mentioned, the hooks I mentioned have nothing to do with DLPI
interface. Maintaining a DLPI interface has nothing to do with
maintaining the actual hooks inside the kernel to capture and export
packets...


> We need to provide the hooks in IP to give access to this data. I don't
> see how you can argue both for this and against the hooks in IP as
> you've done above... I think I'm missing your point.


It really depends on the design. I am not suggesting any design
here. But just as a straw man, the straightforward (but probably
not very good :-) way to support capturing loopback traffic is
to have a loopback device. Just as the proposed solution in
disabling TCP fusion, we can disable the current "IP short cut"
for loopback traffic when snooping and send all the loopback traffic
to the loopback device driver. Then this loopback device driver
can export the IP traffic just like a normal network device
driver. There is no new kind of hooks (not straightly true but
I think the code changed required is not considered as hooks...)
specifically added for exporting the loopback traffic. I should
stress that this is just a simple approach over my head, there are
too many things to be considered. I'm sure the project team can do
a better job ;-) But this illustrates the fact that adding special
hooks is not necessary for capturing loopback traffic.

What I am arguing is that we should probably restrict this project
to support snooping loopback traffic. I think the -I option is
confusing and the new MAC type introduced is probably not a good
idea. I think the proposed additional observation points are only
an incomplete set for IP observability. They are also not the
expected observation points for a normal network sniffer, which
can be confusing to the user. Finally, I think the proposed snoop
hooks may add some unnecessary design constraints/consideration
and/or code redundancy when we are trying to have a better IP
observability mechanism in future.


> I don't think it would necessarily have to be used for that job, but I
> also don't see why is necessarily shouldn't be used for that job given
> that it already has the mechanisms in place for parsing and filtering
> networking data. This project wouldn't hinder a broader network
> observability mechanism even if that mechanism didn't involve snoop, or
> at least I don't see how it would.


So we agree that there is really no good reason to use snoop
to do the job? And do we really want something like the
following in, say, the tunnel code?

/*
* Pre-defined IP observability point, handle the packet
* according to the mechanism.
*/

...

/*
* Oops, we need to support the snoop capturing here
* for backward compatibility, so call the snoop hook
* also :-(
*/


--

K. Poon.
kacheong dot poon at sun dot com

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



seb

Posts: 2,142
From: US

Registered: 3/9/05
Re: Clearview IP-Level Observability Devices design review (due 11/8)
Posted: Nov 30, 2005 1:23 PM   in response to: kcpoon

  Click to reply to this thread Reply

I need to think about your previous comments some more, but wanted to
reply to this one as it is slightly tangential and I can address it
immediately.

> And do we really want something like the
> following in, say, the tunnel code?
>
> /*
> * Pre-defined IP observability point, handle the packet
> * according to the mechanism.
> */
>
> ...
>
> /*
> * Oops, we need to support the snoop capturing here
> * for backward compatibility, so call the snoop hook
> * also :-(
> */

No, but I don't see why we could get to that point. GLDv3 will provide
fully functioning DLPI devices for tunnel interfaces, as described in
the Tunnel Device Driver design document. There's nothing in the IP
Observability Devices design document mentioning any hooks in the tunnel
driver, unless I missed something...

-Seb


_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org






Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
Copyright © 1995-2005 Sun Microsystems, Inc.