OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » networking » discuss

Thread: Clearview IPMP Rearchitecture: high-level design (due 9/22)

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 29 - Last Post: Jul 12, 2006 7:54 AM by: carlsonj
meem

Posts: 3,045
From: US

Registered: 3/9/05
Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 8, 2005 12:57 PM

  Click to reply to this thread Reply


Folks,

As some of you may be aware, the Solaris Approachability team has a project
underway called "Clearview", whose charter is to rationalize, unify, and
enhance the way network interfaces are handled in Solaris at the
programmatic and administrative levels. Under the Clearview umbrella,
there are currently four components under development:

IPMP Rearchitecture
IP Tunnel Device
Vanity Naming and Nemo Unification
IP-Level Observability Devices

The first design document from this work, covering the IPMP rearchitecture,
is now available for download at:

http://opensolaris.org/os/community/networking/ipmp-highlevel-design.pdf

This document covers the proposed new IPMP architecture, what's changed
from the existing architecture, and the administrative and programmatic
impact of these changes.

If you currently use Solaris IPMP, are interested in using it, or simply
want a chance to shape the future of Solaris, we welcome your feedback.
The timer for comments is set at two weeks (September 22).

Design documents for the other Clearview components listed above will also
be made available for comment within the next few weeks. So, stay tuned!

Thanks for helping us make Solaris rock.
--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



benr

Posts: 917
From:

Registered: 4/28/05
Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 8, 2005 6:19 PM   in response to: meem

  Click to reply to this thread Reply

Peter Memishian wrote:

>Folks,
>
>As some of you may be aware, the Solaris Approachability team has a project
>underway called "Clearview", whose charter is to rationalize, unify, and
>enhance the way network interfaces are handled in Solaris at the
>programmatic and administrative levels. Under the Clearview umbrella,
>there are currently four components under development:
>
> IPMP Rearchitecture
> IP Tunnel Device
> Vanity Naming and Nemo Unification
> IP-Level Observability Devices
>
>The first design document from this work, covering the IPMP rearchitecture,
>is now available for download at:
>
> http://opensolaris.org/os/community/networking/ipmp-highlevel-design.pdf
>
>This document covers the proposed new IPMP architecture, what's changed
>from the existing architecture, and the administrative and programmatic
>impact of these changes.
>
>If you currently use Solaris IPMP, are interested in using it, or simply
>want a chance to shape the future of Solaris, we welcome your feedback.
>The timer for comments is set at two weeks (September 22).
>
>Design documents for the other Clearview components listed above will also
>be made available for comment within the next few weeks. So, stay tuned!
>
>Thanks for helping us make Solaris rock.
>
>
I'm excited to see this project pop up. Thank you for letting us be
part of it!

Frankly, I'd love to see things closer to the way they were done with
PNM using seperate tools. Managing IPMP with ifconfig never seemed
intuative and was extremely confusing. I've never used IPMP and felt
really rock solid confident in the configuration. But more than that,
I've absolutely hated IPMP configuration being in hostname.(interface)
files. It just doesn't seem polished at all, is completely
non-intuative and seems to be a rough spot in what its a very natural
scheme of network configuration files in /etc.

If we must have configuration information in hostname files I'd at least
feel better if the file was structured. Even with the changes proposed
(namely the introduction of hostname.ipmp0) I'd still like something
structured. If you sit down someone new to IPMP (or in fact many that
aren't) and ask them to take a guess at what it means they're gonna
scratch their heads. The new method suggested on page 21 is definately
better, but by simply doing alittle extra parsing you could make life oh
so much easier for admins and reduce the number of configuration mistakes.

Is the deprecated keyword going away? It never seemed like a very
descriptive keyword to me. The term to me tends to mean "broken", which
doesn't suit its use in IPMP.

IPMPstat is really really kool. As soon as I saw the example output,
literally, a smile came to my face.

I'm sad to see that IPMP control won't be part of dladm, but I
understand the argument against doing that and agree.

I'm still mulling ideas over about the implementation. Honestly, in a
perfect world I'd like :

1) a tool called "ipmpadm" that was used to create and manage the
interfaces (such as "ipmpadm create ipmp0 ce0 ce1" and "ipmpadm failover
ipmp0", etc).

2) the already described "ipmpstat" tool to have a very readable and
informative status of my IPMP links at a glance.

3) nothing but associated status information in ifconfig (groupname and
status).

4) use a centralized configuration model, such as an ipmp.conf.


These changes would make things harder from a code persepective but much
much easier and managable from a administrative position, i think.

Just an idea. I'm rolling all this around trying to think of examples
of how best to do them, I came up with several for this message but
deleted them. I'm still thinkin'. :)

benr.






_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,045
From: US

Registered: 3/9/05
Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 8, 2005 8:08 PM   in response to: benr

  Click to reply to this thread Reply


> I'm excited to see this project pop up. Thank you for letting us be
> part of it!

we're happy to have you :-) your comments are just what we're looking for.

> Frankly, I'd love to see things closer to the way they were done with
> PNM using seperate tools. Managing IPMP with ifconfig never seemed
> intuative and was extremely confusing. I've never used IPMP and felt
> really rock solid confident in the configuration. But more than that,
> I've absolutely hated IPMP configuration being in hostname.(interface)
> files. It just doesn't seem polished at all, is completely
> non-intuative and seems to be a rough spot in what its a very natural
> scheme of network configuration files in /etc.

i understand.

> If we must have configuration information in hostname files I'd at least
> feel better if the file was structured. Even with the changes proposed
> (namely the introduction of hostname.ipmp0) I'd still like something
> structured. If you sit down someone new to IPMP (or in fact many that
> aren't) and ask them to take a guess at what it means they're gonna
> scratch their heads. The new method suggested on page 21 is definately
> better, but by simply doing alittle extra parsing you could make life oh
> so much easier for admins and reduce the number of configuration mistakes.
>
> [ ... ]
>
> 1) a tool called "ipmpadm" that was used to create and manage the
> interfaces (such as "ipmpadm create ipmp0 ce0 ce1" and "ipmpadm failover
> ipmp0", etc).
>
> [ ... ]
>
> 4) use a centralized configuration model, such as an ipmp.conf.

i agree with all of the above. however, there are some key complicating
factors:

1. realize that we are constrained by backward compatibility here --
regardless of what the future brings, we must maintain the ability to
configure things with ifconfig(1M) and the /etc/hostname.<if> files,
at least to the extent that things could be configured prior to the
rearchitecture.

2. given that almost all of the new design maps cleanly into the
existing administrative model of ifconfig(1M) and /etc/hostname.<if>
files (however questionable that model might be ;-), it didn't seem
reasonable to me to force a new model to be used when dealing with
the ipmp group interfaces -- especially since the whole idea of this
work is to allow the group to be managed "just like" any other ip
interface.

3. further to (2), the only real new administrative control point is the
addition of the "ipmp" keyword -- and again, it didn't seem
reasonable to me to force a new tool to be used just to configure
explicit ipmp groups (that's what uses the new keyword).

4. our system configuration story is undergoing some significant change
at the moment. as you know, solaris 10 introduced smf(5), although
much of the networking universe has not yet been converted to use it.
however, this work is imminent -- that is, the sea of networking
configuration files that you are used to in /etc are in the process
of being moved into the smf repository (by other ongoing projects).
introducing new configuration files like an ipmp.conf is not
compatible with this direction.

5. further to (4), it is not yet clear what smf-based administrative
tools are on the horizon for ip interfaces or other networking
abstractions. for instance, it's conceiveable that we will decide to
introduce an "ipadm" to provide configuration of all ip interfaces
(imagine a sane ifconfig), including ipmp interfaces. in that
scenario, "ipmpadm" would prove to be too narrow an administrative
tool.

all that said, i am certainly not opposed to the idea of an "ipmpadm".
however, i want to make sure we introduce it at the right time, and given
the background i've provided above, i don't think this is that time. once
the rearchitecture as specified is completed, adding ipmpadm should prove
straightforward if it is indeed the right thing to do. however, without
the rearchitecture, implementing ipmpadm (and ipmpstat!) would be a mess.

> Is the deprecated keyword going away? It never seemed like a very
> descriptive keyword to me. The term to me tends to mean "broken", which
> doesn't suit its use in IPMP.

the "deprecated" keyword will remain since it is needed for ipv6 (e.g.,
see rfc 2462). however, an administrator setting up an ipmp configuration
will no longer have to think about "deprecated".

> IPMPstat is really really kool. As soon as I saw the example output,
> literally, a smile came to my face.

i am happy to hear that :-)

> 3) nothing but associated status information in ifconfig (groupname and
> status).

sadly, backwards compatibility prevents us from making significant changes
to the ifconfig(1M) output. (sadly, its output is often parsed because no
better tool was provided.)

> Just an idea. I'm rolling all this around trying to think of examples
> of how best to do them, I came up with several for this message but
> deleted them. I'm still thinkin'. :)

keep 'em coming.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



dminer

Posts: 1,992
From: US

Registered: 3/9/05
Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 9, 2005 7:36 AM   in response to: meem

  Click to reply to this thread Reply

Just to add a little to one thing meem wrote:

> 4. our system configuration story is undergoing some significant change
> at the moment. as you know, solaris 10 introduced smf(5), although
> much of the networking universe has not yet been converted to use it.
> however, this work is imminent -- that is, the sea of networking
> configuration files that you are used to in /etc are in the process
> of being moved into the smf repository (by other ongoing projects).
> introducing new configuration files like an ipmp.conf is not
> compatible with this direction.
>

Those "other ongoing projects" are in their very early stages right now
- not as far along as Clearview. We will be putting out more
information on them in the near future, so they'll be as open for review
and input (and perhaps participation!) as what you're seeing here.

Dave
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



astade2

Posts: 30
From:

Registered: 9/9/05
Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 9, 2005 11:37 AM   in response to: meem

  Click to reply to this thread Reply

Sorry if this has already been answered, but will IPMP become a little more like CARP (OpenBSD)? It seems as if sending gratuitous ARP messages is a bit antiquated in many cases. If 2 NICs could share the same MAC address and fail over seamlessly, there wouldn't be much need for gratuitous ARP.

Thanks!

nordmark

Posts: 619
From: US

Registered: 3/9/05
Re: Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 12, 2005 9:08 AM   in response to: astade2

  Click to reply to this thread Reply

Alex wrote:
> Sorry if this has already been answered, but will IPMP become a
> little more like CARP (OpenBSD)? It seems as if sending gratuitous
> ARP messages is a bit antiquated in many cases. If 2 NICs could share
> the same MAC address and fail over seamlessly, there wouldn't be much
> need for gratuitous ARP.

Alex,

Carp as I understand it is for providing failover between machines, not
between multiple NICs on the same machine. I guess one could run this
between multiple NICs on the same box, but it seems like overkill in
that case.

If two NICs share the same MAC address you couldn't do inbound load
spreading (e.g. by having pop.example.com point at one IP address on one
interface, and smtp.example.com point at the other IP/interface).

But even if you don't want inbound load spreading and go with changing
which NIC is using the single MAC address, you still need to send a
packet when the MAC address moves. This is necessary so that learning
bridges/switches can detect that the MAC address has moved to a
different place (either a different port on the same switch, or a
different Ethernet switch).
Sending a broadcast for this is better than a unicast, because the
broadcast ends up updating the learning tables for all the switches in
the LAN.
FWIW: the Linux nic bonding driver, which does move the MAC address
AFAIK, still sends ARP packets after movement for this reason.

Erik
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



relling

Posts: 1,859
From: US

Registered: 6/17/05
Re: Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 20, 2005 3:29 PM   in response to: nordmark

  Click to reply to this thread Reply

To further expand on Eric's comments, prior to Sun Cluster 3.1 we
used Network Address Failover (NAFO) to provide redundant public
net connections. NAFO did migrate the MAC address. It also was
an unending source of complaints, confusion, and general network
mayhem. We are greatly relieved to be rid of NAFO in Sun Cluster 3.1,
replacing it with IPMP (and its smaller set of quirks :-)

-- richard

David.Edmondson...
re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 14, 2005 4:21 AM   in response to: meem

  Click to reply to this thread Reply

General:
* Excellent document!
* I tried to review this using only "public" information (i.e. not
going to look at clearview.east), mostly as an exercise to see what
might be missing to a non-Sun reviewer. On the whole I think that
there is enough detail about the related projects (pfhooks,
Clearview observability, etc.) included in the document to make that
possible. (This is good.)

Page 2:
* Replace "configurations use significant number" with "configurations
use a significant number"

Page 5:
* Are there situations where it might be useful to have an interface
be a member of multiple IPMP groups? In particular where the
interface in question is STANDBY for a couple of groups. It would
mean a bunch of changes and I'm not clear if it's useful, but the
thought occurred to me anyway.

Page 7:
* Missing "." after "several undocumented ioctl operations".

Page 9:
* Seems that there is a "!" missing in "FAILED will now always imply
RUNNING" (should be "imply !RUNNING").

Page 28:
* What is IFF_NOFAILOVER actually used for now? It seems that:
IFF_NOFAILOVER = (group != NULL) & !IFF_IPMP
Hmm - perhaps it relates to the backward compatibility aspects of
allowing addresses to be added to the component links?

Page 30:
* Replace "Further, when an interface placed" with "Further, when an
interface is placed".

* The text indicates that RTM_DELADDR will be sent for the
non-failover addresses when an interface is added to a group.
Presumably RTM_NEWADDR messages will also be sent when the addresses
are added to the IPMP interface? (This was mostly a case of being
puzzled why one part of the process is mentioned but not the other.)

Page 32: Logical Interfaces.
* If non-failover addresses are added to underlying interfaces, they
will be moved to the IPMP interface, yes?

Page 33:
* Replace "The ability associate specific ARP" with "The ability to
associate specific ARP".

dme.
--
David Edmondson, Solaris Engineering, Sun Microsystems.

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,045
From: US

Registered: 3/9/05
re: re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 14, 2005 8:43 AM   in response to: David.Edmondson...

  Click to reply to this thread Reply


david,

first, glad to have you back on board! :-) second, thanks for reviewing
this; my responses are inline below. i've also updated the document to
take your eagle-eye corrections into account (the link in my original post
will now download the new revision).

[ to keep the response short, i've elided your copy-edits -- they are all
accepted (and quite appreciated!) ]

> * Excellent document!

thanks!

> * I tried to review this using only "public" information (i.e. not
> going to look at clearview.east), mostly as an exercise to see what
> might be missing to a non-Sun reviewer. On the whole I think that
> there is enough detail about the related projects (pfhooks,
> Clearview observability, etc.) included in the document to make that
> possible. (This is good.)

this was my hope -- good to hear. however, long-term, we need to do
better, either by "opening up" i-team wikis like clearview.east, or by
providing an external way to access the links referenced our documents.

our apologies to the community as a whole -- there are clearly some rough
edges on this new process.

> Page 5:
> * Are there situations where it might be useful to have an interface
> be a member of multiple IPMP groups? In particular where the
> interface in question is STANDBY for a couple of groups. It would
> mean a bunch of changes and I'm not clear if it's useful, but the
> thought occurred to me anyway.

offhand, i can't see a benefit over placing everything together into a
single ipmp group (which would have to be semantically possible, given
that the STANDBY interface in question would have to be on the same link
as both of the groups it belonged to). moreover, for multiple ipmp groups
to make sense, one would first have to support having multiple interfaces
on the same link but not part of the same ipmp group.

> Page 9:
> * Seems that there is a "!" missing in "FAILED will now always imply
> RUNNING" (should be "imply !RUNNING").

the original latex source has "~RUNNING", but it appears the "~" got lost
somewhere along the way. hmph! i've changed it to "!RUNNING".

> Page 28:
> * What is IFF_NOFAILOVER actually used for now? It seems that:
> IFF_NOFAILOVER = (group != NULL) & !IFF_IPMP
> Hmm - perhaps it relates to the backward compatibility aspects of
> allowing addresses to be added to the component links?

correct. the kernel will key address "migration" on IFF_NOFAILOVER. keep
in mind that IFF_NOFAILOVER is an address property, not an interface
property. when the address is eventually brought IFF_UP, the kernel can
examine IFF_NOFAILOVER to know whether to migrate the address to the ipmp
group interface. so, the IFF_NOFAILOVER equation you provide above is not
a definition, but rather an invariant that is maintained by the kernel for
all addresses that are IFF_UP.

should i clarify this in the document? i'm not sure if anyone else will
even notice this subtlety, so i fear i may introduce more confusion by
bringing it up.

> * The text indicates that RTM_DELADDR will be sent for the
> non-failover addresses when an interface is added to a group.
> Presumably RTM_NEWADDR messages will also be sent when the addresses
> are added to the IPMP interface? (This was mostly a case of being
> puzzled why one part of the process is mentioned but not the other.)

no, because non-failover addresses do not exist on ipmp interfaces, and
moreover, we do not want existing applications to use them. from the
perspective of existing applications, the system does not "have" test
addresses -- they are entirely invisible. the symmetric RTM_NEWADDR
happens if the underlying interface is later removed from the ipmp group.

do you think some clarification in the document would help?

> Page 32: Logical Interfaces.
> * If non-failover addresses are added to underlying interfaces, they
> will be moved to the IPMP interface, yes?

not when they are added -- when they are brought up. i've added the same
note as in section 5.8 to clarify.

thanks again!
--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



David.Edmondson...
Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 14, 2005 8:59 AM   in response to: meem

  Click to reply to this thread Reply

* Peter dot Memishian at Sun dot COM [2005-09-14 16:44:57]
> first, glad to have you back on board! :-)

Thanks!

> > * I tried to review this using only "public" information
> > (i.e. not going to look at clearview.east), mostly as an
> > exercise to see what might be missing to a non-Sun reviewer.
> > On the whole I think that there is enough detail about the
> > related projects (pfhooks, Clearview observability, etc.)
> > included in the document to make that possible. (This is
> > good.)
>
> this was my hope -- good to hear. however, long-term, we need to do
> better, either by "opening up" i-team wikis like clearview.east, or
> by providing an external way to access the links referenced our
> documents.

Switching many more things to be externally available or externally
developed (I'd prefer the latter) is obviously a requirement moving
forward, but it will take some time to get that all going. Stephen
Hahn's recent email suggests this is understood and work is underway.

> > Page 5:
> > * Are there situations where it might be useful to have an
> > interface be a member of multiple IPMP groups? In particular
> > where the interface in question is STANDBY for a couple of
> > groups. It would mean a bunch of changes and I'm not clear if
> > it's useful, but the thought occurred to me anyway.
>
> offhand, i can't see a benefit over placing everything together into
> a single ipmp group (which would have to be semantically possible,
> given that the STANDBY interface in question would have to be on the
> same link as both of the groups it belonged to).

The only thing that has come to me so far is that it could perhaps be
useful if there was some type of traffic partitioning (which could
imply traffic limiting) taking place. I might have three NICs and two
principal traffic generator/consumer applications. My fabric, whilst
a single layer 3 network, is arranged such that the two applications
are split using IP addressing and each application "owns" a NIC. The
third NIC might be used for failover for both of the first two.

It seems overly convoluted.

> > Page 28:
> > * What is IFF_NOFAILOVER actually used for now? It seems that:
> > IFF_NOFAILOVER = (group != NULL) & !IFF_IPMP
> > Hmm - perhaps it relates to the backward compatibility aspects of
> > allowing addresses to be added to the component links?
> [...]
> should i clarify this in the document? i'm not sure if anyone else will
> even notice this subtlety, so i fear i may introduce more confusion by
> bringing it up.

The document is fine I think. I guess if the backward compatibility
requirement was different we might nuke the flag.

> > * The text indicates that RTM_DELADDR will be sent for the
> > non-failover addresses when an interface is added to a group.
> > Presumably RTM_NEWADDR messages will also be sent when the addresses
> > are added to the IPMP interface? (This was mostly a case of being
> > puzzled why one part of the process is mentioned but not the other.)
>
> no, because non-failover addresses do not exist on ipmp interfaces, and
> moreover, we do not want existing applications to use them. from the
> perspective of existing applications, the system does not "have" test
> addresses -- they are entirely invisible. the symmetric RTM_NEWADDR
> happens if the underlying interface is later removed from the ipmp group.
>
> do you think some clarification in the document would help?

I flipped the bits in my mind when reading the text, sorry. The text
is fine. But...

> > Page 32: Logical Interfaces.
> > * If non-failover addresses are added to underlying interfaces, they
> > will be moved to the IPMP interface, yes?
>
> not when they are added -- when they are brought up. i've added the same
> note as in section 5.8 to clarify.

...it seems that the confusion spread. It's actually non-non-failover
addresses (i.e. normal ones) that will be moved when they come up :-)

There's something a little odd about having addresses that are up or
down, though it is a reasonably natural consequence of the "logical
interface" approach. It often seemed that the BSD "lots of addresses
added to an interface" approach was easier, but that would preclude
some of the things you describe here (admittedly, some of them relate
to things that are strange).

dme.
--
David Edmondson, Solaris Engineering, Sun Microsystems.

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,045
From: US

Registered: 3/9/05
Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 14, 2005 9:18 AM   in response to: David.Edmondson...

  Click to reply to this thread Reply


> Switching many more things to be externally available or externally
> developed (I'd prefer the latter) is obviously a requirement moving
> forward, but it will take some time to get that all going. Stephen
> Hahn's recent email suggests this is understood and work is underway.

indeed.

> > offhand, i can't see a benefit over placing everything together into
> > a single ipmp group (which would have to be semantically possible,
> > given that the STANDBY interface in question would have to be on the
> > same link as both of the groups it belonged to).
>
> The only thing that has come to me so far is that it could perhaps be
> useful if there was some type of traffic partitioning (which could
> imply traffic limiting) taking place. I might have three NICs and two
> principal traffic generator/consumer applications. My fabric, whilst
> a single layer 3 network, is arranged such that the two applications
> are split using IP addressing and each application "owns" a NIC. The
> third NIC might be used for failover for both of the first two.
>
> It seems overly convoluted.

agreed -- though it's possible we will need to support something like this
the future, i see no urgent need.

> > > Page 32: Logical Interfaces.
> > > * If non-failover addresses are added to underlying interfaces, they
> > > will be moved to the IPMP interface, yes?
> >
> > not when they are added -- when they are brought up. i've added the same
> > note as in section 5.8 to clarify.
>
> ...it seems that the confusion spread. It's actually non-non-failover
> addresses (i.e. normal ones) that will be moved when they come up :-)

the text states "any logical interfaces that are not marked IFF_NOFAILOVER
will be brought up on the appropriate IPMP group interface". that seems
correct to me -- do you just mean that the wording is confusing?
unfortunately, the flag's sense forces use of a double-negative.

> There's something a little odd about having addresses that are up or
> down, though it is a reasonably natural consequence of the "logical
> interface" approach.

indeed -- but that's a longstanding bit of solaris behavior. e.g.,
ifconfig(1M) currently states:

down

Mark a logical interface as "down". (That is, turn off
the IFF_UP bit.) When a logical interface is marked
"down," the system does not attempt to use the address
assigned to that interface as a source address for out-
bound packets and will not recognize inbound packets
destined to that address as being addressed to this
host.

i agree this is confusing and that few understand the subtlety that "up"
and "down" apply to addresses rather than interfaces.

> It often seemed that the BSD "lots of addresses added to an interface"
> approach was easier, but that would preclude some of the things you
> describe here (admittedly, some of them relate to things that are
> strange).

if there wasn't backward compatibility to consider, i think the bsd model
would still work smoothly with the proposed architecture. that is, the
idea of migrating during IFF_UP is only necessary for solaris backward
compatibility.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



dme

Posts: 62
From:

Registered: 6/10/05
Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 14, 2005 9:31 AM   in response to: meem

  Click to reply to this thread Reply

* Peter dot Memishian at Sun dot COM [2005-09-14 17:21:54]
> > > > Page 32: Logical Interfaces.
> > > > * If non-failover addresses are added to underlying interfaces, they
> > > > will be moved to the IPMP interface, yes?
> > >
> > > not when they are added -- when they are brought up. i've added the same
> > > note as in section 5.8 to clarify.
> >
> > ...it seems that the confusion spread. It's actually non-non-failover
> > addresses (i.e. normal ones) that will be moved when they come up :-)
>
> the text states "any logical interfaces that are not marked IFF_NOFAILOVER
> will be brought up on the appropriate IPMP group interface". that seems
> correct to me -- do you just mean that the wording is confusing?

I've caused confusion, sorry.

I wrote "If non-failover addresses are added...will be moved" and you
clarified that it's when they are marked UP. In fact it's actually
normal (non-non-failover) addresses that are moved, so my original
question was incorrectly phrased.

The document is correct, there's no need to change.

dme.
--
David Edmondson, Solaris Engineering, Sun Microsystems.

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,810
From: US

Registered: 3/9/05
re: re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Sep 14, 2005 9:01 AM   in response to: meem

  Click to reply to this thread Reply

Peter Memishian writes:
> the original latex source has "~RUNNING", but it appears the "~" got lost
> somewhere along the way. hmph! i've changed it to "!RUNNING".

Use either:

\tilde{1}

or put the expression in verbatim:

\verb@~RUNNING@

The latter's probably better, as this is a bit of code.

--
James Carlson, KISS Network <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



hema

Posts: 8
From: Sydney Australia

Registered: 10/4/05
Re: Clearview IPMP Rearchitecture: high-level
Posted: Oct 4, 2005 9:12 PM   in response to: meem

  Click to reply to this thread Reply

Overall, it is a well written document.

Just wondering the need to have both kstat and ipmpstat for ipmp interface. Wouldn't it be better to have just one tool/utility to collect all the statistics..

--Hema

meem

Posts: 3,045
From: US

Registered: 3/9/05
re: Re: Clearview IPMP Rearchitecture: high-level
Posted: Oct 5, 2005 7:26 AM   in response to: hema

  Click to reply to this thread Reply


> Overall, it is a well written document.
>
> Just wondering the need to have both kstat and ipmpstat for ipmp
> interface. Wouldn't it be better to have just one tool/utility to
> collect all the statistics..

the "stat" in ipmpstat stands for "status", not "statistics" ("stat" as in
lpstat, not as in kstat), and as such there are no statistics reported by
ipmpstat. however, the ipmp interface will have kstats (see section 4.8)
which will be able to be viewed with kstat, netstat, and other existing
tools.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



hema

Posts: 8
From: Sydney Australia

Registered: 10/4/05
Re: re: Re: Clearview IPMP Rearchitecture: high-level
Posted: Oct 6, 2005 6:09 PM   in response to: meem

  Click to reply to this thread Reply

Additional comments:

Sec 3.1

If we have an ipmp group configured to use *link based failure detection* (i.e., no test ip addresses), then how will the proposed new ipmp group look like ? Will the data ip addresses be assigned to the ipmp group interface or the physical interface ?

Sec 4.6

Is it required to plumb the IPv6 "link-local" address on the IPMP group interface if we are using link based probe detection ?

Sec 4.7.

Since ipmp group interface is not a physical address but, rather a group interface, what would link_up and link_status mean ? Will it be set to 1 or UP or Running so long as at least one of the intefaces in the group is up.

hema

Posts: 8
From: Sydney Australia

Registered: 10/4/05
Re: re: Re: Clearview IPMP Rearchitecture: high-level
Posted: Oct 6, 2005 6:24 PM   in response to: hema

  Click to reply to this thread Reply

Sec 4.6
>
> Is it required to plumb the IPv6 "link-local"
> address on the IPMP group interface if we are using
> link based probe detection ?

Please read this as --- link based failure detection.

meem

Posts: 3,045
From: US

Registered: 3/9/05
re: Re: re: Re: Clearview IPMP Rearchitecture: high-level
Posted: Oct 6, 2005 7:31 PM   in response to: hema

  Click to reply to this thread Reply


> Additional comments:
>
> Sec 3.1
>
> If we have an ipmp group configured to use *link based failure
> detection* (i.e., no test ip addresses), then how will the proposed new
> ipmp group look like ? Will the data ip addresses be assigned to the
> ipmp group interface or the physical interface ?

Data addresses are *always* assigned to the IPMP interface in the new
model. If you do not configure any test addresses for an interface, then
the physical interface will simply have a placeholder address of 0.0.0.0.

I will see if I can make this more explicit without making the discussion
too complicated.

> Sec 4.6
>
> Is it required to plumb the IPv6 "link-local" address on the IPMP group
> interface if we are using link based probe detection ?

The link-local addresses on the IPMP group interface are not used for
probe traffic. It would be possible to omit configuration of the
link-local addresses on the underlying interfaces, but I'm not sure I see
the benefit -- currently, probe-based failure detection is always enabled
with IPv6 because link-local address configuration is automatic when an
interface is plumbed.

> Sec 4.7.
>
> Since ipmp group interface is not a physical address but, rather a
> group interface, what would link_up and link_status mean ? Will it be
> set to 1 or UP or Running so long as at least one of the intefaces in
> the group is up.

Right. I will make this explicit in the document.

Thanks!
--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



hema

Posts: 8
From: Sydney Australia

Registered: 10/4/05
Re: re: Re: re: Re: Clearview IPMP Rearchitecture:
Posted: Oct 6, 2005 8:06 PM   in response to: meem

  Click to reply to this thread Reply

> > Sec 4.6
> The link-local addresses on the IPMP group interface
> are not used for
> probe traffic. It would be possible to omit
> currently, probe-based failure
> detection is always enabled
> with IPv6 because link-local address configuration is
> automatic when an
> interface is plumbed.
>

Would it be possible to re-architect this so, we have an option to omit probe traffic in an ipmp configuation with IPv6 addresses as well.


> _______________________________________________
> networking-discuss mailing list
> networking-discuss at opensolaris dot org
>

meem

Posts: 3,045
From: US

Registered: 3/9/05
re: Re: re: Re: re: Re: Clearview IPMP Rearchitecture:
Posted: Oct 6, 2005 9:01 PM   in response to: hema

  Click to reply to this thread Reply


> > > Sec 4.6
> >
> > The link-local addresses on the IPMP group interface are not used for
> > probe traffic. It would be possible to omit currently, probe-based failure
> > detection is always enabled with IPv6 because link-local address configuration
> > is automatic when an interface is plumbed.
>
> Would it be possible to re-architect this so, we have an option to omit
> probe traffic in an ipmp configuation with IPv6 addresses as well.

Is the goal to avoid the autoconfiguration of the link-local address, or
to avoid use of the address as a test address? It should be possible to
do the latter by simply marking the address down with ifconfig, or by
putting this in the /etc/hostname6.<if> file:

group foo
inet6 down

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



hema

Posts: 8
From: Sydney Australia

Registered: 10/4/05
Re: re: Re: re: Re: re: Re: Clearview IPMP Rearchitecture:
Posted: Oct 9, 2005 12:30 AM   in response to: meem

  Click to reply to this thread Reply

>
> > > > Sec 4.6
> > >
> > > The link-local addresses on the IPMP group
> p interface are not used for
> > > probe traffic. It would be possible to omit
> t currently, probe-based failure
> > > detection is always enabled with IPv6 because
> e link-local address configuration
> > > is automatic when an interface is plumbed.
> >
> > Would it be possible to re-architect this so, we
> e have an option to omit
> > probe traffic in an ipmp configuation with IPv6
> 6 addresses as well.
>
> Is the goal to avoid the autoconfiguration of the
> link-local address, or
> to avoid use of the address as a test address? It
> should be possible to
> do the latter by simply marking the address down with
> ifconfig, or by
> putting this in the /etc/hostname6.<if> file:
>
> group foo
> inet6 down
>

I just wanted to get rid of probe/icmp traffic in an IPv6 configuration, should the underlying driver support link up/down notifications.

--Hema

> --
> meem
> _______________________________________________
> networking-discuss mailing list
> networking-discuss at opensolaris dot org
>

meem

Posts: 3,045
From: US

Registered: 3/9/05
re: Re: re: Re: re: Re: re: Re: Clearview IPMP Rearchitecture:
Posted: Oct 9, 2005 4:58 PM   in response to: hema

  Click to reply to this thread Reply


> > Is the goal to avoid the autoconfiguration of the
> > link-local address, or
> > to avoid use of the address as a test address? It
> > should be possible to
> > do the latter by simply marking the address down with
> > ifconfig, or by
> > putting this in the /etc/hostname6.<if> file:
> >
> > group foo
> > inet6 down
> >
>
> I just wanted to get rid of probe/icmp traffic in an IPv6
> configuration, should the underlying driver support link up/down
> notifications.

Then the above will work. I agree that we should provide a more intuitive
way to do it, but I think that will need to wait until new administrative
IPMP utilities are added (see my earlier comments to Ben Rockwood
regarding ipmpadm).

--
meem

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



hema

Posts: 8
From: Sydney Australia

Registered: 10/4/05
Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Oct 9, 2005 4:54 AM   in response to: meem

  Click to reply to this thread Reply

Sec 4.7
Just wondering the need to have both link_up and link_status in kstat, either of the two should be sufficient. I am not sure if having both will confuse network monitoring agents like snmp..

Sec 3.5
This section on source address selection is beautifully written. I really appreciate that.

meem

Posts: 3,045
From: US

Registered: 3/9/05
re: Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Oct 9, 2005 6:18 PM   in response to: hema

  Click to reply to this thread Reply


> Just wondering the need to have both link_up and link_status in kstat,
> either of the two should be sufficient. I am not sure if having both
> will confuse network monitoring agents like snmp..

It seems the "standard" kstat is link_up; I'm fine with just supporting
that one.

> Sec 3.5
> This section on source address selection is beautifully written. I
> really appreciate that.

Thanks!

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



nelsong

Posts: 1
From: Montgomery Alabama

Registered: 7/10/06
Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Jul 10, 2006 8:11 AM   in response to: meem

  Click to reply to this thread Reply

I have a backup server that I need to create a bigger network pipe for data coming into the box, how can I do that? It is my understanding IPMP only allows for outgoing data and failover. Is this something that would be set up on the switch side or is it even possible? I would need to do this with Solaris 8, 9, and 10 environments. Thanks in advance for any responses.

meem

Posts: 3,045
From: US

Registered: 3/9/05
re: Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Jul 10, 2006 10:41 AM   in response to: nelsong

  Click to reply to this thread Reply


> I have a backup server that I need to create a bigger network pipe for
> data coming into the box, how can I do that? It is my understanding
> IPMP only allows for outgoing data and failover. Is this something
> that would be set up on the switch side or is it even possible? I
> would need to do this with Solaris 8, 9, and 10 environments. Thanks
> in advance for any responses.

Inbound load-spreading is usually done in concert with DNS. For instance,
if you have two interfaces and two IP addresses on the same subnet, you
could use IPMP to assign an address to each interface, and then have DNS
round-robin through the two addresses. You can of course scale this up
with more addresses and more interfaces.

Note that in the current implementation, you need to explicitly spread the
IP addresses across the underlying interfaces, by configuring an address
on each one. After the rearchitecture, the kernel will do that for you,
and you will be able to examine the distribution with ipmpstat.

Please let me know if you need more details.
--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



De Mena, Ron
ron.demena@eds.com
RE: Re: Clearview IPMPRearchitecture: high-level design (due 9/22)
Posted: Jul 10, 2006 10:47 AM   in response to: meem

  Click to reply to this thread Reply

Interesting request... Might I ask if Link Aggregations can be IPMP'd?
While this topic is active.

Currently link aggregation can not be performed between two switches.
If we can link aggregate 2 say even 4 ports together on a switch and
IPMP the aggregation to another 2 or 4 ports... (IE. Aggregate one quad
card and IPMP it with another aggregated quad card?)

Might solve this problem and make much more possible.

-- Ron

-----Original Message-----
From: networking-discuss-bounces at opensolaris dot org
[mailto:networking-discuss-bounces at opensolaris dot org] On Behalf Of Peter
Memishian
Sent: Monday, July 10, 2006 1:41 PM
To: Greg Nelson
Cc: networking-discuss at opensolaris dot org
Subject: re: [networking-discuss] Re: Clearview IPMPRearchitecture:
high-level design (due 9/22)


> I have a backup server that I need to create a bigger network pipe
for > data coming into the box, how can I do that? It is my
understanding > IPMP only allows for outgoing data and failover. Is
this something > that would be set up on the switch side or is it even
possible? I > would need to do this with Solaris 8, 9, and 10
environments. Thanks > in advance for any responses.

Inbound load-spreading is usually done in concert with DNS. For
instance, if you have two interfaces and two IP addresses on the same
subnet, you could use IPMP to assign an address to each interface, and
then have DNS round-robin through the two addresses. You can of course
scale this up with more addresses and more interfaces.

Note that in the current implementation, you need to explicitly spread
the IP addresses across the underlying interfaces, by configuring an
address on each one. After the rearchitecture, the kernel will do that
for you, and you will be able to examine the distribution with ipmpstat.

Please let me know if you need more details.
--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



droux

Posts: 352
From: Menlo Park, CA

Registered: 5/23/05
Re: Re: Clearview IPMPRearchitecture: high-level design (due 9/22)
Posted: Jul 10, 2006 11:11 AM   in response to: De Mena, Ron

  Click to reply to this thread Reply

De Mena, Ron wrote:
> Interesting request... Might I ask if Link Aggregations can be IPMP'd?
> While this topic is active.
>
> Currently link aggregation can not be performed between two switches.
> If we can link aggregate 2 say even 4 ports together on a switch and
> IPMP the aggregation to another 2 or 4 ports... (IE. Aggregate one quad
> card and IPMP it with another aggregated quad card?)
>
> Might solve this problem and make much more possible.

Yes, that's a valid way of combining link aggregations and IPMP.

Nicolas.

--
Nicolas Droux, Solaris Kernel Networking
Sun Microsystems, Inc. http://blogs.sun.com/droux

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



meem

Posts: 3,045
From: US

Registered: 3/9/05
RE: Re: Clearview IPMPRearchitecture: high-level design (due 9/22)
Posted: Jul 10, 2006 11:12 AM   in response to: De Mena, Ron

  Click to reply to this thread Reply


> Interesting request... Might I ask if Link Aggregations can be IPMP'd?
> While this topic is active.

Yes, that is supported, and I'm aware of a number of sites that do this.

> Currently link aggregation can not be performed between two switches.
> If we can link aggregate 2 say even 4 ports together on a switch and
> IPMP the aggregation to another 2 or 4 ports... (IE. Aggregate one quad
> card and IPMP it with another aggregated quad card?)

Yes, you can do this.

> Might solve this problem and make much more possible.

I don't believe such a solution is necessary to solve this problem. DNS
round-robin over the set of IP data addresses associated with the IPMP
group should be sufficient.

--
meem
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,810
From: US

Registered: 3/9/05
Re: Re: Clearview IPMP Rearchitecture: high-level design (due 9/22)
Posted: Jul 12, 2006 7:54 AM   in response to: nelsong

  Click to reply to this thread Reply

Greg Nelson writes:
> I have a backup server that I need to create a bigger network pipe
> for data coming into the box, how can I do that? It is my
> understanding IPMP only allows for outgoing data and failover. Is
> this something that would be set up on the switch side or is it even
> possible? I would need to do this with Solaris 8, 9, and 10
> environments. Thanks in advance for any responses.

Is this with one TCP connection or many connections?

If it's with many connections, then IPMP will work for inbound load
spreading as well. You need to set up multiple data addresses on the
system and may need to look into your name service infrastructure as
well (setting up multiple DNS A records and round-robin resolution
will help).

If it's with a single TCP connection, you're sunk. There's no decent
Ethernet-like solution that will do this. (RFC 1990 MP does it for
PPP, at the cost of setting the latency to max(all-links), but I know
of no solution for regular LAN-type hardware.)

--
James Carlson, KISS Network <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org






Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
Copyright © 1995-2005 Sun Microsystems, Inc.