Clearview: Network Interface Coherence 1. Introduction --------------- This is an umbrella case representing all of the features to be introduced by the Clearview project. Each of these features, or components, will be separately delivered and will be represented by separate PSARC cases under this umbrella. This case has no deliverables, and simply gives an overview of the project and ties all of its components together. 1.1. Terminology ---------------- DLPI The Data Link Provider Interface. An Open Group standard that specifies an interface for accessing data link service providers. In Solaris, network-layer protocols such as IP, and network observability tools such as snoop, access data links by using the DLPI interfaces provided by data link devices. GLDv3 The Generic LAN Driver version 3 (PSARC/2004/571, a.k.a. Nemo). A set of kernel modules that implement network link layer facilities for network drivers. These facilities include the implementation of DLPI interfaces, management of DLPI devices, and the implementation of link layer features such as VLANs and link aggregations. GLDv3 devices are managed using the dladm(1M) command. GLDv3 also provides performance features not present in GLDv2. GLDv3 is not compatible with GLDv2. IPMP IP network multipathing (PSARC/1999/225). A popular Solaris-specific multipathing technology operating at the IP layer. IPMP insulates IP-based applications from changes to underlying networking hardware and the system's connectivity to the network as a whole. IPMP also implements a form of load spreading amongst a set of network interfaces. Link Aggregation 802.3ad Link Aggregation. An IEEE standard for combining multiple Ethernet ports in to one virtual interface. This is also known as trunking. Nemo See GLDv3. VLAN 802.1q Virtual Local Area Network. An IEEE standard that defines a method of tagging Ethernet frames to allow Ethernet topologies to be broken up into virtual links. 2. Clearview Overview --------------------- Clearview is a project that unifies the set of features implemented by various network interfaces on Solaris with the goal of simplifying the way that administrators and networking related software running on Solaris handle those interfaces. The idea behind this is that in order for administrators to more easily configure networking on Solaris, all network interfaces of the same type should support the same set of features. An additional goal of the project is to provide a way for administrators to name their network interfaces any way they choose to ease the burden associated with DR, network configuration migration, and system failure recovery. It is difficult for customers to understand the disparity of features offered by various networking drivers in Solaris. In many cases, drivers are implemented by projects that have specific customers and goals in mind, each differing from the other. The end result is a soup of drivers that have little in common, leaving customers confused about which features are available on which platforms or drivers. For example, customers wonder why: * Ethernet VLANs only work with some Solaris Ethernet drivers and not others. * Some network interfaces can be aggregated into 802.3ad link aggregations but not others. * One can observe network traffic on some network interfaces using snoop and Ethereal but not on others. * Some network interfaces' IP addresses magically hop around to other interfaces (as with IPMP) thus breaking higher level networking functions implemented by applications running on the system (such as DHCP or routing daemons.) We want to be able to tell customers, you can: * configure VLANs on all Ethernet interfaces * create link-aggregations using any set of Ethernet interfaces * administer all of your network data-links using a single command set * diagnose problems on any network interface using snoop or Ethereal * expect some level of stability with respect to your IP addresses not moving around unexpectedly * expect your applications to "just work" with key technologies such as IPMP, link aggregations, and VLANs In addition, we'll give you the ability to assign vanity names to your network interfaces to ease administration. Clearview will allow us to do this, and do it in a way that will allow future data-link features to be developed to include all drivers. Not just those that were written by the same Sun organization or those written to a given framework. 3. Clearview Components ----------------------- The project identifies several areas that need to be addressed in order to meet the goals stated above. They are split into four main components, each is briefly discussed in individual sections below. They are: * Vanity Naming and Nemo Unification (UV) * IP Tunneling Device Driver * IPMP Rearchitecture * IP Observability Devices To support these main components, two smaller deliverables will be implemented (also described below): * The G in GLDv3 * Going Public With libdlpi.so 3.1. Vanity Naming and Nemo Unification (UV) -------------------------------------------- There is disparity between the data-link features offered by different network drivers in Solaris. In addition, the way that these features are configured varies depending on the driver used. This is confusing to administrators who wonder why they can't use certain common networking features with the systems they paid good money for while the features are available on a different type of system, or why features are configured differently for seemingly arbitrary drivers. For example, the Nemo network driver framework (a.k.a. GLDv3, introduced by PSARC/2004/571) implements several useful features for Ethernet links such as VLANs, 802.3ad link aggregations, and a command-line tool named dladm(1M) for viewing and configuring network data-links. These features are only available for drivers written using the GLDv3 framework, although VLANs are a feature implemented by some non-GLDv3 drivers. There is an unbundled product (Sun Trunking) that allows Ethernet links to be aggregated, but only for a small set of network drivers, and it is configured using the unbundled tool nettr. This Clearview component will create a shim module to bring all network devices under the GLDv3 framework, thus bringing all network data-links under the administrative control of the dladm(1M) command. Because GLDv3 currently on supports Ethernet drivers, this case will depend on a separate fast-track discussed in section 3.5 that will add support for arbitrary MAC types to GLDv3. One of the deliverables of this component will be to deliver plugins for the MAC types necessary to support a reasonably broad number of network drivers. As a result, all GLDv3 features (such as VLANs and 802.3ad link aggregations) will be available for _all_ applicable links! This will also allow all future data-link features (jumbograms, 802.1x access control, or even the upcoming Crossbow data-link virtualization feature for example) to be implemented within the GLDv3 framework and allowing all network interfaces to take advantage of those features. One such feature that will be introduced by this component will be vanity naming of network interfaces. This will allow administrators to use the dladm(1M) command to rename data-links. This will ease the administrative burden of using DR to replace hardware, and will make migration of system or application configuration between systems much easier (E.g., Zone migration.) A new /dev/net directory will be used to maintain the data-link device namespace so that administratively chosen names won't conflict with the names of other devices on the system. Because DLPI applications today simply look in /dev for DLPI devices, the project will rely on applications using a generic function in libdlpi.so to open a DLPI device. This function will be delivered by the libdlpi.so fast-track described in section 3.6. 3.2. IP Tunneling Device Driver ------------------------------- One type of network interface on Solaris that is arbitrarily different from others in its administrative model and feature set is the IP tunnel interface. An IP tunnel is essentially a virtual link-layer between two or more IP nodes, on which IP packets can be sent and received. The existing implementation doesn't administratively treat IP tunnels as link-layers at all, but as a sequence of STREAMS modules fabricated by the ifconfig(1M) command. The side-effect of this is that tunnel interfaces do not have DLPI devices that provide link-layer services that applications need to perform link-layer tasks. Such tasks include observing packets using snoop and Ethereal, configuring firewall modules, and naming the interfaces using dladm(1M) using the aforementioned Vanity Naming feature. This component will modify the IP tunneling implementation to use a GLDv3 based driver that provides link-layer functionality for IP tunneling. This will result in IP tunnels being fully observable, having fully functional DLPI devices, and having all of the capabilities of the GLDv3 administrative model offered through dladm(1M). As mentioned in section 3.1, GLDv3 only supports Ethernet devices. This is problematic since IP Tunnels are certainly not Ethernet data-links. They are based on IP headers, and therefore based on a virtual IP data-link. A separate fast-track (discussed in section 3.5) will introduce MAC-Type plugins for GLDv3 that will allow anyone to add support for an arbitrary MAC-type to GLDv3 by supplying a kernel module. One of the deliverables of the IP Tunneling Device Driver component will thus be IP tunneling MAC-type plugins for GLDv3 for each type of IP tunnel interface. 3.3. IPMP Rearchitecture ------------------------ Today, an IPMP group is a collection of IP interfaces that are each connected to the same underlying data-link. The IP connectivity through these interfaces is monitored for health by daemon (in.mpathd), and the system protects network applications from IP connectivity problems based on the result of this monitoring. Currently, IPMP does this by moving IP addresses from interfaces that have connectivity problems to healthy interfaces. This model causes serious problems for networking applications that have a close relationship with IP interfaces. Applications like DHCP clients, IP routing daemons, IPv6 autoconfiguration daemons, IPsec IKE daemons, etc... These all provide functions that fail in unexpected ways or behave in suboptimal ways when such address migration occurs. Furthermore, the representation of an IPMP group as multiple IP interfaces means that an administrator cannot observe packets flowing through the group as he would on a single data-link interface. It also means that IPMP cannot easily be used with Zones as one cannot predict the name of the interface associated with a Zone IP address. This component will represent an IPMP group as a single IP interface. As a result, all IP addresses in the group will reside on this interface, and the use of the underlying links by the ip module will be managed behind the scenes, isolating applications from network topology problems (which is the goal of IPMP to begin with), and allowing better integration of IPMP with Zones and other networking sensitive technologies. Using the IP Observability Devices described below in section 3.4, the entire group will be observable with snoop or other DLPI observability applications. 3.4. IP Observability Devices ----------------------------- Another area of disparity in Solaris networking is the observability of network traffic flowing through the system. Administrators use tools like snoop(1M) or the open source Ethereal application to observe packets over data-links. These tools open DLPI devices and read data promiscuously. Unfortunately, not all network interfaces provide DLPI devices for use by snoop and Ethereal, mainly because they are not providing true data-link services. Those are network interfaces that are strictly maintained within the ip module at the network layer such as the loopback interface (lo0) and the IPMP interfaces introduced by the IPMP Rearchitecture component of Clearview described above. Also, not all network traffic flows down to data-link devices, such as packets sent locally on the system. Such packets are looped within the ip module back to local applications. In addition, DLPI devices are not accessible within Zones, so Zone administrators are not able to diagnose their networking problems on any network interface. This Clearview component will allow administrators (even administrators within local zones) to observe network traffic flowing through the ip module, regardless of whether that traffic is looped-back, inter-Zone, or destined to another system. This will be done by creating DLPI device nodes for each IP interface on the system in a new /dev/ipnet directory. These DLPI devices will only implement enough of DLPI to allow snoop or like-applications to promiscuously read IP packets. For the same reasons that applications will use dlpi_open() to access /dev/net DLPI devices, they will need that function to access /dev/ipnet devices. As such, this component will depend on the libdlpi.so fast-track described in section 3.6. 3.5. The G in GLDv3 ------------------- The GLDv3 framework only supports Ethernet devices. As such it is not quite "Generic". The "G" in the "The G in GLDv3" case title implies that this case will make GLDv3 generic, at least allow it to be extensible in a generic way. Two of the main Clearview components require the GLDv3 framework to be used with drivers that are not implementing an Ethernet MAC layer. The Nemo Unification feature described in section 3.1 will allow non-Ethernet drivers to be brought under GLDv3 via a shim module, and the IP Tunneling Device Driver described in section 3.2 implements a GLDv3 driver that supplies a virtual IP link-layer (not Ethernet). As a result, the GLDv3 framework must be modified to support MAC types other than Ethernet. This fast-track will accomplish this by introducing a plugin framework that will allow developers to implement GLDv3 support for any required MAC type without recompiling the framework. Developers needing GLDv3 support for a given MAC type will create a simple kernel module that conforms to the plugin architecture. By making GLDv3 extensible in this way, we will not only be meeting the requirements of the Clearview project, but the network driver developer community will be much more likely to port existing drivers to GLDv3 or to choose GLDv3 as the framework in which to develop new drivers. For example, we've recently discovered that the lack of WIFI MAC layer support in GLDv3 was a barrier for Solaris WIFI driver developers to use GLDv3. We're working with the WIFI team (and in fact also contributing some resources), and an upcoming project is planning on implementing a WIFI MAC-type plugin that will allow WIFI drivers to use GLDv3. This fast-track will also address the fact that GLDv3 is currently not extensible due to the exposure of private data structures to its consumers, and therefore cannot be binary compatible from release to release. This will result in changes in the way MAC drivers register with the GLDv3 framework. 3.6. Going Public With libdlpi.so --------------------------------- Due to the aforementioned need for a centralized public function to open DLPI devices, the libdlpi.so library's currently Consolidation Private functions introduced by PSARC/2003/375 will be made public. Modifications will be made to the overall architecture of the library's routines in order to make this possible. For example, the error handling will be improved, DLPI handles will be implemented, and new functions will be defined to provide DLPI functionality previously missing from the library. One of the functions provided by the library, the dlpi_open() function, will do the work of determining the correct directory under /dev to open DLPI devices. 4. Recap of Anticipated Forthcoming PSARC Cases ----------------------------------------------- A. Vanity Naming and Nemo Unification B. IP Tunneling Device Driver C. IPMP Rearchitecture D. IP Observability Devices E. The G in GLDv3 F. Going Public With libdlpi.so A and B will depend on E. A and D will depend on F. C will depend on D.