|
Replies:
24
-
Last Post:
May 3, 2007 1:19 PM
by: rshoaib
|
|
|
Posts:
91
From:
US
Registered:
1/19/06
|
|
|
|
Kernel sockets design document
Posted:
Apr 23, 2007 3:50 PM
|
|
Hi Everyone,
I just published a design document for the kernel sockets interface[1]. The document is short, so hopefully everyone will have time to read it :) Any comments would be appreciated.
Anders
[1] http://www.opensolaris.org/os/project/kernel-sockets/files/kernel-sockets.pdf _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
Posts:
430
From:
GB
Registered:
6/15/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 24, 2007 2:29 AM
in response to: anders
|
|
On 4/23/07, Anders Persson <anders dot persson at sun dot com> wrote: > > I just published a design document for the kernel sockets interface[1]. > The document is short, so hopefully everyone will have time to read it > :) Any comments would be appreciated. >
Few things...
- Are these sockets goind to obey 3SOCKET or 3XNET semantics?
- I think the ksocket_t * should be the last arg. to ksock_socket(). In general my preference is for value-return args. to be at the end of the list.
- Is ksock_set_nonblocking() necessary? Could this not be handled by an option passed to ksock_setsockopt()?
- I think the ksock_callback_t passed to ksock_accept() is slightly confusing. Is it really necessary for an accepted socket to immediately have callbacks? It would be more straightforward if the thread calling ksock_accept() simply called ksock_callback() upon its return.
Paul
-- Paul Durrant http://www.linkedin.com/in/pdurrant _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
91
From:
US
Registered:
1/19/06
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 24, 2007 10:54 AM
in response to: pdurrant
|
|
Paul Durrant wrote:
<SNIP> > Few things... > > - Are these sockets goind to obey 3SOCKET or 3XNET semantics? They currently obeying 3SOCKET semantics, and it should probably stay that way unless people have some concerns. > > > - I think the ksocket_t * should be the last arg. to ksock_socket(). > In general my preference is for value-return args. to be at the end of > the list. OK, noted. If I see more requests about changing the location of the argument, then I will do so. > > > - Is ksock_set_nonblocking() necessary? Could this not be handled by > an option passed to ksock_setsockopt()? It could. However, I am hesitant to introduce kernel-socket only socket options. > > - I think the ksock_callback_t passed to ksock_accept() is slightly > confusing. Is it really necessary for an accepted socket to > immediately have callbacks? It would be more straightforward if the > thread calling ksock_accept() simply called ksock_callback() upon its > return. The issue with that is that an event might be missed, and the user would have to check for "pending" events after registering the callbacks. The current approach allows the user to do what you want by simply passing in NULL for the last two arguments. Another approach would be to provide two versions of accept; one that has the "regular" accept behavior, and another that allows for callback registration. > > Paul > Thank you very much for you comments,
Anders
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
430
From:
GB
Registered:
6/15/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 25, 2007 6:39 AM
in response to: anders
|
|
On 4/24/07, Anders Persson <anders dot persson at sun dot com> wrote: > > - I think the ksock_callback_t passed to ksock_accept() is slightly > > confusing. Is it really necessary for an accepted socket to > > immediately have callbacks? It would be more straightforward if the > > thread calling ksock_accept() simply called ksock_callback() upon its > > return. > The issue with that is that an event might be missed, and the user would > have to check for "pending" events after registering the callbacks. The > current approach allows the user to do what you want by simply passing > in NULL for the last two arguments. Another approach would be to provide > two versions of accept; one that has the "regular" accept behavior, and > another that allows for callback registration.
This yeilds another question then. When I add a callback function to an existing socket using ksock_callback() do I not get notification of pending events? I.e. if the socket is connected and already has data, do I not get told about that?
Paul
-- Paul Durrant http://www.linkedin.com/in/pdurrant _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
91
From:
US
Registered:
1/19/06
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 25, 2007 5:31 PM
in response to: pdurrant
|
|
Paul Durrant wrote: > On 4/24/07, Anders Persson <anders dot persson at sun dot com> wrote: >> > - I think the ksock_callback_t passed to ksock_accept() is slightly >> > confusing. Is it really necessary for an accepted socket to >> > immediately have callbacks? It would be more straightforward if the >> > thread calling ksock_accept() simply called ksock_callback() upon its >> > return. >> The issue with that is that an event might be missed, and the user would >> have to check for "pending" events after registering the callbacks. The >> current approach allows the user to do what you want by simply passing >> in NULL for the last two arguments. Another approach would be to provide >> two versions of accept; one that has the "regular" accept behavior, and >> another that allows for callback registration. > > This yeilds another question then. When I add a callback function to > an existing socket using ksock_callback() do I not get notification of > pending events? I.e. if the socket is connected and already has data, > do I not get told about that? Right, there is no mechanism in place to notify the user about events that have already take place. If the event happens, and there is no callback registered, the notification is lost. So if callbacks are registered after the ksock_accept(), then the user would have to use ksock_recv() to find out about data that arrived before the registration took place.
Anders _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
638
From:
US
Registered:
3/9/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 25, 2007 9:59 AM
in response to: anders
|
|
Anders Persson wrote: > Paul Durrant wrote: > > <SNIP> >> Few things... >> >> - Are these sockets goind to obey 3SOCKET or 3XNET semantics? > They currently obeying 3SOCKET semantics, and it should probably stay > that way unless people have some concerns.
I think it would be good to get the new msghdr support (ancillary data aka msg_control) by providing only 3XNET semantics. That is a superset of the 3SOCKET semantics.
Erik _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
91
From:
US
Registered:
1/19/06
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 2:55 PM
in response to: nordmark
|
|
Erik Nordmark wrote: > Anders Persson wrote: >> Paul Durrant wrote: >> >> <SNIP> >>> Few things... >>> >>> - Are these sockets goind to obey 3SOCKET or 3XNET semantics? >> They currently obeying 3SOCKET semantics, and it should probably stay >> that way unless people have some concerns. > > I think it would be good to get the new msghdr support (ancillary data > aka msg_control) by providing only 3XNET semantics. That is a superset > of the 3SOCKET semantics. > > > Erik OK, I will look into that.
Thanks, Anders _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Jeremy Harris
jgh@wizmail.org
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 24, 2007 1:23 PM
in response to: anders
|
|
Anders Persson wrote: > http://www.opensolaris.org/os/project/kernel-sockets/files/kernel-sockets.pdf
Overall, good. Two points:
- I'd have been tempted to stick with a synchronous interface for the initial development; lose the event notification. Is there significant demand from potential customers, which wouldn't be satisfied by them creating threads?
- Special-casing nonblocking mode seems odd. Are no other ioctls relevant? SIOCATMARK? FIONREAD?
Cheers, Jeremy Harris _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
91
From:
US
Registered:
1/19/06
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 27, 2007 11:04 AM
in response to: Jeremy Harris
|
|
Jeremy,
Sorry about the late reply, somehow the mail slipped pass me. My comments are inline.
Jeremy Harris wrote: > Anders Persson wrote: >> http://www.opensolaris.org/os/project/kernel-sockets/files/kernel-sockets.pdf > > > Overall, good. Two points: > > - I'd have been tempted to stick with a synchronous interface for the > initial development; lose the event notification. Is there significant > demand from potential customers, which wouldn't be satisfied by them > creating threads? Yes, people have expressed interested in having an asynchronous notification system. However, a synchronous interface is definitely more familiar to use, but that such an interface could always be built on top of the current interface. > > > - Special-casing nonblocking mode seems odd. Are no other ioctls > relevant? > SIOCATMARK? FIONREAD? In the current design, all ioctls that affect sockets would be represented explicitly by a function. SIOCATMARK is handled (ksock_atmark()); I mention it in the text, but I seems like I forgot to add it to the function listing. However, I completely forgot about FIONREAD, and a function would have to be added to handle it. > > > Cheers, > Jeremy Harris Thank you very much for you comments.
Anders _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6,935
From:
US
Registered:
3/9/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 27, 2007 12:32 PM
in response to: anders
|
|
Anders Persson writes: > > - Special-casing nonblocking mode seems odd. Are no other ioctls > > relevant? > > SIOCATMARK? FIONREAD? > In the current design, all ioctls that affect sockets would be > represented explicitly by a function. SIOCATMARK is handled > (ksock_atmark()); I mention it in the text, but I seems like I forgot to > add it to the function listing. However, I completely forgot about > FIONREAD, and a function would have to be added to handle it.
What's the rationale for function-per-ioctl rather than just having ksock_ioctl() and (if necessary) ksock_fcntl()?
-- James Carlson, Solaris Networking <james dot d dot carlson at sun dot com> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
91
From:
US
Registered:
1/19/06
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 2:10 PM
in response to: carlsonj
|
|
James Carlson wrote: > Anders Persson writes: > >>> - Special-casing nonblocking mode seems odd. Are no other ioctls >>> relevant? >>> SIOCATMARK? FIONREAD? >>> >> In the current design, all ioctls that affect sockets would be >> represented explicitly by a function. SIOCATMARK is handled >> (ksock_atmark()); I mention it in the text, but I seems like I forgot to >> add it to the function listing. However, I completely forgot about >> FIONREAD, and a function would have to be added to handle it. >> > > What's the rationale for function-per-ioctl rather than just having > ksock_ioctl() and (if necessary) ksock_fcntl()? > > Not all ioctls that would normally apply to userland sockets are relevant for kernel sockets (e.g., SIOCSPGRP), so there needs to be some facility to verify that a given ioctl is allowed depending on the type of socket. When I initially went through the list of interesting ioctls it was quite short (FIONBIO, SIOCATMARK), and I thought the function-per-ioctl would be suitable mechanism for filtering out unsupported ioctls.
I reviewed the ioctls again, and it now appears that things are a bit more complex. For example, SCTP have a few additional ioctls that needs to be supported. So it appears that the best way would be to provide a generic ksock_ioctl(), however, I still have to look into how to filter out unsupported ioctls.
Anders
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6,935
From:
US
Registered:
3/9/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 2:18 PM
in response to: anders
|
|
Anders Persson writes: > > What's the rationale for function-per-ioctl rather than just having > > ksock_ioctl() and (if necessary) ksock_fcntl()? > > > > > Not all ioctls that would normally apply to userland sockets are relevant > for kernel sockets (e.g., SIOCSPGRP), so there needs to be some facility > to verify that a given ioctl is allowed depending on the type of socket.
I would think that ksock_ioctl() would be an excellent place to test for that relevance. Just return EINVAL if it's not allowed. Heck, you can even do EINVAL as the "default:" rule and only allow the ones you *know* are good.
I think there's a distinction between the design pattern using function calls versus a single entry point (which was the question), and the scope of the allowable features (not the question).
> When I initially went through the list of interesting ioctls it was quite > short (FIONBIO, SIOCATMARK), and I thought the function-per-ioctl > would be suitable mechanism for filtering out unsupported ioctls. > > I reviewed the ioctls again, and it now appears that things are a bit > more complex. For example, SCTP have a few additional ioctls that > needs to be supported.
Indeed.
> So it appears that the best way would be to > provide a generic ksock_ioctl(), however, I still have to look into > how to filter out unsupported ioctls.
'switch' would probably be sufficient. It's what we do elsewhere.
There'll likely be a fair bit of overlap between this project and the existing netinfo interfaces. I'd like to see the more-standard-like ksock approach become the norm ... but that might be much longer term.
-- James Carlson, Solaris Networking <james dot d dot carlson at sun dot com> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
2,114
From:
Registered:
6/8/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 3:15 PM
in response to: carlsonj
|
|
James Carlson wrote:
>Anders Persson writes: > > >>So it appears that the best way would be to >>provide a generic ksock_ioctl(), however, I still have to look into >>how to filter out unsupported ioctls. >> >> > >'switch' would probably be sufficient. It's what we do elsewhere. > >There'll likely be a fair bit of overlap between this project and the >existing netinfo interfaces. I'd like to see the more-standard-like >ksock approach become the norm ... but that might be much longer term >
I'm not sure of that. I suspect you're thinking that the routing socket would be the ideal thing to use here...
That's less than ideal (as was the pfild) and what I'd see as being a compromise on the desired interaction with a purist view on design. A big problem with pfild using the routing sockets as "the" solution was the need to maintain a second copy of the relevant pieces of data.
Another problem here is that there is a window of time between when a change happens on a network interface and the routing socket message is delivered/received. The netinfo interfaces were aimed at eliminating that window opening.
But there is definately some possible overlap with ioctls such as SIOCGIFADDR, SIOCGIFDSTADDR, etc, being just as useful as the existing functions today.
Darren
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
6,935
From:
US
Registered:
3/9/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 3:23 PM
in response to: darrenr
|
|
Darren dot Reed at Sun dot COM writes: > But there is definately some possible overlap with ioctls such as > SIOCGIFADDR, SIOCGIFDSTADDR, etc, being just as useful as > the existing functions today.
Yes, those are the ones I was thinking of. Kernel sockets users probably shouldn't have to use netinfo as well if all they want are "normal" things, such as address information and MTU.
-- James Carlson, Solaris Networking <james dot d dot carlson at sun dot com> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
2,114
From:
Registered:
6/8/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 3:46 PM
in response to: carlsonj
|
|
James Carlson wrote:
>Darren dot Reed at Sun dot COM writes: > > >>But there is definately some possible overlap with ioctls such as >>SIOCGIFADDR, SIOCGIFDSTADDR, etc, being just as useful as >>the existing functions today. >> >> > >Yes, those are the ones I was thinking of. Kernel sockets users >probably shouldn't have to use netinfo as well if all they want are >"normal" things, such as address information and MTU. > >
I agree completely.
Darren
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
3,468
From:
US
Registered:
6/15/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 3:55 PM
in response to: Jeremy Harris
|
|
On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote: > - I'd have been tempted to stick with a synchronous interface for the > initial development; lose the event notification. Is there significant > demand from potential customers, which wouldn't be satisfied by them > creating threads?
It's easier to develop an async API and layer sync on top than to first develop a sync API and later re-whack it into an async API.
Of course, it's easier to develop a sync API and stop there, but really, we need async interfaces -- everything seems to nowadays (from GUI programming, starting decades ago, to Ajax now).
Nico -- _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
1,364
From:
US
Registered:
4/27/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 7:28 PM
in response to: nico
|
|
Nicolas Williams wrote: > On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote: > >> - I'd have been tempted to stick with a synchronous interface for the >> initial development; lose the event notification. Is there significant >> demand from potential customers, which wouldn't be satisfied by them >> creating threads? >> > > It's easier to develop an async API and layer sync on top than to first > develop a sync API and later re-whack it into an async API. > > Of course, it's easier to develop a sync API and stop there, but really, > we need async interfaces -- everything seems to nowadays (from GUI > programming, starting decades ago, to Ajax now). >
That is certainly true of many APIs that were designed to operate in the absence of threading. However, a kernel API where threads are readily available probably shouldn't have to deal with the complexity of an async. API, IMO.
-- Garrett > Nico >
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
213
From:
US
Registered:
3/9/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 7:35 PM
in response to: gdamore
|
|
On Apr 30, 2007, at 9:28 PM, Garrett D'Amore wrote:
> Nicolas Williams wrote: >> On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote: >> >>> - I'd have been tempted to stick with a synchronous interface for >>> the >>> initial development; lose the event notification. Is there >>> significant >>> demand from potential customers, which wouldn't be satisfied by >>> them >>> creating threads? >>> >> >> It's easier to develop an async API and layer sync on top than to >> first >> develop a sync API and later re-whack it into an async API. >> >> Of course, it's easier to develop a sync API and stop there, but >> really, >> we need async interfaces -- everything seems to nowadays (from GUI >> programming, starting decades ago, to Ajax now). >> > > That is certainly true of many APIs that were designed to operate > in the absence of threading. However, a kernel API where threads > are readily available probably shouldn't have to deal with the > complexity of an async. API, IMO.
Think of the NFS server with 1000s connections to manage. Either the ksocket API provides async events or the kernel RPC layer will need to build something on its own. I prefer the former.
Spencer
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
1,364
From:
US
Registered:
4/27/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 7:38 PM
in response to: shepler
|
|
Spencer Shepler wrote: > > On Apr 30, 2007, at 9:28 PM, Garrett D'Amore wrote: > >> Nicolas Williams wrote: >>> On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote: >>> >>>> - I'd have been tempted to stick with a synchronous interface for the >>>> initial development; lose the event notification. Is there >>>> significant >>>> demand from potential customers, which wouldn't be satisfied by them >>>> creating threads? >>>> >>> >>> It's easier to develop an async API and layer sync on top than to first >>> develop a sync API and later re-whack it into an async API. >>> >>> Of course, it's easier to develop a sync API and stop there, but >>> really, >>> we need async interfaces -- everything seems to nowadays (from GUI >>> programming, starting decades ago, to Ajax now). >>> >> >> That is certainly true of many APIs that were designed to operate in >> the absence of threading. However, a kernel API where threads are >> readily available probably shouldn't have to deal with the complexity >> of an async. API, IMO. > > Think of the NFS server with 1000s connections to manage. > Either the ksocket API provides async events or the kernel RPC > layer will need to build something on its own. I prefer the former.
Good point. In the face of scalability concerns, async is cheaper than full threads. :-)
This this the plan going forward for this API, btw?
-- Garrett
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
91
From:
US
Registered:
1/19/06
|
|
|
|
Re: Kernel sockets design document
Posted:
May 1, 2007 11:31 AM
in response to: gdamore
|
|
Garrett D'Amore wrote: <SNIP> >> Think of the NFS server with 1000s connections to manage. >> Either the ksocket API provides async events or the kernel RPC >> layer will need to build something on its own. I prefer the former. > > Good point. In the face of scalability concerns, async is cheaper > than full threads. :-) > > This this the plan going forward for this API, btw? > > -- Garrett Yes, as it stands now the notification system will be asynchronous.
Anders _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
3,468
From:
US
Registered:
6/15/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 7:53 PM
in response to: gdamore
|
|
On Mon, Apr 30, 2007 at 07:28:29PM -0700, Garrett D'Amore wrote: > Nicolas Williams wrote: > >On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote: > > > >>- I'd have been tempted to stick with a synchronous interface for the > >> initial development; lose the event notification. Is there significant > >> demand from potential customers, which wouldn't be satisfied by them > >> creating threads? > >> > > > >It's easier to develop an async API and layer sync on top than to first > >develop a sync API and later re-whack it into an async API. > > > >Of course, it's easier to develop a sync API and stop there, but really, > >we need async interfaces -- everything seems to nowadays (from GUI > >programming, starting decades ago, to Ajax now). > > > > That is certainly true of many APIs that were designed to operate in the > absence of threading. However, a kernel API where threads are readily > available probably shouldn't have to deal with the complexity of an > async. API, IMO.
Threads aren't cheap. _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
213
From:
US
Registered:
3/9/05
|
|
|
|
Re: Kernel sockets design document
Posted:
Apr 30, 2007 7:24 PM
in response to: anders
|
|
On Apr 23, 2007, at 5:53 PM, Anders Persson wrote:
> Hi Everyone, > > I just published a design document for the kernel sockets interface > [1]. The document is short, so hopefully everyone will have time to > read it :) Any comments would be appreciated. > > Anders > > [1] http://www.opensolaris.org/os/project/kernel-sockets/files/ > kernel-sockets.pdf >
Anders,
Thanks for the pointer. I have been working through a new architecture or refactoring of the kernel RPC services; as you know and allude to in the above document, the NFS client and server rely heavily on direct access (with minimal overhead) to the UDP and TCP stacks; streams today -- ksockets going forward.
The ability to receive asynchronous notification of events is a critical requirement for the kernel RPC services (today's and the eventually for the refactored kernel RPC as well). The refactored kernel RPC interfaces will themselves be fundamentally asynchronous (very different than the traditional RPC APIs.
Given that, I have just a few general questions.
Today, the NFS client and server, in support of RDMA transports like Infiniband, will start with TCP connections and then determine if RDMA is available on the interface used for the connection. Most of this work is done at user-level with a smaller set of code in the kernel for the final setup. It would be helpful to enable the NFS client and server to do this transition completely within the kernel. This is a nice to have; not a requirement.
The kernel RPC interfaces use the streams timer mechanism to timeout and close idle connections; again, a nice to have but not a hard requirement. I should also mention that the NFS server changes the receive buffer size/window size to stop-down the client when it is not receiving data as quickly as it is sending requests. Seems like that will be covered with what you propose.
Finally, what additional thoughts do you have about the event notification mechanism. Will it deliver multiple events, simultaneously for a particular socket? Or will it wait for one event delivery to be complete before delivering the next. Would it be possible chain or provide a list of events? If so, I would imagine that the consumer of the interfaces would specify the desired behavior.
Spencer
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
91
From:
US
Registered:
1/19/06
|
|
|
|
Re: Kernel sockets design document
Posted:
May 1, 2007 11:26 AM
in response to: shepler
|
|
Spencer Shepler wrote:
<SNIP> > Today, the NFS client and server, in support of RDMA transports > like Infiniband, will start with TCP connections and then > determine if RDMA is available on the interface used for > the connection. Most of this work is done at user-level > with a smaller set of code in the kernel for the final > setup. It would be helpful to enable the NFS client and > server to do this transition completely within the kernel. > This is a nice to have; not a requirement. It might be possible to do, I will look into it. :) > > The kernel RPC interfaces use the streams timer mechanism to > timeout and close idle connections; again, a nice to have > but not a hard requirement. So far there have been not plans for adding any timer mechanism to the interface itself. The user would have to rely on timeout(9F) for that functionality. > I should also mention that the > NFS server changes the receive buffer size/window size to > stop-down the client when it is not receiving data as quickly > as it is sending requests. Seems like that will be covered > with what you propose. Yes, you will be able to modify the buffer size via ksock_setsockopt(). > > > Finally, what additional thoughts do you have about the > event notification mechanism. Will it deliver multiple > events, simultaneously for a particular socket? That is the current design; there is no mechanism in the interface that synchronize the events, however, at least for TCP, I suppose that those scenarios would be uncommon since squeues should ensure synchronization. > Or will > it wait for one event delivery to be complete before > delivering the next. Would it be possible chain or provide > a list of events? It could be possible, but the consumer can have this type of behavior by adding some additional logic to the callback functions. So for that reason I do not think the additional complexity is needed.
Thank you very much, Anders _______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
213
From:
US
Registered:
3/9/05
|
|
|
|
Re: Kernel sockets design document
Posted:
May 1, 2007 4:03 PM
in response to: anders
|
|
On May 1, 2007, at 1:26 PM, Anders Persson wrote:
> Spencer Shepler wrote: > >> >> The kernel RPC interfaces use the streams timer mechanism to >> timeout and close idle connections; again, a nice to have >> but not a hard requirement. > So far there have been not plans for adding any timer mechanism to > the interface itself. The user would > have to rely on timeout(9F) for that functionality.
That's fine.
>> I should also mention that the >> NFS server changes the receive buffer size/window size to >> stop-down the client when it is not receiving data as quickly >> as it is sending requests. Seems like that will be covered >> with what you propose. > Yes, you will be able to modify the buffer size via ksock_setsockopt > (). >> >> >> Finally, what additional thoughts do you have about the >> event notification mechanism. Will it deliver multiple >> events, simultaneously for a particular socket? > That is the current design; there is no mechanism in the interface > that synchronize the events, however, at least for TCP, I suppose > that those scenarios would be uncommon since squeues should ensure > synchronization. >> Or will >> it wait for one event delivery to be complete before >> delivering the next. Would it be possible chain or provide >> a list of events? > It could be possible, but the consumer can have this type of > behavior by adding some additional logic to the callback functions. > So for that reason I do not think the additional complexity is needed.
Agreed.
Spencer
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
Posts:
190
From:
US
Registered:
3/9/05
|
|
|
|
Re: Kernel sockets design document
Posted:
May 3, 2007 1:19 PM
in response to: anders
|
|
Anders Persson wrote: > Spencer Shepler wrote: > > <SNIP> >> Today, the NFS client and server, in support of RDMA transports >> like Infiniband, will start with TCP connections and then >> determine if RDMA is available on the interface used for >> the connection. Most of this work is done at user-level >> with a smaller set of code in the kernel for the final >> setup. It would be helpful to enable the NFS client and >> server to do this transition completely within the kernel. >> This is a nice to have; not a requirement. > It might be possible to do, I will look into it. :) > I don't think this functionality should be provided by kernel sockets. Since besides IB we dont have any other RDMA capable card yet, it's hard to sepculate/design a general interface. However in case of IB, The IB framework should provide the necessary interfaces for NFS to determine if NFS over RDMA is available.
When/If the general socket interface is updated to account for RDMA based NIC's we should update the kernel sockets.
Rao.
_______________________________________________ networking-discuss mailing list networking-discuss at opensolaris dot org
|
|
|
|
|