OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » networking » discuss

Thread: Kernel sockets design document

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 24 - Last Post: May 3, 2007 1:19 PM by: rshoaib
anders

Posts: 91
From: US

Registered: 1/19/06
Kernel sockets design document
Posted: Apr 23, 2007 3:50 PM

  Click to reply to this thread Reply

Hi Everyone,

I just published a design document for the kernel sockets interface[1].
The document is short, so hopefully everyone will have time to read it
:) Any comments would be appreciated.

Anders

[1]
http://www.opensolaris.org/os/project/kernel-sockets/files/kernel-sockets.pdf
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



pdurrant

Posts: 430
From: GB

Registered: 6/15/05
Re: Kernel sockets design document
Posted: Apr 24, 2007 2:29 AM   in response to: anders

  Click to reply to this thread Reply

On 4/23/07, Anders Persson <anders dot persson at sun dot com> wrote:
>
> I just published a design document for the kernel sockets interface[1].
> The document is short, so hopefully everyone will have time to read it
> :) Any comments would be appreciated.
>

Few things...

- Are these sockets goind to obey 3SOCKET or 3XNET semantics?

- I think the ksocket_t * should be the last arg. to ksock_socket().
In general my preference is for value-return args. to be at the end of
the list.

- Is ksock_set_nonblocking() necessary? Could this not be handled by
an option passed to ksock_setsockopt()?

- I think the ksock_callback_t passed to ksock_accept() is slightly
confusing. Is it really necessary for an accepted socket to
immediately have callbacks? It would be more straightforward if the
thread calling ksock_accept() simply called ksock_callback() upon its
return.

Paul

--
Paul Durrant
http://www.linkedin.com/in/pdurrant
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



anders

Posts: 91
From: US

Registered: 1/19/06
Re: Kernel sockets design document
Posted: Apr 24, 2007 10:54 AM   in response to: pdurrant

  Click to reply to this thread Reply

Paul Durrant wrote:

<SNIP>
> Few things...
>
> - Are these sockets goind to obey 3SOCKET or 3XNET semantics?
They currently obeying 3SOCKET semantics, and it should probably stay
that way unless people have some concerns.
>
>
> - I think the ksocket_t * should be the last arg. to ksock_socket().
> In general my preference is for value-return args. to be at the end of
> the list.
OK, noted. If I see more requests about changing the location of the
argument, then I will do so.
>
>
> - Is ksock_set_nonblocking() necessary? Could this not be handled by
> an option passed to ksock_setsockopt()?
It could. However, I am hesitant to introduce kernel-socket only socket
options.
>
> - I think the ksock_callback_t passed to ksock_accept() is slightly
> confusing. Is it really necessary for an accepted socket to
> immediately have callbacks? It would be more straightforward if the
> thread calling ksock_accept() simply called ksock_callback() upon its
> return.
The issue with that is that an event might be missed, and the user would
have to check for "pending" events after registering the callbacks. The
current approach allows the user to do what you want by simply passing
in NULL for the last two arguments. Another approach would be to provide
two versions of accept; one that has the "regular" accept behavior, and
another that allows for callback registration.
>
> Paul
>
Thank you very much for you comments,

Anders

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



pdurrant

Posts: 430
From: GB

Registered: 6/15/05
Re: Kernel sockets design document
Posted: Apr 25, 2007 6:39 AM   in response to: anders

  Click to reply to this thread Reply

On 4/24/07, Anders Persson <anders dot persson at sun dot com> wrote:
> > - I think the ksock_callback_t passed to ksock_accept() is slightly
> > confusing. Is it really necessary for an accepted socket to
> > immediately have callbacks? It would be more straightforward if the
> > thread calling ksock_accept() simply called ksock_callback() upon its
> > return.
> The issue with that is that an event might be missed, and the user would
> have to check for "pending" events after registering the callbacks. The
> current approach allows the user to do what you want by simply passing
> in NULL for the last two arguments. Another approach would be to provide
> two versions of accept; one that has the "regular" accept behavior, and
> another that allows for callback registration.

This yeilds another question then. When I add a callback function to
an existing socket using ksock_callback() do I not get notification of
pending events? I.e. if the socket is connected and already has data,
do I not get told about that?

Paul

--
Paul Durrant
http://www.linkedin.com/in/pdurrant
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



anders

Posts: 91
From: US

Registered: 1/19/06
Re: Kernel sockets design document
Posted: Apr 25, 2007 5:31 PM   in response to: pdurrant

  Click to reply to this thread Reply

Paul Durrant wrote:
> On 4/24/07, Anders Persson <anders dot persson at sun dot com> wrote:
>> > - I think the ksock_callback_t passed to ksock_accept() is slightly
>> > confusing. Is it really necessary for an accepted socket to
>> > immediately have callbacks? It would be more straightforward if the
>> > thread calling ksock_accept() simply called ksock_callback() upon its
>> > return.
>> The issue with that is that an event might be missed, and the user would
>> have to check for "pending" events after registering the callbacks. The
>> current approach allows the user to do what you want by simply passing
>> in NULL for the last two arguments. Another approach would be to provide
>> two versions of accept; one that has the "regular" accept behavior, and
>> another that allows for callback registration.
>
> This yeilds another question then. When I add a callback function to
> an existing socket using ksock_callback() do I not get notification of
> pending events? I.e. if the socket is connected and already has data,
> do I not get told about that?
Right, there is no mechanism in place to notify the user about events
that have already take place. If the event happens, and there is no
callback registered, the notification is lost. So if callbacks are
registered after the ksock_accept(), then the user would have to use
ksock_recv() to find out about data that arrived before the registration
took place.

Anders
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



nordmark

Posts: 638
From: US

Registered: 3/9/05
Re: Kernel sockets design document
Posted: Apr 25, 2007 9:59 AM   in response to: anders

  Click to reply to this thread Reply

Anders Persson wrote:
> Paul Durrant wrote:
>
> <SNIP>
>> Few things...
>>
>> - Are these sockets goind to obey 3SOCKET or 3XNET semantics?
> They currently obeying 3SOCKET semantics, and it should probably stay
> that way unless people have some concerns.

I think it would be good to get the new msghdr support (ancillary data
aka msg_control) by providing only 3XNET semantics. That is a superset
of the 3SOCKET semantics.


Erik
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



anders

Posts: 91
From: US

Registered: 1/19/06
Re: Kernel sockets design document
Posted: Apr 30, 2007 2:55 PM   in response to: nordmark

  Click to reply to this thread Reply

Erik Nordmark wrote:
> Anders Persson wrote:
>> Paul Durrant wrote:
>>
>> <SNIP>
>>> Few things...
>>>
>>> - Are these sockets goind to obey 3SOCKET or 3XNET semantics?
>> They currently obeying 3SOCKET semantics, and it should probably stay
>> that way unless people have some concerns.
>
> I think it would be good to get the new msghdr support (ancillary data
> aka msg_control) by providing only 3XNET semantics. That is a superset
> of the 3SOCKET semantics.
>
>
> Erik
OK, I will look into that.

Thanks,
Anders
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



Jeremy Harris
jgh@wizmail.org
Re: Kernel sockets design document
Posted: Apr 24, 2007 1:23 PM   in response to: anders

  Click to reply to this thread Reply

Anders Persson wrote:
> http://www.opensolaris.org/os/project/kernel-sockets/files/kernel-sockets.pdf

Overall, good. Two points:

- I'd have been tempted to stick with a synchronous interface for the
initial development; lose the event notification. Is there significant
demand from potential customers, which wouldn't be satisfied by them
creating threads?

- Special-casing nonblocking mode seems odd. Are no other ioctls relevant?
SIOCATMARK? FIONREAD?


Cheers,
Jeremy Harris
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



anders

Posts: 91
From: US

Registered: 1/19/06
Re: Kernel sockets design document
Posted: Apr 27, 2007 11:04 AM   in response to: Jeremy Harris

  Click to reply to this thread Reply

Jeremy,

Sorry about the late reply, somehow the mail slipped pass me. My
comments are inline.

Jeremy Harris wrote:
> Anders Persson wrote:
>> http://www.opensolaris.org/os/project/kernel-sockets/files/kernel-sockets.pdf
>
>
> Overall, good. Two points:
>
> - I'd have been tempted to stick with a synchronous interface for the
> initial development; lose the event notification. Is there significant
> demand from potential customers, which wouldn't be satisfied by them
> creating threads?
Yes, people have expressed interested in having an asynchronous
notification system. However, a synchronous interface is definitely more
familiar to use, but that such an interface could always be built on top
of the current interface.
>
>
> - Special-casing nonblocking mode seems odd. Are no other ioctls
> relevant?
> SIOCATMARK? FIONREAD?
In the current design, all ioctls that affect sockets would be
represented explicitly by a function. SIOCATMARK is handled
(ksock_atmark()); I mention it in the text, but I seems like I forgot to
add it to the function listing. However, I completely forgot about
FIONREAD, and a function would have to be added to handle it.
>
>
> Cheers,
> Jeremy Harris
Thank you very much for you comments.

Anders
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,935
From: US

Registered: 3/9/05
Re: Kernel sockets design document
Posted: Apr 27, 2007 12:32 PM   in response to: anders

  Click to reply to this thread Reply

Anders Persson writes:
> > - Special-casing nonblocking mode seems odd. Are no other ioctls
> > relevant?
> > SIOCATMARK? FIONREAD?
> In the current design, all ioctls that affect sockets would be
> represented explicitly by a function. SIOCATMARK is handled
> (ksock_atmark()); I mention it in the text, but I seems like I forgot to
> add it to the function listing. However, I completely forgot about
> FIONREAD, and a function would have to be added to handle it.

What's the rationale for function-per-ioctl rather than just having
ksock_ioctl() and (if necessary) ksock_fcntl()?

--
James Carlson, Solaris Networking <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



anders

Posts: 91
From: US

Registered: 1/19/06
Re: Kernel sockets design document
Posted: Apr 30, 2007 2:10 PM   in response to: carlsonj

  Click to reply to this thread Reply

James Carlson wrote:
> Anders Persson writes:
>
>>> - Special-casing nonblocking mode seems odd. Are no other ioctls
>>> relevant?
>>> SIOCATMARK? FIONREAD?
>>>
>> In the current design, all ioctls that affect sockets would be
>> represented explicitly by a function. SIOCATMARK is handled
>> (ksock_atmark()); I mention it in the text, but I seems like I forgot to
>> add it to the function listing. However, I completely forgot about
>> FIONREAD, and a function would have to be added to handle it.
>>
>
> What's the rationale for function-per-ioctl rather than just having
> ksock_ioctl() and (if necessary) ksock_fcntl()?
>
>
Not all ioctls that would normally apply to userland sockets are relevant
for kernel sockets (e.g., SIOCSPGRP), so there needs to be some facility
to verify that a given ioctl is allowed depending on the type of socket.
When I initially went through the list of interesting ioctls it was quite
short (FIONBIO, SIOCATMARK), and I thought the function-per-ioctl
would be suitable mechanism for filtering out unsupported ioctls.

I reviewed the ioctls again, and it now appears that things are a bit
more complex. For example, SCTP have a few additional ioctls that
needs to be supported. So it appears that the best way would be to
provide a generic ksock_ioctl(), however, I still have to look into
how to filter out unsupported ioctls.

Anders

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,935
From: US

Registered: 3/9/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 2:18 PM   in response to: anders

  Click to reply to this thread Reply

Anders Persson writes:
> > What's the rationale for function-per-ioctl rather than just having
> > ksock_ioctl() and (if necessary) ksock_fcntl()?
> >
> >
> Not all ioctls that would normally apply to userland sockets are relevant
> for kernel sockets (e.g., SIOCSPGRP), so there needs to be some facility
> to verify that a given ioctl is allowed depending on the type of socket.

I would think that ksock_ioctl() would be an excellent place to test
for that relevance. Just return EINVAL if it's not allowed. Heck,
you can even do EINVAL as the "default:" rule and only allow the ones
you *know* are good.

I think there's a distinction between the design pattern using
function calls versus a single entry point (which was the question),
and the scope of the allowable features (not the question).

> When I initially went through the list of interesting ioctls it was quite
> short (FIONBIO, SIOCATMARK), and I thought the function-per-ioctl
> would be suitable mechanism for filtering out unsupported ioctls.
>
> I reviewed the ioctls again, and it now appears that things are a bit
> more complex. For example, SCTP have a few additional ioctls that
> needs to be supported.

Indeed.

> So it appears that the best way would be to
> provide a generic ksock_ioctl(), however, I still have to look into
> how to filter out unsupported ioctls.

'switch' would probably be sufficient. It's what we do elsewhere.

There'll likely be a fair bit of overlap between this project and the
existing netinfo interfaces. I'd like to see the more-standard-like
ksock approach become the norm ... but that might be much longer term.

--
James Carlson, Solaris Networking <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



darrenr

Posts: 2,114
From:

Registered: 6/8/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 3:15 PM   in response to: carlsonj

  Click to reply to this thread Reply

James Carlson wrote:

>Anders Persson writes:
>
>
>>So it appears that the best way would be to
>>provide a generic ksock_ioctl(), however, I still have to look into
>>how to filter out unsupported ioctls.
>>
>>
>
>'switch' would probably be sufficient. It's what we do elsewhere.
>
>There'll likely be a fair bit of overlap between this project and the
>existing netinfo interfaces. I'd like to see the more-standard-like
>ksock approach become the norm ... but that might be much longer term
>

I'm not sure of that.
I suspect you're thinking that the routing socket would be the
ideal thing to use here...

That's less than ideal (as was the pfild) and what I'd see as being
a compromise on the desired interaction with a purist view on
design. A big problem with pfild using the routing sockets as "the"
solution was the need to maintain a second copy of the relevant
pieces of data.

Another problem here is that there is a window of time between
when a change happens on a network interface and the routing
socket message is delivered/received. The netinfo interfaces
were aimed at eliminating that window opening.

But there is definately some possible overlap with ioctls such as
SIOCGIFADDR, SIOCGIFDSTADDR, etc, being just as useful as
the existing functions today.

Darren

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



carlsonj

Posts: 6,935
From: US

Registered: 3/9/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 3:23 PM   in response to: darrenr

  Click to reply to this thread Reply

Darren dot Reed at Sun dot COM writes:
> But there is definately some possible overlap with ioctls such as
> SIOCGIFADDR, SIOCGIFDSTADDR, etc, being just as useful as
> the existing functions today.

Yes, those are the ones I was thinking of. Kernel sockets users
probably shouldn't have to use netinfo as well if all they want are
"normal" things, such as address information and MTU.

--
James Carlson, Solaris Networking <james dot d dot carlson at sun dot com>
Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



darrenr

Posts: 2,114
From:

Registered: 6/8/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 3:46 PM   in response to: carlsonj

  Click to reply to this thread Reply

James Carlson wrote:

>Darren dot Reed at Sun dot COM writes:
>
>
>>But there is definately some possible overlap with ioctls such as
>>SIOCGIFADDR, SIOCGIFDSTADDR, etc, being just as useful as
>>the existing functions today.
>>
>>
>
>Yes, those are the ones I was thinking of. Kernel sockets users
>probably shouldn't have to use netinfo as well if all they want are
>"normal" things, such as address information and MTU.
>
>

I agree completely.

Darren

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



nico

Posts: 3,468
From: US

Registered: 6/15/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 3:55 PM   in response to: Jeremy Harris

  Click to reply to this thread Reply

On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote:
> - I'd have been tempted to stick with a synchronous interface for the
> initial development; lose the event notification. Is there significant
> demand from potential customers, which wouldn't be satisfied by them
> creating threads?

It's easier to develop an async API and layer sync on top than to first
develop a sync API and later re-whack it into an async API.

Of course, it's easier to develop a sync API and stop there, but really,
we need async interfaces -- everything seems to nowadays (from GUI
programming, starting decades ago, to Ajax now).

Nico
--
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



gdamore

Posts: 1,364
From: US

Registered: 4/27/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 7:28 PM   in response to: nico

  Click to reply to this thread Reply

Nicolas Williams wrote:
> On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote:
>
>> - I'd have been tempted to stick with a synchronous interface for the
>> initial development; lose the event notification. Is there significant
>> demand from potential customers, which wouldn't be satisfied by them
>> creating threads?
>>
>
> It's easier to develop an async API and layer sync on top than to first
> develop a sync API and later re-whack it into an async API.
>
> Of course, it's easier to develop a sync API and stop there, but really,
> we need async interfaces -- everything seems to nowadays (from GUI
> programming, starting decades ago, to Ajax now).
>

That is certainly true of many APIs that were designed to operate in the
absence of threading. However, a kernel API where threads are readily
available probably shouldn't have to deal with the complexity of an
async. API, IMO.

-- Garrett
> Nico
>

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



shepler

Posts: 213
From: US

Registered: 3/9/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 7:35 PM   in response to: gdamore

  Click to reply to this thread Reply


On Apr 30, 2007, at 9:28 PM, Garrett D'Amore wrote:

> Nicolas Williams wrote:
>> On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote:
>>
>>> - I'd have been tempted to stick with a synchronous interface for
>>> the
>>> initial development; lose the event notification. Is there
>>> significant
>>> demand from potential customers, which wouldn't be satisfied by
>>> them
>>> creating threads?
>>>
>>
>> It's easier to develop an async API and layer sync on top than to
>> first
>> develop a sync API and later re-whack it into an async API.
>>
>> Of course, it's easier to develop a sync API and stop there, but
>> really,
>> we need async interfaces -- everything seems to nowadays (from GUI
>> programming, starting decades ago, to Ajax now).
>>
>
> That is certainly true of many APIs that were designed to operate
> in the absence of threading. However, a kernel API where threads
> are readily available probably shouldn't have to deal with the
> complexity of an async. API, IMO.

Think of the NFS server with 1000s connections to manage.
Either the ksocket API provides async events or the kernel RPC
layer will need to build something on its own. I prefer the former.

Spencer

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



gdamore

Posts: 1,364
From: US

Registered: 4/27/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 7:38 PM   in response to: shepler

  Click to reply to this thread Reply

Spencer Shepler wrote:
>
> On Apr 30, 2007, at 9:28 PM, Garrett D'Amore wrote:
>
>> Nicolas Williams wrote:
>>> On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote:
>>>
>>>> - I'd have been tempted to stick with a synchronous interface for the
>>>> initial development; lose the event notification. Is there
>>>> significant
>>>> demand from potential customers, which wouldn't be satisfied by them
>>>> creating threads?
>>>>
>>>
>>> It's easier to develop an async API and layer sync on top than to first
>>> develop a sync API and later re-whack it into an async API.
>>>
>>> Of course, it's easier to develop a sync API and stop there, but
>>> really,
>>> we need async interfaces -- everything seems to nowadays (from GUI
>>> programming, starting decades ago, to Ajax now).
>>>
>>
>> That is certainly true of many APIs that were designed to operate in
>> the absence of threading. However, a kernel API where threads are
>> readily available probably shouldn't have to deal with the complexity
>> of an async. API, IMO.
>
> Think of the NFS server with 1000s connections to manage.
> Either the ksocket API provides async events or the kernel RPC
> layer will need to build something on its own. I prefer the former.

Good point. In the face of scalability concerns, async is cheaper than
full threads. :-)

This this the plan going forward for this API, btw?

-- Garrett

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



anders

Posts: 91
From: US

Registered: 1/19/06
Re: Kernel sockets design document
Posted: May 1, 2007 11:31 AM   in response to: gdamore

  Click to reply to this thread Reply

Garrett D'Amore wrote:
<SNIP>
>> Think of the NFS server with 1000s connections to manage.
>> Either the ksocket API provides async events or the kernel RPC
>> layer will need to build something on its own. I prefer the former.
>
> Good point. In the face of scalability concerns, async is cheaper
> than full threads. :-)
>
> This this the plan going forward for this API, btw?
>
> -- Garrett
Yes, as it stands now the notification system will be asynchronous.

Anders
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



nico

Posts: 3,468
From: US

Registered: 6/15/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 7:53 PM   in response to: gdamore

  Click to reply to this thread Reply

On Mon, Apr 30, 2007 at 07:28:29PM -0700, Garrett D'Amore wrote:
> Nicolas Williams wrote:
> >On Tue, Apr 24, 2007 at 09:23:30PM +0100, Jeremy Harris wrote:
> >
> >>- I'd have been tempted to stick with a synchronous interface for the
> >> initial development; lose the event notification. Is there significant
> >> demand from potential customers, which wouldn't be satisfied by them
> >> creating threads?
> >>
> >
> >It's easier to develop an async API and layer sync on top than to first
> >develop a sync API and later re-whack it into an async API.
> >
> >Of course, it's easier to develop a sync API and stop there, but really,
> >we need async interfaces -- everything seems to nowadays (from GUI
> >programming, starting decades ago, to Ajax now).
> >
>
> That is certainly true of many APIs that were designed to operate in the
> absence of threading. However, a kernel API where threads are readily
> available probably shouldn't have to deal with the complexity of an
> async. API, IMO.

Threads aren't cheap.
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



shepler

Posts: 213
From: US

Registered: 3/9/05
Re: Kernel sockets design document
Posted: Apr 30, 2007 7:24 PM   in response to: anders

  Click to reply to this thread Reply


On Apr 23, 2007, at 5:53 PM, Anders Persson wrote:

> Hi Everyone,
>
> I just published a design document for the kernel sockets interface
> [1]. The document is short, so hopefully everyone will have time to
> read it :) Any comments would be appreciated.
>
> Anders
>
> [1] http://www.opensolaris.org/os/project/kernel-sockets/files/
> kernel-sockets.pdf
>

Anders,

Thanks for the pointer. I have been working through a new architecture
or refactoring of the kernel RPC services; as you know and allude to
in the above document, the NFS client and server rely heavily on
direct access (with minimal overhead) to the UDP and TCP stacks;
streams today -- ksockets going forward.

The ability to receive asynchronous notification of events is
a critical requirement for the kernel RPC services (today's and
the eventually for the refactored kernel RPC as well).
The refactored kernel RPC interfaces will themselves be
fundamentally asynchronous (very different than the traditional
RPC APIs.

Given that, I have just a few general questions.

Today, the NFS client and server, in support of RDMA transports
like Infiniband, will start with TCP connections and then
determine if RDMA is available on the interface used for
the connection. Most of this work is done at user-level
with a smaller set of code in the kernel for the final
setup. It would be helpful to enable the NFS client and
server to do this transition completely within the kernel.
This is a nice to have; not a requirement.

The kernel RPC interfaces use the streams timer mechanism to
timeout and close idle connections; again, a nice to have
but not a hard requirement. I should also mention that the
NFS server changes the receive buffer size/window size to
stop-down the client when it is not receiving data as quickly
as it is sending requests. Seems like that will be covered
with what you propose.

Finally, what additional thoughts do you have about the
event notification mechanism. Will it deliver multiple
events, simultaneously for a particular socket? Or will
it wait for one event delivery to be complete before
delivering the next. Would it be possible chain or provide
a list of events? If so, I would imagine that the consumer
of the interfaces would specify the desired behavior.

Spencer




_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



anders

Posts: 91
From: US

Registered: 1/19/06
Re: Kernel sockets design document
Posted: May 1, 2007 11:26 AM   in response to: shepler

  Click to reply to this thread Reply

Spencer Shepler wrote:

<SNIP>
> Today, the NFS client and server, in support of RDMA transports
> like Infiniband, will start with TCP connections and then
> determine if RDMA is available on the interface used for
> the connection. Most of this work is done at user-level
> with a smaller set of code in the kernel for the final
> setup. It would be helpful to enable the NFS client and
> server to do this transition completely within the kernel.
> This is a nice to have; not a requirement.
It might be possible to do, I will look into it. :)
>
> The kernel RPC interfaces use the streams timer mechanism to
> timeout and close idle connections; again, a nice to have
> but not a hard requirement.
So far there have been not plans for adding any timer mechanism to the
interface itself. The user would
have to rely on timeout(9F) for that functionality.
> I should also mention that the
> NFS server changes the receive buffer size/window size to
> stop-down the client when it is not receiving data as quickly
> as it is sending requests. Seems like that will be covered
> with what you propose.
Yes, you will be able to modify the buffer size via ksock_setsockopt().
>
>
> Finally, what additional thoughts do you have about the
> event notification mechanism. Will it deliver multiple
> events, simultaneously for a particular socket?
That is the current design; there is no mechanism in the interface that
synchronize the events, however, at least for TCP, I suppose that those
scenarios would be uncommon since squeues should ensure synchronization.
> Or will
> it wait for one event delivery to be complete before
> delivering the next. Would it be possible chain or provide
> a list of events?
It could be possible, but the consumer can have this type of behavior by
adding some additional logic to the callback functions. So for that
reason I do not think the additional complexity is needed.

Thank you very much,
Anders
_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



shepler

Posts: 213
From: US

Registered: 3/9/05
Re: Kernel sockets design document
Posted: May 1, 2007 4:03 PM   in response to: anders

  Click to reply to this thread Reply


On May 1, 2007, at 1:26 PM, Anders Persson wrote:

> Spencer Shepler wrote:
>
>>
>> The kernel RPC interfaces use the streams timer mechanism to
>> timeout and close idle connections; again, a nice to have
>> but not a hard requirement.
> So far there have been not plans for adding any timer mechanism to
> the interface itself. The user would
> have to rely on timeout(9F) for that functionality.

That's fine.

>> I should also mention that the
>> NFS server changes the receive buffer size/window size to
>> stop-down the client when it is not receiving data as quickly
>> as it is sending requests. Seems like that will be covered
>> with what you propose.
> Yes, you will be able to modify the buffer size via ksock_setsockopt
> ().
>>
>>
>> Finally, what additional thoughts do you have about the
>> event notification mechanism. Will it deliver multiple
>> events, simultaneously for a particular socket?
> That is the current design; there is no mechanism in the interface
> that synchronize the events, however, at least for TCP, I suppose
> that those scenarios would be uncommon since squeues should ensure
> synchronization.
>> Or will
>> it wait for one event delivery to be complete before
>> delivering the next. Would it be possible chain or provide
>> a list of events?
> It could be possible, but the consumer can have this type of
> behavior by adding some additional logic to the callback functions.
> So for that reason I do not think the additional complexity is needed.

Agreed.

Spencer


_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org



rshoaib

Posts: 190
From: US

Registered: 3/9/05
Re: Kernel sockets design document
Posted: May 3, 2007 1:19 PM   in response to: anders

  Click to reply to this thread Reply

Anders Persson wrote:
> Spencer Shepler wrote:
>
> <SNIP>
>> Today, the NFS client and server, in support of RDMA transports
>> like Infiniband, will start with TCP connections and then
>> determine if RDMA is available on the interface used for
>> the connection. Most of this work is done at user-level
>> with a smaller set of code in the kernel for the final
>> setup. It would be helpful to enable the NFS client and
>> server to do this transition completely within the kernel.
>> This is a nice to have; not a requirement.
> It might be possible to do, I will look into it. :)
>
I don't think this functionality should be provided by kernel sockets.
Since besides IB we dont have any other RDMA capable card yet, it's hard
to sepculate/design a general interface. However in case of IB, The IB
framework should provide the necessary interfaces for NFS to determine
if NFS over RDMA is available.

When/If the general socket interface is updated to account for RDMA
based NIC's we should update the kernel sockets.

Rao.

_______________________________________________
networking-discuss mailing list
networking-discuss at opensolaris dot org






Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
© 2010, Oracle Corporation and/or its affiliates

Oracle