OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » nwam » discuss

Thread: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 32 - Last Post: Sep 30, 2009 6:07 AM by: amaguire
mph

Posts: 542
From:

Registered: 5/11/05
[nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 8, 2009 8:14 AM

  Click to reply to this thread Reply



Begin forwarded message:

Date: Mon, 06 Jul 2009 15:36:11 -0700
From: Michael Hunter <Michael dot Hunter at Sun dot COM>
To: nwam-dev at opensolaris dot org
Subject: [nwam-dev] NWAM Phase 1 Code Review request


This is a request for people to review the NWAM Phase 1 code.

The code can be found at http://cr.opensolaris.org/~mph/nwam1_cr/

For those inside sun there is a cscope database
in /net/coupe.eng/builds/mph/nwam1_cr/usr/src (sorry to those external
to SWAN).

Documentation can both be found in the source tree and on the project
web page at http://opensolaris.org/os/project/nwam/

We request code review be finished by Monday July 20th.

Michael Hunter
_______________________________________________
nwam-dev mailing list
nwam-dev at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-dev
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


Darren J Moffat
darrenm@opensolaris....
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 8, 2009 8:30 AM   in response to: mph

  Click to reply to this thread Reply

Michael Hunter wrote:
>
> Begin forwarded message:
>
> Date: Mon, 06 Jul 2009 15:36:11 -0700
> From: Michael Hunter <Michael dot Hunter at Sun dot COM>
> To: nwam-dev at opensolaris dot org
> Subject: [nwam-dev] NWAM Phase 1 Code Review request
>
>
> This is a request for people to review the NWAM Phase 1 code.
>
> The code can be found at http://cr.opensolaris.org/~mph/nwam1_cr/

I've only looked at the new user creation, and the RBAC profile parts:

DJM-1 user_attr why aren't dladm,netadm,netcfg marked as roles ? Users
shouldn't login to them directly right (yes I know they have locked
accounts)

DJM-2 SUNWcrnet/postinstall how is this going to be done in an IPS world ?

DJM-3 i.passwd the change for dladm's group doesn't look like it will
work for upgrade, see how I did the home dir change for the gdm user.

--
Darren J Moffat
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


okie

Posts: 409
From: US

Registered: 3/9/05
Re: [nwam-discuss] [networking-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Sep 1, 2009 2:55 PM   in response to: Darren J Moffat

  Click to reply to this thread Reply

Thanks for the review and feedback, Darren. Sorry it's taken so long
to address these...

On Wed, Jul 08, 2009 at 04:30:37PM +0100, Darren J Moffat wrote:
> Michael Hunter wrote:
>>
>> Begin forwarded message:
>>
>> Date: Mon, 06 Jul 2009 15:36:11 -0700
>> From: Michael Hunter <Michael dot Hunter at Sun dot COM>
>> To: nwam-dev at opensolaris dot org
>> Subject: [nwam-dev] NWAM Phase 1 Code Review request
>>
>>
>> This is a request for people to review the NWAM Phase 1 code.
>>
>> The code can be found at http://cr.opensolaris.org/~mph/nwam1_cr/
>
> I've only looked at the new user creation, and the RBAC profile parts:
>
> DJM-1 user_attr why aren't dladm,netadm,netcfg marked as roles ? Users
> shouldn't login to them directly right (yes I know they have locked
> accounts)

I'm okay with doing this for netadm and netcfg, as they're newly created
with the nwam phase 1 integration. It seems harder to justify the change
to dladm, though.

What advantages does making them roles offer?

> DJM-2 SUNWcrnet/postinstall how is this going to be done in an IPS world ?

I believe that the owner/group/permissions is part of the packaging
meta-data, and such updates are handled automatically by the IPS
infrastructure.

> DJM-3 i.passwd the change for dladm's group doesn't look like it will
> work for upgrade, see how I did the home dir change for the gdm user.

Good call, and thanks for the pointer to the gdm change. I've made
an analogous change for the dladm section.

-renee
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


Darren J Moffat
darrenm@opensolaris....
Re: [nwam-discuss] [networking-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Sep 2, 2009 1:53 AM   in response to: okie

  Click to reply to this thread Reply

Renee Danson Sommerfeld wrote:
>> DJM-1 user_attr why aren't dladm,netadm,netcfg marked as roles ? Users
>> shouldn't login to them directly right (yes I know they have locked
>> accounts)
>
> I'm okay with doing this for netadm and netcfg, as they're newly created
> with the nwam phase 1 integration. It seems harder to justify the change
> to dladm, though.
>
> What advantages does making them roles offer?

They can't be logged into directly even if someone does set a password
on them. It makes it much clearer what the intent of these accounts is.

>> DJM-3 i.passwd the change for dladm's group doesn't look like it will
>> work for upgrade, see how I did the home dir change for the gdm user.
>
> Good call, and thanks for the pointer to the gdm change. I've made
> an analogous change for the dladm section.

Note that there is an IPS bug for upgrades of changes to passwd entries
that could impact this, it might be ok for group updates though. Check
with David Comay.

http://defect.opensolaris.org/bz/show_bug.cgi?id=9755

--
Darren J Moffat
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


okie

Posts: 409
From: US

Registered: 3/9/05
Re: [nwam-discuss] [networking-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Sep 11, 2009 2:44 PM   in response to: Darren J Moffat

  Click to reply to this thread Reply

On Wed, Sep 02, 2009 at 09:53:55AM +0100, Darren J Moffat wrote:
> Renee Danson Sommerfeld wrote:
>>> DJM-1 user_attr why aren't dladm,netadm,netcfg marked as roles ?
>>> Users shouldn't login to them directly right (yes I know they have
>>> locked accounts)
>>
>> I'm okay with doing this for netadm and netcfg, as they're newly
>> created with the nwam phase 1 integration. It seems harder to justify
>> the change
>> to dladm, though.
>>
>> What advantages does making them roles offer?
>
> They can't be logged into directly even if someone does set a password
> on them. It makes it much clearer what the intent of these accounts is.

I think I'm going to stick with my original response: I'll make the
change for the users we're introducing (netadm and netcfg), but I'm
not going to change the dladm user.

>>> DJM-3 i.passwd the change for dladm's group doesn't look like it will
>>> work for upgrade, see how I did the home dir change for the gdm
>>> user.
>>
>> Good call, and thanks for the pointer to the gdm change. I've made
>> an analogous change for the dladm section.
>
> Note that there is an IPS bug for upgrades of changes to passwd entries
> that could impact this, it might be ok for group updates though. Check
> with David Comay.
>
> http://defect.opensolaris.org/bz/show_bug.cgi?id=9755

Thanks, I'll verify where things stand with David.

-renee
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


seb

Posts: 2,205
From: US

Registered: 3/9/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request
Posted: Jul 20, 2009 10:56 AM   in response to: mph

  Click to reply to this thread Reply


On Wed, 2009-07-08 at 08:14 -0700, Michael Hunter wrote:
> This is a request for people to review the NWAM Phase 1 code.
>
> The code can be found at http://cr.opensolaris.org/~mph/nwam1_cr/

Here are my libinetcfg comments. I'm still working on my nwamd
comments; I'll have that part of my review done by the end of the day
today or tomorrow.

lib/libinetcfg/common/inetcfg.c

* General: I find it strange that an icfg_handle_t can either
represent an IP interface or an IP address (i.e.: "logical
interface"). It gives many of the new functions introduced have
dual-functionality. For example, icfg_unplumb() is actually the
same as icfg_remove_ipaddr() if the handle happens to be a "logical"
interface. I would think that the model would be simpler if a
handle were to strictly refer to an IP interface. You could have a
different handle type for IP addresses if necessary.

* General: Why not simplify ifconfig.c by having it use the new
functions?

* 84: The user probably won't know what "Invalid DLPI link" means.
Most don't know what DLPI is, as it's an API, and not something that
exists at the administrative level. It would make more sense to me
to have the DLPI_ENOLINK libdlpi error map to something that prints
"link does not exist" or something similarly tangible.

* 86-90: These error strings are also quite implementation-specific,
but since there's absolutely nothing that the admin can do about
them, then it's probably not worth making them more friendly (unless
you can think of something better).

* 92: I find it simpler without a count, and instead have a
NULL-terminated array (e.g.: a last entry with a NULL error_desc).

* 116: You've just lost the actual error here. When DL_SYSERR is
returned, errno indicates the actual system error code (see
dlpi_strerror(3DLPI)). It would be good to indicate that error
instead of a cryptic "DLPI error". For example, if it was ENOMEM,
then you at least have a chance to print something meaningful given
that you already have ICFG_NO_MEMORY.

* 2068: The addrlen argument is effectively not used; it should be
removed.

* 2076,2079: You should verify sa_family == icfg_if_protocol() here.

* 2101 and almost every other place where errno gets set: ICFG_FAILURE
ends up printing "Generic failure", which is not useful given that
an actual specific error occured that was stored in errno. A
function that translates errno's to ICFG_* errors would make this
facility better. You'll need something like that anyway to address
my previous concern regarding the mishandling of DL_SYSERR.

* 2171: This restriction seems unwarranted. Moreover, the condition
it checks for can change immediately after the check, so even if it
were intentional, it's not effective.

* 2193,2214,2270,2487: These duplicate code in ifconfig.c. ifconfig.c
should be made to call these functions instead of maintaining two
versions of the same code.

* 2270: There is no longer a plumb_one_device() in ifconfig.c, it's
now ifplumb().

* 2282: Remove spurious blank line. The indentation below also
doesn't match above.

* 2290: An interface with the given name could be plumbed immediately
after the check, so it doesn't add much value. Doesn't SIOCSLIFNAME
already fail with an appropriate error code if the interface already
exists anyway?

* 2347: What is "name string" referring to here?

* 2358,2535: Use strlcpy(). strncpy() can lead to subtle bugs since
it doesn't guarantee that the result is NULL-terminated.

* 2499: Remove spurious blank line.

* 2501: Need blank line between variable declarations and code.

-Seb

_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


okie

Posts: 409
From: US

Registered: 3/9/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request
Posted: Sep 22, 2009 10:31 PM   in response to: seb

  Click to reply to this thread Reply

Hi Seb,

On Mon, Jul 20, 2009 at 01:56:54PM -0400, Sebastien Roy wrote:
>
> On Wed, 2009-07-08 at 08:14 -0700, Michael Hunter wrote:
> > This is a request for people to review the NWAM Phase 1 code.
> >
> > The code can be found at http://cr.opensolaris.org/~mph/nwam1_cr/
>
> Here are my libinetcfg comments.

Thanks for the review! My responses to individual comments are in-line
below; I've also posted a webrev with the changes at

file:///net/muskogee/export/hg/nwam/nwamp1/webrev/index.html

-renee

> lib/libinetcfg/common/inetcfg.c
>
> * General: I find it strange that an icfg_handle_t can either
> represent an IP interface or an IP address (i.e.: "logical
> interface"). It gives many of the new functions introduced have
> dual-functionality. For example, icfg_unplumb() is actually the
> same as icfg_remove_ipaddr() if the handle happens to be a "logical"
> interface. I would think that the model would be simpler if a
> handle were to strictly refer to an IP interface. You could have a
> different handle type for IP addresses if necessary.

I agree with your point. However, I think the model for addresses
vs. interfaces is still a little fuzzy. "logical inteface" 0 is
still special, and the lif number must still be taken into account
when fiddling with addresses on interfaces.

While it's possible that we could put a cleaner distinction in the
library interface now, I'm not sure whatever we did now would hold
across future changes that might make that model a little more clear.
So I wouldn't prioritize doing that work right now.

I'm also unclear how much work should go into polishing libinetcfg
at this point. We had to make changes to do what we needed right
now; but I think libipadm is really the future in this space, right?

I think that argument could be a recurring theme for the libinetcfg
comments. I'll accept things that are sensible, straightforward
changes; but I don't think it makes sense to do a lot of polishing
here.

> * General: Why not simplify ifconfig.c by having it use the new
> functions?

That would add to the scope of this project by pulling in code that
we are not otherwise touching. I also think changes we make to
ifconfig would be fairly short-lived given the progress Sowmini and
team are making with libipadm.

> * 84: The user probably won't know what "Invalid DLPI link" means.
> Most don't know what DLPI is, as it's an API, and not something that
> exists at the administrative level. It would make more sense to me
> to have the DLPI_ENOLINK libdlpi error map to something that prints
> "link does not exist" or something similarly tangible.

Accept.

> * 86-90: These error strings are also quite implementation-specific,
> but since there's absolutely nothing that the admin can do about
> them, then it's probably not worth making them more friendly (unless
> you can think of something better).

Sorry, I haven't come up with anything better so far.

> * 92: I find it simpler without a count, and instead have a
> NULL-terminated array (e.g.: a last entry with a NULL error_desc).

Accept.

> * 116: You've just lost the actual error here. When DL_SYSERR is
> returned, errno indicates the actual system error code (see
> dlpi_strerror(3DLPI)). It would be good to indicate that error
> instead of a cryptic "DLPI error". For example, if it was ENOMEM,
> then you at least have a chance to print something meaningful given
> that you already have ICFG_NO_MEMORY.

Point taken; but I think given how contained the current usage is,
this falls into the "less important polishing" bucket.

> * 2068: The addrlen argument is effectively not used; it should be
> removed.

I feel dumb here. It's being used to verify that the buffer passed in
is big enough, right (lines 2085-2088)?

> * 2076,2079: You should verify sa_family == icfg_if_protocol() here.

Accept.

> * 2101 and almost every other place where errno gets set: ICFG_FAILURE
> ends up printing "Generic failure", which is not useful given that
> an actual specific error occured that was stored in errno. A
> function that translates errno's to ICFG_* errors would make this
> facility better. You'll need something like that anyway to address
> my previous concern regarding the mishandling of DL_SYSERR.

Agreed; but I'm going to go with the same response here as to your
comment about DL_SYSERR.

> * 2171: This restriction seems unwarranted. Moreover, the condition
> it checks for can change immediately after the check, so even if it
> were intentional, it's not effective.

Accept.

> * 2193,2214,2270,2487: These duplicate code in ifconfig.c. ifconfig.c
> should be made to call these functions instead of maintaining two
> versions of the same code.

Agreed; but the same answer I gave to the general comment about updating
ifconfig to use these interfaces applies here as well.

> * 2270: There is no longer a plumb_one_device() in ifconfig.c, it's
> now ifplumb().

Accept (removed that part of the comment, it seemed pretty superfluous).

> * 2282: Remove spurious blank line. The indentation below also
> doesn't match above.

Accept.

> * 2290: An interface with the given name could be plumbed immediately
> after the check, so it doesn't add much value. Doesn't SIOCSLIFNAME
> already fail with an appropriate error code if the interface already
> exists anyway?

Accept. I removed the check at 2290 and added checks for the errno on
SIOCSLIFNAME so that a more specific error (ICFG_EXISTS) is returned if
the interface already exists.

> * 2347: What is "name string" referring to here?

Misplaced comment; I've cleaned it up.

> * 2358,2535: Use strlcpy(). strncpy() can lead to subtle bugs since
> it doesn't guarantee that the result is NULL-terminated.

Accept.

> * 2499: Remove spurious blank line.

Accept.

> * 2501: Need blank line between variable declarations and code.

line 2501 is the last variable declaration, and it's followed by a blank
line.
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


seb

Posts: 2,205
From: US

Registered: 3/9/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request
Posted: Sep 23, 2009 7:09 AM   in response to: okie

  Click to reply to this thread Reply


Hi Renee,

On Tue, 2009-09-22 at 22:31 -0700, Renee Danson Sommerfeld wrote:
> Hi Seb,
>
> On Mon, Jul 20, 2009 at 01:56:54PM -0400, Sebastien Roy wrote:
> >
> > On Wed, 2009-07-08 at 08:14 -0700, Michael Hunter wrote:
> > > This is a request for people to review the NWAM Phase 1 code.
> > >
> > > The code can be found at http://cr.opensolaris.org/~mph/nwam1_cr/
> >
> > Here are my libinetcfg comments.
>
> Thanks for the review! My responses to individual comments are in-line
> below; I've also posted a webrev with the changes at
>
> file:///net/muskogee/export/hg/nwam/nwamp1/webrev/index.html
>
> -renee
>
> > lib/libinetcfg/common/inetcfg.c
> >
> > * General: I find it strange that an icfg_handle_t can either
> > represent an IP interface or an IP address (i.e.: "logical
> > interface"). It gives many of the new functions introduced have
> > dual-functionality. For example, icfg_unplumb() is actually the
> > same as icfg_remove_ipaddr() if the handle happens to be a "logical"
> > interface. I would think that the model would be simpler if a
> > handle were to strictly refer to an IP interface. You could have a
> > different handle type for IP addresses if necessary.
>
> I agree with your point. However, I think the model for addresses
> vs. interfaces is still a little fuzzy. "logical inteface" 0 is
> still special, and the lif number must still be taken into account
> when fiddling with addresses on interfaces.

The API can hide those implementation details (libipadm seems to do
that). Anyway, see below.

> While it's possible that we could put a cleaner distinction in the
> library interface now, I'm not sure whatever we did now would hold
> across future changes that might make that model a little more clear.
> So I wouldn't prioritize doing that work right now.
>
> I'm also unclear how much work should go into polishing libinetcfg
> at this point. We had to make changes to do what we needed right
> now; but I think libipadm is really the future in this space, right?
>
> I think that argument could be a recurring theme for the libinetcfg
> comments. I'll accept things that are sensible, straightforward
> changes; but I don't think it makes sense to do a lot of polishing
> here.

Okay, I can live with that. My mindset going in was that you could have
structured the new libinetcfg APIs in such a way that the libipadm
replacement would have been more straightforward. It's probably too
late to go back and do this now, and I'm fine with your response.

>
> > * General: Why not simplify ifconfig.c by having it use the new
> > functions?
>
> That would add to the scope of this project by pulling in code that
> we are not otherwise touching. I also think changes we make to
> ifconfig would be fairly short-lived given the progress Sowmini and
> team are making with libipadm.

Okay.

> > * 86-90: These error strings are also quite implementation-specific,
> > but since there's absolutely nothing that the admin can do about
> > them, then it's probably not worth making them more friendly (unless
> > you can think of something better).
>
> Sorry, I haven't come up with anything better so far.

So be it.

> > * 116: You've just lost the actual error here. When DL_SYSERR is
> > returned, errno indicates the actual system error code (see
> > dlpi_strerror(3DLPI)). It would be good to indicate that error
> > instead of a cryptic "DLPI error". For example, if it was ENOMEM,
> > then you at least have a chance to print something meaningful given
> > that you already have ICFG_NO_MEMORY.
>
> Point taken; but I think given how contained the current usage is,
> this falls into the "less important polishing" bucket.

I don't understand the pushback on this one. It's a few lines of code
with a clear benefit. Here's the code:

case DL_SYSERR:
switch (errno) {
case ENOMEM:
return (ICFG_NO_MEMORY);
case EINVAL:
return (ICFG_INVALID_ARG);
}
/* FALLTHROUGH */
default:
return (ICFG_DLPI_FAILURE);
}

>
> > * 2068: The addrlen argument is effectively not used; it should be
> > removed.
>
> I feel dumb here. It's being used to verify that the buffer passed in
> is big enough, right (lines 2085-2088)?

Sorry for not being explicit. It's effectively unused because there is
only one correct range of values for addrlen, and addr has to point to a
structure of the correct size otherwise the application is borked to
begin with. In other words, the caller must pass in a buffer that is at
least sizeof (struct sockaddr_in) if the protocol is AF_INET and at
least sizeof (struct sockaddr_in6) if the protocol is AF_INET6. There's
no point in checking that because if any value that is out of range
indicates a bug in the application, and a similar bug in the application
would result in the addrlen argument having nothing to do with the
actual size of the addr buffer... The onus is on the caller to pass in
an appropriately sized buffer. The addrlen check is IMO totally
meaningless.

> > * 2101 and almost every other place where errno gets set: ICFG_FAILURE
> > ends up printing "Generic failure", which is not useful given that
> > an actual specific error occured that was stored in errno. A
> > function that translates errno's to ICFG_* errors would make this
> > facility better. You'll need something like that anyway to address
> > my previous concern regarding the mishandling of DL_SYSERR.
>
> Agreed; but I'm going to go with the same response here as to your
> comment about DL_SYSERR.

I see.

> > * 2193,2214,2270,2487: These duplicate code in ifconfig.c. ifconfig.c
> > should be made to call these functions instead of maintaining two
> > versions of the same code.
>
> Agreed; but the same answer I gave to the general comment about updating
> ifconfig to use these interfaces applies here as well.

Okay.

> > * 2290: An interface with the given name could be plumbed immediately
> > after the check, so it doesn't add much value. Doesn't SIOCSLIFNAME
> > already fail with an appropriate error code if the interface already
> > exists anyway?
>
> Accept. I removed the check at 2290 and added checks for the errno on
> SIOCSLIFNAME so that a more specific error (ICFG_EXISTS) is returned if
> the interface already exists.

Okay.

> > * 2501: Need blank line between variable declarations and code.
>
> line 2501 is the last variable declaration, and it's followed by a blank
> line.

I see now.

Thanks,
-Seb


_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


okie

Posts: 409
From: US

Registered: 3/9/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request
Posted: Sep 23, 2009 1:25 PM   in response to: seb

  Click to reply to this thread Reply

On Wed, Sep 23, 2009 at 10:09:53AM -0400, Sebastien Roy wrote:
>
> On Tue, 2009-09-22 at 22:31 -0700, Renee Danson Sommerfeld wrote:
> >
> > While it's possible that we could put a cleaner distinction in the
> > library interface now, I'm not sure whatever we did now would hold
> > across future changes that might make that model a little more clear.
> > So I wouldn't prioritize doing that work right now.
> >
> > I'm also unclear how much work should go into polishing libinetcfg
> > at this point. We had to make changes to do what we needed right
> > now; but I think libipadm is really the future in this space, right?
> >
> > I think that argument could be a recurring theme for the libinetcfg
> > comments. I'll accept things that are sensible, straightforward
> > changes; but I don't think it makes sense to do a lot of polishing
> > here.
>
> Okay, I can live with that. My mindset going in was that you could have
> structured the new libinetcfg APIs in such a way that the libipadm
> replacement would have been more straightforward. It's probably too
> late to go back and do this now, and I'm fine with your response.

That would have been a good thing to do; however, we didn't do a good
job of designing the changes needed for libinetcfg from the start, so
they just sort of grew to suit our needs. :-( Apologies to the
libipadm team!!

> > > * 116: You've just lost the actual error here. When DL_SYSERR is
> > > returned, errno indicates the actual system error code (see
> > > dlpi_strerror(3DLPI)). It would be good to indicate that error
> > > instead of a cryptic "DLPI error". For example, if it was ENOMEM,
> > > then you at least have a chance to print something meaningful given
> > > that you already have ICFG_NO_MEMORY.
> >
> > Point taken; but I think given how contained the current usage is,
> > this falls into the "less important polishing" bucket.
>
> I don't understand the pushback on this one. It's a few lines of code
> with a clear benefit. Here's the code:
>
> case DL_SYSERR:
> switch (errno) {
> case ENOMEM:
> return (ICFG_NO_MEMORY);
> case EINVAL:
> return (ICFG_INVALID_ARG);
> }
> /* FALLTHROUGH */
> default:
> return (ICFG_DLPI_FAILURE);
> }

Okay, I have no problem doing that. The pushback was because I was
imagining a more thorough examination/handling of possible return
values from libdlpi. But you're right, doing the simple change you
suggest would be an easy-to-accomplish improvement. I'll include
this in my next update to pull in the IPMP handling (in response to
one of Sowmini's comments).

> > > * 2068: The addrlen argument is effectively not used; it should be
> > > removed.
> >
> > I feel dumb here. It's being used to verify that the buffer passed in
> > is big enough, right (lines 2085-2088)?
>
> Sorry for not being explicit. It's effectively unused because there is
> only one correct range of values for addrlen, and addr has to point to a
> structure of the correct size otherwise the application is borked to
> begin with. In other words, the caller must pass in a buffer that is at
> least sizeof (struct sockaddr_in) if the protocol is AF_INET and at
> least sizeof (struct sockaddr_in6) if the protocol is AF_INET6. There's
> no point in checking that because if any value that is out of range
> indicates a bug in the application, and a similar bug in the application
> would result in the addrlen argument having nothing to do with the
> actual size of the addr buffer... The onus is on the caller to pass in
> an appropriately sized buffer. The addrlen check is IMO totally
> meaningless.

Okay, I see your point now. I'll pull the addrlen parameter.
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


okie

Posts: 409
From: US

Registered: 3/9/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request
Posted: Sep 24, 2009 10:56 PM   in response to: okie

  Click to reply to this thread Reply

I've updated my webrev; it can be found at

http://jurassic.eng/~okie/webrev/index.html

This includes all of my libinetcfg changes based on code review feedback
from Seb and Sowmini.

One change from an earlier reply to Seb:

On Wed, Sep 23, 2009 at 01:25:12PM -0700, Renee Danson Sommerfeld wrote:
> > >
> > > > * 2068: The addrlen argument is effectively not used; it should be
> > > > removed.
> > >
> > > I feel dumb here. It's being used to verify that the buffer passed in
> > > is big enough, right (lines 2085-2088)?
> >
> > Sorry for not being explicit. It's effectively unused because there is
> > only one correct range of values for addrlen, and addr has to point to a
> > structure of the correct size otherwise the application is borked to
> > begin with. In other words, the caller must pass in a buffer that is at
> > least sizeof (struct sockaddr_in) if the protocol is AF_INET and at
> > least sizeof (struct sockaddr_in6) if the protocol is AF_INET6. There's
> > no point in checking that because if any value that is out of range
> > indicates a bug in the application, and a similar bug in the application
> > would result in the addrlen argument having nothing to do with the
> > actual size of the addr buffer... The onus is on the caller to pass in
> > an appropriately sized buffer. The addrlen check is IMO totally
> > meaningless.
>
> Okay, I see your point now. I'll pull the addrlen parameter.

I do see your point, but I'm also not entirely comfortable with it.
It seems to me that if the size is wrong (and as the buffer pointer
is to a sockaddr, there's at least potential for address family
confusion there), then yes, the caller is broken; but at least there
will be a clear failure if the address length check is there.

I think you can make a reasonable case either way; but my inclination
and the path of least resistance both point to leaving the code as-is
in this case.

-renee
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


seb

Posts: 2,205
From: US

Registered: 3/9/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request
Posted: Sep 25, 2009 5:34 AM   in response to: okie

  Click to reply to this thread Reply


On Thu, 2009-09-24 at 22:56 -0700, Renee Danson Sommerfeld wrote:
> On Wed, Sep 23, 2009 at 01:25:12PM -0700, Renee Danson Sommerfeld wrote:
> > > >
> > > > > * 2068: The addrlen argument is effectively not used; it should be
> > > > > removed.
> > > >
> > > > I feel dumb here. It's being used to verify that the buffer passed in
> > > > is big enough, right (lines 2085-2088)?
> > >
> > > Sorry for not being explicit. It's effectively unused because there is
> > > only one correct range of values for addrlen, and addr has to point to a
> > > structure of the correct size otherwise the application is borked to
> > > begin with. In other words, the caller must pass in a buffer that is at
> > > least sizeof (struct sockaddr_in) if the protocol is AF_INET and at
> > > least sizeof (struct sockaddr_in6) if the protocol is AF_INET6. There's
> > > no point in checking that because if any value that is out of range
> > > indicates a bug in the application, and a similar bug in the application
> > > would result in the addrlen argument having nothing to do with the
> > > actual size of the addr buffer... The onus is on the caller to pass in
> > > an appropriately sized buffer. The addrlen check is IMO totally
> > > meaningless.
> >
> > Okay, I see your point now. I'll pull the addrlen parameter.
>
> I do see your point, but I'm also not entirely comfortable with it.
> It seems to me that if the size is wrong (and as the buffer pointer
> is to a sockaddr, there's at least potential for address family
> confusion there), then yes, the caller is broken; but at least there
> will be a clear failure if the address length check is there.
>
> I think you can make a reasonable case either way; but my inclination
> and the path of least resistance both point to leaving the code as-is
> in this case.

Okay, it's your call, I don't feel strongly about it.
-Seb


_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


meem

Posts: 3,083
From: US

Registered: 3/9/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 20, 2009 7:00 PM   in response to: mph

  Click to reply to this thread Reply


> This is a request for people to review the NWAM Phase 1 code.
>
> The code can be found at http://cr.opensolaris.org/~mph/nwam1_cr/
>
> For those inside sun there is a cscope database
> in /net/coupe.eng/builds/mph/nwam1_cr/usr/src (sorry to those external
> to SWAN).
>
> Documentation can both be found in the source tree and on the project
> web page at http://opensolaris.org/os/project/nwam/

Michael,

Here are my comments on nwamadm, netcfgd and the DLPI logic in nwamd.

General:

* A number of functions are named without appropriate prefixes
(e.g. nwam_) in ways that encroach upon actively evolved public
namespaces. For instance, door_switch() and and dlpi_notify().

* A number of functions use chumminess with lint to make arguments
appear to be used (e.g., `foo = foo'). This is not a robust or
clear way to handle unused arguments; please use ARGSUSED or
the NOTE() facility to mark unused arguments.

* A number of places seem to check for pthread_mutex_lock()
failure. This is needless: there are no cases short of
programmer error that will lead to pthread_mutex_lock() failing
(for a normal pthread lock).

nwamadm/Makefile:

* Given the way gettext() is used in nwamadm, you need something
like XGETFLAGS += -a -x $(PROG).xcl and an nwamadm.xcl file.

* 34: Why is LFLAGS set in here?

* 36: -I. is wrong; local header files should be included with
`#include "local.h"'.

* 30: Needless; source is only built from one file.

* 42-44: Needless rules.

* 55: Why not use lint_PROG?

nwamadm.c:

* Throughout: the ofmt API needs to be used to print stuff out
rather than rolling your own printing logic. This will also
provide parsable mode and user-selectable fields for free
(which currently seem to be missing).

* Throughout: what's up with the name zerr()? Am I to assume it's
because this code started life as zoneadm.c? Seems like it
should be called die(), and seems like you could also use a
die_nwamerr() function that takes a nwamerr and does the
nwam_strerror() a la die_dlerr() in dladm.

* Throughout: seems a shame that the libnwam walker APIs require
a non-NULL cb_ret argument, as nothing here seems to want to
consult it. Why not allow it to be NULL?

* 50: Needless blank line.

* 58, 64-72: The use of both command numbers *and* an array of
command entries seems muddled. Seems like all the functionality
tied to command numbers could be folded into state tracked in
the command entries -- e.g., a cmd_help pointer-to-function (or
just a simple const char *), and a flags field that indicates
whether e.g. "-x" can be used or whether the type needs to be
determined.

* 60: Should be cmd_handler for consistency.

* 60: Should define a typedef for this -- e.g.:

typedef int (cmd_handle_func_t)(int, char **);

... then declare cmd_handler as a pointer to this, and the
declarations on 79-84 as instances of this type.

* 77: Seems like this FMRI should be in a library header.

* 151: Unclear why this is a global; seems like it should be
declared in interact_func() and passed to eventhandler().

* 167-277: I'd expect these helper routines to be in libnwam
rather than nwamadm. Some of them already appear to be there,
such as nwam_ncu_type_to_flag(). Why are they here?

* 179, 193, 215, 230, 246, 262, 275, 297: The default cases here
seem unhelpful. I'd expect strings of "?" and enumerations
along the lines of NWAM_OBJECT_TYPE_UNKNOWN.

* 209: Presumably this should be "ncu"?

* 323: Presumably this should be ""%s: <subcommand>".

* 368, 371: This seems like abuse of return values, as
NWAM_WALK_HALTED and NWAM_SUCCESS do not appear to be intended
for this use. Instead, it seems that the walker should return
0 or 1.

* 378: Why is it safe to ignore the return value from this
function? Is it that we assume `state' will only be modified
by this function call upon success? Seems a needless
assumption.

* 451: Would seem more natural to return `type' and remove the
`num' parameter in place of an NWAM_OBJECT_TYPE_UNKNOWN value.

* 514: This seems unnecessarily obscure; don't we know
cmd_to_str(CMD_LIST) == "list"?

* 549-558: Seems like this would just be:

if (cmd_num != CMD_LIST && type == NWAM_OBJECT_TYPE_NONE)

* 560: Needless cast.

* 653-684: This should be factored out into a funciton that takes
the type and then is merely called here for NWAM_NCU_TYPE_LINK
and NWAM_NCU_TYPE_INTERFACE. For bonus points, could take a
function pointer which would be either &nwam_ncu_enable or
&nwam_ncu_disable.

* 651, 690, 693: Should be a switch statement with NWAM_SUCCESS,
NWAM_ENTITY_MULTIPLE_VALUES and default.

* 705-715: Can factor this out to a function and reuse this code with
disable_func(). For I18N to work, you need to restructure the
messages -- e.g.: "cannot enable: <blah>" -- then you'll end up
parameterizing the subcommand name (which won't be translated).

* 719: By the time we get here isn't the enable complete? That
is, seems like this should be "Enabled", not "Enabling".

* 724: `if' block should be replaced with assert(ret !=
NWAM_SUCCESS)' and the code should be unindented.

* 785-868: See previous comments regarding enable_func() code.

* 871: Would seem preferable to return a error code indicating
whether the function succeeded rather than overloading
*_UNINITALIZED for the job.

* 872, 934: Everything that uses determine_object_state() immediately
then calls add_to_profile_entry(). Seems like it would be
simpler to have add_to_profile_entry() get the state itself
rather than taking `state' and `auxstate' arguments.

* 928: How do we get here? Seems like we should assert(0).

* 932: p_entries is a linked list, not an array.

* 956: p_last_entry is gross. Does the order of the list matter?
If not, why not prepend to the list? If so, would prefer to
make the list circular and use p_prev.

* 979, 1020, 1058, 1090: Needless cast.

* 986, 988: `name' appears to be leaked.

* 988: `val' appears to be leaked.

* 1108-1156: See previous comments about using ofmt. As an aside,
the field names are generally considered part of the command
itself and are not something we translate -- and this is
important when -o <field> is supported.

* 1165-1166: Needless "optimization".

* 1196-1199: No voodoo code, please.

* 1207, 1219, 1227, 1234: Needless cast.

* 1250, 1251: Useless comments.

* 1260-1475: This function is really hard to read -- and again,
ofmt should be used to print output.

* 1260-1475: Is this intended as a debugging aid or something that
end-users would actually use? If the latter, seems like some
I18N handling is missing in the various prompts. Also, how is
the end-user supposed to make sense of the stuff output from
this? Further, it seems odd to me that the interactive support
and event monitor have been glommed together into one facility.

* 1266: `action' should be declared properly as a `const char *'
and the bogus casts on line 1278, 1422, and 1449 should be
removed.

* 1338, 1346, 1388: This works without fflush(stdout)?

* 1338, 1346, 1388: Can the user ^C out? If so, how does that
work? If not, seems a usability problem.

* 1366, 1402: Why are errors going to stdout?

* 1378, 1388: Would expect a space after the ":".

* 1432: How do you know this is a sockaddr_in? Sure looks like
this code can be reached with a sockeddr_in6.

* 1474: Remove blank line.

* 1511: See previous comment about I18N with field names.

* 1515-1528: Odd code here seems to stem from mixing interaction
and event monitoring.

* 1543: strdup() can fail. This is the problem with using
basename(3C) rather than just doing the typical:

if ((execname = strrchr(argv[0], '/')) == NULL)
execname = argv[0];
else
execname++;

You can also remove the <libgen.h> #include and -lgen link.

* 1551-1554: This means one cannot observe the events at NWAM
startup with "interact" which seems unfortunate.

* 1531: So using the "interact" subcommand always exits with a
non-zero status?

* 1547, 1554, 1565: Would be clearer as EXIT_FAILURE.

* 1556: I suppose it doesn't matter much, but `state' is leaked.

netcfgd:

* Unclear why this consists of multiple source files; it seems
like everything could be consolidated into a single source file
without sacrificing maintainability.

netcfgd/Makefile:

* 29-30: Pulling in Makefile.lib seems wrong since that's for
building libraries. Just use $(ROOT)/lib.

* 35: Remove needless HEADERS definition.

* 61: What does this $(ROOTCMD) dependency do?

netcfgd/util.h:

* Given that this only has logging functions, seems odd it's named
util.h. (But then again, I can't figure out why this exists.)

* 33: Need a /* PRINTFLIKE2 */ here.

* 39: No externs, please.

netcfgd/logging.c:

* 28-29: Unclear what this comment has to do with anything; no
debug.c here.

* 42: s/The idea for having this function is so that you can
/This function allows you to/

* 43: s/Its/It's/

* 59: Why 256? Why not vasprintf()?

* 81-90: Could be replaced with (in util.h)

#define dprintf(...) nlog(LOG_DEBUG, __VA_ARGS__)

netcfgd/main.c:

* General: I can't figure out what the zonename code is for;
we never seem to use the `zonename' variable. My nits below
assume that somehow it's supposed to do something.

* 28: Comment should provide an overview of what this daemon does
on a technical level.

* 32, 34, 38, 39, 40, et al: Superfluous #includes

* 51, 52: No excuse for these being globals.

* 57-59: Unclear what this comment has to do with anything.

* 68: Where does this code lookup SMF properties?

* 70: Where does this code manage privileges?

* 76: Is there a reason this uses LOG_NDELAY?

* 79-113: Someone *really* needs to put this in a library :-(
Looks like this is cut line-for-line from Nevada nwamd/main.c
which in turn took it from pppoe/pppoed.c.

* 124: Prefer const.

* 126: No need to define `zoneid'; just inline the call to
getzoneid() int the getzonenamebyid() call.

* 132: Why is it correct to charge on with GLOBAL_ZONENAME in
this case?

* 141-142: Just curious: what exactly are we localizing? All of
the messages here go to syslog which is not localized.

* 145: Why is this LOG_INFO? Seems like LOG_DEBUG. Also, LOG_PID
was specified to openlog() so I'm not sure why this includes the
pid as well. Finally, the message should be generated in
start_logging(), not here.

* 149-155: Indenting here is one level off.

* 162: Comment states the obvious.

* 166-168: Unclear why we want to ignore USR1/USR2/INT.

* 170: Need to resolve this XXX for SIGTHAW.

dlpi_events.c:

* Throughout: libdlpi does not use `errno' for errors. All of the
dlpi-related log messages in this file are busted.

* 34: Unclear what net/route.h has to do with this file.

* 65: This `if' clause is needless: nwamd has requested to only
receive events of these two types.

* 67: Use of 'IFF_UP' to have some relationship to the link state
seems wrong. Seems like it would be simpler to change
link_state.link_state to link_state.link_up and just have it be
a boolean.

* 71: Shouldn't this message explain the administrative impact?

* 88-102: Unclear why this function is allocating a buffer to
receive into since it does nothing with this buffer -- and even
more unclear why this buffer is sized to DLPI_PHYSADDR_MAX.
Seems like this whole function should just be:

static void *
nwamd_dlpi_thread(void *arg)
{
int rc;
dlpi_handle_t *dh = arg;

for (;;) {
rc = dlpi_recv(*dh, NULL, NULL, NULL, NULL, -1, NULL);
if (rc != DLPI_SUCCESS) {
nlog(LOG_ERR, "nwam_dlpi_thread: dlpi_recv "
"failed: %s", dlpi_strerror(rc));
break;
}
}
return (NULL);
}

* 94: Bogus cast.

* 99: Unclear where the magic notion of three failures comes from;
one would think any unexpected failure would be worth stopping
for.

* 100: resumably *dh was meant?

* 105: This function would be much easier to read if a local
variable was declared for ncu->ncu_node.u_link.

* 123: Can this happen? If so, what prevents a situation where two
thread simultaneously pass this check and race to both call
pthread_create()?

* 135, 182: Comments only tell me what is clear already from the
code.

* 136-146: dlpi_bind() is no longer needed to use dlpi_enabnotify().

* 150: Sure seems like `strdup(name)' ends up getting leaked here.
Shouldn't this be NULL?

* 160: Presumably the intent is to record the error value returned
from pthread_create() in the nlog() message?

--
meem
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


mph

Posts: 542
From:

Registered: 5/11/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 23, 2009 7:26 AM   in response to: meem

  Click to reply to this thread Reply

On Mon, 20 Jul 2009 19:00:21 -0700
Peter Memishian <Peter dot Memishian at Sun dot COM> wrote:

[...]
> dlpi_events.c:
>
> * Throughout: libdlpi does not use `errno' for errors. All of the
> dlpi-related log messages in this file are busted.
[...]

The return section section of dlpi_recv(3DLPI) says:

Upon success, DLPI_SUCCESS is returned. If DL_SYSERR is
returned, errno contains the specific UNIX system error
value. Otherwise, a DLPI error value defined in <sys/d
or an error value listed in the following section is
returned.

Our code is busted with respect to the two level error codes. But in
contradiction with your statement this does say that at least dlpi_recv
could return an error via errno.

mph
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


seb

Posts: 2,205
From: US

Registered: 3/9/05
Re: [nwam-discuss] [networking-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 23, 2009 7:54 AM   in response to: mph

  Click to reply to this thread Reply


On Thu, 2009-07-23 at 07:26 -0700, Michael Hunter wrote:
> On Mon, 20 Jul 2009 19:00:21 -0700
> Peter Memishian <Peter dot Memishian at Sun dot COM> wrote:
>
> [...]
> > dlpi_events.c:
> >
> > * Throughout: libdlpi does not use `errno' for errors. All of the
> > dlpi-related log messages in this file are busted.
> [...]
>
> The return section section of dlpi_recv(3DLPI) says:
>
> Upon success, DLPI_SUCCESS is returned. If DL_SYSERR is
> returned, errno contains the specific UNIX system error
> value. Otherwise, a DLPI error value defined in <sys/d
> or an error value listed in the following section is
> returned.
>
> Our code is busted with respect to the two level error codes. But in
> contradiction with your statement this does say that at least dlpi_recv
> could return an error via errno.

All libdlpi functions behave the same way, dlpi_recv() isn't unique.
They return a libdlpi error code. If that code happens to be DL_SYSERR,
then errno indicates the system error.

-Seb


_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


meem

Posts: 3,083
From: US

Registered: 3/9/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 23, 2009 10:17 AM   in response to: mph

  Click to reply to this thread Reply


> > * Throughout: libdlpi does not use `errno' for errors. All of the
> > dlpi-related log messages in this file are busted.
> [...]
>
> The return section section of dlpi_recv(3DLPI) says:
>
> Upon success, DLPI_SUCCESS is returned. If DL_SYSERR is
> returned, errno contains the specific UNIX system error
> value. Otherwise, a DLPI error value defined in <sys/d
> or an error value listed in the following section is
> returned.
>
> Our code is busted with respect to the two level error codes. But in
> contradiction with your statement this does say that at least dlpi_recv
> could return an error via errno.

Yes, it is possible libdlpi will set `errno' in some cases. My primary
point was that the code is broken -- and it is.

--
meem
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


amaguire

Posts: 304
From:

Registered: 10/17/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 24, 2009 10:17 AM   in response to: meem

  Click to reply to this thread Reply

hi meem

many thanks for the detailed review! I'm going
to work through your netcfgd code review comments
first if that's okay. We probably won't publish
a complete updated webrev for a little while since we want
to address as much of the feedback as possible, but
I've placed an updated webrev which includes these
changes at:

http://zhadum.east/export/ws/amaguire/nwam1-bugs/webrev/

I haven't addressed the issue of moving netcfgd to
usr/src/cmd/netcfgd here yet, though I agree it's a good
idea.

Peter Memishian wrote:
> netcfgd:
>
> * Unclear why this consists of multiple source files; it seems
> like everything could be consolidated into a single source file
> without sacrificing maintainability.
>
accepted - we just have netcfgd.c now.
> netcfgd/Makefile:
>
> * 29-30: Pulling in Makefile.lib seems wrong since that's for
> building libraries. Just use $(ROOT)/lib.
>
>
done.
> * 35: Remove needless HEADERS definition.
>
>
done.


> * 61: What does this $(ROOTCMD) dependency do?
>
>
It's not needed.
> netcfgd/util.h:
>
> * Given that this only has logging functions, seems odd it's named
> util.h. (But then again, I can't figure out why this exists.)
>
>
> * 33: Need a /* PRINTFLIKE2 */ here.
>
> * 39: No externs, please.
>
I've removed this file as per your comment above.
> netcfgd/logging.c:
>
> * 28-29: Unclear what this comment has to do with anything; no
> debug.c here.
>
>
this was copy-and-paste cruft - a fair bit of
this crept into netcfgd I'm afraid.

> * 42: s/The idea for having this function is so that you can
> /This function allows you to/
>
> * 43: s/Its/It's/
>
>
accepted.
> * 59: Why 256? Why not vasprintf()?
>
>
accepted - I've also removed the thread id logging
since it's not useful for netcfgd.
> * 81-90: Could be replaced with (in util.h)
>
> #define dprintf(...) nlog(LOG_DEBUG, __VA_ARGS__)
>
>
We don't really need dprintf anyhow, we can just
use nlog(LOG_DEBUG, ...) so I've removed it.
> netcfgd/main.c:
>
> * General: I can't figure out what the zonename code is for;
> we never seem to use the `zonename' variable. My nits below
> assume that somehow it's supposed to do something.
>
>
cruft again from the nwamd -> netcfgd cut-and-paste
I'm afraid.
> * 28: Comment should provide an overview of what this daemon does
> on a technical level.
>
>
I've added this.
> * 32, 34, 38, 39, 40, et al: Superfluous #includes
>
>
I think we need syslog.h (39), and ditto
for sys/stat.h and sys/types.h since we
call umask(2). We need sys/wait.h for
wait(2) and unistd.h for pause(2). libscf.h,
locale.h, priv.h, strings.h and zone.h aren't
needed though as far as I can see.
> * 51, 52: No excuse for these being globals.
>
>
accepted - I've moved fg to main() and removed
zonename.
> * 57-59: Unclear what this comment has to do with anything.
>
>
removed.
> * 68: Where does this code lookup SMF properties?
>
>
It doesn't anymore.
> * 70: Where does this code manage privileges?
>
>
Ditto - more cruft.
> * 76: Is there a reason this uses LOG_NDELAY?
>
>
I don't think so - particularly since we call syslog()
immediately after openlog() in start_logging() now.
I've removed this.
> * 79-113: Someone *really* needs to put this in a library :-(
> Looks like this is cut line-for-line from Nevada nwamd/main.c
> which in turn took it from pppoe/pppoed.c.
>
>
Agreed.
> * 124: Prefer const.
>
> * 126: No need to define `zoneid'; just inline the call to
> getzoneid() int the getzonenamebyid() call.
>
>
> * 132: Why is it correct to charge on with GLOBAL_ZONENAME in
> this case?
>
>
I've removed the zone code since it's not needed.
>
> * 141-142: Just curious: what exactly are we localizing? All of
> the messages here go to syslog which is not localized.
>
>
nothing needs to be localized - I've removed the
POFILES definition from the Makefile and the
relevant headers. In fact, the parent directory
Makefile wasn't even including netcfgd in
MSGSUBDIRS.
> * 145: Why is this LOG_INFO? Seems like LOG_DEBUG. Also, LOG_PID
> was specified to openlog() so I'm not sure why this includes the
> pid as well. Finally, the message should be generated in
> start_logging(), not here.
>
>
accepted.
> * 149-155: Indenting here is one level off.
>
>
fixed.
> * 162: Comment states the obvious.
>
>
removed it.
> * 166-168: Unclear why we want to ignore USR1/USR2/INT.
>
>
I suspect the reasoning was USR1/USR2 were being used
by nwamd to dump config at one point. I've removed these
3 settings of signal disposition.
> * 170: Need to resolve this XXX for SIGTHAW.
>
>
This was another artifact of the copy-paste-from nwamd -
at the time, we weren't handling SIGTHAW there. Removed.
the XXX as not relevant.

Thanks again for the in-depth review!

Alan
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


meem

Posts: 3,083
From: US

Registered: 3/9/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 24, 2009 3:20 PM   in response to: amaguire

  Click to reply to this thread Reply


> many thanks for the detailed review! I'm going
> to work through your netcfgd code review comments
> first if that's okay. We probably won't publish
> a complete updated webrev for a little while since we want
> to address as much of the feedback as possible, but
> I've placed an updated webrev which includes these
> changes at:
>
> http://zhadum.east/export/ws/amaguire/nwam1-bugs/webrev/

Alan,

The changes look good -- 150 fewer lines, too. I have a few additional
ones below in addition to the Makefile comments I sent you separately.

* General Seems cumbersome to be sending getopt() processing
failures to syslog. Further, most daemons running in the
foreground write to stderr. As such, I'd expect (a) syslog
to only be used when running as a daemon and (b) nlog() to
use fprintf() or syslog() as appropriate.

* 28-36: nit: Rejustify the comment to wrap at 79 columns and it
should format a bit better.

* 170: Default disposition of SIGTHAW is SIG_IGN, so this seems
unnecessary.

> I haven't addressed the issue of moving netcfgd to
> usr/src/cmd/netcfgd here yet, though I agree it's a good

I don't recall suggesting that; I'm fine with its current location.

--
meem
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


amaguire

Posts: 304
From:

Registered: 10/17/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 27, 2009 3:10 AM   in response to: meem

  Click to reply to this thread Reply

Peter Memishian wrote:
> > many thanks for the detailed review! I'm going
> > to work through your netcfgd code review comments
> > first if that's okay. We probably won't publish
> > a complete updated webrev for a little while since we want
> > to address as much of the feedback as possible, but
> > I've placed an updated webrev which includes these
> > changes at:
> >
> > http://zhadum.east/export/ws/amaguire/nwam1-bugs/webrev/
>
> Alan,
>
> The changes look good -- 150 fewer lines, too. I have a few additional
> ones below in addition to the Makefile comments I sent you separately.
>
> * General Seems cumbersome to be sending getopt() processing
> failures to syslog. Further, most daemons running in the
> foreground write to stderr. As such, I'd expect (a) syslog
> to only be used when running as a daemon and (b) nlog() to
> use fprintf() or syslog() as appropriate.
>
>
I've modified nlog() and the usage error
to write to stderr (the former only if netcfgd
is running in the foreground of course).
> * 28-36: nit: Rejustify the comment to wrap at 79 columns and it
> should format a bit better.
>
> * 170: Default disposition of SIGTHAW is SIG_IGN, so this seems
> unnecessary.
>
>
sure. I also tidied up the Makefile in line with
your other suggestions - see:

http://zhadum.east.sun.com/export/ws/amaguire/nwam1-bugs/webrev/

...for the latest version with these changes.
> > I haven't addressed the issue of moving netcfgd to
> > usr/src/cmd/netcfgd here yet, though I agree it's a good
>
> I don't recall suggesting that; I'm fine with its current location.
>
>
my mistake - that was a code review comment from
Sowmini, based upon the precedent set by dlmgmtd.
Thanks again for all the help!

Alan
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


meem

Posts: 3,083
From: US

Registered: 3/9/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 27, 2009 3:21 PM   in response to: amaguire

  Click to reply to this thread Reply


> > * 28-36: nit: Rejustify the comment to wrap at 79 columns and it
> > should format a bit better.
> >
> > * 170: Default disposition of SIGTHAW is SIG_IGN, so this seems
> > unnecessary.
> >
> >
> sure. I also tidied up the Makefile in line with
> your other suggestions - see:
>
> http://zhadum.east.sun.com/export/ws/amaguire/nwam1-bugs/webrev/

netcfgd/Makefile now has:

clean:
$(RM) $(PROG)

The `clean' target is not supposed to remove "final" objects like this.
That is, the `clean' target should be empty.

No need for `include ../../Makefile.msg' either.

For netcfg.c:

* 51-52, 86: Should be static.

* 70-79: Any reason not to fold this into nlog()?

* 155: Would prefer to properly grab the last component of the
path (e.g., see ipmpstat.c lines 198-201), and then use this
instead of hardcoded strings at lines 100 and 102.

--
meem
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


amaguire

Posts: 304
From:

Registered: 10/17/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 28, 2009 6:28 AM   in response to: meem

  Click to reply to this thread Reply

Peter Memishian wrote:
> > > * 28-36: nit: Rejustify the comment to wrap at 79 columns and it
> > > should format a bit better.
> > >
> > > * 170: Default disposition of SIGTHAW is SIG_IGN, so this seems
> > > unnecessary.
> > >
> > >
> > sure. I also tidied up the Makefile in line with
> > your other suggestions - see:
> >
> > http://zhadum.east.sun.com/export/ws/amaguire/nwam1-bugs/webrev/
>
> netcfgd/Makefile now has:
>
> clean:
> $(RM) $(PROG)
>
> The `clean' target is not supposed to remove "final" objects like this.
> That is, the `clean' target should be empty.
>
> No need for `include ../../Makefile.msg' either.
>
> For netcfg.c:
>
> * 51-52, 86: Should be static.
>
> * 70-79: Any reason not to fold this into nlog()?
>
> * 155: Would prefer to properly grab the last component of the
> path (e.g., see ipmpstat.c lines 198-201), and then use this
> instead of hardcoded strings at lines 100 and 102.
>
>
all accepted - see

http://zhadum.east.sun.com/export/ws/amaguire/nwam1-bugs/webrev/

thanks!

Alan
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


meem

Posts: 3,083
From: US

Registered: 3/9/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 28, 2009 10:46 AM   in response to: amaguire

  Click to reply to this thread Reply


> all accepted - see
>
> http://zhadum.east.sun.com/export/ws/amaguire/nwam1-bugs/webrev/

Looks good.

--
meem
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


pwernau

Posts: 90
From: US

Registered: 3/9/05
Re: [nwam-discuss] [networking-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 21, 2009 7:15 AM   in response to: mph

  Click to reply to this thread Reply



Michael Hunter wrote:
>
> Begin forwarded message:
>
> Date: Mon, 06 Jul 2009 15:36:11 -0700
> From: Michael Hunter <Michael dot Hunter at Sun dot COM>
> To: nwam-dev at opensolaris dot org
> Subject: [nwam-dev] NWAM Phase 1 Code Review request
>
>
> This is a request for people to review the NWAM Phase 1 code.
>
> The code can be found at http://cr.opensolaris.org/~mph/nwam1_cr/

I just have some comments on how things fit together in a systemic way,
having played with these bits extensively in a VPN context. This is all
I could get to.

- usr/src/cmd/svc/milestone/net-loc

480: s/IPSec/IPsec/ (in other files also, like net-nwam)

265: do_nis()

Keep in mind that the ypservers, if listed by name, need to be in
/etc/hosts. NIS looks explicitly there and does not so regular
nameservice lookup. Not sure if this is really the place to address it,
but there probably should be some warning when the user is doing manual
config and uses a hostname.

399: It would probably be good to not use the default nsswitch.nis file.
Especially if you have both dns and NIS enabled. In general, one only
needs NIS for passwd and automount (and possibly printers). At the very
least, it would be good to have hosts and ipnodes list dns first, if
available.

- usr/src/cmd/svc/milestone/net-nwam

revert_to_legacy_loc() disables IPsec policy and IPfilter policy,
changes some properties, then re-enables them. This leaves open a
window of opportunity where there is no network security policy, which
is a bad thing. Can't you change the config file locations in the SMF
properties and do a refresh/restart instead without first disabling? Or
refresh/restart/innocuous enable in case the service wasn't started before?

- usr/src/lib/libsecdb/exec_attr.txt

So, you have nwamadm and nwamcfg as part of the "Network Management"
profile. But "Network IPsec Management" is part of "Network Security"
and "IP Filter Management" appears to be in its own bucket (which is a
pre-existing bug IMO, it should be in Network Security). It is not part
of "Network Management", possibly on purpose. Anyway, nwamcfg and
nwamadm implicitly give you the ability to manage IPfilter and IPsec
policies if you have the solaris.network.autoconf.write authorization
since you can define a property to override security policy. In fact,
you must supply a property or policy gets blanked out IIRC. This is
leaking the separation of privileges.

What is the intention here as far as the privilege and authorization
model? I'm not sure what the right balance is for your project and the
previous requirements that caused this separation in the first place.

- usr/src/lib/libsecdb/user_attr.txt

Does netadm need the IPfilter service listed?

-Paul
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


okie

Posts: 409
From: US

Registered: 3/9/05
[nwam-discuss] code review feedback: exec_attr.txt questions
Posted: Sep 2, 2009 7:40 PM   in response to: pwernau

  Click to reply to this thread Reply

Hi Paul,

Sorry it's taken so long to get back to you on this.

I'm looking specifically at your comments regarding exec_attr.txt; I
think these merit their own discussion. Here are my thoughts so far...

On Tue, Jul 21, 2009 at 10:15:30AM -0400, Paul Wernau wrote:
[...]
>
> - usr/src/lib/libsecdb/exec_attr.txt
>
> So, you have nwamadm and nwamcfg as part of the "Network Management"
> profile. But "Network IPsec Management" is part of "Network Security"
> and "IP Filter Management" appears to be in its own bucket (which is a
> pre-existing bug IMO, it should be in Network Security). It is not part
> of "Network Management", possibly on purpose. Anyway, nwamcfg and
> nwamadm implicitly give you the ability to manage IPfilter and IPsec
> policies if you have the solaris.network.autoconf.write authorization
> since you can define a property to override security policy. In fact,
> you must supply a property or policy gets blanked out IIRC. This is
> leaking the separation of privileges.
>
> What is the intention here as far as the privilege and authorization
> model? I'm not sure what the right balance is for your project and the
> previous requirements that caused this separation in the first place.

It took me a while to figure out all of the issues here; and it's quite
likely there are more than the ones I've uncovered so far. But after a
long conversation with Scott Rotondo, and lots of digging through *_attr
files, here's what I've come up with.

* We can remove the exec_attr entries for nwamadm and nwamcfg; they're
unneeded since netcfgd is actually doing the file manipulation. Any
user may run nwamcfg or nwamadm; but that user must have the appropriate
solaris.network.autoconf.* auths in order to do "privileged" things
with them. This is what netcfgd verifies when the library calls into
its door interface to do reads from/writes to the repository.

* The Network Autoconf profile that's given to the Console User is more
powerful than it might seem. I think this is the core of Paul's concern,
and was of concern to Scott, as well.

The Network Autoconf profile allows you to create/modify/activate
locations. Locations determine what the security (ipsec and/or ipfilter)
policy for the system is. And nwamd, which runs as netadm, which means
it has the Network Management and Network Security profiles, can then
apply that security policy.

What this all boils down to: anyone logged in on the console of a system
that's running NWAM has control over the system's security policy. On a
laptop, that's probably not such a big deal. On desktops, it might be
more of a problem.

I do believe (after further review, Scott!) that it is *not* possible
for the console user to disable network/physical:default and enable nwam
instead; meaning that we do have the fallback to legacy network config
for folks who are concerned about Console User having this power. Or,
the Console User profile could be modified, and the Network Autoconf
profile could be given to a more specific user only.

One small thing I think we should do is remove Network Autoconf from
Network Management; the Network Autoconf profile is pretty powerful,
and should be explicitly assigned, rather than get rolled in with
another profile.

* We could make this scheme more locked down out-of-the-box, and also more
flexible, by splitting the Network Autoconf profile into separate nwam
user and nwam admin functions. The Console User would have the former,
Primary Admin the latter. From the GUI point of view, I think we could
tailor the split such that the panel applet functions (choose a location,
choose a wlan, create a wifi key) would be allowed for the nwam user, and
the config tool functions (create/modify profiles) would be allowed for
the nwam admin.

I think this could be a fairly straightforward enhancement we should
pursue quickly after integration.

A couple other nits I noticed in digging through all the various *_attr
files:

* We didn't intend to remove the Network Observability profile, did we?
It was removed in the changeset (8177) that sync'd with the build (103)
in which it was introduced (changeset 8122 in the nwam1 gate)...so it
looks like it could have been a simple mis-merge.

* Looks like we added the auth 'solaris.smf.modify.application' to the
Network Management profile because we (nwamd? who does this?) need to
be able to create property groups (the change went in with Alan's
changeset 8805, with comment "need additional auth to create property
group"). Adding this to the Network Management profile doesn't seem
quite right; I'm thinking maybe it should be in the Network Autoconf
profile? That's at least more nwam-specific.

Along the same lines, I'm thinking solaris.smf.manage.location belongs
in Network Autoconf, rather than Network Management.
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


meem

Posts: 3,083
From: US

Registered: 3/9/05
Re: [nwam-discuss] code review feedback: exec_attr.txt questions
Posted: Sep 3, 2009 10:52 AM   in response to: okie

  Click to reply to this thread Reply


> * We didn't intend to remove the Network Observability profile, did we?
> It was removed in the changeset (8177) that sync'd with the build (103)
> in which it was introduced (changeset 8122 in the nwam1 gate)...so it
> looks like it could have been a simple mis-merge.
>
> * Looks like we added the auth 'solaris.smf.modify.application' to the
> Network Management profile because we (nwamd? who does this?) need to
> be able to create property groups (the change went in with Alan's
> changeset 8805, with comment "need additional auth to create property
> group"). Adding this to the Network Management profile doesn't seem
> quite right; I'm thinking maybe it should be in the Network Autoconf
> profile? That's at least more nwam-specific.
>
> Along the same lines, I'm thinking solaris.smf.manage.location belongs
> in Network Autoconf, rather than Network Management.

Indeed.

--
meem
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


pwernau

Posts: 90
From: US

Registered: 3/9/05
Re: [nwam-discuss] code review feedback: exec_attr.txt questions
Posted: Sep 9, 2009 3:10 PM   in response to: okie

  Click to reply to this thread Reply

Hi Renee,

> * We can remove the exec_attr entries for nwamadm and nwamcfg; they're
> unneeded since netcfgd is actually doing the file manipulation. Any
> user may run nwamcfg or nwamadm; but that user must have the appropriate
> solaris.network.autoconf.* auths in order to do "privileged" things
> with them. This is what netcfgd verifies when the library calls into
> its door interface to do reads from/writes to the repository.
>

ok

> * The Network Autoconf profile that's given to the Console User is more
> powerful than it might seem. I think this is the core of Paul's concern,
> and was of concern to Scott, as well.
>

Yes, my concern was there was unintentional privilege escalation. The
console user may purposely or accidentally bypass the network security
policy.

> The Network Autoconf profile allows you to create/modify/activate
> locations. Locations determine what the security (ipsec and/or ipfilter)
> policy for the system is. And nwamd, which runs as netadm, which means
> it has the Network Management and Network Security profiles, can then
> apply that security policy.
>
> What this all boils down to: anyone logged in on the console of a system
> that's running NWAM has control over the system's security policy. On a
> laptop, that's probably not such a big deal. On desktops, it might be
> more of a problem.
>

Yes, for instance, in a situation where users are not given
administrative privileges.

> I do believe (after further review, Scott!) that it is *not* possible
> for the console user to disable network/physical:default and enable nwam
> instead; meaning that we do have the fallback to legacy network config
> for folks who are concerned about Console User having this power. Or,
> the Console User profile could be modified, and the Network Autoconf
> profile could be given to a more specific user only.
>
> One small thing I think we should do is remove Network Autoconf from
> Network Management; the Network Autoconf profile is pretty powerful,
> and should be explicitly assigned, rather than get rolled in with
> another profile.
>

Yes.

> * We could make this scheme more locked down out-of-the-box, and also more
> flexible, by splitting the Network Autoconf profile into separate nwam
> user and nwam admin functions. The Console User would have the former,
> Primary Admin the latter. From the GUI point of view, I think we could
> tailor the split such that the panel applet functions (choose a location,
> choose a wlan, create a wifi key) would be allowed for the nwam user, and
> the config tool functions (create/modify profiles) would be allowed for
> the nwam admin.
>
> I think this could be a fairly straightforward enhancement we should
> pursue quickly after integration.
>

OK. For a corporate-supplied laptop where the user has no
administrative privileges (fairly common), the user should be able to
use NWAM in the sense of plugging into arbitrary ethernet ports or
connecting to arbitrary wifi points, but they shouldn't necessarily be
able to change firewall or IPsec policy with express permission or do
other administrative-like things.

> A couple other nits I noticed in digging through all the various *_attr
> files:
>
> * We didn't intend to remove the Network Observability profile, did we?
> It was removed in the changeset (8177) that sync'd with the build (103)
> in which it was introduced (changeset 8122 in the nwam1 gate)...so it
> looks like it could have been a simple mis-merge.
>
> * Looks like we added the auth 'solaris.smf.modify.application' to the
> Network Management profile because we (nwamd? who does this?) need to
> be able to create property groups (the change went in with Alan's
> changeset 8805, with comment "need additional auth to create property
> group"). Adding this to the Network Management profile doesn't seem
> quite right; I'm thinking maybe it should be in the Network Autoconf
> profile? That's at least more nwam-specific.
>
> Along the same lines, I'm thinking solaris.smf.manage.location belongs
> in Network Autoconf, rather than Network Management.

I don't have a strong opinion here, but anything that really only has a
purpose for Network Autoconf should probably go there for good hygiene,
as you suggest.

Thanks,
Paul
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


okie

Posts: 409
From: US

Registered: 3/9/05
Re: [nwam-discuss] code review feedback: exec_attr.txt questions
Posted: Sep 9, 2009 3:51 PM   in response to: pwernau

  Click to reply to this thread Reply

On Wed, Sep 09, 2009 at 06:10:52PM -0400, Paul Wernau wrote:
[...]
> > I do believe (after further review, Scott!) that it is *not* possible
> > for the console user to disable network/physical:default and enable nwam
> > instead; meaning that we do have the fallback to legacy network config
> > for folks who are concerned about Console User having this power. Or,
> > the Console User profile could be modified, and the Network Autoconf
> > profile could be given to a more specific user only.
> >
> > One small thing I think we should do is remove Network Autoconf from
> > Network Management; the Network Autoconf profile is pretty powerful,
> > and should be explicitly assigned, rather than get rolled in with
> > another profile.
> >
>
> Yes.
>
> > * We could make this scheme more locked down out-of-the-box, and also more
> > flexible, by splitting the Network Autoconf profile into separate nwam
> > user and nwam admin functions. The Console User would have the former,
> > Primary Admin the latter. From the GUI point of view, I think we could
> > tailor the split such that the panel applet functions (choose a location,
> > choose a wlan, create a wifi key) would be allowed for the nwam user, and
> > the config tool functions (create/modify profiles) would be allowed for
> > the nwam admin.
> >
> > I think this could be a fairly straightforward enhancement we should
> > pursue quickly after integration.
> >
>
> OK. For a corporate-supplied laptop where the user has no
> administrative privileges (fairly common), the user should be able to
> use NWAM in the sense of plugging into arbitrary ethernet ports or
> connecting to arbitrary wifi points, but they shouldn't necessarily be
> able to change firewall or IPsec policy with express permission or do
> other administrative-like things.

Yep, that's a good example.

If we weren't as far along as we are now, I would happily make this
change now; but given where we are and the primary audience for NWAM
phase 1, I think we need to move forward with just the removal of
Network Autoconf from Network Management. But I will also file the
RFE to break up the Network Autoconf profile, and we will try to
address that quickly. I think that will be very do-able.

I'll also make all the other, smaller changes I mentioned:

* removal of exec_attr entries for nwamadm and nwamcfg

* restoration of the Network Observability profile

* move the auths added specifically for nwam (solaris.smf.manage.location
and solaris.smf.modify.application) from Network Management to
Network Autoconf

* remove Network Autoconf from Network Management

Thanks!
renee
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


okie

Posts: 409
From: US

Registered: 3/9/05
Re: [nwam-discuss] [networking-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Sep 11, 2009 4:08 PM   in response to: pwernau

  Click to reply to this thread Reply

Hi Paul,

One more bit of code review commentary.

On Tue, Jul 21, 2009 at 10:15:30AM -0400, Paul Wernau wrote:
>
> - usr/src/lib/libsecdb/user_attr.txt
>
> Does netadm need the IPfilter service listed?

No, it does not. All of the location-related changes are performed
in the start method for the location service, which runs as root.
It probably does not need IPsec Management, either; I'll check on
that.

-renee
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


seb

Posts: 2,205
From: US

Registered: 3/9/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request (part deux)
Posted: Jul 21, 2009 12:38 PM   in response to: mph

  Click to reply to this thread Reply

And my nwamd comments:


General: How about moving nwamd out of usr/src/cmd/cmd-inet/... and up
to usr/src/cmd?

cmd/cmd-inet/lib/nwamd/README

* General: This file is a very nice touch. More subsystems need this
kind of documentation for developers.

* General: Is there room for an nwam dtrace provider that can help
track the various states of the objects maintained by nwamd to help
with debugging? The strategy for debugging problems could include
"give me the output of this dtrace script" instead of (or maybe in
addition to) the current "set the debug SMF property, enable
debugging in syslog, and give me the output of the syslog file".

* 124: It would help to expand the NCP, NCU, and ENM acronyms at least
once in the README.


usr/src/cmd/cmd-inet/lib/nwamd/Makefile

* General: For a daemon as central and critical to the system's
operation as nwamd, it should be compiled with CTF data so that it
can be debugged with dtrace and mdb. See the in.mpathd Makefile for
an example of how to do this.


cmd/cmd-inet/lib/nwamd/objects.h

* 45-52: There are over 90 other declarations of "object_name" in ON.
Please use nwam-specific prefixes in nwam data structures to help
with navigation of the code.

* 47,48: Use of "void *" for object data defeats compiler
type-checking and is not generally good practice. At a guess,
shouldn't object handle be of type "struct nwam_handle *"?
object_data could be pointer to a union or just a union (does
object_data need to be allocated separately?).


cmd/cmd-inet/lib/nwamd/ncu.h

* 70-99: Members need some identifying prefix. E.g., I have no hope
of getting at "flags" or "dhp" from cscope. Please go through all
data structures and rectify this where necessary.


cmd/cmd-inet/lib/nwamd/main.c

* 71: There's not much point in making this a global variable in
main.c since it's only used in one place in conditions.c, but I'm
wondering why it's not initialized by an SMF property like other
timer-related intervals.

* 71: Related to wlan_scan_interval, if the user manually selected a
specific WLAN link, I would think that the daemon doesn't scan every
120 seconds. For some WiFi drivers, initiating a scan can have
really strange side-effects, including blocking data for extended
periods of time and experiencing link flaps. It would be nice to
have a mode where the daemon doesn't continually scan for no reason.

* 82: And that nice README that comes with the source...

* 143: This isn't used outside of main.c and should be static. In
accordance to csctyle guidelines, it also needs to go at the top of
the file with other globals.

* 170: refresh() calls nwamd_init_ncus() along with other
nwamd_init_<stuff>() functions. Do these init routines handle
cleaning up the existing objects before initializing? If they clean
up, then why is nwamd_fini_ncus() called on line 169? If they
don't, then why don't we need to call nwamd_fini_<stuff>() before
the refresh() call on 173?

* 172: I'm wondering why SIGHUP is any different from SIGTHAW.
Shouldn't both signals essentially result in re-creating the nwamd
universe from scratch as if it were starting anew?

* 242: There's no need for any of these local variables; you can pass
the addresses of the global variables into
nwamd_lookup_boolean_property().

* 288: Consider renaming to nwamd_refresh() for easier cscope browsing.

* 347: I may be misunderstanding how this works, but what is the point
of enqueuing an event here if the daemon is immediately going to
exit() (the event might still be in the queue when that happens).
Did I miss something?

* 414: I don't see how zonename is used anywhere. It should probably
be removed (along with nwamd_lookup_zonename()).

* 424: Using dladm_status2str() to include additional information
would be good.

* 444: What's this SOFT_RESET_FILE about? It doesn't look like it's
used anywhere.

* 480: door_initialize() should return an error code that can be
converted to a printable string passed to pfail(). Another approach
would be to have door_initialize() pfail() upon failure and have it
return void (which is what you do for other init routines).

* 481: What kind of error handling did you have in mind? ;-) Killing
the daemon seems perfectly reasonable since this looks like a fatal
error condition.


cmd/cmd-inet/lib/nwamd/door_if.c

* 73-776: The length of this function and its deep indentation makes
it hard to read and thus review. I'd suggest processing of each
request type in separate functions to help readability and reduce
indentation.

* 85-88: what is this existentialist code about? If it's for lint,
please remove this and use ARGSUSED instead.

* 91: Copying the data would aleviate the need for LINTED tags.

* 116,etc.: There's a lot of duplicate auth-checking code throughout
this file. It might be simpler to have a simple table of door
commands and associated required authorizations that could be
checked prior to the switch.

* 133: Why is there no auth check for this command?

* 136: A WLAN scan request should require an authorization and/or
elevated privileges, as it negatively impacts performance during the
scan (severely for some drivers). The dladm scan-wifi subcommand
requires sys_dl_config.

* 161: I believe that the res data is leaked here. door_return()
never returns... Use alloca() instead.

* 177: What does this do? If it connects to a WLAN, then there should
be some authorization check associated with this command.

* 214,308: The code for enable and disable is almost identical. I'm
sure this could be refactored.

* 229: Why don't we record success for other audit checks in this
function? Is that a bug?

* 251,281,345,375: This locking scheme to check for object state is at
odds with the event queue scheme, and will likely lead to unexpected
results. Locking the object to check its state cannot work reliably
since there could be an event pending ahead of this request that
changes the state immediately after you drop the lock and before you
enqueue the event associated with this request. The
NWAM_ENTITY_INVALID_STATE error returned by these door calls is thus
meaningless. The state of the object cannot be reliably be checked
until previous enqueued events are processed, and this event is
dequeued.

Given the event queue design, the best you probably can do is
enqueue the request without checking the state, since you have no
idea what the pending state of the object is (unless you can think
of something better).

* 255,285,349: The same synchronization problem exists for these
objects. There could be enqueued events that change the current
active object, and thus checking the name of the current active
object without consideration for pending events will lead to
unexpected results.

* 255,285,349: The "active_loc" string could be in the middle of being
modified by nwamd_loc_activate(). Same comment for the other
strings being compared.

* 255,285,349: What verifies the input? The string could be
non-NULL-terminated garbage. There should be a strlcpy() somewhere
that verifies for overrun.

* 480-519: I don't see how this code can reliably work. Each action
gets enqueued, so you have no way to verify if it succeeded. As
such, the strategy of a rename operation being two separate
operations (destroy and refresh) can result in an object being
destroyed or a refresh action being processed when the destroy
actually failed.

* 573,588: This doesn't take into account pending requests in the
event queue. Those could very well be the requests from the door
client to disable the very objects that it wants to destroy...

* 643: Why do you need a special case for the active location? Can it
not be looked up using nwamd_object_find()? If yes, then you don't
need that block of code. If not, then there's a race condition
since "active_loc" could be in the process of being written to by
nwamd_loc_activate(), which will result in an erroneous
NWAM_ENTITY_NOT_FOUND.

* 791: I'd suggest using DOOR_REFUSE_DESC as well if you don't intent
to handle file-descriptor passing through this door.

* 797: cstyle: need blank line between variables and code.

* 827: You could make doorfd a global I suppose, and you could close it


cmd/cmd-inet/lib/nwamd/conditions.c

* 49: Yes, conditions are usually conditional. ;-)

* 94: Need block comment explaining what this does.

* 111,113: I don't grok this code. What does NWAM_CONDITION_IS mean?
This looks needlessly complicated. Why not just have a function
that returns the state of the object and have the caller do what it
wants with it...

* 565: "ret" is not descriptive enough given that it has deeper
semantics. It represents the state of some object.

* 608: There should be a comment somewhere (perhaps at the top of this
file) explaining what CONDITIONAL_ANY and CONDITIONAL_ALL mean. I'm
really not sure what this is designed to do.


usr/src/cmd/cmd-inet/lib/nwamd/dlpi_events.c

* 58,87: Please give these more distinguished names.

* 67: IFF_UP is an IP interface flag; it is not a link state. There
is a link_state_t defined in <sys/m that you can use for this
purpose.

* 82-84: This needs further clarification in the comment. You're only
interested in notifications, and notifications are only processed in
a pending dlpi_recv(). As such, you need to continuously call
dlpi_recv() even if you're not interested in any other DLPI
messages.

* 99: This needs some comments, as I'm not clear on what it implies.
Does the DLPI notification thread for this link simply vanish if it
gets three consecutive read errors? What happens to the rest of the
handling for that link if that happens?

* 103: Need block comment.

* 116: This can't possibly happen, right? Shouldn't this be an
assert()?

* 131,144,156,167: It would be simpler and less error-prone to declare
a name string of appropriate size on the stack, and have
nwam_ncu_typed_name_to_name() (that's quite a mouthful) use it
instead of having it dynamically allocate memory.

* 130,142,154: The use of errno in these locations is a bug:
dlpi_open() only sets errno if the error is DL_SYSERR. You should
use dlpi_strerror() on the returned libdlpi error code to print
something meaningful, it handles this cleanly.


cmd/cmd-inet/lib/nwamd/enm.c


* 50: Please elaborate a little bit here and explain what an "external
network modifier" abstraction is.

* 123: You're leaking object_name here.

* 134: You make this lengthy check numerous times in this function.
I'd suggest defining a boolean_t called goingonline (or something
like that) as a local variable and setting it to
(object->object_state == NWAM_STATE_OFFLINE_TO_ONLINE).

* 165: The only possible first argument for nwamd_start_child() is
CTRUN (nwamd_start_child() is only called in this one location), so
it might as well be implied and not be explicitly passed in.

* 241: What prevents another thread from getting a reference to this
object after you've dropped the lock and before you blow it away by
calling nwamd_object_fini()? I don't see how this is safe. Has
this code been stress-tested?

* 368-390: This code is identical to 304-323. This could be factored
out into its own function.

* 461: Need a block comment; check for what?

* 572: Replace all of the "return (0);" calls in the above switch with
proper "break;" calls, and you won't need this NOTREACHED lint
comment.

* 569: You don't need this default case.

* 625: You unlock the object here, but proceed to read and write to
its state afterwards. This doesn't look safe.


cmd/cmd-inet/lib/nwamd/events.c

* 476,509: Stashing the user context like this is obsoleted by
dtrace and mdb, and neither of those need to have debug enabled to
work. Can we lose this?

* 477: s/returned/return/

* 481,519,550,etc.: Since this isn't a PTHREAD_MUTEX_ERRORCHECK mutex,
failure of pthread_mutex_lock() is impossible.


cmd/cmd-inet/lib/nwamd/loc.c

* 201: Need block comment: check what?

* 224,227: This would be simpler as:

err = nwam_value_get_uint64(activationval, &activation);
nwam_value_free(activationval);
if (err != NWAM_SUCCESS) {
nlog(...);
return (0);
}

* 349: This doesn't look right since another thread could be modifying
active_loc.


cmd/cmd-inet/lib/nwamd/ncu.c

* 51,1138: This doesn't look kosher at all. Basing functionality on
link names is not correct. There looks to be some architecture
missing here.

* 191: Holding the link open is a problem for DR. This will need to
eventually be addressed. This is a more general problem than just
the DLPI interaction, obviously (I'm not sure how the existing rcm
code interacts with nwamd's configuration of IP interfaces). The
overall NWAM interaction with DR will need to be investigated fully
in order for these two features to interoperate. Please file an RFE
to track this.

* 1427: It looks like some other thread can sneak in and grab a
reference to this object after it's been unlocked and before it's
blown away by nwamd_object_fini().

* 1513-1528: This could all be collapsed:

case NWAM_STATE_OFFLINE_TO_ONLINE:
case NWAM_STATE_ONLINE_TO_OFFLINE:
case NWAM_STATE_ONLINE:
nwamd_ncu_state_machine(event->event_object);
break;

* 1545: Won't doing this after the object lock is dropped result in
potential out-of-order issues with these calls?


cmd/cmd-inet/lib/nwamd/ncu_ip.c

* 89: This function leaks both "request" and "reply".

* 271: I think this needs to be newh. Otherwise, you're setting the
broadcast on logical unit 0, and not the address that was created.

* 540-582: The strategy used to figure out if a logical interface is
needed or if it's the first address seems error prone and certainly
not thread safe (although I'm not sure if thread safety is a concern
for this function). This seems to be needed as a side-effect of the
distinction between the newly introduced icfg_add_ipaddr()
vs. icfg_set_addr() functions. It should be possible to fix this by
defining a single icfg_add_addr() function that just does the right
thing.

On a related note, why is one _ipaddr() and the other _addr()?

* 700: There's a XXX here.

* 722: A cleaner interface would be two separate functions, a plumb
function and an unplumb function. They can share the same
implementation if necessary, but at least the callers won't be
saddled with a huge ambiguous function name and needless boolean
argument.


cmd/cmd-inet/lib/nwamd/ncu_phys.c

* 69: The caller should pass in a buffer of size LIFNAMSIZ; there's no
reason for this function to dynamically allocate memory. It could
then return void and callers wouldn't need to handle failure.

* 132: Similar to my comment on the *_plumb_unplumb_*() function, this
really should be two functions whose names are indicative of what
they do instead of one ambiguous one with a boolean argument that
indicates what it should do.

* 185: There's a XXX here.


cmd/cmd-inet/lib/nwamd/objects.c

* 195: pthread_rwlock_wrlock() can't possibly fail. If it fails, then
it's because there's a bug in your code and you might as well
abort().

* 396: nwamd_object_fini() must not fail. It is called in a context
where failure is not an option (there is no way to recover from such
a failure). This must be rectified. It looks like every caller
ignores the return value, so it looks like someone already knew that
this was wrong...


usr/src/cmd/cmd-inet/lib/nwamd/util.c

* 241: Isn't this a little too late? zone_check_datalink() verifies
if a link belongs to a running zone, but nwamd holds all links open
when it starts, which is before zones boot. As such, the zone that
needs this link won't be able to use it (a link's zone property
can't be set if the link is open). I don't see how this works.
Maybe I'm not seeing the bigger picture. How can an exclusive stack
zone boot propertly while nwamd is running?

* 521: A safer model (one less prone to memory leaks) is one where the
caller passes in an appropriately-sized buffer to store the answer.


lib/libnwam/common/libnwam_events.c

* 284: Block comment needed: what does this do?


lib/libnwam/common/libnwam_ncp.c

* 553: Having the caller pass in a buffer and buffer length is a
better model, as it is less prone to memory leaks.

-Seb


_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


amaguire

Posts: 304
From:

Registered: 10/17/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request (part deux)
Posted: Sep 24, 2009 8:33 AM   in response to: seb

  Click to reply to this thread Reply

hi Seb

apologies for the delay, but I'm going to try and
tackle the nwamd enm.c comments below.
There's been a bit of flux with the code since
due to bugs etc, but the webrev at

http://zhadum.east/export/ws/amaguire/nwam1-fixes/webrev

...contains the bulk of the changes you suggest
(I've noted the one that was already fixed).
Responses inline below...

Sebastien Roy wrote:
> And my nwamd comments:
>
>
<snipped out everything bar enm.c comments>

> cmd/cmd-inet/lib/nwamd/enm.c
>
>
> * 50: Please elaborate a little bit here and explain what an "external
> network modifier" abstraction is.
>
>
done.
> * 123: You're leaking object_name here.
>
>
fixed (prior to this webrev so doesn't show up
in the diffs).
> * 134: You make this lengthy check numerous times in this function.
> I'd suggest defining a boolean_t called goingonline (or something
> like that) as a local variable and setting it to
> (object->object_state == NWAM_STATE_OFFLINE_TO_ONLINE).
>
>
good idea, fixed.
> * 165: The only possible first argument for nwamd_start_child() is
> CTRUN (nwamd_start_child() is only called in this one location), so
> it might as well be implied and not be explicitly passed in.
>
>
makes sense, I've renamed the function nwamd_ctrun().
> * 241: What prevents another thread from getting a reference to this
> object after you've dropped the lock and before you blow it away by
> calling nwamd_object_fini()? I don't see how this is safe. Has
> this code been stress-tested?
>
>
these locking issues will need to be tackled together.
Michael has been working on this so I haven't tried
to address this issue here.
> * 368-390: This code is identical to 304-323. This could be factored
> out into its own function.
>
>
done, I've created nwamd_enm_run_method().

> * 461: Need a block comment; check for what?
>
>
added a block comment.
> * 572: Replace all of the "return (0);" calls in the above switch with
> proper "break;" calls, and you won't need this NOTREACHED lint
> comment.
>
>
> * 569: You don't need this default case.
>
>
both accepted.
> * 625: You unlock the object here, but proceed to read and write to
> its state afterwards. This doesn't look safe.
>
>
I've fixed this and we now hold the object lock
while checking the state and enqueueing the
state change.

Thanks again for the review! More to come soon...

Alan
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


mph

Posts: 542
From:

Registered: 5/11/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request (part deux)
Posted: Sep 29, 2009 3:12 PM   in response to: amaguire

  Click to reply to this thread Reply

On Thu, 24 Sep 2009 16:33:05 +0100
Alan Maguire <Alan dot Maguire at Sun dot COM> wrote:

> hi Seb
>
> apologies for the delay, but I'm going to try and
> tackle the nwamd enm.c comments below.
> There's been a bit of flux with the code since
> due to bugs etc, but the webrev at
>
> http://zhadum.east/export/ws/amaguire/nwam1-fixes/webrev

door_if.c:565 current_ncu_priority_group should be accessed in some
atomic fashion

enm.c:290-293 If we are going to do something with the object it might
be better to bump the ref count. When we look it up in the new thread
it could be gone, changed, etc. If it needs to be the same then we
need to hold the lock. But even if we are not concerned about it
changing by bumping the ref count we can at least remove a source of
error paths.

enm.c:294-300 Why not strdup? The nwamd_object_name should already be
of validated name length. If you bump the ref count above then you
don't need to copy this name at all.

ncp.c:126 strlcpy

Michael
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


amaguire

Posts: 304
From:

Registered: 10/17/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request (part deux)
Posted: Sep 30, 2009 6:07 AM   in response to: mph

  Click to reply to this thread Reply

Michael Hunter wrote:
> On Thu, 24 Sep 2009 16:33:05 +0100
> Alan Maguire <Alan dot Maguire at Sun dot COM> wrote:
>
>
>> hi Seb
>>
>> apologies for the delay, but I'm going to try and
>> tackle the nwamd enm.c comments below.
>> There's been a bit of flux with the code since
>> due to bugs etc, but the webrev at
>>
>> http://zhadum.east/export/ws/amaguire/nwam1-fixes/webrev
>>
>
> door_if.c:565 current_ncu_priority_group should be accessed in some
> atomic fashion
>
>
fixed - I've used the active_ncp_mutex here, and for times
when we access active_ncph.
> enm.c:290-293 If we are going to do something with the object it might
> be better to bump the ref count. When we look it up in the new thread
> it could be gone, changed, etc. If it needs to be the same then we
> need to hold the lock. But even if we are not concerned about it
> changing by bumping the ref count we can at least remove a source of
> error paths.
>
>
So rather than taking the lock, when we create the ENM
execution thread, we increase the reference count of the
object so that other threads will not dispose of it. Then
when ENM execution has completed, we drop the reference
count and unlock the object. I've added a few object
functions -

nwamd_object_release_and_preserve() - drops the mutex
without decreasing the refcnt
nwamd_object_release_after_preserve() - drops the mutex
and the refcount by 2 (compensating for the previous preserve)
nwamd_object_release_and_destroy_after_preserve() - drops
the mutex and the refcount by 3 (compensating for the previous
preserve)
> enm.c:294-300 Why not strdup? The nwamd_object_name should already be
> of validated name length. If you bump the ref count above then you
> don't need to copy this name at all.
>
>
fixed.
> ncp.c:126 strlcpy
>
>
done. Webrev has been updated. Thanks!

Alan
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


amaguire

Posts: 304
From:

Registered: 10/17/05
Re: [nwam-discuss] NWAM Phase 1 Code Review request (part deux)
Posted: Sep 29, 2009 7:50 AM   in response to: seb

  Click to reply to this thread Reply

I've respun the webrev with enm.c
code review changes to include the
door_if.c comments responded to below.
Thanks again!

Sebastien Roy wrote:
> And my nwamd comments:
>
>
>
<snip>
> cmd/cmd-inet/lib/nwamd/door_if.c
>
> * 73-776: The length of this function and its deep indentation makes
> it hard to read and thus review. I'd suggest processing of each
> request type in separate functions to help readability and reduce
> indentation.
>
>
I've changed the door handing to utilize a table
containing request type, auth required and the
appropriate processing function (as you suggest
below).
> * 85-88: what is this existentialist code about? If it's for lint,
> please remove this and use ARGSUSED instead.
>
>
Fixed.
> * 91: Copying the data would aleviate the need for LINTED tags.
>
>
That would be a bit expensive I think - the door_if.c
code has been simplified a bit in that we simply fill in the
(maximally-sized) buffer used in the door call with the
data we wish to return.
> * 116,etc.: There's a lot of duplicate auth-checking code throughout
> this file. It might be simpler to have a simple table of door
> commands and associated required authorizations that could be
> checked prior to the switch.
>
>
Fixed (see above).
> * 133: Why is there no auth check for this command?
>
>
Fixed.
> * 136: A WLAN scan request should require an authorization and/or
> elevated privileges, as it negatively impacts performance during the
> scan (severely for some drivers). The dladm scan-wifi subcommand
> requires sys_dl_config.
>
>
Fixed - scan requests require AUTOCONF_REFRESH_AUTH,
while requests to read the scan results require
AUTOCONF_READ_AUTH. WLAN selection/key setting
requires AUTOCONF_WRITE_AUTH.
> * 161: I believe that the res data is leaked here. door_return()
> never returns... Use alloca() instead.
>
>
I've changed the door-related memory management
approach. Now a maximally-sized buffer is used in
door calls, and responses simply fill it in, with no need
to allocate memory.
> * 177: What does this do? If it connects to a WLAN, then there should
> be some authorization check associated with this command.
>
>
This code supports users making explicit WLAN
selections. I've changed it to require
AUTOCONF_WRITE_AUTH (ditto for WLAN
key setting).
> * 214,308: The code for enable and disable is almost identical. I'm
> sure this could be refactored.
>
>
Good point, done.
> * 229: Why don't we record success for other audit checks in this
> function? Is that a bug?
>
>
I pinged Renee on this one, here's her response (thanks
Renee!):

"The only events that *nwamd* audits are enables and disables of
profiles. Config update and remove events (which make up the rest
of what's being handled here) are audited by netcfgd. You can see
this in nwam_backend_door_server() in libnwam (which is really the
guts of netcfgd)."

> * 251,281,345,375: This locking scheme to check for object state is at
> odds with the event queue scheme, and will likely lead to unexpected
> results. Locking the object to check its state cannot work reliably
> since there could be an event pending ahead of this request that
> changes the state immediately after you drop the lock and before you
> enqueue the event associated with this request. The
> NWAM_ENTITY_INVALID_STATE error returned by these door calls is thus
> meaningless. The state of the object cannot be reliably be checked
> until previous enqueued events are processed, and this event is
> dequeued.
>
> Given the event queue design, the best you probably can do is
> enqueue the request without checking the state, since you have no
> idea what the pending state of the object is (unless you can think
> of something better).
>
>
Good point. We've made this change (though prior
to this webrev, so it won't show up in the diffs).
> * 255,285,349: The same synchronization problem exists for these
> objects. There could be enqueued events that change the current
> active object, and thus checking the name of the current active
> object without consideration for pending events will lead to
> unexpected results.
>
Removed these checks. We enqueue the action event
regardless and check the active object when handling it.
> * 255,285,349: The "active_loc" string could be in the middle of being
> modified by nwamd_loc_activate(). Same comment for the other
> strings being compared.
>
>
The active_loc and active_ncp strings are now mutex-protected.
> * 255,285,349: What verifies the input? The string could be
> non-NULL-terminated garbage. There should be a strlcpy() somewhere
> that verifies for overrun.
>
>
Fixed - we now strlcpy() into the char arrays "name" and
"parent", and bail with an error if the string is too big.
> * 480-519: I don't see how this code can reliably work. Each action
> gets enqueued, so you have no way to verify if it succeeded. As
> such, the strategy of a rename operation being two separate
> operations (destroy and refresh) can result in an object being
> destroyed or a refresh action being processed when the destroy
> actually failed.
>
>
I've filed

11614 nwamd needs to handle object renaming in a better way


...and will address this separately as the fix here is
a bit more involved I suspect.
> * 573,588: This doesn't take into account pending requests in the
> event queue. Those could very well be the requests from the door
> client to disable the very objects that it wants to destroy...
>
>
nwam_[enm|loc|known_wlan|ncu]_action() - which
are called in door_if.c to handle destroy requests -
each enqueue an OBJECT_ACTION event. This event
in turn triggers a direct call to nwam_*_handle_fini_event().
So the destruction is handled in the context of the
event handling thread, and though intermediate events may
be in the queue when we receive the destruction request,
they will not interfere with the destruction of the object
(which will be enqueued after any such intermediate events).
> * 643: Why do you need a special case for the active location? Can it
> not be looked up using nwamd_object_find()? If yes, then you don't
> need that block of code.
It can indeed.
> If not, then there's a race condition
> since "active_loc" could be in the process of being written to by
> nwamd_loc_activate(), which will result in an erroneous
> NWAM_ENTITY_NOT_FOUND.
>
> * 791: I'd suggest using DOOR_REFUSE_DESC as well if you don't intent
> to handle file-descriptor passing through this door.
>
>
Done.
> * 797: cstyle: need blank line between variables and code.
>
>
Fixed.
> * 827: You could make doorfd a global I suppose, and you could close it
>
>
Done. I've also added an nwamd_door_fini() function
we call on shutdown to prevent stray door requests
after we destroy configuration objects.

Updated webrev at:

http://zhadum.east/export/ws/amaguire/nwam1-fixes/webrev

Thanks!

Alan
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss


meem

Posts: 3,083
From: US

Registered: 3/9/05
Re: [nwam-discuss] Fw: [nwam-dev] NWAM Phase 1 Code Review request
Posted: Jul 23, 2009 9:36 PM   in response to: mph

  Click to reply to this thread Reply


> Date: Mon, 06 Jul 2009 15:36:11 -0700
> From: Michael Hunter <Michael dot Hunter at Sun dot COM>
> To: nwam-dev at opensolaris dot org
> Subject: [nwam-dev] NWAM Phase 1 Code Review request
>
>
> This is a request for people to review the NWAM Phase 1 code.
>
> The code can be found at http://cr.opensolaris.org/~mph/nwam1_cr/
>
> For those inside sun there is a cscope database
> in /net/coupe.eng/builds/mph/nwam1_cr/usr/src (sorry to those external
> to SWAN).
>
> Documentation can both be found in the source tree and on the project
> web page at http://opensolaris.org/os/project/nwam/

Michael,

Sorry for the delay; my feedback on nwamcfg.c follows. I do not have time
to review all of libnwam, but if there are specific parts the team would
like me to look at, please let me know.

* General: many of the same comments I made for nwamadm apply
to this file -- e.g., zerr(), die_nwamerr(), CMD_* vs command
tables, etc. In the interest of brevity, I have not repeated
previous comments that apply directly to nwamcfg (e.g.,
problematic use of basename()).

* General: a lot of very repetitious code -- I have enumerated a
number of instances below. FWIW, a lot of these issues are due
to the libnwam API itself -- e.g., sets of parallel functions
that differ only in the handle they take and their name. It
seems like an OO design would've simplified the code
considerably -- e.g., each object could've had an ops vector
associated with it that the various callers could use without
repeatedly concerning themselves with the type of the object.

* General: because of the above, I have mostly focused on smaller
opportunities for code cleanup. That is, I find the libnwam API
problematic because it leads to rampant duplication and code
sprawl for consumers like nwamcfg, but aside from a couple of
examples to explain what we could be done "someday" I have
avoided commenting on it unless I have a specific and
straightforward suggestion for streamlining the code.

* General: assorted uses of printf() without localization.
(e.g., 918-920, 1984, 3060, 4331, 4335, ...)

* General: assorted spurious ARGSUSED directives -- e.g.: 3051,
3076, 3101, ...

* General: the various got_*_handle globals seem unnecessary;
shouldn't we be able to just check if the handles are NULL?

* General: numerous calls to check_scope() could be more simply
handled by the function dispatcher, rather than repeated in each
*_func().

* General: several blocks of code have something like:

if (got_enm_handle) {
prop_type = pt_to_prop_type(pt_type, NWAM_OBJECT_TYPE_ENM);
if (prop_type == NULL)
goto prop_error;

/* check if property can be set */
if (is_prop_read_only(prop_type, NWAM_OBJECT_TYPE_ENM))
goto read_only;
if (!show_prop_test(prop_type, enm_prop_display_entry_table,
NWAM_OBJECT_TYPE_ENM))
goto skip_prop;
/* ... */
} else if (got_wlan_handle) {
prop_type = pt_to_prop_type(pt_type,
NWAM_OBJECT_TYPE_KNOWN_WLAN);
if (prop_type == NULL)
goto prop_error;

/* check if property can be set */
if (is_prop_read_only(prop_type, NWAM_OBJECT_TYPE_KNOWN_WLAN))
goto read_only;
if (!show_prop_test(prop_type, wlan_prop_display_entry_table,
NWAM_OBJECT_TYPE_KNOWN_WLAN))
goto skip_prop;
/* ... */
} else if (got_ncu_handle) {
/* ... */

This code could be simplified significantly if there was a
function that consulted the various handles to determine which
one was live and returned the appropriate object type. Then
you could do something like:

handle_type = get_handle_type();
if ((prop_type = pt_to_prop_type(pt_type, handle_type)) == NULL)
goto prop_error;
if (is_prop_read_only(prop_type, handle_type))
goto read_only;
if (!show_prop(prop_type, prop_table[handle_type], handle_type))
goto skip_prop;

... and only switch on the handle_type once you need to get
into logic that cannot easily be parameterized.

* 32: I don't see any block comments near the end of
nwamcfg_grammar.y.

* 163, 167, 187, 189, 193, 214: What is the significance of these
comments?

* 219, 228, 236: Why are these arrays NULL-terminated?

* 240, 286, 304, 359: Why are these all different structures?
They all have identical contents and field names. These should
be folded into a common nwamcfg_prop_table_entry structure and
all members should have a consistent prefix for easy cscoping.

* 399: Seems like `output_file' could be local to export_func()
and passed to the various walker callbacks via the walker
callback argument.

* 405: Seems like quoted_strings could just be an argument to
output_prop_val().

* 447, 451, 455: Indeed, errors need to be reported.

* 483: Doesn't seem to have anything in particular to do with
strings -- would expect `array_free(void **array, int nelem)'.

* 962-986: Assumes English locale.

* 1043: prompt[] only accounts for MAX_NWAM_TYPE_LEN once, but
there can be two types.

* 1084: Why is it safe to ignore the return value from
string_to_yyin()?

* 1218-1265: Should be condensed to a single is_prop_listprop()
function that takes the table pointer. (That said, I would've
expected this attribute to be part of the object itself and
thus tracked by libnwam rather than nwamcfg.)

* 1355-1378: Mapping between NWAM classes and NWAM object types
seems more suited to libnwam.

* 1390-1392: Under what situations do callers pass in NULL?

* 1396, 1398: Crashes if strdup() fails.

* 1418-1659: Most error paths in this function leak `newname'.

* 1459-1519: Handling of `ret' can be centralized (cmd_res1_type
and cmd_res2_type can be converted to strings if precise error
messages are desired).

* 1525-1625: Lots of repetitious parallel logic here. This is
exactly the kind of code that would be simple and clear had
libnwam associated an ops vector with its objects rather than
having a set of equivalent but distinctly-named methods for
every operation.

* 1654-1655, 1925-1926, 1977-1978, 4859-4860, 2140: Shouldn't
alloc_cmd() be able to guarantee this?

* 1664: nit: s/than/the.

* 1671-1738: Logic to process the return codes should be shared
across these functions -- e.g.:

int
destroy_ncp_callback(nwam_ncp_handle_t ncp, void *arg)
{
return (destroy_ret(nwam_ncp_destroy(ncp,
NWAM_FLAG_DO_NOT_FREE));
}

* 1752-1764: What happens to nwamcfg when this to fails part-way
through?

* 1778-1783: Given the way this function is used, it would seem
simpler to change this be have_ncus() { return (1) } and have
the caller check if the NWAM_WALK_HALTED was returned to
determine whether or not the list was empty.

* 1896: If nwam_ncp_walk_ncus() fails, it appears we delete the
ncp regardless.

* 1935-1955: Duplicates logic in cancel_func().

* 1973-1982: Duplicates logic in end_func().

* 2019-2052: Should be backed by a common implementation that
takes a propname -- but moreover, I'd expect these utility
routines to be provided by libnwam.

* 2028, 2030, 2046, 2048: Returning -1 seems incorrect since it's
out-of-range for the type. Would seem simplest to add e.g.
NWAM_NCU_TYPE_UNKNOWN to the enums.

* 2143, 3577: nit: Superfluous `return'.

* 2201, 2221: Leaks `name'.

* 2223: nit: Superfluous blank.

* 2240-2344: This logic could be shortened considerably by having
the table pointers stored in an array that's keyed by the type
-- i.e.:

int
prop_to_pt(const char *nwam_prop, nwam_object_type_t type)
{
nwam_prop_table_t *table;

assert(type >= 0 && type < NWAM_OBJECT_TYPE_KNOWN_WLAN);
table = prop_tables[type]]]]namefunc)(obj, &name);
if (ret != NWAM_SUCCESS) {
warn_nwerr("List error: cannot get %s", objname);
return (1);
}

(void) printf("\t%s\n", name);
free(name);
return (0);
}

... then e.g.:

int
list_enm_callback(nwam_enm_handle_t enm, void *arg)
{
return (list_obj_callback(enm, arg, "ENM", "ENMs"));
}

* 3181-3258: This is really hard to read. One minor improvement
would be to have the logic at 3193-3257 just build the string
arrays, and have a single loop iterate through them at the end
and centralize the fprintf() and NWAM_VALUE_DELIMITER_STR logic.

* 3322-3334: Needs to share a common implementation with
output_propname().

* 3322-3334: Shouldn't bisect the *_listprop() routines.

* 3431, ...: nit: Why is list_msg local to each `if' clause?

* 3556-3558: Why is this needed?

* 3564: Shouldn't list_ncu_callback() already print this?

* 3573, 4120: Superfluous check.

* 3581-3590, 3671-3681, 3717-3727, 3769-3779: Functions can easily
be consolidated into a single function that takes a type.

* 3830-3832: L1 exports would be cleaner to implement as an
export[] array indexed cmd_res1_type -- e.g. 3915-3937 would
essentially be:

if (cmd->cmd_res1_type == RT1_UNKNOWN) {
for (i = RT1_MIN; i < RT1_MAX; i++)
export_l1[i] =cmd_res1_type] = B_TRUE;
}

* 3888: Short comment explaining the purpose of the
write_to_file check would be helpful.

* 4166-4191: See previous comments about get_handle_type().

* 4244-4275: See previous comments about get_handle_type().

* 4296: How can we get here with prop_type != NULL?

* 4343-4611: Significant amount of duplicate code here that
should be refactored -- e.g., lines 4356-4362, 4426-4432,
4493-4500, 4568-4574 all identical (and seems like functionality
that could partially be built into output_prop_val()), and much
of the user input processing could be factored out to a common
function and parameterize based on the type.

* 4338-4340: Seems more complex to do it this way than to
just have 4332 and 4335 print the closing brace.

* 4621-462: nits: s/new/newly/, s/value/a value/,
s/current/the current/

* 4659-4770: Duplicate walking logic should be factored out.

* 4830: Indeed, an error here would be good.

* 4839-4852: Essentially duplicates the logic from 1084-1095.

* 4899-4908: What would go wrong if we attempted to use a
file that wasn't a "regular file"?

* 4854-4865: Duplicates logic in end_func().

* 4918-4925: Duplicates logic in destroy_func().

* 4959: Shouldn't cmd_file_mode already be B_FALSE?

--
meem
_______________________________________________
nwam-discuss mailing list
nwam-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/nwam-discuss





Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
© 2010, Oracle Corporation and/or its affiliates

Oracle