OpenSolaris

  subsites   code review   repo   packages   bugs   defect   polls   planet
You are not signed in. Sign in or register.

User Libraries

Status 11/02/07

Many of the shared libraries are now building. If you do a make -k install in usr/src/lib you'll get a good pass at a number of them. The GNU toolchain has an issue with creating incorrect relocations. Basically if you have a recursive call in a function of the primary fucntion you will get the following afte the link stage.

text relocation failed

To correct this declare the called func extern within the func.

Others may show up.

For libnsl we had to create the front end language for g++ and cpp. You will need to update your GNU toolchain and install the ones we have created.

libc status 01/08/2007

Outline:

  • How lib/ppc was made
    • Copy and modify
    • Trend toward more portable code
    • Makefile.com
  • What is in libc – Issues
    • genassym
    • C Preprocessor used for asm
    • pre-ANSI C function prototypes
    • Assembly code
    • Floating-point
    • Unimplemented functions
  • Porting Schedule
  • Testing
  • Documentation

How libc/ppc was made

Copy and modify


Many aspects of Solaris/PPC, not just libc, are made by copying existing code and modifying. It is a judgement call, made on a case-by-case basis, what existing code to use as the starting point.

Sometimes, it is best to start from the old Solaris/PPC 2.6 code. This is almost always appropriate for small leaf nodes that deal more with instruction-set architecture, which has not changed much over the last decade. Other times, it seems better to start with modern (ONNV) code for another processor, and modify for PowerPC. This is more appropriate for things that interact with operating system services, where the interfaces may have changed considerably.

Then, there is the question of which processor to use as a starting point. Our current work on 32-bit PowerPC is more like x86 in some ways, but more like Sparc in other ways.

PowerPC is more like x86:

  1. Both are 32-bit only. Yes, there are 64-bit PowerPC machines, but we are not porting to those, yet. Yes, there are 64-bit x86 machines, but that is practically a different ISA, so we need not consider that for our purposes, here.
  2. Both are little-endian, see note:1 , so layout of some data structures can be cribbed off of IA32 code.
  3. Both carve out a portion of each userland process address space for the kernel address space, whereas sparc has its own completely separate kernel address space.
  4. There is no dynamic reconfiguration to deal with, so physical memory management is easier.

PowerPC is more like sparc:

  1. Both are big-endian, see note:1, so layout of some data structures can be cribbed off of sparc code.
  2. Both have 32 general-purpose (integer) registers, whereas x86 has significantly fewer registers and almost none are truly general-purpose. And, as a consequence …
  3. Both almost always pass all function arguments in registers, whereas x86 must almost always pass arguments on the stack.
  4. Both have 32-bit, 64-bit, and 128-bit binary IEEE 754 floating-point and a set of directly addressed floating-point registers, whereas x86 has its own 80-bit extended format, and has a mixed bag of floating-point stack and SSE registers.
  5. Both use some form of OpenFirmware.

note:1 PowerPC can work in big-endian or little-endian mode. Solaris/PPC supports both. Solaris 2.6 code was little-endian only; Solaris 2.11 code is big-endian, but code is written to work both ways. Some libc functions were built using a merge of x86 and sparc code for 2.11, and Tim Marsland's 2.6 PPC big-endian workspace.

Trend toward more portable code


Solaris 2.6 had a relatively small amount of code in usr/src/lib/libc/port and more code in usr/src/lib/libc/${ISA} for each instruction-set architecture. The trend over the last decade has been to rely more on libc/port. This has been driven by the higher cost of maintaining code for more processors, and by faster processors and better compilers making it less important to do isa-specific optimizations using hand-code assembly language.

Many libc/ppc functions from Solaris/PPC 2.6 were abandoned. That is, they simply did not make it into the port to Solaris/PPC 2.11. And, good riddance.
The libc/port function was used, instead. There are probably yet more libc/ppc 2.6 functions that could be removed, after a more careful review. I am not completely confident that my first pass did the job perfectly. It was a pretty quick-and-dirty first pass.

Makefile.com


The starting point for the file, usr/src/lib/libc/ppc/Makefile.com was usr/src/lib/libc/i386/Makefile.com. There are some files that have a corresponding name, such as i386data.o vs. ppcdata.o. There are files that exist only in i386, such as lxstat.o, xmknod.o and xstat.o. So far, these difference have been an annoyance, because, when there are changes to 2.11 libc/i386, it is very likely that we want to make the corresponding change to libc/ppc, but automatic merging of Makefile.com can fail due to mismatches in surrounding context. I am considering coming up with empty files for lxstat.o, xmknod.o, and xstat.o, and maybe making a few other changes, just to make automatic merging easier.

What is in libc


Subdirectories


Each processor (and libc/port) has the following subdirectories:
  • crt C runtime
  • etc
  • fp floating-point functions
  • gen general-purpose
  • inc header files
  • sys system calls (not header files, like other 'sys' directories)
  • threads
  • unwind

[[ More detail to be added, later. — Guy Shaw ]]

Issues


genassym


genassym.c and offset.in have the same issues as genassym in usr/src/uts. The compiler used on the build machine must be an x86 compiler, which does not necessarily produce the same results as a PowerPC compiler, for things like offsets within data structures.

As more libc functions are fleshed out, this problem may get worse. But, if we move to a strategy of doing more in C, taking advantage of GNU C inline asm functions, then the problem might be reduced, or go away, entirely. But, since there are many .s files in libc/port, they would have to be taken care of by a more clever, more general-purpose solution. But, it could be done.

C Preprocessor used for asm


Cpp is not a general-purpose preprocessor. It is specialized for the C programming language. That specialization has advantages and disadvantages.

Any attempt to use cpp for anything other than C programs is likely to run into trouble. The less like C the subject programming language, the more likely it is that there will be some problem with comments, token-pasting behavior, line continuation, preprocessor arithmetic, or something. The problems get worse if there is much in the way of "clever" use of preprocessor features, such as using token pasting to construct identifiers from fragments of identifiers.

Construction of identifiers using ANSI C token pasting fails for assembly language in the case of local identifiers that start with '.'. A period is not part of a valid C identifier, so token ANSI C token pasting will not work. Unfortunately, this is not some exotic corner case; construction of local identifiers is very useful. It would not do to confine construction of identifiers to global symbols.

Token pasting is also used to create symbols of the form: identifier@qualifier, where 'qualifier' is 'ha', 'la', 'plt', etc. Sparc has similar kinds of symbols. Fortunately, whitespace is allowed between identifier and qualifier. But that just means we lucked out on that issue, not that we can ignore the cpp problem.

There is a significant amount of code in Solaris that depends on Sun's cpp, which suppresses null comments entirely, rather than treating them is a single space. This is used to construct identifiers a/**/b, where 'ab' is the identifier that is needed. This code occurs not only in processor-specific code all processors, but in common code.

Due to the legacy of common code, Solaris/PPC 2.11 continues to rely Sun's cpp.

pre-ANSI C function prototypes


Much of the C code that came from Solaris/PPC 2.6 has C functions with pre-ANSI C prototypes. The code builds, and much of it probably still works. But, it would be nice to convert to ANSI-C prototypes. Of course, the most portable thing to do would be to use the preprocessor and generate both ANSI and K&R function declarations, but I don't know if anyone cares about that, now.

Some converted code is present, as long as I was visiting it for some other reasoning, but it is not completeI. Not even close.

Assembly code


Assembly language code from Solaris/PPC 2.6 has had to be modified, partly to account for differences in notation, but mostly to update the mnemonics used.

Some changes are to ensure correct code for both 32-bit and 64-bit hardware, others are changes in usage of simplified mnemonics.

Here is a table of opcodes that have had to be modified.

  old     new       page[2] note
  • --— ---— ---— ---------—
cmp cmpw 8-28 cmp, L=0 cmpi cmpwi 8-29 cmpi, L=0 cmpl cmplw 8-30 cmpl, L=0 cmpli cmplwi 8-31 cmpli, L=0 sl slw 8-158 sli slwi 8-155 rlwinm sri srwi 8-155 rlwinm
  [1] page: Page number in MPCFPE32B, Rev. 3, 2005/09.

Floating-point


libc/ppc/fp — quadruple (128-bit) floating-point arithmetic.
  • Not in i386; In sparc; some in sparcv9
  • Low priority
  • Reference: Title: Doubled-Precision IEEE Standard 754 Floating-Point Arithmetic
        Author: Kahan, W.
        Date: 1987-Feb-26
    
  • Reference:
Title: Software for Doubled-Precision Floating-Point Computations Author: Linnainmaa, Seppo ACM TOMS vol 7 no 3, September 1981, pages 272-283 Date: 1981-Sep

See also, <section>Testing</section> for testing of floating-point.

[[ Fill in more detail. — Guy Shaw ]]

Unimplemented functions


Many functions in libc/ppc are not implemented, at all, or are partially coded, but have some portion that is unimplemented. These can be located by looking for two patterns:

  1. calls to libcunimplemented()
  2. preprocessor directives that conditionally compile
based on preprocessor symbols of the form XXX*

All calls to libcunimplemented() in C code pass no arguments. So you can find all calls by searching for the string, 'libcunimplemented()'.

In assembly language, all occurrence can be found by searching for the regular expression, 'bl\t+libcunimplemented', where '\t' is a tab.

Searching for all calls to libcunimplemented and for all /^#if.*defined(XXX_/ would generate a pretty complete list of things to do. However, that list contains no information about priority (urgency and/or importance), schedule, or anything else. See <section>Porting Schedule</section>.

Porting Schedule


In order to meet our next milestone, userland process, we do not really need any libraries, not even libc. We can create a statically linked program that does its own traps to do system calls directly. But, the milestone after that, single user prompt with Bourne shell, would require at least a good bit of libc, and some other libraries.

If we do things in the "natural" order, that is, if we wait until userland processes work before we write and test libc functions, then there will be considerable extra delay in the schedule. A great deal of writing and testing needs to be done "out of order", before any testing can be done on a live system running Solaris/PPC.

See the section, <section>Testing</section>, immediately below.

We will definitely need many of the general-purpose functions and system calls. We can probably skip many less frequently called functions. I have not done a detailed analysis of what is needed. I will probably run truss on Bourne shell to collect the list of functions and system calls it really needs just to come up and do something light-duty but realistic. We want to make "accomodations" for early bringup, but we don't want to cross the line, where we would be just plain cheating.

We can probably postpone work on floating-point. However, that has not changed much, so we might do it because it is cheap, and effectively just comes along for the ride, since much of the cost of porting and testing can be shared. Thorough testing of floating-point arithmetic, on the other hand, can be very costly.

I am hoping that single-user prompt using Bourne shell will not require any support for multi-threading. Threads is one of those areas that had undergone a great deal of revision over the last decade, including Roger Faulkner's "great reform". This is one area of libc that might better done by throwing away all Solaris/PPC 2.6 code and starting anew from Solaris/x86 2.11.

What I fear is that, even though Bourne shell does not require fully functional multi-threading support, the new thread model means that support for multi-threading is an integral part of even the simplest process, and so, in modern Solaris, it can never again be treated as a bolt-on. That is as it should be. But, it means we may have to implement the bulk of the libc/ppc/thread functions, up front.

Testing


A nice thing about working on libraries is that, except for things like libc/ppc/sys (system calls), almost everything can be tested on a hosted system, with the same ISA. For example, Linux on an Apple.

Test suite

  • QA at Sun Q: Are Sun's test suites available under CDDL license?
  • Low priority background task for me: obtain info about
Sun's QA arsenal for testing libraries, especially libc.

Paranoia – for testing floating-point

  • Low priority background task for me: obtain, build, and test Paranoia

Documentation


Show work


Much of the existing Solaris code gets low marks for documentation because the source code lacks background material, an overview of program logic, rationale, original work, etc. Only polished code makes it into the gate. Everything else is jettisoned.