Search Results

Search found 7 results on 1 pages for 'user9154181'.

Page 1/1 | 1

Much Ado About Nothing: Stub Objects

- by user9154181

The Solaris 11 link-editor (ld) contains support for a new type of object that we call a stub object. A stub object is a shared object, built entirely from mapfiles, that supplies the same linking interface as the real object, while containing no code or data. Stub objects cannot be executed — the runtime linker will kill any process that attempts to load one. However, you can link to a stub object as a dependency, allowing the stub to act as a proxy for the real version of the object. You may well wonder if there is a point to producing an object that contains nothing but linking interface. As it turns out, stub objects are very useful for building large bodies of code such as Solaris. In the last year, we've had considerable success in applying them to one of our oldest and thorniest build problems. In this discussion, I will describe how we came to invent these objects, and how we apply them to building Solaris. This posting explains where the idea for stub objects came from, and details our long and twisty journey from hallway idea to standard link-editor feature. I expect that these details are mainly of interest to those who work on Solaris and its makefiles, those who have done so in the past, and those who work with other similar bodies of code. A subsequent posting will omit the history and background details, and instead discuss how to build and use stub objects. If you are mainly interested in what stub objects are, and don't care about the underlying software war stories, I encourage you to skip ahead. The Long Road To Stubs This all started for me with an email discussion in May of 2008, regarding a change request that was filed in 2002, entitled: 4631488 lib/Makefile is too patient: .WAITs should be reduced This CR encapsulates a number of cronic issues with Solaris builds: We build Solaris with a parallel make (dmake) that tries to build as much of the code base in parallel as possible. There is a lot of code to build, and we've long made use of parallelized builds to get the job done quicker. This is even more important in today's world of massively multicore hardware. Solaris contains a large number of executables and shared objects. Executables depend on shared objects, and shared objects can depend on each other. Before you can build an object, you need to ensure that the objects it needs have been built. This implies a need for serialization, which is in direct opposition to the desire to build everying in parallel. To accurately build objects in the right order requires an accurate set of make rules defining the things that depend on each other. This sounds simple, but the reality is quite complex. In practice, having programmers explicitly specify these dependencies is a losing strategy: It's really hard to get right. It's really easy to get it wrong and never know it because things build anyway. Even if you get it right, it won't stay that way, because dependencies between objects can change over time, and make cannot help you detect such drifing. You won't know that you got it wrong until the builds break. That can be a long time after the change that triggered the breakage happened, making it hard to connect the cause and the effect. Usually this happens just before a release, when the pressure is on, its hard to think calmly, and there is no time for deep fixes. As a poor compromise, the libraries in core Solaris were built using a set of grossly incomplete hand written rules, supplemented with a number of dmake .WAIT directives used to group the libraries into sets of non-interacting groups that can be built in parallel because we think they don't depend on each other. From time to time, someone will suggest that we could analyze the built objects themselves to determine their dependencies and then generate make rules based on those relationships. This is possible, but but there are complications that limit the usefulness of that approach: To analyze an object, you have to build it first. This is a classic chicken and egg scenario. You could analyze the results of a previous build, but then you're not necessarily going to get accurate rules for the current code. It should be possible to build the code without having a built workspace available. The analysis will take time, and remember that we're constantly trying to make builds faster, not slower. By definition, such an approach will always be approximate, and therefore only incremantally more accurate than the hand written rules described above. The hand written rules are fast and cheap, while this idea is slow and complex, so we stayed with the hand written approach. Solaris was built that way, essentially forever, because these are genuinely difficult problems that had no easy answer. The makefiles were full of build races in which the right outcomes happened reliably for years until a new machine or a change in build server workload upset the accidental balance of things. After figuring out what had happened, you'd mutter "How did that ever work?", add another incomplete and soon to be inaccurate make dependency rule to the system, and move on. This was not a satisfying solution, as we tend to be perfectionists in the Solaris group, but we didn't have a better answer. It worked well enough, approximately. And so it went for years. We needed a different approach — a new idea to cut the Gordian Knot. In that discussion from May 2008, my fellow linker-alien Rod Evans had the initial spark that lead us to a game changing series of realizations: The link-editor is used to link objects together, but it only uses the ELF metadata in the object, consisting of symbol tables, ELF versioning sections, and similar data. Notably, it does not look at, or understand, the machine code that makes an object useful at runtime. If you had an object that only contained the ELF metadata for a dependency, but not the code or data, the link-editor would find it equally useful for linking, and would never know the difference. Call it a stub object. In the core Solaris OS, we require all objects to be built with a link-editor mapfile that describes all of its publically available functions and data. Could we build a stub object using the mapfile for the real object? It ought to be very fast to build stub objects, as there are no input objects to process. Unlike the real object, stub objects would not actually require any dependencies, and so, all of the stubs for the entire system could be built in parallel. When building the real objects, one could link against the stub objects instead of the real dependencies. This means that all the real objects can be built built in parallel too, without any serialization. We could replace a system that requires perfect makefile rules with a system that requires no ordering rules whatsoever. The results would be considerably more robust. We immediately realized that this idea had potential, but also that there were many details to sort out, lots of work to do, and that perhaps it wouldn't really pan out. As is often the case, it would be necessary to do the work and see how it turned out. Following that conversation, I set about trying to build a stub object. We determined that a faithful stub has to do the following: Present the same set of global symbols, with the same ELF versioning, as the real object. Functions are simple — it suffices to have a symbol of the right type, possibly, but not necessarily, referencing a null function in its text segment. Copy relocations make data more complicated to stub. The possibility of a copy relocation means that when you create a stub, the data symbols must have the actual size of the real data. Any error in this will go uncaught at link time, and will cause tragic failures at runtime that are very hard to diagnose. For reasons too obscure to go into here, involving tentative symbols, it is also important that the data reside in bss, or not, matching its placement in the real object. If the real object has more than one symbol pointing at the same data item, we call these aliased symbols. All data symbols in the stub object must exhibit the same aliasing as the real object. We imagined the stub library feature working as follows: A command line option to ld tells it to produce a stub rather than a real object. In this mode, only mapfiles are examined, and any object or shared libraries on the command line are are ignored. The extra information needed (function or data, size, and bss details) would be added to the mapfile. When building the real object instead of the stub, the extra information for building stubs would be validated against the resulting object to ensure that they match. In exploring these ideas, I immediately run headfirst into the reality of the original mapfile syntax, a subject that I would later write about as The Problem(s) With Solaris SVR4 Link-Editor Mapfiles. The idea of extending that poor language was a non-starter. Until a better mapfile syntax became available, which seemed unlikely in 2008, the solution could not involve extentions to the mapfile syntax. Instead, we cooked up the idea (hack) of augmenting mapfiles with stylized comments that would carry the necessary information. A typical definition might look like: # DATA(i386) __iob 0x3c0 # DATA(amd64,sparcv9) __iob 0xa00 # DATA(sparc) __iob 0x140 iob; A further problem then became clear: If we can't extend the mapfile syntax, then there's no good way to extend ld with an option to produce stub objects, and to validate them against the real objects. The idea of having ld read comments in a mapfile and parse them for content is an unacceptable hack. The entire point of comments is that they are strictly for the human reader, and explicitly ignored by the tool. Taking all of these speed bumps into account, I made a new plan: A perl script reads the mapfiles, generates some small C glue code to produce empty functions and data definitions, compiles and links the stub object from the generated glue code, and then deletes the generated glue code. Another perl script used after both objects have been built, to compare the real and stub objects, using data from elfdump, and validate that they present the same linking interface. By June 2008, I had written the above, and generated a stub object for libc. It was a useful prototype process to go through, and it allowed me to explore the ideas at a deep level. Ultimately though, the result was unsatisfactory as a basis for real product. There were so many issues: The use of stylized comments were fine for a prototype, but not close to professional enough for shipping product. The idea of having to document and support it was a large concern. The ideal solution for stub objects really does involve having the link-editor accept the same arguments used to build the real object, augmented with a single extra command line option. Any other solution, such as our prototype script, will require makefiles to be modified in deeper ways to support building stubs, and so, will raise barriers to converting existing code. A validation script that rederives what the linker knew when it built an object will always be at a disadvantage relative to the actual linker that did the work. A stub object should be identifyable as such. In the prototype, there was no tag or other metadata that would let you know that they weren't real objects. Being able to identify a stub object in this way means that the file command can tell you what it is, and that the runtime linker can refuse to try and run a program that loads one. At that point, we needed to apply this prototype to building Solaris. As you might imagine, the task of modifying all the makefiles in the core Solaris code base in order to do this is a massive task, and not something you'd enter into lightly. The quality of the prototype just wasn't good enough to justify that sort of time commitment, so I tabled the project, putting it on my list of long term things to think about, and moved on to other work. It would sit there for a couple of years. Semi-coincidentally, one of the projects I tacked after that was to create a new mapfile syntax for the Solaris link-editor. We had wanted to do something about the old mapfile syntax for many years. Others before me had done some paper designs, and a great deal of thought had already gone into the features it should, and should not have, but for various reasons things had never moved beyond the idea stage. When I joined Sun in late 2005, I got involved in reviewing those things and thinking about the problem. Now in 2008, fresh from relearning for the Nth time why the old mapfile syntax was a huge impediment to linker progress, it seemed like the right time to tackle the mapfile issue. Paving the way for proper stub object support was not the driving force behind that effort, but I certainly had them in mind as I moved forward. The new mapfile syntax, which we call version 2, integrated into Nevada build snv_135 in in February 2010: 6916788 ld version 2 mapfile syntax PSARC/2009/688 Human readable and extensible ld mapfile syntax In order to prove that the new mapfile syntax was adequate for general purpose use, I had also done an overhaul of the ON consolidation to convert all mapfiles to use the new syntax, and put checks in place that would ensure that no use of the old syntax would creep back in. That work went back into snv_144 in June 2010: 6916796 OSnet mapfiles should use version 2 link-editor syntax That was a big putback, modifying 517 files, adding 18 new files, and removing 110 old ones. I would have done this putback anyway, as the work was already done, and the benefits of human readable syntax are obvious. However, among the justifications listed in CR 6916796 was this We anticipate adding additional features to the new mapfile language that will be applicable to ON, and which will require all sharable object mapfiles to use the new syntax. I never explained what those additional features were, and no one asked. It was premature to say so, but this was a reference to stub objects. By that point, I had already put together a working prototype link-editor with the necessary support for stub objects. I was pleased to find that building stubs was indeed very fast. On my desktop system (Ultra 24), an amd64 stub for libc can can be built in a fraction of a second: % ptime ld -64 -z stub -o stubs/libc.so.1 -G -hlibc.so.1 \ -ztext -zdefs -Bdirect ... real 0.019708910 user 0.010101680 sys 0.008528431 In order to go from prototype to integrated link-editor feature, I knew that I would need to prove that stub objects were valuable. And to do that, I knew that I'd have to switch the Solaris ON consolidation to use stub objects and evaluate the outcome. And in order to do that experiment, ON would first need to be converted to version 2 mapfiles. Sub-mission accomplished. Normally when you design a new feature, you can devise reasonably small tests to show it works, and then deploy it incrementally, letting it prove its value as it goes. The entire point of stub objects however was to demonstrate that they could be successfully applied to an extremely large and complex code base, and specifically to solve the Solaris build issues detailed above. There was no way to finesse the matter — in order to move ahead, I would have to successfully use stub objects to build the entire ON consolidation and demonstrate their value. In software, the need to boil the ocean can often be a warning sign that things are trending in the wrong direction. Conversely, sometimes progress demands that you build something large and new all at once. A big win, or a big loss — sometimes all you can do is try it and see what happens. And so, I spent some time staring at ON makefiles trying to get a handle on how things work, and how they'd have to change. It's a big and messy world, full of complex interactions, unspecified dependencies, special cases, and knowledge of arcane makefile features... ...and so, I backed away, put it down for a few months and did other work... ...until the fall, when I felt like it was time to stop thinking and pondering (some would say stalling) and get on with it. Without stubs, the following gives a simplified high level view of how Solaris is built: An initially empty directory known as the proto, and referenced via the ROOT makefile macro is established to receive the files that make up the Solaris distribution. A top level setup rule creates the proto area, and performs operations needed to initialize the workspace so that the main build operations can be launched, such as copying needed header files into the proto area. Parallel builds are launched to build the kernel (usr/src/uts), libraries (usr/src/lib), and commands. The install makefile target builds each item and delivers a copy to the proto area. All libraries and executables link against the objects previously installed in the proto, implying the need to synchronize the order in which things are built. Subsequent passes run lint, and do packaging. Given this structure, the additions to use stub objects are: A new second proto area is established, known as the stub proto and referenced via the STUBROOT makefile macro. The stub proto has the same structure as the real proto, but is used to hold stub objects. All files in the real proto are delivered as part of the Solaris product. In contrast, the stub proto is used to build the product, and then thrown away. A new target is added to library Makefiles called stub. This rule builds the stub objects. The ld command is designed so that you can build a stub object using the same ld command line you'd use to build the real object, with the addition of a single -z stub option. This means that the makefile rules for building the stub objects are very similar to those used to build the real objects, and many existing makefile definitions can be shared between them. A new target is added to the Makefiles called stubinstall which delivers the stub objects built by the stub rule into the stub proto. These rules reuse much of existing plumbing used by the existing install rule. The setup rule runs stubinstall over the entire lib subtree as part of its initialization. All libraries and executables link against the objects in the stub proto rather than the main proto, and can therefore be built in parallel without any synchronization. There was no small way to try this that would yield meaningful results. I would have to take a leap of faith and edit approximately 1850 makefiles and 300 mapfiles first, trusting that it would all work out. Once the editing was done, I'd type make and see what happened. This took about 6 weeks to do, and there were many dark days when I'd question the entire project, or struggle to understand some of the many twisted and complex situations I'd uncover in the makefiles. I even found a couple of new issues that required changes to the new stub object related code I'd added to ld. With a substantial amount of encouragement and help from some key people in the Solaris group, I eventually got the editing done and stub objects for the entire workspace built. I found that my desktop system could build all the stub objects in the workspace in roughly a minute. This was great news, as it meant that use of the feature is effectively free — no one was likely to notice or care about the cost of building them. After another week of typing make, fixing whatever failed, and doing it again, I succeeded in getting a complete build! The next step was to remove all of the make rules and .WAIT statements dedicated to controlling the order in which libraries under usr/src/lib are built. This came together pretty quickly, and after a few more speed bumps, I had a workspace that built cleanly and looked like something you might actually be able to integrate someday. This was a significant milestone, but there was still much left to do. I turned to doing full nightly builds. Every type of build (open, closed, OpenSolaris, export, domestic) had to be tried. Each type failed in a new and unique way, requiring some thinking and rework. As things came together, I became aware of things that could have been done better, simpler, or cleaner, and those things also required some rethinking, the seeking of wisdom from others, and some rework. After another couple of weeks, it was in close to final form. My focus turned towards the end game and integration. This was a huge workspace, and needed to go back soon, before changes in the gate would made merging increasingly difficult. At this point, I knew that the stub objects had greatly simplified the makefile logic and uncovered a number of race conditions, some of which had been there for years. I assumed that the builds were faster too, so I did some builds intended to quantify the speedup in build time that resulted from this approach. It had never occurred to me that there might not be one. And so, I was very surprised to find that the wall clock build times for a stock ON workspace were essentially identical to the times for my stub library enabled version! This is why it is important to always measure, and not just to assume. One can tell from first principles, based on all those removed dependency rules in the library makefile, that the stub object version of ON gives dmake considerably more opportunities to overlap library construction. Some hypothesis were proposed, and shot down: Could we have disabled dmakes parallel feature? No, a quick check showed things being build in parallel. It was suggested that we might be I/O bound, and so, the threads would be mostly idle. That's a plausible explanation, but system stats didn't really support it. Plus, the timing between the stub and non-stub cases were just too suspiciously identical. Are our machines already handling as much parallelism as they are capable of, and unable to exploit these additional opportunities? Once again, we didn't see the evidence to back this up. Eventually, a more plausible and obvious reason emerged: We build the libraries and commands (usr/src/lib, usr/src/cmd) in parallel with the kernel (usr/src/uts). The kernel is the long leg in that race, and so, wall clock measurements of build time are essentially showing how long it takes to build uts. Although it would have been nice to post a huge speedup immediately, we can take solace in knowing that stub objects simplify the makefiles and reduce the possibility of race conditions. The next step in reducing build time should be to find ways to reduce or overlap the uts part of the builds. When that leg of the build becomes shorter, then the increased parallelism in the libs and commands will pay additional dividends. Until then, we'll just have to settle for simpler and more robust. And so, I integrated the link-editor support for creating stub objects into snv_153 (November 2010) with 6993877 ld should produce stub objects PSARC/2010/397 ELF Stub Objects followed by the work to convert the ON consolidation in snv_161 (February 2011) with 7009826 OSnet should use stub objects 4631488 lib/Makefile is too patient: .WAITs should be reduced This was a huge putback, with 2108 modified files, 8 new files, and 2 removed files. Due to the size, I was allowed a window after snv_160 closed in which to do the putback. It went pretty smoothly for something this big, a few more preexisting race conditions would be discovered and addressed over the next few weeks, and things have been quiet since then. Conclusions and Looking Forward Solaris has been built with stub objects since February. The fact that developers no longer specify the order in which libraries are built has been a big success, and we've eliminated an entire class of build error. That's not to say that there are no build races left in the ON makefiles, but we've taken a substantial bite out of the problem while generally simplifying and improving things. The introduction of a stub proto area has also opened some interesting new possibilities for other build improvements. As this article has become quite long, and as those uses do not involve stub objects, I will defer that discussion to a future article.

Read the article
Using Stub Objects

- by user9154181

Having told the long and winding tale of where stub objects came from and how we use them to build Solaris, I'd like to focus now on the the nuts and bolts of building and using them. The following new features were added to the Solaris link-editor (ld) to support the production and use of stub objects: -z stub This new command line option informs ld that it is to build a stub object rather than a normal object. In this mode, it accepts the same command line arguments as usual, but will quietly ignore any objects and sharable object dependencies. STUB_OBJECT Mapfile Directive In order to build a stub version of an object, its mapfile must specify the STUB_OBJECT directive. When producing a non-stub object, the presence of STUB_OBJECT causes the link-editor to perform extra validation to ensure that the stub and non-stub objects will be compatible. ASSERT Mapfile Directive All data symbols exported from the object must have an ASSERT symbol directive in the mapfile that declares them as data and supplies the size, binding, bss attributes, and symbol aliasing details. When building the stub objects, the information in these ASSERT directives is used to create the data symbols. When building the real object, these ASSERT directives will ensure that the real object matches the linking interface presented by the stub. Although ASSERT was added to the link-editor in order to support stub objects, they are a general purpose feature that can be used independently of stub objects. For instance you might choose to use an ASSERT directive if you have a symbol that must have a specific address in order for the object to operate properly and you want to automatically ensure that this will always be the case. The material presented here is derived from a document I originally wrote during the development effort, which had the dual goals of providing supplemental materials for the stub object PSARC case, and as a set of edits that were eventually applied to the Oracle Solaris Linker and Libraries Manual (LLM). The Solaris 11 LLM contains this information in a more polished form. Stub Objects A stub object is a shared object, built entirely from mapfiles, that supplies the same linking interface as the real object, while containing no code or data. Stub objects cannot be used at runtime. However, an application can be built against a stub object, where the stub object provides the real object name to be used at runtime, and then use the real object at runtime. When building a stub object, the link-editor ignores any object or library files specified on the command line, and these files need not exist in order to build a stub. Since the compilation step can be omitted, and because the link-editor has relatively little work to do, stub objects can be built very quickly. Stub objects can be used to solve a variety of build problems: Speed Modern machines, using a version of make with the ability to parallelize operations, are capable of compiling and linking many objects simultaneously, and doing so offers significant speedups. However, it is typical that a given object will depend on other objects, and that there will be a core set of objects that nearly everything else depends on. It is necessary to impose an ordering that builds each object before any other object that requires it. This ordering creates bottlenecks that reduce the amount of parallelization that is possible and limits the overall speed at which the code can be built. Complexity/Correctness In a large body of code, there can be a large number of dependencies between the various objects. The makefiles or other build descriptions for these objects can become very complex and difficult to understand or maintain. The dependencies can change as the system evolves. This can cause a given set of makefiles to become slightly incorrect over time, leading to race conditions and mysterious rare build failures. Dependency Cycles It might be desirable to organize code as cooperating shared objects, each of which draw on the resources provided by the other. Such cycles cannot be supported in an environment where objects must be built before the objects that use them, even though the runtime linker is fully capable of loading and using such objects if they could be built. Stub shared objects offer an alternative method for building code that sidesteps the above issues. Stub objects can be quickly built for all the shared objects produced by the build. Then, all the real shared objects and executables can be built in parallel, in any order, using the stub objects to stand in for the real objects at link-time. Afterwards, the executables and real shared objects are kept, and the stub shared objects are discarded. Stub objects are built from a mapfile, which must satisfy the following requirements. The mapfile must specify the STUB_OBJECT directive. This directive informs the link-editor that the object can be built as a stub object, and as such causes the link-editor to perform validation and sanity checking intended to guarantee that an object and its stub will always provide identical linking interfaces. All function and data symbols that make up the external interface to the object must be explicitly listed in the mapfile. The mapfile must use symbol scope reduction ('*'), to remove any symbols not explicitly listed from the external interface. All global data exported from the object must have an ASSERT symbol attribute in the mapfile to specify the symbol type, size, and bss attributes. In the case where there are multiple symbols that reference the same data, the ASSERT for one of these symbols must specify the TYPE and SIZE attributes, while the others must use the ALIAS attribute to reference this primary symbol. Given such a mapfile, the stub and real versions of the shared object can be built using the same command line for each, adding the '-z stub' option to the link for the stub object, and omiting the option from the link for the real object. To demonstrate these ideas, the following code implements a shared object named idx5, which exports data from a 5 element array of integers, with each element initialized to contain its zero-based array index. This data is available as a global array, via an alternative alias data symbol with weak binding, and via a functional interface. % cat idx5.c int _idx5[5] = { 0, 1, 2, 3, 4 }; #pragma weak idx5 = _idx5 int idx5_func(int index) { if ((index 4)) return (-1); return (_idx5[index]); } A mapfile is required to describe the interface provided by this shared object. % cat mapfile $mapfile_version 2 STUB_OBJECT; SYMBOL_SCOPE { _idx5 { ASSERT { TYPE=data; SIZE=4[5] }; }; idx5 { ASSERT { BINDING=weak; ALIAS=_idx5 }; }; idx5_func; local: *; }; The following main program is used to print all the index values available from the idx5 shared object. % cat main.c #include <stdio.h> extern int _idx5[5], idx5[5], idx5_func(int); int main(int argc, char **argv) { int i; for (i = 0; i The following commands create a stub version of this shared object in a subdirectory named stublib. elfdump is used to verify that the resulting object is a stub. The command used to build the stub differs from that of the real object only in the addition of the -z stub option, and the use of a different output file name. This demonstrates the ease with which stub generation can be added to an existing makefile. % cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o stublib/libidx5.so.1 -zstub % ln -s libidx5.so.1 stublib/libidx5.so % elfdump -d stublib/libidx5.so | grep STUB [11] FLAGS_1 0x4000000 [ STUB ] The main program can now be built, using the stub object to stand in for the real shared object, and setting a runpath that will find the real object at runtime. However, as we have not yet built the real object, this program cannot yet be run. Attempts to cause the system to load the stub object are rejected, as the runtime linker knows that stub objects lack the actual code and data found in the real object, and cannot execute. % cc main.c -L stublib -R '$ORIGIN/lib' -lidx5 -lc % ./a.out ld.so.1: a.out: fatal: libidx5.so.1: open failed: No such file or directory Killed % LD_PRELOAD=stublib/libidx5.so.1 ./a.out ld.so.1: a.out: fatal: stublib/libidx5.so.1: stub shared object cannot be used at runtime Killed We build the real object using the same command as we used to build the stub, omitting the -z stub option, and writing the results to a different file. % cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o lib/libidx5.so.1 Once the real object has been built in the lib subdirectory, the program can be run. % ./a.out [0] 0 0 0 [1] 1 1 1 [2] 2 2 2 [3] 3 3 3 [4] 4 4 4 Mapfile Changes The version 2 mapfile syntax was extended in a number of places to accommodate stub objects. Conditional Input The version 2 mapfile syntax has the ability conditionalize mapfile input using the $if control directive. As you might imagine, these directives are used frequently with ASSERT directives for data, because a given data symbol will frequently have a different size in 32 or 64-bit code, or on differing hardware such as x86 versus sparc. The link-editor maintains an internal table of names that can be used in the logical expressions evaluated by $if and $elif. At startup, this table is initialized with items that describe the class of object (_ELF32 or _ELF64) and the type of the target machine (_sparc or _x86). We found that there were a small number of cases in the Solaris code base in which we needed to know what kind of object we were producing, so we added the following new predefined items in order to address that need: NameMeaning ...... _ET_DYNshared object _ET_EXECexecutable object _ET_RELrelocatable object ...... STUB_OBJECT Directive The new STUB_OBJECT directive informs the link-editor that the object described by the mapfile can be built as a stub object. STUB_OBJECT; A stub shared object is built entirely from the information in the mapfiles supplied on the command line. When the -z stub option is specified to build a stub object, the presence of the STUB_OBJECT directive in a mapfile is required, and the link-editor uses the information in symbol ASSERT attributes to create global symbols that match those of the real object. When the real object is built, the presence of STUB_OBJECT causes the link-editor to verify that the mapfiles accurately describe the real object interface, and that a stub object built from them will provide the same linking interface as the real object it represents. All function and data symbols that make up the external interface to the object must be explicitly listed in the mapfile. The mapfile must use symbol scope reduction ('*'), to remove any symbols not explicitly listed from the external interface. All global data in the object is required to have an ASSERT attribute that specifies the symbol type and size. If the ASSERT BIND attribute is not present, the link-editor provides a default assertion that the symbol must be GLOBAL. If the ASSERT SH_ATTR attribute is not present, or does not specify that the section is one of BITS or NOBITS, the link-editor provides a default assertion that the associated section is BITS. All data symbols that describe the same address and size are required to have ASSERT ALIAS attributes specified in the mapfile. If aliased symbols are discovered that do not have an ASSERT ALIAS specified, the link fails and no object is produced. These rules ensure that the mapfiles contain a description of the real shared object's linking interface that is sufficient to produce a stub object with a completely compatible linking interface. SYMBOL_SCOPE/SYMBOL_VERSION ASSERT Attribute The SYMBOL_SCOPE and SYMBOL_VERSION mapfile directives were extended with a symbol attribute named ASSERT. The syntax for the ASSERT attribute is as follows: ASSERT { ALIAS = symbol_name; BINDING = symbol_binding; TYPE = symbol_type; SH_ATTR = section_attributes; SIZE = size_value; SIZE = size_value[count]; }; The ASSERT attribute is used to specify the expected characteristics of the symbol. The link-editor compares the symbol characteristics that result from the link to those given by ASSERT attributes. If the real and asserted attributes do not agree, a fatal error is issued and the output object is not created. In normal use, the link editor evaluates the ASSERT attribute when present, but does not require them, or provide default values for them. The presence of the STUB_OBJECT directive in a mapfile alters the interpretation of ASSERT to require them under some circumstances, and to supply default assertions if explicit ones are not present. See the definition of the STUB_OBJECT Directive for the details. When the -z stub command line option is specified to build a stub object, the information provided by ASSERT attributes is used to define the attributes of the global symbols provided by the object. ASSERT accepts the following: ALIAS Name of a previously defined symbol that this symbol is an alias for. An alias symbol has the same type, value, and size as the main symbol. The ALIAS attribute is mutually exclusive to the TYPE, SIZE, and SH_ATTR attributes, and cannot be used with them. When ALIAS is specified, the type, size, and section attributes are obtained from the alias symbol. BIND Specifies an ELF symbol binding, which can be any of the STB_ constants defined in <sys/elf.h>, with the STB_ prefix removed (e.g. GLOBAL, WEAK). TYPE Specifies an ELF symbol type, which can be any of the STT_ constants defined in <sys/elf.h>, with the STT_ prefix removed (e.g. OBJECT, COMMON, FUNC). In addition, for compatibility with other mapfile usage, FUNCTION and DATA can be specified, for STT_FUNC and STT_OBJECT, respectively. TYPE is mutually exclusive to ALIAS, and cannot be used in conjunction with it. SH_ATTR Specifies attributes of the section associated with the symbol. The section_attributes that can be specified are given in the following table: Section AttributeMeaning BITSSection is not of type SHT_NOBITS NOBITSSection is of type SHT_NOBITS SH_ATTR is mutually exclusive to ALIAS, and cannot be used in conjunction with it. SIZE Specifies the expected symbol size. SIZE is mutually exclusive to ALIAS, and cannot be used in conjunction with it. The syntax for the size_value argument is as described in the discussion of the SIZE attribute below. SIZE The SIZE symbol attribute existed before support for stub objects was introduced. It is used to set the size attribute of a given symbol. This attribute results in the creation of a symbol definition. Prior to the introduction of the ASSERT SIZE attribute, the value of a SIZE attribute was always numeric. While attempting to apply ASSERT SIZE to the objects in the Solaris ON consolidation, I found that many data symbols have a size based on the natural machine wordsize for the class of object being produced. Variables declared as long, or as a pointer, will be 4 bytes in size in a 32-bit object, and 8 bytes in a 64-bit object. Initially, I employed the conditional $if directive to handle these cases as follows: $if _ELF32 foo { ASSERT { TYPE=data; SIZE=4 } }; bar { ASSERT { TYPE=data; SIZE=20 } }; $elif _ELF64 foo { ASSERT { TYPE=data; SIZE=8 } }; bar { ASSERT { TYPE=data; SIZE=40 } }; $else $error UNKNOWN ELFCLASS $endif I found that the situation occurs frequently enough that this is cumbersome. To simplify this case, I introduced the idea of the addrsize symbolic name, and of a repeat count, which together make it simple to specify machine word scalar or array symbols. Both the SIZE, and ASSERT SIZE attributes support this syntax: The size_value argument can be a numeric value, or it can be the symbolic name addrsize. addrsize represents the size of a machine word capable of holding a memory address. The link-editor substitutes the value 4 for addrsize when building 32-bit objects, and the value 8 when building 64-bit objects. addrsize is useful for representing the size of pointer variables and C variables of type long, as it automatically adjusts for 32 and 64-bit objects without requiring the use of conditional input. The size_value argument can be optionally suffixed with a count value, enclosed in square brackets. If count is present, size_value and count are multiplied together to obtain the final size value. Using this feature, the example above can be written more naturally as: foo { ASSERT { TYPE=data; SIZE=addrsize } }; bar { ASSERT { TYPE=data; SIZE=addrsize[5] } }; Exported Global Data Is Still A Bad Idea As you can see, the additional plumbing added to the Solaris link-editor to support stub objects is minimal. Furthermore, about 90% of that plumbing is dedicated to handling global data. We have long advised against global data exported from shared objects. There are many ways in which global data does not fit well with dynamic linking. Stub objects simply provide one more reason to avoid this practice. It is always better to export all data via a functional interface. You should always hide your data, and make it available to your users via a function that they can call to acquire the address of the data item. However, If you do have to support global data for a stub, perhaps because you are working with an already existing object, it is still easilily done, as shown above. Oracle does not like us to discuss hypothetical new features that don't exist in shipping product, so I'll end this section with a speculation. It might be possible to do more in this area to ease the difficulty of dealing with objects that have global data that the users of the library don't need. Perhaps someday... Conclusions It is easy to create stub objects for most objects. If your library only exports function symbols, all you have to do to build a faithful stub object is to add STUB_OBJECT; and then to use the same link command you're currently using, with the addition of the -z stub option. Happy Stubbing!

Read the article
64-bit Archives Needed

- by user9154181

A little over a year ago, we received a question from someone who was trying to build software on Solaris. He was getting errors from the ar command when creating an archive. At that time, the ar command on Solaris was a 32-bit command. There was more than 2GB of data, and the ar command was hitting the file size limit for a 32-bit process that doesn't use the largefile APIs. Even in 2011, 2GB is a very large amount of code, so we had not heard this one before. Most of our toolchain was extended to handle 64-bit sized data back in the 1990's, but archives were not changed, presumably because there was no perceived need for it. Since then of course, programs have continued to get larger, and in 2010, the time had finally come to investigate the issue and find a way to provide for larger archives. As part of that process, I had to do a deep dive into the archive format, and also do some Unix archeology. I'm going to record what I learned here, to document what Solaris does, and in the hope that it might help someone else trying to solve the same problem for their platform. Archive Format Details Archives are hardly cutting edge technology. They are still used of course, but their basic form hasn't changed in decades. Other than to fix a bug, which is rare, we don't tend to touch that code much. The archive file format is described in /usr/include/ar.h, and I won't repeat the details here. Instead, here is a rough overview of the archive file format, implemented by System V Release 4 (SVR4) Unix systems such as Solaris: Every archive starts with a "magic number". This is a sequence of 8 characters: "!<arch>\n". The magic number is followed by 1 or more members. A member starts with a fixed header, defined by the ar_hdr structure in/usr/include/ar.h. Immediately following the header comes the data for the member. Members must be padded at the end with newline characters so that they have even length. The requirement to pad members to an even length is a dead giveaway as to the age of the archive format. It tells you that this format dates from the 1970's, and more specifically from the era of 16-bit systems such as the PDP-11 that Unix was originally developed on. A 32-bit system would have required 4 bytes, and 64-bit systems such as we use today would probably have required 8 bytes. 2 byte alignment is a poor choice for ELF object archive members. 32-bit objects require 4 byte alignment, and 64-bit objects require 64-bit alignment. The link-editor uses mmap() to process archives, and if the members have the wrong alignment, we have to slide (copy) them to the correct alignment before we can access the ELF data structures inside. The archive format requires 2 byte padding, but it doesn't prohibit more. The Solaris ar command takes advantage of this, and pads ELF object members to 8 byte boundaries. Anything else is padded to 2 as required by the format. The archive header (ar_hdr) represents all numeric values using an ASCII text representation rather than as binary integers. This means that an archive that contains only text members can be viewed using tools such as cat, more, or a text editor. The original designers of this format clearly thought that archives would be used for many file types, and not just for objects. Things didn't turn out that way of course — nearly all archives contain relocatable objects for a single operating system and machine, and are used primarily as input to the link-editor (ld). Archives can have special members that are created by the ar command rather than being supplied by the user. These special members are all distinguished by having a name that starts with the slash (/) character. This is an unambiguous marker that says that the user could not have supplied it. The reason for this is that regular archive members are given the plain name of the file that was inserted to create them, and any path components are stripped off. Slash is the delimiter character used by Unix to separate path components, and as such cannot occur within a plain file name. The ar command hides the special members from you when you list the contents of an archive, so most users don't know that they exist. There are only two possible special members: A symbol table that maps ELF symbols to the object archive member that provides it, and a string table used to hold member names that exceed 15 characters. The '/' convention for tagging special members provides room for adding more such members should the need arise. As I will discuss below, we took advantage of this fact to add an alternate 64-bit symbol table special member which is used in archives that are larger than 4GB. When an archive contains ELF object members, the ar command builds a special archive member known as the symbol table that maps all ELF symbols in the object to the archive member that provides it. The link-editor uses this symbol table to determine which symbols are provided by the objects in that archive. If an archive has a symbol table, it will always be the first member in the archive, immediately following the magic number. Unlike member headers, symbol tables do use binary integers to represent offsets. These integers are always stored in big-endian format, even on a little endian host such as x86. The archive header (ar_hdr) provides 15 characters for representing the member name. If any member has a name that is longer than this, then the real name is written into a special archive member called the string table, and the member's name field instead contains a slash (/) character followed by a decimal representation of the offset of the real name within the string table. The string table is required to precede all normal archive members, so it will be the second member if the archive contains a symbol table, and the first member otherwise. The archive format is not designed to make finding a given member easy. Such operations move through the archive from front to back examining each member in turn, and run in O(n) time. This would be bad if archives were commonly used in that manner, but in general, they are not. Typically, the ar command is used to build an new archive from scratch, inserting all the objects in one operation, and then the link-editor accesses the members in the archive in constant time by using the offsets provided by the symbol table. Both of these operations are reasonably efficient. However, listing the contents of a large archive with the ar command can be rather slow. Factors That Limit Solaris Archive Size As is often the case, there was more than one limiting factor preventing Solaris archives from growing beyond the 32-bit limits of 2GB (32-bit signed) and 4GB (32-bit unsigned). These limits are listed in the order they are hit as archive size grows, so the earlier ones mask those that follow. The original Solaris archive file format can handle sizes up to 4GB without issue. However, the ar command was delivered as a 32-bit executable that did not use the largefile APIs. As such, the ar command itself could not create a file larger than 2GB. One can solve this by building ar with the largefile APIs which would allow it to reach 4GB, but a simpler and better answer is to deliver a 64-bit ar, which has the ability to scale well past 4GB. Symbol table offsets are stored as 32-bit big-endian binary integers, which limits the maximum archive size to 4GB. To get around this limit requires a different symbol table format, or an extension mechanism to the current one, similar in nature to the way member names longer than 15 characters are handled in member headers. The size field in the archive member header (ar_hdr) is an ASCII string capable of representing a 32-bit unsigned value. This places a 4GB size limit on the size of any individual member in an archive. In considering format extensions to get past these limits, it is important to remember that very few archives will require the ability to scale past 4GB for many years. The old format, while no beauty, continues to be sufficient for its purpose. This argues for a backward compatible fix that allows newer versions of Solaris to produce archives that are compatible with older versions of the system unless the size of the archive exceeds 4GB. Archive Format Differences Among Unix Variants While considering how to extend Solaris archives to scale to 64-bits, I wanted to know how similar archives from other Unix systems are to those produced by Solaris, and whether they had already solved the 64-bit issue. I've successfully moved archives between different Unix systems before with good luck, so I knew that there was some commonality. If it turned out that there was already a viable defacto standard for 64-bit archives, it would obviously be better to adopt that rather than invent something new. The archive file format is not formally standardized. However, the ar command and archive format were part of the original Unix from Bell Labs. Other systems started with that format, extending it in various often incompatible ways, but usually with the same common shared core. Most of these systems use the same magic number to identify their archives, despite the fact that their archives are not always fully compatible with each other. It is often true that archives can be copied between different Unix variants, and if the member names are short enough, the ar command from one system can often read archives produced on another. In practice, it is rare to find an archive containing anything other than objects for a single operating system and machine type. Such an archive is only of use on the type of system that created it, and is only used on that system. This is probably why cross platform compatibility of archives between Unix variants has never been an issue. Otherwise, the use of the same magic number in archives with incompatible formats would be a problem. I was able to find information for a number of Unix variants, described below. These can be divided roughly into three tribes, SVR4 Unix, BSD Unix, and IBM AIX. Solaris is a SVR4 Unix, and its archives are completely compatible with those from the other members of that group (GNU/Linux, HP-UX, and SGI IRIX). AIX AIX is an exception to rule that Unix archive formats are all based on the original Bell labs Unix format. It appears that AIX supports 2 formats (small and big), both of which differ in fundamental ways from other Unix systems: These formats use a different magic number than the standard one used by Solaris and other Unix variants. They include support for removing archive members from a file without reallocating the file, marking dead areas as unused, and reusing them when new archive items are inserted. They have a special table of contents member (File Member Header) which lets you find out everything that's in the archive without having to actually traverse the entire file. Their symbol table members are quite similar to those from other systems though. Their member headers are doubly linked, containing offsets to both the previous and next members. Of the Unix systems described here, AIX has the only format I saw that will have reasonable insert/delete performance for really large archives. Everyone else has O(n) performance, and are going to be slow to use with large archives. BSD BSD has gone through 4 versions of archive format, which are described in their manpage. They use the same member header as SVR4, but their symbol table format is different, and their scheme for long member names puts the name directly after the member header rather than into a string table. GNU/Linux The GNU toolchain uses the SVR4 format, and is compatible with Solaris. HP-UX HP-UX seems to follow the SVR4 model, and is compatible with Solaris. IRIX IRIX has 32 and 64-bit archives. The 32-bit format is the standard SVR4 format, and is compatible with Solaris. The 64-bit format is the same, except that the symbol table uses 64-bit integers. IRIX assumes that an archive contains objects of a single ELFCLASS/MACHINE, and any archive containing ELFCLASS64 objects receives a 64-bit symbol table. Although they only use it for 64-bit objects, nothing in the archive format limits it to ELFCLASS64. It would be perfectly valid to produce a 64-bit symbol table in an archive containing 32-bit objects, text files, or anything else. Tru64 Unix (Digital/Compaq/HP) Tru64 Unix uses a format much like ours, but their symbol table is a hash table, making specific symbol lookup much faster. The Solaris link-editor uses archives by examining the entire symbol table looking for unsatisfied symbols for the link, and not by looking up individual symbols, so there would be no benefit to Solaris from such a hash table. The Tru64 ld must use a different approach in which the hash table pays off for them. Widening the existing SVR4 archive symbol tables rather than inventing something new is the simplest path forward. There is ample precedent for this approach in the ELF world. When ELF was extended to support 64-bit objects, the approach was largely to take the existing data structures, and define 64-bit versions of them. We called the old set ELF32, and the new set ELF64. My guess is that there was no need to widen the archive format at that time, but had there been, it seems obvious that this is how it would have been done. The Implementation of 64-bit Solaris Archives As mentioned earlier, there was no desire to improve the fundamental nature of archives. They have always had O(n) insert/delete behavior, and for the most part it hasn't mattered. AIX made efforts to improve this, but those efforts did not find widespread adoption. For the purposes of link-editing, which is essentially the only thing that archives are used for, the existing format is adequate, and issues of backward compatibility trump the desire to do something technically better. Widening the existing symbol table format to 64-bits is therefore the obvious way to proceed. For Solaris 11, I implemented that, and I also updated the ar command so that a 64-bit version is run by default. This eliminates the 2 most significant limits to archive size, leaving only the limit on an individual archive member. We only generate a 64-bit symbol table if the archive exceeds 4GB, or when the new -S option to the ar command is used. This maximizes backward compatibility, as an archive produced by Solaris 11 is highly likely to be less than 4GB in size, and will therefore employ the same format understood by older versions of the system. The main reason for the existence of the -S option is to allow us to test the 64-bit format without having to construct huge archives to do so. I don't believe it will find much use outside of that. Other than the new ability to create and use extremely large archives, this change is largely invisible to the end user. When reading an archive, the ar command will transparently accept either form of symbol table. Similarly, the ELF library (libelf) has been updated to understand either format. Users of libelf (such as the link-editor ld) do not need to be modified to use the new format, because these changes are encapsulated behind the existing functions provided by libelf. As mentioned above, this work did not lift the limit on the maximum size of an individual archive member. That limit remains fixed at 4GB for now. This is not because we think objects will never get that large, for the history of computing says otherwise. Rather, this is based on an estimation that single relocatable objects of that size will not appear for a decade or two. A lot can change in that time, and it is better not to overengineer things by writing code that will sit and rot for years without being used. It is not too soon however to have a plan for that eventuality. When the time comes when this limit needs to be lifted, I believe that there is a simple solution that is consistent with the existing format. The archive member header size field is an ASCII string, like the name, and as such, the overflow scheme used for long names can also be used to handle the size. The size string would be placed into the archive string table, and its offset in the string table would then be written into the archive header size field using the same format "/ddd" used for overflowed names.

Read the article
Nagging As A Strategy For Better Linking: -z guidance

- by user9154181

The link-editor (ld) in Solaris 11 has a new feature that we call guidance that is intended to help you build better objects. The basic idea behind guidance is that if (and only if) you request it, the link-editor will issue messages suggesting better options and other changes you might make to your ld command to get better results. You can choose to take the advice, or you can disable specific types of guidance while acting on others. In some ways, this works like an experienced friend leaning over your shoulder and giving you advice — you're free to take it or leave it as you see fit, but you get nudged to do a better job than you might have otherwise. We use guidance to build the core Solaris OS, and it has proven to be useful, both in improving our objects, and in making sure that regressions don't creep back in later. In this article, I'm going to describe the evolution in thinking and design that led to the implementation of the -z guidance option, as well as give a brief description of how it works. The guidance feature issues non-fatal warnings. However, experience shows that once developers get used to ignoring warnings, it is inevitable that real problems will be lost in the noise and ignored or missed. This is why we have a zero tolerance policy against build noise in the core Solaris OS. In order to get maximum benefit from -z guidance while maintaining this policy, I added the -z fatal-warnings option at the same time. Much of the material presented here is adapted from the arc case: PSARC 2010/312 Link-editor guidance The History Of Unfortunate Link-Editor Defaults The Solaris link-editor is one of the oldest Unix commands. It stands to reason that this would be true — in order to write an operating system, you need the ability to compile and link code. The original link-editor (ld) had defaults that made sense at the time. As new features were needed, command line option switches were added to let the user use them, while maintaining backward compatibility for those who didn't. Backward compatibility is always a concern in system design, but is particularly important in the case of the tool chain (compilers, linker, and related tools), since it is a basic building block for the entire system. Over the years, applications have grown in size and complexity. Important concepts like dynamic linking that didn't exist in the original Unix system were invented. Object file formats changed. In the case of System V Release 4 Unix derivatives like Solaris, the ELF (Extensible Linking Format) was adopted. Since then, the ELF system has evolved to provide tools needed to manage today's larger and more complex environments. Features such as lazy loading, and direct bindings have been added. In an ideal world, many of these options would be defaults, with rarely used options that allow the user to turn them off. However, the reality is exactly the reverse: For backward compatibility, these features are all options that must be explicitly turned on by the user. This has led to a situation in which most applications do not take advantage of the many improvements that have been made in linking over the last 20 years. If their code seems to link and run without issue, what motivation does a developer have to read a complex manpage, absorb the information provided, choose the features that matter for their application, and apply them? Experience shows that only the most motivated and diligent programmers will make that effort. We know that most programs would be improved if we could just get you to use the various whizzy features that we provide, but the defaults conspire against us. We have long wanted to do something to make it easier for our users to use the linkers more effectively. There have been many conversations over the years regarding this issue, and how to address it. They always break down along the following lines: Change ld Defaults Since the world would be a better place the newer ld features were the defaults, why not change things to make it so? This idea is simple, elegant, and impossible. Doing so would break a large number of existing applications, including those of ISVs, big customers, and a plethora of existing open source packages. In each case, the owner of that code may choose to follow our lead and fix their code, or they may view it as an invitation to reconsider their commitment to our platform. Backward compatibility, and our installed base of working software, is one of our greatest assets, and not something to be lightly put at risk. Breaking backward compatibility at this level of the system is likely to do more harm than good. But, it sure is tempting. New Link-Editor One might create a new linker command, not called 'ld', leaving the old command as it is. The new one could use the same code as ld, but would offer only modern options, with the proper defaults for features such as direct binding. The resulting link-editor would be a pleasure to use. However, the approach is doomed to niche status. There is a vast pile of exiting code in the world built around the existing ld command, that reaches back to the 1970's. ld use is embedded in large and unknown numbers of makefiles, and is used by name by compilers that execute it. A Unix link-editor that is not named ld will not find a majority audience no matter how good it might be. Finally, a new linker command will eventually cease to be new, and will accumulate its own burden of backward compatibility issues. An Option To Make ld Do The Right Things Automatically This line of reasoning is best summarized by a CR filed in 2005, entitled 6239804 make it easier for ld(1) to do what's best The idea is to have a '-z best' option that unchains ld from its backward compatibility commitment, and allows it to turn on the "best" set of features, as determined by the authors of ld. The specific set of features enabled by -z best would be subject to change over time, as requirements change. This idea is more realistic than the other two, but was never implemented because it has some important issues that we could never answer to our satisfaction: The -z best proposal assumes that the user can turn it on, and trust it to select good options without the user needing to be aware of the options being applied. This is a fallacy. Features such as direct bindings require the user to do some analysis to ensure that the resulting program will still operate properly. A user who is willing to do the work to verify that what -z best does will be OK for their application is capable of turning on those features directly, and therefore gains little added benefit from -z best. The intent is that when a user opts into -z best, that they understand that z best is subject to sometimes incompatible evolution. Experience teaches us that this won't work. People will use this feature, the meaning of -z best will change, code that used to build will fail, and then there will be complaints and demands to retract the change. When (not if) this occurs, we will of course defend our actions, and point at the disclaimer. We'll win some of those debates, and lose others. Ultimately, we'll end up with -z best2 (-z better), or other compromises, and our goal of simplifying the world will have failed. The -z best idea rolls up a set of features that may or may not be related to each other into a unit that must be taken wholesale, or not at all. It could be that only a subset of what it does is compatible with a given application, in which case the user is expected to abandon -z best and instead set the options that apply to their application directly. In doing so, they lose one of the benefits of -z best, that if you use it, future versions of ld may choose a different set of options, and automatically improve the object through the act of rebuilding it. I drew two conclusions from the above history: For a link-editor, backward compatibility is vital. If a given command line linked your application 10 years ago, you have every reason to expect that it will link today, assuming that the libraries you're linking against are still available and compatible with their previous interfaces. For an application of any size or complexity, there is no substitute for the work involved in examining the code and determining which linker options apply and which do not. These options are largely orthogonal to each other, and it can be reasonable not to use any or all of them, depending on the situation, even in modern applications. It is a mistake to tie them together. The idea for -z guidance came from consideration of these points. By decoupling the advice from the act of taking the advice, we can retain the good aspects of -z best while avoiding its pitfalls: -z guidance gives advice, but the decision to take that advice remains with the user who must evaluate its merit and make a decision to take it or not. As such, we are free to change the specific guidance given in future releases of ld, without breaking existing applications. The only fallout from this will be some new warnings in the build output, which can be ignored or dealt with at the user's convenience. It does not couple the various features given into a single "take it or leave it" option, meaning that there will never be a need to offer "-zguidance2", or other such variants as things change over time. Guidance has the potential to be our final word on this subject. The user is given the flexibility to disable specific categories of guidance without losing the benefit of others, including those that might be added to future versions of the system. Although -z fatal-warnings stands on its own as a useful feature, it is of particular interest in combination with -z guidance. Used together, the guidance turns from advice to hard requirement: The user must either make the suggested change, or explicitly reject the advice by specifying a guidance exception token, in order to get a build. This is valuable in environments with high coding standards. ld Command Line Options The guidance effort resulted in new link-editor options for guidance and for turning warnings into fatal errors. Before I reproduce that text here, I'd like to highlight the strategic decisions embedded in the guidance feature: In order to get guidance, you have to opt in. We hope you will opt in, and believe you'll get better objects if you do, but our default mode of operation will continue as it always has, with full backward compatibility, and without judgement. Guidance suggestions always offers specific advice, and not vague generalizations. You can disable some guidance without turning off the entire feature. When you get guidance warnings, you can choose to take the advice, or you can specify a keyword to disable guidance for just that category. This allows you to get guidance for things that are useful to you, without being bothered about things that you've already considered and dismissed. As the world changes, we will add new guidance to steer you in the right direction. All such new guidance will come with a keyword that let's you turn it off. In order to facilitate building your code on different versions of Solaris, we quietly ignore any guidance keywords we don't recognize, assuming that they are intended for newer versions of the link-editor. If you want to see what guidance tokens ld does and does not recognize on your system, you can use the ld debugging feature as follows: % ld -Dargs -z guidance=foo,nodefs debug: debug: Solaris Linkers: 5.11-1.2275 debug: debug: arg[1] option=-D: option-argument: args debug: arg[2] option=-z: option-argument: guidance=foo,nodefs debug: warning: unrecognized -z guidance item: foo The -z fatal-warning option is straightforward, and generally useful in environments with strict coding standards. Note that the GNU ld already had this feature, and we accept their option names as synonyms: -z fatal-warnings | nofatal-warnings --fatal-warnings | --no-fatal-warnings The -z fatal-warnings and the --fatal-warnings option cause the link-editor to treat warnings as fatal errors. The -z nofatal-warnings and the --no-fatal-warnings option cause the link-editor to treat warnings as non-fatal. This is the default behavior. The -z guidance option is defined as follows: -z guidance[=item1,item2,...] Provide guidance messages to suggest ld options that can improve the quality of the resulting object, or which are otherwise considered to be beneficial. The specific guidance offered is subject to change over time as the system evolves. Obsolete guidance offered by older versions of ld may be dropped in new versions. Similarly, new guidance may be added to new versions of ld. Guidance therefore always represents current best practices. It is possible to enable guidance, while preventing specific guidance messages, by providing a list of item tokens, representing the class of guidance to be suppressed. In this way, unwanted advice can be suppressed without losing the benefit of other guidance. Unrecognized item tokens are quietly ignored by ld, allowing a given ld command line to be executed on a variety of older or newer versions of Solaris. The guidance offered by the current version of ld, and the item tokens used to disable these messages, are as follows. Specify Required Dependencies Dynamic executables and shared objects should explicitly define all of the dependencies they require. Guidance recommends the use of the -z defs option, should any symbol references remain unsatisfied when building dynamic objects. This guidance can be disabled with -z guidance=nodefs. Do Not Specify Non-Required Dependencies Dynamic executables and shared objects should not define any dependencies that do not satisfy the symbol references made by the dynamic object. Guidance recommends that unused dependencies be removed. This guidance can be disabled with -z guidance=nounused. Lazy Loading Dependencies should be identified for lazy loading. Guidance recommends the use of the -z lazyload option should any dependency be processed before either a -z lazyload or -z nolazyload option is encountered. This guidance can be disabled with -z guidance=nolazyload. Direct Bindings Dependencies should be referenced with direct bindings. Guidance recommends the use of the -B direct, or -z direct options should any dependency be processed before either of these options, or the -z nodirect option is encountered. This guidance can be disabled with -z guidance=nodirect. Pure Text Segment Dynamic objects should not contain relocations to non-writable, allocable sections. Guidance recommends compiling objects with Position Independent Code (PIC) should any relocations against the text segment remain, and neither the -z textwarn or -z textoff options are encountered. This guidance can be disabled with -z guidance=notext. Mapfile Syntax All mapfiles should use the version 2 mapfile syntax. Guidance recommends the use of the version 2 syntax should any mapfiles be encountered that use the version 1 syntax. This guidance can be disabled with -z guidance=nomapfile. Library Search Path Inappropriate dependencies that are encountered by ld are quietly ignored. For example, a 32-bit dependency that is encountered when generating a 64-bit object is ignored. These dependencies can result from incorrect search path settings, such as supplying an incorrect -L option. Although benign, this dependency processing is wasteful, and might hide a build problem that should be solved. Guidance recommends the removal of any inappropriate dependencies. This guidance can be disabled with -z guidance=nolibpath. In addition, -z guidance=noall can be used to entirely disable the guidance feature. See Chapter 7, Link-Editor Quick Reference, in the Linker and Libraries Guide for more information on guidance and advice for building better objects. Example The following example demonstrates how the guidance feature is intended to work. We will build a shared object that has a variety of shortcomings: Does not specify all it's dependencies Specifies dependencies it does not use Does not use direct bindings Uses a version 1 mapfile Contains relocations to the readonly allocable text (not PIC) This scenario is sadly very common — many shared objects have one or more of these issues. % cat hello.c #include <stdio.h> #include <unistd.h> void hello(void) { printf("hello user %d\n", getpid()); } % cat mapfile.v1 # This version 1 mapfile will trigger a guidance message % cc hello.c -o hello.so -G -M mapfile.v1 -lelf As you can see, the operation completes without error, resulting in a usable object. However, turning on guidance reveals a number of things that could be better: % cc hello.c -o hello.so -G -M mapfile.v1 -lelf -zguidance ld: guidance: version 2 mapfile syntax recommended: mapfile.v1 ld: guidance: -z lazyload option recommended before first dependency ld: guidance: -B direct or -z direct option recommended before first dependency Undefined first referenced symbol in file getpid hello.o (symbol belongs to implicit dependency /lib/libc.so.1) printf hello.o (symbol belongs to implicit dependency /lib/libc.so.1) ld: warning: symbol referencing errors ld: guidance: -z defs option recommended for shared objects ld: guidance: removal of unused dependency recommended: libelf.so.1 warning: Text relocation remains referenced against symbol offset in file .rodata1 (section) 0xa hello.o getpid 0x4 hello.o printf 0xf hello.o ld: guidance: position independent (PIC) code recommended for shared objects ld: guidance: see ld(1) -z guidance for more information Given the explicit advice in the above guidance messages, it is relatively easy to modify the example to do the right things: % cat mapfile.v2 # This version 2 mapfile will not trigger a guidance message $mapfile_version 2 % cc hello.c -o hello.so -Kpic -G -Bdirect -M mapfile.v2 -lc -zguidance There are situations in which the guidance does not fit the object being built. For instance, you want to build an object without direct bindings: % cc -Kpic hello.c -o hello.so -G -M mapfile.v2 -lc -zguidance ld: guidance: -B direct or -z direct option recommended before first dependency ld: guidance: see ld(1) -z guidance for more information It is easy to disable that specific guidance warning without losing the overall benefit from allowing the remainder of the guidance feature to operate: % cc -Kpic hello.c -o hello.so -G -M mapfile.v2 -lc -zguidance=nodirect Conclusions The linking guidelines enforced by the ld guidance feature correspond rather directly to our standards for building the core Solaris OS. I'm sure that comes as no surprise. It only makes sense that we would want to build our own product as well as we know how. Solaris is usually the first significant test for any new linker feature. We now enable guidance by default for all builds, and the effect has been very positive. Guidance helps us find suboptimal objects more quickly. Programmers get concrete advice for what to change instead of vague generalities. Even in the cases where we override the guidance, the makefile rules to do so serve as documentation of the fact. Deciding to use guidance is likely to cause some up front work for most code, as it forces you to consider using new features such as direct bindings. Such investigation is worthwhile, but does not come for free. However, the guidance suggestions offer a structured and straightforward way to tackle modernizing your objects, and once that work is done, for keeping them that way. The investment is often worth it, and will replay you in terms of better performance and fewer problems. I hope that you find guidance to be as useful as we have.

Read the article
The Stub Proto: Not Just For Stub Objects Anymore

- by user9154181

One of the great pleasures of programming is to invent something for a narrow purpose, and then to realize that it is a general solution to a broader problem. In hindsight, these things seem perfectly natural and obvious. The stub proto area used to build the core Solaris consolidation has turned out to be one of those things. As discussed in an earlier article, the stub proto area was invented as part of the effort to use stub objects to build the core ON consolidation. Its purpose was merely as a place to hold stub objects. However, we keep finding other uses for it. It turns out that the stub proto should be more properly thought of as an auxiliary place to put things that we would like to put into the proto to help us build the product, but which we do not wish to package or deliver to the end user. Stub objects are one example, but private lint libraries, header files, archives, and relocatable objects, are all examples of things that might profitably go into the stub proto. Without a stub proto, these items were handled in a variety of ad hoc ways: If one part of the workspace needed private header files, libraries, or other such items, it might modify its Makefile to reach up and over to the place in the workspace where those things live and use them from there. There are several problems with this: Each component invents its own approach, meaning that programmers maintaining the system have to invest extra effort to understand what things mean. In the past, this has created makefile ghettos in which only the person who wrote the makefiles feels confident to modify them, while everyone else ignores them. This causes many difficulties and benefits no one. These interdependencies are not obvious to the make, utility, and can lead to races. They are not obvious to the human reader, who may therefore not realize that they exist, and break them. Our policy in ON is not to deliver files into the proto unless those files are intended to be packaged and delivered to the end user. However, sometimes non-shipping files were copied into the proto anyway, causing a different set of problems: It requires a long list of exceptions to silence our normal unused proto item error checking. In the past, we have accidentally shipped files that we did not intend to deliver to the end user. Mixing cruft with valuable items makes it hard to discern which is which. The stub proto area offers a convenient and robust solution. Files needed to build the workspace that are not delivered to the end user can instead be installed into the stub proto. No special exceptions or custom make rules are needed, and the intent is always clear. We are already accessing some private lint libraries and compilation symlinks in this manner. Ultimately, I'd like to see all of the files in the proto that have a packaging exception delivered to the stub proto instead, and for the elimination of all existing special case makefile rules. This would include shared objects, header files, and lint libraries. I don't expect this to happen overnight — it will be a long term case by case project, but the overall trend is clear. The Stub Proto, -z assert_deflib, And The End Of Accidental System Object Linking We recently used the stub proto to solve an annoying build issue that goes back to the earliest days of Solaris: How to ensure that we're linking to the OS bits we're building instead of to those from the running system. The Solaris product is made up of objects and files from a number of different consolidations, each of which is built separately from the others from an independent code base called a gate. The core Solaris OS consolidation is ON, which stands for "Operating System and Networking". You will frequently also see ON called the OSnet. There are consolidations for X11 graphics, the desktop environment, open source utilities, compilers and development tools, and many others. The collection of consolidations that make up Solaris is known as the "Wad Of Stuff", usually referred to simply as the WOS. None of these consolidations is self contained. Even the core ON consolidation has some dependencies on libraries that come from other consolidations. The build server used to build the OSnet must be running a relatively recent version of Solaris, which means that its objects will be very similar to the new ones being built. However, it is necessarily true that the build system objects will always be a little behind, and that incompatible differences may exist. The objects built by the OSnet link to other objects. Some of these dependencies come from the OSnet, while others come from other consolidations. The objects from other consolidations are provided by the standard library directories on the build system (/lib, /usr/lib). The objects from the OSnet itself are supposed to come from the proto areas in the workspace, and not from the build server. In order to achieve this, we make use of the -L command line option to the link-editor. The link-editor finds dependencies by looking in the directories specified by the caller using the -L command line option. If the desired dependency is not found in one of these locations, ld will then fall back to looking at the default locations (/lib, /usr/lib). In order to use OSnet objects from the workspace instead of the system, while still accessing non-OSnet objects from the system, our Makefiles set -L link-editor options that point at the workspace proto areas. In general, this works well and dependencies are found in the right places. However, there have always been failures: Building objects in the wrong order might mean that an OSnet dependency hasn't been built before an object that needs it. If so, the dependency will not be seen in the proto, and the link-editor will silently fall back to the one on the build server. Errors in the makefiles can wipe out the -L options that our top level makefiles establish to cause ld to look at the workspace proto first. In this case, all objects will be found on the build server. These failures were rarely if ever caught. As I mentioned earlier, the objects on the build server are generally quite close to the objects built in the workspace. If they offer compatible linking interfaces, then the objects that link to them will behave properly, and no issue will ever be seen. However, if they do not offer compatible linking interfaces, the failure modes can be puzzling and hard to pin down. Either way, there won't be a compile-time warning or error. The advent of the stub proto eliminated the first type of failure. With stub objects, there is no dependency ordering, and the necessary stub object dependency will always be in place for any OSnet object that needs it. However, makefile errors do still occur, and so, the second form of error was still possible. While working on the stub object project, we realized that the stub proto was also the key to solving the second form of failure caused by makefile errors: Due to the way we set the -L options to point at our workspace proto areas, any valid object from the OSnet should be found via a path specified by -L, and not from the default locations (/lib, /usr/lib). Any OSnet object found via the default locations means that we've linked to the build server, which is an error we'd like to catch. Non-OSnet objects don't exist in the proto areas, and so are found via the default paths. However, if we were to create a symlink in the stub proto pointing at each non-OSnet dependency that we require, then the non-OSnet objects would also be found via the paths specified by -L, and not from the link-editor defaults. Given the above, we should not find any dependency objects from the link-editor defaults. Any dependency found via the link-editor defaults means that we have a Makefile error, and that we are linking to the build server inappropriately. All we need to make use of this fact is a linker option to produce a warning when it happens. Although warnings are nice, we in the OSnet have a zero tolerance policy for build noise. The -z fatal-warnings option that was recently introduced with -z guidance can be used to turn the warnings into fatal build errors, forcing the programmer to fix them. This was too easy to resist. I integrated 7021198 ld option to warn when link accesses a library via default path PSARC/2011/068 ld -z assert-deflib option into snv_161 (February 2011), shortly after the stub proto was introduced into ON. This putback introduced the -z assert-deflib option to the link-editor: -z assert-deflib=[libname] Enables warning messages for libraries specified with the -l command line option that are found by examining the default search paths provided by the link-editor. If a libname value is provided, the default library warning feature is enabled, and the specified library is added to a list of libraries for which no warnings will be issued. Multiple -z assert-deflib options can be specified in order to specify multiple libraries for which warnings should not be issued. The libname value should be the name of the library file, as found by the link-editor, without any path components. For example, the following enables default library warnings, and excludes the standard C library. ld ... -z assert-deflib=libc.so ... -z assert-deflib is a specialized option, primarily of interest in build environments where multiple objects with the same name exist and tight control over the library used is required. If is not intended for general use. Note that the definition of -z assert-deflib allows for exceptions to be specified as arguments to the option. In general, the idea of using a symlink from the stub proto is superior because it does not clutter up the link command with a long list of objects. When building the OSnet, we usually use the plain from of -z deflib, and make symlinks for the non-OSnet dependencies. The exception to this are dependencies supplied by the compiler itself, which are usually found at whatever arbitrary location the compiler happens to be installed at. To handle these special cases, the command line version works better. Following the integration of the link-editor change, I made use of -z assert-deflib in OSnet builds with 7021896 Prevent OSnet from accidentally linking to build system which integrated into snv_162 (March 2011). Turning on -z assert-deflib exposed between 10 and 20 existing errors in our Makefiles, which were all fixed in the same putback. The errors we found in our Makefiles underscore how difficult they can be prevent without an automatic system in place to catch them. Conclusions The stub proto is proving to be a generally useful construct for ON builds that goes beyond serving as a place to hold stub objects. Although invented to hold stub objects, it has already allowed us to simplify a number of previously difficult situations in our makefiles and builds. I expect that we'll find uses for it beyond those described here as we go forward.

Read the article
elffile: ELF Specific File Identification Utility

- by user9154181

Solaris 11 has a new standard user level command, /usr/bin/elffile. elffile is a variant of the file utility that is focused exclusively on linker related files: ELF objects, archives, and runtime linker configuration files. All other files are simply identified as "non-ELF". The primary advantage of elffile over the existing file utility is in the area of archives — elffile examines the archive members and can produce a summary of the contents, or per-member details. The impetus to add elffile to Solaris came from the effort to extend the format of Solaris archives so that they could grow beyond their previous 32-bit file limits. That work introduced a new archive symbol table format. Now that there was more than one possible format, I thought it would be useful if the file utility could identify which format a given archive is using, leading me to extend the file utility: % cc -c ~/hello.c % ar r foo.a hello.o % file foo.a foo.a: current ar archive, 32-bit symbol table % ar r -S foo.a hello.o % file foo.a foo.a: current ar archive, 64-bit symbol table In turn, this caused me to think about all the things that I would like the file utility to be able to tell me about an archive. In particular, I'd like to be able to know what's inside without having to unpack it. The end result of that train of thought was elffile. Much of the discussion in this article is adapted from the PSARC case I filed for elffile in December 2010: PSARC 2010/432 elffile Why file Is No Good For Archives And Yet Should Not Be Fixed The standard /usr/bin/file utility is not very useful when applied to archives. When identifying an archive, a user typically wants to know 2 things: Is this an archive? Presupposing that the archive contains objects, which is by far the most common use for archives, what platform are the objects for? Are they for sparc or x86? 32 or 64-bit? Some confusing combination from varying platforms? The file utility provides a quick answer to question (1), as it identifies all archives as "current ar archive". It does nothing to answer the more interesting question (2). To answer that question, requires a multi-step process: Extract all archive members Use the file utility on the extracted files, examine the output for each file in turn, and compare the results to generate a suitable summary description. Remove the extracted files It should be easier and more efficient to answer such an obvious question. It would be reasonable to extend the file utility to examine archive contents in place and produce a description. However, there are several reasons why I decided not to do so: The correct design for this feature within the file utility would have file examine each archive member in turn, applying its full abilities to each member. This would be elegant, but also represents a rather dramatic redesign and re-implementation of file. Archives nearly always contain nothing but ELF objects for a single platform, so such generality in the file utility would be of little practical benefit. It is best to avoid adding new options to standard utilities for which other implementations of interest exist. In the case of the file utility, one concern is that we might add an option which later appears in the GNU version of file with a different and incompatible meaning. Indeed, there have been discussions about replacing the Solaris file with the GNU version in the past. This may or may not be desirable, and may or may not ever happen. Either way, I don't want to preclude it. Examining archive members is an O(n) operation, and can be relatively slow with large archives. The file utility is supposed to be a very fast operation. I decided that extending file in this way is overkill, and that an investment in the file utility for better archive support would not be worth the cost. A solution that is more narrowly focused on ELF and other linker related files is really all that we need. The necessary code for doing this already exists within libelf. All that is missing is a small user-level wrapper to make that functionality available at the command line. In that vein, I considered adding an option for this to the elfdump utility. I examined elfdump carefully, and even wrote a prototype implementation. The added code is small and simple, but the conceptual fit with the rest of elfdump is poor. The result complicates elfdump syntax and documentation, definite signs that this functionality does not belong there. And so, I added this functionality as a new user level command. The elffile Command The syntax for this new command is elffile [-s basic | detail | summary] filename... Please see the elffile(1) manpage for additional details. To demonstrate how output from elffile looks, I will use the following files: FileDescription configA runtime linker configuration file produced with crle dwarf.oAn ELF object /etc/passwdA text file mixed.aArchive containing a mixture of ELF and non-ELF members mixed_elf.aArchive containing ELF objects for different machines not_elf.aArchive containing no ELF objects same_elf.aArchive containing a collection of ELF objects for the same machine. This is the most common type of archive. The file utility identifies these files as follows: % file config dwarf.o /etc/passwd mixed.a mixed_elf.a not_elf.a same_elf.a config: Runtime Linking Configuration 64-bit MSB SPARCV9 dwarf.o: ELF 64-bit LSB relocatable AMD64 Version 1 /etc/passwd: ascii text mixed.a: current ar archive, 32-bit symbol table mixed_elf.a: current ar archive, 32-bit symbol table not_elf.a: current ar archive same_elf.a: current ar archive, 32-bit symbol table By default, elffile uses its "summary" output style. This output differs from the output from the file utility in 2 significant ways: Files that are not an ELF object, archive, or runtime linker configuration file are identified as "non-ELF", whereas the file utility attempts further identification for such files. When applied to an archive, the elffile output includes a description of the archive's contents, without requiring member extraction or other additional steps. Applying elffile to the above files: % elffile config dwarf.o /etc/passwd mixed.a mixed_elf.a not_elf.a same_elf.a config: Runtime Linking Configuration 64-bit MSB SPARCV9 dwarf.o: ELF 64-bit LSB relocatable AMD64 Version 1 /etc/passwd: non-ELF mixed.a: current ar archive, 32-bit symbol table, mixed ELF and non-ELF content mixed_elf.a: current ar archive, 32-bit symbol table, mixed ELF content not_elf.a: current ar archive, non-ELF content same_elf.a: current ar archive, 32-bit symbol table, ELF 64-bit LSB relocatable AMD64 Version 1 The output for same_elf.a is of particular interest: The vast majority of archives contain only ELF objects for a single platform, and in this case, the default output from elffile answers both of the questions about archives posed at the beginning of this discussion, in a single efficient step. This makes elffile considerably more useful than file, within the realm of linker-related files. elffile can produce output in two other styles, "basic", and "detail". The basic style produces output that is the same as that from 'file', for linker-related files. The detail style produces per-member identification of archive contents. This can be useful when the archive contents are not homogeneous ELF object, and more information is desired than the summary output provides: % elffile -s detail mixed.a mixed.a: current ar archive, 32-bit symbol table mixed.a(dwarf.o): ELF 32-bit LSB relocatable 80386 Version 1 mixed.a(main.c): non-ELF content mixed.a(main.o): ELF 64-bit LSB relocatable AMD64 Version 1 [SSE]

Read the article
Solaris 11

- by user9154181

Oracle has a strict policy about not discussing product features until they appear in shipping product. Now that Solaris 11 is publically available, it is time to catch up. I will be shortly posting articles on a variety of new developments in the Solaris linkers and related bits: 64-bit Archives After 40+ years of Unix, the archive file format has run out of room. The ar and link-editor (ld) commands have been enhanced to allow archives to grow past their previous 32-bit limits. Guidance The link-editor is now willing and able to tell you how to alter your link lines in order to build better objects. Stub Objects This is one of the bigger projects I've undertaken since joining the Solaris group. Stub objects are shared objects, built entirely from mapfiles, that supply the same linking interface as the real object, while containing no code or data. You can link to them, but cannot use them at runtime. It was pretty simple to add this ability to the link-editor, but the changes to the OSnet in order to apply them to building Solaris were massive. I discuss how we came to invent stub objects, how we apply them to build the OSnet in a more parallel and scalable manner, and about the follow on opportunities that have emerged from the new stub proto area we created to hold them. The elffile Utility A new standard Solaris utility, elffile is a variant of the file utility, focused exclusively on linker related files. elffile is of particular value for examining archives, as it allows you to find out what is inside them without having to first extract the archive members into temporary files. This release has been a long time coming. I joined the Solaris group in late 2005, and this will be my first FCS. From a user perspective, Solaris 11 is probably the biggest change to Solaris since Solaris 2.0. Solaris 11 polishes the ground breaking features from Solaris 10 (DTrace, FMA, ZFS, Zones), and uses them to add a powerful new packaging system, numerous other enhacements and features, along with a huge modernization effort. I'm excited to see it go out into the world. I hope you enjoy using it as much as we did creating it. Software is never done. On to the next one...

Read the article

1