structure – OpenPool

Observations

Use, Use, Re-use

Many platform, or product family, programmes aim to ‘design for re-use’, trying to identify generic functionality upfront and design and implement the software such that it can be used several times without being adapted. Although theoretically possible, practise proves it to be very difficult if not infeasible to identify and implement generic functionality upfront and re-use it multiple times as is. Software typically works almost but not quite right and, as a result, developers often end up having to devise workarounds or glue code to make someone else’s software fit their needs.

Open Source and Inner Source stem from the opposite mindset: re-usable components are consolidated rather than innovated – components need to be used once and (adapted to be) used again before generic functionality can be identified and consolidated. This is often dubbed ‘use, use, re-use’ paradigm. Contrary to the ‘design for re-use’ paradigm, ‘use, use, re-use’ stimulates components being adapted to people’s needs because these adaptations are, in essence, feedback.

Open Within Community Scope

A development team can grant the community an increasing amount of rights, or freedoms: the right to use, read, modify, and (re)distribute the software.

- The basic right is the right to use the software. In this model, only the team itself has access to the source code and the rest of the community is allowed to use the software as a black box. Theoretically, this should be an effective and efficient approach but in the reality of today’s software engineering with insufficient clarity in requirements specifications, requirements that change during development, and bugs, it often results in fighting rather than using a black box.
- In addition, the team can allow the community to read the source code, helping them a lot to understand the internals of the software and how to use it. The community can look but not touch: they can read code and learn all about the implementation but still have to report bugs and change requests to the team to get anything changed or fixed.
- Moreover, the team can allow the community to modify the source code if and only if those modifications are returned to the team. Hence, the community is allowed to contribute, but the team decides what goes in when, and maintains absolute control. The main problem with this model is that there is very little incentive for the community to contribute because they depend entirely on the team to accept the contributions.
- Finally, the team can allow the community to (re)distribute their modified version of the source code. This allows the community to contribute to the software and, when the team doesn’t want to accept the contribution at this time, (re)distribute it to others. In essence, this removes the team being a possible bottleneck, preventing anyone in the community (either part of the team or not) from “being more equal than others”.

These rights can be granted on different scopes: For example, Microsoft’s Shared Source and SUN Microsystems’ Java Community Source were both disclosed source models on a world-wide scope: Everybody is allowed to modify the source code provided those modifications are given back to and will be owned and controlled by Microsoft respectively SUN Microsystems. Both initiatives were not very successful, perhaps because they don’t offer an equal partnership: contributors do not get anything in return.

Open Source is an example of the (re)distribute model on a world-wide scope: the generally accepted definition of Open Source denotes software to be Open Source if it provides full access to its source code and the permission to use (on any computer, in any situation), modify (improving it, fixing bugs, augmenting functionality), and redistribute it provided the distribution terms remain unchanged. Inner Source is an example of the (re)distribute model on a limited scope, typically a programme within a division, company, or consortium. There are increasingly many examples of these but typically not as visible from outside those communities as Open Source.

Key Aspects

Accessibility

Accessibility of source code, bug tracking, and communication is the single most important requirement for effective co-operation. As outlined in section 1.1, software typically needs to be slightly adapted to be able to be re-used. To be able to adapt software, a developer needs access to its source code but also have the opportunity to talk to the developers that built the software to begin with.

It is important to recognise that accessibility is not a black and white issue – there is a significant difference between easy access and being able to access information at all. If the threshold is too high, accessibility in effect decreases because most people simply don’t take the time and effort to get access.

Inner Source insists on using standard, lightweight tools that get the job done with little or no customisations. Customising tools or using heavily specialised tools results in high threshold to access which conflicts directly with effective co-operation! For instance, using the same version control system at different sites should make getting access to each others’ systems easy but if each site deploys proprietary customisations, collaboration becomes impossible.

Decentralised Ownership & Control

Making source code, lines of communication, and generally information so easy accessible enables developers to co-operate much better but also demands a clear definition of ownership, control, and responsibilities. This section defines ownership of source code and distinguishes four different roles a developer can play with respect to a component before explaining how ownership and control is organised in an Inner Source approach.

So what does “ownership of code” mean when code is infinitely re-duplicable, highly malleable, and the surrounding community has limited coercive power relationships (that is, one part of an organisation can influence but not force another)? In his analysis of ownership and control in Open Source software engineering, Eric Raymond defines ownership of code as “the person who has the exclusive right, recognised by the community at large, to distribute modified versions”. Similarly, we define ownership of code for Inner Source as “the team who has the exclusive right, recognised by the community at large, to distribute modified versions”. In short, this means the team developing and maintaining a component owns the code.

A developer can play certain roles with respect to a component: core developer, contributor, expert user, or end user.

- Core developers of a component are the developers who are working mainly on this component. They know it inside out, are component owner, maintain the component, and decide what goes in and when. The team of core developers typically deliver more than 80% of the functionality and bug fixes of a project.
- Contributors to a component use the component and need to adapt it to their needs. They use the component, know it fairly well, do not own it, and, being the first users of a component, typically provide bug reports and patches. Optionally, they maintain patches on the component but prefer to transfer their patches to the core developers because it saves time for everybody.
- Expert users of a component are developers who use the component as is (possibly with a patches provided by a contributor). Typically, they don’t know the component internals all that well and barely ever need to provide bug reports or patches.
- End users are users (as in, people not interested in source code such as developers) who simply use the component as is. They never look at source code and, hence, barely every provide any bug reports and don’t provide any patches.

The team of core developers own their component whereas contributors and expert users own patches they maintain. Effectively, this results in a distributed model of ownership and control. If a patch is accepted into the component, ownership of the patch also transfers to the team owning the component. The next section discusses patching in more detail. All source code is read-only accessible by the entire community while write access is under strict control of the core developers.

Adapt Rather Than Workaround

As discussed in section 1.2, using an existing component ‘as is’ is the ideal but often not feasible. There are five ways for a team to develop functionality that is very similar to another team’s component but suits their own needs:

1. adapt, or patch, the component and either
  1. present the patch to the original owner or
  2. maintain the patch separately
2. create workaround or glue code
3. fork existing component
4. re-develop from scratch

These options are ordered on the amount of effort required and amount of control gained. That is, patching costs a team the least effort and gains them little control whereas re-developing the component from scratch costs them a lot of effort but yields full control without legacy. This range of options effectively enables a team to balance the amount of control they need against the amount of effort they are prepared to invest.

By far the most efficient option for a team is to adapt the original component, developing a patch. The team owning the original component decide whether to accept the patch into their component based on clear and well documented rules to prevent people from getting frustrated and demotivated when their patch is rejected. If accepted, ownership and maintenance of the patch transfers together with the patch itself.

If rejected, the team can maintain the patch themselves and, although this requires effort, it is typically an order of magnitude less than maintaining a workaround. Often, a patch will live a limited time separate before being accepted into the original component because of different timing and priorities of teams. Patches can range from anything between a single to several tens of thousands of lines of code but, to be accepted into the original component, have to adhere to the component’s architecture. In other words, the team patching a component has to follow the owners’ lead and, with that, has limited control while spending a limited amount of effort.

Alternatively, a team can take a component as is and create workarounds or glue code to make it work in their system. Generally, this is the worst option of all because it typically requires a lot more effort without any gain; developing a patch is usually much more efficient and workarounds are only efficient as temporary, quick fix.

A team can decide to fork an existing component into a new one by taking the source code and start their own version of it. This results in two components that are owned by different teams and provide similar functionality. A fork results in significant duplicate effort but the forking team gains full control as owners of the new project. Typically, forks only happen if both teams involved have a fundamental difference of opinion, for instance on the architecture or features.

Finally, a team can decide to re-develop a component from scratch. This takes most effort but the team has full control and no legacy code. Obviously, this more or less defeats the intent of a product family, or platform, approach. If re-development from a component seems necessary, it probably shouldn’t be part of the platform to begin with.

Most teams default to option 2, which is the worst in that it require much additional effort at no gain. The Inner Source initiative aims to maximise flexibility and productivity by enabling, organising, and stimulating product teams to use options 1a and 1b whenever and wherever needed. The next section discusses in detail how to release early and often and why this is vital to enable efficient patching.

Release Early, Release Often

To really enable others to develop and maintain patches, it is vital to provide various entry levels to others: a team that currently plays the role of end user, may in the future become expert user or contributor. Depending on their role, a team should be able to find the right balance between ease of use and having the latest features. A contributor will insist on having the latest features and be prepared to put in some extra effort to get up to speed, an end user will insist on getting a system that works out of the box. Inner Source comprises two kinds of teams: development teams and a distribution team, both of whom provide multiple releases.

Development

Development teams focus on development of the software, augmenting functionality, finding and fixing bugs, and refactoring the system (for instance to beautify code, to fix a concrete problem, or to generalise a part). Examples of development teams in Open Source are Linux and KMail. At MIP, both the product and platform teams are development teams.

They provide three versions, each targeting contributors and expert users:

- The bleeding edge provides a view on the active development to enable contributors to track and participate in developers’ discussions. This version is (almost) continuous and lags less than a day on the repository. It is typically provided via a web interface with on-the-fly generated tarballs to make it easy to browse and download. Obviously, this version may often break and perhaps not even build depending on the team’s check-in procedure.
- Snapshots are released early and often (for instance, weekly) and aim to make it as easy as possible for contributors and expert users to contribute to active development. Snapshots provide the latest features and are somewhat tested but may require considerable work to get working in the context of particular versions of the platform at large.
- The stable release provides a properly tested and stable version to expert users and end users. It will be easy to build and use. The frequency of the stable releases depends on the project’s activity but it will be a lot less frequent than snapshots.

Each version enables a specific aspect of close collaboration: The bleeding edge is essential for contributor involvement whereas snapshots are vital to enable efficient development of patches and bug fixes – contributors need to be sufficiently close to the bleeding edge to minimise duplicate effort but at the same time, it should take as little effort as possible to get started. The stable releases lowers the threshold to use even further at the expense of lagging further behind the active development.

All versions come with limited assistance on a best effort basis: the development team will assist contributors and expert users to a certain extent but always keeps its main focus on development. Users should turn to the distribution team for more extensively tested and integrated releases with full support.

Note that any release should obviously be usable. That is, an expert user (as in, a contributor or expert user) should be able to get it to work without much effort. In practise, this means the usual demands on source code – having readable code, a fairly logical structure or architecture, and a fairly clean code tree – as well as concise(!) documentation (e.g. a short readme or how-to outlining what other software is required and some pointers to install and configure it) to miminise the threshold to use the component. Mind you, it is perfectly fine if a snapshot requires more effort than the stable release but keeping the threshold as low as possible is essential for efficient collaboration.

Also note that large development projects often split their development into a stable and exponential branch to dodge the so-called “deadliness of deadlines” trap in which case the stable branch delivers the stable release, the experimental branch delivers the snapshots, and the active development provides views on both branches. (“An immutable feature list is scheduled to be delivered at a fixed deadline and, as a result, quality drops. Relaxing either of these constraints can make scheduling workable again.” [DL87])

Distribution

The distribution team focusses on stability, ease of use, and support of an integrated solution. Examples of distribution teams in Open Source are Debian and KDE. They provide an integrated, tested, and easy to use releases, geared towards (but not limited to) non-expert users:

- major, minor
- works in progress

Using (stable) releases from the development teams, the distribution team tests and integrates the components into a major release that works out of the box. In addition, they provide extra documentation guiding unexperienced users and provide support on the system. Optionally, they deliver minor releases in between major releases to provide security updates or to fix high priority bugs not found in time to make it into the major release.

Optionally, the distribution delivers works in progress, or beta, releases in between major releases. These releases aim to increase feedback from expert and end users, stimulating bug reporting to catch as many as possible before the next stable release.

Autonomous life cycles

Decoupling the dynamics of development teams – that is, having independent release schedules – dramatically increases flexibility, both for the teams themselves and their users.

Teams that are working on a component which is still in its infancy or adolescence phase typically need to release frequently whereas a much slower pace suits teams that evolve mature components. It makes little sense to force teams working on components of different maturity into the same pace of releasing.

Moreover, coupling of team dynamics allows (and probably will cause) development to get tightly intertwined with illegal dependencies creeping in which will be detected much later if at all detected. Even if the teams would be working with a components and interfaces based approach, coupling team dynamics typically results in a system that is much more monolith-like, defeating one of the key drivers for using a component and interfaces based approach in the first place. Decoupled team dynamics enforces the teams to keep their parts properly separated, minimising the threshold for contributors and expert users to combine a snapshot of one component with an otherwise stable system. This, in turn, maximises the flexibility for teams to develop and maintain the diversities they need.

Balancing Effort Against Control

Together, the releases provide a range of alternatives that enable the team building on top of the software to balance how close they need to be to the bleeding edge against how much effort they are willing to invest.

However, for this to be effective and efficient (that is, teams choosing the proper option), it is absolutely vital that teams have a thorough understanding of the costs and benefits associated with each option. For instance, a team will need to have very good reasons for forking because it is very expensive (both in time to market, quality of the component, and duplicate effort) for the community at large. Many engineers tend to prefer option 3 or even 4 when 1 is perfectly feasible because they gain full control or suffer from what is sometimes called the “not invented here syndrome”. Hence, it is important to either set incentives guiding teams to choose the most efficient option or not allow option 3 and 4.

One way to set incentives is to partly evaluate teams on their co-operative performance based on key statistics such as how many times their components are re-used, how much (in percentage of the lines of code) those components needed to be adapted to be re-usable, how many times they use components from others, and how many patches they supplied and needed.

Alternatively, a platform programme can decide not to allow forks and re-development. In essence, the freedom to fork provides an escalation mechanism which is vital to prevent some teams from “being more equal than others”. In Open Source software engineering, this works very well – forking usually only happens for a valid reason, for instance when there arises a different focus (todo, add example such as GNU Emacs vs XEmacs) or a fundamental difference of opinion (todo, add example such as QT licensing issue) – but in corporations, the management hierarchy already provides such escalation mechanisms and, as such, forms an alternative to the freedom to fork.

Meritocracy of Peers

In an Inner Source community, developers collegially compete with each other through their ideas, arguments, and especially their working software submissions. Concepts and solutions are judged on their technical merit via the consensus of critical, multi-level peer review and rigorous testing in a wide variety of different software and platform environments.

Over time, the developers who consistently devise more elegant, complete, and better performing solutions gain well deserved peer recognition, stature within the project, and, consequently, more influence in technical discussions and reviewing the work of others. It enables the most talented software developers to stand out and gain influence while allowing the less gifted to contribute what they can.

In essence, Inner Source results in a meritocracy of peers: the more you contribute and the higher quality your contributions, the more respect, status, and influence you earn. This is generally a very (if not the most) effective way to motivate a competent engineer.

Inner Source

Table of Contents