The Linux Kernel celebrates its 30th anniversary and it still has a lot to give
At the beginning of the month we released the note of the 30th anniversary of the publication of the first website, a fact that undoubtedly marked history and of which I have always related a bit to Linux, since both the publication of the first website as well as the first prototype of the Linux Kernel go hand in hand, since both were released in the same year.
As on August 25, 1991, after five months of development, 21-year-old student Linus Torvalds ad in the comp.os.minix conference call I was working on a working prototype of a new operating system Linux, for which the portability of bash 1.08 and gcc 1.40 had been completed. This first public version of the Linux kernel was released on September 17.
Kernel 0.0.1 was 62 KB in compressed form and it contained about 10 thousand lines of source code which compared to today's Linux kernel has more than 28 million lines of code.
According to a study commissioned by the European Union in 2010, the approximate cost of developing a project similar to a modern Linux kernel from scratch would have been more than a billion dollars (calculated when the kernel had 13 million lines of code), according to another estimate at more than 3 billion.
30 Years of Linux 1991-2021
A bit about Linux
The Linux kernel was inspired by the MINIX operating system, which Linus didn't like with his limited license. Later, when Linux became a famous project, the wicked they tried to accuse Linus of directly copying the code of some MINIX subsystems.
The attack was repelled by the author of MINIX, Andrew Tanenbaum, who commissioned a student to do a detailed comparison of the Minix code with the first public versions of Linux. Study results showed the presence of only four negligible code block matches due to POSIX and ANSI C requirements.
Linus originally thought of calling the kernel Freax, from free, freak and X (Unix). But the kernel got the name "Linux" with the light hand of Ari Lemmke, who, at Linus's request, put the kernel on the university's FTP server, naming the directory with the file not "freax," as Torvalds requested, but "linux."
Notably, entrepreneurial entrepreneur William Della Croce managed to trademark Linux and wanted to collect royalties over time, but then changed his mind and transferred all rights to the trademark to Linus. The official mascot for the Linux kernel, the Tux penguin, was selected through a competition held in 1996. The name Tux stands for Torvalds UniX.
Regarding the growth of the Kernel during the last 30 years:
- 0.0.1 - September 1991, 10 thousand lines of code
- 1.0.0 - March 1994, 176 thousand lines
- 1.2.0 - March 1995, 311 thousand lines
- 2.0.0 - June 1996, 778 thousand lines
- 2.2.0 - January 1999, 1,8 million lines
- 2.4.0 - January 2001, 3,4 million lines
- 2.6.0 - December 2003, 5,9 million lines
- 2.6.28 - December 2008, 10,2 million lines
- 2.6.35 - August 2010, 13,4 million lines
- 3.0 - August 2011, 14,6 million lines
- 3.5 - July 2012, 15,5 million lines
- 3.10 - July 2013, 15,8 million lines
- 3.16 - August 2014, 17,5 million lines
- 4.1 - June 2015, 19,5 million lines
- 4.7 - July 2016, 21,7 million lines
- 4.12 - July 2017, 24,1 million lines
- 4.18 - August 2018, 25,3 million lines
- 5.2 - July 2019, 26,55 million lines
- 5.8 - August 2020, 28,4 million lines
- 5.13 - June 2021, 29,2 million lines
While for the part of development and news:
- September 1991: Linux 0.0.1, first public release that only supports i386 CPU and boots from floppy disk.
- January 1992: Linux 0.12, the code began to be distributed under the GPLv2 license
- March 1992: Linux 0.95, provided the ability to run the X Window System, support for virtual memory and partition swapping, and the first SLS and Yggdrasil distributions appeared.
- In the summer of 1993, the Slackware and Debian projects were founded.
- March 1994: Linux 1.0, first officially stable version.
- March 1995: Linux 1.2, significant increase in the number of drivers, support for Alpha, MIPS and SPARC platforms, expansion of network stack capabilities, appearance of a packet filter, NFS support.
- June 1996: Linux 2.0, support for multiprocessor systems.
- January 1999: Linux 2.2, increased memory management system efficiency, added support for IPv6, implementation of a new firewall, introduced a new sound subsystem
- February 2001: Linux 2.4, support for 8-processor systems and 64 GB of RAM, Ext3 file system, USB, ACPI support.
- December 2003: Linux 2.6, SELinux support, automatic kernel tuning tools, sysfs, redesigned memory management system.
- En septiembre de 2008, the first version of the Android platform based on the Linux kernel was formed.
- In July 2011, after 10 years of development of the 2.6.x branch, the transition to 3.x numbering was made.
- In 2015, Linux 4.0, the number of git objects in the repository has reached 4 million.
- In April of 2018, I overcome the barrier of 6 million git-core objects in the repository.
- In January of 2019, the Linux 5.0 kernel branch was formed.
- Posted in August 2020, kernel 5.8 was the largest in terms of the amount of changes of all kernels during the entire life of the project.
- In 2021, code for developing Rust language drivers was added to the next branch of the Linux kernel.
The Linux kernel is one of the most popular operating system kernels in the world. Less than 30 years after its humble beginnings in 1991, the Linux kernel now underpins modern computing infrastructure, with 2019 estimates of the number of running Linux kernels ranging upwards of twenty billion. To put that in perspective: There are about 3 Linux kernels for every living person.
The Linux kernel powers household appliances, smartphones, industrial automation, Internet data centers, almost all of the cloud, financial services, and supercomputers. It even powers a few percent of the world’s desktop systems, including the one that I am typing these words into. But the year of the Linux desktop continues its decades-long tradition of being next year.
But it wasn’t always that way.
A brave choice, big commitment in Linux’s early days
The Linux kernel was still a brave choice when IBM joined the Linux community in the late 1990s. IBM began its Linux-kernel work with a skunkworks port of Linux to the IBM mainframe and a corporate commitment that resulted in IBM’s investing $1B on Linux in 2001. Linux was ported to all IBM servers, and even to IBM Research’s Linux-powered wristwatch. Linux soon enjoyed widespread use within IBM’s hardware, software, and services.
Of course, IBM wasn’t the only player betting on Linux. For example, an IBM sales team spent much time preparing to convince a long-standing, technically conservative client to start moving towards Linux. When the team went to give their pitch, the client opened the discussion with: “We have decided that we are going with Linux. Will you be coming with us?” Although this destroyed untold hours of preparation, it produced a result beyond the sales team’s wildest imaginations.
And it wasn’t an isolated incident.
Keynote: Linus Torvalds in conversation with Dirk Hohndel
Setting Linux up for success
This widespread enthusiasm motivated IBM not only to make substantial contributions to Linux, but also to come to its defense. First, we committed to not attack Linux in the form of patent pledges. We took it a step further and opted to co-found the Open Invention Network, which helped defend open source projects such as the Linux kernel against attacks by patent holders. We made numerous visits to the courtroom to defend ourselves against a lawsuit related to our Linux involvement, and co-founded several umbrella organizations to facilitate open source projects, perhaps most notably helping to found the Linux Foundation.
IBM is also a strong technical contributor to the Linux kernel, ranking in the top ten corporate contributors and having maintainers for a wide range of Linux-kernel subsystems. Of course, IBM contributes heavily to support its own offerings, but it is also a strong contributor in the areas of scalability, robustness, security, and other areas that benefit the Linux ecosystem.
Of course, not everything that IBM attempted worked out. IBM’s scalability work in the scheduler was never accepted into the Linux kernel. Although its journaling filesystem (JFS) was accepted and remains part of the Linux kernel, it seems safe to say that JFS never achieved the level of popularity that IBM had hoped for. Nevertheless, it seems likely that IBM’s efforts helped to inspire the work leading to the Linux kernel’s excellent scalability, features, and functionality in its filesystems and scheduler.
In addition, these experiences taught IBM to work more closely with the community, paving the way to later substantial contributions. One example is the CPU-groups feature of the community’s scheduler that now underpins containers technologies such as Docker, along with the virtio feature that plays into the more traditional hypervisor-based virtualization. Another example is numerous improvements leading up to the community’s EXT4 filesystem. A final example is the device-tree hardware specification feature, originally developed for IBM’s Power servers but now also used by many embedded Linux systems.
Celebrating 30 Years of Open
Achieving impossible results
It has also been a great privilege for IBM to be involved in a number of Linux-kernel efforts that produced results widely believed to be impossible.
First, at the time that IBM joined the Linux kernel community, the kernel could scale to perhaps two or four CPUs. At the time there was a large patchset from SGI that permitted much higher scalability, but this patchset primarily addressed HPC workloads. About ten years of hard work across the community changed this situation dramatically, so that, despite the naysayers, the same Linux-kernel source code supports both deep embedded systems and huge servers with more than one thousand CPUs and terabytes of memory.
Second, it was once common knowledge that achieving sub-millisecond response times required a special-purpose, real-time operating system. In other words, sub-millisecond response times certainly could not be achieved by a general-purpose operating system such as the Linux kernel. IBM was an important part of a broad community effort that proved this to be wrong, as part of an award-winning effort including Raytheon and the US Navy. Although the real-time patchsett has not yet been fully integrated into the mainline Linux kernel, it does achieve not merely deep sub-millisecond response times, but rather deep sub-hundred-microsecond response times. And these response times are achieved not only on single-CPU embedded system, but also on systems with thousands of CPUs.
Third, only about a decade ago, it was common knowledge that battery-powered embedded systems required special-purpose operating systems. You might be surprised that IBM would be involved in kernel work in support of such systems. One reason for IBM’s involvement was that some of the same code that improves battery lifetime also improves the Linux kernel’s virtualization capabilities — capabilities important to the IBM mainframe. A second reason for IBM’s involvement was the large volume of ARM chips then produced by its semiconductor technology partners. This latter reason motivated IBM to cofound the Linaro consortium, which improved Linux support for ARM’s processor families. The result, as billions of Android smartphone users can attest, is that the Linux kernel has added battery-powered systems to its repertoire.
Fourth and finally, version 5.2 of the Linux kernel comprises 13,600 changes from 1,716 kernel developers. The vast majority of these changes were applied during the two-week merge window immediately following the release of version 5.1, with only a few hundred new changes appearing in each of the release candidates that appear at the end of each weekend following the merge window. This represents a huge volume of changes from a great many contributors, and with little formal coordination. Validating these changes is both a challenge and a first-class concern.
One of IBM’s contributions to validation is “-next” integration testing, which checks for conflicts among the contributions intended for the next merge window. The effects of -next integration testing, when combined with a number of other much-appreciated efforts throughout the community, has not been subtle. Ten years ago, serious kernel testing had to wait for the third or fourth release candidate due to bugs introduced during the preceding merge window. Today, serious kernel testing can almost always proceed with the first release candidate that comes out immediately at the close of the merge window.
But is Linux done yet?
Not yet.
Happy Birthday Linux! Plus Highlights of DLN MegaFest Celebrating 30 Years of Linux!
A continuing effort
Although developers should be proud of the great increases in stability of the Linux kernel over the years, the kernel still has bugs, some of which are exploitable. There is a wide range of possible improvements from more aggressively applying tried testing techniques to over-the-horizon research topics such as formal verification. In addition, existing techniques are finding new applications, so that CPU hotplug (which was spearheaded by IBM in the early 2000s) has recently been used to mitigate hardware side-channel attack vectors.
The size of hardware systems is still increasing, which will require additional work on scalability. Many of these larger systems will be used in various cloud-computing environments, some of which will pose new mixed-workload challenges. Changing hardware devices, including accelerators and non-volatile memory, will require additional support from Linux, as well as from hardware features such as IBM’s Power Systems servers’ support of the NVLink and CAPI interconnects.
Finally, there is more to security than simply fixing bugs faster than attackers can exploit them (though that would be challenge enough!). Although there is a great deal of security work needed in a great many areas, one important advance is Pervasive Encryption for IBM Z.
IBM congratulates the Linux kernel community on its excellent progress over the decades, and looks forward to being part of future efforts overturning yet more morsels of common wisdom!
Linux Kernel Internals
What 30 Years of Linux Taught the Software Industry
Linux has become the largest collaborative development project in the history of computing over the last 30 years. Reflecting on what made this possible and how its open source philosophy finally imposed itself in the industry can offer software vendors valuable lessons from this amazing success story.
The web may not have reached full adulthood yet, but it has already crafted its own mythology.
August 25, 1991: Linus Torvalds, a 21-year-old university student from Finland, writes a post to a Usenet group: “Hello everybody out there using minix — I’m doing a (free) operating system (just a hobby, won’t be big and professional like gnu) for 386 (486) AT clones […]”. A few weeks later, the project, which will eventually be known as Linux, is published for the first time.
This is the starting point of an epic that few could have foreseen.
Fast-forward 30 years and the Linux kernel isn’t only running on most of the web servers and smartphones around the globe, but it also supports virtually all of the much more recent cloud infrastructure. Without open source programs like Linux, cloud computing wouldn’t have happened.
Among the major factors that propelled Linux to success is security. Today, the largest software companies in the world are taking open source security to new levels, but the Linux project was one of the first to emphasize this.
HotCloud '14 - Cloudy with a Chance of …
How Linux Became the Backbone of the Modern IT World
Brief History
Open source predates the Linux project by many years and is arguably as old as software itself. Yet, it is the success of the latter that propelled this movement in the 1990s. When it was first submitted for contribution in 1991 by Torvalds, the Linux kernel was the GNU project’s ‘missing link’ to a completely free software operating system, which could be distributed and even sold without restrictions. In the following years, and as the project started to incorporate proprietary licensed components and grow in popularity, a clarification on the meaning of “free software” became necessary.
This led to the coining of the term “open source” as we use it today, thanks in part to Eric Raymond’s seminal paper The Cathedral and the Bazaar, a “reflective analysis of the hacker community and free software principles.” Open source was chosen to qualify software in which the source code is available to the general public for use or modification from its original design, depending on the terms of the license. People may then download, modify and publish their version of source code (fork) back to the community.
Open source projects started gaining traction in the late nineties thanks to the popularity of software like Apache HTTP Server, MySQL and PHP to run the first dynamic websites on the internet.
Facts and Figures
Today, not only is Linux powering most of the digital era, but open source has become the leading model for how we build and ship software. Though most people don’t realize it, much of the technology we rely on every day runs on free and open source software (FOSS). Phones, cars, planes and even many cutting-edge artificial intelligence programs use open source software. According to the Linux Foundation, 96.3% of the world’s top one million servers run on Linux and 95% of all cloud infrastructure operates on it. Other infrastructure also relies on open source: 70% of global mobile subscribers use devices running on networks built using ONAP (Open Network Automation Platform).
Linux adoption is very high in professional IT, where it’s become a de facto standard, especially with the advent of the cloud era. In fact, 83.1% of developers said Linux is the platform they prefer to work on. This success is due, in large part, to the community that contributed to its source code since its creation: More than 15,000 developers from more than 1,500 companies. Linux went on to become, arguably, the biggest success story of the free software movement, proving that open source could lead to the creation of software as powerful as any sold by a corporation.
The Linux Foundation, a non-profit technology consortium founded in 2000 to support the collaborative development of Linux and OS software projects, is itself a big success. It now has more than 100 projects under its umbrella, spread across technology sectors like artificial intelligence, autonomous vehicles, networking and security. Several subset foundations have also emerged over the years, including the Cloud Foundry Foundation, the influential Cloud Native Computing Foundation, and the recently announced Open Source Security Foundation. The Foundation estimates the total shared value created from the collective contributions of its community to a whopping $54.1 billion.
All these achievements may not have been possible without the embrace of open source by the enterprise world, which may represent its biggest win.
New Generation of Mainframers - John Mertic, The Linux Foundation & Len Santalucia, Vicom Infinity
Enterprise Adoption
Companies began to realize that many open source projects were easier and cheaper to implement than asking their developers to build the basic pieces of an internet business over and over again from scratch.
Twenty years ago, most businesses ran atop proprietary software from Microsoft, Oracle and IBM, and the idea of collaborating on big software projects might have sounded laughable to them. Today, these companies, along with relative newcomers such as Google, Facebook and Amazon, are not only employing thousands of full-time contributors to work on open source projects like Linux, they also regularly choose to open source some of their state-of-the-art projects; from Google Brain’s machine learning platform TensorFlow and container orchestration platform Kubernetes to Facebook’s React.
There’s no question that open source software created a new wave of business opportunities. As more companies took an interest in open source projects, they realized they didn’t necessarily have the in-house expertise to manage those projects themselves and turned to startups and larger companies for help.
Even Microsoft, which famously warred against the very concept of Linux for nearly a decade, made a strategic shift to embrace open source in the 2010s, led by CEO Satya Nadella. The IT giant finally joined the Linux Foundation in 2016 and acquired GitHub, the largest host for open source projects, two years later. It has since become one of the biggest sponsors of open source projects.
As a consequence, the stakes have been raised for open source software, which is the engine powering the shift toward the cloud for virtually every company. In this context, security is becoming a topic of the utmost importance, and the commitment to secure the open source ecosystem is growing fast.
LINUX Kernel
Setting a Standard for Security and Trust
Open Source Security Issues
Following the OSS adoption boom, the sustainability, stability and security of these software packages is now a major concern for every company that uses them.
The Census II report on structural and security complexities in the modern-day supply chain “where open source is pervasive but not always understood” revealed two concerning trends that could make FOSS more vulnerable to security breaches. First, the report said it is common to see popular packages published under individual developers’ accounts, raising the issue of security and reliability. Second, it is very common to see outdated versions of open source programs in use, meaning they contain fewer security patches.
The 2021 OSSRA report agrees: “98% of the codebases audited over the past year contain at least one open source component, with open source comprising 75% of the code overall.” The report also noted that 85% of the audited codebases contained components “more than four years out of date”.
This highlights the mounting security risk posed by “unmanaged” open source: “84% of audited codebases containing open source components with known security vulnerabilities, up from 75% the previous year. Similarly, 60% of the codebases contained high-risk vulnerabilities, compared to 49% just 12 months prior.” Not only is the security posture affected, but there are also compliance issues that can arise from unsupervised integration of open source content because licenses can be conflicting or even absent.
Because large corporations are now a big part of the open source ecosystem, their sponsorship is a welcome source of financing for many people whose work had been done for free until now, yet it may not be enough. The open source community is well-known for its commitment to independence, its sense of belonging and its self-sufficiency, and expecting contributors to voluntarily address security issues is unlikely to succeed.
This is where the experience of building Linux over 30 years and coordinating the work of thousands of individual contributors may be an example to follow.
Linux on IBM Z and LinuxONE: What's New
Linux Foundations
In Linux kernel development, security is taken very seriously. Because it is an underlying layer for so many public and private software ‘bricks’ in the digital world means that any mistake can cost millions to businesses, if not lives. Since the beginning, it has adopted a decentralized development approach with a large number of contributors collaborating continuously. Therefore, it has consolidated a strong peer-reviewing process as the community development effort grew and expanded.
The last stable release at the time of writing is 5.14, released on August 29th, 2021, only a few days before the 30th birthday of the project. The most important features in the release are security-related: One is intended to help mitigate processor-level vulnerabilities like Spectre and Meltdown and the other concerns system memory protection, which is a primary attack surface to exploit. Each Linux kernel release sees close to 100 new fixes per week committed by individuals and professionals from the likes of Intel, AMD, IBM, Oracle and Samsung.
With such broad adoption and long history, the Linux project has reached a level of maturity that few, if any, other FOSS projects have seen. The review process and release model have built confidence for numerous downstream vendors. Although the world is not perfect and it is arguably difficult for them to keep up with such a high rate of change, they can at least benefit from strong security enforcement mechanisms and they can adapt their security posture in concordance to their “risk appetite”: Vendors are able to do the calculus of determining how old a kernel they can tolerate exposing users to.
A maintainable, scalable, and verifiable SW architectural design model
Pushing the Boundaries of Open Source Security
Heartbleed and the Fragility of OS Security
In April 2014, a major security incident affecting the OpenSSL cryptography library; disclosed as “Heartbleed.” The developer who introduced the bug acknowledged that, though he was working on the project with a handful of other engineers:
“I am responsible for the error because I wrote the code and missed the necessary validation by an oversight. Unfortunately, this mistake also slipped through the review process and therefore made its way into the released version.” OpenSSL, an open source project, is widely used to implement the Transport Layer Security (TLS) protocol. In other words, it’s a fundamental piece used to secure a large part of the web.
Open source was seen as fundamentally secure for a long time because the more people examine a line of code, the better the chances of spotting any weakness. Additionally, this model prevents “security by obscurity,” whereby the bulk of the protection comes from people not knowing how the security software works—which can result in the whole edifice tumbling down if that confidential information is released or discovered externally.
This incident was a major turning point for a large share of the biggest web corporations: They realized that many open source technologies underpinning their core operations could not be “assumed to be secure” anymore. Any human error could have huge implications; therefore, a specific effort had to be made to improve the security in this specific space.
Linux Kernel Development
A New Era for Open Source
As we advance in an era where open source is omnipresent in codebases, tooling, networks and infrastructure and is even in fields other than software, security awareness is starting to take hold. But it needs a lot more work.
A big part of the challenge, to begin with, is for the industry to understand the scope of the problem.
Google just announced that it will be committing “$100 million to support third-party foundations that manage open source security priorities and help fix vulnerabilities.”
The Secure Open Source (SOS) pilot program, run by the Linux Foundation, will reward developers for enhancing the security of critical open source projects that we all depend on.
In doing so, Google leads the way in enlarging the financial sponsorship of big players like companies and governments — which are increasingly sponsoring open source both directly and indirectly. However, they also recommend that organizations “understand the impact they have on the future of the FOSS ecosystem and follow a few guiding principles.”
What could these principles look like?
Modernizing on IBM Z Made Easier With Open Source Software
A Roadmap to Safely Use and Contribute to Open Source
The Linux Foundation proposed a specific Trust and Security Initiative which describes a collection of eight best practices (with three degrees of maturity) open source teams should use to secure the software they produce as well as by a larger audience to “raise the collective security bar.” Here they are:
Clarifying the roles and responsibilities, and making sure everyone is aware of their security responsibilities across the organization.
Setting up a security policy for everyone; in other words, a clear north star for all members of the organization.
‘Know your contributors’ is defined as a set of practices to make risk-based decisions on who to trust and fight offensive cyberwarfare techniques, such as the poisoning of upstream code.
Locking down the software supply chain: This has become a preferred target as adversaries clearly understood that they can have a bigger and more effective impact with less effort than targeting individual systems.
Provide technical security guidance to narrow potential solutions down to the more appropriate ones in terms of security.
Deploy security playbooks to define how to do specific security processes, specifically incident response and vulnerability management processes, like creating roles and responsibilities or publishing security policies. This may feel formal, antiquated and old-school but having pre-defined playbooks means that teams can focus on shipping software and not learning how to do security, especially at the least convenient and most stressful time.
Securing Linux VM boot with AMD SEV measurement - Dov Murik & Hubertus Franke, IBM Research
Develop security testing techniques with automated testing strongly recommended since it scales better, has less friction and less cost to the teams and aligns well to modern continuous delivery pipelines.
However, the authors of the guide are aware that some major challenges are still facing the industry and, as such, need to be addressed. They mention:
- The lack of open source security testing tools
- The fact that open source package distribution is broken
- The fact that the CVE format for vulnerability disclosure is also broken
The lack of a standard for a security build certificate which would allow any consumer to transparently verify that a product or component complies with the announced specifications
“The types of verification can and should include the use of automated security tools like SAST, DAST and SCA, as well as verification of security processes like the presence of security readmes in repos and that security response emails are valid.”
A scheme like this could have a significant and lasting effect on the security quality of open source software and the internet at large.
The Linux project, born 30 years ago, is present in all layers of the modern software stack today. It is used by all the largest server clusters powering the modern web and any business going digital will use it at some point. This unparalleled longevity and success have demonstrated that the open source model was compatible with the requirements of enterprise-grade services and economically viable. Now that open source is all the rage in the software industry, a consensus and action plan on how to ensure the sustainability of this ecosystem becomes urgent. The top priority for businesses that depend on it is to adopt strong application security guidelines, like the ones promoted by the Linux Foundation, which have proven their value.
One last note on the nature of open source: As businesses are now much more bound by the common use of open source components to build upon, they should not fall into the “tragedy of the commons” trap. This would mean waiting until others take action; for instance, to improve the global software security landscape. This might be one of the biggest challenges confronting our highly collaborative industry.
More Information:
https://developer.ibm.com/blogs/ibm-and-the-linux-kernel/
https://devops.com/what-30-years-of-linux-taught-the-software-industry/
https://www.howtogeek.com/754345/linux-turns-30-how-a-hobby-project-conquered-the-world/