What is Transparent data encryption (TDE)

What is data encryption?

Data encryption is a system that encodes your data so other people can’t read it. Consider this:

Xibu jt ebub fodszqujpo? No, that’s not a massive typo — that’s the phrase “What is data encryption?” encrypted with a simple Caesar cipher, or shift cipher. Each letter is replaced by the letter that follows it in the alphabet, so when you see the encrypted phrase, it’s just gibberish. You can’t decrypt it if you don’t know the encryption system.

What is data encryption?
Why use data encryption?
How does data encryption work?
Common encryption algorithms
Data at rest vs. data in transit
How to encrypt your PC
Mobile data encryption
Wireless encryption types
End-to-end vs. VPN encryption
Different types of VPN encryption
Can encrypted data be hacked?
Encrypt your data securely and easily with a VPN

What Is Data Encryption and How Does it Work?

Data encryption protects your data from being seen, hacked, or stolen. VPNs provide data encryption at the consumer level, but how about end-to-end encryption? Is a VPN the best option, or are there other solutions out there? What does data encryption even mean? Find out with our guide to everything you need to know about data encryption.

Data encryption works along the same lines, but with far more complex encryption systems. These transform regular data, stored as plaintext, into what’s known as “ciphertext” — a seemingly nonsensical string of letters, numbers, and symbols. You can only unscramble the data, or decrypt it, with a specific decryption key.

Introducing Oracle Key Vault: Centralized Keys, Wallets, and Java Keystores

Why use data encryption?

Data encryption is all about protecting your personal information from anyone who’d like to get their hands on it. This idea stems from humanity’s long history of encoded communications, the use and study of which is known as cryptography. Some of these encryption systems, such as the writing used in the Renaissance-era Voynich manuscript, still remain uncracked, even with the aid of modern computing.

Enforcing in-depth Data protection & privacy with Database Security essentials

So why is data encryption important? In short, using encryption protects your personal data. You can use data encryption to safeguard yourself against a multitude of online threats, including identity theft, hacking, and fraud.

Many businesses also use encryption algorithms in network security to defend against spyware and other malware. Anyone who manages to obtain encrypted data won’t be able to read it — preventing hackers from gaining access to business secrets. That means data encryption also protects against certain strains of ransomware that hijack data and threaten to publish it unless a ransom is paid.

Enforcing in-depth Data protection & privacy with Database Security essentials

How can encryption be used to protect information?

Did you know that you’re benefiting from data encryption nearly every time you use the internet? Here are a few uses of encryption that you may encounter in your daily online life:

Many modern websites feature HTTPS encryption— you’ll know because the URL begins with https, or because your browser shows you a little padlock icon in the address bar. Check your address bar now, and you’ll see these indicators here on our site. AVG Signal’s looking out for you.

HTTPS encryption protects your internet traffic while it travels between your device and the website you’re using, preventing anyone from either listening in or altering the data while it’s in transit. You should never divulge any sensitive personal data, such as credit card numbers, while on an unsecured website with plain old HTTP. If you don’t know how secure a certain site is, it’s always best to do a quick website safety check before entering any personal information.

Gmail and Outlook — two of the most widely used email platforms — encrypt all emails by default. The encryption they provide should be sufficient for the average email user, but there are more secure options available. Both Gmail and Outlook offer upgraded encryption with premium accounts, and ProtonMail is a securely encrypted email service that anyone can use.

Many messaging apps also protect users with data encryption. Signal and Wickr are two popular options providing end-to-end encryption: the data is encrypted all the way from the sender to the receiver.

If you’ve dabbled at all in cryptocurrencies such as Bitcoin (BTC) or Ethereum (ETH), you’ve also enjoyed the protections of data encryption — though if you’re savvy enough to be using these, you probably already knew that. Cryptocurrencies protect their users by encrypting transactions and storing them in a shared historical record known as the “blockchain.” Once a transaction joins the blockchain, it can’t be reversed or forged.

VPNs are a popular solution for data encryption — you can even download a VPN on your mobile phone for encryption on the go. If you’re on an unsecured public Wi-Fi network, a VPN is an ideal solution for keeping your data safe. We’ll explore VPNs in more detail later in this piece, but for now, think of them as on-demand data encryption that’s both convenient and secure.

Data Security Essentials

How does data encryption work?

Data encryption revolves around two essential elements: the algorithm and the key.

The algorithm is the set of rules that determine how the encryption works. The Caesar cipher algorithm we used earlier in this article substitutes each letter with another letter that sits a fixed distance away from it in the alphabet.

The key determines the encryption implementation. Keys are randomly generated and combined with the algorithm to encrypt and decrypt data. In our Caesar cipher, we used a key of +1. A is replaced by B, B is replaced by C, and so on. In data encryption, keys are defined by their length in bits.

The algorithm and the keys it generates both contribute to the overall security of the encryption method. Key length is one factor in encryption security, but it’s not an exclusive determinant — the mathematical systems behind the algorithm also influence encryption security as well. Some algorithms with shorter keys may have equivalent or greater security when compared to other algorithms with longer keys.

Privacy Essentials for Security Professionals

Cryptographic keys

Modern cryptography algorithms generate new data encryption keys for each use, so that two users of the same algorithm can’t decrypt each other’s communications. Symmetric-key algorithms use the same key for encrypting and decrypting, while public-key algorithms (also known as asymmetric-key algorithms) have separate keys for each process:

In a symmetric-key algorithm, the encrypting and decrypting parties all share the same key. Everyone who needs to receive the encrypted data will have the same key as everyone else. It’s a simpler system but with greater risk, as it takes just one leak to expose the data being transmitted by all involved parties.

Symmetric encryption uses either stream ciphers or block ciphers to encrypt plaintext data.

Stream ciphers encrypt data on a per-byte basis. Each byte is encrypted individually. It’s a complex system that uses a different key for each byte, but reversal is relatively easy.

Block ciphers encrypt data in blocks of 64 bits (8 bytes) or larger. Reversing block cipher encryption is much harder than with stream cipher encryption.

Our Caesar cipher example is a symmetric-key algorithm, since you can encrypt and decrypt a message using the same key: the amount of letters in the shift from plaintext to ciphertext and back.

A public-key algorithm is more secure than its symmetric-key counterpart. The public key is widely available for anyone to use in sending communications, but there’s a second key — the private key — that’s needed to decrypt the message. The algorithm creates both keys at once, and only these two exact keys can work together.

So how does data encryption protect data? Without the decryption key, you can’t unscramble the data — unless you’re willing to invest a lot of time and effort into other means of breaking the encryption. We’ll dive into what those measures look like towards the end of this piece.

What about hashing?

Hashing is a process that uses an algorithm to convert plaintext into numerical values. Any website worth using will hash user credentials to protect them in the event of a data breach. If you encounter a website that still stores passwords as plaintext, run away and never look back.

Common encryption algorithms

There’s not just one data encryption algorithm out there. Here, we look at several of the most common encryption algorithms and quickly break down how they work.

Advanced Encryption Standard (AES)

AES is a secure symmetric algorithm that’s easy to use, making it ideal for situations in which secrecy is important. Users can set the key length to 128, 192, or 246 bits, and AES supports block lengths of 128 bits for block cipher encryption.

Rivest–Shamir–Adleman (RSA)

Names for its three creators, RSA is one of the earliest public-key algorithms and still sees widespread use. RSA uses large prime numbers to create its keys, and compared to other systems, it’s rather slow. For this reason, RSA is most often used to share a symmetric key, which is used in turn to encrypt the actual data that needs protecting.

Triple DES

Triple DES (or TDES/3DES) is a symmetrical block-cipher algorithm that encrypts each block three times over using a 56-bit data encryption standard (DES) key. But what is the data encryption standard in the first place?

DES is a pioneering encryption algorithm developed in the 1970s that was used as the US federal standard until being replaced in 2002 by AES. At the time, DES was strong enough to defend against contemporary threats. Even with its three layers of encryption, TDES is no longer considered reliably secure by modem standards.

Perfect forward secrecy (PFS)

PFS isn’t an algorithm, but a property that an encryption protocol can have. An encryption protocol is a system that defines how, when, and where an algorithm should be used in order to achieve encryption. When a protocol has PFS, it means that if the private key in a public-key algorithm becomes compromised, prior instances of encryption will still be protected. This is because PFS protocols create new keys for every encryption session.

The 5 Ws of Database Encryption

Because of the way PFS protects prior sessions from future attacks, it is a critical feature for the security of any encryption system. You’ll also see PFS referred to simply as “forward secrecy” or FS.

Data at rest vs. data in transit

The majority of the encryption conversation focuses on data in motion encryption, or how to protect data in transit — in other words, data that’s on its way from one place to another. When you encrypt your web traffic with a VPN, that’s data in transit encryption in action.

But not all data is constantly in motion. Data that’s stored in one place is called “data at rest.” There’s plenty of data on your computer that isn’t going anywhere, but may be even more sensitive than anything you’d be communicating to other parties.

It’s just as important to practice data at rest encryption as well, in case your device gets hacked or stolen. You can easily protect your local data by encrypting or password-protecting files and folders on your computer or external storage device.

We’ll show you some encryption best practices for data at rest in the following sections, “How to encrypt your PC” and “Mobile data encryption.”

Transparent data encryption (TDE)

Introduced by Microsoft in 2008, transparent data encryption (TDE) protects databases by encrypting the files on the servers as well as any backups. Microsoft, IBM and Oracle use TDE to provide enterprises with SQL server database encryption.

The encrypted files are automatically decrypted by any authorized applications or users when accessing the database. This is why it’s “transparent” — if you’re already allowed to access the data, you don’t need to do anything extra to see it. Think of TDE like an employee ID badge that grants entrance to a secure facility. If you have a badge, you can waltz right on in.

As an additional security measure, TDE stores the encryption keys separately from the encrypted data files. This way, if the physical storage media or files are stolen, they’ll still be protected against unauthorized access. You can’t open the data files without the correct key.

Monitoring Beyond SQL Server Oracle DB, DB2, Informix, and MySQL

What Is Transparent Data Encryption?

Transparent Data Encryption (TDE) enables you to encrypt sensitive data that you store in tables and tablespaces.

After the data is encrypted, this data is transparently decrypted for authorized users or applications when they access this data. TDE helps protect data stored on media (also called data at rest) in the event that the storage media or data file is stolen.

Oracle Database uses authentication, authorization, and auditing mechanisms to secure data in the database, but not in the operating system data files where data is stored. To protect these data files, Oracle Database provides Transparent Data Encryption (TDE). TDE encrypts sensitive data stored in data files. To prevent unauthorized decryption, TDE stores the encryption keys in a security module external to the database, called a keystore.

You can configure Oracle Key Vault as part of the TDE implementation. This enables you to centrally manage TDE keystores (called TDE wallets in Oracle Key Vault) in your enterprise. For example, you can upload a software keystore to Oracle Key Vault and then make the contents of this keystore available to other TDE-enabled databases. See Oracle Key Vault Administrator's Guide for more information.

Enhance Security with OCI Web Application Firewall (WAF)

Benefits of Using Transparent Data Encryption

Transparent Data Encryption (TDE) ensures that sensitive data is encrypted, meets compliance, and provides functionality that streamlines encryption operations.

Benefits are as follows:

As a security administrator, you can be sure that sensitive data is encrypted and therefore safe in the event that the storage media or data file is stolen.

Using TDE helps you address security-related regulatory compliance issues.

You do not need to create auxiliary tables, triggers, or views to decrypt data for the authorized user or application. Data from tables is transparently decrypted for the database user and application. An application that processes sensitive data can use TDE to provide strong data encryption with little or no change to the application.

Data is transparently decrypted for database users and applications that access this data. Database users and applications do not need to be aware that the data they are accessing is stored in encrypted form.

You can encrypt data with zero downtime on production systems by using online table redefinition or you can encrypt it offline during maintenance periods. (See Oracle Database Administrator’s Guide for more information about online table redefinition.)

You do not need to modify your applications to handle the encrypted data. The database manages the data encryption and decryption.

Oracle Database automates TDE master encryption key and keystore management operations. The user or application does not need to manage TDE master encryption keys.

Monitoring your cloud-based Infrastructure with Oracle Management Cloud

Pros and Cons of Transparent Data Encryption (TDE)

Introduction

Transparent Data Encryption (TDE) encrypts all the data that’s stored within the database’s physical files and also any backup files created from the database. With data security becoming more and more important there’s no doubt that encryption of data using technologies such as TDE will become increasingly relevant. However as always there’s a price to be paid for implementing TDE and this article discusses some of the pros and cons.

Transparent Data Encryption

First of all it’s important to understand the scope of TDE, as it’s not a complete end to end encryption solution. TDE will encrypt the data files and transaction log files (.mdf, .ndf and .ldf files) and the backup files (.bak files). This means that so called “data at rest” is encrypted, however traffic between the database and application is not encrypted (at least not by TDE, but you can use SSL to achieve this), and data held within the application is also not encrypted. TDE is implemented at the database level and is an all or nothing solution – so all data within the database will be encrypted – you can’t just encrypt the sensitive columns.

Another point to watch is that even if only one database on a server has TDE enabled then TempDB will be encrypted, so the performance of other non-encrypted databases on the same server may be affected. However although there’s inevitably a performance impact when using TDE on a database, Microsoft claims this is only 2 – 4% compared to a non-encrypted database.

I thought it would be useful to summarise some of the pros and cons of TDE :

Advantages of TDE

Fairly simple to implement.
No changes to the application tier required.
Is invisible to the user.
Works with high availability features, such as mirroring, AlwaysOn and log shipping.
Works with older versions of SQL Server, back to 2008.

Disadvantages of TDE

Only encrypts data at rest, so data in motion or held within an application is not encrypted.
All data in the database is encrypted – not just the sensitive data.
Requires the more expensive Enterprise Edition (or Developer or DataCenter Edition) of SQL Server.
The amount of compression achieved with compressed backups will be significantly reduced.
There is a small performance impact.
FileStream data is not encrypted.
Some DBA tasks require extra complexity, for instance restoring a backup onto another server.
As TempDB is encrypted, there is potentially an impact on non-encrypted databases on the same server.
The master database, which contains various metadata, user data and server level information is not encrypted.

Hopefully the above summary is useful, if you have any other pros and cons then please let me know and I will add them to the list.

For completeness TDE isn’t the only database encryption technique available within SQL Server, some of the others are:

The business logic within individual stored procedures can be encrypted using the ‘ENCRYPTION’ keyword.

Individual data items (i.e. column or cell-level encryption) can be encrypted and decrypted using the ‘ENCRYPTBYPASSPHRASE’ and ‘DECRYPTBYPASSPHRASE’ statement along with a pass phrase. ENCRYPTBYKEY/DECRYPTBYKEY and ENCRYPTBYCERT/DECRYPTBYCERT are similar but use a key or certificate to encrypt the data.

Oracle Cloud Native Ecosystem for Containers

Setting up Transparent Data Encryption (TDE)

Transparent Data Encryption (TDE) encrypts all the data that’s stored within the database’s physical files and also any backup files created from the database. Encrypting a database with TDE is a very straightforward process, involving 3 simple steps. This article shows how to do this.

ENCRYPTING A DATABASE WITH TDE IN 3 STEPS -

STEP 1 : Set up an instance level master key and certificate

The first step is to set up a master key by running the following SQL, but make sure you change the password to something more secure. If you already have a master key set up on the instance then you don’t need to run this (you may well already have one as it’s used by other encryption functions).

USE Master;

CREATE MASTER KEY ENCRYPTION

BY PASSWORD='MyPassword';

Then we need to create a certificate at the instance level by running the following SQL :

USE Master;

CREATE CERTIFICATE MyInstance_ServerCert

WITH SUBJECT='Certificate for MyInstance';

STEP 2 : Set up a database encryption key within the database to be encrypted

To set up a database encryption key run the following SQL in the database to be encrypted (in this example the database is called TDETest). Note that this uses the server certificate created in the previous step (I called it MyInstance_ServerCert) :

USE TDETest;

CREATE DATABASE ENCRYPTION KEY

WITH ALGORITHM = AES_128

ENCRYPTION BY SERVER CERTIFICATE MyInstance_ServerCert;

STEP 3 : Enable TDE for the database

To switch on encryption for a database run the following SQL :

ALTER DATABASE TDETest SET ENCRYPTION ON;

That’s it ! The database files should now be encrypted.

MORE INFORMATION

I’ve briefly described how to set up TDE, but will now discuss how to verify TDE, check on encryption progress and backup certificates and other matters relating to TDE.

Verifying whether a Database is Encrypted

You can check whether a database is encrypted either by looking at the database properties (right click on the database in SQL Server Management Studio, select Properties and then click on the Options page), and looking at the “Encryption Enabled” setting as below : How to verify TDE encryption status

Alternatively a list of all the encrypted databases on an instance can be obtained by running the following query :

SELECT name,DEK.*

FROM sys.databases D

JOIN sys.dm_database_encryption_keys DEK

ON DEK.database_id = D.database_id

ORDER BY name;

On my server I have just the one encrypted databases so get the following results from this query (note that as there is at least one encrypted DB then tempdb is encrypted) : TDE Query Result

Note the encryption_state column. The value of 3 indicates that the database is encrypted, a value of 2 indicates that encryption is in progress (a newly encrypted database is encrypted in the background, which may take a while for larger databases).

Backing Up Certificates

It’s important to back up the certificate and keep this safe, as this will be needed if you want to restore the database onto another server. To backup the certificate use the ‘Backup Certificate’ command, for instance I ran :

USE Master;

BACKUP CERTIFICATE MyInstance_ServerCert

TO FILE = 'D:\SQL2012_SQLCert.cer'

WITH PRIVATE KEY (FILE='D:\SQL2012_MasterKey.pvk',

ENCRYPTION BY PASSWORD='D8LsiBL62mLPhi0z7CoM');

Removing Encryption from a Database

If encryption is no longer required then it can be removed along with the database encryption key by running the following commands (obviously you’d need to change the database name) :

-- Remove encryption

USE TDETest;

ALTER DATABASE TDETest SET ENCRYPTION OFF;

-- Remove DEK

DROP DATABASE ENCRYPTION KEY;

You may need to leave a gap between running these statements as TDE is removed asynchronously in the background, and needs to be removed before the DEK can be dropped. Also note that tempdb will remain encrypted even if encryption is removed from all databases on the instance. However if the instance is restarted then tempdb will be recreated without encryption.

Restoring Encrypted Backups to another Server

If you are restoring a database backup to the same instance (either overwriting the same database or as a new database) then this can be done in the same way as a non-encrypted database. The restored database will also be encrypted using the same encryption key. However if you try and restore the backup to another instance the chances are that you’ll get a ‘Cannot find server certificate with thumbprint’ error. This is because the new server needs the certificate from the original server. To create the certificate that we backed up in the section ‘Backing Up Certificates’ above we can just run the statement below on the new server :

CREATE CERTIFICATE MyServer_SQL2012_SQLCert

FROM FILE = 'D:\SQL2012_SQLCert.cer'

WITH PRIVATE KEY (FILE='D:\SQL2012_MasterKey.pvk',

DECRYPTION BY PASSWORD='D8LsiBL62mLPhi0z7CoM');

Note that the new server will need to have a master key set up in order to encrypt the certificate, however the password does not need to be the same as on the original server. If you don’t have a master key setup then this can be done using the following statement (but best to use a different password) :

CREATE MASTER KEY ENCRYPTION

BY PASSWORD='MyPassword';

In summary if you are restoring the encrypted database to another instance you will need the certificate and its private key as well as the backup file.

Checking What Certificates Exist

You can check which certificates exist using the following query :

SELECT * FROM sys.certificates;

Best Practices for Transparent Data Encryption (TDE)

In previous articles I discussed some of the advantages and disadvantages of using Transparent Data Encryption as part of a security solution as well as specific details of how to encrypt a database with TDE.

To finish the series this article discusses some best practices and recommendations for implementing TDE.

Untangling SaaS Security in the Enterprise

Recommendations and Best Practice

If your database doesn’t need encryption then don’t implement TDE on it – as there is a small performance impact when querying an encrypted database don’t encrypt needlessly.

Backups – always backup your databases before encrypting them, just in case.

Storage of encryption keys – make sure these are stored safely, as these will be needed to remove encryption. If disaster occurs and you need to restore the database to another server from a backup file then the backup will be useless without the certificate and private key.

Extended backup duration – encrypted backups don’t compress well, so expect backups to be larger, and take longer to run.

TDE isn’t an end to end encryption solution - don’t expect data to be encrypted in transit, or within the application even if you have TDE enabled. TDE encrypts the data (e.g. .mdf and .ldf files) and backup files (e.g. .bak), nothing more.

Implement other data access controls - TDE complements, but does not replace, other methods of securing data, so access control (via permissions), password encryption and securing network traffic are still important.

Ultimate Encryption with TDE and Break Glass

Make sure you've taken a good look at the image, because we're about to discuss it at length. As you may have noticed, the left pillar in the figure is your environment, which you access through the application user interface (App UX). The center pillar is the Break Glass setup and process that Oracle personnel is required to follow in order to have access to your environment while both Transparent Data Encryption and Database Vault are securing the database. When Break Glass is set up, all credentials in the system are reset and stored in the Credential Store Framework (CSF) and in an escrow system called the Oracle Privileged Access Manager (OPAM), where Oracle stores a subset (12) of the accounts. The remaining accounts are stored in CSF, which is used for programmatic access through scripts.

Anytime Oracle personnel need access to the system, they're granted access to the escrow account only once they have gone through the approval process. The Oracle Enterprise Manager on top acts as the access controller and removes access after the duration expires, including resetting passwords, terminating sessions, and storing new passwords in OPAM and CSF.

Data Protection

The Oracle Privileged Account Manager (OPAM) is a password management solution designed to generate, provision, and manage access to passwords for privileged accounts like Linux/Unix “root” or Oracle database “sys” accounts. It enables auditing and establishes accountability for users who normally share privileged account credentials and additional user Session Management and Recording. OPAM's integration with the Oracle Identity Governance platform provides central governance for regular users and privileged users, auditing, reporting and certification of user's regular accounts and shared accounts, and lifecycle management from request and approval to certification and usage tracking.

Meanwhile, Oracle Platform Security Services includes the Credential Store Framework (CSF), which is a set of APIs that applications can use to create, read, update, and manage credentials. A credential store is a repository of security data. Credentials can hold user name and password combinations, tickets, or public key certificates, and are used during authentication and during authorization when determining what actions the person can perform.

Understanding Transparent Data Encryption

TDE is a database feature that encrypts individual table columns or a tablespace. When a user inserts data into an encrypted column, the database automatically encrypts the data. And when users select the column, the data is decrypted. This form of encryption is transparent, provides high performance, and is easy to implement.

TDE is important because it is kind of like adding a moat around a castle for an additional layer of protection. TDE protects against threats to Oracle Fusion Applications data by encrypting Oracle Fusion Applications data when it is saved to the disk. DBF files and database backups are encrypted, and they cannot be read even in the unlikely case that they are accessed, copied, or stolen on removable media.

Staying Safe with Tablespace Encryption

Oracle Fusion application cloud instances use tablespace-level encryption on all tablespaces that contain Fusion Application business data. Like securing a home to prevent an intruder from entering through a window, encrypted tablespaces primarily protect data from unauthorized access by means other than through the database. Additionally, encrypted tablespaces protect data from users who try to circumvent the security features of the database by accessing database files directly through the operating system file system. To maximize security, data from an encrypted tablespace remains encrypted when written to the undo tablespace, to the redo logs, or to any temporary tablespace.

Maximizing Security by Managing Encryption Keys

To keep your credit cards and money safe, the easiest thing to do is store them in a wallet, right? The situation is pretty similar with encryption keys, which are coincidentally also kept in a wallet for safekeeping. To prevent unauthorized decryption, TDE stores encryption keys in the Oracle Wallet, which is a security module external to the database. TDE uses a two-tier key architecture for flexible and non-intrusive key rotation and least operational and performance impact. Each encrypted tablespace has its own tablespace key. Tablespace keys are stored in the header of the tablespace and in the header of each underlying OS file that makes up the tablespace. Each of these keys is encrypted with the TDE master encryption key, which is stored outside of the database in an Oracle Wallet (a PKCS#12 formatted file that is encrypted using a passphrase supplied by the designated security administrator during setup).

Tablespace keys are AES (with 128-bit key length), while the TDE master key is always an AES256 key. Each tablespace within the database has its own encryption key, and those keys are then encrypted within the Oracle Wallet by the master encryption key. But the security doesn't stop there – Oracle Wallet is additionally secured by a password.

Oracle Break Glass: So Secure It's Shatterproof

Hopefully you're ready to learn the ins and outs of Break Glass, because we have a lot of important information to toss your way. Oracle Break Glass is an advanced security product that restricts and manages Oracle Employees' access to the customer's cloud environment and data through non-application interfaces and restricts administrative access to systems and services. This means access isn't easy around here – instead, it definitely involves jumping through a few hoops. Any request by Oracle personnel for privileged access to the customer environment (for support, maintenance, or operational purposes) will need to go through an internal Oracle approval workflow and up to 3 levels of approval from the customer's organization before any access is granted. Check out the following image, which does a pretty good job of illustrating Oracle Break Glass.

Running Oracle Weblogic Server in Kubernetes using WebLogic Operator

Seeing Clear through the Glass: Customer Benefits

Customers benefit from Oracle Break Glass for a few different reasons. For one, Break Glass supports stringent access control requirements. These:

Enable customers to have custom restrictions on Oracle personnel accessing their Cloud Environment
Allow for active involvement of customer organization in approvals
Allow access control to be restricted via pre-defined windows and approvals prior to use

Another benefit is the improved visibility into Access Control. Here, Break Glass:

Enables the customer to view audit reports for when administrative access was leveraged, which is recommended for customers in highly regulated industries
Augments the rigorous defense in-depth security posture of Oracle Public Cloud
Helps customers comply with compliance laws and regulations
Implementing Break Glass: Know Your Roles and Responsibilities

In this section we'll be talking about roles and responsibilities required for a successful Oracle Break Glass implementation. There will be a set of roles and responsibilities applicable to Oracle and another set applicable to the customer.

When privileged access to the customer's environment is required to perform maintenance, upgrades, support, or a service request response, the Oracle personnel will need to follow the Oracle Break Glass workflow to gain access.

Encrypting data in Kubernetes deployments. Protect your data, not just your Secrets

Let's Talk Access

Privileged access requests submitted through the Break Glass Security workflow require two levels of approval within Oracle before your approval is required. If the customer has preapproved certain types of access entitlements, Oracle will notify the customer about access through the Oracle Break Glass workflow. Otherwise, customers are notified to gain their approval before access is fully granted.

With an Automatic Revocation of Access, Oracle automatically revokes privileged access and resets the password after the end of the access duration.

With an Access history audit, Oracle audits all approved requests. Customers can request a copy of the access history audit report by raising a Service Request.

Customers Have Responsibilities Too

Now is a good time to break down the customer's responsibilities for a successful Break Glass implementation.

You'll configure Oracle Break Glass to define the access duration, the approval levels within your organization (up to three), and the preapproved access entitlements.

You must provide timely approval to each Break Glass access request from Oracle in order for Oracle operations and support personnel to conduct their activities.

As a best practice that you perform at least once each year, you'll need to validate the accuracy of the preapproved access list and update it to reflect the most current approval hierarchy. Any easy way to do this is by submitting a service request through My Oracle Support.

Be sure to whitelist the email address “OIMCORP-NOTIFICATIONS_WW@oracle.com” on your inbound and outbound email servers so that you can receive and respond to Break Glass request notifications from Oracle.

Here's How to Manage Break Glass Approvals

You can enable up to three levels of approvers for each of the three entitlement types, and each approval level can have multiple approvers. Approvers will be contacted by email with approval requests. Approval by any of the listed approvers in a level will advance the request to the next approval level, while rejection of the request by any of the listed approvers will stop the approval process and access will not be provided.

More Information:

https://learn.oracle.com/ords/launchpad/learn?page=what-is-tde-and-bg&context=0:42508:42481

https://blogs.oracle.com/oraclemagazine/transparent-data-encryption-v2

https://blogs.oracle.com/oraclemagazine/encrypt-your-data-assets

https://learn.oracle.com/ords/launchpad/learn?page=transparent-data-encryption-break-glass&context=0:42508

https://study.com/academy/lesson/what-is-transparent-data-encryption-tde.html

https://oraexpertacademy.wordpress.com/2018/03/07/transparent-data-encryption-in-oracle-database-complete-reference/

https://www.slideshare.net/oracle_imc_team

https://www.golinuxcloud.com/

https://www.technicalmint.com/linux/linux-boot-process-in-rhel7-rhel6/?fbclid=IwAR2Mhtg7hFjA4nFTRAvyfEx72dlCVQZdvP01m3JlGuUx_lPmvqCwc65Nrzg

https://blog.malwarebytes.com/threat-analysis/2018/03/encryption-101-how-to-break-encryption/

https://www.avg.com/en/signal/data-encryption

https://docs.oracle.com/database/121/ASOAG/introduction-to-transparent-data-encryption.htm#ASOAG10117

https://oracle-base.com/articles/10g/transparent-data-encryption-10gr2

https://www.sqlmatters.com/Articles/Pros%20and%20Cons%20of%20Transparent%20Data%20Encryption%20(TDE).aspx

The new IBM Z single-frame and multi-frame systems can transform your application and data portfolio with innovative cloud native development, data privacy, security, and cyber resiliency – all delivered in a hybrid cloud environment.

SUSE, IBM Z and LinuxONE: Celebrating the First 20 Years

In 1999, as the digitized world approached the year 2000, all thoughts were on “Y2K,” and what it would mean for the millions of lines of code that ran the computers for business and government. Would switching the two-digit year from “99” to “00” cause massive systems failures and chaos? Would our ATMs stop working, and airplanes fall out of the sky? In the midst of this perceived crisis, hardly anybody was thinking about the future impact of porting Linux to the mainframe.

The new IBM z15 Part 1 - 4/15/20

Skip ahead 20 years, and many technological changes have fundamentally altered the ways we live, work and see the world. Camera phones have given everyone the power to document our times. The internet has gone from a “like to have” to an essential component of consumer-facing commerce, back-end operations, education, social services and even medicine. Far from being the “end of the world” for digital devices, the year 2000 marked the beginning of an unprecedented boom in innovation and collaboration – perhaps none more profound in the world of business IT than combining the power of the IBM mainframe with the possibilities of Linux.

The new IBM z15 Part 2 - 4/23/20

The concept of open source software was still in its infancy when Linux began making significant inroads into commercial data centers – assuming increasingly critical roles in business operations. Opportunities for innovation were wide open as SUSE introduced commercial Linux to the IBM s/390 mainframe – now IBM Z – in the fall of 2000. Since then, SUSE and IBM have continued to watch Linux grow and gain acceptance throughout the business world.

Today, more businesses choose SUSE Linux Enterprise Server (SLES) for IBM Z and LinuxONE than any other Linux for running workloads on IBM mainframes. The reasons are self-evident. SLES is optimized for IBM mainframes, and businesses want to focus on serving their customers – not managing their IT. And the proven engineering excellence of the IBM mainframe, combined with the agility and business value of Linux, enable businesses to accelerate innovation to keep pace with constantly changing market dynamics.

Despite the tendency of some to dismiss mature technologies, neither Linux nor the mainframe could in any way be considered “outmoded” or “antique.” The issue isn’t age. It’s quality, reliability, security and an ongoing ability to innovate and adapt to change. As countless commoditization cycles within the IT industry have written lesser technologies into the history books, Linux on the mainframe is enabling businesses to write the next chapter in their story of digital transformation. Indeed, Gartner’s 2019 assessment that “open source is becoming the backbone for driving digital innovation” speaks both to the innovative capabilities of Linux and to the continued reliability and security of the IBM mainframe.

Introducing the new IBM z15 T02

Many of the world’s exciting innovations in business IT are being developed and deployed via Linux on the mainframe. Traditional industries such as finance and retail are finding new ways to remain competitive using hybrid cloud and AI – running on Linux and the mainframe – to make the most of their data in service to their customers. While newer types of workloads to manage digital currencies, global scientific research and emerging industries continue to demand the highly available, reliable and secure computing provided by Linux on the mainframe.

The distinguishing features Linux on the mainframe will enable both to play essential roles in mission-critical business operations for many years to come. We’ve come a long way in our first 20 years, and have much to be proud of. But envisioning the possibilities of the next 20 years makes us even more excited about the future of Linux on IBM Z.

LinuxONE: Determining Your Own IT Strategy

Dogs are supposed to wag their tails, not the other way around. Yet, too many enterprises have found themselves in situations where their IT infrastructure dictates the way they run their business.

Introducing the new IBM z15 T02

Lest anyone forget, the purpose of IT is to enable and support business outcomes, not to determine or limit them. But how do enterprises get control of their IT when it’s already up and running, and comprises piece parts from a variety of vendors? How do enterprises migrate their mission-critical operations to a hybrid cloud environment so they can move quickly and manage growth?

With the introduction of LinuxONE five years ago, IBM answered those questions with secure, reliable and scalable architecture, complementing the capabilities of the underlying architecture in unique ways. Massive power, extreme virtualization, and open and disruptive technologies make the combination greater than the sum of its parts.

Unlike other Linux platforms, LinuxONE lets users scale up without disruption. Having this “cloud in a box” capability means enterprises can add database layers and new applications to their IT infrastructure without taking everything offline. They can change their tires and even upgrade their horsepower while staying in the race—critical capabilities in any industry with constantly changing demands.

The key is being able to define what’s valuable to an organization, versus what an IT platform will let it do. With its values determined, the enterprise is free to establish its own cloud roadmap, manage its own cloud services consumption, and position itself for innovation and market disruption.

LinuxONE represents the culmination of years of innovation and integration in optimizing open source workloads on a trusted architecture. Add to that the capabilities of Red Hat OpenShift, and you have a hybrid cloud infrastructure that:

Optimizes the value of existing IT infrastructure
Hosts mission-critical applications while protecting sensitive data
Maintains security and scalability in the public cloud
Enables “write once/run anywhere” application portability
Installs and upgrades without disrupting ongoing business processes

Enterprises with traditional workloads can capitalize on the elegance of managing and scaling their cloud-native system from a single control point that enables previously unheard-of agility in their digital reinvention. Businesses with emerging workloads— such as enterprises in the Confidential Computing space—can count on the secure service containers and hardware security models of LinuxONE to establish and build trust in their marketplace relationships. And all users can benefit from the systems’ containerization, encryption and virtualization that allow them to maintain control of their own security keys. In other words, the enterprise—not the IT infrastructure—is in charge.

As LinuxONE celebrates its fifth anniversary, it has emerged as a “lighthouse” platform of global collaboration to simplify IT management, even as tasks have become more complex. As a result, we stand on the verge of a period of dramatic change, in which AI running on hybrid cloud will enable breakthroughs in classical computing and its tremendous potential to improve countless aspects of our lives. For businesses and governments making this important journey, LinuxONE is an essential partner to progress.

IBM and Red Hat: Nearly two decades of Linux innovation across computing architectures

In the decades since its inception, Linux has become synonymous with collaboration, both at a technical and organizational standpoint. This community work, from independent contributors, end users and IT vendors, has helped Linux adapt and embrace change, rather than fight it. A powerful example of this collaboration was the launch of Red Hat Enterprise Linux (RHEL) 2.1 in 2002, heralding the march of Linux across the enterprise world. Today, Red Hat Enterprise is a bellwether for Linux in production systems, serving as the world’s leading enterprise Linux platform to power organizations across the world and across the open hybrid cloud.

All of this innovation and industry leadership wouldn’t have been possible without a strong partner ecosystem, including the close ties we’ve long had with IBM. IBM was one of the first major technology players to recognize the value in Linux, especially RHEL. As IBM Z and IBM LinuxONE celebrate 20 years of powering enterprise IT today, this benchmark provides further validation of the need for enterprise-grade Linux across architectures, especially as the requirements of modern businesses change dynamically.

One Linux platform, spanning the mainframe to the open hybrid cloud

For more than five years, Red Hat’s vision of IT’s future rests in the hybrid cloud, where operations and services don’t fully reside in a corporate datacenter or in a public cloud environment. While the open hybrid cloud provides a myriad of benefits, from greater control over resources to extended flexibility and scalability, it also delivers choice: choice of architecture, choice of cloud provider and choice of workload.

RHEL encompasses a vast selection of certified hardware configurations and environments, including IBM Z and LinuxONE - this ecosystem recently expanded to include IBM z15 and LinuxONE III single frame systems. Working with IBM as a long-time partner, we’ve optimized RHEL across nearly all computing architectures, from mainframes and Power systems to x86 and Arm processors. It’s this ability to deliver choice that makes RHEL an ideal backbone for the hybrid cloud.

Linux is just the beginning

Linux is crucial to the success of the hybrid cloud, but it’s just the first step. RHEL lays the foundation for organizations to extend their operations into new environments, like public cloud, or new technologies, like Kubernetes. Choice remains key throughout this evolution, as innovation is worth nothing if it cannot answer the specific and evolving needs of individual enterprises.

RHEL is the starting point for Red Hat’s open innovation, including Red Hat OpenShift. Again, thanks to our close collaboration with IBM, the value of RHEL, OpenShift and Red Hat’s open hybrid cloud technologies encompasses IBM Z and LinuxONE systems. This makes it easier for organizations to use their existing investments in IBM’s powerful, scalable mainframe technologies while still taking advantage of cloud-native technologies.

Supporting IT choice and supporting IT’s future

The open hybrid cloud isn’t a set of technologies delivered in a box - rather, it’s an organizational strategy that brings the power and flexibility of new infrastructure and emerging technologies to wherever the best footprint is for a given enteprise’s needs. IBM Z and LinuxONE represent a powerful architecture for organizations to build out modern, forward-looking datacenter implementations, while RHEL provides the common plane to unite these advanced systems with the next wave of open source innovations, including Red Hat OpenShift.

Twenty years of open source software for IBM Z and LinuxONE

It’s been 20 years since IBM first released Linux on IBM Z, so I thought it appropriate to mark the occasion by exploring the history, the details, and the large ecosystem of open source software that’s now available for the IBM Z and LinuxONE platforms.

IBM has deep roots in the open source community. We have been backing emerging communities from a very early stage — including the Linux Foundation, the Apache Software Foundation, and the Eclipse Foundation. This includes years of contributions to the development of open source code, licenses, advocating for open governance, and open standards in addition to being an active contributor to many projects.

As open source continues to gain momentum in the software world, we see growth reflected across different hardware and processor architectures. The processor architecture for IBM Z and LinuxONE is known as s390x.

If you’re new to these two hardware platforms, they are commonly known as mainframes. IBM Z has had a tremendous evolution with world-class, enterprise-grade features for performance, security, reliability, and scale. The latest version, IBM z15, can co-locate different operating systems including Linux, z/OS, z/VSE, and z/TPF. The LinuxONE III model has the same features as IBM Z, but was designed exclusively for the Linux operating system, including most commercial and open source Linux distributions.

When we talk about commonalities, there’s one that is not very well known related to mainframes — open source software. Did you know that open source software (OSS) for mainframes existed as far back as 1955? SHARE, a volunteer-run user group, was founded in 1955 to share technical information related to mainframe software. They created an open SHARE library with available source code, and undertook distributed development. It was not called “open source” back then, but we can consider that one of the early origins of open source.

Open source software, Linux, and IBM

The popularity of open source software originated in large part as a result of years of cultural evolution through sharing libraries across all programming languages. Innovating and sharing software with reusable functionality has become a common practice led by open source communities and some of the largest organizations in the world. Another factor is that all of the latest technologies are being developed in the open — AI, machine learning, blockchain, virtual reality, and autonomous cars, just to name a few.

As mentioned earlier, open source is not new to mainframes — another example is Linux, which has been used for more than 20 years. In 1999, IBM published a collection of patches and additions to the Linux kernel to facilitate the use of Linux in IBM Z. Then, in 2000, more features were added to the mainframes, including the Integrated Facility for Linux (IFL), which hosts Linux with or without hypervisors for virtual machines (VMs).

Over the last 20+ years, IBM has committed significant resources to Linux. In 2000, IBM announced a $1 billion investment to make Linux a key part of the company strategy, establishing IBM as a champion for contributions to the Linux kernel and subsystems.

One of IBM’s key contributions to Linux has always been enhancements that take advantage of the unique capabilities of the mainframe. Today, IBM Z and LinuxONE run a much-improved open source Linux that allows amazing technology for high I/O transactions, cryptographic capabilities, scalability, reliability, compression, and performance.

All commercial and open source Linux distributions are available for IBM Z and LinuxONE: Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Ubuntu, Fedora, Debian, Open SUSE, CentOS, Alpine, and ClefOS.

The use of Linux over the course of 20 years has opened the doors to a vast ecosystem of open source software for IBM Z and LinuxONE.

The open source software ecosystem for S390x

Today, in line with its commitment to Linux, IBM contributes to many open source projects. In fact, together with Red Hat, which is now part of IBM, it has the largest number of active open source contributors in the world — an amazing feat.

Because IBM is committed to the goal of continuing to develop the open source software ecosystem for IBM Z and LinuxONE, the company has teams of full-time developers that contribute upstream back to open source communities. In general terms, all you need is a different compiled Linux distribution for s390x; then, if you want to port exiting software, you will have to build or compile it again on IBM Z or LinuxONE.

Open source communities and IBM upstream developers address technical items specific to s390x, especially when related to existing open source software for x86 processors that need to be ported and validated on an IBM Z or LinuxONE (s390x).

Technical considerations for porting OSS to S390x

First, it’s important to note that most software recompiles or builds with minimal to no changes; x86-specific components will cause compilation or runtime errors. In those cases, code needs to be added to make those libraries or components work for s390x.

S390x uses big-endian memory allocation. The big-endian scheme stores the most significant byte (MSB) first, while the little-endian scheme available in ARM and x86 processors stores the least-significant byte (LSB) first. What this means is that if the software is doing low-level memory allocation in a little-endian scheme, the code needs to be adjusted to big-endian so the application can continue to work properly in the mainframe.

The same considerations apply to library dependencies (transitive libraries) in which functionality specific to other processor architectures needs to change to work on s390x.

Every tool, script, and piece of software is different, but for the most part, the previous technical considerations apply, and in many cases, no code changes are required — all you have to do is build or compile the software again.

Growing the open source ecosystem

There you have it! Coding and building OSS are basically the same on any platform. The use of Linux and re-use of open source technologies, together with commonly used open source development tools and languages, have helped to grow the ecosystem of OSS for IBM Z and LinuxONE. We have seen more interest in recent months, and we are looking forward to having more OSS (especially in the AI space) being available for s390x.

Open source on IBM z/OS is a topic for another blog post, but it too is seeing growth including Linux Foundation projects like Zowe.

Open source ecosystem - logos

We invite you to participate. We have a growing community, and there are resources available for you to try in the IBM LinuxONE Community Cloud as well as a variety of other resources listed in this blog post. Developers and enterprises are sure to enjoy the benefits of working in a familiar open source environment.

Explore and make use of the advanced capabilities of the IBM z15

More than any other platform, the z15 offers a high-value architecture that can give you the cloud you want with the privacy and security you need. From the hundreds of microprocessors to the Z software stack, the z15 is built to be open, secure, resilient, and flexible from the ground up.

The z15 is offered as a single air-cooled 19-inch frame called the z15 T02, or as a multi-frame (1 to 4 19-inch frames) called the z15 T01.

The IBM Redbooks team brought together experts from around the world to help you explore and realize the potential of the IBM z15. Let IBM Redbooks guide you through the opportunities that this new technology can bring to your business.

https://www.redbooks.ibm.com/redbooks.nsf/pages/z15

The new IBM Z single-frame and multi-frame systems bring security, privacy and resiliency to your hybrid cloud infrastructure

WHY THE NEW IBM Z15 T02 MAINFRAME TECHNOLOGY MATTERS MORE THAN EVER

The New IBM z15 T02 Mainframe

Observations On IBM’s Announcement

As anticipated, IBM announced the follow-on to the z14 ZR1 server, the z15 model T02. This class of server from IBM is designed to deliver the scalability and granularity required for small and medium-sized enterprises across all industries. IBM architected this server by building off the enterprise-class chipset introduced in the Enterprise Class counterpart known as the z15 model T01. It is interesting to note, the 19” form factor found in the z15 T01 was first introduced in the prior small and medium-sized enterprise-class mainframe server – the ZR1.

If you have not kept up to speed on mainframe technology, things have changed dramatically. The image below highlights what IBM’s latest Mainframe family looks like. The number of racks in the T01 model is a function of the size of the client configuration. The T02 will always be a single rack system.

Lasting Technologies Innovate

Due to the recent changes in our world, it is comforting to know that the mainframe is the most secure, reliable, available and innovative platform, and it continues to support the backbone of our economy. It is amazing to see the IBM Z enhancements made over just the last decade, as highlighted below.

Mainframes have continued to see new innovations and technologies capabilities over the years. For example, you can see above how the number of cores and memory continues to increase with each server generation.

Network of Experts: Bodo Hoppe and IBM z15 – a developer perspective | IBM Client Center Boeblingen

For those platform nay-sayers that claim the “mainframe is dead,” why are the memory and core numbers increasing? The answer is simple; it is because clients are still dependent on this platform, and their workloads are continuing to grow and demand more resources.

What a great technology story! The first IBM mainframe was introduced in 1964. It just goes to show you that the mainframe is a lasting technology and that IBM and its partners will continue to innovate on the platform, while still preserving its core values

Enterprise Modernization And The T02

During the uncertainty in our world right now, many states and their mainframe environments have made headlines. Several states are scrambling to locate programming talent to scale their legacy mainframe applications which are written in Enterprise COBOL. These applications support the unemployment systems, which are seeing a dramatic spike in claim submissions.1 Having read these articles, there appear to be common themes – the organizations decided to no longer invest in the platform, complacency may have set in, or some organizations favored a workload refactor + re-engineering approach. Organizations that embark on a transformation journey focus on the promise of reduced costs, improved customer experience and revenue growth. The challenge is none of those benefits are realized until the activity results in true Maintenance and Operation (M&O) of the refactored workload. That can happen, but it takes a concerted investment effort and time.

Interesting Features Of IBM Z15 T02

Turning our attention back to IBM’s announcement, this new server offers five hardware models and well over 250+ unique software capacity settings, providing a highly granular and scalable system. The base single-engine speed of 98 MIPS is found on the A01; the same full speed unit (Z01) climbs to 1761 MIPs, up from 1570 MIPs on the prior generation. The server clock speed held steady at 4.5 GHz, yet the average single-core performance increases 14% when compared to the ZR1. Also seeing increases are memory configurations and the number of cores available within a single system image for Linux-centric workloads. Docker Containers can be deployed natively on this system, and doing so would allow your microservices to access native z/OS services within the same LPAR. Talk about zero network latency!

The T02 server also includes key innovative features. One such feature is known as Compression Acceleration, and a second is Instant Recovery. Let’s briefly review both within the context of the T02.

Compression Acceleration

Compression Acceleration is made possible due to a new on-chip accelerator known as the Nest Acceleration Unit (NXU). DEFLATE is an industry-standard compression protocol. The T02 NXU leverages Gzip compression, which offers improved reliability over DEFLATE. Utilizing on-chip compression will provide higher compression ratios and operates in one of two modes: synchronous or asynchronous. Synchronous mode will go straight through to the on-chip accelerator. Asynchronous mode will require a corresponding priced feature on z/OS.

Instant Recovery

Did you know the mainframe system that you currently own embraces a three-pronged system availability strategy? IBM designs the mainframe with a three-pronged availability strategy.

Begin with a mindset that centers on keeping the system up and running.

Eliminate both unplanned outages as well as planned outages.

Architect the hardware and OS so that your applications remain available should an LPAR become unavailable for whatever reason, such as to apply maintenance. IBM instrumentation includes support for rolling IPLs and platform clustering technology.

To continue to improve the focus on system availability, another new feature, IBM System Recovery Boost, has been released.

According to IBM[i], System Recovery Boost is an innovative solution that diminishes the impact of downtime, planned or unplanned, so you can restore service and recover workloads substantially faster than on previous IBM Z generations with zero increase in IBM software MSU consumption or cost.

With System Recovery Boost, you can use your already-entitled Central Processors and zIIPs to unleash additional processing capacity on an LPAR-by-LPAR basis, providing a fixed-duration performance boost to reduce the length and mitigate the impact of downtime:

Faster shutdown
Faster GDPS automation actions
Faster restart & recovery
Faster elimination of workload backlog

Key features include:

Speed Boost: Enables general-purpose processors on sub-capacity machine models to run at full-capacity speed in the image(s) being boosted.
zIIP Boost: Provides additional capacity and parallelism by enabling general-purpose workloads to run on zIIP processors that are available to the image(s) being boosted.

Let’s dive further into this. How does the operating system know you are shutting down one of your LPARs and that a shutdown boost period of 30 minutes should start? It’s quite simple. At shutdown time, the operator has to explicitly activate the boost by starting the new started procedure IEASDBS (Shut Down Boost Start).

Upon re-IPL of that same LPAR, Boost would be “On by Default” for that image, offering up sixty minutes of boosted capacity to get the operating system and subsystems up. During the Boost period time, workloads will also continue processing at an accelerated pace.

For those familiar with this platform, you know that zIIPs traditionally only host DRDA, IPSec and IBM Db2 utility workloads, along with non-IBM Software solutions that have chosen to leverage the zIIP API. During System Recovery Boost, if you have at least one zIIP engine available to the LPAR, it can run both traditional zIIP-only workloads as well as General Purpose CP Workload. IBM dubbed this capability CP Blurring. Just like Speed Boost, zIIP Boost will last thirty minutes on shutdown and sixty minutes on restart.

What runs on the zIIP during the boost period? The short answer – any program!2

[i] https://www.ibm.com/downloads/cas/1NWEJKOX

On-Demand Webinar: Preparing Enterprise IT for the Next 50 Years of the Mainframe

Announcing IBM z15 Model T02, IBM LinuxONE III Model LT2 and IBM Secure Execution for Linux

Every day, clients of all sizes are examining their hybrid IT environments, looking for flexibility, responsiveness and ways to cut costs to fuel their digital transformations. To help address these needs, today IBM is making two announcements. The first is two new single-frame, air-cooled platforms– IBM z15 Model T02 and IBM LinuxONE III Model LT2–designed to build on the capabilities of z15. The second, is IBM Secure Execution for Linux, a new offering designed to help protect from internal and external threats across the hybrid cloud. The platforms and offering will become generally available on May 15, 2020.

Expanding privacy with IBM Secure Execution for Linux

According to the Ponemon Institute’s 2020 Cost of an Insider Breach Report[1] sponsored by IBM, insider threats are steadily increasing. From 2016 to 2019, the average number of incidents involving employee or contractor negligence has increased from 10.5 to 14.5–and the average number of credential theft incidents per company has tripled over the past three years, from 1.0 to 3.2.[2] IBM Secure Execution for Linux helps to mitigate these concerns by enabling clients to isolate large numbers of workloads with granularity and at scale, within a trusted execution environment available on all members of the z15 and LinuxONE III families.

Read the Ponemon Institute Report https://www.ibm.com/downloads/cas/LQZ4RONE

For clients with highly sensitive workloads such as cryptocurrency and blockchain services, keeping data secure is even more critical. That’s why IBM Secure Execution for Linux works by establishing secured enclaves that can scale to host these sensitive workloads and provide both enterprise-grade confidentiality and protection for sensitive and regulated data. For our clients, this is the latest step toward delivering a highly secure platform for mission-critical workloads.

For years, Vicom has worked with LinuxONE and Linux® on Z to solve clients’ business challenges as a reseller and integrator. On learning how IBM Secure Execution for Linux can help clients, Tom Amodio, President, Vicom Infinity said, “IBM’s Secure Execution, and the evolution of confidential computing on LinuxONE, give our clients the confidence they need to build and deploy secure hybrid clouds at scale.”

Simplifying your regulatory requirements for highly sensitive workloads

In addition to the growing risk of insider threats, our clients are also facing complexity around new compliance regulations such as GDPR and the California Consumer Privacy Act, demonstrating that workload isolation and separation of control are becoming even more important for companies of all sizes to ensure the integrity of each application and its data across platforms. IBM Secure Execution for Linux provides an alternative to air-gapped or separated dedicated hardware typically required for sensitive workloads.

TechU Talks Replay: Introducing IBM z15 Data Privacy Passports - 4/9/20

Learn more about IBM Storage offerings for IBM Z

Delivering cyber resiliency and flexible compute

Building on recent announcements around encrypting everywhere, cloud-native and IBM Z Instant Recovery capabilities, as well as support for Red Hat OpenShift Container Platform and Red Hat Ansible Certified Content for IBM Z, these two new members of the IBM Z and LinuxONE families bring new cyber resiliency and flexible compute capabilities to clients including:

Enterprise Key Management Foundation–Web Edition provides centralized, secured management of keys for robust IBM z/OS® management.

Flexible compute: Increased core and memory density with 2 central processor complex drawer design provides increased physical capacity and an enhanced high availability option. Clients can have up to 3 I/O drawers and can now support up to 40 crypto processors.

Red Hat OpenShift Container Platform 4.3: The latest release, planned for general availability this month on IBM Z and LinuxONE.

Complementary IBM Storage enhancements

In addition, IBM also announced new updates to our IBM Storage offerings for IBM Z. The IBM DS8900F all-flash array and IBM TS7700 virtual tape library both now offer smaller footprint options. This week the TS7700 family announced a smaller footprint, with flexible configurations for businesses of all sizes and different needs that can be mounted in an industry-standard 19-inch rack.

https://mediacenter.ibm.com/media/0_vczkd5zx

More Information:

https://www.redbooks.ibm.com/redbooks.nsf/pages/z15

https://www.ibm.com/nl-en/products/z15

https://www.ibm.com/blogs/think/nl-en/?lnk=fdi-nlen

https://www.ibm.com/us-en/?lnk=fcc

https://www.ibm.com/cloud?lnk=ushpv18ct4

https://www.ibm.com/products/cloud-pak-for-data

https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=AN&subtype=CA&htmlfid=897/ENUS120-006&appname=USN

https://www.ibm.com/downloads/cas/0VZ0KE68

http://www.redbooks.ibm.com/redbooks/pdfs/sg248850.pdf

https://www.evolvingsol.com/2020/04/14/ibmz15-mainframe/

https://www.ibm.com/products/z15/resources

https://developer.ibm.com/components/ibmz/blogs/inside-the-new-ibm-z15-t02-and-linuxone-iii-lt2/

http://www.redbooks.ibm.com/redpieces/abstracts/sg248850.html

https://developer.ibm.com/blogs/technical-overview-of-secure-execution-for-linux-on-ibm-z/

https://www.ibm.com/it-infrastructure/z/news

https://newsroom.ibm.com/linux-on-z

https://developer.ibm.com/blogs/the-latest-on-open-source-software-for-ibm-z-and-linuxone/

New CPU co-optimized for Red Hat OpenShift for enterprise hybrid cloud

IBM revealed the next generation of its IBM POWER central processing unit (CPU) family: IBM POWER10.

OpenPOWER Summit 2020 Sponsor Showcase: IBM POWER10

Intel and AMD have some fresh competition in the enterprise and data center markets as IBM just launched its next-generation Power10 processor.

The Power9 processor was introduced back in 2017. It's a 14nm processor that was used in the Summit supercomputer, which held the top spot as the world's fastest supercomputer from Nov. 2018 to June 2020. Now IBM is set to replace Power9 with the company's first 7nm processor, and Power10 will be manufactured through a partnership with Samsung.

Power10 promises some massive improvements over Power9. IBM claims a 3x improvement in both capacity and processor energy efficiency over its previous chip generation within the same power envelope. Power10 also includes a new feature called "memory inception," allowing clusters of physical memory to be shared across a pool of systems. Each system in the pool can access all of the memory, and memory clusters can be scaled up to petabytes in size.

IBM says there's up to a 20x improvement in speed for artificial intelligence workloads compared to Power9, and there's also been a focus on bolstering security. IBM added "quadruple the number of AES encryption engines per core" while also anticipating "future cryptographic standards like quantum-safe cryptography and fully homomorphic encryption."

"Enterprise-grade hybrid clouds require a robust on-premises and off-site architecture inclusive of hardware and co-optimized software," said Stephen Leonard, GM of IBM Cognitive Systems. "With IBM POWER10 we've designed the premier processor for enterprise hybrid cloud, delivering the performance and security that clients expect from IBM. With our stated goal of making Red Hat OpenShift the default choice for hybrid cloud, IBM POWER10 brings hardware-based capacity and security enhancements for containers to the IT infrastructure level."

Considering that the Summit supercomputer has only dropped to second place on the fastest list and still counts as the fifth most efficient supercomputer operating today, it seems likely a supercomputer using Power10 processors is going to appear and jump immediately to the top of the charts within a few years.

RELATED
Japan's ARM-Based 'Fugaku' System Now the World's Fastest Supercomputer
Microsoft's Powerful Supercomputer Will Supercharge AI for Azure Developers
Supercomputers Taken Offline After Hackers Secretly Install Cryptocurrency Miners

Designed to offer a platform to meet the unique needs of enterprise hybrid cloud computing, the IBM POWER10 processor uses a design focused on energy efficiency and performance in a 7nm form factor with an expected improvement of up to 3x greater processor energy efficiency, workload capacity, and container density than the IBM POWER9 processor.1

Designed over five years with hundreds of new and pending patents, the IBM POWER10 processor is an important evolution in IBM's roadmap for POWER. Systems taking advantage of IBM POWER10 are expected to be available in the second half of 2021. Some of the new processor innovations include:

-IBM's First Commercialized 7nm Processor that is expected to deliver up to a 3x improvement in capacity and processor energy efficiency within the same power envelope as IBM POWER9, allowing for greater performance.1
-Support for Multi-Petabyte Memory Clusters with a breakthrough new technology called Memory Inception, designed to improve cloud capacity and economics for memory-intensive workloads from ISVs like SAP, the SAS Institute, and others as well as large-model AI inference.
-New Hardware-Enabled Security Capabilities including transparent memory encryption designed to support end-to-end security. The IBM POWER10 processor is engineered to achieve significantly faster encryption performance with quadruple the number of AES encryption engines per core compared to IBM POWER9 for today's most demanding standards and anticipated future cryptographic standards like quantum-safe cryptography and fully homomorphic encryption. It also brings new enhancements to container security.
-New Processor Core Architectures in the IBM POWER10 processor with an embedded Matrix Math Accelerator which is extrapolated to provide 10x, 15x and 20x faster AI inference for FP32, BFloat16 and INT8 calculations per socket respectively than the IBM POWER9 processor to infuse AI into business applications and drive greater insights.

IBM's POWER10 Processor - William Starke & Brian W. Thompto, IBM

IBM POWER10 7nm Form Factor Delivers Energy Efficiency and Capacity Gains

IBM POWER10 is IBM's first commercialized processor built using 7nm process technology. IBM Research has been partnering with Samsung Electronics Co., Ltd. on research and development for more than a decade, including demonstration of the semiconductor industry's first 7nm test chips through IBM's Research Alliance.

With this updated technology and a focus on designing for performance and efficiency, IBM POWER10 is expected to deliver up to a 3x gain in processor energy efficiency per socket, increasing workload capacity in the same power envelope as IBM POWER9. This anticipated improvement in capacity is designed to allow IBM POWER10-based systems to support up to 3x increases in users, workloads and OpenShift container density for hybrid cloud workloads as compared to IBM POWER9-based systems.

This can affect multiple datacenter attributes to drive greater efficiency and reduce costs, such as space and energy use, while also allowing hybrid cloud users to achieve more work in a smaller footprint.

Hardware Enhancements to Further Secure the Hybrid Cloud

IBM POWER10 offers hardware memory encryption for end-to-end security and faster cryptography performance thanks to additional AES encryption engines for both today's leading encryption standards as well as anticipated future encryption protocols like quantum-safe cryptography and fully homomorphic encryption.

Further, to address new security considerations associated with the higher density of containers, IBM POWER10 is designed to deliver new hardware-enforced container protection and isolation capabilities co-developed with the IBM POWER10 firmware. If a container were to be compromised, the POWER10 processor is designed to be able to prevent other containers in the same Virtual Machine (VM) from being affected by the same intrusion.

Cyberattacks are continuing to evolve, and newly discovered vulnerabilities can cause disruptions as organizations wait for fixes. To better enable clients to proactively defend against certain new application vulnerabilities in real-time, IBM POWER10 is designed to give users dynamic execution register control, meaning users could design applications that are more resistant to attacks with minimal performance loss.

Multi-Petabyte Size Memory Clustering Gives Flexibility for Multiple Hybrid Deployments

IBM POWER has long been a leader in supporting a wide range of flexible deployments for hybrid cloud and on-premises workloads through a combination of hardware and software capabilities. The IBM POWER10 processor is designed to elevate this with the ability to pool or cluster physical memory across IBM POWER10-based systems, once available, in a variety of configurations. In a breakthrough new technology called Memory Inception, the IBM POWER10 processor is designed to allow any of the IBM POWER10 processor-based systems in a cluster to access and share each other's memory, creating multi-Petabyte sized memory clusters.

For both cloud users and providers, Memory Inception offers the potential to drive cost and energy savings, as cloud providers can offer more capability using fewer servers, while cloud users can lease fewer resources to meet their IT needs.

Infusing AI into the Enterprise Hybrid Cloud to Drive Deeper Insights

As AI continues to be more and more embedded into business applications in transactional and analytical workflows, AI inferencing is becoming central to enterprise applications. The IBM POWER10 processor is designed to enhance in-core AI inferencing capability without requiring additional specialized hardware.

With an embedded Matrix Math Accelerator, the IBM POWER10 processor is expected to achieve 10x, 15x, and 20x faster AI inference for FP32, BFloat16 and INT8 calculations respectively to improve performance for enterprise AI inference workloads as compared to IBM POWER9,2 helping enterprises take the AI models they trained and put them to work in the field. With IBM's broad portfolio of AI software, IBM POWER10 is expected to help infuse AI workloads into typical enterprise applications to glean more impactful insights from data.

Building the Enterprise Hybrid Cloud of the Future

With hardware co-optimized for Red Hat OpenShift, IBM POWER10-based servers will deliver the future of the hybrid cloud when they become available in the second half of 2021. Samsung Electronics will manufacture the IBM POWER10 processor, combining Samsung's industry-leading semiconductor manufacturing technology with IBM's CPU designs.

OpenPOWER Summit EU 2019: Microwatt: Make Your Own POWER CPU

IBM today introduced its next generation Power10 microprocessor, a 7nm device manufactured by Samsung. The chip features a new microarchitecture, broad new memory support, PCIe Gen 5 connectivity, hardware enabled security, impressive energy efficiency, and a host of other improvements. Unveiled at the annual Hot Chips conference (virtual this year) Power10 won’t turn up in IBM systems until this time next year. IBM didn’t disclose when the chip would be available to other systems makers.

IBM says Power10 offers a ~3x performance gain and ~2.6x core efficiency gain over Power9. No benchmarks against non-IBM chips were presented. Power9, of course, was introduced in 2017 and manufactured by Global Foundries on a 14nm process. While the move to a 7nm process provides many of Power10’s gains, there are also significant new features, not least what IBM calls Inception Memory that allows Power10 to access up to “multi petabytes” of pooled memory from diverse sources.

“You’re able to kind of trick a system into thinking that memory in another system belongs to this system. It isn’t like traditional [techniques] and doing an RDMA over InfiniBand to get access to people’s memory. This is programs running on my computer [that] can do load-store-access directly, coherently,” said William Starke, IBM distinguished engineer and a Power10 architect in a pre-briefing. “They use their caches [to] play with memory as if it’s in my system, even if it’s bridged by a cable over to another system. If we’re using short-reach cabling, we can actually do this with only 50-to-100 nanoseconds of additional latency. We’re not talking adding a microsecond or something like you might have over and RDMA.”

IBM is promoting Inception as a major achievement.

“HP came out with their big thing a few years ago. They called it The Machine and it was going to be their way of revolutionizing things largely by disaggregating memory. Intel you’ve seen from their charts talking about their Rack Scale architectures [that] they’re evolving toward. Well, this is IBM’s version of this and we have it today, in silicon. We are announcing we are able to take things outside of the system and aggregate the multiple systems together to directly share memory.

OpenPOWER Summit NA 2019: An Overview of the Self Boot Engine (SBE) in POWER9 base OpenPOWER Systems

Inception is just one of many interesting features of Power10, which has roughly 18 billion transistors. IBM plans to offer two core types – 4 SMT (simultanous multi-threaded) cores and 8 SMT cores; IBM focused on the latter in today’s presentation. There are 16 cores on the chip and on/offchip bandwidth via the OMI interface or PoweAXON (for adding OpenCAPI accelerators) or PCIe5 interface, all of which are shown delivering up to 1 terabyte per sec on IBM’s slides.

CXL interconnect is not supported by Power10, which is perhaps surprising given the increasingly favorable comments about CXL from IBM over the past year.

Starke said as part of a Slack conversation tied to Hot Chips, “Does POWER10 support CXL? No, it does not. IBM created OpenCAPI because we believe in Open, and we have 10+ years of experience in this space that we want to share with the industry. We know that an asymmetric, host-dominant attach is the only way to make these things work across multiple companies. We are encouraged to see the same underpinnings in CXL. It’s open. It’s asymmetric. So it’s built on the right foundations. We are CXL members and we want to bring our know-how into CXL. But right now, CXL is a few years behind OpenCAPI. Until it catches up, we cannot afford to take a step backwards. Right now OpenCAPI provides a great opportunity to get in front of things that will become more mainstream as CXL matures.”

Below is the block diagram of IBM’s new Power10 chip showing major architecture elements.

How open is OpenPOWER? - DevConf.CZ 2020

The process shrink does play role in allowing to IBM to offer two packaging options shown below (slide below).

IBM. is offering two versions of the processor module and were able to do this primarily because of the energy efficiency gains. “We’re bringing out a single chip module. There is one Power10 chip and exposing all those high bandwidth interfaces, so very high bandwidth per compute type of characteristics. [O]n the upper right you can see [it]. We build a 16-socket, large system that’s very robustly scalable. We’ve enjoyed success over the last several generations with this type of offering, and Power10 is going to be no different.

“On the bottom you see something a little new. We can basically take two Power10 processor chips and cram them into the same form factor where we used to put just one Power9 processor. We’re taking 1200 square millimeters of silicon and putting it into the same form factor. That’s going to be very valuable in compute-dense, energy-dense, volumetric space-dense cloud configurations, where we can build systems ranging from one to four sockets where those are dual chip module sockets as shown.

IBM POWER10 technical preview of chip capabilities

It will be interesting to see what sort of traction the two different offerings gain among non-IBM systems builders as well as hyperscalers. Broadly IBM is positioning Power10 as a strong fit for hybrid cloud, AI, and HPC environments. Hardware and firmware enhancements were made to support security, containerization, and inferencing, with IBM pointedly suggesting Power10 will be able to handle most inferencing workflows as well as GPUs.

Talking about security, Satya Sharma, IBM Fellow and CTO, IBM Cognitive Systems, said “Power10 implements transparent memory encryption, which is memory encryption without any performance degradation. When you do memory encryption in software, it usually leads to performance degradation. Power10 implements transparent hardware memory encryption.”

Sharma cited similar features for containers and acceleration cryptographic standards. IBM’s official announcement says Power10 is designed to deliver hardware-enforced container protection and isolation optimized with the IBM firmware and that Power10 can encrypt data 40 percent faster than Power9.

Architecture innovations in POWER ISA v3.01 and POWER10

IBM also reports Power10 delivers a 10x-to-20x advantage over Power9 on inferencing workloads. Memory bandwidth and new instructions helped achieve those gains. One example is a new special purpose-built matrix math accelerator that was tailored for the demands of machine learning and deep learning inference and includes a lot of AI data types.

Focusing for a moment on dense-math-engine microarchitecture, Brian Thompto, distinguished engineer and Power10 designer, noted, “We also focused on algorithms that were hungry for flops, such as the matrix math utilized in deep learning. Every core has built in matrix math acceleration and efficiently performs matrix outer product operations. These operations were optimized across a wide range of data types. Recognizing that various precisions can be best suited for specific machine learning algorithms, we included very broad support: double precision, single precision, two flavors of half-precision doing both IEEE and bfloat16, as well as reduced precision integer 16-, eight-, and four-bit. The result is 64 flops per cycle, double precision, and up to one K flops per cycle of reduced precision per SMT core. These operations were tailor made to be efficient while applying machine learning.

At the socket level, you get 10 times the performance per socket for double and single-precision, and using reduced precision, bfloat16 sped up to over 15x and int8 inference sped up to over 20x over Power9 More broadly, he said, “We have a host of new capabilities in ISA version 3.1. This is the new instruction set architecture that supports Power10 and is contributed to the OpenPOWER Foundation. The new ISA supports 64-bit prefixed instructions in a risk-friendly way. This is in addition to the classic way that we’ve delivered 32-bit instructions for many decades. It opens the door to adding new capabilities such as adding new addressing modes as well as providing rich new opcode space for future expansion.

POWER Up Your Insights - IBM System Summit

2016 August POWER Up Your Insights - IBM System Summit Mumbai from Anand Haridass

IBM promises 1000-qubit quantum computer—a milestone—by 2023

IBM today, for the first time, published its road map for the future of its quantum computing hardware. There is a lot to digest here, but the most important news in the short term is that the company believes it is on its way to building a quantum processor with more than 1,000 qubits — and somewhere between 10 and 50 logical qubits — by the end of 2023.

Currently, the company’s quantum processors top out at 65 qubits. It plans to launch a 127-qubit processor next year and a 433-qubit machine in 2022. To get to this point, IBM is also building a completely new dilution refrigerator to house these larger chips, as well as the technology to connect multiple of these units to build a system akin to today’s multi-core architectures in classical chips.

Gil believes that 2023 will be an inflection point in the industry, with the road to the 1,121-qubit machine driving improvements across the stack. The most important — and ambitious — of these performance improvements that IBM is trying to execute on is bringing down the error rate from about 1% today to something closer to 0.0001%. But looking at the trajectory of where its machines were just a few years ago, that’s the number the line is pointing toward.

Q-CTRL and Quantum Machines, two of the better-known startups in the quantum control ecosystem, today announced a new partnership that will see Quantum Machines integrate Q-CTRL‘s quantum firmware into Quantum Machines’ Quantum Orchestration hardware and software solution.

Building quantum computers takes so much specialized knowledge that it’s no surprise that we are now seeing some of the best-of-breed startups cooperate — and that’s pretty much why these two companies are now working together and why we’ll likely see more of these collaborations over time.

“The motivation [for quantum computing] is this immense computational power that we could get from quantum computers and while it exists, we didn’t make it happen yet. We don’t have full-fledged quantum computers yet,” Itamar Sivan, the co-founder and CEO of Quantum Machines, told me.

IBM Power10 A Glimpse Into the Future of Servers

For 20 years scientists and engineers have been saying that “someday” they’ll build a full-fledged quantum computer able to perform useful calculations that would overwhelm any conventional supercomputer. But current machines contain just a few dozen quantum bits, or qubits, too few to do anything dazzling. Today, IBM made its aspirations more concrete by publicly announcing a “road map” for the development of its quantum computers, including the ambitious goal of building one containing 1000 qubits by 2023. IBM’s current largest quantum computer, revealed this month, contains 65 qubits.

“We’re very excited,” says Prineha Narang, co-founder and chief technology officer of Aliro Quantum, a startup that specializes in code that helps higher level software efficiently run on different quantum computers. “We didn’t know the specific milestones and numbers that they’ve announced,” she says. The plan includes building intermediate-size machines of 127 and 433 qubits in 2021 and 2022, respectively, and envisions following up with a million-qubit machine at some unspecified date. Dario Gil, IBM’s director of research, says he is confident his team can keep to the schedule. “A road map is more than a plan and a PowerPoint presentation,” he says. “It’s execution.”

IBM is not the only company with a road map to build a full-fledged quantum computer—a machine that would take advantage of the strange rules of quantum mechanics to breeze through certain computations that just overwhelm conventional computers. At least in terms of public relations, IBM has been playing catch-up to Google, which 1 year ago grabbed headlines when the company announced its researchers had used their 53-qubit quantum computer to solve a particular abstract problem that they claimed would overwhelm any conventional computer—reaching a milestone known as quantum supremacy. Google has its own plan to build a million-qubit quantum computer within 10 years, as Hartmut Neven, who leads Google’s quantum computing effort, explained in an April interview, although he declined to reveal a specific timeline for advances.

AI in Automobile :Solutions for ADAS and AI data engineering using OpenPOWER/POWER systems

IBM’s declared timeline comes with an obvious risk that everyone will know if it misses its milestones. But the company decided to reveal its plans so that its clients and collaborators would know what to expect. Dozens of quantum-computing startup companies use IBM’s current machines to develop their own software products, and knowing IBM’s milestones should help developers better tailor their efforts to the hardware, Gil says.

One company joining those efforts is Q-CTRL, which develops software to optimize the control and performance of the individual qubits. The IBM announcement shows venture capitalists the company is serious about developing the challenging technology, says Michael Biercuk, founder and CEO of Q-CTRL. “It’s relevant to convincing investors that this large hardware manufacturer is pushing hard on this and investing significant resources,” he says.

A 1000-qubit machine is a particularly important milestone in the development of a full-fledged quantum computer, researchers say. Such a machine would still be 1000 times too small to fulfill quantum computing’s full potential—such as breaking current internet encryption schemes—but it would big enough to spot and correct the myriad errors that ordinarily plague the finicky quantum bits.

IBM Power Systems at FIS InFocus 2019

IBM Power Systems at FIS InFocus 2019 from Paula Koziol

A bit in an ordinary computer is an electrical switch that can be set to either zero or one. In contrast, a qubit is a quantum device—in IBM’s and Google’s machines, each is a tiny circuit of superconducting metal chilled to nearly absolute zero—that can be set to zero, one, or, thanks to the strange rules of quantum mechanics, zero and one at the same time. But the slightest interaction with the environment tends to distort those delicate two-ways-at-once states, so researchers have developed error-correction protocols to spread information ordinarily encoded in a single physical qubit to many of them in a way that the state of that “logical qubit” can be maintained indefinitely.

With their planned 1121-qubit machine, IBM researchers would be able to maintain a handful of logical qubits and make them interact, says Jay Gambetta, a physicist who leads IBM’s quantum computing efforts. That’s exactly what will be required to start to make a full-fledged quantum computer with thousands of logical qubits. Such a machine would mark an “inflection point” in which researchers’ focus would switch from beating down the error rate in the individual qubits to optimizing the architecture and performance of the entire system, Gambetta says.

IBM is already preparing a jumbo liquid-helium refrigerator, or cryostat, to hold a quantum computer with 1 million qubits. The IBM road map doesn’t specify when such a machine could be built. But if company researchers really can build a 1000-qubit computer in the next 2 years, that ultimate goal will sound far less fantastical than it does now.

IBM Power Systems at the heart of Cognitive Solutions

IBM Power Systems at the heart of Cognitive Solutions from David Spurway

More Information:

https://www.ibm.com/it-infrastructure/power/supercomputing

https://techcrunch.com/2020/09/15/ibm-publishes-its-quantum-roadmap-says-it-will-have-a-1000-qubit-machine-in-2023/?guccounter=1

https://techcrunch.com/2020/09/09/q-ctrl-and-quantum-machines-team-up-to-accelerate-quantum-computing/

https://www.sciencemag.org/news/2020/09/ibm-promises-1000-qubit-quantum-computer-milestone-2023

https://newsroom.ibm.com/2020-08-17-IBM-Reveals-Next-Generation-IBM-POWER10-Processor

https://www.hpcwire.com/2020/08/17/ibm-debuts-power10-touts-new-memory-scheme-security-and-inferencing/

https://newsroom.ibm.com/Stephen-Leonard-POWER10

https://www.olcf.ornl.gov/summit/

https://www.olcf.ornl.gov/wp-content/uploads/2018/06/Summit_bythenumbers_FIN-1.pdf

https://www.pcmag.com/news/ibm-launches-7nm-power10-processor

https://www.nextplatform.com/2020/09/03/the-memory-area-network-at-the-heart-of-ibms-power10/

https://www.servethehome.com/ibm-power10-searching-for-the-holy-grail-of-compute/

About the Project

The supercomputer Fugaku development plan initiated by the Ministry of Education, Culture, Sports, Science and Technology in 2014, has set the goal to develop: (1) the next generation flagship supercomputer of Japan (the successor to the K computer); and (2) a wide range of applications that will address social and scientific issues of high priority.

RIKEN Center for Computational Science (R-CCS) has been appointed to lead the development of Fugaku with the aim to start public service in FY2021. We are committed to develop a world-leading, versatile supercomputer and its applications, building on the research, technologies, and experience obtained through the use of the K computer.

The MEXT, R-CCS and its corporate partner will collaborate with several research institutions and universities to co-design the system and applications in order to address the high priority social and scientific issues identified by the MEXT.

Outline of the Development of the Supercomputer Fugaku

The supercomputer Fugaku will be developed based on the following guiding principles:

Top priority on problem-solving research

During development, highest priority will be given to creating a system capable of contributing to the solution of various scientific and societal issues. For this, the hardware and software will be developed in a coordinated way (Co-design), with the aim to make it usable in a variety of fields.

World-leading performance

Create the most advanced general-use system in the world.

Improve performance through international cooperation

While leveraging Japan’s strengths, cooperate internationally to achieve world-leading technologies of the highest quality and become the international standard.

Continue the legacy of the K computer

Make the fullest use of the technologies, human resources, and applications of the K computer project for developing the Fugaku system.

Introduction to Fujitsu ARM A64FX

Japan’s Fugaku gains title as world’s fastest supercomputer

The supercomputer Fugaku, which is being developed jointly by RIKEN and Fujitsu Limited based on Arm® technology, has taken the top spot on the Top500 listThe webpage will open in a new tab., a ranking of the world’s fastest supercomputers. It also swept the other rankings of supercomputer performance, taking first place on the HPCGThe webpage will open in a new tab., a ranking of supercomputers running real-world applications, HPL-AIThe webpage will open in a new tab., which ranks supercomputers based on their performance capabilities for tasks typically used in artificial intelligence applications, and Graph 500The webpage will open in a new tab., which ranks systems based on data-intensive loads. This is the first time in history that the same supercomputer has become No.1 on Top500, HPCG, and Graph500 simultaneously. The awards were announced on June 22 at the ISC High Performance 2020 DigitalThe webpage will open in a new tab., an international high-performance computing conference.

On the Top500, it achieved a LINPACK score of 415.53 petaflops, a much higher score than the 148.6 petaflops of its nearest competitor, Summit in the United States, using 152,064 of its eventual 158,976 nodes. This marks the first time a Japanese system has taken the top ranking since June 2011, when the K computer—Fugaku’s predecessor—took first place. On HPCG, it scored 13,400 teraflops using 138,240 nodes, and on HPL-AI it gained a score of 1.421 exaflops—the first time a computer has even earned an exascale rating on any list—using 126,720 nodes.

The top ranking on Graph 500 was won by a collaboration involving RIKEN, Kyushu University, Fixstars Corporation, and Fujitsu Limited. Using 92,160 nodes, it solved a breadth-first search of an enormous graph with 1.1 trillion nodes and 17.6 trillion edges in approximately 0.25 seconds, earning it a score of 70,980 gigaTEPS, more than doubling the score of 31,303 gigaTEPS the K computer and far surpassing China’s Sunway TaihuLight, which is currently second on the list, with 23,756 gigaTEPS.

Fugaku, which is currently installed at the RIKEN Center for Computational Science (R-CCS) in Kobe, Japan, is being developed under a national plan to design Japan’s next generation flagship supercomputer and to carry out a wide range of applications that will address high-priority social and scientific issues. It will be put to use in applications aimed at achieving the Society 5.0 planThe webpage will open in a new tab., by running applications in areas such as drug discovery; personalized and preventive medicine; simulations of natural disasters; weather and climate forecasting; energy creation, storage, and use; development of clean energy; new material development; new design and production processes; and—as a purely scientific endeavor—elucidation of the fundamental laws and evolution of the universe. In addition, Fugaku is currently being used on an experimental basis for research on COVID-19, including on diagnostics, therapeutics, and simulations of the spread of the virus. The new supercomputer is scheduled to begin full operation in fiscal 2021 (which starts in April 2021).

According to Satoshi Matsuoka, director of RIKEN R-CCS, “Ten years after the initial concept was proposed, and six years after the official start of the project, Fugaku is now near completion. Fugaku was developed based on the idea of achieving high performance on a variety of applications of great public interest, such as the achievement of Society 5.0, and we are very happy that it has shown itself to be outstanding on all the major supercomputer benchmarks. In addition to its use as a supercomputer, I hope that the leading-edge IT developed for it will contribute to major advances on difficult social challenges such as COVID-19.”

According to Naoki Shinjo, Corporate Executive Officer of Fujitsu Limited, “I believe that our decision to use a co-design process for Fugaku, which involved working with RIKEN and other parties to create the system, was a key to our winning the top position on a number of rankings. I am particularly proud that we were able to do this just one month after the delivery of the system was finished, even during the COVID-19 crisis. I would like to express our sincere gratitude to RIKEN and all the other parties for their generous cooperation and support. I very much hope that Fugaku will show itself to be highly effective in real-world applications and will help to realize Society 5.0.”

“The supercomputer Fugaku illustrates a dramatic shift in the type of compute that has been traditionally used in these powerful machines, and it is proof of the innovation that can happen with flexible computing solutions driven by a strong ecosystem,” said Rene Haas, President, IPG, Arm. “For Arm, this achievement showcases the power efficiency, performance and scalability of our compute platform, which spans from smartphones to the world’s fastest supercomputer. We congratulate RIKEN and Fujitsu Limited for challenging the status quo and showing the world what is possible in Arm-based high-performance computing.”

Fujitsu High Performance CPU for the Post K Computer

The most important thing you need to understand about the role Arm processor architecture plays in any computing or communications market — smartphones, personal computers, servers, or otherwise — is this: Arm Holdings, Ltd., which is based in Cambridge, UK, designs the components of processors for others to build. Arm owns these designs, along with the architecture of their instruction sets, such as 64-bit ARM64. Its business model is to license the intellectual property (IP) for these components and the instruction set to other companies, enabling them to build systems around them that incorporate their own designs as well as Arm's. For its customers who build systems around these chips, Arm has done the hard part for them.

Arm Holdings, Ltd. does not manufacture its own chips. It has no fabrication facilities of its own. Instead, it licenses these rights to other companies, which Arm Holdings calls "partners." They utilize Arm's architectural model as a kind of template, building systems that use Arm cores as their central processors.

A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exascale Performance

These Arm partners are allowed to design the rest of their own systems, perhaps manufacture those systems -- or outsource their production to others -- and then sell them as their own. Many Samsung and Apple smartphones and tablets, and essentially all devices produced by Qualcomm, utilize some Arm intellectual property. A new wave of servers produced with Arm-based systems-on-a-chip (SoC) has already made headway in competing against x86, especially with low-power or special-use models. Each device incorporating an Arm processor tends to be its own unique system, like the multi-part Qualcomm Snapdragon 845 mobile processor depicted above. (Qualcomm announced its 865 Plus 5G mobile platform in early July.)

Last August, Arm announced it had signed a partnership agreement with the US Defense Dept.'s DARPA agency, giving Pentagon research teams access to Arm's technology portfolio for research purposes.

Arm processors: Everything you need to know

CPU? GPU? This new ARM chip is BOTH

WHAT WILL NVIDIA'S ROLE BE IN OPERATING ARM?

On September 13, Nvidia announced a deal to acquire Arm Holdings , Ltd. from its parent company, Tokyo-based Softbank Group Corp., in a cash and stock exchange valued at $40 billion. The deal is pending regulatory review in the European Union, United States, Japan, and China, in separate processes that could take as long as 18 months to conclude.

In a September 14 press conference, Nvidia CEO Jensen Huang told reporters his intention is to maintain Arm's current business model, without influencing its current mix of partners. However, Huang also stated his intention to "add" access to Nvidia's GPU technology to Arm's portfolio of IP offered to partners, giving Arm licensees access to Nvidia designs. What's unclear at the time the deal was announced is what a prospective partner would want with a GPU design, besides the opportunity to compete against Nvidia.

Arm designs are created with the intention of being mixed-and-matched in various configurations, depending on the unique needs of its partners. The Arm Foundry program is a partnership between Arm Holdings and fabricators of semiconductors, such as Taiwan-based TSMC and US-based Intel, giving licensees multiple options for producing systems that incorporate Arm technology. (Prior to the September announcement, when Arm was considered for sale, rumored potential buyers included TSMC and Samsung.) By comparison, Nvidia produces exclusive GPU designs, with the intention of being exclusively produced at a foundry of its choosing — originally IBM, then largely TSMC, and most recently Samsung. Nvidia's designs are expressly intended for these particular foundries — for instance, to take advantage of Samsung's Extreme Ultra-Violet (EUV) lithography process.

Arm processors: Everything you need to know

After a colossal $40 billion deal with GPU maker Nvidia closes in 2021 or early 2022, there’s a good chance Arm’s intellectual property may be part of every widely distributed processor that is not x86.

HOW IS ARM DIFFERENT FROM X86 CPUS OR NVIDIA GPUS?

An x86-based PC or server is built to some common set of specifications for performance and compatibility. Such a PC isn't so much designed as assembled. This keeps costs low for hardware vendors, but it also relegates most of the innovation and feature-level premiums to software, and perhaps a few nuances of implementation. The x86 device ecosystem is populated by interchangeable parts, at least insofar as architecture is concerned (granted, AMD and Intel processors have not been socket-compatible for quite some time).

The Arm ecosystem is populated by some of the same components, such as memory, storage, and interfaces, but otherwise by complete systems designed and optimized for the components they utilize.

This does not necessarily give Arm devices, appliances, or servers any automatic advantage over Intel and AMD. Intel and x86 have been dominant in the computing processor space for the better part of four decades, and Arm chips have existed in one form or another for nearly all of that time -- since 1985. Its entire history has been about finding success in markets that x86 technology had not fully exploited or in which x86 was showing weakness, or in markets where x86 simply cannot be adapted.

For tablet computers, more recently in data center servers, and soon once again in desktop and laptop computers, the vendor of an Arm-based device or system is no longer relegated to being simply an assembler of parts. This makes any direct, unit-to-unit comparison of Arm vs. x86 processor components somewhat frivolous, as a device or system based on one could easily and consistently outperform the other, based on how that system was designed, assembled, and even packaged.

The class of processor now known as GPU originated as a graphics co-processor for PCs, and is still prominently used for that purpose. However, largely due to the influence of Nvidia in the artificial intelligence space, the GPU has come to be regarded as one class of general-purpose accelerator, as well as a principal computing component in supercomputers — being coupled with, rather than subordinate to, supercomputers. The GPU's strong suit is its ability to execute many clusters of instructions, or threads, in parallel, greatly accelerating many academic tasks.

By definition and by design, an Arm processor is not a GPU, though a system could be constructed using both. Last November, Nvidia announced its introduction of a reference platform enabling systems architects to couple Arm-based server designs with Nvidia GPU accelerators.

The tofu-interconnect-d

WHAT'S THE RELATIONSHIP BETWEEN ARM AND APPLE?

Apple CEO Tim Cook announces his company's chip manufacturing unit at WWDC 2020.

Apple Silicon is the phrase Apple presently uses to describe its own processor production, beginning last June with Apple's announcement of the replacement of its x86 Mac processor line. In its place, in Mac laptop units that are reportedly already shipping, will be a new system-on-a-chip called A12Z, code-named "Bionic," produced by Apple using the 64-bit instruction set licensed to it by Arm Holdings. In this case, Arm isn't the designer, but the producer of the instruction set around which Apple makes its original design. Apple is widely expected to choose TSMC as the fabricator for its A12Z.

For MacOS 11 to continue to run software compiled for Intel processors, the new Apple system will run a kind of "just-in-time" instruction translator called Rosetta 2. Rather than run an old MacOS image in a virtual machine, the new OS will run a live x86 machine code translator that re-fashions x86 code into what Apple now calls Universal 2 binary code -- an intermediate-level code that can still be made to run on older Intel-based Macs -- in real-time. That code will run in what sources outside of Apple call an "emulator," but which isn't really an emulator in that it doesn't simulate the execution of code in an actual, physical machine (there is no "Universal 2" chip).

The first results of independent performance benchmarks comparing an iPad Pro using the A12Z chip planned for the first Arm-based Macs, against Microsoft Surface models, looked promising. Geekbench results give the Bionic-powered tablet a multi-core processing score of 4669 (higher is better), versus 2966 for the Pentium-powered Surface Pro X, and 3033 for the Core i5-powered Surface Pro 6.

Apple's newly claimed ability to produce its own SoC for Mac, just as it does for iPhone and iPad, could save the company over time as much as 60 percent on production costs, according to its own estimates. Of course, Apple is typically tight-lipped as to how it arrives at that estimate, and how long such savings will take to be realized.

The relationship between Apple and Arm Holdings dates back to 1990, when Apple Computer UK became a founding co-stakeholder. The other co-partners at that time were the Arm concept's originator, Acorn Computers Ltd. (more about Acorn later) and custom semiconductor maker VLSI Technology (named for the common semiconductor manufacturing process called "very large-scale integration"). Today, Arm Holdings is a wholly-owned subsidiary of SoftBank, which announced its intent to purchase the licensor in July 2016. At the time, the acquisition deal was the largest for a Europe-based technology firm.

WHY X86 IS SOLD AND ARM IS LICENSED

The maker of an Intel- or AMD-based x86 computer does not design nor does it own any portion of the intellectual property for the CPU. It also cannot reproduce x86 IP for its own purposes. "Intel Inside" is a seal certifying a license for the device manufacturer to build a machine around Intel's processor. An Arm-based device may be designed to incorporate the processor, perhaps even making adaptations to its architecture and functionality. For that reason, rather than a "central processing unit" (CPU), an Arm processor is instead called a system-on-a-chip (SoC). Much of the functionality of the device may be fabricated onto the chip itself, cohabiting the die with Arm's exclusive cores, rather than built around the chip in separate processors, accelerators, or expansions.

As a result, a device run by an Arm processor, such as one of the Cortex series, is a different order of machine from one run by an Intel Xeon or an AMD Epyc. It means something quite different to be an original device based around an Arm chip. Most importantly from a manufacturer's perspective, it means a somewhat different, and hopefully more manageable, supply chain. Since Arm has no interest in marketing itself to end-users, you don't typically hear much about "Arm Inside."

Equally important, however, is the fact that an Arm chip is not necessarily a central processor. Depending on the design of its system, it can be the heart of a device controller, a microcontroller (MCU), or some other subordinate component in a system.

Perhaps the best explanation of Arm's business model, as well as its relationship with its own intellectual property, is to be found in a 2002 filing with the US Securities and Exchange Commission:

We take great care to establish and maintain the proprietary integrity of our products. We focus on designing and implementing our products in a "cleanroom" fashion, without the use of intellectual property belonging to other third parties, except under strictly maintained procedures and express license rights. In the event that we discover that a third party has intellectual property protections covering a product that we are interested in developing, we would take steps to either purchase a license to use the technology or work around the technology in developing our own solution so as to avoid infringement of that other company's intellectual property rights. Notwithstanding such efforts, third parties may yet make claims that we have infringed their proprietary rights, which we would defend.

What types of Arm processors are produced today?

To stay competitive, Arm offers a variety of processor core styles or series. Some are marketed for a variety of use cases; others are earmarked for just one or two. It's important to note here that Intel uses the term "microarchitecture," and sometimes by extension "architecture," to refer to the specific stage of evolution of its processors' features and functionality -- for example, its most recently shipped generation of Xeon server processors is a microarchitecture Intel has codenamed Cascade Lake. By comparison, Arm architecture encompasses the entire history of Arm RISC processors. Each iteration of this architecture has been called a variety of things, but most recently a series. All that having been said, Arm processors' instruction sets have evolved at their own pace, with each iteration generally referred to using the same abbreviation Intel uses for x86: ISA. And yes, here the "A" stands for "architecture."

Intel manufactures Celeron, Core, and Xeon processors for very different classes of customers; AMD manufactures Ryzen for desktop and laptop computers, and Epyc for servers. By contrast, Arm produces designs for complete processors, that may be utilized by partners as-is, or customized by those partners for their own purposes. Here are the principal Arm Holdings, Ltd. designs at the time of this publication:

Cortex-A has been marketed as the workhorse of the Arm family, with the "A" in this instance standing for application. As originally conceived, the client looking to build a system around Cortex-A had a particular application in mind for it, such as a digital audio amplifier, digital video processor, the microcontroller for a fire suppression system, or a sophisticated heart rate monitor. As things turned out, Cortex-A ended up being the heart of two emerging classes of device: Single-board computers capable of being programmed for a variety of applications, such as cash register processing; and most importantly of all, smartphones. Importantly, Cortex-A processors include memory management units (MMU) on-chip. Decades ago, it was the inclusion of the MMU on-chip by Intel's 80286 CPU that changed the game in its competition against Motorola chips, which at that time-powered Macintosh. The principal tool in Cortex-A's arsenal is its advanced single-instruction, multiple-data (SIMD) instruction set, code-named NEON, which executes instructions like accessing memory and processing data in parallel over a larger set of vectors. Imagine pulling into a filling station and loading up with enough fuel for 8 or 16 tanks, and you'll get the basic idea.
Cortex-R is a class of processor with a much narrower set of use cases: Mainly microcontroller applications that require real-time processing. One big case-in-point is 4G LTE and 5G modems, where time (or what a music composer might more accurately call "tempo") is a critical factor in achieving modulation. Cortex-R's architecture is tailored in such a way that it responds to interrupts -- the requests for attention that trigger processes to run -- not only quickly but predictably. This enables R to run more consistently and deterministically and is one reason why Arm is promoting its use as a high-capacity storage controller for solid-state flash memory.
Cortex-M is a more miniaturized form factor, making it more suitable for tight spaces: For example, automotive control and braking systems, and high-definition digital cameras with image recognition. A principal use for M is as a digital signal processor (DSP), which responds to and manages analog signals for applications such as sound synthesis, voice recognition, and radar. Since 2018, Arm has taken to referring to all its Cortex series collectively under the umbrella term Cosmos.
Ethos-N is a series of processor specifically intended for applications that may involve machine learning or some other form of neural network processing. Arm calls this series a neural processor, although it's not quite the same class as Google's tensor processing unit, which Google itself admits is actually a co-processor and not a stand-alone controller [PDF]. Arm's concept of the neural processor includes routines used in drawing logical inferences from data, which are the building blocks of artificial intelligence used in image and pattern recognition, as well as machine learning.
Ethos-U is a slimmed-down edition of Ethos-N that is designed to work more like a co-processor, particularly in conjunction with Cortex-A.
Neoverse, launched in October 2018, represents a new and more concentrated effort by Arm to design cores that are more applicable in servers and the data centers that host them -- especially the smaller varieties. The term Arm uses in marketing Neoverse is "infrastructure" -- without being too specific, but still targeting the emerging use cases for mini and micro data centers stationed at the "customer edge," closer to where end-users will actually consume processor power.
SecurCore is a class of processor designed by Arm exclusively for use in smart card, USB-based certification, and embedded security applications.
These are series whose designs are licensed for others to produce processors and microcontrollers. All this being said, Arm also licenses certain custom and semi-custom versions of its architecture exclusively, enabling these clients to build unique processors that are available to no other producer. These special clients include:

Apple, which has fabricated for itself a variety of Arm-based designs over the years for iPhone and iPad, and announced last June an entirely new SoC for Mac (see above);

Marvell, which acquired chip maker Cavium in November 2017, and has since doubled down on investments in the ThunderX series of processors originally designed for Cavium;

Nvidia, which co-designed two processor series with Arm, the most recent of which is called CArmel. Known generally as a GPU producer, Nvidia leverages the CArmel design to produce its 64-bit Tegra Xavier SoC. That chip powers the company's small-form-factor edge computing device, called Jetson AGX Xavier.
Samsung, which produces a variety of 32-bit and 64-bit Arm processors for its entire consumer electronics line, under the internal brand Exynos. Some have used a Samsung core design called Mongoose, while most others have utilized versions of Cortex-A. Notably (or perhaps notoriously) Samsung manufactures variations of its Galaxy Note, Galaxy S, and Galaxy A series smartphones with either its own Exynos SoCs (outside the US) or Qualcomm Snapdragons (the US only).
Qualcomm, whose most recent Snapdragon SoC models utilize a core design called Kryo, which is a semi-custom variation of Cortex-A. Earlier Snapdragon models were based on a core design called Krait, which was still officially an Arm-based SoC even though it was a purely Qualcomm design. Analysts estimate Snapdragon 855, 855 Plus, and 865 together to comprise the nucleus of greater than half the world's 5G smartphones. Although Qualcomm did give it a go in November 2017 with producing Arm chips for data center servers, with a product line called Centriq, it began winding down production of that line in December 2018, turning over the rights to continue its production to China-based Huaxintong Semiconductor (HXT), at the time a joint venture partner. That partnership was terminated the following April.

Ampere Computing, a startup launched with ex-Intel president Renee James, produces a super-high core-count server processor line called Altra. The 128-core Altra Max edition will begin sampling in Q4 2020, notwithstanding the pandemic.

IS A SYSTEM-ON-A-CHIP THE SAME AS A CHIPSET?

Technically speaking, the class of processor to which an Arm chip belongs is an application-specific integrated circuit (ASIC). Consider a hardware platform whose common element is a set of processing cores. That's not too difficult; that describes essentially every device ever manufactured. But miniaturize these components so that they all fit on one die -- on the same physical platform -- interconnected using an exclusive mesh bus.

A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exascale Performance

As you know, for a computer, the application program is rendered as software. In many appliances such as Internet routers, front-door security systems, and "smart" HDTVs, the memory in which operations programs are stored is non-volatile, so we often call it firmware. In a device whose core processor is an ASIC, its main functionality is rendered onto the chip, as a permanent component. So the functionality that makes a device a "system" shares the die with the processor cores, and an Arm chip can have dozens of those.

Some analysis firms have taken to using the broad phrase applications processor, or AP, to refer to ASICs, but this has not caught on generally. In more casual use, an SoC is also called a chipset, even though in recent years, more often than not, the number of chips in the set is just one. In general use, a chipset is a set of one or more processors that collectively function as a complete system. A CPU executes the main program, while a chipset manages attached components and communicates with the user. On a PC motherboard, the chipset is separate from the CPU. On an SoC, the main processor and the system components share the same die.

What makes Arm processor architecture unique?

The "R" in "Arm" actually stands for another acronym: Reduced Instruction Set Computer (RISC). Its purpose is to leverage the efficiency of simplicity, to render all of the processor's functionality on a single chip. Keeping a processor's instruction set small means it can be coded using a fewer number of bits, thus reducing memory consumption as well as execution cycle time. Back in 1982, students at the University of California, Berkeley, were able to produce the first working RISC architectures by judiciously selecting which functions would be used most often, and rendering only those in hardware -- with the remaining functions rendered as software. Indeed, that's what makes an SoC with a set of small cores feasible: Relegating as much functionality to software as possible.

Retroactively, architectures such as x86, which adopted strategies quite opposite to RISC, were dubbed Complex Instruction Set Computers (CISC), although Intel has historically avoided using that term for itself. The power of x86 comes from being able to accomplish so much with just a single instruction. For instance, with Intel's vector processing, it's possible to execute 16 single-precision math operations, or 8 double-precision operations, simultaneously; here, the vector acts as a kind of "skewer," if you will, poking through all the operands in a parallel operation and racking them up.

That makes complex math easier, at least conceptually. With a RISC system, math operations are decomposed into fundamentals. Everything that would happen automatically with a CISC architecture -- for example, clearing up the active registers when a process is completed -- takes a full, recorded step with RISC. However, because fewer bits (binary digits) are required to encapsulate the entire RISC instruction set, it may end up taking about as many bits in the end to encode a sequence of fundamental operations in a RISC processor -- perhaps even fewer -- than a complex CISC instruction where all the properties and arguments are piled together in a big clump.

Intel can, and has, demonstrated very complex instructions with higher performance statistics than the same processes for Arm processors, or other RISC chips. But sometimes such performance gains come at an overall performance cost for the rest of the system, making RISC architectures somewhat more efficient than CISC at general-purpose tasks.

Then there's the issue of customization. Intel enhances its more premium CPUs with functionality by way of programs that would normally be rendered as software, but are instead embedded as microcode. These are routines designed to be quickly executed at the machine code level, and that can be referenced by that code indirectly, by name. This way, for example, a program that needs to invoke a common method for decrypting messages on a network can address very fast processor code, very close to where that code will be executed. (Conveniently, many of the routines that end up in microcode are the ones often employed in performance benchmarks.) These microcode routines are stored in read-only memory (ROM) near the x86 cores.

An Arm processor, by contrast, does not use digital microcode in its on-die memory. The current implementation of Arm's alternative is a concept called custom instructions [PDF]. It enables the inclusion of completely client-customizable, on-die modules, whose logic is effectively "pre-decoded." These modules are represented in the above Arm diagram by the green boxes. All the program has to do to invoke this logic is cue up a dependent instruction for the processor core, which passes control to the custom module as though it were another arithmetic logic unit (ALU). Arm asks its partners who want to implement custom modules to present it with a configuration file, and map out the custom data path from the core to the custom ALU. Using just these items, the core can determine the dependencies and instruction interlocking mechanisms for itself.

This is how an Arm partner builds up an exclusive design for itself, using Arm cores as their starting ingredients.

Although Arm did not create the concept of RISC, it had a great deal to do with realizing the concept, and making it publicly available. One branch of the original Berkeley architecture to have emerged unto its own is RISC-V, whose core specification was made open source under the Creative Commons 4.0 license. Nvidia, among others including Qualcomm, Samsung, Huawei, and Micron Technology, has been a founding member of the RISC-V Foundation. When asked, Nvidia CEO Jensen Huang indicated he intends for his company to continue contributing to the RISC-V effort, maintaining that its ecosystem is naturally separate from that of Arm.

The rising prospects for Arm in servers

RIKEN Center for Computational Science

Just last month, a Fujitsu Arm-powered supercomputer named Fugaku (pictured left), built for Japan's RIKEN Center for Computational Science, seized the #1 spot on the semi-annual Top 500 Supercomputer list.

But of all the differences between an x86 CPU and an Arm SoC, this may be the only one that matters to a data center's facilities manager: Given any pair of samples of both classes of processor, it's the Arm chip that is least likely to require an active cooling system. Put another way, if you open up your smartphone, chances are you won't find a fan. Or a liquid cooling apparatus.

The buildout of 5G wireless technology is, ironically enough, expanding the buildout of fiber optic connectivity to locations near the "customer edge" -- the furthest point from the network operations center. This opens up the opportunity to station edge computing devices and servers at or near such points, but preferably without the heat exchanger units that typically accompany racks of x86 servers.

Bamboo Systems

This is where startups such as Bamboo Systems come in. Radical reductions in the size and power requirements for cooling systems enable server designers to devise new ways to think "out-of-the-box" -- for instance, by shrinking the box. A Bamboo server node is a card not much larger than the span of most folks' hands, eight of which may be securely installed in a 1U box that typically supports 1, maybe 2, x86 servers. Bamboo aims to produce servers, the company says, that use as little as one-fifth the rack space and consume one-fourth the power, of x86 racks with comparable performance levels.

Where did Arm processors come from?

An Acorn. Indeed, that's what the "A" originally stood for.

Back in 1981, a Cambridge, UK-based company called Acorn Computers was marketing a microcomputer (what we used to call "PCs" back before IBM popularized the term) based on Motorola's 6502 processor -- which had powered the venerable Apple II, the Commodore 64, and the Atari 400 and 800. Although the name "Acorn" was a clever trick to appear earlier on an alphabetized list than "Apple," its computer had been partly subsidized by the BBC and was thus known nationwide as the BBC Micro.

All 6502-based machines used 8-bit processor architecture, and in 1981, Intel was working towards a fully compatible 16-bit architecture to replace the 8086 used in the IBM PC/XT. The following year, Intel's 80286 would enable IBM to produce its PC AT so that MS-DOS, and all the software that ran on DOS, would not have to be changed or recompiled to run on 16-bit architecture. It was a tremendous success, and Motorola could not match it. Although Apple's first Macintosh was based on the 16-bit Motorola 68000 series, its architecture was only "inspired" by the earlier 8-bit design, not compatible with it. (Eventually, it would produce a 16-bit Apple IIGS based on the 65C816 processor, but only after several months waiting for the makers of the 65816 to ship a working test model. The IIGS did have an "Apple II" step-down mode but technically not full compatibility.)

Acorn's engineers wanted a way forward, and Motorola was leaving them at a dead end. After experimenting with a surprisingly fast co-processor for the 6502 called Tube that just wasn't fast enough, they opted to take the plunge with a full 32-bit pipeline. Following the lead of the Berkeley RISC project, in 1983, they built a simulator for a processor they called Arm1 that was so simple, it ran on the BASIC language interpreter of the BBC Micro (albeit not at speed). They would collaborate with VLSI and would produce two years later their first Arm1 working model, with a 6 MHz clock speed. It utilized so little power that, as one project engineer tells the story, one day they noticed the chip was running without its power supply connected. It was actually being powered by leakage from the power rails leading to the I/O chip.

At this early stage, the Arm1, Arm2, and Arm3 processors were all technically CPUs, not SoCs. Yet in the same sense that today's Intel Core processors are architectural successors of its original 4004, Cortex-A is the architectural successor to Arm1.

More Information:

Supercomputer Fugaku Documents

https://www.fujitsu.com/global/about/innovation/fugaku/documents/

https://www.fujitsu.com/global/about/innovation/fugaku/specifications/

https://www.r-ccs.riken.jp/en/fugaku/project

https://www.riken.jp/en/news_pubs/news/2020/20200623_1/

https://www.r-ccs.riken.jp/en/fugaku/project/outline

https://www.theverge.com/2020/6/23/21300097/fugaku-supercomputer-worlds-fastest-top500-riken-fujitsu-arm

https://www.statista.com/statistics/264446/distribution-of-interconnect-families-used-in-supercomputers/

https://www.statista.com/statistics/268280/number-of-computer-cores-in-selected-supercomputers-worldwide/

https://www.omicsonline.org/universities/The_Institute_of_Physical_and_Chemical_Research_RIKEN/

https://www.omicsonline.org/computer-science-journals-impact-factor-ranking.php

https://robotics.insightconferences.com/call-for-abstracts.php

https://www.zdnet.com/article/introducing-the-arm-processor-again-what-you-should-know-about-it-now/

Microsoft Topological Quantum Computing Approach

For faster quantum computing, Microsoft builds a better qubit

Microsoft's new approach to quantum computing is "very close," an executive says.

Mostly borrowed & updated from Steve Lamb in Microsoft Land….

Google just announced quantum supremacy, a milestone in which the radically different nature of a quantum computer lets it vastly outpace a traditional machine. But Microsoft expects progress of its own by redesigning the core element of quantum computing, the qubit.

Microsoft has been working on a qubit technology called a topological qubit that it expects will deliver benefits from quantum computing technology that today are mostly just a promise. After spending five years figuring out the complicated hardware of topological qubits, the company is almost ready to put them to use, said Krysta Svore, general manager of Microsoft's quantum computing software work.

"We've really spent the recent few years developing that technology," Svore said Thursday after a talk at the IEEE International Conference on Rebooting Computing. "We believe we're very close to having that."

Quantum computers are hard to understand, hard to build, hard to operate and hard to program. Since they only work when chilled to a tiny fraction of a degree above absolute zero -- colder than outer space -- you're not likely to have a quantum laptop anytime soon.

But running them in data centers where customers can tap into them could deliver profound benefits by tackling computing challenges that classical computers can't handle. Among examples Svore offered are solving chemistry problems like making fertilizer more efficiently, or routing trucks to speed deliveries and cut traffic.

Better qubits

Together, the phenomena should enable quantum computers to explore an enormous number of possible solutions to a problem at the same time. ... The key advantage of Microsoft's topological qubit is that fewer physical qubits are needed to make one logical qubit, Svore said.

Better qubits

Classical computers store data as a bit that represents either a 0 or a 1. Qubits, though, can store a combination of 0 and 1 simultaneously through a peculiar quantum physics principle called superposition. And qubits can be ganged together through another phenomenon called entanglement. Together, the phenomena should enable quantum computers to explore an enormous number of possible solutions to a problem at the same time.

Quantum Computing 101

One of the basic quantum computing problems is that qubits are easily perturbed. That's why the heart of a quantum computer is housed in a refrigerated container the size of a 55-gallon drum.

Even with that isolation, though, individual qubits today only can perform useful work for a fraction of a second. To compensate, quantum computer designers plan technology called error correction that yokes many qubits together into a single effective qubit, called a logical qubit. The idea is that logical qubits can perform useful processing work when many of their underlying physical qubits have gone astray.

The key advantage of Microsoft's topological qubit is that fewer physical qubits are needed to make one logical qubit, Svore said.

Specifically, she thinks one logical qubit will require 10 to 100 physical qubits with Microsoft's topological qubits. That compares to something like 1,000 to 20,000 physical qubits for other approaches.

"We believe that overhead will be far less," she said. That'll mean quantum computers will become practical with far fewer qubits.

By comparison, Google's Sycamore quantum computing chip used 53 physical qubits. For serious quantum computing work, researchers are hoping to reach qubit levels of at least a million.

Topological quantum computing with Majorana Fermions

One drawback of Microsoft's topological qubit, though, is that they're not available yet. Alternative designs might not work as well, but they're in real-world testing today.

Better quantum computing algorithms

Microsoft is also trying to improve other aspects of quantum computing. One is the control system, which in today's quantum computers is a snarl of hundreds of wires, each an expensive coaxial cable used to communicate with qubits.

On Monday at Microsoft's Ignite conference, the company also showed off a new quantum computer control system developed with the University of Sydney that uses many fewer wires -- down from 216 to just three, Svore said. "We think this will scale to tens of thousands of qubits and beyond."

And Svore pushed for progress on quantum computing software, too, urging professors to introduce their students to learning and improving quantum computing algorithms.

In one example of those benefits, Microsoft tackled an aspect of that nitrogen-fixing fertilizer problem that simply couldn't be solved on a classical machine -- but found that a quantum computer would still take 30,000 years.

That's faster than a classical computer that would require "the lifetime of the universe," but still not practical, she said. But with algorithm improvements, Microsoft found a way to shorten that to just a day and a half.

Non-Abelian Anyons & Topological Quantum Computation ESE 523

"New algorithms can be a breakthrough in how to solve something," Svore said. "We need to make them better, we need to optimize them, we need to be pushing."

Developing a Topological Qubit

As quantum technologies advance, we get closer to finding solutions to some of the world’s most challenging problems. While this new paradigm holds incredible possibility, quantum computing is very much in its infancy. To fully embrace the power and potential of quantum computing, the system must be engineered to meet the demands of the solutions the world needs most.

The fragile nature of qubits is well-known as one of the most significant hurdles in quantum computing. Even the slightest interference can cause qubits to collapse, making the solutions we’re pursuing impossible to identify because the computations cannot be completed.

Microsoft is addressing this challenge by developing a topological qubit. Topological qubits are protected from noise due to their values existing at two distinct points, making our quantum computer more robust against outside interference. This increased stability will help the quantum computer scale to complete longer, more complex computations, bringing the solutions we need within reach.

Quantum computing explained to my (Schrödinger) cat - Alessandro Vozza - Codemotion Amsterdam 2018

Quantum computing explained to my (Schrödinger) cat - Alessandro Vozza - Codemotion Amsterdam 2018 from Codemotion

Topology and quantum computing

Topology is a branch of mathematics describing structures that experience physical changes such as being bent, twisted, compacted, or stretched, yet still maintain the properties of the original form. When applied to quantum computing, topological properties create a level of protection that helps a qubit retain information despite what’s happening in the environment. The topological qubit achieves this extra protection in two different ways:

Electron fractionalization. By splitting the electron, quantum information is stored in both halves, behaving similarly to data redundancy. If one half of the electron runs into interference, there is still enough information stored in the other half to allow the computation to continue.

Ground state degeneracy. Topological qubits are engineered to have two ground states—known as ground state degeneracy—making them much more resistant to environmental noise. Normally, achieving this protection isn’t feasible because there’s no way to discriminate between the two ground states. However, topological systems can use braiding or measurement to distinguish the difference, allowing them to achieve this additional protection.

From Bohr’s Atom to the Topological Quantum Computer | Charles Marcus

The path to the topological qubit

Currently years into the development of the topological qubit, the journey began with a single question, “Could a topological qubit be achieved”? Working with theory as a starting point, Microsoft brought together mathematicians, computer scientists, physicists, and engineers to explore possible approaches. These experts collaborated, discussed methods, and completed countless equations to take the first steps on the path toward realizing a topological qubit.

Modeling and experimentation work hand-in-hand as an ongoing, iterative cycle, guiding the design of the topological qubit. Throughout this process, the Microsoft team explored possible materials, ways to apply control structure, and methods to stabilize the topological qubit.

A team member proposed the use of a superconductor in conjunction with a strong magnetic field to create a topological phase of matter—an approach that has been adopted toward realizing the topological qubit. While bridging these properties has been long-taught, it had never been done in such a controlled way prior to this work.

To create the exact surface layer needed for the qubit, chemical compounds are currently being grown in Microsoft labs using a technique called “selective area growth.” Chosen for its atomic-level precision, this unique method can be described as spraying atoms in the exact arrangement needed to achieve the properties required.

Topological Quantum Computing

The team continues testing functional accuracy through device simulation, to ensure that every qubit will be properly tuned, characterized, and validated.

Bridging fields to advance technology

Many fields of knowledge have come together to realize the topological qubit, including mathematics, theoretical physics, solid state physics, materials science, instrumentation and measurement technology, computer science, quantum algorithms, quantum error correction, and software applications development.

Bridging these fields has led to breakthrough techniques across all aspects of realizing a topological qubit, including:

Theory and simulation– Turning a vision into reality by creating a rapid design, simulation, and prototyping process
Fabrication– Pioneering unique fabrication approaches and finding new ways to bridge properties
Materials growth– Developing inventive methods to create materials using special growth techniques to create the exact properties required at nanoscale
Measurement and quantum control– Tuning devices for accuracy in function and measurement

At Microsoft, the development of the topological qubit continues, bringing us closer to scalable quantum computing and finding solutions to some of the world’s most challenging problems.

Introduction to Topological Quantum Computing

A complete quantum system from hardware to software

The process of building a quantum computer includes creating the raw materials needed to make topological quantum devices, fabricating the cold electronics and refrigeration systems, and developing the overall infrastructure needed to bring the solution to life. In addition, our system includes everything you need to program the quantum computer, including a control system, software, development tools, and Azure services—a combination we refer to as our full quantum stack.

Because quantum and classical work together, Microsoft Azure is a perfect environment for quantum processing and deployment. With data stored in Azure, developers will be able to access quantum processing alongside classical processing, creating a streamlined experience.

Using the complete Microsoft quantum system, what would the start-to-finish experience look like?

Beginning with a problem you may be able to solve with a quantum algorithm…

You would start by building your solution in Visual Studio, using the tools found in the Microsoft Quantum Development Kit.

Using Q#, a language created specifically for quantum development, you would write the code for your solution with the help of the extensive Microsoft quantum libraries.

When your code is complete, you would run a quantum simulation to check for bugs and validate that your solution is ready for deployment.

Once validated, you would be ready to run your solution on the quantum computer.

Your quantum solution would be deployed from within Microsoft Azure, using the quantum computer as a co-processor. As many scenarios will use both quantum and classical processing, Azure will streamline workflows as real-time or batch applications, later connecting results directly into your business processes.

Together, this full quantum stack pairs with familiar tools to create an integrated, streamlined environment for quantum processing.

Quantum Computing Are We There Yet?

Scalability, from top to bottom

Quantum computers can help address some of the world’s toughest problems, provided the quantum computer has enough high-quality qubits to find the solution. While the quantum systems of today may be able to add a high number of qubits, the quality of the qubits is the key factor in creating useful scale. From the cooling system to qubits to algorithms, scalability is a fundamental part of the Microsoft vision for quantum computing.

The topological qubit is a key ingredient in our scalable quantum system. Different from traditional qubits, a topological qubit is built in a way that automatically protects the information it holds and processes. Due to the fragile nature of conventional qubits, this protection offers a landmark improvement in performance, providing added stability and requiring fewer qubits overall. This critical benefit makes the ability to scale possible.

Microsoft has been working on scalable quantum computing for nearly two decades, creating its first quantum computing group—known as Station Q—in 2006. Investing in scalable quantum computing for over a decade, we have connected some of the brightest minds in the industry and academia to make this dream a reality. Blending physics, mathematics, engineering, and computer science, teams around the globe work daily to advance the development of the topological qubit and the Microsoft vision for quantum computing.

Empowering the quantum revolution

At Microsoft, we envision a future where quantum computing is available to a broad audience, scaling as needed to solve some of the world’s toughest challenges. Our quantum approach begins within familiar tools you know and use such as Visual Studio. It provides development resources to build and simulate your quantum solutions. And it continues with deployment through Azure for a streamlined combination of both quantum and classical processing.

As the path to build a quantum computer continues, challenges from across industries await solutions from this new computational power. One of the many examples of high-impact problems that can be solved on a quantum computer is developing a new alternative to fertilizer production. Making fertilizer requires a notable percentage of the world’s annual production of natural gas. This implies high cost, high energy waste, and substantial greenhouse emissions. Quantum computers can help identify a new alternative by analyzing nitrogenase, an enzyme in plants that converts nitrogen to ammonia naturally. To address this problem, a quantum computer would require at least 200 fault-free qubits—far beyond the small quantum systems of today. In order to find a solution, quantum computers must scale up. The challenge, however, is that scaling a quantum computer isn’t merely as simple as adding more qubits.

Building a quantum computer differs greatly from building a classical computer. The underlying physics, the operating environment, and the engineering each pose their own obstacles. With so many unique challenges, how can a quantum computer scale in a way that makes it possible to solve some of the world’s most challenging problems?

Introduction to Quantum Computer

Introduction to Quantum Computer from Sarun Sumriddetchkajorn

Navigating obstacles

Most quantum computers require temperatures colder than those found in deep space. To reach these temperatures, all the components and hardware are contained within a dilution refrigerator—highly specialized equipment that cools the qubits to just above absolute zero. Because standard electronics don’t work at these temperatures, a majority of quantum computers today use room-temperature control. With this method, controls on the outside of the refrigerator send signals through cables, communicating with the qubits inside. The challenge is that this method ultimately reaches a roadblock: the heat created by the sheer number of cables limits the output of signals, restraining the number of qubits that can be added.

As more control electronics are added, more effort is needed to maintain the very low temperature the system requires. Increasing both the size of the refrigerator and the cooling capacity is a potential option, however, this would require additional logistics to interface with the room temperature electronics, which may not be a feasible approach.

Another alternative would be to break the system into separate refrigerators. Unfortunately, this isn’t ideal either because the transfer of quantum data between the refrigerators is likely to be slow and inefficient.

At this stage in the development of quantum computers, size is therefore limited by the cooling capacity of the specialized refrigerator. Given these parameters, the electronics controlling the qubits must be as efficient as possible.

Physical qubits, logical qubits, and the role of error correction

By nature, qubits are fragile. They require a precise environment and state to operate correctly, and they’re highly prone to outside interference. This interference is referred to as ‘noise’, which is a consistent challenge and a well-known reality of quantum computing. As a result, error correction plays a significant role.

As a computation begins, the initial set of qubits in the quantum computer are referred to as ‘physical qubits’. Error correction works by grouping many of these fragile physical qubits, which creates a smaller number of usable qubits that can remain immune to noise long enough to complete the computation. These stronger, more stable qubits used in the computation are referred to as ‘logical qubits’.

In classical computing, noisy bits are fixed through duplication (parity and Hamming codes), which is a way to correct errors as they occur. A similar process occurs in quantum computing, but is more difficult to achieve. This results in significantly more physical qubits than the number of logical qubits required for the computation. The ratio of physical to logical qubits is influenced by two factors: 1) the type of qubits used in the quantum computer, and 2) the overall size of the quantum computation performed. And due to the known difficulty of scaling the system size, reducing the ratio of physical to logical qubits is critical. This means that instead of just aiming for more qubits, it is crucial to aim for better qubits.

Stability and Scale with a Topological Qubit

The topological qubit is a type of qubit that offers more immunity to noise than many traditional types of qubits. Topological qubits are more robust against outside interference, meaning fewer total physical qubits are needed when compared to other quantum systems. With this improved performance, the ratio of physical to logical qubits is reduced, which in turn, creates the ability to scale.

As we know from Schrödinger’s cat, outside interactions can destroy quantum information. Any interaction from a stray particle, such as an electron, a photon, a cosmic ray, etc., can cause the quantum computer to decohere.

There is a way to prevent this: parts of the electron can be separated, creating an increased level of protection for the information stored. This is a form of topological protection known as a Majorana quasi-particle. The Majorana quasi-particle was predicted in 1937 and was detected for the first time in the Microsoft Quantum lab in the Netherlands in 2012. This separation of the quantum information creates a stable, robust building block for a qubit. The topological qubit provides a better foundation with lower error rates, reducing the ratio of physical to logical qubits. With this reduced ratio, more logical qubits are able to fit inside the refrigerator, creating the ability to scale.

If topological qubits were used in the example of nitrogenase simulation, the required 200 logical qubits would be built out of thousands of physical qubits. However, if more traditional types of qubits were used, tens or even hundreds of thousands of physical qubits would be needed to achieve 200 logical qubits. The topological qubit’s improved performance causes this dramatic difference; fewer physical qubits are needed to achieve the logical qubits required.

Developing a topological qubit is extremely challenging and is still underway, but these benefits make the pursuit well worth the effort.

A solid foundation to tackle problems unsolved by today’s computers

A significant number of logical qubits are required to address some of the important problems currently unsolvable by today’s computers. Yet common approaches to quantum computing require massive numbers of physical qubits in order to reach these quantities of logical qubits—creating a huge roadblock to scalability. Instead, a topological approach to quantum computing requires far fewer physical qubits than other quantum systems, making scalability much more achievable.

Providing a more solid foundation, the topological approach offers robust, stable qubits, and helps to bring the solutions to some of our most challenging problems within reach.

Microsoft's Approach: Topological Systems

At Microsoft Quantum, our ambition is to help solve some of the world’s most complex problems by developing scalable quantum technology. Our global team of researchers, scientists, and engineers are addressing this challenging task by developing a topological qubit.

To realize this vision, our teams have been making advances in materials and device fabrication, designing the precise physical environment required to support the topological state of matter. The latest discovery by the team expands the landscape for creating and controlling the exotic particles critical for enabling topological superconductivity in nanoscale devices.

Discovery: a new route to topology

Our qubit architecture is based on nanowires, which under certain conditions (low-temperature, magnetic field, material choice) can enter a topological state. Topological quantum hardware is intrinsically robust against local sources of noise, making it particularly appealing as we scale up the number of qubits.

An intriguing feature of topological nanowires is that they support Majorana zero modes (MZMs) that are neither fermions nor bosons. Instead, they obey different, more exotic quantum exchange rules. If kept apart and braided around each other, similar to strands of hair, MZMs remember when they encircle each other. Such braiding operations act as quantum gates on a state, allowing for a new kind of computation that relies on the topology of the braiding pattern.

A topological qubit is constructed by arranging several nanowires hosting MZMs in a comb-like structure and coupling them in a specific way that lets them share multiple MZMs. The first step in building a topological qubit is to reliably establish the topological phase in these nanowires.

While exploring the conditions for the creation of topological superconductivity, the team discovered a topological quantum vortex state in the core of a semiconductor nanowire surrounded on all sides by a superconducting shell. They were very surprised to find Majorana modes in the structure, akin to a topological vortex residing inside of a nanoscale coaxial cable.

Quantum Information Science

With hindsight, the findings can now be understood as a novel topological extension of a 50-year old piece of physics known as the Little-Parks effect. In the Little-Parks effect, a superconductor in the shape of a cylindrical shell – analogous to a soda straw – adjusts to an external magnetic field, threading the cylinder by jumping to a “vortex state” where the quantum wavefunction around the cylinder carries a twist. The quantum wavefunction must close on itself.

Thus, the wavefunction phase accumulated by going around the cylinder must take the values zero, one, two, and so on, in units of 2π. This has been known for decades. What had not been explored in depth was what those twists do to the semiconductor core inside the superconducting shell. The surprising discovery made by the Microsoft team—experiment and theory—was a twist in the shell, under appropriate conditions, can make a topological state in the core, with MZMs localized at the opposite ends.

While signatures of Majorana modes have been reported in related systems without the fully surrounding cylindrical shell, these previous realizations placed rather stringent requirements on materials and required large magnetic fields. This discovery places few requirements on materials and needs a smaller magnetic field, expanding the landscape for creating and controlling Majoranas.

Worldwide collaboration

What started as two separate papers – one experimental, the other theoretical – was combined into a single publication that tells the complete story, with mutual support of experiment, theory, and numerics.

Of course, looking back, deep connections to previous ideas and experiments can now be recognized, and results that were first mysterious now seem inevitable. That is the nature of scientific progress: from seemingly impossible to seemingly obvious after a few months of making, measuring, and thinking.

Saulius Vaitiekėnas, then a PhD student and postdoc at the Niels Bohr Institute, University of Copenhagen, and now a newly minted Microsoft researcher, was the main experimentalist. As he comments, “The paper represents a series of surprises. And it was really exciting to see so many different disciplines come together, all in a united activity.”

Roman Lutchyn, Principal Research Manager and lead of the theoretical effort, reflected on the collaboration process. “Microsoft Quantum started with just a small group in Santa Barbara. Now we’ve grown into a much broader organization with labs all around the world – Copenhagen, Delft, Purdue, Sydney, Redmond, among others. I think this paper is a landmark in our partnership between teams and is a model of how we can work effectively together as one team – around the world – on related ideas in physics, ultimately generating new and potentially important results.”

Charles Marcus, Scientific Director of Microsoft Quantum Lab – Copenhagen and lead for the experimental effort, concurs, “[This paper is an example] where two results – from theory and experiment – help each other to make more conclusive statements about physics. Otherwise, we would have been left with more abstract theory; and experimentally, we would have measurements but may have hedged on interpretation. By merging theory and experiment, the overall story is stronger and also more interesting, seeing the connection to related phenomena in different systems.”

Inside Microsoft’s Quest to Make Quantum Computing Scalable

The company’s researchers are building a system that’s unlike any other quantum computer being developed today.

Introduction to topological superconductivity and Majorana fermions

Introduction to topological superconductivity and Majorana fermions

There’s no shortage of big tech companies building quantum computers, but Microsoft claims its approach to manufacturing qubits will make its quantum computing systems more powerful than others’. The company’s researchers are pursuing “topological” qubits, which store data in the path of moving exotic Majorana particles. This is different from storing it in the state of electrons, which is fragile.

That’s according to Krysta Svore, research manager in Microsoft’s Quantum Architectures and Computation group. The Majorana particle paths -- with a fractionalized electron appearing in many places along them -- weave together like a braid, which makes for a much more robust and efficient system, she said in an interview with Data Center Knowledge. These qubits are called “topological cubits,” and the systems are called “topological quantum computers.”

With other approaches, it may take 10,000 physical qubits to create a logical qubit that’s stable enough for useful computation, because the state of the qubits storing the answer to your problem “decoheres” very easily, she said. It’s harder to disrupt an electron that’s been split up along a topological qubit, because the information is stored in more places.

In quantum mechanics, particles are represented by wavelengths. Coherence is achieved when waves that interfere with each other have the same frequency and constant phase relation. In other words, they don’t have to be in phase with each other, but the difference between the phases has to remain constant. If it does not, the particle states are said to decohere.

“We’re working on a universally programmable circuit model, so any other circuit-based quantum machine will be able to run the same class of algorithms, but we have a big differentiator,” Svore said. “Because the fidelity of the qubit promises to be several orders of magnitude better, I can run an algorithm that’s several orders of magnitude bigger. If I can run many more operations without decohering, I could run a class of algorithm that in theory would run on other quantum machines but that physically won’t give a good result. Let’s say we’re three orders of magnitude better; then I can run three orders of magnitude more operations in my quantum circuit.”

Theoretically, that could mean a clear advantage of a quantum computer over a classical one. “We can have a much larger circuit which could theoretically be the difference between something that shows quantum advantage or not. And for larger algorithms, where error corrections are required, we need several orders of magnitude less overhead to run that algorithm,” she explained.

A Hardware and Software System that Scales

Microsoft has chosen to focus on topological qubits because the researchers believe it will scale, and the company is also building a complete hardware stack to support the scaling. “We’re building a cryogenic computer to control the topological quantum chip; then we're building a software system where you can compile millions of operations and beyond.”

The algorithms running on the system could be doing things like quantum chemistry – looking for more efficient fertilizer or a room temperature semiconductor – or improving machine learning. Microsoft Research has already shown that deep learning trains faster with a quantum computer. With the same deep learning models in use today, Svore says, the research shows “quadratic speedups” even before you start adding quantum terms to the data model, which seems to improve performance even further.

Redesigning a Programming Language

To get developers used to the rather different style of quantum programming, Microsoft will offer a new set of tools in preview later this year (which doesn’t have a name yet) that’s a superset built on what it learned from the academics, researchers, students, and developers who used Liquid, an embedded domain specific language in F# that Microsoft created some years ago.

The language itself has familiar concepts like functions, if statements, variables, and branches, but it also has quantum-specific elements and a growing set of libraries developers can call to help them build quantum apps.

“We’ve almost completely redesigned the language; we will offer all the things Liquid had, but also much more, and it’s not an embedded language. It’s really a domain-specific language designed upfront for scalable quantum computing, and what we’ve tried to do is raise the level of abstraction in this high-level language with the ability to call vast numbers of libraries and subroutines.”

Some of those are low-level subroutines like an adder, a multiplier, and trigonometry functions, but there are also higher-level functions that are commonly used in quantum computing. “Tools like phase estimation, amplitude amplification, amplitude estimation -- these are very common frameworks for your quantum algorithms. They’re the core framework for setting up your algorithm to measure and get the answer out at the end [of the computation], and they’re available in a very pluggable way.”

Quantum computing

Quantum computing from Nick Brandaleone

A key part of making the language accessible is the way it’s integrated into Visual Studio, Microsoft’s IDE. “I think this is a huge step forward,” Svore told us. “It makes it so much easier to read the code because you get the syntax coloring and the debugging; you can set a breakpoint, you can visualise the quantum state.”

Being able to step through your code to understand how it works is critical to learning a new language or a new style of programming, and quantum computing is a very different style of computing.

“As we’ve learned about quantum algorithms and applications, we’ve put what we’ve learned into libraries to make it easier for a future generation of quantum developers,” Svore said. “Our hope is that as a developer you’re not having to think at the lower level of circuits and probabilities. The ability to use these higher-level constructs is key.”

Hybrid Applications

The new language will also make it easier to develop hybrid applications that use both quantum and classical computing, which Svore predicts will be a common pattern. “With the quantum computer, many of the quantum apps and algorithms are hybrid. You're doing pre and post-processing or in some algorithms you’ll even be doing a very tight loop with a classical supercomputer.”

How Many Qubits Can You Handle?

Microsoft, she says, is making progress with its topological qubits, but, as it’s impossible to put any kind of date on when a working system might emerge from all this work, the company will come out with a quantum simulator to actually run the programs you write, along with the other development tools.

Depending on how powerful your system is, you’ll be able to simulate between 30 and 33 qubits on your own hardware. For 40 qubits and more, you can do the simulation on Azure.

“At 30 qubits, it takes roughly 16GB of classical memory to store that quantum state, and each operation takes a few seconds,” Svore explains. But as you simulate more qubits, you need a lot more resources. Ten qubits means adding two to the power of 10, or 16TB of memory and double that to go from 40 to 41 qubits. Pretty soon, you’re hitting petabytes of memory. “At 230 qubits, the amount of memory you need is 10^80 bytes, which is more bytes than there are particles in the physical universe, and one operation takes the lifetime of the universe,” Svore said. “But in a quantum computer, that one operation takes 100 nanoseconds.”

Microsoft’s broad-based quantum effort

LIQUi|> is one of a number of quantum computing projects Microsoft researchers have been spearheading for more than a decade, in the quest to create the next generation of computing that will have a profound effect on society.

In addition to the QuArC research group, Microsoft’s Station Q research lab, led by renowned mathematician Michael Freedman, is pursuing an approach called topological quantum computing that they believe will be more stable than other quantum computing methods.

The idea is to design software, hardware and other elements of quantum computing all at the same time.

“This isn’t just, ‘Make the qubits.’ This is, ‘Make the system,’” Wecker said.

A qubit is a unit of quantum information, and it’s the key building block to a quantum computer. Using qubits, researchers believe that quantum computers could very quickly evaluate multiple solutions to a problem at the same time, rather than sequentially. That would give scientists the ability to do high-speed, complex calculations, allowing biologists, physicists and chemists to get information they never thought possible before.

LIQUiD - Station Q overview

Fertilizer, batteries and climate change

Take fertilizer, for example. Fertilizers are crucial to feeding the world’s growing population because they allow plants to develop better and faster. But synthetic fertilizer relies on natural gas, and lots of it: That’s expensive, depletes an important natural resource and adds to pollution.

Using a quantum computer, Wecker said scientists think they could map the chemical used by bacteria that naturally creates fertilizers, making it easier to create an alternative to the current, natural-gas based synthetic fertilizer.

The incredible power of quantum computers also could be used to figure out how to create organic batteries that don’t rely on lithium, and Wecker said they could help to create systems for capturing carbon emissions effectively, potentially reducing the effects of climate change.

Researchers believe that quantum computers will be ideal for challenges like this, which involve mapping complex physical systems, but they also know that they won’t be the best choice for all computing problems. That’s because quantum computers operate very differently from classical digital computers.

Although quantum computers can process data much faster, it’s much more difficult to get the results of their calculations because of how qubits are structured. A person using a quantum system needs to know the right question to ask in order to efficiently get the answer they want.

For now at least, quantum computer scientists also are struggling to create systems that can run lots of qubits. Because qubits are essentially a scarce resource, Svore said another big research focus is on how to minimize the number of qubits needed to do any algorithm or calculation. That’s also one of the main focuses of Station Q, which is using an area of math called topology to find ways to use fewer qubits.

Wecker said that’s another major advantage to a system like LIQUi|>: It will help researchers figure out how best to use these unique computers.

As quantum computing technology becomes increasingly sophisticated, the techniques required to calibrate and certify device performance are becoming commensurately sophisticated. In this talk, I will discuss the need for QCVV (quantum characterization, verification, and validation) protocols to facilitate advances towards fault-tolerant universal quantum computation. In particular, I'll examine what kind of errors we expect nascent quantum information processors to suffer from, and how the QCVV tools may be used for detecting, diagnosing, and ultimately correcting such errors. To illustrate this point, I will examine the role gate set tomography (GST) played in characterizing quantum operations on a trapped-Yb-ion qubit, and how GST was iteratively used to a) make the qubit gate behavior Markovian and b) verify that the errors on the qubit operations were below the threshold for fault-tolerance. Lastly, several "GST-adjacent" QCVV protocols, such as drift- and cross-talk detection will be examined, and the future of QCVV research will be discussed.

This work was supported by the Intelligence Advanced Research Projects Activity (IARPA), and Sandia's Laboratory Directed Research and Development (LDRD) Program. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA- 0003525.

Better Quantum Living Through QCVV

More Information:

https://www.nature.com/news/inside-microsoft-s-quest-for-a-topological-quantum-computer-1.20774

https://medium.com/swlh/topological-quantum-computing-5b7bdc93d93f

https://arxiv.org/pdf/1705.04103.pdf

https://www.pnas.org/content/115/43/10938

https://www.kitp.ucsb.edu/news/topological-quantum-computing

http://web.math.ucsb.edu/~zhenghwa/data/course/cbms.pdf

https://iopscience.iop.org/journal/1367-2630/page/Focus%20on%20Topological%20Quantum%20Computation

https://physicstoday.scitation.org/doi/10.1063/PT.3.4499

https://cloudblogs.microsoft.com/quantum/2018/09/06/developing-a-topological-qubit/

https://cloudblogs.microsoft.com/quantum/2020/03/27/new-physics-discovery-microsoft-quantum-topology-with-a-twist/

https://www.microsoft.com/en-us/quantum/technology

https://news.microsoft.com/features/new-microsoft-breakthroughs-general-purpose-quantum-computing-moves-closer-reality/

https://pubs.acs.org/doi/10.1021/acscentsci.9b00202

http://stationq.github.io/Liquid/getting-started/

https://github.com/StationQ/Liquid

Quantum Computing Research helps IBM win Top Spot in Patent Race

These exotic, radical new machines have matured enough to secure a place at CES 2021.

IBM secured 9,130 US patents in 2020, more than any other company as measured by an annual ranking, and this year quantum computing showed up as part of Big Blue's research effort. The company wouldn't disclose how many of the patents were related to quantum computing -- certainly fewer than the 2,300 it received for artificial intelligence work and 3,000 for cloud computing -- but it's clear the company sees them as key to the future of computing.

The IFI Claims patent monitoring service compiles the list annually, and IBM is a fixture at the top. The IBM Research division, with labs around the globe, has for decades invested in projects that are far away from commercialization. Even though the work doesn't always pay dividends, it's produced Nobel prizes and led to entire industries like hard drives, computer memory and database software.

"A lot of the work we do in R&D really is not just about the number of patents, but a way of thinking," Jerry Chow, director of quantum hardware system development, said in an exclusive interview. "New ideas come out of it."

IFI's US patent list is dominated by computer technology companies. Second place went to Samsung with 6,415 patents, followed by Canon with 3,225, Microsoft with 2,905 and Intel with 2,867. Next on the list are Taiwan Semiconductor Manufacturing Corp., LG, Apple, Huawei and Qualcomm. The first non-computing company is Toyota, in 14th place.

Quantum Computing Fundamentals

Internationally, IBM ranked second to Samsung in patents for 2020, and industrial companies Bosch and General Electric cracked the top 10. Many patents are duplicative internationally since it's possible to file for a single patent in 153 countries.

The quantum computing priority

Quantum computing holds the potential to tackle computing problems out of reach of conventional computers. During a time when it's getting harder to improve ordinary microprocessors, quantum computers could pioneer new high-tech materials for solar panels and batteries, improve chemical processes, speed up package delivery, make factories more efficient and lower financial risks for investors.

What's Next: The Future of Quantum Computing

Industrywide, quantum computing is a top research priority, with dozens of companies investing millions of dollars even though most don't expect a payoff for years. The US government is bolstering that effort with a massive multilab research effort. It's even become a headline event at this year's CES, a conference that more typically focuses on new TVs, laptops and other consumer products.

"Tactical and strategic funding is critical" to quantum computing's success, said Hyperion Research analyst Bob Sorensen. That's because, unlike more mature technologies, there's not yet any virtuous cycle where profits from today's quantum computing products and services fund the development of tomorrow's more capable successors.

European Quantum Leadership - Session 1: Quantum Computing

IBM has taken a strong early position in quantum computing, but it's too early to pick winners in the market, Sorensen added.

The long-term goal is what's called a fault tolerant quantum computer, one that uses error correction to keep calculations humming even when individual qubits, the data processing element at the heart of quantum computers, are perturbed. In the nearer term, some customers like financial services giant JPMorgan Chase, carmaker Daimler and aerospace company Airbus are investing in quantum computing work today with the hope that it'll pay off later.

IBM quantum computing patents

Quantum computing is complicated to say the least, but a few patents illustrate what's going on in IBM's labs.

Patent No. 10,622,536 governs different lattices in which IBM lays out its qubits. Today's 27-qubit "Falcon" quantum computers use this approach, as do the newer 65-qubit "Hummingbird" machines and the much more powerful 1,121-qubit "Condor" systems due in 2023.

What to do with a near-term quantum computer?

IBM's lattices are designed to minimize "crosstalk," in which a control signal for one qubit ends up influencing others, too. That's key to IBM's ability to manufacture working quantum processors and will become more important as qubit counts increase, letting quantum computers tackle harder problems and incorporate error correction, Chow said.

Patent No. 10,810,665 governs a higher-level quantum computing application for assessing risk -- a key part of financial services companies figuring out how to invest money. The more complex the options being judged, the slower the computation, but the IBM approach still outpaces classical computers.

Patent No. 10,599,989 describes a way of speeding up some molecular simulations, a key potential promise of quantum computers, by finding symmetries in molecules that can reduce computational complexity.

Most customers will tap into the new technology through quantum computing as a service. Because quantum computers typically must be supercooled to within a hair's breadth of absolute zero to avoid perturbing the qubits, and require spools of complicated wiring, most quantum computing customers are likely to tap into online services from companies like IBM, Google, Amazon and Microsoft that offer access to their own carefully managed machines.

One of the best sources for initial education on the basics of Quantum Computing is Quantum Computing for the Determined by Michael Nielsen. This consists of 22 short videos that discusses The Basics, Superdense Coding, Quantum Teleportation, and The Postulates of Quantum Mechanics. Highly recommended.

QBN Webinar: Quantum Computing for Material Science & Pharma

Michael Nielsen and Andy Matuschak are developing a new online course called Quantum computing for the very curious using a new experimental mnemonic medium designed to make it almost effortless to remember what you read. The first episode has just been posted with additional ones coming soon.

IBM is providing an online, open-source textbook called Learn Quantum Computation Using Qiskit that will connect theory with practice and help students explore practical problem sets that can run on real quantum systems. IBM has dozens of different videos available on IBM’s YouTube Qiskit channel. A number of different playlists are available covering topics including Coding with Qiskit, 1 Minute Qiskit, Quantum Fundamentals, Circuit Sessions, Quantum Information Science Seminar Series and others.

Q-CTRL has provided a serious of education videos, technical seminars, and tutorials that you can find on Q-CTRL’s YouTube channel here.

Toptica Photonics has developed a Quantum Quiz app where players can test and improve their knowledge about quantum technologies. It is available for both Apple iOS and Android smartphones. The quiz has three increasingly difficult levels of play and can be used in either Solo or Multiplayer mode.

FutureLearn in association with Keio University is offering a free online course called Understanding Quantum Computers. This is a four week course requiring an estimated five hours per week of study that will discuss the motivation for building quantum computers, cover the important principles in quantum computing, take a look at some of the important quantum computing algorithms and provide a brief look at quantum computing hardware and the budding quantum information technology industry. It is meant for high school students, college students, and computer professionals interested in developing a qualitative understanding of quantum computing.

Impact of Ionizing Radiation on Superconducting Qubit Coherence - Antti Vepsäläinen

A blog post that lists Quantum Computing Resources for High School Students is available on the Unitary Fund website. It was written as a guest posting by Jack Ceroni and describes both programs and educational resources that may be of interest to a quantum-curious high school student.

David Deutsch has posted six video Lectures on Quantum Computation designed as an introduction to the quantum theory of computation.

The Perimeter Institute has posted a series of 14 hour long lectures by Daniel Gottesman called the Quantum Information Review. This lecture series was recorded relatively recently in 2015 and they can be downloaded in multiple formats (MP4, MP3, and PDF).

Overview of Qiskit Ignis - Struggle with errors

Microsoft and Brilliant.org have teamed to create an online course on Quantum Computing. It is a 33 chapter course that teaches quantum computing concepts and some well-known algorithms using Microsoft’s Q# language with Python. The first two chapters of the course are free but there is a fee to access the remaining chapters.

Programming Existing Quantum Computers

Microsoft has created a series of tutorials called the Quantum Katas. These tutorials are an open source project that contains a series of programming exercises using the Q# programming language that allow users to learn at their own pace. They are used with the Microsoft Quantum Development Kit and consists of a sequence of quantum computing tasks that require a user to fill in some code. The katas use simple learning principles including active learning, incremental complexity growth, and feedback. For more information you can read the Microsoft blog description here and download the code and instructions on how to install it at GitHub here.

Pengfei Zhang – An obstacle to sub-AdS holography for SYK-like models

QuTech has available online the QuTech Academy which currently includes four courses on edX.org. These include:

Quantum internet & quantum computers: how will they change the world? – an introduction to the various potential applications of a quantum computer and a quantum internet.
Building blocks of a quantum computer – part 1: what does a quantum computer look like, what components will it have and how does a quantum computer operate? Part 1 focusses on the layers of the qubit.
Building blocks of a quantum computer – part 2: as a continuation of part 1, part 2 will explain the other layers of a quantum computer, ranging from the electronics, to hardware, software and algorithms needed to operate a quantum computer.
Quantum Cryptography: this course dives deeper into the quantum protocols and how this will lead to secure communication.

Caltech has online the course material for Physics 219, Quantum Computation. This is a course which has evolved for over 10 years and now has over 400 pages of material online in nine chapters. You can find this course at: http://www.theory.caltech.edu/people/preskill/ph229/

Umesh Vazirani of UC Berkeley has recorded a series of 64 video lectures for a course titled: Quantum Mechanics and Quantum Computation. The videos are short ranging from 5 to 20 minutes in length and provide a good introduction to basic quantum mechanical principles, qubits, and quantum algorithms. The videos have been uploaded onto YouTube and you can find them at https://www.youtube.com/watch?v=Z1uoz_8dLH0&list=PL74Rel4IAsETUwZS_Se_P-fSEyEVQwni7.

Fabrizio Renzi, Ivano Tavernelli - IBM Q: building the first universal quantum computers

Daniel Colomer of Quantum Intuition has created a YouTube channel containing several hundred videos covering a broad range of topics related to quantum algorithms and programming quantum computers. The videos are divided into six areas including Project Reviews, Quantum AI/ML, Textbook Algorithms, Useful Primitives, Quantum Error Correction, and Book & Online Course Reviews. The videos range in length from 2 minutes to 2 hours, but the average is roughly in the 30 minute range. These videos are a good source for those who want to better understand a particular topic because the videos show online demonstrations using several different software platforms such as Qiskit, Cirq, Pennylane, Quirk and others. You can view the videos on the Quantum Intuition YouTube channel at https://www.youtube.com/channel/UC-2knDbf4kzT3uzOeizWo7iTJyw.

MIT offers an xPRO series consisting of two series with two courses in each series. The courses consist of video lectures from MIT professors with associated problem sets and each lasts for four weeks. The Quantum Computing Fundamentals series has two courses. The first is called Introduction to Quantum Computing and the second is called Quantum Computing Algorithms for Cybersecurity, Chemistry, and Optimization. The Quantum Computing Realities series has Practical Realities of Quantum Computation and Quantum Communications as the first course and Requirements for Large-Scale Universal Quantum Computation as the second. Details on these courses and links to enroll in them can be found on the MIT web site at https://learn-xpro.mit.edu/quantum-computing.

Boson Sampling and Quantum Simulations in Circuit QED - Qiskit Seminar Series - Steve Girvin

Quantum Computing UK provides a web site that contains several tutorials that introduces the reader to quantum computing. In addition, they maintain a code repository that allows someone to run programs on quantum computers and they also perform research and publish papers on quantum computing algorithms.

Dr. James Wootton of the University of Basel has developed a blog site called Decodoku and associated games devoted to the topic of quantum error correction. The site contains two games called Decodoku and Decodoku Puzzles where are available for download on both IOS and Android. Playing the games allows one to learn and do research on quantum error correction. In addition, the blog has a good series of posts that provide a good tutorial on quantum error correction.

Two different companies, Qubitekk and Phase Space Computing, have developed educational toolkits suitable for classroom use that provides students with hands-on experience with quantum phenomena. The Qubitekk product, called the Quantum Mechanics Lab Kit, includes all of the equipment and instructions needed to perform seven fundamental experiments in quantum mechanics. The kit is based on photonic technology and includes a laser, bi-photon source, photon counting module, coincidence counter and various fiber optic components to demonstrate entanglement, superposition and other quantum phenomena. The Phase Space Computing Toolkits consist of electronic circuit boards that approximate the behavior of quantum gates. They use patent pending, two-complementary pass-transistor logic to similar the behavior of reversible quantum gates. Their toolboxes can demonstrate functions such as quantum key distribution, teleportation, superdense coding, the Deutsch-Jozsa algorithm and Shor’s algorithm.

Quantum Programs

qutools GmbH is offering three different Quantum Physics Education and Science Kits. These include quED, an entanglement demonstrator, Quantenkoffer, a plug and play quantum science kit with single and entangled photon pairs that provide multiple tokens with different optical abilities give a huge variety of experiments and quNV for investigating quantum sensing using nitrogen-vacancy (NV) centers.

There is a concise, yet very understandable brief on quantum annealing written by Brianna Gopaul. The brief describes how quantum annealing works, what organizations are developing quantum annealers, and applications where they may be used. You can view this brief at: https://www.linkedin.com/pulse/quantum-annealers-solving-worlds-optimization-problems-brianna-gopaul/.

More Information:

https://quantumcomputingreport.com/education/

https://learn-xpro.mit.edu/quantum-computing

https://quantumcomputinguk.org

https://www.youtube.com/c/UncertainSystems/videos

https://iqim.caltech.edu

http://www.theory.caltech.edu/resources

https://quantumfrontiers.com

https://quantumfrontiers.com/2020/08/30/if-the-quantum-metrology-key-fits/

https://iqim.caltech.edu/nsf-poster-session/

https://www.physicsforums.com/forums/programming-and-computer-science.165/

https://www.physicsforums.com/forums/computing-and-technology.188/

https://orangeqs.com

https://delft-circuits.com

https://quandco.com/news/quco-impaqt-consortium

https://quandco.com/blog

https://arcb.csc.ncsu.edu/~mueller/qc/qc19/readings/

Qatalyst, a Quantum Application Accelerator

One of the best places where a quantum computer can be used for advantage over a classical approach is with optimization problems. These are problems where one seeks to find the lowest energy value in a mathematical equation expressed as a QUBO (Quadratic Unconstrained Binary Optimization) and they are applicable in a great many areas including finance, logistics, drug discovery, cybersecurity, machine learning, and many others. To this end, Quantum Computing Inc. (QCI) has announced commercial availability of their Qatalyst software (formerly called Mukai) to solve these types of problems efficiently on a variety of hardware platforms, both classical and quantum.

One advantage to the Qatalyst software is that many data scientists are quite familiar with using optimization problems with classical computing solvers and they should be able to quickly learn to use the software and compare the solutions of these problems on several different computing platforms, both classical and quantum, including quantum machines from Rigetti, D-Wave, and IonQ. So the big hurdle often faced by application developers with other QC approaches of learning completely new software algorithms, languages, and development kits is minimized. QCI has indicated that they software has solved large optimization problems that contain up to 110,000 variables with 8,000 constraints. The Los Alamos National Laboratory (LANL) tested the software and recently posted a paper on arXiv comparing the performance of the Qatalyst software (referred to as Mukai in the paper) with another QUBO solver available from D-Wave called Qbsolv and found that Qatalyst had better performance.

From qubits to Quantum Accelerator - The Full Stack Vision

Additional information about QCI’s announcement can be found in their announcement press release here. You can also view a previous articles we published about the Mukai software here and here. In addition, QCI has created a QikStart initiative for accelerating quantum use cases that we reported on here.

Majorana qubits for topological quantum computing Researchers are trying to store robust quantum information in Majorana particles and are generating quantum gates by exploiting the bizarre non-abelian statistics of Majorana zero modes bound to topological defects.

Soon after Enrico Fermi became a professor of physics at Italy’s University of Rome in 1927, Ettore Majorana joined his research group. Majorana’s colleagues described him as humble because he considered some of his work unexceptional. For example, Majorana correctly predicted in 1932 the existence of the neutron, which he dubbed a neutral proton, based on an atomic-structure experiment by Irène Joliot-Curie and Frédéric Joliot-Curie. Despite Fermi’s urging, Majorana didn’t write a paper. Later that year James Chadwick experimentally confirmed the neutron’s existence and was awarded the 1935 Nobel Prize in Physics for the discovery.

Nevertheless, Fermi thought highly of Majorana, as is captured in the following quote: “There are various categories of scientists, people of a secondary or tertiary standing, who do their best but do not go very far. There are also those of high standing, who come to discoveries of great importance, fundamental for the development of science. But then there are geniuses like Galileo and Newton. Well, Ettore was one of them.” Majorana only wrote nine papers, and the last one, about the now-eponymous fermions, was published in 1937 at Fermi’s insistence. A few months later, Majorana took a night boat to Palermo and was never seen again.

In that final article, Majorana presented an alternative representation of the relativistic Dirac equation in terms of real wave functions. The representation has profound consequences because a real wave function describes particles that are their own antiparticles, unlike electrons and positrons. Since particles and antiparticles have opposite charges, fermions in his new representation must have zero charge. Majorana postulated that the neutrino could be one of those exotic fermions.

Although physicists have observed neutrinos for more than 60 years, whether Majorana’s hypothesis is true remains unclear. For example, the discovery of neutrino oscillations, which earned Takaaki Kajita and Arthur McDonald the 2015 Nobel Prize in Physics, demonstrates that neutrinos have mass.

But the standard model requires that neutrinos be massless, so various possibilities have been hypothesized to explain the discrepancy. One answer could come from massive neutrinos that do not interact through the weak nuclear force. Such sterile neutrinos could be the particles that Majorana predicted. Whereas conclusive evidence for the existence of Majorana neutrinos remains elusive, researchers are now using Majorana’s idea for other applications, including exotic excitations in superconductors.

The next generation of Quantum Analyzers: SHFQA Launch Event

Majorana quasiparticles in superconductors

From the condensed-matter viewpoint, Majoranas are not elementary particles but rather emergent quasiparticles. Interestingly, the equation that describes quasiparticle excitations in superconductors has the same mathematical structure as the Majorana equation. The reason for the similarity arises from the underlying particle–hole symmetry in superconductors: Unlike quasiparticles in a metal, which have a well-defined charge, quasiparticles in a superconductor comprise coherent superpositions of electrons and holes. For the special zero-energy eigenmode, the electron and the hole, which each contribute half probability, form a quasiparticle. The operators describing the zero-energy particle–hole superpositions are invariant under charge conjugation, and zero-energy modes are therefore condensed-matter Majorana particles.

Particle–hole symmetry dictates that excitations in superconductors should occur in pairs at energies ±E. Therefore, zero-energy excitations are seemingly unreachable because they cannot emerge by any smooth deformation of the Hamiltonian, which would require that one of the solutions disappear. Rather, the only way to generate zero-energy excitations in superconductors is through a topological transition, a process that separates the phase of Majorana zero modes from the phase without them by closing and then reopening the superconducting gap (see the article by Nick Read, Physics Today, July 2012, page 38).

Majorana zero modes are located at topological defects, such as vortices, boundaries, and domain walls in topological superconductors. Remarkably, Majorana zero modes bound to defects do not obey fermion statistics. Unlike the original particles predicted by Majorana, the zero modes possess non-abelian exchange statistics, also known as non-abelian braiding, which makes them promising for applications in topological quantum computing, as detailed in box 1. Quasiparticles with non-abelian exchange statistics were first predicted in 1991 to occur in the filling factor ν = 5⁄2 of the fractional quantum Hall state. In 2000, researchers demonstrated that similar physics occur in superconductors with intrinsic p-wave pairing, an exotic form of superconductivity in which Cooper pairs bind through rare triplet-like pairing instead of the more standard singlet-like pairing in s-wave superconductors.2 Conventional s-wave pairing can be converted to p-wave pairing by combining the superconducting proximity effect in materials with strong spin–orbit interactions and an external magnetic field that breaks time-reversal symmetry.

Box 1. Non-abelian braiding
Quantum mechanics dictates that particles obey either Fermi–Dirac or Bose–Einstein statistics in three dimensions, which means that the wavefunction Ψ of a system of indistinguishable particles is necessarily bosonic or fermionic upon particle exchange. From that point of view, fermions and bosons are not exotic because exchanging them leaves the ground state invariant, up to a sign: Ψ → ±Ψ.
Two dimensions are richer. Now, the possibilities go beyond the fermionic or bosonic cases. A system can exhibit anyon statistics in which the wavefunction picks up an arbitrary phase under an exchange: Ψ → eiθ Ψ. Such behavior generalizes the boson and fermion cases, where the phases can only be θ = 0 or θ = π. Because the phase factors are ordinary commuting numbers, the order of successive exchanges doesn’t matter, and the anyon statistics are called abelian.
The weirdness starts in systems with a degenerate many-body ground state containing several quasiparticles. When quasiparticles are exchanged, the system goes from one ground state, Ψa, to another, MabΨb. Because the unitary transformations Mab that operate in the subspace of degenerate ground states are generally noncommuting, the anyonic statistics take a non-abelian form. The final state of the system, therefore, depends on the order of the exchange operations, similar to braiding cords in a necklace.
Using Majorana zero modes to store and manipulate quantum information is one case where non-abelian braiding statistics form the basis of topological quantum computation (see the article by Sankar Das Sarma, Michael Freedman, and Chetan Nayak, Physics Today, July 2006, page 32). Quantum computation in such a system also benefits from protection against environmental decoherence because of the nonlocal character of Majorana-based qubits.

The nanowire proposal

In 2010 two research groups made an elegant theoretical proposal, shown schematically in figure 1. If a semiconducting nanowire with strong spin–orbit coupling, such as indium arsenide or indium antimonide, is coupled to a standard s-wave superconductor, Majorana zero modes will emerge at both ends of the nanowire, provided that a magnetic field is applied parallel to it.3 The proposal realistically implements the paradigmatic one-dimensional model for p-wave superconductivity that was discussed in 2001 for the first time by Alexei Kitaev.4

Figure 1. (a) The nanowire proposal3 takes a nanowire of a semiconductor, such as indium arsenide or indium antimonide, that has strong spin–orbit coupling and places it in contact with an s-wave superconductor, such as aluminum, in the presence of an external magnetic field B. As in the original model for one-dimensional p-wave superconductors,4 the nanowire device experiences a topological nontrivial phase with exponentially decaying Majorana bound states, denoted γL, at both ends of the nanowire. (b) An actual device from Delft University of Technology includes various metallic gates for tuning it to the topological phase by adjusting the nanowire’s chemical potential. (Panel a adapted from ref. 3, R. M. Lutchyn, J. D. Sau, S. Das Sarma; panel b adapted from H. Zhang et al., Nature 556 74, 2018.)

The Majorana zero modes are localized at opposite ends of the wire and decay with position x as e−x/ξ, where ξ is the localization length. But together they form a highly delocalized fermion, which can be seen mathematically as a fermion operator that decomposes to two real, self-adjoint operators. The nonlocal fermion defines two parity states—the empty state and the full fermion one—that are degenerate at zero energy except for exponentially small corrections e−L/ξ, where L is the length of the wire. Those two states can be used to define a qubit. Because the states are stored nonlocally, the qubit is resilient against local perturbations from the environment.

To induce a closing and reopening of an energy gap in the nanowire platform, researchers exploit the competition among three effects. The first, the s-wave superconducting proximity effect, pairs electrons of opposite spin and opens a superconducting gap Δ at the Fermi level. In the second effect, an external magnetic field B generates a Zeeman energy EZ = gμBB/2—with g the nanowire’s Landé factor and μB the Bohr magneton—which tends to break Cooper pairs by aligning their electron spins and closing the gap. The third effect, spin–orbit coupling, negates the external magnetic field by preventing the spins from reaching full alignment.

The competition between the second and third effects creates regions in parameter space where the gap closes and reopens again. At low electron densities, the transition occurs when the Zeeman energy is of the same magnitude as the induced superconducting gap, and it can be reached either by increasing the magnetic field, as shown in figure 2, or by tuning the wire’s chemical potential. Apart from choosing semiconductors with a large spin–orbit coupling and good proximity effect with conventional superconductors, researchers need large g factors to induce a large Zeeman effect with moderate magnetic fields below the critical field of the superconductor. Materials such as the heavy-element semiconductors InAs and InSb have proven to be excellent choices.

Figure 2. Andreev reflections of electrons and holes to form Cooper pairs at the semiconducting– superconducting interface induce superconductivity in a nanowire. As a result, Majorana zero modes (flat red line) emerge in the energy spectrum as the external magnetic field increases. The Majoranas appear beyond some critical value of the external field (black dotted line) where the superconducting gap closes and reopens again, which signals a topological phase transition. Theory predicts that the emergent Majorana zero modes can be detected as a zero-bias anomaly in electrical conductance dI/dV. (Image by R. Aguado and L. P. Kouwenhoven.)

Topological superconductivity can also be engineered using similar ideas in alternative platforms. Some examples include chains of magnetic impurities above superconductors; proximitized 2D materials; and vortices in proximitized topological insulators such as quantum spin-Hall insulators, quantum anomalous-Hall insulators, and iron-based topological surface states.

Measuring Majoranas

At energies below the superconducting gap, an electron incident on a superconductor (S) from a normal conductor (N) can be reflected either as an electron or as a hole. Whereas the electron process is a standard, normal reflection, the hole process, known as Andreev reflection, is subtler because electrons are reflected as holes in the normal side while creating a Cooper pair in the superconducting side. In a standard NS junction, such Andreev processes are rare in the tunneling limit, and the conductance is small. But in a topological NS junction containing Majorana bound states, an incident electron is always reflected as a hole with unitary probability.

As a result of that resonant Andreev process, the electrical conductance G at zero voltage is expected to be perfectly quantized: G = 2e2/h, where e is the electron charge and h, Planck’s constant. The Andreev process underscores the particle–antiparticle duality of Majorana bound states: Because the electron and hole contribute equally to form a Majorana quasiparticle, the tunneling rates for electrons and holes should be equal. Therefore, researchers can use tunneling spectroscopy to directly detect a Majorana bound state as a zero-bias anomaly (ZBA). The differential conductance dI/dV, with I the current across the junction, is a function of the applied bias voltage V, and the ZBA should emerge as an increasing magnetic field induces a topological transition in the nanowire.

In 2012, researchers showed that the nanowire proposal could indeed be realized.5 A typical measurement from that experiment is illustrated in figure 3a, which shows conductance versus applied bias voltage and magnetic field. For intermediate values of the magnetic field, a clear ZBA emerges in the middle of the superconducting gap and is consistent with the existence of zero-energy Majorana bound states in the nanowire. Subsequent experiments showed similar results.

Noise is a qubit killer. This is the main challenge faced by quantum hardware makers and the reason why Microsoft decided to chase topological particles called Majoranas for their qubits, instead of the usual two-level systems. It is a “three-little-pigs” situation: you can choose the material that will give you a qubit faster, such as superconducting circuits, but you will build a fragile quantum processor. On the other end of the spectrum are the hard-to-build yet all-enduring topological qubits. The problem that topological-qubit advocates are facing now is that it seems that they are even harder to build than first thought. This creates a major setback to Microsoft’s main quantum hardware program, as discussed by Quantum Computing Report in a previous article.

Topological properties are believed to be impervious to noise because they are not easily affected by stray electromagnetic fields. As an analogy, think of a sailboat taking laps around a lake. While the changes in wind might make the path of the sailboat wiggle around its intended route, the number of laps that the boat takes is unaffected (unless you have catastrophic winds). In this analogy, the wiggly path would be the noise that affects traditional qubits, while the number of laps is the robust topological property. But topological states are not easy to synthesize – Majorana particles, for instance, do not occur naturally.

These Majorana particles were theorised to emerge in complex devices made from superconductors and special semiconducting nanowires. In 2018, a team led by Professor Leo Kouwenhoven from QuTech, in the Netherlands, developed ingenious fabrication techniques and brought together all the necessary ingredients (they had partial results before, but their main result only came in 2018). They then measured currents through these nanowires and concluded that they saw signs of these topological particles, kickstarting Microsoft’s global effort to make Majorana-based quantum computers. In a surprising twist, the authors of the original work decided now to retract the paper from Nature and publish extended data that reveals that their main conclusion was incorrect. Majoranas were not measured in those nanowires.

Why did it take two years for an extremely well-funded, globally-reaching scientific team to notice something was off? Firstly, it should be noted that the “extended data” published now is not new data acquired through more recent measurements – it is data that had been cut out from the original paper. This data was analysed by experts in the field and it contains enough information to discard the main conclusion in the original paper. If this information were made public before (as dictated by good scientific conduct), this issue could have been identified by trained eyes earlier.

But on top of that, the collective understanding of the physics behind these complex devices is in its infancy, and effects of unavoidable disorder in the nanowires are only now starting to be understood. That is the risk that Microsoft took for itself when deciding to embark on a journey to create quantum computers out of particles that had not yet been observed in laboratory.

Zurich Instruments talks - Scaling up quantum computing control systems to 100 qubits and beyond

This opened a Pandora’s box of reactions. The specialised media was fast to capitalise on the “Microsoft vs Google vs IBM vs Intel quantum race”, claiming that the result shows how far behind Microsoft is when compared, for instance, with Google’s 53 qubit superconducting chip. But anyone that has been paying close attention knows that Google is not the world leader in quantum computing – 53 qubits are an impressive display of engineering skills, but these are very faulty qubits that cannot be used for any real-life applications. While we wait for Google to show how its Sycamore processor performs when trying to perform quantum error corrections, it is hard to gauge whether Google really has something that could move forward as a viable universal quantum processor.

On the topic of quantum error correction, that could well be where other technologies stall and Microsoft’s Majoranas catch up. One of the theoretical masterminds behind the idea of topological qubits and co-author in the retracted paper, Professor Sankar Das Sarma from University of Maryland fired on Twitter that the “…idea of using surface codes to do error correction is in some crude sense trying to produce approximate topological qubits…”. Indeed, the surface code is largely based on the ideas from 2016 Nobel laureate Duncan Haldane. By organising qubits in a two-dimensional array and repetitively performing operations and measurements, one tries to force the faulty qubits into collectively maintaining bits of information that are protected from the local stray fields acting on each separate qubit. This is, once again, leveraging the idea that global properties (like the number of laps of a sailboat around a lake) are more resistant than local properties (such as the position of the sailboat at any given moment, as affected by winds and waves).

Professor Das Sarma goes on about the shortcomings of the usual two-level-system approach in his Twitter account (called Condensed Matter Theory Center after the UMD-based institution that he directs). He says most of the mediatic noise regarding this episode is the result of “…total ignorance of how a quantum computer works—you must have LOGICAL qubits which NOBODY is even close to having”. This is arguably incorrect – the group of Professor Christopher Monroe from the same University of Maryland posted to an online preprint repository a manuscript showing a logical qubit based on ion traps with fault-tolerant operation levels (as covered by Quantum Computing Report here).

One of the key findings by Professor Monroe is that a significant improvement is achieved when, unlike in the surface code, long range coupling between qubits is attained. The surface code assumes that a qubit can only be entangled with its immediate neighbours, but some technologies allow for qubits to be moved around or even to achieve pairwise interactions mediated by the collective movement of all qubits (which is the case for ions in a trap). Hard to believe that Professor Das Sarma is unaware of the work by Professor Monroe – it is unclear whether he thinks there is something wrong with Professor Monroe’s conclusions or if he is (perhaps ironically) waiting for the manuscript to be published in a peer-reviewed journal.

Declaring the topological qubits dead prematurely might be a mistake. But undoubtedly this retraction reveals how ignorant we still are about these ethereal topological particles and how Microsoft is not ready to firmly progress in harvesting them for technological applications. Scientists are not nearly ready to pass this baton to engineers.

It’s a tale oft told in physics: researchers are, yet again, excited about a phenomenon that may or may not exist. This time, it’s Majorana fermions—weird objects that act as their own antiparticles. Some condensed matter physicists think they’ve seen these elusive beasts, but others aren’t so sure. Either way, Microsoft has put out a bounty for the Majoranas and hopes one day to harness them for quantum computing.

While particle physicists also study a version of the Majorana fermion (neutrinos might be of this ilk), the ones of interest to quantum computing are quasiparticles—many electrons acting collectively in materials to mimic particles. In 2012, researchers at the Delft University of Technology in the Netherlands first reported experimental evidence of the quasiparticle in a semiconductor nanowire attached to a superconductor. Subsequent measurements by several other research groups also match theoretical predictions, although it is still possible that the signals could come from some other interaction in the nanowire.

Topological Defect Networks for Fractons of all Types (Dominic Williamson)

Trajectories of Majorana quasiparticles

Trajectories of Majorana quasiparticles can be arranged to be topologically distinct and might form the basis for robust qubits in quantum computing.

Because of these tantalizing experimental results, researchers think that someone will conclusively nail down the quasiparticle soon. "It looks like a dog, and it walks like a dog," says Mihir Pendharkar, a Microsoft-funded graduate student at the University of California, Santa Barbara (UCSB), who presented his research at the 2018 March Meeting. But still, he adds, it might not be a dog.

Quantum computing researchers want to use a specific kind of Majorana fermion, known as a Majorana zero mode, as a qubit. "In classical terms, a Majorana zero mode is like half an electron," says Pendharkar. Theory, along with supporting experiments, suggests that these half-electron quasiparticles can exist at the ends of one-dimensional semiconducting wires that are attached to a superconductor.

One predicted property of these quasiparticles is that they have a "memory" of how they’ve been moved around. For example, if you swap two quasiparticles’ positions on a nanowire, "they would remember whether they had been moved clockwise or counterclockwise around each other," says Pendharkar.

You can store information in a pair of quasiparticles by exploiting this property, says Christina Knapp, a graduate student at UCSB who also presented in the same session. For example, in a simplistic encoding scheme, moving one quasiparticle clockwise with respect to the other could correspond to a 1, while moving counterclockwise could correspond to a 0. To read out the qubit, in principle, you would collide the two half-electron quasiparticles together on the nanowire and measure the outcome, which would yield a different signal depending on whether they were in a 0, 1, or a superposition state.

Researchers predict that these quasiparticles will be more robust at holding information than the qubits that Google and IBM are currently building. The latter are error-prone because of "local" noise, such as ambient electromagnetic fields. Consequently, thousands of superconducting qubits are required to lower the error rate enough to perform a logical operation. Google, the current record-holder, has put only 72 of these qubits together. This constrains researchers to design algorithms that are still useful despite inevitable errors.

Zurich Instruments - Qubit control for 100 qubits and more

Unlike Google’s computer, which stores information in a single localized object, a Majorana-based qubit would encode a single bit of information in multiple quasiparticles. According to theory, this type of quantum information should be much less likely to go bad. The quasiparticle still "remembers" whether it has been moved clockwise or counterclockwise with respect to its twin, even if you move it around on a nanowire. The information is also immune to local environmental noise.

The researchers liken these qubits to a knot on a shoestring, so that how the knot is tied indicates the information stored. "The knot doesn’t really change if you tug at the part of the shoestring," says Knapp. "It doesn’t care about little changes in the system."

To be clear, no one is tying physical knots in a nanowire—but you can mathematically visualize the timeline of these quasiparticles as you swap their positions as knots or braids. These knots are known as topologically protected states; hence, the proposed quantum computers built with Majorana fermions are known as topological quantum computers.

Theorists have already begun designing solid-state qubits using the hypothesized quasiparticle, although Pendharkar and his adviser, physicist Chris Palmstrøm of UCSB, say that it will likely be decades before anyone makes a topological qubit. "Right now, we don’t even know if the fundamental thing actually exists," says Palmstrøm. To conclude once and for all that they’ve created Majorana zero modes, Pendharkar says, a research group must demonstrate that a pair of them yields the predicted properties when swapped.

However, Palmstrøm’s group is already working to design a chip-based architecture for the expected qubits. They have designed a layered chip made primarily of indium-based materials containing sheets of electrons that interact only two-dimensionally. They can then etch those sheets into one-dimensional "wires" that they can couple to a superconductor to create the Majorana edge modes. Etching is a much more feasible—and scalable—manufacturing process than laying single nanowires in parallel, says Pendharkar.

Pendharkar and Palmstrøm are careful not to over-promise their device. After all, unlike Google, IBM, and Intel’s quantum computers, theirs doesn’t exist yet. "There are different bottlenecks for different technologies," says Palmstrøm. "We’re at the bottleneck where we don’t even know whether the technology works."

But other quantum computing architectures could hit a different bottleneck, Palmstrøm says: They’ll be difficult to expand into the thousand and million qubit devices that will ultimately be broadly useful to society. Because a topological qubit doesn’t need the same type of error correction as superconducting qubits, it should be easier to make a working thousand-qubit quantum computer out of topological qubits. A topological qubit should be a fundamentally better piece of hardware—they just have to figure out how to make it.

The company claims it is leading the field in a type of quantum computing called a topological qubit, which it claims is far less error-prone than rival qubit systems.

“We are very close to figuring out a topological qubit. We are working on the cryogenic process to control it, and we are working on 3D nano printing,” said Todd Holmdahl, Microsoft corporate vice-president in charge of quantum computing.

“Competitors will need to connect a million qubits, compared with 1,000 in our quantum computing machine. It is about quality.”

Why Topological qubits are better

The reason Holmdahl believes Microsoft has the edge in quantum computing is because its researchers are close to cracking what is known as a topological qubit. It is also developing a system architecture at the Niels Bohr Institute in Copenhagen, where qubits operate at just above absolute zero, at 30 millikelvin. The extreme cold minimises interference. Microsoft has also created a high-level language Q# for Visual Studio, plus it is working on a quantum computer simulator, which will run locally on a PC or on Azure.

Full stack ahead: Pioneering quantum hardware allows for controlling up to thousands of qubits at cryogenic temperatures

The topological qubit is the centrepiece of Microsoft’s efforts in quantum computing. Work began two decades ago in Microsoft’s theoretical research centre, when mathematician Michael Freedman joined. Freedman is renowned for his research in a field of mathematics known as topology.

According to Microsoft, Freedman began a push into quantum computing 12 years ago, backed by the company’s chief research and strategy officer, Craig Mundie.

At the time, Mundie said quantum computing was in a bit of a doldrums. Although physicists had been talking about the possibility of building quantum computers for years, they were struggling to create a working qubit with high enough fidelity to be useful in building a working computer.

According to Holmdahl, physical qubits are error-prone so it requires roughly 10,000 of them to make one “logical” qubit – which is a qubit reliable enough for any truly useful computation.

Quantum computing researchers have found that if a qubit is disrupted, it will “decohere”, which means it stops being in a physical state where it can be used for computation.

According to Microsoft, Freedman had been exploring the idea that topological qubits are more robust because their topological properties potentially make them more stable and provide more innate error protection.

Holmdahl said a topological qubit would have far fewer errors, meaning more of its processing power could be used for solving problems rather than correcting errors. “The more qubits you have, the more errors you have,” he said. This, in turn, means that more qubits must be connected together.

According to Holmdahl, there is a theoretical limit to how much a quantum computer can scale, due to the complexity of networking all the qubits together and the error handling. “We are taking a different approach. Our error rate is three to four orders of magnitude better

Zurich Instruments QCCS Quantum Computing Control System

Key Features

Scalable design: new inputs and outputs can be added at any time, and a high channel density and consistent performance are guaranteed for all setup sizes.
Productivity-boosting software: LabOne efficiently connects high-level quantum algorithms with the analog signals from the quantum device.
Hardware specifications that match the application: low noise, high resolution, and large bandwidth.
A thought-through and tested systems approach: precise synchronization, reliable operation.
Feedback operation: fast data propagation across the system, powerful decoding capability.

Zurich Instruments talks - IEEE Quantum Week Workshop - part 1/3

Zurich Instruments introduced the first commercial Quantum Computing Control System (QCCS), designed to control more than 100 superconducting and spin qubits. Each component of the QCCS is conceived to play a specific role in qubit control, readout and feedback, and operates in a fully synchronized manner with the other parts of the system. LabOne®, the Zurich Instruments control software, enables fast access to qubit data and facilitates the integration into higher-level software frameworks.

Zurich Instruments talks - IEEE Quantum Week Workshop - part 2/3

The Zurich Instruments QCCS supports researchers and engineers by allowing them to focus on the development of quantum processors and other elements of the quantum stack while benefiting from the most advanced classical control electronics and software.

Efficient workflows, tailored specifications and feature sets, and a high degree of reliability are the characteristics most valued by our customers.

Zurich Instruments talks - IEEE Quantum Week Workshop - part 3/3

The scientific achievements already accomplished with the QCCS (see below for a list of publications) are a testimony to our close engagement with some of the most ambitious research groups in this area. The recent launch of the SHFQA Quantum Analyzer introduces the second generation of QCCS products, which operate directly at qubit frequencies, offer higher density and lower cost per qubit, and provide new features that take into account the most recent developments in quantum computing.

More Information:

https://en.wikipedia.org/wiki/Quadratic_unconstrained_binary_optimization

https://www.quantumcomputinginc.com/

https://arxiv.org/abs/2102.01225

https://quantumcomputingreport.com/news-2020/#QCIMUKAI

https://quantumcomputingreport.com/quantum-computing-inc-announces-new-support-for-cloud-based-quantum-computers-with-their-mukai-software/

https://quantumcomputingreport.com/news/#QCIQIKSTART

https://www.zhinst.com/europe/en/quantum-computing-systems/qccs?msclkid=821685a55de21cca3969e58a2427cb0f

https://escholarship.org/uc/item/04305656

https://www.qutube.nl/quantum-computer-12/topological-quantum-computing-majorana-fermions-and-where-to-find-them

https://www.hpcwire.com/2020/08/19/intel-connects-the-quantum-dots-in-accelerating-quantum-computing-effort/

https://arxiv.org/abs/2102.01225

https://www.zhinst.com/europe/en/quantum-computing-systems/qccs?msclkid=821685a55de21cca3969e58a2427cb0f

https://www.aps.org/publications/apsnews/201804/hunt.cfm

https://physicstoday.scitation.org/doi/10.1063/PT.3.4499#:~:text=Majorana%20qubits%20for%20topological%20quantum%20computing%20Researchers%20are,of%20Majorana%20zero%20modes%20bound%20to%20topological%20defects.

https://www.qutube.nl/quantum-computer-12/topological-quantum-computing-majorana-fermions-and-where-to-find-them

https://arxiv.org/abs/2004.02124

https://escholarship.org/uc/item/04305656

https://www.hpcwire.com/2021/01/28/microsoft-develops-cryo-controller-chip-gooseberry-for-quantum-computing/

IBM Hybrid Cloud and Multi-Cloud the Best Cloud Platform Sofar

Hybrid cloud is an IT architecture that incorporates some degree of workload portability, orchestration, and management across 2 or more environments. Depending on whom you ask, those environments may need to include:

At least 1 private cloud and at least 1 public cloud
2 or more private clouds
2 or more public clouds
A bare-metal or virtual environment connected to at least 1 cloud—public or private

These varying requirements are an evolution from the earlier age of cloud computing, where the differences between public clouds and private clouds were easily defined by location and ownership. But today’s cloud types are far more complex, because location and ownership are abstract considerations.

Schlumberger, IBM and Red Hat Announce Major Hybrid Cloud Collaboration for the Energy Industry

This is why it can be more helpful to define hybrid cloud computing by what it does. All hybrid clouds should:

Connect multiple computers through a network.
Consolidate IT resources.
Scale out and quickly provision new resources.
Be able to move workloads between environments.
Incorporate a single, unified management tool.
Orchestrate processes with the help of automation.

Open hybrid cloud: Red Hat's vision for the future of IT

How do hybrid clouds work?

The way public clouds and private clouds work as part of a hybrid cloud are no different than how standalone public clouds or private clouds work:

A local area network (LAN), wide area network (WAN), virtual private network (VPN), and/or application programming interfaces (APIs) connect multiple computers together.

Virtualization, containers, or software-defined storage abstract resources, which can be pooled into data lakes.

Management software allocates those resources into environments where applications can run, which are then provisioned on-demand with help from an authentication service.

Separate clouds become hybrid when those environments are connected as seamlessly as possible. That interconnectivity is the only way hybrid clouds work—and it’s why hybrid clouds are the foundation of edge computing. That interconnectivity is how workloads are moved, management is unified, and processes are orchestrated. How well-developed those connections are has a direct impact on how well your hybrid cloud works.

Make Hybrid Cloud Work for Your Business

Modern hybrid cloud architecture

Today’s hybrid clouds are architected differently. Instead of connecting the environments themselves, modern IT teams build hybrid clouds by focusing on the portability of the apps that run in the environments.

Think about it like this: Instead of building a local 2-lane road (fixed middleware instances) to connect 2 interstate highways (a public cloud and a private cloud), you could instead focus on creating an all-purpose vehicle that can drive, fly, and float. Either strategy still gets you from one place to another, but there's a lot less permitting, construction, permanancy, and ecological impact if you focus on a universally capable vehicle.

IBM Data and Multi-Cloud and Hybrid Cloud

Ibm db2update2019 icp4 data from Gustav Lundström

Modern IT teams build hybrid clouds by focusing on the car—the app. They develop and deploy apps as collections of small, independent, and loosely coupled services. By running the same operating system in every IT environment and managing everything through a unified platform, the app's universality is extended to the environments below it. In more practical terms, a hybrid cloud can be the result of:

Running Linux® everywhere
Building and deploying cloud-native apps
Managing everything using an orchestration engine like Kubernetes or Red Hat OpenShift®

Using the same operating system abstracts all the hardware requirements, while the orchestration platform abstracts all the app requirements. This creates an interconnected, consistent computing environment where apps can be moved from one environment to another without maintaining a complex map of APIs that breaks every time apps are updated or you change cloud providers.

It starts with Linux

This interconnectivity allows development and operations teams to work together in a DevOps model: A process by which teams work collaboratively across integrated environments using a microservice architecture supported by containers.

Operating system

Every cloud is unique. That means you need an OS that can do anything. And the only ones that can do everything are open source software, like Linux. So start with Red Hat Enterprise Linux. It lets you run cloud-native apps with the control, confidence, and freedom that comes from a consistent foundation across any cloud deployment.

As the most deployed commercial Linux distribution in the public cloud, Red Hat Enterprise Linux is certified to run on hundreds of public cloud and service providers and is built off the native Linux OS containers are supposed to run on. Plus, customers running Red Hat Enterprise Linux gain economic advantages of more than US$1 trillion each year, just because of the OS.

Practical DevOps in a Hybrid World

Practical DevOps in a Hybrid World from Dev_Events

The new generation of hybrid cloud enables you to build and manage across any cloud with a common platform. That means you can skill once, build once and manage from a single pane of glass. ... IBM also offers IBM Cloud Pak® solutions, an AI-infused software portfolio that runs on Red Hat OpenShift.

Multicloud is a cloud approach made up of more than 1 cloud service, from more than 1 cloud vendor—public or private.

For example, your enterprise invests in expanding a cloud infrastructure. You've moved from bare-metal servers to virtualization-based workloads, and now you're evaluating public cloud options—not for everything, but to support a specific customer-facing application with highly variable use rates. After some research, you find the public cloud provider that has the right blend of service-level agreements (SLAs), security protocols, and uptime to host your custom application. You’re happy with your choice. But eventually, customers start asking for features that are only available through a different vendor’s proprietary app. Integrating these features into your custom app requires that you not only purchase the vendor’s app, but also host the app in that vendor’s proprietary public cloud—a solution that allows both apps to scale with demand.

You now have a multicloud.

Multicloud - Evolution of Core Infrastructure Strategy

What’s the difference between multicloud and hybrid cloud?

Multicloud refers to the presence of more than 1 cloud deployment of the same type (public or private), sourced from different vendors. Hybrid cloud refers to the presence of multiple deployment types (public or private) with some form of integration or orchestration between them.

A multicloud approach could involve 2 public cloud environments or 2 private cloud environments. A hybrid cloud approach could involve a public cloud environment and a private cloud environment with infrastructure (facilitated by application programming interfaces, middleware, or containers) facilitating workload portability.

These cloud approaches are mutually exclusive: You can't have both, simultaneously because the clouds will either be interconnected (hybrid cloud), or not (multicloud). Having multiple cloud deployments, both public and private, is becoming more common across enterprises as they seek to improve security and performance through an expanded portfolio of environments.

How to Create a Multi-Cloud Strategy

Managing and automating multicloud environments

IT is becoming more dynamic, based on virtual infrastructure both on-premise and off. This introduces significant complexity around self-service, governance and compliance, resource management, financial controls, and capacity planning. Cloud management and automation tools help maintain greater visibility and oversight across these disparate resources.

Automation has been used discretely within enterprises, with different tools used by different teams for individual management domains. But today’s automation technologies (like Red Hat® Ansible® Automation Platform) are capable of automating assets across environments. Adding modern automation capabilities to multicloud environments limits the environment’s complexity while enhancing workload security and performance for traditional and cloud-native applications.

Multicloud and containers

Linux® containers give enterprises choices when it comes to public cloud vendors. Because containers package and isolate apps with their entire runtime environment, users can move the contained app between clouds while retaining full functionality. This gives enterprises the freedom to choose public cloud providers, based on universal standards (e.g. uptime, storage space, cost) instead of whether it will—or won’t—support your workload due to proprietary restrictions.

This portability is facilitated by microservices, an architectural approach to writing software where applications are broken down into their smallest components, independent from each other. Containers—which are Linux—just happen to be the ideal place to run microservice-based apps. Together, they can be the key to taking your apps to any cloud.

Why Red Hat?

Multicloud helps enterprises avoid the pitfalls of single-vendor reliance. Spreading workloads across multiple cloud vendors gives enterprises flexibility to use (or stop using) a cloud whenever they want. There's nothing evil about having multiple clouds—in fact, it’s a good thing. And open source software magnifies that good. Our open source technologies bring a consistent foundation to any cloud deployment: public, private, hybrid, or multi.

Distributed Cloud for Telco Networks and Edge - Bill Lambertson, IBM

A new HPC solution from IBM Cloud

IBM has a long history of leadership in high performance computing (HPC). With ground-breaking advancements in systems, software, and services, IBM has enabled enterprises across many industries to manage distributed environments for running their HPC workloads for decades.

Customers are taking advantage of the massive computational power that cloud computing brings to HPC workloads. Identifying and scripting the necessary provisioning and configuration steps required to build scalable compute environments in the cloud can be daunting. When the need to encrypt valuable data and algorithms is added, the complexity only increases.

IBM Cloud is now announcing a new automated solution that enables customers to quickly and easily build scalable, encrypted compute environments in the IBM Cloud.

DevOps.com Webinar: Protecting OpenShift Container-Based Applications with Cloud-Native Backup

Introducing IBM Cloud HPC Cluster

IBM Cloud is excited to announce the general availability of IBM Cloud HPC Cluster. This scalable and repeatable service simplifies the process of building encrypted HPC environments in the IBM Cloud. The automated service provided by HPC Cluster eases the burden on IT administrators and shortens time to results.

The HPC Cluster service includes two deployment plans: Encrypted Bare Metal HPC Cluster and Encrypted VSI HPC Cluster. The HPC Cluster service is ideal for users who require high levels of security and encryption for their computationally intensive workloads. It enables customers to bring their own encrypted operating system image and bring their own keys to protect the confidential nature of data and algorithms. With the built-in security and encryption features provided by IBM Cloud HPC Cluster, customers have complete control over their HPC environment in the cloud.

IBM Cloud Pak for Applications Overview

Key benefits of IBM Cloud HPC Cluster

The HPC Cluster service greatly simplifies the process to create and manage encrypted compute environments in the IBM Cloud.

By applying advanced encryption, automation, and monitoring, customers can quickly create scalable compute environments with their choice of compute resources. These environments can be used to execute multiple HPC workloads while ensuring data privacy. The environments can be easily modified by adding and deleting compute resources depending on workload needs.

HPC Cluster key features include the following:

Support for bring your own encrypted operating system and Bring Your Own Key (BYOK)
Automated deployment and configuration of single-tenant, redundant LUNA Hardware Security Modules (HSM)
Automated deployment of compute resources, with boot encryption using customer provided keys
Integration with IBM Cloud Object Storage
Integration with Activity Tracker to view, manage, and audit cloud activity

The HPC Cluster service is deployed into the customer’s own IBM Cloud account and offers high degrees of customization and control, allowing clients to replicate their on-premise HPC environments in the cloud or extend their HPC workloads to the cloud. All compute configurations offered through this service provide hourly, consumption-based pricing, which helps customers to control spending.

Introducing Migration Toolkit for Virtualization - Miguel Perez Colino (Red Hat) OpenShift Commons

Hybrid cloud security is the protection of the data, applications, and infrastructure associated with an IT architecture that incorporates some degree of workload portability, orchestration, and management across multiple IT environments, including at least 1 cloud—public or private.

Hybrid clouds offer the opportunity to reduce the potential exposure of your data. You can keep sensitive or critical data off the public cloud while still taking advantage of the cloud for data that doesn’t have the same kinds of risk associated with it.

Why choose hybrid cloud for enhanced security?

Hybrid clouds let enterprises choose where to place workloads and data based on compliance, audit, policy, or security requirements.

While the various environments that make up a hybrid cloud remain unique and separate entities, migrating between them is facilitated by containers or encrypted application programming interfaces (APIs) that help transmit resources and workloads. This separate—yet connected—architecture is what allows enterprises to run critical workloads in the private cloud and less sensitive workloads in the public cloud. It’s an arrangement that minimizes data exposure and allows enterprises to customize a flexible IT portfolio.

The components of hybrid cloud security

Hybrid cloud security, like computer security in general, consists of three components: physical, technical, and administrative.

Physical controls are for securing your actual hardware. Examples include locks, guards, and security cameras.

Technical controls are protections designed into IT systems themselves, such as encryption, network authentication, and management software. Many of the strongest security tools for hybrid cloud are technical controls.

Finally, administrative controls are programs to help people act in ways that enhance security, such as training and disaster planning.

Overview of IBM Cloud Pak for Data

Physical controls for hybrid cloud security

Hybrid clouds can span multiple locations, which makes physical security a special challenge. You can’t build a perimeter around all your machines and lock the door.

In the case of shared resources like a public cloud, you may have Service Level Agreements (SLAs) with your cloud provider that define which physical security standards will be met. For example, some public cloud providers have arrangements with government clients to restrict which personnel have access to the physical hardware.

But even with good SLAs, you’re giving up some level of control when you’re relying on a public cloud provider. This means other security controls become even more important.

Technical controls for hybrid cloud security

Technical controls are the heart of hybrid cloud security. The centralized management of a hybrid cloud makes technical controls easier to implement.

Some of the most powerful technical controls in your hybrid cloud toolbox are encryption, automation, orchestration, access control, and endpoint security.

Encryption

Encryption greatly reduces the risk that any readable data would be exposed even if a physical machine is compromised.

You can encrypt data at rest and data in motion. Here’s how:

Protect your data at rest:

Full disk (partition encryption) protects your data while your computer is off. Try the Linux Unified Key Setup-on-disk (LUSK) format which can encrypt your hard drive partitions in bulk.

Hardware encryption that will protect the hard drive from unauthorized access. Try the Trusted Platform Module (TPM), which is a hardware chip that stores cryptographic keys. When the TPM is enabled, the hard drive is locked until the user is able to authenticate their login.

Encrypt root volumes without manually entering your passwords. If you have built a highly automated cloud environment, build upon that work with automated encryption. If you are using Linux, try the Network Bound Disk Encryption (NBDE), which works on both physical and virtual machines. Bonus: make TPM part of the NBDE and provide two layers of security (the NMDE will help protect networked environments, while the TPM will work on premises).

Protect your data in motion:

Encrypt your network session. Data in motion is at a much higher risk of interception and alteration. Try the Internet Protocol Security (IPsec) which is an extension of the Internet Protocol that uses cryptography.

Select products that already implement security standards. Look for products that support the Federal Information Processing Standard (FIPS) Publication 140-2 which uses cryptographic modules to protect high-risk data.

Pathways to Multicloud Transformation

Pathways to Multicloud Transformation from IBM

Automation

To appreciate why automation is a natural fit for hybrid clouds, consider the drawbacks of manual monitoring and patching.

Manual monitoring for security and compliance often has more risks than rewards. Manual patches and configuration management risk being implemented asynchronously. It also makes implementing self-service systems more difficult. If there is a security breach, records of manual patches and configurations risk being lost and can lead to team in-fighting and finger-pointing. Additionally, manual processes tend to be more error prone and take more time.

Automation, by contrast, allows you to stay ahead of risks, rather than react to them. Automation gives you the ability to set rules, share, and verify processes which ultimately make it easier to pass security audits. As you evaluate your hybrid cloud environments, think about automating the following processes:

Assembling your cloud orchestra: A field guide to multi-cloud management

Assembling your cloud orchestra: A field guide to multi-cloud management from IBM

Monitoring your environments

Checking for compliance
Implementing patches
Implementing custom or regulatory security baselines
Orchestration

Cloud orchestration goes a step further. You can think of automation as defining specific ingredients, and orchestration as a cookbook of recipes that bring the ingredients together.

Orchestration makes it possible to manage cloud resources and their software components as a single unit, and then deploy them in an automated, repeatable way through a template.

Orchestration’s biggest boon to security is standardization. You can deliver the flexibility of the cloud while still making sure the systems deployed meet your standards for security and compliance.

CIO Think Tank: Pathways to Multi-Cloud Transformation

2019 CIO Think Tank: Pathways to Multicloud Transformation from IBM

Access control

Hybrid clouds also depend on access control. Restrict user accounts to only the privileges they need and consider requiring two-factor authentication. Limiting access to users connected to a Virtual Private Network (VPN) can also help you maintain security standards.

Endpoint security

Endpoint security often means using software to remotely revoke access or wipe sensitive data if a user’s smartphone, tablet, or computer gets lost, stolen, or hacked.

Users can connect to a hybrid cloud with personal devices from anywhere, making endpoint security an essential control. Adversaries may target your systems with phishing attacks on individual users and malware that compromises individual devices.

We’re listing it here as a technical control, but endpoint security combines physical, technical and administrative controls: Keep physical devices secure, use technical controls to limit the risks if a device falls into the wrong hands, and train users in good security practices.

Hybrid- and Multi-Cloud by design - IBM Cloud and your journey to Cloud

Hybrid- and Multi-Cloud by design - IBM Cloud and your journey to Cloud from Aleksandar Francuz

Administrative controls for hybrid cloud security

Lastly, administrative controls in hybrid cloud security are implemented to account for human factors. Because hybrid cloud environments are highly connected, security is every user’s responsibility.

Disaster preparedness and recovery are an example of an administrative control. If part of your hybrid cloud is knocked offline, who’s responsible for what actions? Do you have protocols in place for data recovery?

Hybrid architecture offers significant advantages for administrative security. With your resources potentially distributed among on-site and off-site hardware, you have options for backups and redundancies. In hybrid clouds that involve public and private clouds, you can fail over to the public cloud if a system on your private data center cloud fails.

The IBM Cloud is the cloud made for business

The IBM Cloud is the cloud made for business from Aleksandar Francuz

More Information:

https://www.redhat.com/en/topics/cloud-computing/what-is-hybrid-cloud

https://www.redhat.com/en/engage/cloud-native-meets-hybrid-cloud-strategy-guide?intcmp=701f2000001OMH6AAO

https://www.redhat.com/en/topics/cloud-computing/what-is-multicloud

https://www.ibm.com/cloud/hybrid

https://www.ibm.com/cloud/blog/ibm-cloud-raises-the-bar-for-secure-high-performance-computing

https://thenewstack.io/ibms-red-hat-buy-aims-to-bring-the-hybrid-cloud-to-the-enterprise/

https://www.redhat.com/en/engage/boost-hybrid-cloud-security

https://www.redhat.com/en/topics/security/what-is-hybrid-cloud-security

https://www.harbourit.com.au/everything-you-need-to-know-about-hybrid-cloud/

https://www.redhat.com/en/topics/security

Microsoft Azure Well-Architected Framework

The Azure Well-Architected Framework is a set of guiding tenets that can be used to improve the quality of a workload. The framework consists of five pillars of architecture excellence: Cost Optimization, Operational Excellence, Performance Efficiency, Reliability, and Security.

To assess your workload using the tenets found in the Microsoft Azure Well-Architected Framework, see the Microsoft Azure Well-Architected Review.

Cost Optimization

When you are designing a cloud solution, focus on generating incremental value early. Apply the principles of Build-Measure-L earn, to accelerate your time to market while avoiding capital-intensive solutions. Use the pay-as-you-go strategy for your architecture, and invest in scaling out, rather than delivering a large investment first version. Consider opportunity costs in your architecture, and the balance between first mover advantage versus "fast follow".

Cost Guidance

Review cost principles
Develop a cost model
Create budgets and alerts
Review the cost optimization checklist

A cost-effective workload is driven by business goals and the return on investment (ROI) while staying within a given budget. The principles of cost optimization are a series of important considerations that can help achieve both business objectives and cost justification.

Use the Azure Well Architected Framework to optimize your workload

To assess your workload using the tenets found in the Azure Well-Architected Framework, see the Microsoft Azure Well-Architected Review.

Keep within the cost constraints

Every design choice has cost implications. Before choosing an architectural pattern, Azure service, or a price model for the service, consider the budget constraints set by the company. As part of design, identify acceptable boundaries on scale, redundancy, and performance against cost. After estimating the initial cost, set budgets and alerts at different scopes to measure the cost. One of cost drivers can be unrestricted resources. These resources typically need to scale and consume more cost to meet demand.

Aim for scalable costs

A key benefit of the cloud is the ability to scale dynamically. The workload cost should scale linearly with demand. You can save cost through automatic scaling. Consider the usage metrics and performance to determine the number of instances. Choose smaller instances for a highly variable workload and scale out to get the required level of performance, rather than up. This choice will enable you to make your cost calculations and estimates granular.

Pay for consumption

Adopt a leasing model instead of owning infrastructure. Azure offers many SaaS and PaaS resources that simplify overall architecture. The cost of hardware, software, development, operations, security, and data center space included in the pricing model.

Also, choose pay-as-you-go over fixed pricing. That way, as a consumer, you're charged for only what you use.

Right resources, right size

Choose the right resources that are aligned with business goals and can handle the performance of the workload. An inappropriate or misconfigured service can impact cost. For example, building a multi-region service when the service levels don't require high-availability or geo-redundancy will increase cost without any reasonable business justification.

Certain infrastructure resources are delivered as fix-sized building blocks. Ensure that these blocks are adequately sized to meet capacity demand, deliver expected performance without wasting resources.

Monitor and optimize

Treat cost monitoring and optimization as a process, rather than a point-in-time activity. Conduct regular cost reviews and measure and forecast the capacity needs so that you can provision resources dynamically and scale with demand. Review the cost management recommendations and take action.

If you're just starting in this process review enable success during a cloud adoption journey .

Operational Excellence

This pillar covers the operations processes that keep an application running in production. Deployments must be reliable and predictable. They should be automated to reduce the chance of human error. They should be a fast and routine process, so they don't slow down the release of new features or bug fixes. Equally important, you must be able to quickly roll back or roll forward if an update has problems.

Monitoring and diagnostics are crucial. Cloud applications run in a remote data-center where you do not have full control of the infrastructure or, in some cases, the operating system. In a large application, it's not practical to log into VMs to troubleshoot an issue or sift through log files. With PaaS services, there may not even be a dedicated VM to log into. Monitoring and diagnostics give insight into the system, so that you know when and where failures occur.

Configure and Manage Azure Virtual Networking

All systems must be observable. Use a common and consistent logging schema that lets you correlate events across systems.

The monitoring and diagnostics process has several distinct phases:

Instrumentation. Generating the raw data, from application logs, web server logs, diagnostics built into the
Azure platform, and other sources.
Collection and storage. Consolidating the data into one place.
Analysis and diagnosis. To troubleshoot issues and see the overall health.
Visualization and alerts. Using telemetry data to spot trends or alert the operations team.
Use the DevOps checklist to review your design from a management and DevOps standpoint.

Performance efficiency

Performance efficiency is the ability of your workload to scale to meet the demands placed on it by users in an efficient manner. The main ways to achieve this are by using scaling appropriately and implementing PaaS offerings that have scaling built in.

There are two main ways that an application can scale. Vertical scaling (scaling up) means increasing the capacity of a resource, for example by using a larger VM size. Horizontal scaling (scaling out) is adding new instances of a resource, such as VMs or database replicas.

Demystifying Azure Cloud Adoption and Well-Architected Frameworks

Horizontal scaling has significant advantages over vertical scaling:

True cloud scale. Applications can be designed to run on hundreds or even thousands of nodes, reaching
scales that are not possible on a single node.
Horizontal scale is elastic. You can add more instances if load increases, or remove them during quieter
periods.
Scaling out can be triggered automatically, either on a schedule or in response to changes in load.
Scaling out may be cheaper than scaling up. Running several small VMs can cost less than a single large VM.
Horizontal scaling can also improve resiliency, by adding redundancy. If an instance goes down, the
application keeps running.

An advantage of vertical scaling is that you can do it without making any changes to the application. But at somepoint you'll hit a limit, where you can't scale any up any more. At that point, any further scaling must be horizontal.

An advantage of vertical scaling is that you can do it without making any changes to the application. But at some point you'll hit a limit, where you can't scale any up any more. At that point, any further scaling must be horizontal.

Horizontal scale must be designed into the system. For example, you can scale out VMs by placing them behind a load balancer. But each VM in the pool must be able to handle any client request, so the application must be stateless or store state externally (say, in a distributed cache). Managed PaaS services often have horizontal scaling and autoscaling built in. The ease of scaling these services is a major advantage of using PaaS services.

Just adding more instances doesn't mean an application will scale, however. It might simply push the bottleneck somewhere else. For example, if you scale a web front end to handle more client requests, that might trigger lock contentions in the database. You would then need to consider additional measures, such as optimistic concurrency or data partitioning, to enable more throughput to the database.

Always conduct performance and load testing to find these potential bottlenecks. The stateful parts of a system, such as databases, are the most common cause of bottlenecks, and require careful design to scale horizontally.

Resolving one bottleneck may reveal other bottlenecks elsewhere.

Azure Well-Architected Framework Overview

Azure Well-Architected Framework Overview from albertspijkers

Reliability

A reliable workload is one that is both resilient and available. Resiliency is the ability of the system to recover from failures and continue to function. The goal of resiliency is to return the application to a fully functioning state after a failure occurs. Availability is whether your users can access your workload when they need to.

In traditional application development, there has been a focus on increasing the mean time between failures

(MTBF). Effort was spent trying to prevent the system from failing. In cloud computing, a different mindset is required, due to several factors:

Distributed systems are complex, and a failure at one point can potentially cascade throughout the system.
Costs for cloud environments are kept low through the use of commodity hardware, so occasional hardware failures must be expected.

Applications often depend on external services, which may become temporarily unavailable or throttle highvolume users.

Today's users expect an application to be available 24/7 without ever going offline.

Introduction to Microsoft Azure Well-Architected Framework - Vaibhav Gujral

All of these factors mean that cloud applications must be designed to expect occasional failures and recover from them. Azure has many resiliency features already built into the platform. For example:

Azure Storage, SQL Database, and Cosmos DB all provide built-in data replication, both within a region and across regions.

Azure managed disks are automatically placed in different storage scale units to limit the effects of hardware failures.

VMs in an availability set are spread across several fault domains. A fault domain is a group of VMs that share a common power source and network switch. Spreading VMs across fault domains limits the impact of physical hardware failures, network outages, or power interruptions.

That said, you still need to build resiliency into your application. Resiliency strategies can be applied at all levels of the architecture. Some mitigations are more tactical in nature — for example, retrying a remote call after a transient network failure. Other mitigations are more strategic, such as failing over the entire application to a secondary region. Tactical mitigations can make a big difference. While it's rare for an entire region to experience a disruption, transient problems such as network congestion are more common — so target these first. Having the right monitoring and diagnostics is also important, both to detect failures when they happen, and to find the root causes.

When designing an application to be resilient, you must understand your availability requirements. How much downtime is acceptable? This is partly a function of cost. How much will potential downtime cost your business?

How much should you invest in making the application highly available?

Security

Think about security throughout the entire lifecycle of an application, from design and implementation to deployment and operations. The Azure platform provides protections against a variety of threats, such as

network intrusion and DDoS attacks. But you still need to build security into your application and into your DevOps processes.

Here are some broad security areas to consider.

Identity management

Consider using Azure Active Directory (Azure AD) to authenticate and authorize users. Azure AD is a fully managed identity and access management service. You can use it to create domains that exist purely on Azure, or integrate with your on-premises Active Directory identities. Azure AD also integrates with Office365,

Dynamics CRM Online, and many third-party SaaS applications. For consumer-facing applications, Azure Active Directory B2C lets users authenticate with their existing social accounts (such as Facebook, Google, or LinkedIn), or create a new user account that is managed by Azure AD.

If you want to integrate an on-premises Active Directory environment with an Azure network, several approaches are possible, depending on your requirements. For more information, see our Identity Management reference architectures.

Protecting your infrastructure

Control access to the Azure resources that you deploy. Every Azure subscription has a trust relationship with an Azure AD tenant. Use Azure role-based access control (Azure RBAC) to grant users within your organization the correct permissions to Azure resources. Grant access by assigning Azure roles to users or groups at a certain scope. The scope can be a subscription, a resource group, or a single resource. Audit all changes to infrastructure.

Application security

In general, the security best practices for application development still apply in the cloud. These include things like using SSL everywhere, protecting against CSRF and XSS attacks, preventing SQL injection attacks, and so on.

Cloud applications often use managed services that have access keys. Never check these into source control. Consider storing application secrets in Azure Key Vault.

Data sovereignty and encryption

Make sure that your data remains in the correct geopolitical zone when using Azure data services. Azure's geo-replicated storage uses the concept of a paired region in the same geopolitical region.

Use Key Vault to safeguard cryptographic keys and secrets. By using Key Vault, you can encrypt keys and secrets by using keys that are protected by hardware security modules (HSMs). Many Azure storage and DB services support data encryption at rest, including Azure Storage, Azure SQL Database, Azure Synapse Analytics, and Cosmos DB.

Evaluate and optimize your costs using the Microsoft Azure Well-Architected Framework

Principles of cost optimization

To assess your workload using the tenets found in the Azure Well-Architected Framework, see the Microsoft Azure Well-Architected Review.

Keep within the cost constraints

Aim for scalable costs

Pay for consumption

Also, choose pay-as-you-go over fixed pricing. That way, as a consumer, you're charged for only what you use.

Right resources, right size

Monitor and optimize

Diving deeper into Azure workload reliability (Part 1) | Well-Architected Framework

Operational excellence principles

Considering and improving how software is developed, deployed, operated, and maintained is one part of achieving a higher competency in operations. Equally important is providing a team culture of experimentation and growth, solutions for rationalizing the current state of operations, and incident response plans. The principles of operational excellence are a series of considerations that can help achieve excellent operational practices.

Azure Well-Architected Framework Operational Excellence 1

Azure Well-Architected Framework Operational Excellence 1 from albertspijkers

To assess your workload using the tenets found in the Azure Well-Architected Framework, see the Microsoft Azure Well-Architected Review.

DevOps methodologies

The contraction of "Dev" and "Ops" refers to replacing siloed Development and Operations to create multidisciplinary teams that now work together with shared and efficient practices and tools. Essential DevOps practices include agile planning, continuous integration, continuous delivery, and monitoring of applications.

Separation of roles

A DevOps model positions the responsibility of operations with developers. Still, many organizations do not fully embrace DevOps and maintain some degree of team separation between operations and development, either to enforce clear segregation of duties for regulated environments or to share operations as a business function.

Team collaboration

It is essential to understand if developers are responsible for production deployments end-to-end, or if a handover point exists where responsibility is passed to an alternative operations team, potentially to ensure strict segregation of duties such as the Sarbanes-Oxley Act where developers cannot touch financial reporting systems.

It is crucial to understand how operations and development teams collaborate to address operational issues and what processes exist to support and structure this collaboration. Moreover, mitigating issues might require various teams outside of development or operations, such as networking and external parties. The processes to support this collaboration should also be understood.

Workload isolation

The goal of workload isolation is to associate an application's specific resources to a team to independently manage all aspects of those resources.

Operational lifecycles

Reviewing operational incidents where the response and remediation to issues either failed or could have been optimized is vital to improving overall operational effectiveness. Failures provide a valuable learning opportunity, and in some cases, these learnings can also be shared across the entire organization. Finally, Operational procedures should be updated based on outcomes from frequent testing.

Operational metadata

Azure Tags provide the ability to associate critical metadata as a name-value pair, such as billing information (e.g., cost center code), environment information (e.g., environment type), with Azure resources, resource groups, and subscriptions. See Tagging Strategies for best practices.

Azure Well-Architected Framework Operational Excellence 2

Azure Well-Architected Framework Operational Excellence 2 from albertspijkers

Optimize build and release processes

From provisioning with Infrastructure as Code, building and releasing with CI/CD pipelines, automated testing, and embracing software engineering disciplines across your entire environment. This approach ensures the creation and management of environments throughout the software development lifecycle is consistent, repeatable, and enables early detection of issues.

Monitor the entire system and understand operational health

Implement systems and processes to monitor build and release processes, infrastructure health, and application health. Telemetry is critical to understanding the health of a workload and whether the service is meeting the business goals.

Rehearse recovery and practice failure

Run DR drills on a regular cadence and use engineering practices to identify and remediate weak points in application reliability. Regular rehearsal of failure will validate the effectiveness of recovery processes and ensure teams are familiar with their responsibilities.

Embrace operational improvement

Continuously evaluate and refine operational procedures and tasks while striving to reduce complexity and ambiguity. This approach enables an organization to evolve processes over time, optimizing inefficiencies, and learning from failures.

Use loosely coupled architecture

Enable teams to independently test, deploy, and update their systems on demand without depending on other teams for support, services, resources, or approvals.

Incident management

When incidents occur, have well thought out plans and solutions for incident management, incident communication, and feedback loops. Take the lessons learned from each incident and build telemetry and monitoring elements to prevent future occurrences.

Diving deeper into Azure workload reliability (Part 2) | Well-Architected Framework

Overview of the performance efficiency pillar

Performance efficiency is the ability of your workload to scale to meet the demands placed on it by users in an efficient manner. Before the cloud became popular, when it came to planning how a system would handle increases in load, many organizations intentionally provisioned workloads to be oversized to meet business requirements. This might make sense in on-premises environments because it ensured capacity during peak usage. Capacity reflects resource availability (CPU and memory). This was a major consideration for processes that would be in place for a number of years.

Just as you needed to anticipate increases in load in on-premises environments, you need to anticipate increases in cloud environments to meet business requirements. One difference is that you may no longer need to make long-term predictions for anticipated changes to ensure that you will have enough capacity in the future. Another difference is in the approach used to manage performance.

What is scalability and why is it important?

An important consideration in achieving performance efficiency is to consider how your application scales and to implement PaaS offerings that have built-in scaling operations. Scalability is the ability of a system to handle increased load. Services covered by Azure Autoscale can scale automatically to match demand to accommodate workload. They will scale out to ensure capacity during workload peaks and scaling will return to normal automatically when the peak drops.

In the cloud, the ability to take advantage of scalability depends on your infrastructure and services. Some platforms, such as Kubernetes, were built with scaling in mind. Virtual machines, on the other hand, may not scale as easily although scale operations are possible. With virtual machines, you may want to plan ahead to avoid scaling infrastructure in the future to meet demand. Another option is to select a different platform such as Azure virtual machines scale sets.

When using scalability, you need only predict the current average and peak times for your workload. Payment plan options allow you to manage this. You pay either per minute or per-hour depending on the service for a designated time period.

Principles

Follow these principles to guide you through improving performance efficiency:

Become Data-driven - Embrace a data-driven culture to deliver timely insights to everyone in your organization across all your data. To harness this culture, get the best performance from your analytics solution across all your data, ensure data has the security and privacy needed for your business environment, and make sure you have tools that enable everyone in your organization to gain insights from your data.
Avoid antipatterns - A performance antipattern is a common practice that is likely to cause scalability problems when an application is under pressure. For example, you can have an application that behaves as expected during performance testing. However, when it is released to production and starts to handle live workloads, performance decreases. Scalability problems such as rejecting user requests, stalling, or throwing exceptions may arise. To learn how to identify and fix these antipatterns, see Performance antipatterns for cloud applications.
Perform load testing to set limits - Load testing helps ensure that your applications can scale and do not go down during peak traffic. Load test each application to understand how it performs at various scales. To learn about Azure service limits, see Managing limits.
Understand billing for metered resources - Your business requirements will determine the tradeoffs between cost and level of performance efficiency. Azure doesn't directly bill based on the resource cost. Charges for a resource, such as a virtual machine, are calculated by using one or more meters. Meters are used to track a resource’s usage over time. These meters are then used to calculate the bill.
Monitor and optimize - Lack of monitoring new services and the health of current workloads are major inhibitors in workload quality. The overall monitoring strategy should take into consideration not only scalability, but resiliency (infrastructure, application, and dependent services) and application performance as well. For purposes of scalability, looking at the metrics would allow you to provision resources dynamically and scale with demand.

Microsoft Azure Well-Architected - Joni Leskinen Sr. Cloud Solution Architect

Microsoft Azure Well-Architected - Joni Leskinen Sr. Cloud Solution Architect from albertspijkers

Principles of the reliability pillar

Building a reliable application in the cloud is different from traditional application development. While historically you may have purchased levels of redundant higher-end hardware to minimize the chance of an entire application platform failing, in the cloud, we acknowledge up front that failures will happen. Instead of trying to prevent failures altogether, the goal is to minimize the effects of a single failing component.

Application framework

These critical principles are used as lenses to assess the reliability of an application deployed on Azure. They provide a framework for the application assessment questions that follow.

To assess your workload using the tenets found in the Microsoft Azure Well-Architected Framework, see the Microsoft Azure Well-Architected Review.

Define and test availability and recovery targets - Availability targets, such as Service Level Agreements (SLA) and Service Level Objectives (SLO), and Recovery targets, such as Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO), should be defined and tested to ensure application reliability aligns with business requirements.

Design applications to be resistant to failures - Resilient application architectures should be designed to recover gracefully from failures in alignment with defined reliability targets.

Ensure required capacity and services are available in targeted regions - Azure services and capacity can vary by region, so it's important to understand if targeted regions offer required capabilities.

Plan for disaster recovery - Disaster recovery is the process of restoring application functionality in the wake of a catastrophic failure. It might be acceptable for some applications to be unavailable or partially available with reduced functionality for a period of time, while other applications may not be able to tolerate reduced functionality.

Design the application platform to meet reliability requirements - Designing application platform resiliency and availability is critical to ensuring overall application reliability.

Design the data platform to meet reliability requirements - Designing data platform resiliency and availability is critical to ensuring overall application reliability.

Recover from errors - Resilient applications should be able to automatically recover from errors by leveraging modern cloud application code patterns.

Ensure networking and connectivity meets reliability requirements - Identifying and mitigating potential network bottle-necks or points-of-failure supports a reliable and scalable foundation over which resilient application components can communicate.

Allow for reliability in scalability and performance - Resilient applications should be able to automatically scale in response to changing load to maintain application availability and meet performance requirements.

Address security-related risks - Identifying and addressing security-related risks helps to minimize application downtime and data loss caused by unexpected security exposures.

Define, automate, and test operational processes - Operational processes for application deployment, such as roll-forward and roll-back, should be defined, sufficiently automated, and tested to help ensure alignment with reliability targets.

Test for fault tolerance - Application workloads should be tested to validate reliability against defined reliability targets.

Monitor and measure application health - Monitoring and measuring application availability is vital to qualifying overall application health and progress towards defined reliability targets.

Azure Reliability

Azure Reliability from albertspijkers

Overview of the security pillar

Information Security has always been a complex subject, and it evolves quickly with the creative ideas and implementations of attackers and security researchers. The origin of security vulnerabilities started with identifying and exploiting common programming errors and unexpected edge cases. However over time, the attack surface that an attacker may explore and exploit has expanded well beyond that. Attackers now freely exploit vulnerabilities in system configurations, operational practices, and the social habits of the systems’ users. As system complexity, connectedness, and the variety of users increase, attackers have more opportunities to identify unprotected edge cases and to “hack” systems into doing things they were not designed to do.

Security is one of the most important aspects of any architecture. It provides confidentiality, integrity, and availability assurances against deliberate attacks and abuse of your valuable data and systems. Losing these assurances can negatively impact your business operations and revenue, as well as your organization’s reputation in the marketplace. In the following series of articles, we’ll discuss key architectural considerations and principles for security and how they apply to Azure.

Security design principles

These principles support these three key strategies and describe a securely architected system hosted on cloud or on-premises datacenters (or a combination of both). Application of these principles will dramatically increase the likelihood your security architecture will maintain assurances of confidentiality, integrity, and availability.

Each recommendation in this document includes a description of why it is recommended, which maps to one of more of these principles:

Align Security Priorities to Mission – Security resources are almost always limited, so prioritize efforts and assurances by aligning security strategy and technical controls to the business using classification of data and systems. Security resources should be focused first on people and assets (systems, data, accounts, etc.) with intrinsic business value and those with administrative privileges over business critical assets.

Build a Comprehensive Strategy – A security strategy should consider investments in culture, processes, and security controls across all system components. The strategy should also consider security for the full lifecycle of system components including the supply chain of software, hardware, and services.

Drive Simplicity – Complexity in systems leads to increased human confusion, errors, automation failures, and difficulty of recovering from an issue. Favor simple and consistent architectures and implementations.

Design for Attackers – Your security design and prioritization should be focused on the way attackers see your environment, which is often not the way IT and application teams see it. Inform your security design and test it with penetration testing to simulate one-time attacks. Use red teams to simulate long-term persistent attack groups. Design your enterprise segmentation strategy and other security controls to contain attacker lateral movement within your environment. Actively measure and reduce the potential attack surface that attackers target for exploitation of resources within the environment.

Leverage Native Controls – Favor native security controls built into cloud services over external controls from third parties. Native security controls are maintained and supported by the service provider, eliminating or reducing effort required to integrate external security tooling and update those integrations over time.

Use Identity as Primary Access Control – Access to resources in cloud architectures is primarily governed by identity-based authentication and authorization for access controls. Your account control strategy should rely on identity systems for controlling access rather than relying on network controls or direct use of cryptographic keys.

Accountability – Designate clear ownership of assets and security responsibilities and ensure actions are traceable for nonrepudiation. You should also ensure entities have been granted the least privilege required (to a manageable level of granularity).

Embrace Automation - Automation of tasks decreases the chance of human error that can create risk, so both IT operations and security best practices should be automated as much as possible to reduce human errors (while ensuring skilled humans govern and audit the automation).

Focus on Information Protection – Intellectual property is frequently one of the biggest repositories of organizational value and this data should be protected anywhere it goes including cloud services, mobile devices, workstations, or collaboration platforms (without impeding collaboration that allows for business value creation). Your security strategy should be built around classifying information and assets to enable security prioritization, leveraging strong access control and encryption technology, and meeting business needs like productivity, usability, and flexibility.

Design for Resilience – Your security strategy should assume that controls will fail and design accordingly. Making your security posture more resilient requires several approaches working together:

Balanced Investment – Invest across core functions spanning the full NIST Cybersecurity Framework lifecycle (identify, protect, detect, respond, and recover) to ensure that attackers who successfully evade preventive controls lose access from detection, response, and recovery capabilities.

Ongoing Maintenance – Maintain security controls and assurances to ensure that they don’t decay over time with changes to the environment or neglect.

Ongoing Vigilance – Ensure that anomalies and potential threats that could pose risks to the organizations are addressed in a timely manner.

Defense in Depth – Consider additional controls in the design to mitigate risk to the organization in the event a primary security control fails. This design should consider how likely the primary control is to fail, the potential organizational risk if it does, and the effectiveness of the additional control (especially in the likely cases that would cause the primary control to fail).

Least Privilege – This is a form of defense in depth to limit the damage that can be done by any one account. Accounts should be granted the least amount of privilege required to accomplish their assigned tasks. Restrict the access by permission level and by time. This helps mitigate the damage of an external attacker who gains access to the account and/or an internal employee who inadvertently or deliberately (for example, insider attack) compromises security assurances.

Baseline and Benchmark – To ensure your organization considers current thinking from outside sources, evaluate your strategy and configuration against external references (including compliance requirements). This helps to validate your approaches, minimize risk of inadvertent oversight, and the risk of punitive fines for noncompliance.

Drive Continuous Improvement – Systems and existing practices should be regularly evaluated and improved to ensure they are and remain effective against attackers who continuously improve and the continuous digital transformation of the enterprise. This should include processes that proactively integrate learnings from real world attacks, realistic penetration testing and red team activities, and other sources as available.

Assume Zero Trust – When evaluating access requests, all requesting users, devices, and applications should be considered untrusted until their integrity can be sufficiently validated. Access requests should be granted conditionally based on the requestor's trust level and the target resource’s sensitivity. Reasonable attempts should be made to offer means to increase trust validation (for example, request multi-factor authentication) and remediate known risks (change known-leaked password, remediate malware infection) to support productivity goals.

Educate and Incentivize Security – The humans that are designing and operating the cloud workloads are part of the whole system. It is critical to ensure that these people are educated, informed, and incentivized to support the security assurance goals of the system. This is particularly important for people with accounts granted broad administrative privileges.

Azure Well-Architected Framework Security Overview

Azure Well-Architected Framework Security Overview from albertspijkers

Overview of a hybrid workload

Customer workloads are becoming increasingly complex, with many applications often running on different hardware across on-premises, multicloud, and the edge. Managing these disparate workload architectures, ensuring uncompromised security, and enabling developer agility are critical to success.

Azure uniquely helps you meet these challenges, giving you the flexibility to innovate anywhere in your hybrid environment while operating seamlessly and securely. The Well-Architected Framework includes a hybrid description for each of the five pillars: cost optimization, operational excellence, performance efficiency, reliability, and security. These descriptions create clarity on the considerations needed for your workloads to operate effectively across hybrid environments.

Adopting a hybrid model offers multiple solutions that enable you to confidently deliver hybrid workloads: run Azure data services anywhere, modernize applications anywhere, and manage your workloads anywhere.

Extend Azure management to any infrastructure

Tip
Applying the principles in this article series to each of your workloads will better prepare you for hybrid adoption. For larger or centrally managed organizations, hybrid and multicloud are commonly part of a broader strategic objective. If you need to scale these principle across a portfolio of workloads using hybrid and multicloud environments, you may want to start with the Cloud Adoption Framework's hybrid and multicloud scenario and best practices. Then return to this series to refine each of your workload architectures.

Use Azure Arc enabled infrastructure to extend Azure management to any infrastructure in a hybrid environment. Key features of Azure Arc enabled infrastructure are:

Unified Operations

Organize resources such as virtual machines, Kubernetes clusters and Azure services deployed across your entire IT environment.
Manage and govern resources with a single pane of glass from Azure.
Integrated with Azure Lighthouse for managed service provider support.
Adopt cloud practices

Easily adopt DevOps techniques such as infrastructure as code.

Empower developers with self-service and choice of tools.

Standardize change control with configuration management systems, such as GitOps and DSC.

More Information:

https://www.microsoft.com/en-us/us-partner-blog/tag/well-architected-framework/

https://docs.microsoft.com/en-us/learn/paths/azure-well-architected-framework/

https://www.microsoft.com/azure/partners/well-architected#well-architected-framework

https://azure.microsoft.com/en-us/blog/introducing-the-microsoft-azure-wellarchitected-framework/

https://docs.microsoft.com/en-us/azure/architecture/framework/

https://www.microsoft.com/en-us/us-partner-blog/2021/01/26/an-introduction-to-azures-well-architected-framework/

https://www.capgemini.com/2020/10/microsoft-azure-well-architected-framework/

https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/innovate/considerations/adoption

https://azure.microsoft.com/en-us/features/reliability/#features

https://docs.microsoft.com/en-us/azure/cost-management-billing/

https://azureinfohub.azurewebsites.net/Service/Videos?serviceTitle=Azure%20Cost%20Management

https://docs.microsoft.com/en-us/azure/architecture/

https://cloudsecurityalliance.org/blog/2020/08/26/shared-responsibility-model-explained/

https://docs.microsoft.com/en-us/azure/architecture/browse/

https://visualstudiomagazine.com/articles/2020/08/04/azure-well-architected-framework.aspx

https://docs.microsoft.com/en-us/azure/architecture/framework/scalability/test-checklist

https://docs.microsoft.com/en-us/learn/modules/azure-well-architected-security/

Anything-as-a-Service (XaaS) solutions, Is your business ready?

Everything as a service is increasingly becoming the preferred business model. Thanks to major platforms like Uber for ridesharing, Netflix for streaming video and Google for cloud services, businesses are now seeing the benefits of embracing on-demand, ‘as a service’ business models.

Everything as a service uses X as a placeholder for any kind of product, meaning that you don’t sell the product itself but charge for the usage or the output of the product, such as pay-per-use or a monthly flat fee, like Uber or Netflix, respectively. In financial terms, the customer exchanges capital expenses for operational expenses. Although XaaS sounds like it’s a standard leasing or renting model, that is not the case.

Digital Transformation With XaaS (Everything-as-a-Service)

The concept isn’t new, either. Sixty years ago, the Xerox Corporation (at the time named Haloid) introduced a new business model to encourage affordable widescale use of their copy machines to offices by leasing their machines. Xerox would supply the machine, services and support allowing companies a specific number of copies included in this service and charging for usage above and beyond that amount.

Today, anything as a service business models are based on the supplier taking on the responsibility for the data analysis and maintenance of the service and using information via the Internet of Things (IoT) to provide real-time upgrades and improvements.

Rolls Royce is an early adopter of the XaaS model with its turbine engines, which charges aerospace customers a fixed price for the number of hours they fly. Maintenance is covered, engine downtime is reduced, and companies have a fixed, predictable cost. These engines also come loaded with IoT sensors. For example, the Pratt & Whitney’s Geared Turbo Fan (GTF) engine is fitted with 5,000 sensors that generate up to 10 GB of data per second. As the cloud and IoT become more widely available to a much broader audience, XaaS becomes a central pillar in transforming their business.

The machine itself must be adaptable either through modularization or offering open standards, such as using a digital twin to engineer the product before integration begins.

A successful everything as a service business model must do the following:

Avoid downtimes. If the machine isn’t capable of working 24/7 or you can’t deliver it on time, you lose money. Predicting failures can help machine builders know when something may likely need maintenance or repairs, including access to spare parts.

Improve service efficiency. This is a point where you either gain or lose money. Parts and service experts must be available to manage any issues and ensure the machine has minimal downtime or updated functionality

Performance financing approach. Ideally, XaaS will have as flexible model as pay-per-use, which means you pay for what has been used from the machine. This will also determine current usage and the value of the service.

Positive user experience. When selling service, the retention rate must be considered high in order to achieve success. Interfaces need to be available on desktops and mobile devices so stakeholders can view and share information with ease and there must be a comprehensive transformation of how the business operates, including training for accounting, R&D and sales support since they won’t be selling products but rather services.

HPE CEO Pledges to Sell ‘Everything as a Service’ by 2022

In its boldest move yet to make on-prem IT more like public cloud, the company says GreenLake is its future.

Three years from now, every product Hewlett Packard Enterprise sells will be available as a service. That’s the pledge CEO Antonio Neri made from stage Tuesday afternoon during his keynote at the company’s Discover conference in Las Vegas. The pledge covers both hardware and software in the enterprise tech giant’s sprawling portfolio.

Disrupted by public cloud providers like Amazon Web Services and Microsoft Azure, market incumbents like HPE, Dell, and IBM have all been looking for ways to bring the experience of using their products closer to the experience of using cloud services. That experience includes not having to sink capital in infrastructure, not spending money to run and maintain it, and paying only for what you use. It also includes a constant stream of new features to select from and frequent upgrades to the latest hardware.

“In the next three years HPE will be a consumption-driven company and everything delivered to you will be delivered as a service,” Neri said. “You choose what you want, where you want it, and only pay for what you consume.”

The company started on this path in 2017, when it launched GreenLake, the overarching brand for HPE’s on-premises solutions offered as a service. Instead of buying a hardware system and the necessary infrastructure management software to support SAP Hana, for example, a customer can have HPE deploy the system in their data center, manage it, and provide it to them as a service, the same way AWS provides its myriad of services, with the key difference being that HPE’s services are running out of the customer’s own facilities, not AWS’s.

According to Neri, the GreenLake business, now serving about 600 customers, has been growing faster than any other HPE business. “We now deliver HPE GreenLake in 56 countries and lead the industry in consumption-based services on-premises,” he said. “This is HPE’s fastest-growing business.”

Dell Technologies, HPE’s biggest rival in the data center market, earlier this year rolled out an as-a-service offering for on-premises hardware, a service operated by its subsidiary VMware. The service is for Dell EMC’s VxRail hyperconverged infrastructure, combined with VMware Cloud Foundation, the software stack through which VMware extends between on-prem environments and its cloud-provider partners.

Delivering everything as-a-service, whether at the edge, in the cloud, or in the datacenter

Expanding HPE GreenLake

The GreenLake portfolio already includes Azure Stack, Microsoft’s software that simulates the experience of using its public cloud on-premises. It also includes solutions like backup, databases, Big Data, and edge computing, among others.

Monday, the day before Discover kicked off, HPE announced a new hybrid cloud partnership with Google Cloud, expanding the GreenLake portfolio. The hybrid cloud will combine HPE’s ProLiant servers and Nimble storage with Anthos, Google’s recently unveiled software platform for running applications in Kubernetes-managed containers on customers’ own hardware running in their own data centers, and in Google’s public cloud. According to Google, customers will also be able to use Anthos to manage their workloads in its competitors’ clouds, such as Microsoft Azure and Amazon Web Services.

Until recently, GreenLake was only available to large enterprises. On Tuesday, however, the company announced that it’s expanding the business to also target mid-size customers.

To take more friction out of the GreenLake customer experience, companies unable or unwilling to allocate their own data center space for their HPE-as-a-Service solution can turn to one of its two new colocation partners: CyrusOne and Equinix. Besides readily available space, power, and network connectivity at their facilities, large customers get the benefit of private network links to hyperscale cloud platforms if they want to use GreenLake as part of their hybrid-cloud setups.

Also being added to GreenLake is the vast portfolio of network technologies by Aruba, the enterprise mobility specialist HP acquired in 2015. At Discover this week, HPE launched a new Aruba “Network as a Service” offering sold through GreenLake.

New Storage Box With 100 Percent Uptime Guaranteed

Of course, no HPE Discover keynote is complete without a new data center hardware rollout. Neri announced two new pieces of hardware in his talk: a storage box and a compute box.

The storage box, announced together with HPE’s chief sales officer Philip Davis, is called Primera, and it combines hardware by Nimble Storage, a company HPE bought in 2017, with InfoSight, HPE’s machine learning and analytics-heavy infrastructure management software.

In what appears will be a part of all future product announcements by HPE, the company said it will offer Primera as a service, either as a subscription or on a pay-for-what-you-use basis. But customers will have the choice to buy and manage the new storage array on their own.

The company is so confident in the InfoSight machine learning-engine’s ability to stave off infrastructure issues that it’s guaranteeing 100 percent availability for Primera, Davis said. “And it’s standard for everyone,” he added, meaning there aren’t lower-tier Primera offerings that aren’t guaranteed to never go down.

Primera’s designers also aimed to make it the easiest box of its kind to set up. “It delivers a consumer-grade user experience that customers can install and upgrade on their own,” Davis said. “You plug in six cables, make a few clicks, and go from rack to rack in less than 20 minutes.”

Competing storage arrays require extensive planning and room in the budget for expert services, he said. Primera upgrades take less than five minutes, he added, this time taking a direct swipe at Dell EMC by saying he challenged anyone to upgrade a Dell EMC PowerMax storage array in less than five minutes.

HPE said it will start taking orders for Primera this August.

The Machine in a ProLiant Box

According to Neri, the company is already taking orders on the compute box he announced at the show. The box is an HPE ProLiant server with HPE’s moonshot “memory-driven computing” architecture inside.

The company first unveiled the memory-driven concept in 2014, and in 2017 showed a prototype system built around it called The Machine. The idea, basically, is to create a single big pool of memory, with a single address space, interconnected with the CPU using silicon photonics, a technology that relies on light rather than electrons for communication between components in a system.

The Machine HPE showed two years ago was a 40-node cluster with 160 terabytes of memory, all with a single address space. Without sharing much detail, Neri this week promised a ProLiant-based memory-driven development platform that combines CPU, accelerators, and memory, interconnected with a “photonic mesh,” and said the company was “now taking orders” for the box.

The SaaS business model & metrics: Understand the key drivers for success

As-a-Service: What it is and how it's changing the face of IT and business

As organizations push forward with their transformation efforts, as-a-Service computing is laying the foundation for greater agility, flexibility, speed, and more. In this Technology Untangled episode, experts discuss the evolution of cloud-based IT services and how the model is revolutionizing the way businesses compute―and compete.

Everyone's talking about as-a-Service computing models, and with good reason: As-a-Service offers businesses a long list of benefits, including greater agility, rapid elasticity, on-demand consumption, and more.

From platform and infrastructure to storage and networking, as-a-Service has upended traditional IT―and leveled the playing field in business.

"What the cloud has given us is the ability to move fast, and what the public cloud has done is it's made those services available to anyone," says Tony Clement, strategic hybrid cloud adviser at HPE Pointnext Services. "So, as a small business, I can compete from a compute perspective. I can compete with just about anybody."

In this episode of Technology Untangled, Clement joins colleagues Paul Kennedy, business development manager for HPE's advisory and professional services team, and Reuben Melville and Scott Thomson from HPE's GreenLake team to explain why as-a-Service is key to today's transformation efforts. They unravel all the terminology and buzz around public, private, and on-premises cloud models; explain why hybrid often provides the best of both worlds; and explore the as-a-Service capabilities that can help your business better compete.

Excerpts from the podcast follow:

Bird: Infrastructure-as-a-Service provides the same tech and capabilities as a traditional data center, including servers, network, OS, and storage, without the need to physically maintain or manage all of it.

An infrastructure middle ground between as-a-Service and on premises is hosted infrastructure, which was pretty popular in the mid-2000s. Now, it's usually referred to as colocation. It lets organizations rent physical space for servers and other hardware.

Clement: So the SaaS model has disrupted traditional IT. And more importantly, it's created pressure on IT to move faster.

Bird: These days, pretty much everyone uses SaaS on a daily basis, from Google apps, to Netflix, to Dropbox, and for some organizations, Salesforce.

The lines between Platform- and Infrastructure-as-a-Service are becoming more and more blurred with providers such as Microsoft and Google offering services that span both.

"Today, when I see IT leaders and business leaders holding on to the past―holding on to their VHS tapes, holding on to that VCR player, holding on to those old remote controls, holding on to that old stuff―that is a cultural problem within many organizations that's inhibiting digital transformation."
TONY CLEMENT STRATEGIC ADVISER, HYBRID CLOUD, HPE POINTNEXT SERVICES

The five characteristics of cloud

Bird: All of these as-a-Service models were made possible by one innovation: cloud computing, which is defined by NIST as having five key characteristics: on-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service.

These characteristics are the same, whether we're talking about the public cloud, like AWS, or the private cloud―computing services only available to select users, whether over the Internet or a private internal network.

But why are we talking about this? Well, the trajectory didn't stop at the end of our timeline. There's an upward trend in every imaginable as-a-Service offering from Security-as-a-Service, Analytics as-a-Service, and even Big Data-as-a-Service.

Clement: What the cloud has given us is the ability to move fast, and what the public cloud has done is it's made those services available to anyone. So, as a small business, I can compete from a compute perspective. I can compete with just about anybody.

Bird: As-a-Service offerings available on the public cloud were revolutionary for everyone, from start-ups to large organizations.

Digital transformation: XaaS-Business model with MindSphere

'Tell me about security'

Kennedy: I remember this big, scary CIO walking into Google's HQ. And this must have been about 10 years ago. … And this guy sat down, looked some of the head honchos of Google in the eye and said, "Tell me about security."

And there was about two hours of one of the public cloud specialists talking to John, the CIO, about the security of the public cloud. After that two hours, the CIO sort of sat back in his chair, thought about it, looked us in the eye, and said, "Have you any idea how much money it costs me to run my infrastructure from a security perspective? Have you any idea how much my security budget is? And you're telling me that I don't need to worry about the security; you'll do it for me for a fee of 33 pounds per user, per year?"

That was a really radical change that he was thinking about.

Bird: So radical, cheap, and agile. Why wouldn't you want to use the public cloud for everything?

Kennedy: Is the public cloud a destination? Absolutely not. It's about choice, and choice is far more important than a single destination. The public cloud is not the answer to everything in the same way that running things in your own data center is not the answer for everything. There are some organizations where that's probably appropriate but really not many.

What we now need to think about is the public cloud is a choice

Enter private cloud

Bird: In fact, the very nature of some organizations meant that their workloads and applications needed to remain very much under wraps, but they still wanted some of the agility, a cloud like experience, if you will. Enter the private cloud.

Clement: So there'll be policies around the type of information or the type of processing that a particular service executes being proprietary: "This process that we have, it is our strategic advantage. There is no way I would ever let this out into the public ever. We've spent millions and millions over the last 20 years, and this is our proprietary stuff."

That is a good reason to run into private cloud because you want to be able to control security―physical security.

Bird: Safe and secure sound great in theory, but for most organizations, this just isn't practical.

Clement: In order to effectively run private cloud, you need to have a pretty strong motivator. The best practices that are being implemented today are very expensive. Google, Amazon, Microsoft―their infrastructures costs billions, right? And there aren't that many organizations on the planet that are willing to invest that much in technology when it could be available cheaper as a service.

Bird: The public cloud is perfect for certain apps and workloads, and the private cloud could be a fit for some organizations, but at what cost? If only there was some kind of blended environment that moves beyond these conversations about public versus private clouds.

Clement: The hybrid cloud is exactly what the definition says: It's the interoperation between private cloud and public cloud.

Bird: Yes, that's right, the hybrid cloud, otherwise known as the best of both worlds.

Scale, simplicity, speed

Kennedy: The main benefits of the hybrid model are around three things. And I talk about the three S's: scale, simplicity, and speed.

If you think about scale, you need to be able to have the capability to scale, whatever you're building, as quickly as you can, either on premises or in the clouds. Now, there could well be things that you'll want to build and be able to scale very, very quickly in the public cloud, and you can certainly do that. But having that switching capability to say, well, actually some of it needs to run on premises because of security or latency issues, but still knowing that you've got that scalability capability, that's actually independent of either public cloud or on premises.

So think about simplicity. Simplicity is all about having the things available to you whenever you want it. So what you don't want to do is be in a world where you could say, well, actually you can build that on premises but you can't do it in the public cloud. You want to have that simplicity to actually go, "Well, it doesn't really matter where it sits―I need to be able to get hold of that dataset, or I want to run a containerized infrastructure, or I need people to be able to access this application as quickly as possible."

And then you want things to run at speed. You want to be able to make those changes quickly and easily so you can stay ahead of the competition.

Bird: The hybrid environment lets organizations switch between these public and private clouds, and this simplification and streamlining of an organization's IT can be packaged together using the as-a-Service model.

Migrating Your Data Center To Azure (How to: Lift and Shift)

Everything-as-a-Service defined

Kennedy: What does Everything-as-a-Service fundamentally mean? Certainly, first of all, if things are as a service, we want them really to be location-independent. So we don't want to be deterministic about whether things are in our data centers or in the public cloud. We want things to be independent by very nature and, where possible, to be able to move around.

And when we want … everything to be as a service, we want the payment model, the cost model to also be the same. So what we want to be able to do is for customers and organizations to look at the cost model and think about it from a customer's perspective and think about it from their business perspective and not be determined by an architecture, be it on the public cloud or in the private cloud.

Bird: Depending on your business needs, having Infrastructure- or Storage-as-a-Service might seem a bit pie in the sky, but hardware manufacturers have been preparing for this for quite some time.

Clement: Infrastructure manufacturers―whether it's HPE or IBM or Dell, Cisco, EMC―they're building in cloud functionality within the hardware now. Right? So hardware itself has the hooks in it for virtualization: hardware, physical compute, storage infrastructure.

New applications, modern applications, are architected to run in the cloud. That's where you get this cloud-native term. Right? All of IT on-premises infrastructure, if we accept the assertion that that will become private cloud, being able to run my application workloads across both the private cloud and the public cloud could be of a big benefit. So there could be application functionality and application data that I want to keep on premises, but then there also could be application functionality and data that I want to run in the cloud. But I want this service to span both. I want it to be one service. I don't want to have two business services, so to speak.

Bird: So to ground this idea of hardware and infrastructure as a service, I called up Reuben Melville and Scott Thomson from HPE GreenLake.

Thomson: My name's Scott Thompson [cloud services specialist].

Melville: My name is Rubin Melville, worldwide category manager for GreenLake in the channel.

Bird: OK, really quick disclaimer: There are myriad examples of Everything-as-a-Service out there and GreenLake is HPE's offering.

Universal benefits of on-demand consumption

We don't want to make this all about us, but GreenLake is a really useful case study to explain universal benefits of the consumption model to organizations.

Accelerate Digital Transformation with the Microsoft Cloud that comes to you | OD429

Melville: Now, a couple of years ago, customers talked about this cloud journey, but what they tended to mean by that was public cloud. You know, "we're going on a cloud journey" really ended up being "we're taking it to the public cloud." However, we've now started to see a change in that. So customers are now looking for more of a hybrid-type solution.

And the reason for that is because they've started to realize that, yes, they're on this cloud journey, but they can't move all of the data that they have to the public cloud.

Bird: One of the biggest problems for traditional IT departments is provisioning. Organizations have always had to estimate how much of everything they need in advance, from storage to compute power.

Melville: So, where you're buying technology, for purchase of infrastructure, a customer is actually spending a lump sum upfront on capacity they don't actually know when they're going to use. So, if you think about it, most customers overprovision. In fact, industry talks about 60 percent overprovisioning for compute, for example, and around 50 percent for storage. So that means they're actually spending money upfront on capacity they don't actually require from day one.

So what we do with GreenLake is actually deploy on day one what the customer actually requires. So not overprovisioning, but actually what are they going to use? But we also give the customer a buffer because obviously capacity can go up and go down. So we'll give them a buffer on site of around 10 percent to 20 percent over what they require. That means they can scale into it when required.

Bird: Infrastructure as-a-Service avoids overprovisioning and service rule, which very much fulfills that simplification requirements of the hybrid environment. The ability to scale up and scale down is key both from an agility and a cost control perspective.

Thomson: Financial services is an area that this resonates with from the point of view that they think a lot about the return on investment.

[In the case of one company], they were really growing quite fast; their customers were putting demands on the organization to launch new products and ways that they wanted to see them be able to work. So they were trying to release all these new products and keep up with the demands as an IT division to release these products at the same time as helping the organization run the day to day and grow at the 8 percent year on year that they were looking to grow. Well, now that they have HPE GreenLake, they're able to effectively capacity plan and deploy projects much, much faster and have no lag time between the need for increased capacity and the supply of increased capacity.

SUSE: Enabling Transformation Edge to Core to Cloud with Brent Schroeder

Bird: Isn't as-a-Service just the same as rental?

A rental or lease?

Melville: Good question. We get asked this quite a lot. Is GreenLake basically the same as rental, or is it a lease?

I would say no. And I can understand why you would ask that question, because there are a lot of competitive solutions that kind of describe as-a-Service when in reality, if you just look at it, it's like a lease.

What we're actually doing is metering the usage of the environment. And we do that for our software consumption analytics, and that's what makes it stand out from being a rental or being a lease.

Bird: Well, it's fair to say that in recent times, particularly in 2020, most organizations have experienced doing quite a bit more with quite a bit less.

Thomson: There's been a huge increase in remote working for obvious reasons. You know, we've seen particularly an increase in demand on virtual desktop infrastructure projects.

So risk and cost are critical and, of course, more than ever.

Melville: What we're doing with GreenLake is actually helping them with that because we're deploying capacity as and when required.

Thomson: Due to the fact that the consumption model is metered and billed on invoice on a monthly basis, there's no upfront investment required for an organization to start working with as-a-Service.

Bird: Well, as I mentioned earlier, GreenLake is just one example of the possibilities of as-a-Service, a model which in itself delivers those key public cloud concepts of agility and consumption pricing with a level of control more akin to the private cloud or even on-premises infrastructure.

Everything-as-a-Service is about flexibility. When organizations aren't held back by their hardware, they can really hone in on what's important to them.

Kennedy: For an organization to look at an Everything-as-a-Service model, there's a couple of key values that they get out of it. I think the first thing that they get from it is the capability to focus on the key elements of their business.

If you're a bank, you can focus on being a bank. If you're a retailer, you can focus on the customer experience. If you're a government organization, you can focus on the services that you provide.

Bird: Everything-as-a-service opens up new ways of working in every industry, but at its core, it's not necessarily a technological shift. The biggest changes and, perhaps, challenges are operational and organizational.

Thomson: It is challenging. There's a lot to be taken account of. We talked earlier about the multicloud environment and the hybrid environment―all of those different operating models need different skill sets to manage that complexity of various environments. It's tricky. And also making those decisions around where the workloads are going to be optimized is difficult as well.

Succeeding with Secure Access Service Edge (SASE) | OD416

Innovate or die

Clement: You can't be thinking about how I can do something better than my competitors today with an Industrial Age mindset. That's impossible. And this challenge, this change of mindset, is the real transformation.

Would it make any sense for me to use, do the same tasks that I performed with my VHS player with Netflix? No. Why would you bother? And that's the way people are thinking in the digital age, thinking about traditional IT. Why are you bothering doing those things that we know are inefficient?

Today, when I see IT leaders and business leaders holding on to the past―holding on to their VHS tapes, holding on to that VCR player, holding on to those old remote controls, holding on to that old stuff―that is a cultural problem within many organizations that's inhibiting digital transformation.

Bird: Digital transformation is difficult but totally necessary. We often hear the phrase "innovate or die." And it really is a stark [reminder] that if organizations don't transform, they can't compete with those that do.

Clement: You may have heard the term agile allows you to fail fast. Yeah, it does allow you to fail fast because you can see what's working and what is not working very quickly. But the key isn't just failing fast; it's also failing fast, learning, and then pivoting. You have to think on the fly, you have to be nimble, you have to be agile, you have to come up with new answers, you have to be innovative, you have to be creative. And those attributes need to be part of your culture―they don't need to be an exception; you need to be doing that all the time.

Kennedy: Some organizations have struggled with the amount of investment that would be required. Certainly in some traditional organizations, IT has still been seen as somewhat of a cost and somewhat of a drag. You know, there was always that thought that IT, you know, the guys in the IT crowd, sit in the basement. You only really go there if you've got a problem with your computer.

Now, organizations need to look at that and fundamentally think about digital transformation in a new world. And the IT department needs to be at the point where they can lead that or at least be at the table to help facilitate, understand, and act on the way that the business needs to change. Because I'll tell you for one, there is some little startups somewhere that are going where they want to be and will be there quicker than them.

IT leaders are business leaders

Bird: Both Tony and Paul described CTOs and CIOs as playing important roles in driving organizations forward―not just in terms of technology by informing where apps, workloads, and data sits and how they're managed, but actually digitizing and modernizing the entire business approach.

Kennedy: I think number one, think about where the business needs to go in the future. Number two, really think about your customer―not now, but in the future. Number three, and I've spoken about this before, think about the culture, the culture not only of the organization, but think about the culture of your customers as well and how they want to interact with you both now and in the future.

Bird: The as-a-Service market is projected to keep on climbing, and as for the public, private, and hybrid clouds, word on the street is we'll be getting rid of that terminology altogether.

Clement: I don't think we'll be calling it hybrid for a very long time. It'll just be computing … just like we don't talk about the Worldwide Web anymore. You almost never hear anyone say WWW. We are on that trajectory. We will be in a hybrid world where private and public cloud need to interoperate effectively as one for the foreseeable future. Hybrid is the way of the future, and organizations need to move quickly to adopt it or they will compromise their organization's ability to win in the marketplace.

Bird: Harness the power of hybrid to get the best of both worlds; embrace Everything-as-a-Service to improve agility and flexibility; cut costs but stay in control; and digitally transform to keep your competitive edge. So how do you get started? Tony says there's no substitute for just jumping in.

Just do it

Clement: You could read as much as you want and go to as many workshops as you want, but until you actually start to live it and experience it, you don't know what it is, right?

It's just like anything else. It's only by diving in and experiencing it where you learn how your organization or any individual organization will execute this and then take advantage of it, because it will be slightly different in every company because of the culture, because of everything. And that requires that day-to-day leadership.

More Information:

https://www.hpe.com/us/en/insights/articles/as-a-service--what-it-is-and-how-it-s-changing-the-face-of-it-an-2008.html

https://www.datacenterknowledge.com/hewlett-packard-enterprise/hpe-ceo-pledges-sell-everything-service-2022

https://blogs.sw.siemens.com/thought-leadership/2019/07/11/everything-as-a-service-a-closer-look-at-the-business-model-of-the-future/

https://fowmedia.com/future-cloud-service/

https://www.nearform.com/blog/a-closer-look-at-the-business-benefits-of-serverless/

https://www.delltechnologies.com/en-us/blog/the-x-factor-tapping-into-everything-as-a-service-2/

https://www.govexec.com/feature/mission-ready/

https://www.business.com/articles/everything-as-a-service-gamechanger/

Translation software enables efficient DNA data storage

In support of a major collaborative project to store massive amounts of data in DNA molecules, a Los Alamos National Laboratory–led team has developed a key enabling technology that translates digital binary files into the four-letter genetic alphabet needed for molecular storage.

“Our software, the Adaptive DNA Storage Codec (ADS Codex), translates data files from what a computer understands into what biology understands,” said Latchesar Ionkov, a computer scientist at Los Alamos and principal investigator on the project. “It’s like translating from English to Chinese, only harder.”

DNA Data Storage - The Solution to Data Storage Shortage

The work is key part of the Intelligence Advanced Research Projects Activity (IARPA) Molecular Information Storage (MIST) program to bring cheaper, bigger, longer-lasting storage to big-data operations in government and the private sector. The short-term goal of MIST is to write 1 terabyte—a trillion bytes—and read 10 terabytes within 24 hours for $1,000. Other teams are refining the writing (DNA synthesis) and retrieval (DNA sequencing) components of the initiative, while Los Alamos is working on coding and decoding.

“DNA offers a promising solution compared to tape, the prevailing method of cold storage, which is a technology dating to 1951,” said Bradley Settlemyer, a storage systems researcher and systems programmer specializing in high-performance computing at Los Alamos. “DNA storage could disrupt the way we think about archival storage, because the data retention is so long and the data density so high. You could store all of YouTube in your refrigerator, instead of in acres and acres of data centers. But researchers first have to clear a few daunting technological hurdles related to integrating different technologies.”

Data Storage in DNA techniques explained

Data Storage in DNA from Sourabh Chalotra

Not Lost in Translation

Compared to the traditional long-term storage method that uses pizza-sized reels of magnetic tape, DNA storage is potentially less expensive, far more physically compact, more energy efficient, and longer lasting—DNA survives for hundreds of years and doesn’t require maintenance. Files stored in DNA also can be very easily copied for negligible cost.

DNA’s storage density is staggering. Consider this: humanity will generate an estimated 33 zettabytes by 2025—that’s 3.3 followed by 22 zeroes. All that information would fit into a ping pong ball, with room to spare. The Library of Congress has about 74 terabytes, or 74 million million bytes, of information—6,000 such libraries would fit in a DNA archive the size of a poppy seed. Facebook’s 300 petabytes (300,000 terabytes) could be stored in a half poppy seed.

DNA Data Storage is the Future!

Encoding a binary file into a molecule is done by DNA synthesis. A fairly well understood technology, synthesis organizes the building blocks of DNA into various arrangements, which are indicated by sequences of the letters A, C, G, and T. They are the basis of all DNA code, providing the instructions for building every living thing on earth.

The Los Alamos team’s ADS Codex tells exactly how to translate the binary data—all 0s and 1s—into sequences of four letter-combinations of A, C, G, and T. The Codex also handles the decoding back into binary. DNA can be synthesized by several methods, and ADS Codex can accommodate them all. The Los Alamos team has completed a version 1.0 of ADS Codex and in November 2021 plans to use it to evaluate the storage and retrieval systems developed by the other MIST teams.

Unfortunately, DNA synthesis sometimes makes mistakes in the coding, so ADS Codex addresses two big obstacles to creating DNA data files.

First, compared to traditional digital systems, the error rates while writing to molecular storage are very high, so the team had to figure out new strategies for error correction. Second, errors in DNA storage arise from a different source than they do in the digital world, making the errors trickier to correct.

“On a digital hard disk, binary errors occur when a 0 flips to a 1, or vice versa, but with DNA, you have more problems that come from insertion and deletion errors,” Ionkov said. “You’re writing A, C, G, and T, but sometimes you try to write A, and nothing appears, so the sequence of letters shifts to the left, or it types AAA. Normal error correction codes don’t work well with that.”

ADS Codex adds additional information called error detection codes that can be used to validate the data. When the software converts the data back to binary, it tests if the codes match. If they don’t, ACOMA tries removing or adding nucleotides until the verification succeeds.

Microsoft and University of Washington DNA Storage Research Project

Smart Scale-up

Large warehouses contain today’s largest data centers, with storage at the exabyte scale—that’s a trillion million bytes or more. Costing billions to build, power, and run, this type of digitally based data centers may not be the best option as the need for data storage continues to grow exponentially.

Long-term storage with cheaper media is important for the national security mission of Los Alamos and others. “At Los Alamos, we have some of the oldest digital-only data and largest stores of data, starting from the 1940s,” Settlemyer said. “It still has tremendous value. Because we keep data forever, we’ve been at the tip of the spear for a long time when it comes to finding a cold-storage solution.”

Settlemyer said DNA storage has the potential to be a disruptive technology because it crosses between fields ripe with innovation. The MIST project is stimulating a new coalition among legacy storage vendors who make tape, DNA synthesis companies, DNA sequencing companies, and high-performance computing organizations like Los Alamos that are driving computers into ever-larger-scale regimes of science-based simulations that yield mind-boggling amounts of data that must be analyzed.

Deeper Dive into DNA

When most people think of DNA, they think of life, not computers. But DNA is itself a four-letter code for passing along information about an organism. DNA molecules are made from four types of bases, or nucleotides, each identified by a letter: adenine (A), thymine (T), guanine (G), and cytosine (C).

These bases wrap in a twisted chain around each other—the familiar double helix—to form the molecule. The arrangement of these letters into sequences creates a code that tells an organism how to form. The complete set of DNA molecules makes up the genome—the blueprint of your body.

By synthesizing DNA molecules—making them from scratch—researchers have found they can specify, or write, long strings of the letters A, C, G, and T and then read those sequences back. The process is analogous to how a computer stores information using 0s and 1s. The method has been proven to work, but reading and writing the DNA-encoded files currently takes a long time, Ionkov said.

“Appending a single nucleotide to DNA is very slow. It takes a minute,” Ionkov said. “Imagine writing a file to a hard drive taking more than a decade. So that problem is solved by going massively parallel. You write tens of millions of molecules simultaneously to speed it up.”

While various companies are working on different ways of synthesizing to address this problem, ADS Codex can be adapted to every approach.

DNA Data storage

Dna data storage from Ravi Vaniya

DNA Storage as a Solution to our Data Storage Crissis

There are those who argue that, over the last two decades, we’ve moved from an oil-centric economy to a data-focused one. While this may seem like something of an exaggeration, it’s impossible to deny that data is playing an increasing role in our day-to-day life. And this is most apparent in the world of business and big data.

Unfortunately, with an increased need for data comes a greater need for data storage – which has created a very big problem!

The current situation

As we’ve discussed previously, data is now the driving force behind a multitude of business decisions. Additionally, an immeasurable number of valuable insights can be gleaned from the vast swathes of data that organizations are yet to analyze. In short, big data is big business. And this – coupled with increasingly more affordable technology – has seen us create data at a truly unprecedented rate.

Karin Strauss - DNA Storage

Karin Strauss - DNA Storage, July 2016 from Seattle DAML meetup

In April 2013, 90% of all of the world’s data had been created within the previous two years. Today, the total amount of data in existence is doubling every other year. Such growth is not sustainable, but with data having proven indispensable, simply creating less of it is not an option. In order to avert this crisis, what is needed is storage media that offers storage density that is vastly superior to what is currently available. Here are the most likely possibilities along with a summary of their pros and cons:

We improve existing media

Storage media stalwart Seagate has officially manufactured more than 40,000 hard drives featuring HAMR (heat-assisted magnetic recording) technology and they plan to begin shipping them later this year.

Unlocking the Potential of NVMe

HAMR – whereby a disk’s platter is heated prior to the writing process – has been developed in order to significantly increase hard drive’s storage capacity. This heating means that less space is needed to store data and results in hard drives that are therefore capable of achieving higher storage densities.

Infinite Memory Engine

Infinite Memory Engine from albertspijkers

Potential storage capacity

Seagate has stated that HAMR tech will allow them to produce drives with more than 20TBs of storage before the end of 2019 and 40TBs by 2023. No further estimations are provided, though the company does state that they’ve already begun developing its successor, heated-dot magnetic recording (HDMR) suggesting that there is more to come.

Practicality

As we’ve stated previously, Seagate has stated that they plan to ship HAMR drives before the end of the year. These drives also use the 3.5 standard carriages. Meaning that they’ll easily insert into existing arrays. The drives also remain relatively affordable in spite of the inclusion of this new technology.

The cons

Whilst a 40TB HDD would represent a significant improvement on what’s currently available, it’s still unlikely to represent a long-term solution to the problem – unless HDMR proves capable of significantly boosting their capacity, that is.

We use our existing storage more efficiently

Sia is an example of what is, in our opinion, a unique and highly innovative potential answer to our current storage problems: it identifies unused space on various pieces of storage media throughout the globe, rents it from those users and then sells it to the general public as remote cloud storage.

Potential storage capacity

Sia’s website claims they’re an entire network of drives boats 4.2 Petabytes (4,200TBs) of storage. This may seem like a lot at first glance, but with the entire cloud currently storing just under 1,500 Exabytes (that’s 150,000 Petabytes) it doesn’t offer the kind of capacity needed to offer a real solution to our data storage crisis.

That said, 4.2 Petabytes is a considerable amount of storage that would otherwise have been wasted and Sia are also not the only company leveraging this technology. So, whilst decentralized cloud storage alone isn’t the answer we’re looking for, it’s far from ineffectual and is certainly an efficient way of utilizing existing storage space.

Practicality

As with most cloud storage, it’s easy to use and, thanks to the use of blockchain, is extremely cheap at just $2 per TB of storage.

Cons

We’ve already said that the decentralized cloud’s unlikely to offer the kind of storage capacity the world’s going to need to avert our impending storage crisis. We also know that trust in the cloud tends to diminish with each high-profile data breach so we’d expect the adoption rate to be somewhat slow.

What You Need to Know about DNA Data Storage Today

We use something ground-breaking

It may sound like pure fiction, but DNA has already been used to store and retrieve data. In fact, DNA data storage is something that has a lot of people very, very excited.

Potential storage capacity

This is what sets DNA data storage apart from its competitors: just one gram of DNA could store 215 Petabytes of data. With storage capacities like this, it’s clear to see why many believe DNA could be the answer to our storage conundrum.

Practicality

As I’m sure you can imagine, the process of storing and retrieving data from DNA is cumbersome and, whilst it was first achieved five years ago, it’s still far from an accessible and practical means of storing data.

Whilst using DNA to create a useable piece of storage media is proving to be problematic, though, it could not only produce a device capable of storing a data center in a 3.5 inch HDD cradle but one robust enough to last for a millennium, also.

Cons

As we’ve said previously, no uniform way of reading and writing data to and from DNA currently exists. It’s also been widely reported that the task of retrieving the data itself is both a slow and cumbersome one. These, however, are not the greatest hurdles scientists face in trying to make DNA the world’s de facto storage media: that honor belongs to cost.

Ultra-dense data storage and extreme parallelism with electronic-molecular systems

In 2017, data was successfully stored in and then retrieved from DNA but the cost of doing so was astonishingly high: synthesizing the data cost $7,000 and retrieving it a further $2,000. These are expected to drop significantly over the next few years but this could take as much as a decade.

“One of the challenges for us as a company, and us as an industry, is that many of the technologies we rely on are beginning to get to the point where either they are at the end, or they’re starting to get to the point where you can see the end. Moore’s Law is a well-publicized one and we hit it some time ago. And that’s a great opportunity, because whenever you get that rollover, you get an opportunity to be able do things differently, to have new ways of doing things.”

– ANT ROWSTRON, DISTINGUISHED ENGINEER AND DEPUTY LAB DIRECTOR, MICROSOFT RESEARCH CAMBRIDGE

It is projected that around 125 zettabytes of data will be generated annually by 2024. Storing this data efficiently and cost-effectively will be a huge challenge. Growth in storage capabilities using SSDs, HDDs or magnetic tape has not kept up with the exponential growth in compute capacity, the surge in data being generated, or the novel economics and storage needs of cloud services.

Future demands from intelligent edge and Internet of Things (IoT) deployments, streaming audio, video, virtual and mixed reality, “digital twins” and use cases we haven’t yet predicted will generate lots of bits – but where will we keep them?

This requires more than incremental improvements – it demands disruptive innovation. For many years, Microsoft researchers and their collaborators have been exploring ways to make existing storage approaches more efficient and cost-effective, while also forging entirely new paths – including storing data in media such as glass, holograms and even DNA.

DNA Data Storage and Near-Molecule Processing for the Yottabyte Era

Re-Imagining Storage

Researchers have taken a holistic approach to making storage more efficient and cost-effective, using the emergence of the cloud as an opportunity to completely re-think storage in an end-to-end fashion. They are co-designing new approaches across layers that are traditionally thought of as independent – blurring the lines between storage, memory and network. At the same time, they’re re-thinking storage from the media up – including storing data in media such as glass, holograms and even DNA.

This work extends back more than two decades: in 1999, Microsoft researchers began work on Farsite, a secure and scalable file system that logically functions as a centralized file server, but is physically distributed among a set of untrusted computers. This approach would utilize the unused storage and network resources of desktop computers to provide a service that is reliable, available and secure despite running on machines that are unreliable, often unavailable and of limited security. In 2007, researchers published a paper that explored the conditions under which such a system could be scalable, the software engineering environment used to build the system, and the lessons learned in its development.

At Microsoft Research Cambridge, researchers began exploring optimizing enterprise storage using off-the-shelf hardware in the late 2000s. They explored off-loading data from overloaded volumes to virtual stores to reduce power consumption and to better accommodate peak I/O request rates. This was an early example of storage virtualization and software-defined storage, ideas that are now widely used in the cloud. As solid-state drives (SSDs) became more commonplace in PCs, researchers considered their application in the datacenter – and concluded that they were not yet cost-effective for most workloads at current prices. In 2012, they analyzed potential applications of non-volatile main memory (NVRAM) and proposed whole-system persistence (WSP), an approach to database and key-value store recovery in which memory rather than disk is used to recover an application’s state when it fails – blurring the lines between memory and storage.

In 2011, researchers established the Software-Defined Storage Architectures project, which brought the idea of separating control flow from data from networking to storage, to provide predictable performance and reduced cost. IOFlow is a software-defined storage architecture that uses a logically centralized control plane to enable end-to-end policies and QoS guarantees, which required rethinking across the data center storage and network stack. This principle was extended to other cloud resources to create a virtual data center per tenant. In this 2017 article the researchers describe the advantages of intentionally blurring the lines between virtual storage and virtual networking.

Established in 2012, the FaRM project explored new approaches to using main memory for storing data with a distributed computing platform that exploits remote direct memory access (RDMA) communication to improve both latency and throughput by an order of magnitude compared to main memory systems that use TCP/IP. By providing both strong consistency and high performance – challenging the conventional wisdom that high performance required weak consistency – FaRM allowed developers to focus on program logic rather than handling consistency violations. Initial developer experience highlighted the need for strong consistency for aborted transactions as well as committed ones – and this was then achieved using loosely synchronized clocks.

At the same time, Project Pelican addressed the storage needs of “cold” or infrequently-accessed data. Pelican is a rack-scale disk-based storage unit that trades latency for cost; using a unique data layout and IO scheduling scheme to constrain resources usage so that only 8% of its drives can spin concurrently. Pelican was an example of rack-scale co-design: rethinking the storage, compute, and networking hardware as well as the software stack at a rack scale to deliver value at cloud scale.

DNA Storage for Digital Preservation

To further challenge traditional ways of thinking about the storage media and controller stack, researchers began to consider whether a general-purpose CPU was even necessary for many operations. To this end, Project Honeycomb tackles the challenges of building complex abstractions using FPGAs in CPU-free custom hardware, leaving CPU-based units to focus on control-plane operations.

DNA synthesis and sequencing: writing and reading the code

DNA is the carrier of genetic information in nearly all living organisms. This information is stored as a code made up of four chemical bases: adenine (A), guanine (G), cytosine (C), and thymine (T). The order of these bases – called the sequence – then determines what information is available for building and maintaining an organism.

DNA bases pair up: A with T, and C with G. These base pairs, along with a couple of other components (sugar and phosphate), then arrange themselves up along two long strands to form the ladder-like helix we’re so accustomed to seeing when we think about DNA.

What does this all have to do with data storage? As it turns out, binary code (all those 0’s and 1’s) can be translated into DNA base pairings.

DNA data storage is the process of encoding binary data into synthetic strands of DNA. Binary digits (bits) are converted from 0s and 1s to the four chemicals bases (A, T, C, and G), such that the DNA sequence corresponds to the order of the bits in a digital file. In this way, the physical storage medium becomes a man-made chain of DNA.

Recovering the data is then a matter of sequencing that DNA. DNA sequencing determines the order of those four chemical building blocks, or bases, that make up the DNA molecule, and is generally used to determine the genetic information carried out in a particular DNA strand.

By running digital data-containing synthetic DNA through a sequencer, the genetic code – or sequence of bases – can be obtained and translated back into the original binary bits to access that stored data.

HovercRaft: Achieving Scalability and Fault-tolerance for Microsecond-scale Datacenter Services

Why we need DNA Storage

On Earth right now, there are about 10 trillion gigabytes of digital data, and every day, humans produce emails, photos, tweets, and other digital files that add up to another 2.5 million gigabytes of data. Much of this data is stored in enormous facilities known as exabyte data centers (an exabyte is 1 billion gigabytes), which can be the size of several football fields and cost around $1 billion to build and maintain.

Many scientists believe that an alternative solution lies in the molecule that contains our genetic information: DNA, which evolved to store massive quantities of information at very high density. A coffee mug full of DNA could theoretically store all of the world’s data, says Mark Bathe, an MIT professor of biological engineering.

“We need new solutions for storing these massive amounts of data that the world is accumulating, especially the archival data,” says Bathe, who is also an associate member of the Broad Institute of MIT and Harvard. “DNA is a thousandfold denser than even flash memory, and another property that’s interesting is that once you make the DNA polymer, it doesn’t consume any energy. You can write the DNA and then store it forever.”

Scientists have already demonstrated that they can encode images and pages of text as DNA. However, an easy way to pick out the desired file from a mixture of many pieces of DNA will also be needed. Bathe and his colleagues have now demonstrated one way to do that, by encapsulating each data file into a 6-micrometer particle of silica, which is labeled with short DNA sequences that reveal the contents.

Using this approach, the researchers demonstrated that they could accurately pull out individual images stored as DNA sequences from a set of 20 images. Given the number of possible labels that could be used, this approach could scale up to 1020 files.

Bathe is the senior author of the study, which appears today in Nature Materials. The lead authors of the paper are MIT senior postdoc James Banal, former MIT research associate Tyson Shepherd, and MIT graduate student Joseph Berleant.

Post-quantum cryptography: Supersingular isogenies for beginners

Stable storage

Digital storage systems encode text, photos, or any other kind of information as a series of 0s and 1s. This same information can be encoded in DNA using the four nucleotides that make up the genetic code: A, T, G, and C. For example, G and C could be used to represent 0 while A and T represent 1.

DNA has several other features that make it desirable as a storage medium: It is extremely stable, and it is fairly easy (but expensive) to synthesize and sequence. Also, because of its high density — each nucleotide, equivalent to up to two bits, is about 1 cubic nanometer — an exabyte of data stored as DNA could fit in the palm of your hand.

One obstacle to this kind of data storage is the cost of synthesizing such large amounts of DNA. Currently it would cost $1 trillion to write one petabyte of data (1 million gigabytes). To become competitive with magnetic tape, which is often used to store archival data, Bathe estimates that the cost of DNA synthesis would need to drop by about six orders of magnitude. Bathe says he anticipates that will happen within a decade or two, similar to how the cost of storing information on flash drives has dropped dramatically over the past couple of decades.

Aside from the cost, the other major bottleneck in using DNA to store data is the difficulty in picking out the file you want from all the others.

“Assuming that the technologies for writing DNA get to a point where it’s cost-effective to write an exabyte or zettabyte of data in DNA, then what? You're going to have a pile of DNA, which is a gazillion files, images or movies and other stuff, and you need to find the one picture or movie you’re looking for,” Bathe says. “It’s like trying to find a needle in a haystack.”

Currently, DNA files are conventionally retrieved using PCR (polymerase chain reaction). Each DNA data file includes a sequence that binds to a particular PCR primer. To pull out a specific file, that primer is added to the sample to find and amplify the desired sequence. However, one drawback to this approach is that there can be crosstalk between the primer and off-target DNA sequences, leading unwanted files to be pulled out. Also, the PCR retrieval process requires enzymes and ends up consuming most of the DNA that was in the pool.

“You’re kind of burning the haystack to find the needle, because all the other DNA is not getting amplified and you’re basically throwing it away,” Bathe says.

Quantum-safe cryptography: Securing today’s data against tomorrow’s computers

File retrieval

As an alternative approach, the MIT team developed a new retrieval technique that involves encapsulating each DNA file into a small silica particle. Each capsule is labeled with single-stranded DNA “barcodes” that correspond to the contents of the file. To demonstrate this approach in a cost-effective manner, the researchers encoded 20 different images into pieces of DNA about 3,000 nucleotides long, which is equivalent to about 100 bytes. (They also showed that the capsules could fit DNA files up to a gigabyte in size.)

Each file was labeled with barcodes corresponding to labels such as “cat” or “airplane.” When the researchers want to pull out a specific image, they remove a sample of the DNA and add primers that correspond to the labels they’re looking for — for example, “cat,” “orange,” and “wild” for an image of a tiger, or “cat,” “orange,” and “domestic” for a housecat.

The primers are labeled with fluorescent or magnetic particles, making it easy to pull out and identify any matches from the sample. This allows the desired file to be removed while leaving the rest of the DNA intact to be put back into storage. Their retrieval process allows Boolean logic statements such as “president AND 18th century” to generate George Washington as a result, similar to what is retrieved with a Google image search.

“At the current state of our proof-of-concept, we’re at the 1 kilobyte per second search rate. Our file system’s search rate is determined by the data size per capsule, which is currently limited by the prohibitive cost to write even 100 megabytes worth of data on DNA, and the number of sorters we can use in parallel. If DNA synthesis becomes cheap enough, we would be able to maximize the data size we can store per file with our approach,” Banal says.

For their barcodes, the researchers used single-stranded DNA sequences from a library of 100,000 sequences, each about 25 nucleotides long, developed by Stephen Elledge, a professor of genetics and medicine at Harvard Medical School. If you put two of these labels on each file, you can uniquely label 1010 (10 billion) different files, and with four labels on each, you can uniquely label 1020 files.

George Church, a professor of genetics at Harvard Medical School, describes the technique as “a giant leap for knowledge management and search tech.”

“The rapid progress in writing, copying, reading, and low-energy archival data storage in DNA form has left poorly explored opportunities for precise retrieval of data files from huge (1021 byte, zetta-scale) databases,” says Church, who was not involved in the study. “The new study spectacularly addresses this using a completely independent outer layer of DNA and leveraging different properties of DNA (hybridization rather than sequencing), and moreover, using existing instruments and chemistries.”

Bathe envisions that this kind of DNA encapsulation could be useful for storing “cold” data, that is, data that is kept in an archive and not accessed very often. His lab is spinning out a startup, Cache DNA, that is now developing technology for long-term storage of DNA, both for DNA data storage in the long-term, and clinical and other preexisting DNA samples in the near-term.

“While it may be a while before DNA is viable as a data storage medium, there already exists a pressing need today for low-cost, massive storage solutions for preexisting DNA and RNA samples from Covid-19 testing, human genomic sequencing, and other areas of genomics,” Bathe says.

More Information

https://www.microsoft.com/en-us/research/project/dna-storage/#!publications

http://thewindowsupdate.com/2020/11/09/research-collection-re-inventing-storage-for-the-cloud-era/

https://www.microsoft.com/en-us/research/project/dna-storage/

https://community.arm.com/developer/research/b/articles/posts/research-in-a-post-moore-era-hpca-2019

https://www.newswise.com/articles/translation-software-enables-efficient-dna-data-storage

https://leciir.com/?blog_post=the-dna-based-solution-to-our-data-storage-crisis-where-its-at-in-2020

https://news.mit.edu/2021/dna-data-storage-0610

https://www.businesswire.com/news/home/20210610005250/en/DNA-Data-Storage-Alliance-Publishes-First-White-Paper-Launches-Website

https://researchr.org/publication/cidr-2019

https://news.microsoft.com/innovation-stories/hello-data-dna-storage/

https://www.nature.com/articles/s41598-019-41228-8

https://blog.dshr.org/2021/05/storage-update.html

https://www.ddn.com/products/ime-flash-native-data-cache/

Yotta bytes Big Data Storage

1. Zettabyte Era

Data is measured in bits and bytes. One bit contains a value of 0 or 1. Eight bits make a byte. Then we have kilobytes (1,000 bytes), megabytes (1000² bytes), gigabytes (1000³ bytes), terabytes (1000⁴ bytes), petabytes (1000⁵ bytes), exabytes (1000⁶ bytes) and zettabytes (1000⁷ bytes).

Cisco estimated that in 2016 we have passed one zettabyte in total annual Internet traffic, that is all data that we have uploaded and shared on the world wide web, most of it being file sharing. A zettabyte is a measure of storage capacity, which equals 1000⁷ (1,000,000,000,000,000,000,000 bytes). One zettabyte is equal to a thousand exabytes, a billion terabytes, or a trillion gigabytes. In other words — that’s a lot! Especially if we take into account that Internet is not even 40 years old. Cisco estimated also that by 2020 the annual traffic will grow to over 2 zettabytes.

From Block Storage to GPUDirect | Past & Future of Data Platform

Internet traffic is only one part of the total data storage, which includes also all personal and business devices. Estimates for the total data storage capacity which we have right now, in 2019, vary, but are already in 10–50 zettabyte range. By 2025 this is estimated to grow to the range of 150–200 zettabytes.

Definitely data creation will only fasten in upcoming years, so you might wonder: is there any limit to data storage? Not really, or rather, there are limits, but are so far away that we won’t get anywhere near them anytime soon. For example, just a gram of DNA can store 700 terabytes of data, which means that we could store all our data we have right now on 1500kg of DNA — packed densely, it would fit into an ordinary room. That however is very far from what we are able to manufacture currently. The largest hard drive being manufactured has 15 terabytes, and the largest SSD reaches 100 terabytes.

The term Big Data refers to a dataset which is too large or too complex for ordinary computing devices to process. As such, it is relative to the available computing power on the market. If you look at recent history of data, then in 1999 we had a total of 1.5 exabytes of data and 1 gigabyte was considered big data. Already in 2006, total data was estimated at 160 exabytes level — 1000% more in 7 years. In our Zettabyte Era 1 gigabyte is no longer big data really, and it makes sense to talk about big data starting with at least 1 terabyte. If we were to put that in more mathematical terms, then it seems natural to talk about Big Data with regard to datasets which exceed total data created in the world divided by 1000³.

2. Petaflops

For data to be useful, it’s not enough to store it, you also have to access it and process it. One can measure computer’s processing power by either number of instruction per second (IPS) or floating-point operations per second (FLOPS). While IPS is broader than FLOP, it is also less precise and depends on a programming language used. On the other hand FLOPS are pretty easy to imagine as they are directly related to number of multiplications/divisions we can do per second. For example a simple handheld calculator needs several FLOPS to be functional, while most modern CPUs are in the range of 20–60 GFLOPS (gigaFLOPS = 1000³ FLOPS). The record breaking computer built in 2018 by IBM reached 122.3 petaFLOPS (1000⁵ FLOPS), which is several millions faster than an ordinary PC (200 petaflops in a peak performance).

DataOps –The Foundation for Your Agile Data Architecture

GPUs perform better with floating-point computations reaching several hundred GFLOPS (mass market devices). Things are getting interesting when you look into specialized architecture. Latest trend is building hardware to boost machine learning and the most well known example is TPU by Google which reaches 45 teraFLOPS (1000⁴ FLOPS) and can be accessed through the cloud.

If you need to perform large computations and you don’t have yourself a supercomputer, the next best thing is to rent it, or compute on the cloud. Amazon gives you up to 1 petaFLOPS with P3 while Google offers a pod of TPUs with speed up to 11.5 petaFLOPS.

3. Artificial Intelligence and Big Data

Let’s put it all together: you have the data, you have computing power to match it, so it’s time to use them in order to gain new insights. To really benefit from both you have to turn to machine learning. Artificial Intelligence is at the forefront of data usage, helping in making predictions about weather, traffic or health (from discovering new drugs to early detection of cancer).

AI needs training to perform specialized tasks, and looking at how much training is needed in order to achieve peak performance is a great indicator of computing power vs data. There’s a great report by OpenAI from 2018 evaluating those metrics and concluding that since 2012 the AI training measured in petaflops/day (petaFD) was doubling every 3.5 month. One petaFD consists of performing 1000⁵ neural net operations per second for one day, or a total of about 10²⁰ operations. The great thing about this metric is that it not only takes architecture of a network (in a form of number of operations needed), but connects it with implementation on current devices (compute time).

You can compare how much petaFD was used in recent advances in AI, by looking at the following chart:

The leader being unsurprisingly AlphaGo Zero by DeepMind with over 1,000 petaFD used or 1 exaFD. How much is that really in terms of resources? If you were to replicate the training yourself with the same hardware, you could easily end up spending near $3m as estimated here in detail. To put the lower estimate on it, based on above chart 1,000 petaFD is at least like using the best available Amazon P3 for 1,000 days. With the current price at $31.218 per hour, this would give $31.218 x 24 (hours) x 1,000 (days) = $749,232. This is the lowest bound as it assumes that one neural net operation is one floating-point operation and that you get the same performance on P3 as on different GPUs/TPUs used by DeepMind.

Computational Storage: Edge Compute Deployment

This shows that AI needs a lot of power and resources to be trained. There are examples of recent advances in machine learning when not much was needed in terms of computing power or data, but most often than not, additional computing power is quite helpful. That is why building better supercomputers and larger data centers makes sense, if we want to develop artificial intelligence and thus our civilization as a whole. You can think about supercomputers similarly to Large Hadron Colliders — you build larger and larger colliders so that you can access deeper truths about our universe. The same is true about computing power and artificial intelligence. We don’t understand our own intelligence or how we perform creative tasks, but increasing the scale of FLOPS can help unravel the mystery.

Embrace the Zettabyte Era! And better profit from it quickly, as Yottabyte Era is not far away.

As data gets bigger, what comes after a yottabyte?

An exabyte of data is created on the Internet each day, which equates to 250 million DVDs worth of information. And the idea of even larger amounts of data — a zettabyte — isn’t too far off when it comes to the amount of info traversing the web in any one year. Cisco (s csco) estimates we’ll see a 1.3 zettabytes of traffic annually over the internet in 2016 — and soon enough, we might to start talking about even bigger volumes.

After a zettabyte comes yottabytes, which big data scientists use to talk about how much government data the NSA or FBI have on people altogther. Put it in terms of DVDs, a yottabyte would require 250 trillion of them. But we’ll eventually have to think bigger, and thanks to a presentation from Shantanu Gupta, director of Connected Intelligent Solutions at Intel (s intc), we now know the next-generation prefixes for going beyond the yottabyte: a brontobyte and a gegobyte.

A brontobyte, which isn’t an official SI prefix but is apparently recognized by some people in the measurement community, is a 1 followed by 27 zeros. Gupta uses it to describe the type of sensor data we’ll get from the internet of things. A gegobyte is 10 to the power of 30. It’s meaningless to think about how many DVDs that would be, but suffice it to say it’s more than I could watch in a lifetime.

Big Data Storage Options & Recommendations

And to drive home the influx of data, Gupta offered the following stats (although in the case of CERN, the SKA telescope and maybe the jet engine sensors, not all of that data needs to be stored):

On YouTube, 72 hours of video are uploaded per minute, translating to a terabyte every four minutes.
500 terabytes of new data per day are ingested in Facebook databases.
The CERN Large Hadron Collider generates 1 petabyte per second.
The proposed Square Kilometer Array telescope will generate an exabyte of data per day.
Sensors from a Boeing jet engine create 20 terabytes of data every hour.

This begs the question which tools should we use?

8 Best Big Data Hadoop Analytics Tools in 2021

Most companies have big data but are unaware of how to use it. Firms have started realizing how important it is for them to start analyzing data to make better business decisions.

With the help of big data analytics tools, organizations can now use the data to harness new business opportunities. This is return will lead to smarter business leads, happy customers, and higher profits. Big data tools are crucial and can help an organization in multiple ways – better decision making, offer customers new products and services, and it is cost-efficient.

Let us further explore the top data analytics tools which are useful in big data:

1. Apache Hive

A java-based cross-platform, Apache Hive is used as a data warehouse that is built on top of Hadoop. a data warehouse is nothing but a place where data generated from multiple sources gets stored in a single platform. Apache Hive is considered as one of the best tools used for data analysis. A big data professional who is well acquainted with SQL can easily use Hive. The Query language used here is HIVEQL or HQL.

Pros:

Hive uses a different type of storage called ORC, HBase, and Plain text.

The HQL queries resemble SQL queries.

Hive is operational on compressed data which is intact inside the Hadoop ecosystem

It is in-built and used for data-mining.

2. Apache Mahout

The term Mahout is derived from Mahavatar, a Hindu word describing the person who rides the elephant. Algorithms run by Apache Mahout take place on top of Hadoop thus termed as Mahout. Apache Mahout is ideal when implementing machine learning algorithms on the Hadoop ecosystem. An important feature worth mentioning is that Mahout can easily implement machine learning algorithms without the need for any integration on Hadoop.

Pros:

Composed of matrix and vector libraries.

Used for analyzing large datasets.

Ideal for machine learning algorithms.

Big data processing with apache spark

3. Apache Impala

Ideally designed for Hadoop, the Apache Impala is an open-source SQL engine. It offers faster processing speed and overcomes the speed-related issue taking place in Apache Hive. The syntax used by Impala is similar to SQL, the user interface, and ODBC driver like the Apache Hive. This gets easily integrated with the Hadoop ecosystem for big data analytics purposes.

Pros:

Offers easy-integration.

It is scalable.

Provides security.

Offers in-memory data processing.

4. Apache Spark

It is an open-source framework used in data analytics, fast cluster computing, and even machine learning. Apache Spark is ideally designed for batch applications, interactive queries, streaming data processing, and machine learning.

Pros:

Easy and cost-efficient.

Spark offers a high-level library that is used for streaming.

Due to the powerful processing engine, it runs at a faster pace.

It has in-memory processing.

5. Apache Pig

Apache Pig was first developed by Yahoo to make programming easier for developers. Ever since it offers the advantage of processing an extensive dataset. Pig is also used to analyze large datasets and can be presented in the form of dataflow. Now, most of these tools can be learned through professional certifications from some of the top big data certification platforms available online. As big data keep evolving, big data tools will be of the utmost significance to most industries.

Pros:

Known to handle multiple types of data.

Easily extensible.

Easy to program.

Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger

6. Apache Storm

Apache Storm is an open-source distributed real-time computation system and is free. And this is built with the help of programming languages like Java, Clojure, and many other languages. Apache Storm is used for streaming due to its speed. It can also be used for real-time processing and machine learning processing. Apache Storm is used by top companies such as Twitter, Spotify, and Yahoo, etc.

Pros:

The operational level is easy.

Fault tolerance.

Scalable

7. Apache Sqoop

If there is a command-line developed by Apache, that would be Sqoop. Apache Sqoop’s major purpose is to import structured data such as Relational Database Management System (RDBMS) like Oracle, SQL, MySQL to the Hadoop Distributed File System (HDFS). Apache Sqoop can otherwise transfer data from HDFS to RDBMS too.

Pros:

Sqoop controls parallelism.

Helps connect to the database server.

Offers feature to import data to HBase or Hive.

Floating on a RAFT: HBase Durability with Apache Ratis

8. HBase

HBase is a non-distributed, column-based oriented, and non-relational database. It composes of multiple tables and these tables consist of many data rows. These data rows further have multiple column families and the column’s family each consists of a key-value pair. HBase is ideal to use when looking for small size data from large datasets.

Pros:

The Java API is used for client access.

It blocks the cache for real-time data queries.

Offers modularity and linear scalability.

Besides the above-mentioned tools, you can also use Tableau to provide interactive visualization to demonstrate the insights drawn from the data and MapReduce, which helps Hadoop function faster.

However, you need to take the right pick while choosing any tool for your project.

Data driven sustainability along the supply chain | Big-Data.AI Summit 2021

As Big Data Explodes, Are You Ready For Yottabytes?

The inescapable truth about big data, the thing you must plan for, is that it just keeps getting bigger. As transactions, electronic records, and images flow in by the millions, terabytes grow into petabytes, which swell into exabytes. Next come zettabytes and, beyond those, yottabytes.

A yottabyte is a billion petabytes. Most calculators can’t even display a number of that size, yet the federal government’s most ambitious research efforts are already moving in that direction. In April, the White House announced a new scientific program, called the Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative, to “map” the human brain. Francis Collins, the director of the National Institutes of Health, said the project, which was launched with $100 million in initial funding, could eventually entail yottabytes of data.

Overcoming Kubernetes Storage Challenges with Composable Infrastructure

And earlier this year, the US Department of Defense solicited bids for up to 4 exabytes of storage, to be used for image files generated by satellites and drones. That’s right—4 exabytes! The contract award has been put on hold temporarily as the Pentagon weighs its options, but the request for proposals is a sign of where things are heading.

Businesses also are racing to capitalize on the vast amounts of data they’re generating from internal operations, customer interactions, and many other sources that, when analyzed, provide actionable insights. An important first step in scoping out these big data projects is to calculate how much data you’ve got—then multiply by a thousand.

If you think I’m exaggerating, I’m not. It’s easy to underestimate just how much data is really pouring into your company. Businesses are collecting more data, new types of data, and bulkier data, and it’s coming from new and unforeseen sources. Before you know it, your company’s all-encompassing data store isn’t just two or three times what it had been; it’s a hundred times more, then a thousand.

Not that long ago, the benchmark for databases was a terabyte, or a trillion bytes. Say you had a 1 terabyte database and it doubled in size every year—a robust growth rate, but not unheard of these days. That system would exceed a petabyte (a thousand terabytes) in 10 years.

Challenges of the Software-Defined Data Center of the Future: Datera

And many businesses are accumulating data even faster. For example, data is doubling every six months at Novation, a healthcare supply contracting company, according to Alex Latham, the company’s vice president of e-business and systems development. Novation has deployed Oracle Exadata Database Machine and Oracle’s Sun ZFS Storage appliance products to scale linearly—in other words, without any slowdown in performance—as data volumes keep growing. (In this short video interview, Latham explains the business strategy behind Novation’s tech investment.)

Terabytes are still the norm in most places, but a growing number of data-intensive businesses and government agencies are pushing into the petabyte realm. In the latest survey of the Independent Oracle Users Group, 5 percent of respondents said their organizations were managing 1 to 10 petabytes of data, and 6 percent had more than 10 petabytes. You can find the full results of the survey, titled “Big Data, Big Challenges, Big Opportunities,” here.

These burgeoning databases are forcing CIOs to rethink their IT infrastructures. Turkcell , the leading mobile communications and technology company in Turkey, has also turned to Oracle Exadata Database Machine, which combines advanced compression, flash memory, and other performance-boosting features, to condense 1.2 petabytes of data into 100 terabytes for speedier analysis and reporting.

Envisioning a Yottabyte

Some of these big data projects involve public-private partnerships, making best practices of utmost importance as petabytes of information are stored and shared. On the new federal brain-mapping initiative, the National Institutes of Health is collaborating with other government agencies, businesses, foundations, and neuroscience researchers, including the Allen Institute, the Howard Hughes Medical Institute, the Kavli Foundation, and the Salk Institute for Biological Studies

Lifting the Clouds: Storage Challenges in a Containerized Environment

Space exploration and national intelligence are other government missions soon to generate yottabytes of data. The National Security Agency’s new 1-million-square-foot data center in Utah will reportedly be capable of storing a yottabyte.

That brings up a fascinating question: Just how much storage media and real-world physical space are necessary to house so much data that a trillion bytes are considered teensy-weensy? By one estimate, a zettabyte (that’s 10 to the twenty-first power) of data is the equivalent of all of the grains of sand on all of Earth’s beaches.

Of course, IT pros in business and government manage data centers, not beachfront, so the real question is how can they possibly cram so much raw information into their data centers, and do so when budget pressures are forcing them to find ways to consolidate, not expand, those facilities?

The answer is to optimize big data systems to do more with less—actually much, much more with far less. I mentioned earlier that mobile communications company Turkcell is churning out analysis and reports nearly 10 times faster than before. What I didn’t say was that, in the process, the company also shrank its floor space requirements by 90 percent and energy consumption by 80 percent through its investment in Oracle Exadata Database Machine, which is tuned for these workloads.

Businesses will find that there are a growing number of IT platforms designed for petabyte and even exabyte workloads. A case in point is Oracle’s StorageTek SL8500 modular library system, the world’s first exabyte storage system. And if one isn’t enough, 32 of those systems can be connected to create 33.8 exabytes of storage managed through a single interface.

So, as your organization generates, collects, and manages terabytes upon terabytes of data, and pursues an analytics strategy to take advantage of all of that pent-up business value, don’t underestimate how quickly it adds up. Think about all of the grains of sand on all of Earth’s beaches, and remember: The goal is to build sand castles, not get buried by the sand.

The BRAIN Initiative

The BRAIN Initiative — short for Brain Research through Advancing Innovative Neurotechnologies — builds on the President’s State of the Union call for historic investments in research and development to fuel the innovation, job creation, and economic growth that together create a thriving middle class.

The Initiative promises to accelerate the invention of new technologies that will help researchers produce real-time pictures of complex neural circuits and visualize the rapid-fire interactions of cells that occur at the speed of thought. Such cutting-edge capabilities, applied to both simple and complex systems, will open new doors to understanding how brain function is linked to human behavior and learning, and the mechanisms of brain disease.

In his remarks this morning, the President highlighted the BRAIN Initiative as one of the Administration’s “Grand Challenges” – ambitious but achievable goals that require advances in science and technology to accomplish. The President called on companies, research universities, foundations, and philanthropies to join with him in identifying and pursuing additional Grand Challenges of the 21st century—challenges that can create the jobs and industries of the future while improving lives.

In addition to fueling invaluable advances that improve lives, the pursuit of Grand Challenges can create the jobs and industries of the future.

Data Science Crash Course

That’s what happened when the Nation took on the Grand Challenge of the Human Genome Project. As a result of that daunting but focused endeavor, the cost of sequencing a single human genome has declined from $100 million to $7,000, opening the door to personalized medicine.

Like sequencing the human genome, President Obama’s BRAIN Initiative provides an opportunity to rally innovative capacities in every corner of the Nation and leverage the diverse skills, tools, and resources from a variety of sectors to have a lasting positive impact on lives, the economy, and our national security.

That’s why we’re so excited that critical partners from within and outside government are already stepping up to the President’s BRAIN Initiative Grand Challenge.

The BRAIN Initiative is launching with approximately $100 million in funding for research supported by the National Institutes of Health (NIH), the Defense Advanced Research Projects Agency (DARPA), and the National Science Foundation (NSF) in the President’s Fiscal Year 2014 budget.

Foundations and private research institutions are also investing in the neuroscience that will advance the BRAIN Initiative. The Allen Institute for Brain Science, for example, will spend at least $60 million annually to support projects related to this initiative. The Kavli Foundation plans to support BRAIN Initiative-related activities with approximately $4 million dollars per year over the next ten years. The Howard Hughes Medical Institute and the Salk Institute for Biological Studies will also dedicate research funding for projects that support the BRAIN Initiative.

This is just the beginning. We hope many more foundations, Federal agencies, philanthropists, non-profits, companies, and others will step up to the President’s call to action.

CBL-Mariner

Microsoft Linux is Here!

CBL-Mariner is an internal Linux distribution for Microsoft’s cloud infrastructure and edge products and services. CBL-Mariner is designed to provide a consistent platform for these devices and services and will enhance Microsoft’s ability to stay current on Linux updates. This initiative is part of Microsoft’s increasing investment in a wide range of Linux technologies, such as SONiC, Azure Sphere OS and Windows Subsystem for Linux (WSL). CBL-Mariner is being shared publicly as part of Microsoft’s commitment to Open Source and to contribute back to the Linux community. CBL-Mariner does not change our approach or commitment to any existing third-party Linux distribution offerings.

CBL-Mariner has been engineered with the notion that a small common core set of packages can address the universal needs of first party cloud and edge services while allowing individual teams to layer additional packages on top of the common core to produce images for their workloads. This is made possible by a simple build system that enables:

Package Generation: This produces the desired set of RPM packages from SPEC files and source files.

Image Generation: This produces the desired image artifacts like ISOs or VHDs from a given set of packages.

Whether deployed as a container or a container host, CBL-Mariner consumes limited disk and memory resources. The lightweight characteristics of CBL-Mariner also provides faster boot times and a minimal attack surface. By focusing the features in the core image to just what is needed for our internal cloud customers there are fewer services to load, and fewer attack vectors.

When security vulnerabilities arise, CBL-Mariner supports both a package-based update model and an image based update model. Leveraging the common RPM Package Manager system, CBL-Mariner makes the latest security patches and fixes available for download with the goal of fast turn-around times.

More Information:

https://towardsdatascience.com/how-big-is-big-data-3fb14d5351ba

https://www.hadoop360.datasciencecentral.com

https://docs.microsoft.com/en-us/windows/wsl/about

https://github.com/microsoft/CBL-Mariner

https://www.forbes.com/sites/oracle/2013/06/21/as-big-data-explodes-are-you-ready-for-yottabytes/

https://obamawhitehouse.archives.gov/blog/2013/04/02/brain-initiative-challenges-researchers-unlock-mysteries-human-mind

https://obamawhitehouse.archives.gov/the-press-office/2013/04/02/fact-sheet-brain-initiative

https://braininitiative.nih.gov/resources/publications?combine=&apply_filter=yes&field_priority_area_tid%5B%5D=4&field_program_tid%5B%5D=48&field_year_value%5Bvalue%5D%5Byear%5D=&apply_filter=yes

https://brainblog.nih.gov

https://pubmed.ncbi.nlm.nih.gov/31566565/

https://www.delltechnologies.com/ru-az/events/webinar/index.htm

https://crgconferences.com/datasciencewebinar

https://elifesciences.org/articles/48750

https://www.wired.com/2012/03/ff-nsadatacenter/

http://www.technologybloggers.org/internet/how-many-human-brains-would-it-take-to-store-the-internet/

WHAT IS LF EDGE

LF Edge is an umbrella organization that aims to establish an open, interoperable framework for edge computing independent of hardware, silicon, cloud, or operating system. By bringing together industry leaders, LF Edge will create a common framework for hardware and software standards and best practices critical to sustaining current and future generations of IoT and edge devices.

We are fostering collaboration and innovation across the multiple industries including industrial manufacturing, cities and government, energy, transportation, retail, home and building automation, automotive, logistics and health care — all of which stand to be transformed by edge computing.

What is. LF Edge

Project EVE Promotes Cloud-Native Approach to Edge Computing

The LF Edge umbrella organization for open source edge computing that was announced by The Linux Foundation last week includes two new projects: Samsung Home Edge and Project EVE. We don’t know much about Samsung’s project for home automation, but we found out more about Project EVE, which is based on Zededa’s edge virtualization technology. Last week, we spoke with Zededa co-founder Roman Shaposhnik about Project EVE, which provides a cloud-native based virtualization engine for developing and deploying containers for industrial edge computers (see below).

LF Edge aims to establish “an open, interoperable framework for edge computing independent of hardware, silicon, cloud, or operating system.” It is built around The Linux Foundation’s telecom-oriented Akraino Edge Stack, as well as its EdgeX Foundry, an industrial IoT middleware project..

Like the mostly proprietary cloud-to-edge platforms emerging from Google (Google Cloud IoT Edge), Amazon (AWS IoT), Microsoft (Azure Sphere), and most recently Baidu (Open Edge), among others, the LF Edge envisions a world where software running on IoT gateway and edge devices evolves top down from the cloud rather than from the ground up with traditional embedded platforms.

The Linux Foundation, which also supports numerous “ground up” embedded projects such as the Yocto Project and Iotivity, but with LF Edge it has taken a substantial step toward the cloud-centric paradigm. The touted benefits of a cloud-native approach for embedded include easier software development, especially when multiple apps are needed, and improved security via virtualized, regularly updated container apps. Cloud-native edge computing should also enable more effective deployment of cloud-based analytics on the edge while reducing expensive, high-latency cloud communications.

None of the four major cloud operators listed above are currently members of LF Edge, which poses a challenge for the organization. However, there’s already a deep roster of companies onboard, including Arm, AT&T, Dell EMC, Ericsson, HPE, Huawei, IBM, Intel, Nokia Solutions, Qualcomm, Radisys, Red Hat, Samsung, Seagate, and WindRiver (see the LF Edge announcement for the full list.)

With developers coming at the edge computing problem from both the top-down and bottom-up perspectives, often with limited knowledge of the opposite realm, the first step is agreeing on terminology. Back in June, the Linux Foundation launched an Open Glossary of Edge Computing project to address this issue. Now part of LF Edge, the Open Glossary effort “seeks to provide a concise collection of terms related to the field of edge computing.”

There’s no mention of Linux in the announcements for the LF Edge projects, all of which propose open source, OS-agnostic, approaches to edge computing. Yet, there’s no question that Linux will be the driving force here.

Project EVE aims to be the Android of edge computing

Project EVE is developing an “open, agnostic and standardized architecture unifying the approach to developing and orchestrating cloud-native applications across the enterprise edge,” says the Linux Foundation. Built around an open source EVE (Edge Virtualization Engine) version of the proprietary Edge Virtualization X (EVx) engine from Santa Clara startup Zededa, Project EVE aims to reinvent embedded using Docker containers and other open source cloud-native software such as Kubernetes. Cloud-native edge computing’s “simple, standardized orchestration” will enable developers to “extend cloud applications to edge devices safely without the need for specialized engineering tied to specific hardware platforms,” says the project.

Earlier this year, Zededa joined the EdgeX Foundry project, and its technology similarly targets the industrial realm. However, Project EVE primarily concerns the higher application level rather than middleware. The project’s cloud-native approach to edge software also connects it to another LF project: the Cloud Native Computing Foundation.

In addition to its lightweight virtualization engine, Project EVE also provides a zero-trust security framework. In conversation with Linux.com, Zededa co-founder Roman Shaposhnik proposed to consign the word “embedded” to the lower levels of simple, MCU-based IoT devices that can’t run Linux. “To learn embedded you have to go back in time, which is no longer cutting it,” said Shaposhnik. “We have millions of cloud-native software developers who can drive edge computing. If you are familiar with cloud-native, you should have no problem in developing edge-native applications.”

If Shaposhnik is critical of traditional, ground-up embedded development, with all its complexity and lack of security, he is also dismissive of the proprietary cloud-to-edge solutions. “It’s clear that building silo’d end-to-end integration cloud applications is not really flying,” he says, noting the dangers of vendor lock-in and lack of interoperability and privacy.

To achieve the goals of edge computing, what’s needed is a standardized, open source approach to edge virtualization that can work with any cloud, says Shaposhnik. Project EVE can accomplish this, he says, by being the edge computing equivalent of Android.

“The edge market today is where mobile was in the early 2000s,” said Shaposhnik, referring to an era when early mobile OSes such as Palm, BlackBerry, and Windows Mobile created proprietary silos. The iPhone changed the paradigm with apps and other advanced features, but it was the far more open Android that really kicked the mobile world into overdrive.

“Project EVE is doing with edge what Android has done with mobile,” said Shaposhnik. The project’s standardized edge virtualization technology is the equivalent of Android package management and Dalvik VM for Java combined, he added. “As a mobile developer you don’t think about what driver is being used. In the same way our technology protects the developer from hardware complexity.”

Project EVE is based on Zededa’s EVx edge virtualization engine, which currently runs on edge hardware from partners including Advantech, Lanner, SuperMicro, and Scalys. Zededa’s customers are mostly large industrial or energy companies that need timely analytics, which increasingly requires multiple applications.

“We have customers who want to optimize their wind turbines and need predictive maintenance and vibration analytics,” said Shaposhnik. “There are a half dozen machine learning and AI companies that could help, but the only way they can deliver their product is by giving them a new box, which adds to cost and complexity.”

A typical edge computer may need only a handful of different apps rather than the hundreds found on a typical smartphone. Yet, without an application management solution such as virtualized containers, there’s no easy way to host them. Other open source cloud-to-edge solutions that use embedded container technology to provide apps include the Balena IoT fleet management solution from Balena (formerly Resin.io) and Canonical’s container-like Ubuntu Core distribution.

Right now, the focus is on getting the open source version of EVx out the door. Project EVE plans to release a 1.0 version of the EVE in the second quarter along with an SDK for developing EVE edge containers. An app store platform will follow later in the year.

Whether or not Edge computing serves as the backbone of mission-critical business worldwide depends on the success of the underlying network.

Linux Foundation's Project EVE: a Cloud-Native Edge Computing Platform

Recognizing the Edge’s potential and urgency to support Edge network, The Linux Foundation earlier this year created LF Edge, an umbrella organization dedicated to creating an open, agnostic and interoperable framework for edge computing. Similar to what the Cloud Native Computing Foundation (CNCF) has done for cloud development, LF Edge aims to enhance cooperation among key players so that the industry as a whole can advance more quickly.

By 2021, Gartner forecasts that there will be approximately 25 billion IoT devices in use around the world. Each of those devices, in turn, has the capacity to produce immense volumes of valuable data. Much of this data could be used to improve business-critical operations — but only if we’re able to analyze it in a timely and efficient manner. As mentioned above, it’s this combination of factors that has led to the rise of edge computing as one of the most rapidly -developing technology spaces today.

This idea of interoperability at the edge is particularly important because the hardware that makes up edge devices is so diverse — much more so than servers in a data center. Yet for edge computing to succeed, we need to be able to run applications right on local gateway devices to analyze and respond to IoT and Industry 4.0 data in near-real time. How do you design applications that are compatible with a huge variety of hardware and capable of running without a reliable cloud connection? This is the challenge that LF Edge is helping to solve.

Part of the solution is Project EVE, an Edge Virtualization Engine donated to LF Edge by ZEDEDA last month. I think of EVE as doing for the edge what Android did for mobile phones and what VMware did for data centers: decoupling software from hardware to make application development and deployment easier.

his curious (and somewhat unexpected) interplay between mobile and server computing requirements is exactly what makes edge so exciting. As an open source project, EVE now has a unique opportunity to blend the best parts of building blocks from projects as diverse as Android, ChromeOS, CoreOS, Qubes OS, Xen, Linuxkit, Linuxboot, Docker, Kubernetes and unikernels (AKA library operating systems — out of which AtmanOS is our favorite). And if you are still not convinced that all of these projects have much in common, simply consider this:

Today’s edge hardware is nothing like underpowered, specialized embedded hardware of yesterday. All of these boxes typically come with a few gigabits of RAM, dozens (if not hundreds) of GBs of flash and modern, high-speed CPUs with the latest features (like virtualization extensions) available by default. In short, they are very capable of supporting exactly the same cloud-native software abstractions developers now take for granted in any public cloud: containers, immutable infrastructure, 12-factor apps and continuous delivery software pipelines. From this perspective, edge hardware starts to look very much like servers in a data center (be it a public cloud or a private colo). At the same time;

These boxes are deployed out in the wild. Which means when it comes to security and network requirements, they exist in a world that looks nothing like a traditional data center. In fact, it looks a lot like the world mobile computing platforms have evolved in. Just like iPhones, these boxes get stolen, disassembled and hacked all the time in the hopes that secrets inside of them can be revealed and used as attack vectors. On the networking side, the similarity is even more striking: the way our smartphones have to constantly cope with ill-defined, flaky and heterogeneous networks (hopping between WiFi and LTE, for example) sets up a really good model for how to approach edge computing networking.

There’s no denying that EVE stands on the shoulders of all these open source giants that came before it and yet it has plenty of its own open source development to be done. In the remainder of this article, I’ll cover some of the technical details of Project EVE.

Project EVE overview

Fundamentally, EVE is a replacement for traditional (or even some of the real-time) operating systems (Linux, Windows, VxWorks, etc.) that are commonplace today in IoT and edge deployments. EVE takes control right after UEFI/BIOS and we have future plans around Linuxboot to have EVE actually replace your UEFI/BIOS altogether.

There are three key components of EVE: a type-1 hypervisor, running directly on bare metal; an Edge Container runtime that allows you to run applications in either a virtual machine or container; and a hardened root-of-trust implementation for security. A full list of hardware that EVE was tested on is available on the project’s Wiki page, but we expect EVE to run on most modern edge computing hardware (including products from major companies like Advantech and Supermicro, as well as architectures from ARM and Intel).

Project EVE Introduction

Once the EVE instance is up and running, the first thing it does is contact a pre-defined controller and receive instructions from the controller on how to configure itself and what workloads to start executing. The controller builds these instruction manifests for every EVE-enabled device that it knows about, based on the overall orchestration requests it receives from the DevOps rolling out a given deployment.

The API that EVE uses to talk to the controller is part of the LF Edge standardization efforts and we fully expect that it can evolve into the industry de-facto standard for how edge virtualization infrastructure is being controlled and monitored. You can see the current version of the API and documentation in EVE’s GitHub repository.

The kinds of workloads that a DevOps will be deploying to all EVE-enabled devices are packaged as Edge Containers. Edge Containers are meant to be an extension of traditional OCI Containers and the effort around their standardization will be ongoing in LF Edge in the coming months. The idea behind Edge Container extensions is to allow for seamless integration between virtual machine, unikernel and container workloads through a single packaging and distribution format.

Continuing with our Android analogy, one may say that while EVE is trying to do for the edge what Android has done for mobile, Edge Containers are meant to be the APKs of the edge.

All of EVE’s functionality is provided by a series of individual Go microservices that are running in full isolation from each other, similar to the pioneering ideas of radical isolation introduced by Qubes OS. Our ultimate goal is to make each one of those microservices be a standalone unikernel running directly on top of a type-1 hypervisor without requiring any operating system at all. We are planning to leverage excellent work done by the AtmanOS community in order to achieve that.

All of EVE’s microservices and infrastructure elements (think boot loader, Linux kernel, etc.) are tied together into a Linuxkit-like distribution that allows us to provide bootable EVE images ready to be deployed on Intel– and ARM-based edge hardware.

Our root-of-trust architecture leverages TPM and TEE hardware elements and provides a solid foundation for implementing flexible secret management, data encryption and measured boot capabilities without burdening application developers with any of that complexity.

Finally, on the connectivity side, EVE offers flexible networking capabilities to its Edge Containers through transparent integration of LISP protocol and crypto-routing. That way, EVE can provide SD-WAN and mesh networking functionality right out of the box, without requiring additional integration efforts.

Putting it all together, the internals of EVE’s architecture look something like this:

While this architecture may seem complex and daunting at times, we’re rapidly investing in documenting it and making it more flexible to work with. The EVE community shares the spirit of the Apache Way and believes in “Community over Code.” We welcome any and all types of contributions that benefit the community at large, not just code contributions:

Providing user feedback;
Sharing your use cases;
Evangelizing or collaborating with related products and technologies;
Maintaining our wiki;
Improving documentation;
Contributing test scenarios and test code;
Adding or improving hardware support;
Fixing bugs and adding new features.

The most important part of Project EVE is that it’s an open standard for the community, designed to make it easier for others to create and deploy applications for the edge. Now that the code is officially open sourced through LF Edge, it’s also available for anyone to contribute to and explore.

Shaposhnik: I think through the introduction, it's pretty clear who I am. If you're interested in talking to me about some of the other things that I do in the open source, feel free to do that. I happen to be very involved in Apache Software Foundation and Linux Foundation.

Today, we will be talking about edge computing. Let's start by defining the term, what is edge computing? I think we started a long time ago with IoT, Internet of Things. Then Cisco introduced this term called fog computing, which was telco-ish, IoT view. I think edge computing to me is very simple. It is basically cloud native IoT. It is when the small devices, I call them computers outside of data centers, they start to be treated by developers in a very cloud native way. People say, "We've been doing it for years. What's different?" The difference is it's all of the APIs and all of the things that we take for granted in the cloud and even in the private data center today. That actually took time to develop. We didn't start with Kubernetes, and Docker, and orchestration tools, and mesh networks. We started with individual machines. We started with individual rackable servers. That's basically what IoT is still, individual machines. The whole hope is that we can make it much better and much more exciting by applying some of the cloud native paradigms like liquid software, pipeline delivery, CI/CD, DevOps, that type of thing, but with the software running outside of your data center.

When I talk about edge, let's actually be very specific, because there are different types of edge. I will cover the edge I will not be talking about. Very specifically, let's talk about the edge that's very interesting to me, and I think it should be interesting to all of you. These are the type of devices that some people called deep edge, some people call enterprise edge. These are basically computers that are attached to some physical object. That physical object could be a moving vehicle. It could be a big turbine generating electricity. It could be a construction site. The point being is that something is happening in the real world and you either need to capture data about that something, or you need to drive the process of that something. Manufacturing is a really good example. You have your pipeline. You're manufacturing your product. You need to control that process. You have a computer that is typically called industrial PC attached to it. Same deal with a construction site, or even your local McDonald's. In McDonald's, you want to orchestrate the experience of your customers. You have a little computer that's attached to the cash register. You have a little computer that's attached to the display, and all of that needs to be orchestrated.

What I'm not talking about, I'm not actually talking about two things. I'm not talking about Raspberry Pis. There's definitely a lot of excitement about Raspberry Pis. It's interesting because if you think about the original motivation for the Raspberry Pi, it was to give underprivileged kids access to computing. It was basically to replace your personal laptop or desktop with essentially a very inexpensive device. The fact that Raspberry Pis now find their way into pretty much every single personal IoT project, there's almost a byproduct of how they designed the thing. I am yet to see Raspberry Pis being used for business, most of the time they just stop at the level of you personally doing something, or maybe you doing something with your friends, your hackerspace. Today, we'll not be talking about any of that. The reason we're not talking about that is because just like with container orchestration and Docker, you don't really need those tools unless you actually do some level of production. You don't really need those tools if you're just tinkering. You don't need Kubernetes to basically run your application if you're just writing an application for yourself. You only need Kubernetes if that is something that actually generates some business. We will not be talking about Raspberry Pis. We'll not be talking about telco edge, edge of the network, all of that.

Even this slice of the edge computing alone, given various estimations, represents a huge total addressable market. The biggest reason for that is the size of the data. These computers are connected to something that is in the real world. The data originates in the real world. The previous presentation today about self-driving vehicle from Uber is a perfect example of that. There's so much data that the vehicle is gathering, even if it was legally allowed, it is completely impossible to transfer all of that data to the big cloud in the sky for any processing. You have to orchestrate that behavior on the edge. As practitioners, we actually have to figure out how to do that. I was a little bit underwhelmed that Uber is focusing more on the machine learning. I understand why, but I'm an infrastructure guy. Today, I will be talking to you about infrastructure, how to make those types of applications easily deployable.

The good news is the total addressable market. The bad news is that it's a little bit of a situation like building the airplane while it's in flight. I think it would be fair to say that edge computing today is where cloud computing was in 2006. 2006, Amazon was starting to introduce EC2. Everybody was saying, it's crazy, it will never work. People at Netflix started doing microservices. Everybody says it's crazy, it will never work. The rest is history. Edge computing is a little bit of that. My goal today is to give you enough understanding of the space, to give you enough understanding of the challenges in this space but also the opportunities in this space. Also, explain maybe a little bit of the vocabulary of this space so you can orient yourself. I cannot give you the tools. I cannot really give you something that you will be immediately productive at your workplace, the same way that I can talk about Kubernetes, or Kafka, or any other tool that's fairly mature. Edge computing is just happening in front of our eyes. To me, that's what makes it exciting.

In a way, when I say cloud native, to me, edge computing represents basically one final cloud that we're building, because we've built a lot of the public clouds. There's Google. There is Microsoft. There is obviously Amazon. All of these are essentially in the business of getting all of the applications that don't have to have any physicality attached to them. What we're trying to do is we're trying to basically build a distributed cloud from the API perspective that will be executing on the equipment that doesn't belong to the same people who run public clouds. Edge computing is where ownership belongs to somebody else, not the infrastructure provider. From any other perspective, it's just the cloud. People always ask me, "If edge is just another cloud, can we actually reuse all of the software that we developed for the cloud and run it on these small computers"?

Project EVE Architecture Overview

It used to be a challenge even to do that, because those computers used to be really small. The good news now is that the whole space of IoT bifurcated. The only constraint that you have from now on is power budget. It might still be the case that you have to count every single milliamp. If you're in that type of a business, you're doing essential Snowflake's and bespoke things all the time. There's really no commonality that I can give you because everything has to be so super tightly integrated, because you're really in a very constrained power budget. Everything else where power is not a problem, it used to be that silicon cost used to be a problem, but that's not the case anymore. Thanks to the economy of scale, you can basically get Raspberry Pi class devices for essentially a couple dozen bucks. It actually costs more to encase them in a way that would make them weatherproof than to actually produce the silicon.

The computers are actually pretty powerful. These are the type of computers we used to have in our data centers five years ago. Five years ago, public cloud existed. Five years ago, Kubernetes already existed. Docker definitely existed. The temptation is to take that software and run it at the edge. There have been numerous attempts to rub some Kubernetes on it because, obviously, that's what we do. We try to reuse as much as possible. Pretty much every attempt of reusing the implementation that I know of failed. I can talk in greater details of why that is. APIs are still very useful. If you're taking the implementation that Kubernetes gives you today, that will not work for two reasons. First of all, it will not work because of the network issues. All of those devices happen to be offline more than they are online. Kubernetes is not happy about that type of situation. Second of all, and this is where you need to start appreciating the differences of why edge is different, interestingly enough, in the data center, the game that Kubernetes and all of these orchestration technologies play is essentially a game of workload consolidation. You're trying to run as many containers on as few servers as possible. The scalability requirements that we're building the Kubernetes-like platforms with are essentially not as many servers and tons of containers and applications. On the edge, it's exactly the reverse. On the edge, you basically have maybe half a dozen applications on each box, because boxes are ok, but they're still 4, 8 gigs of memory. It's not like your rackable server, but you have a lot of them.

Here's one data point that was given to us by one of our biggest customers. There's an industrial company called Siemens. That industrial company is in the business of managing and supporting industrial PCs that are attached to all things. Today, they have a challenge of managing 10 million of those industrial PCs. By various estimations, total number of servers inside of all of the Amazon data centers is single digit millions. That gives you a feel for what scale we should actually be building this for.

Finally, the economics of the edge is not the same as with the data center. All of these challenges essentially, make you think, we can reuse some of the principles that made cloud so successful and so developer friendly nowadays. We actually have to come up with slightly different implementations. My thesis is that the edge computing will be this really interesting, weird mix of traditional data center requirements, and actually mobile requirements. Because edge computing is like the original edge computing is this. Actually, the original edge computing, I would argue, is Microsoft Xbox. With this we really got our first taste for what an edge computing-like platform could look like. All of the things that made it so, the platforms, Android or iOS, the mobile device management approaches, cloud, Google Play Store or Google services, all of that will actually find its way into the edge. We have to think about, how will it look like? We also need to think about traditional data center architectures, like operating systems, hypervisors, all of that. I will try to outline and map out how Linux Foundation is trying to approach this space.

Open Source Edge Computing Platforms - Overview

Edge is actually pretty diverse, not just in terms of the ownership, but also in terms of the hardware and applications. Today, let's take industrial PCs. Pretty much all of them are running Windows. They're all x86 based hardware running Windows. When I say Windows, I actually mean Windows XP. Yes, it exists. A lot of SCADA applications are still based on Windows XP. If you show up as a developer and start razzle-dazzling these customers with your cloud native microservices-based architectures, the first question that they're going to ask you is, "It's all great. This is the new stuff. What about my old stuff? I want to keep running my old stuff. Can you give me a platform that would be able to support my old stuff, while I am slowly rebuilding it in this new next-generation architecture?" That becomes one of the fundamental requirements.

Scale, we already talked about the geographic aspect of it and deployments and the maintenance. The security is also interesting. Edge computing, unlike data center is much closer to this. Because edge computing is physical, which means you cannot really rely on physical security to protect it. It's not like there is a guy holding a machine gun in front of a data center, you cannot put that guy in front of every single edge computing device. You basically have to build your platform, very similarly to how iOS and Android are protecting all of your personal data. That's not something that data center people are even thinking about, because in a data center, you have your physical security and you have your network security. We are done with that. On a perimeter, you pay a lot of attention to it, but within the data center, not so much.

Also, interestingly enough, what I like about edge is that edge is probably the hardest one to really succumb to a vendor lock-in. Because the diversity is such that not a single vendor like a big cloud provider can actually handle it all. Edge is driven a lot by system integrator companies, SIs. SIs are typically pretty vertical. There may be an SI that is specializing in industrial, in retail, this and that. That diversity is actually good news for us as developers because we will not see the same concentration of power like we're seeing in the public cloud today, so I think it's good for us.

A lot of what I will be covering, in this talk, I wanted to pitch this other talk that just was made publicly available, taken out. This is the first time ever that Microsoft Xbox team talked about how they develop the platform for Xbox. That was done about a month ago, maybe two months ago, first time ever. A lot of the same principles apply, which makes me happy because we thought about them independently. The tricks that they played are really fascinating. The challenges they faced are very similar to the edge. If you want to hear from somebody who can claim that they successfully developed an edge platform, listen to those guys. I'm talking about the platform that's being developed. Mine can still fail, theirs is pretty successful.

Enterprise Edge Computing with Project EVE - Jason Shepherd, Zededa

Let's switch gears a little bit and talk about how Linux Foundation got involved in all of this. I shouldn't be the one to tell you that Cloud Native Compute Foundation has been super successful. In a way, I would say that Kubernetes was the first Google project that was successful precisely because of CNCF. I love Google, but they have a tendency of just throwing their open-source project over the wall and basically say, "If you like it, use it, if you don't, not our problem." Kubernetes was the first one where they actively tried to build a community. The fact that they went and donated it to Linux Foundation, and that was the anchor tenant for the Cloud Native Compute Foundation, I think made all the difference. Obviously, Linux Foundation itself was pretty happy about this outcome. They would like to do more of it.

The thought process went exactly like what I was talking about. When I say inside of data centers, I mean public cloud or your private data center. It doesn't matter. It's just a computer inside of a data center. For all of that, there's basically a forum of technologists that can decide, what is the common set of best practices that we all need to apply to the space to be more productive, more effective? That's CNCF, Cloud Native Compute Foundation. For all of the computers outside of data centers, it feels like we at least need to provide that type of forum even if we don't really have an anchor tenant like Kubernetes still. We need to give people a chance to talk among themselves, because otherwise there is really no way for them to synchronize on how the technology gets developed. That's LF EDGE.

Linux Foundation Edge Initiative was announced, not that long ago, actually, this year. It was announced in January, February this year. My company, ZEDEDA, we ended up being one of the founding members. We donated our project. There are a lot of companies in the space that are now part of the LF EDGE, so if you're interested, you can go to this lfedge.org website. The membership is pretty vast at this point. These are the premium members. There are also tons of general members. A lot of the good discussions are already happening within LF EDGE.

To give you a complete picture, what does LF EDGE cover? LF EDGE basically covers all of the computers outside of data centers. It starts with what we consider to be partial edge. A partial edge would be a quasi data center. It's not quite a data center, but it looks almost like a data center if you squint. A good example of that would be a telco central office, a telco CO. It's not really built to the same specification that a telco data center or a hyperscale data center would be built for, but a lot of technologies still apply. That's definitely in scope for LF EDGE. Then we basically go to telco access points. These are already physical devices. We're talking base stations. We're talking 5G deployments. These are all of the things in the CD infrastructure, or any infrastructure that would have to run some compute on them. That's definitely in scope for LF EDGE. Both of these are pretty dominated by telcos today, for good reason, because they're probably the best example of that type of an edge computing.

Then there are two other examples of edge. One that I will spend a lot of time talking about, we call it, for now, enterprise edge. This is basically all of those industrial PCs, IoT gateways. An example of the enterprise edge would be also a self-driving vehicle. Uber or Tesla building it would be also an example. Finally, there's obviously consumer edge. This is all of your washers, and dryers, and your refrigerators, all of that is in scope for LF EDGE. Every single one of these areas basically has a project that was donated by one of the founding companies. HomeEdge is from Samsung, which is not surprising because they're making all of these devices that you buy. Enterprise edge is us, ZEDEDA, and a few big enterprise companies like Dell, those types of guys. There's project Akraino that's dominated by telcos.

Interestingly enough, I have a friend of mine from Dell, Jason Shepherd, who keeps joking that this edge thing, it's very similar to how this country was settled. Because it feels we're now running away from the big hyperscale cloud providers, just like in the good old days people were running away for big businesses on the East Coast. The only place for us to actually build this exciting technology now is on the edge because everything else is dominated, and you have to join Google or Facebook to have a play in there. Go West, young man, go Edge.

These are the projects. I will be specifically talking about one of them, Edge Virtualization Engine. Check out the rest on the Linux Foundation website. I think you will find it very useful. Edge Virtualization Engine is what was donated by my company, ZEDEDA. We're actually working very closely with Fledge. Fledge is a middleware that runs on top of the project EVE. EVE stands for Edge Virtualization Engine.

Specifically, what requirements does EVE try to address? We basically approach looking at these boxes essentially from the ground up. We feel that we have to take control pretty much from the BIOS level up. I will talk about why that is important, because a lot of the technology that you would find at the BIOS and board management level in the data center simply doesn't exist on the edge. For those of you who know BMCs and iLOs, those things are not present on the edge for obvious reasons, because the control plane is not really to be had on the edge. Who are you going to talk to even if you have a BMC? Which creates an interesting challenge for how you can cut down on BIOS, and things like that. We feel that we need to start supporting hardware from the ground up. The hardware at the same time has to be zero touch. The experience of actually deploying the edge computing device should be as much similar to you buying a mobile device as possible. You get a device with an Android pre-installed. You turn it on, and you can run any applications that are compatible with an Android platform, so zero touch deployment.

We also feel that we need to run legacy applications. The legacy applications would include Windows XP. For Windows XP, you actually have to make sure that the application can access a floppy drive. That's a requirement. You also need to run real-time operating systems for control processes. You need to basically do hard partitioning of the hardware to guarantee the real-time SLAs on these applications. You need to build it at IoT scale, but what it really means is it needs to be at the same scale that all of the services that support your mobile devices operate at. What it means is that when you talk about edge computing, just building a service, a control plane in a single data center is not good enough, because your customers will be all over the place, sometimes even in Antarctica, or in the middle of the ocean. That also happens. You have to figure that one out. The platform has to be built with zero trust, absolutely zero trust, because we all know the stories of hacks that happened at uranium enrichment plant at Iranian facilities. The attack vector was very simple. It was a physical attack vector. Those things will keep happening unless we secure the platforms, and make them trustworthy as much as possible.

Finally, and that's where all of you come in, those platforms have to be made cloud native, in a sense that what APIs we give to developers to actually provide applications on top of them. Because if you look at the state of the industry today, and I already scared you at least a little bit with my Windows XP story, but Windows XP is actually a good story. The rest of the industry is still stuck in the embedded mindset. It's not a good embedded mindset. It's not like using Yocto or something. It's using some god-awful, embedded operating system that the company purchased 12, 15, 20 years ago, where people cannot even use modern GCC to compile the binary. That's the development experience in the edge and IoT today. I think it is only if we allow the same developers who built the cloud to actually develop for these platforms, it's only then that edge computing will actually take off. Because we are artificially restricting the number of innovative people that can come to the platform by not allowing the same tools that allowed us to make cloud as successful as it is today.

I talked a lot about various things that we plan to tackle. As developers, when I talk about cloud native, people tend to really just focus and assume app deployments. They're like, "Give me app deployments, and I'm done." The trouble is, app deployments, the way we think about them in a data center is just the tip of the iceberg on the edge. My favorite example that I give to everyone is, even if you assume virtualization, on the edge you basically have to solve the following problem. Suppose you decided on Docker containers, and now there is one Docker container that needs to drive a certain process, and another Docker container that needs to get a certain set of data. The process and the data happened to be connected to the single GPIO. This is a single physical device that basically has a pin out. Now you're in business of making sure that one container gets these two pins, and the other container gets those two pins. It's not something that would even come up as a problem in a data center. Because in a data center, all of your IO is basically restricted to networking, maybe a little bit of GPU. That's about it. Edge, is all about IO. All of that data that we're trying to get access to and unlock, that is the data that we can only access through a reasonable IO.

There are a lot of interesting plumbing challenges that need to be solved first before we can even start deploying our Docker containers. Docker containers are great. I think the thesis that we have at LF EDGE, at least within the project EVE, is basically very similar to what you would see in a data center, but with a certain set of specific details attached to it. We feel that edge needs to be treated exactly like you treat your Kubernetes cluster edge. The physical nodes, like your pods will be out there. There will be a controller sitting typically in the cloud, or it can sit on-prem, either one. All of these devices will basically talk to the controller just like your pods talk to the Kubernetes controller. Then somebody deploying the applications would talk to the control through typically a Kubernetes-like API. It is very much guaranteed to be a Kubernetes-like API. I think the API itself is great. That's very familiar to all of you. The question is, how do we build the layer that actually makes it all possible? That's where the project EVE comes in.

If I were to go through EVE's architecture, high level view, very quickly. It all starts with the hardware. Actually, it starts with the physical devices that you attach to the hardware. Then there needs to be some operating system that would allow you to do all of the above. That operating system needs to be open source. It needs to be Android of the edge type of an offering. That operating system will talk to the control plane. The control plane will sit in the cloud. On top of that offering of an operating system, you would be running your applications just like you do today in a data center, so a very typical, very familiar architecture.

Typically, your applications will talk to the big clouds in the sky from time to time, because that's where the data ends up anyway. You need to help them do that. Because a lot of times, people will talk to me and say, "I'm deploying my edge application today using Docker." I'm like, "That's great." They're like, "Now we need to make sure that the traffic flows into this particular Amazon VPC. How can we do that?" It just so happens that now you have to read a lot of documentation, because there's strongSwan involved, there's IPsec. It's not really configured by default. It's like, how can we actually connect the big cloud in the sky with this last cloud that we're building called edge computing? That has to come out of the box. These are essentially the requirements. That's the high-level architecture. I will deep dive into one specific component, which is EVE today.

State of the Edge: Exploring the Intersection of IoT, AI, 5G and Edge Computing

What we're trying to accomplish is, at the open-source layer, we need to standardize on two components. One is the runtime itself. The other one is the notion of an application. An application we're now trying to standardize we're calling that standard edge containers. The runtime is project EVE. At the top you basically have catalogs, and you have control planes. That's where companies can innovate and monetize. I would expect a lot of big cloud providers to basically join LF EDGE and essentially start building their controller offerings. Just like Amazon today gives you a lot of managed services, that will be one of the services that they would give you.

Deep diving into project EVE. EVE today is based on the type-1 hypervisor, currently Xen. We actually just integrated patches for ACRN. ACRN is Intel's type-1 hypervisor. It's a pretty simple layered cake, very traditional virtualization architecture. I will explain why virtualization is involved. It's hardware, a hypervisor, then there's a bunch of microservices that are running on that hypervisor. Finally, you get to run your containers.

That is to say that we're building the very same architecture that Android had to build for the mobile. The biggest difference being that Android built it in 2003. They essentially answered the same questions that we're answering just in a different way, because those were different times. The hardware was different. The questions are still the same. The questions are, how can you do application and operating system sandboxing because you don't want your applications to affect the operating system and vice versa? How do you do application bundling? How do you do application deployment? What hardware do you support? We are answering it more closely to a traditional virtualization play. Android basically did it through the sandboxing on top of JVM, because it made sense at the time. At the end of the day, I think Android also had this idea in mind that mobile platforms will only be successful if we invite all of the developers to actually develop for them. At the time developing for mobile was painful. It was that type of an embedded development experience. It's god-awful compilers, tool chains from the '80s. One of the key pieces of innovation of Android was like, let's actually pick a language that everybody understands and can program in called Java. We're essentially doing the same, but we're saying, language nowadays doesn't matter because we have this technology called Docker container. Language can be anything. It's the same idea of opening it up to the biggest amount of people who can actually bring their workloads to the platform.

EVE happens to be a post-, post-modern operating system. When I say it like that, I've built a couple of operating systems. I used to work at Sun Microsystems for a long time. I've built a couple of those. I used to hack on plotnine. I spent a bit of time doing that. All throughout my career, an operating system wanted to be a point of aggregation for anything that you do, hence packaging, shared libraries. An operating system wanted to be that point, that skeleton on which you hang everything. What happened a few years ago with basically the help of virtualization and technologies like unikernels, and things like that, is that we no longer view an operating system as that central aggregation point. An operating system these days is basically just enough operating system to run my Docker engine. I don't actually update my operating system, hence CoreOS. I don't really care about my operating system that much. I care about it running a certain type of workload. That's about it. That's what I mean by post-, post-modern operating system. It is an operating system in support of a certain type of workload. In case of EVE, that workload happens to be edge container.

Testing Challenges and Approaches in Edge Computing

Inside of EVE, there is a lot of moving parts. I will be talking about a few of those today. If you're interested, we actually have a really good documentation, which I'm proud of, because most of the open source projects lack that aspect of it. Go to our GitHub if you want to read some of the other stuff, so it's LF EDGE EVE, and click on the docs folder. There's the whole design and implementation of EVE that would be available to you. Let's quickly cover a few interesting bits and pieces. Here, I'm doing this hopefully to explain to you that what we're building is legit, but also maybe generate some interest so you can help us build it. If anything like that sounds interesting to you just talk to me after the presentation, we can figure out what pull request and GitHub issues I can assign to you.

EVE was inspired by a few operating systems that I had privilege to be associated with, one is Qubes OS. How many of you do know about Qubes OS? That's surprisingly few. You absolutely should check out Qubes OS. Qubes OS is the only operating system that Edward Snowden trusts. That's what he's running on his laptop, because that is the only one that he trusts. When he was escaping, his whole journey was Qubes OS that was running on his laptop. It's not perfect, but it's probably the best in terms of security thinking that I have seen in a long while.

Then there is Chrome OS. It's basically this idea that you can take an operating system and make it available on devices that you don't really manage. SmartOS was like Chrome OS or CoreOS, but derived from Solaris. EVE today is based on the type-1 hypervisor. People always ask me, why type-1? Why KVM is not allowed. The answer is simple. It's that requirement for the real-time workloads. Yes, patches for the real-time Linux kernel exist. They are really tricky. If you're talking about a pretty heterogeneous set of hardware, it's actually really tricky to maintain this single view of guaranteeing that your scheduler in Linux kernel would really be real-time. We use type-1 hypervisors and an ACRN, our choice today. We're running containers. We're running VMs. We're running unikernels. Basically, everything gets partitioned into its own domain by hypervisor but those domains can be super lightweight. With projects like Firecracker, that becomes faster and faster and pretty much indistinguishable from just starting a container.

DomU, basically where all of the microservices run, that is based on LinuxKit. LinuxKit is one of the most exciting projects in building specialized Linux-based distributions that I found in the last five years. It came out of Docker. It basically came out of Docker trying to build Docker Desktop. LinuxKit is how Docker manages that VM that happens to give you all of the Docker Desktop Services. It's also based on Alpine Linux. We get a lot of Alpine Linux dependencies.

We're driving towards unikernel architecture. Every single instance of a service will be running in its own domain. All of our stuff is implemented in Go. One of the really interesting projects that we're looking at is called AtmanOS, which basically allows you to do this, see that line, GOOS equals Xen, and you just do, go build. AtmanOS figured out that you can create very little infrastructure to allow binary run without an operating system, because it so happens that Go is actually pretty good about sandboxing you. Go needs a few services from an operating system, like memory management, scheduling, and that's about it. All of those services are provided directly by the hypervisor. You can actually do, go build, with GOOS Xen and have a binary that's a unikernel.

Edge Computing Standardisation and Initiatives

Finally, we're actually trying to standardize edge containers, which is pretty exciting. We are trying to truly extend the OCI specification. There have been a few areas in the OCI that we're looking at. Image specification itself doesn't require much of a change. The biggest focus that we have is on registry support. We don't actually need runtime specification because OCI had this problem that they needed to integrate with other tools. Remember, classical operating system, when all the classical operating systems were black box execution engine. We don't need to integrate with anything but ourselves, hence runtime specification is not really needed. Good news is that there are actually a lot of the parallel efforts of extending the OCI into supporting different types of containers. Two that I would mention are Kata Containers, which are more traditional OCI, but also Singularity Containers, which came more from HPC and giving you access to hardware. Weaveworks is doing some of that same thing. Check them out. Obviously, Firecracker is pretty cool as a container execution environment that also gives you isolation of the hypervisor.

Top three goals that we have for edge containers are, basically allow you not only file system level composition, which is what a traditional container gives you. You can basically compose layers. We happen to be glorified tarball. You do everything at the level of the file system. You add this file, you remove that file. We're also allowing you block-level composition. You can basically compose block-level devices, which allows you then to manage disks, VMs, unikernels, this and that. We allow you hardware mapping. You can basically associate how the hardware maps to a given container, not at the runtime level, but at the container level itself.

We still feel that the registry is the best thing that ever happened to Docker. The fact that you can produce a container is not interesting enough. The fact that you can share that container with everybody else, that is interesting. We feel that the registry basically has to take onto an ownership of managing as many artifacts as possible, which seems to be the trajectory of OCI anyway. Things like Helm charts and all the other things that you need for orchestration, I would love for them to exist in the registry. Because that becomes my single choke point for any deployment that then happens throughout my enterprise.

Eve's networking is intent based. You will find that very familiar to any type of networking architecture that exists in VMware or any virtualization product, with a couple of exceptions. One is cloud network, which is, literally, intent is, connect me to that cloud. I don't care how. I'm willing to give you my credentials but I need my traffic to flow into the Google, into Amazon, into Microsoft Cloud. Just make it happen. The way we do make it happen is each container or each VM, because everything is virtualized, basically gets a virtualized NIC, network interface card. What happens on the other side of that NIC? Think of it as basically one glorified sidecar, but instead of using the sidecar that has to communicate through the operating system. We communicate through the hypervisor. Basically, the VM is none the wiser of what happens to the traffic. All of that is configured by the system, which allows us really interesting tricks, like networking that Windows XP into the Amazon cloud. Otherwise, it would be impossible. You can install IPsec to Windows XP but it's super tricky. Windows XP that just communicates over the virtualized NIC and the traffic happens to flow through IPsec and to Amazon cloud, that Windows XP instance, is none the wiser.

Another cool thing that we do networking-wise is called mesh network. It is basically based on the standard called LISP, which has an RFC 6830. It allows you to have a flat IPv6 overlay namespace where anything can see anything else. That IPv6 is true overlay. It doesn't change if you move the device. What allows it to do is basically bypass all of the NetBoxes, and all of the things that may be in between this edge device and that edge device, so that they can directly communicate with each other. Think about it as one gigantic Skype or peer-to-peer system that allows everything to basically have a service mesh that is based on IPv6 instead of some interesting service discovery. That's networking.

On trust, we're basically building everything through the root-of-trust that's rooted at the hardware element. On Intel most of the time it happens to be TPM. TPMs exist in pretty much every single system I've seen. Yet nobody but Microsoft seems to be using them. Why? Because the developer support still sucks. On Linux, it actually takes a lot of time to enable TPM and configure TPM. We're virtualizing the TPM. We use it internally, but then the applications, the edge containers get the virtualized view of the TPM. We also deal with a lot of the crap that exists today in the modern x86 based system. Because a lot of people don't realize it but there is a lot of processors and software that runs on your x86 system that you don't know about. Your operating system, even your hypervisor is not the only piece of software. We're trying to either disable it or make it manageable. Our management starts from the BIOS level up. Thanks to Qubes for pioneering this. Everything runs in its own domain. We're even disaggregating device drivers. If you have a device driver for Bluetooth and it gets compromised, since it's running in its own domain, that will not compromise the rest of the system. Stuff like that.

EVE's software update model is super easy for applications. It's your traditional cloud native deployment. You push to the cloud and the application happens to run. If you don't like it, you push the next version. You can do canary deployments. You can do all of the stuff that you expect to see from Kubernetes. EVE itself needs to be updated. That's where ideas from Chrome OS and CoreOS kick in. It's pretty similar to what happens on your cell phone. It's dual partitioned with multiple levels of fallback, lots of burn-in testing that we do. We're trying to avoid the need for physical contact with edge nodes as much as possible, which means that a lot of things that would have you press a key would have to be simulated by us. That's a whole tricky area of how to do that. That's something that we also do in EVE. We are really big fans of the open-source BIOS reimplementation from coreboot, and especially u-root on top of coreboot. That allows us to basically have a complete open-source stack on everything from the BIOS level up.

The most interesting work that we're doing with TPM, and I have to plug it because I get excited about every single time, we're trying to basically do a hardware-protected vTPM, something that hasn't been done before, even in the data center. There's a group of us who is doing it, if you're interested you can contact any one of us. TrenchBoot is the name of the project. There's Dave Smith and LF EDGE in general.

Eve itself is actually super easy to develop. That's the demo that I wanted to give, because it's not a QCon without a demo. EVE is based on LinuxKit. There is a little makefile infrastructure that allows you to do all of the traditional operating system developer things. Basically, typing make run would allow you to manage the operating system, run the operating system. The only reason I'm mentioning this is because people get afraid a lot of times if I talk about operating system development and design, because there's a little bit of a stigma. It's like, "I need a real device. I need some J-Tech connector. I need a serial port to debug it." No, with EVE, you can actually debug it all in the comfort of your terminal window on macOS.

The entire build system is Docker based. Basically, all of the artifacts in EVE get packaged as Docker containers. It's actually super easy to develop within a single artifact. Because we're developing edge containers in parallel, we are planning to start using that for the unikernel development as well, which might, interestingly enough, bifurcate and be its own project. Because, I think when it comes to unikernels, developers still don't really have the tools. There's a few available like UniK and a few others. There's not, really, that same level of usefulness of the tools that Docker Desktop just gives me. We're looking into that as well.

Edge computing today is where public cloud was in '06. Sorry, I cannot give you ready-made tools, but I can invite you to actually build the tools with me and us at Linux Foundation. Edge computing is one final cloud that's left. I think it's the cloud that will never ever be taken away from us. By us, I mean people who actually run the actual physical hardware. Because you could tell, I'm an infrastructure guy. It sucked when people stopped buying servers and operating systems, and now everything just moved to the cloud. My refuge is edge. Edge computing is a huge total addressable market. As a founder of a startup company, I can assure you that there is tremendous amount of VC activity in the space. It's a good place to be if you're trying to build a company. Kubernetes as an implementation is dead, but long live Kubernetes as an API. That stays with us. Edge computing is a lot of fun. Just help us build either EVE, a super exciting project, or there are a few projects to pick in the LF EDGE in general.

Participant 1: We see the clouds, AWS, Azure, and all that. Is there L2 connectivity? Are you using, for example, the AWS Direct Connect APIs, and for Azure, ExpressRoute? That's what you're doing?

Shaposhnik: Yes, exactly.

Participant 1: I belong to Aconex and we are delving into a similar thing, we already allow people to connect to the cloud. We'll look deeper into this.

Shaposhnik: Absolutely. That's exactly right. That's why I'm saying it's a different approach to running an operating system because I see a lot of companies trying to still integrate with Linux, which is great. There is a lot of business in that. What we're saying is Linux itself doesn't matter anymore. It's the Docker container that matters. We're extending it into the edge container. Docker container is an edge container. It almost doesn't matter what an operating system is. We're replacing all layers of it with this very built for purpose engine. While it's still absolutely a valid approach to still say, "I need Yocto," or some traditional Linux distribution that integrates with that. I think my only call to action would be, let's build tools that would be applicable in both scenarios. That way we can help each other grow.

Participant 2: I know in your presentation you mentioned that edge is going to be more diverse. What's your opinion on cloud providers extending to the edge through projects like Azure Sphere and Azure IoT Edge?

Shaposhnik: They will be doing it, no question about it. I think they will come from the cloud side. Remember that long range of what's edge and what's not edge. They will basically start addressing the issues at the CO, the central office. They will start addressing the issues at the maybe Mac access points. I don't see them completely flipping and basically running on the deep edge. The reason for that is, business-wise, they're not set up to do that. The only company that I see that potentially can do that is Microsoft. Because if you want to run on the deep edge, you need to develop and foster your ecosystem, the same way that Microsoft developed and fostered the ecosystem that made every single PC run Windows. Amazon and ecosystem don't go together in the same sentence. Google is just confused. If anybody tackles it, that would be Microsoft, but they are distracted by so much of a low-hanging fruit in front of them just moving their traditional customers into the cloud, that I just don't see them as applying effort in that space. It may happen in five years, but for now, running this company, at least I don't see any of that happening.

Participant 3: What about drivers for sensors on these edge devices? It seems EVE abstracts the OS away from you, but in industrial, for instance, you need to detect things, so you need peripherals.

Shaposhnik: Correct. What about drivers? Because it's a hypervisor based architecture, we can just assign the hardware directly to you. If you want to have that Windows XP based VM drive your hardware, we can do that. That's not interesting, because we need software abstractions that will make it easier for developers to basically not think about it. That is the work that is a very nascent chunk of work. How do you provide software abstractions for a lot of things that we took for granted, like there's a file in /dev someplace, and I do something with it through Yocto. Now we're flipping it back and saying, "If I'm running a Docker container, what would be the most natural abstraction to a particular hardware resource?" A lot of times, surprisingly, to me, that abstraction happens to be a network socket. We can manage the driver on the other side of the hypervisor. Again, we will still run the driver in its own domain. To all of the containers that want to use it, we will basically present a nice software abstraction such as network socket.

More Information:

https://www.infoq.com/presentations/linux-eve/

https://landscape.lfedge.org/card-mode?license=apache-license-2-0

https://www.lfedge.org/resources/publications/

https://www.lfedge.org/#

https://www.lfedge.org/news-events/blog/

https://www.lfedge.org/2021/03/12/state-of-the-edge-2021-report/

https://www.linux.com/topic/networking/project-eve-promotes-cloud-native-approach-edge-computing/

https://zededa.com/product/

https://thenewstack.io/how-the-linux-foundations-eve-can-replace-windows-linux-for-edge-computing/

Tesla Dojo and Hydranet and AI and Deep Learning

Tesla Has Done Something No Other Automaker Has: Assumed The Mantle Of Moore’s Law

Steve Jurvetson shared on Twitter that Tesla now holds the mantle of Moore’s law in the same manner NVIDIA took leadership from Intel a decade ago. He noted that the substrates have shifted several times, but humanity’s capacity to compute has compounded for 122 years. He shared a log scale with details.

https://www.flickr.com/photos/jurvetson/51391518506/

The link Jurvetson shared included a detailed article explaining how Tesla holds the mantel of Moore’s Law. Tesla’s introduced its D1 chip for the DOJO Supercomputer and he said:

“This should not be a surprise, as Intel ceded leadership to NVIDIA a decade ago, and further handoffs were inevitable. The computational frontier has shifted across many technology substrates over the past 120 years, most recently from the CPU to the GPU to ASICs optimized for neural networks (the majority of new compute cycles).”

“Of all of the depictions of Moore’s Law, this is the one I find to be most useful, as it captures what customers actually value — computation per $ spent (note: on a log scale, so a straight line is an exponential; each y-axis tick is 100x).”

“Humanity’s capacity to compute has compounded for as long as we can measure it, exogenous to the economy, and starting long before Intel co-founder Gordon Moore noticed a refraction of the longer-term trend in the belly of the fledgling semiconductor industry in 1965.”

Project Dojo: Check out Tesla Bot AI chip! (full presentation)

“In the modern era of accelerating change, it is hard to find even five-year trends with any predictive value, let alone trends that span the centuries. I would go further and assert that this is the most important graph ever conceived (my earlier blog post on its origins and importance).”

“Why the transition within the integrated circuit era? Intel lost to NVIDIA for neural networks because the fine-grained parallel compute architecture of a GPU maps better to the needs of deep learning. There is a poetic beauty to the computational similarity of a processor optimized for graphics processing and the computational needs of a sensory cortex, as commonly seen in neural networks today. A custom chip (like the Tesla D1 ASIC) optimized for neural networks extends that trend to its inevitable future in the digital domain. Further advances are possible in analog in-memory compute, an even closer biomimicry of the human cortex. The best business planning assumption is that Moore’s Law, as depicted here, will continue for the next 20 years as it has for the past 120.”

In the detailed description of the chart, Jurvetson pointed out that in the perception of Moore’s Law, computer chips are compounding in their complexity at near-constant per unit cost. He explained that this is one of many abstractions of the law. Moore’s Law is both a prediction and an abstraction this abstraction is related to the compounding of transistor density in two dimensions. He explained that others related to speed or computational power.

He also added:

“What Moore observed in the belly of the early IC industry was a derivative metric, a refracted signal, from a longer-term trend, a trend that begs various philosophical questions and predicts mind-bending futures.”

“Ray Kurzweil’s abstraction of Moore’s Law shows computational power on a logarithmic scale, and finds a double exponential curve that holds over 120 years! A straight line would represent a geometrically compounding curve of progress.”

He explained that, through five paradigm shifts, the computation power that $1,000 buys has doubled every two years. And it has been doubling every year for the past 30 years. In this graph, he explained that each dot represented a frontier of the computational price performance of the day. He gave these examples: one machine used in the 1890 Census, one cracked the Nazi Enigma cipher in WW2, and one predicted Eisenhower’s win in the 1956 presidential election.

He also pointed out that each dot represented a human drama and that before Moore’s first paper in 1965, none of them realized that they were on a predictive curve. The dots represent an attempt to build the best computer with the tools of the day, he explained. And with those creations, we use them to make better design software and manufacturing control algorithms.

“Notice that the pace of innovation is exogenous to the economy. The Great Depression and the World Wars and various recessions do not introduce a meaningful change in the long-term trajectory of Moore’s Law. Certainly, the adoption rates, revenue, profits, and economic fates of the computer companies behind the various dots on the graph may go through wild oscillations, but the long-term trend emerges nevertheless.”

Tesla now holds the mantle of Moore’s Law, with the D1 chip introduced last night for the DOJO supercomputer (video, news summary).

Tesla’s BREAKTHROUGH DOJO Supercomputer Hardware Explained

This should not be a surprise, as Intel ceded leadership to NVIDIA a decade ago, and further handoffs were inevitable. The computational frontier has shifted across many technology substrates over the past 120 years, most recently from the CPU to the GPU to ASICs optimized for neural networks (the majority of new compute cycles). The ASIC approach is being pursued by scores of new companies and Google TPUs now added to the chart by popular request (see note below for methodology), as well as the Mythic analog M.2

By taking on the mantle of Moore’s Law, Tesla is achieving something that no other automaker has achieved. I used the term “automaker” since Tesla is often referred to as such by the media, friends, family, and those who don’t really follow the company closely. Tesla started out as an automaker and that’s what people remember most about it: “a car for rich people,” one of my close friends told me. (She was shocked when I told her how much a Model 3 cost. She thought it was over $100K for the base model.)

Jurvetson’s post is very technical, but it reflects the truth: Tesla has done something unique for the auto industry. Tesla has progressed an industry that was outdated and challenged the legacy OEMs to evolve. This was is a hard thing for them to do, as there hasn’t been any new revolutionary technology introduced to this industry since Henry Ford moved humanity from the horse and buggy to automobiles.

Sure, over the years, designs of vehicles changed along with pricing, specs, and other details, but until Tesla, none of these changes affected the industry largely as a whole. None of these changes made the industry so uncomfortable that they laughed at the idea before lated getting scared of being left behind. The only company to have done this is Tesla, and now new companies are trying to be the next Tesla or create competing cars — and do whatever they can to keep up with Tesla’s lead.

Teaching a Car to Drive Itself by Imitation and Imagination (Google I/O'19)

For the auto industry, Tesla represents a jump in evolution, and not many people understand this. I think most automakers have figured this out, though. Ford and VW especially.

Of all of the depictions of Moore’s Law, this is the one I find to be most useful, as it captures what customers actually value — computation per $ spent (note: on a log scale, so a straight line is an exponential; each y-axis tick is 100x).

Humanity’s capacity to compute has compounded for as long as we can measure it, exogenous to the economy, and starting long before Intel co-founder Gordon Moore noticed a refraction of the longer-term trend in the belly of the fledgling semiconductor industry in 1965.

Why the transition within the integrated circuit era? Intel lost to NVIDIA for neural networks because the fine-grained parallel compute architecture of a GPU maps better to the needs of deep learning. There is a poetic beauty to the computational similarity of a processor optimized for graphics processing and the computational needs of a sensory cortex, as commonly seen in neural networks today. A custom chip (like the Tesla D1 ASIC) optimized for neural networks extends that trend to its inevitable future in the digital domain. Further advances are possible in analog in-memory compute, an even closer biomimicry of the human cortex. The best business planning assumption is that Moore’s Law, as depicted here, will continue for the next 20 years as it has for the past 120.

For those unfamiliar with this chart, here is a more detailed description:

Moore's Law is both a prediction and an abstraction

Moore’s Law is commonly reported as a doubling of transistor density every 18 months. But this is not something the co-founder of Intel, Gordon Moore, has ever said. It is a nice blending of his two predictions; in 1965, he predicted an annual doubling of transistor counts in the most cost effective chip and revised it in 1975 to every 24 months. With a little hand waving, most reports attribute 18 months to Moore’s Law, but there is quite a bit of variability. The popular perception of Moore’s Law is that computer chips are compounding in their complexity at near constant per unit cost. This is one of the many abstractions of Moore’s Law, and it relates to the compounding of transistor density in two dimensions. Others relate to speed (the signals have less distance to travel) and computational power (speed x density).

Unless you work for a chip company and focus on fab-yield optimization, you do not care about transistor counts. Integrated circuit customers do not buy transistors. Consumers of technology purchase computational speed and data storage density. When recast in these terms, Moore’s Law is no longer a transistor-centric metric, and this abstraction allows for longer-term analysis.

Tesla’s MIND BLOWING Dojo AI Chip (changes everything)

What Moore observed in the belly of the early IC industry was a derivative metric, a refracted signal, from a longer-term trend, a trend that begs various philosophical questions and predicts mind-bending futures.

Ray Kurzweil’s abstraction of Moore’s Law shows computational power on a logarithmic scale, and finds a double exponential curve that holds over 120 years! A straight line would represent a geometrically compounding curve of progress.

Through five paradigm shifts – such as electro-mechanical calculators and vacuum tube computers – the computational power that $1000 buys has doubled every two years. For the past 35 years, it has been doubling every year.

Each dot is the frontier of computational price performance of the day. One machine was used in the 1890 Census; one cracked the Nazi Enigma cipher in World War II; one predicted Eisenhower’s win in the 1956 Presidential election. Many of them can be seen in the Computer History Museum.

Each dot represents a human drama. Prior to Moore’s first paper in 1965, none of them even knew they were on a predictive curve. Each dot represents an attempt to build the best computer with the tools of the day. Of course, we use these computers to make better design software and manufacturing control algorithms. And so the progress continues.

Notice that the pace of innovation is exogenous to the economy. The Great Depression and the World Wars and various recessions do not introduce a meaningful change in the long-term trajectory of Moore’s Law. Certainly, the adoption rates, revenue, profits and economic fates of the computer companies behind the various dots on the graph may go though wild oscillations, but the long-term trend emerges nevertheless.

Any one technology, such as the CMOS transistor, follows an elongated S-shaped curve of slow progress during initial development, upward progress during a rapid adoption phase, and then slower growth from market saturation over time. But a more generalized capability, such as computation, storage, or bandwidth, tends to follow a pure exponential – bridging across a variety of technologies and their cascade of S-curves.

In the modern era of accelerating change in the tech industry, it is hard to find even five-year trends with any predictive value, let alone trends that span the centuries. I would go further and assert that this is the most important graph ever conceived.

Why is this the most important graph in human history?

A large and growing set of industries depends on continued exponential cost declines in computational power and storage density. Moore’s Law drives electronics, communications and computers and has become a primary driver in drug discovery, biotech and bioinformatics, medical imaging and diagnostics. As Moore’s Law crosses critical thresholds, a formerly lab science of trial and error experimentation becomes a simulation science, and the pace of progress accelerates dramatically, creating opportunities for new entrants in new industries. Boeing used to rely on the wind tunnels to test novel aircraft design performance. Ever since CFD modeling became powerful enough, design moves to the rapid pace of iterative simulations, and the nearby wind tunnels of NASA Ames lie fallow. The engineer can iterate at a rapid rate while simply sitting at their desk.

Tesla unveils "Dojo" Computer Chip | Tesla AI Day

Every industry on our planet is going to become an information business. Consider agriculture. If you ask a farmer in 20 years’ time about how they compete, it will depend on how they use information, from satellite imagery driving robotic field optimization to the code in their seeds. It will have nothing to do with workmanship or labor. That will eventually percolate through every industry as IT innervates the economy.

Non-linear shifts in the marketplace are also essential for entrepreneurship and meaningful change. Technology’s exponential pace of progress has been the primary juggernaut of perpetual market disruption, spawning wave after wave of opportunities for new companies. Without disruption, entrepreneurs would not exist.

Moore’s Law is not just exogenous to the economy; it is why we have economic growth and an accelerating pace of progress. At Future Ventures, we see that in the growing diversity and global impact of the entrepreneurial ideas that we see each year. The industries impacted by the current wave of tech entrepreneurs are more diverse, and an order of magnitude larger than those of the 90’s — from automobiles and aerospace to energy and chemicals.

At the cutting edge of computational capture is biology; we are actively reengineering the information systems of biology and creating synthetic microbes whose DNA is manufactured from bare computer code and an organic chemistry printer. But what to build? So far, we largely copy large tracts of code from nature. But the question spans across all the complex systems that we might wish to build, from cities to designer microbes, to computer intelligence.

Reengineering engineering

As these systems transcend human comprehension, we will shift from traditional engineering to evolutionary algorithms and iterative learning algorithms like deep learning and machine learning. As we design for evolvability, the locus of learning shifts from the artifacts themselves to the process that created them. There is no mathematical shortcut for the decomposition of a neural network or genetic program, no way to "reverse evolve" with the ease that we can reverse engineer the artifacts of purposeful design. The beauty of compounding iterative algorithms (evolution, fractals, organic growth, art) derives from their irreducibility. And it empowers us to design complex systems that exceed human understanding.

Tesla AI Day

Why does progress perpetually accelerate?

All new technologies are combinations of technologies that already exist. Innovation does not occur in a vacuum; it is a combination of ideas from before. In any academic field, the advances today are built on a large edifice of history. . This is why major innovations tend to be 'ripe' and tend to be discovered at the nearly the same time by multiple people. The compounding of ideas is the foundation of progress, something that was not so evident to the casual observer before the age of science. Science tuned the process parameters for innovation, and became the best method for a culture to learn.

From this conceptual base, come the origin of economic growth and accelerating technological change, as the combinatorial explosion of possible idea pairings grows exponentially as new ideas come into the mix (on the order of 2^n of possible groupings per Reed’s Law). It explains the innovative power of urbanization and networked globalization. And it explains why interdisciplinary ideas are so powerfully disruptive; it is like the differential immunity of epidemiology, whereby islands of cognitive isolation (e.g., academic disciplines) are vulnerable to disruptive memes hopping across, much like South America was to smallpox from Cortés and the Conquistadors. If disruption is what you seek, cognitive island-hopping is good place to start, mining the interstices between academic disciplines.

Predicting cut-ins (Andrej Karpathy)

It is the combinatorial explosion of possible innovation-pairings that creates economic growth, and it’s about to go into overdrive. In recent years, we have begun to see the global innovation effects of a new factor: the internet. People can exchange ideas like never before Long ago, people were not communicating across continents; ideas were partitioned, and so the success of nations and regions pivoted on their own innovations. Richard Dawkins states that in biology it is genes which really matter, and we as people are just vessels for the conveyance of genes. It’s the same with ideas or “memes”. We are the vessels that hold and communicate ideas, and now that pool of ideas percolates on a global basis more rapidly than ever before.

In the next 6 years, three billion minds will come online for the first time to join this global conversation (via inexpensive smart phones in the developing world). This rapid influx of three billion people to the global economy is unprecedented in human history, and so to, will the pace of idea-pairings and progress.

We live in interesting times, at the cusp of the frontiers of the unknown and breathtaking advances. But, it should always feel that way, engendering a perpetual sense of future shock.

The D1 is the second semiconductor designed internally by Tesla, following the in-car supercomputer released in 2019. According to Tesla Official, each D1 packs 362 teraflops (TFLOPs) of processing power, meaning it can perform 362 trillion floating-point operations per second.

Is the ‘D1’ AI chip speeding Tesla towards full autonomy?

The company has designed a super powerful and efficient chip for self-driving, but can be used for many other things

Tesla on its AI day, unveiled a custom chip for training artificial intelligence networks in data centers

The D1 chip is part of Tesla’s Dojo supercomputer system, uses a 7-nm manufacturing process, with 362 teraflops of processing power

The chips can help train models to recognize items from camera feeds inside Tesla vehicles

Will the just-announced Tesla Bot make future working optional for humans - or obsolete?

Elon Musk says Tesla robot will make physical work a ‘choice’

Back at the Tesla 2019 Autonomy Day, CEO Elon Musk unveiled its first custom artificial intelligence (AI) chip, which promised to propel the company toward its goal of full autonomy. The automaker then started producing cars with its custom AI within the same year. This year, as the world grapples with a chip shortage conundrum, the company presented its in-house D1 chip — the processor that will power its Dojo supercomputer.

Tesla's Dojo Supercomputer, Full Analysis (Part 1/2)

Tesla's Dojo Supercomputer, Full Analysis (Part 2/2)

Tesla combines 25 chips into a training tile and links 120 training tiles together across several server cabinets. In simple terms, each training tile clocks in at 9 petaflops, meaning Dojo will boast over 1 exaflop of computing power. In other words, Dojo can easily be the most powerful AI training machine in the world.

The company believes that AI has limitless possibilities and the system is getting smarter than an average human. Tesla announced that to speed up the AI software workloads, its D1 Dojo custom application-specific integrated circuit (ASIC) for AI training will be of great use, the software that the company presented during this year’s AI Day that was held last week.

Although many companies including tech giants like Amazing, Baidu, Intel and NVIDIA are building ASICs for AI workloads, not everyone has the right formula or satisfies each workload perfectly. Experts reckon it is the reason why Tesla opted to develop its own ASIC for AI training purposes.

Tesla and its foray into AI

The system which is called the D1 resembles a part of the Dojo supercomputer used to train AI models inside Tesla’s headquarters. It is fair to note that the chip is a product of Taiwan’s TSMC’s manufacturing efforts and is produced using the 7nm semiconductor node. The chip reportedly is packed with over 50 billion transistors and boasts a huge die size of 645mm^2.

Now, with the introduction of an exascale supercomputer which management says will be operational next year, Tesla has reinforced that advantage. Since AI training requires two things: massive amounts of data, and a powerful supercomputer that can use that data to train deep neural nets, Tesla has the added advantage. With over one million autopilot-enabled EVs on the road, Tesla already has a vast dataset edge over other automakers.

All this work comes two years after Tesla began producing vehicles containing AI chips it built in-house. Those chips help the car’s onboard software make decisions very quickly in response to what’s happening on the road. This time, Musk noted that its latest supercomputer tech can be used for many other things and that Tesla is willing to open up other automakers and tech companies who are interested.

At first it seems improbable — how could it be that Tesla, who has never designed a chip before — would design the best chip in the world? But that is objectively what has occurred. Not best by a small margin, best by a huge margin. It’s in the cars right now,” Musk said. With that, his newest big prediction is that Tesla will have self-driving cars on the road next year — without humans inside — operating in a so-called robo-taxi fleet.

Tesla introduced the Tesla D1, a new chip designed specifically for artificial intelligence that is capable of delivering a power of 362 TFLOPs in BF16 / CFP8. This was announced at Tesla’s recent AI Day event.

The Tesla D1 adds a total of 354 training nodes that form a network of functional units, which are interconnected to create a massive chip. Each functional unit comes with a quad-core, 64-bit ISA CPU that uses a specialized, custom design for transpositions, compilations, broadcasts, and link traversal. This CPU adopts a superscalar implementation (4-wide scalar and 2-wide vector pipelines).

This new Tesla silicon is manufactured in 7nm process, has a total of 50,000 million transistors, and occupies an area of 645 mm square, which means that it is smaller than the GA100 GPU, used in the NVIDIA A100 accelerator, which is 826 mm square in size.

Each functional unit has 1.25 MB SRAM and 512 GB/sec bandwidth in any direction on the unit network. The CPUs are joined in multichip configurations of 25 D1 units, which Tesla calls "Dojo Interface Processors" (DIPs).

Tesla claims its Dojo chip will process computer vision data four times faster than existing systems, enabling the company to bring its self-driving system to full autonomy, but the two most difficult technological feats have not been accomplished by Tesla yet, this is the tile to tile interconnect and software. Each tile has more external bandwidth than the highest end networking switches. To achieve this, Tesla developed custom interconnects. Tesla says the first Dojo cluster will be running by next year.

The same technology that undergirds Tesla’s cars will drive the forthcoming Tesla Bot, which is intended to perform mundane tasks like grocery shopping or assembly-line work. Its design spec calls for 45-pound carrying capacity, “human-level hands,” and a top speed of 5 miles per hour (so humans can outrun it).

IBM’s Telum Processor is the latest silicon wafer chip and a competitor to the Tesla D1. IBM’s first commercial processor, the Telum contains on-chip acceleration and allows clients to use deep-learning interference at scale. IBM claims that the on-chip acceleration empowers the system to conduct inference at a great speed.

IBM’s Telum is integral in fraud detection during the early periods of transaction processing while Tesla’s Dojo is mainly essential for computer vision for self-driving cars using cameras. While Telum is a silicon wafer, Dojo has gone against industry standards: the chips are designed to connect without any glue.

The most powerful supercomputer in the world, Fugaku, lives at the RIKEN Center for Computational Science in Japan. At its tested limit it is capable of 442,010 TFLOPs per second, and theoretically it could perform up to 537,212 TFLOPs per second. Dojo, Tesla said, could end up being capable of breaking the exaflop barrier, something that no supercomputing company, university or government has been capable of doing.

Tesla unveils "Dojo" Computer Chip | Tesla AI Day

Dojo is made up of a mere 10 cabinets and is thus also the smallest supercomputer in the world when it comes to size. Fugaku on the other hand is made up of 256 cabinets. If Tesla was to add 54 cabinets to Dojo V1 for a total of 64 cabinets, Dojo would surpass Fugaku in computer performance.

All along, Tesla seemed positioned to gain an edge in artificial intelligence. Sure, Elon Musk’s Neuralink — along with SpaceX and The Boring Company — are separately held companies from Tesla, but certainly seepage among the companies occurs. So, at the Tesla AI event last month, when the company announced it would be designing its own silicon chips, more than ever it seemed Tesla had an advantage.

The AI event culminated with a dancing human posing as a humanoid robot, previewing the Tesla Bot the company intends to build. But the more immediate and important reveal was the custom AI chip “D1,” which would be used for training the machine-learning algorithm behind Tesla’s Autopilot self-driving system. Tesla has a keen focus on this technology, with a single giant neural network known as a “transformer” receiving input from 8 cameras at once.

“We are effectively building a synthetic animal from the ground up,” Tesla’s AI chief, Andrej Karpathy, said during the August, 2021 event. “The car can be thought of as an animal. It moves around autonomously, senses the environment, and acts autonomously.”

CleanTechnica‘s Johnna Crider, who attended the AI event, shared that, “At the very beginning of the event, Tesla CEO Musk said that Tesla is much more than an electric car company, and that it has ‘deep AI activity in hardware on the inference level and on the training level.’” She concluded that, “by unveiling the Dojo supercomputer plans and getting into the details of how it is solving computer vision problems, Tesla showed the world another side to its identity.”

Tesla’s Foray into Silicon Chips

Tesla is the latest nontraditional chipmaker, as described in a recent Wired analysis. Intel Corporation is the world’s largest semiconductor chip maker, based on its 2020 sales. It is the inventor of the x86 series of microprocessors found in most personal computers today. Yet, as AI gains prominence and silicon chips become essential ingredients in technology-integrated manufacturing, many others, including Google, Amazon, and Microsoft, are now designing their own chips.

Tesla FSD chip explained! Tesla vs Nvidia vs Intel chips

For Tesla, the key to silicon chip success will be deriving optimal performance out of the computer system used to train the company’s neural network. “If it takes a couple of days for a model to train versus a couple of hours,” CEO Elon Musk said at the AI event, “it’s a big deal.”

Initially, Tesla relied on Nvidia hardware for its silicon chips. That changed in 2019, when Tesla turned in-house to design chips that interpret sensor input in its cars. However, manufacturing the chips needed to train AI algorithms — moving the creative process from vision to execution — is quite a sophisticated, costly, and demanding endeavor.

The D1 chip, part of Tesla’s Dojo supercomputer system, uses a 7-nanometer manufacturing process, with 362 teraflops of processing power, said Ganesh Venkataramanan, senior director of Autopilot hardware. Tesla places 25 of these chips onto a single “training tile,” and 120 of these tiles come together across several server cabinets, amounting to over an exaflop of power. “We are assembling our first cabinets pretty soon,” Venkataramanan disclosed.

CleanTechnica‘s Chanan Bos deconstructed the D1 chip intricately in a series of articles (in case you missed them) and related that, under its specifications, the D1 chip boasts that it has 50 billion transistors. When it comes to processors, that absolutely beats the current record held by AMD’s Epyc Rome chip of 39.54 billion transistors.

Tesla says on its website that the company believes “that an approach based on advanced AI for vision and planning, supported by efficient use of inference hardware, is the only way to achieve a general solution for full self-driving and beyond.” To do so, the company will:

Build silicon chips that power the full self-driving software from the ground up, taking every small architectural and micro-architectural improvement into account while pushing hard to squeeze maximum silicon performance-per-watt;

Perform floor-planning, timing, and power analyses on the design;

Write robust, randomized tests and scoreboards to verify functionality and performance;

Implement compilers and drivers to program and communicate with the chip, with a strong focus on performance optimization and power savings; and,

Validate the silicon chip and bring it to mass production.

“We should have Dojo operational next year,” CEO Elon Musk affirmed.

Keynote - Andrej Karpathy, Tesla

The Tesla Neural Network & Data Training

Tesla’s approach to full self-driving is grounded in its neural network. Most companies that are developing self-driving technology look to lidar, which is an acronym for “Light Detection and Ranging.” It’s a remote sensing method that uses light in the form of a pulsed laser to measure ranges — i.e., variable distances — to the Earth. These light pulses are combined with other data recorded by the airborne system to generate precise, 3-dimensional information about the shape of the Earth and its surface characteristics.

PyTorch at Tesla - Andrej Karpathy, Tesla

Tesla, however, rejected lidar, partially due to its expensive cost and the amount of technology required per vehicle. Instead, it interprets scenes by using the neural network algorithm to dissect input from its cameras and radar. Chris Gerdes, director of the Center for Automotive Research at Stanford, says this approach is “computationally formidable. The algorithm has to reconstruct a map of its surroundings from the camera feeds rather than relying on sensors that can capture that picture directly.”

Tesla explains on its website the protocols it has embraced to develop its neural networks:

Apply cutting-edge research to train deep neural networks on problems ranging from perception to control;

Per-camera networks analyze raw images to perform semantic segmentation, object detection, and monocular depth estimation;

Birds-eye-view networks take video from all cameras to output the road layout, static infrastructure, and 3D objects directly in the top-down view;

Networks learn from the most complicated and diverse scenarios in the world, iteratively sourced from a fleet of nearly 1M vehicles in real time; and,

A full build of Autopilot neural networks involves 48 networks that take 70,000 GPU hours to train, and, together, output 1,000 distinct tensors (predictions) at each timestep.

Training Teslas via Videofeeds

Tesla gathers more training data than other car companies. Each of the more than 1 million Teslas on the road sends back to the company the videofeeds from its 8 cameras. Hardware 3 onboard computer processes more than 40s the data compared to Tesla’s previous generation system. The company employs 1,000 people who label those images — noting cars, trucks, traffic signs, lane markings, and other features — to help train the large transformer.

At the August event, Tesla also said it can automatically select which images to prioritize in labeling to make the process more efficient. This is one of the many pieces that sets Tesla apart from its competitors.

Conclusion

Tesla has an advantage over Waymo (and other competitors) in three key areas thanks to its fleet of roughly 500,000 vehicles:

Computer vision
Prediction
Path planning/driving policy

Concerns about collecting the right data, paying people to label it, or paying for bandwidth and storage don’t obviate these advantages. These concerns are addressed by designing good triggers, using data that doesn’t need human labelling, and using abstracted representations (replays) instead of raw video.

The majority view among business analysts, journalists, and the general public appears to be that Waymo is far in the lead with autonomous driving, and Tesla isn’t close. This view doesn’t make sense when you look at the first principles of neural networks.

Wafer-Scale Hardware for ML and Beyond

What’s more, AlphaStar is a proof of concept of large-scale imitation learning for complex tasks. If you are skeptical that Tesla’s approach is the right one, or that path planning/driving policy is a tractable problem, you have to explain why imitation learning worked for StarCraft but won’t work for driving.

I predict that – barring a radical move by Waymo to increase the size of its fleet – in the next 1-3 years, the view that Waymo is far in the lead and Tesla is far behind will be widely abandoned. People have been focusing too much on demos that don’t inform us about system robustness, deeply limited disengagement metrics, and Google/Waymo’s access to top machine learning engineers and researchers. They have been focusing too little on training data, particularly for rare objects and behaviours where Waymo doesn’t have enough data to do machine learning well, or at all.

Wafer-scale AI for science and HPC (Cerebras)

Simulation isn’t an advantage for Waymo because Tesla (like all autonomous vehicle companies) also uses simulation. More importantly, a simulation can’t generate rare objects and rare behaviours that the simulation’s creators can’t anticipate or don’t know how to model accurately.

Pure reinforcement learning didn’t work for AlphaStar because the action space of StarCraft is too large for random exploration to hit upon good strategies. So, DeepMind had to bootstrap with imitation learning. This shows a weakness in the supposition that, as with AlphaGo Zero, pure simulated experience will solve any problem. Especially when it comes to a problem like driving where anticipating the behaviour of humans is a key component. Anticipating human behaviour requires empirical information about the real world.

Compiler Construction for Hardware Acceleration: Challenges and Opportunities

Observers of the autonomous vehicles space may be underestimating Tesla’s ability to attract top machine learning talent. A survey of tech workers found that Tesla is the 2nd most sought-after company in the Bay Area, one rank behind Google. It also found Tesla is the 4th most sought-after company globally, two ranks behind Google at 2nd place. (Shopify is in 3rd place globally, and SpaceX is in 1st.) It also bears noting that fundamental advances in machine learning are often shared openly by academia, OpenAI, and corporate labs at Google, Facebook, and DeepMind. The difference between what Tesla can do and what Waymo can do may not be that big.

2020 LLVM in HPC Workshop: Keynote: MLIR: an Agile Infrastructure for Building a Compiler Ecosystem

The big difference between the two companies is data. As Tesla’s fleet grows to 1 million vehicles, its monthly mileage will be about 1 billion miles, 1000x more than Waymo’s monthly rate of about 1 million miles. What that 1000x difference implies for Tesla is superior detection for rare objects, superior prediction for rare behaviours, and superior path planning/driving policy for rare situations. The self-driving challenge is more about handling the 0.001% of miles that contain rare edge cases than the 99.999% of miles that are unremarkable. So, it stands to reason that the company that can collect a large number of training examples from this 0.001% of miles will do better than the companies that can’t.

More Information:

https://www.datacenterdynamics.com/en/news/tesla-detail-pre-dojo-supercomputer-could-be-up-to-80-petaflops/

https://www.allaboutcircuits.com/news/a-circuit-level-assessment-teslas-proposed-supercomputer-dojo/

https://heartbeat.fritz.ai/computer-vision-at-tesla-cd5e88074376

https://towardsdatascience.com/teslas-deep-learning-at-scale-7eed85b235d3

https://www.autopilotreview.com/teslas-andrej-karpathy-details-autopilot-inner-workings/

https://phucnsp.github.io/blog/self-taught/2020/04/30/tesla-nn-in-production.html

https://asiliconvalleyinsider.com/2020/03/08/waymo-chauffeurnet-versus-telsa-hydranet/

https://www.infoworld.com/article/3597904/why-enterprises-are-turning-from-tensorflow-to-pytorch.html

https://cleantechnica.com/2021/08/22/teslas-dojo-supercomputer-breaks-all-established-industry-standards-cleantechnica-deep-dive-part-3/

https://semianalysis.com/the-tesla-dojo-chip-is-impressive-but-there-are-some-major-technical-issues/

https://www.tweaktown.com/news/81229/teslas-insane-new-dojo-d1-ai-chip-full-transcript-of-its-unveiling/index.html

https://www.inverse.com/innovation/tesla-full-self-driving-release-date-ai-day

https://videocardz.com/newz/tesla-d1-chip-features-50-billion-transistors-scales-up-to-1-1-exaflops-with-exapod

https://cleantechnica.com/2021/09/15/what-advantage-will-tesla-gain-by-making-its-own-silicon-chips/

Taking Neuromorphic Computing to the Next Level with Loihi 2

Intel Labs’ new Loihi 2 research chip outperforms its predecessor by up to 10x and comes with an open-source, community-driven neuromorphic computing framework

Today, Intel introduced Loihi 2, its second-generation neuromorphic research chip, and Lava, an open-source software framework for developing neuro-inspired applications. Their introduction signals Intel’s ongoing progress in advancing neuromorphic technology.

“Loihi 2 and Lava harvest insights from several years of collaborative research using Loihi. Our second-generation chip greatly improves the speed, programmability, and capacity of neuromorphic processing, broadening its usages in power and latency constrained intelligent computing applications. We are open sourcing Lava to address the need for software convergence, benchmarking, and cross-platform collaboration in the field, and to accelerate our progress toward commercial viability.”

–Mike Davies, director of Intel’s Neuromorphic Computing Lab

Why It Matters: Neuromorphic computing, which draws insights from neuroscience to create chips that function more like the biological brain, aspires to deliver orders of magnitude improvements in energy efficiency, speed of computation and efficiency of learning across a range of edge applications: from vision, voice and gesture recognition to search retrieval, robotics, and constrained optimization problems.

Neuromorphic Chipsets - Industry Adoption Analysis

Neuromorphic Chipsets - Industry Adoption Analysis from Netscribes

Applications Intel and its partners have demonstrated to date include robotic arms, neuromorphic skins and olfactory sensing.

About Loihi 2: The research chip incorporates learnings from three years of use with the first-generation research chip and leverages progress in Intel’s process technology and asynchronous design methods.

Advances in Loihi 2 allow the architecture to support new classes of neuro-inspired algorithms and applications, while providing up to 10 times faster processing1, up to 15 times greater resource density2 with up to 1 million neurons per chip, and improved energy efficiency. Benefitting from a close collaboration with Intel’s Technology Development Group, Loihi 2 has been fabricated with a pre-production version of the Intel 4 process, which underscores the health and progress of Intel 4. The use of extreme ultraviolet (EUV) lithography in Intel 4 has simplified the layout design rules compared to past process technologies. This has made it possible to rapidly develop Loihi 2.

The Lava software framework addresses the need for a common software framework in the neuromorphic research community. As an open, modular, and extensible framework, Lava will allow researchers and application developers to build on each other’s progress and converge on a common set of tools, methods, and libraries. Lava runs seamlessly on heterogeneous architectures across conventional and neuromorphic processors, enabling cross-platform execution and interoperability with a variety of artificial intelligence, neuromorphic and robotics frameworks. Developers can begin building neuromorphic applications without access to specialized neuromorphic hardware and can contribute to the Lava code base, including porting it to run on other platforms.

Architectures for Accelerating Deep Neural Nets

"Investigators at Los Alamos National Laboratory have been using the Loihi neuromorphic platform to investigate the trade-offs between quantum and neuromorphic computing, as well as implementing learning processes on-chip,” said Dr. Gerd J. Kunde, staff scientist, Los Alamos National Laboratory. “This research has shown some exciting equivalences between spiking neural networks and quantum annealing approaches for solving hard optimization problems. We have also demonstrated that the backpropagation algorithm, a foundational building block for training neural networks and previously believed not to be implementable on neuromorphic architectures, can be realized efficiently on Loihi. Our team is excited to continue this research with the second generation Loihi 2 chip."

About Key Breakthroughs: Loihi 2 and Lava provide tools for researchers to develop and characterize new neuro-inspired applications for real-time processing, problem-solving, adaptation and learning. Notable highlights include:

Faster and more general optimization: Loihi 2’s greater programmability will allow a wider class of difficult optimization problems to be supported, including real-time optimization, planning, and decision-making from edge to datacenter systems.
New approaches for continual and associative learning: Loihi 2 improves support for advanced learning methods, including variations of backpropagation, the workhorse algorithm of deep learning. This expands the scope of adaptation and data efficient learning algorithms that can be supported by low-power form factors operating in online settings.
Novel neural networks trainable by deep learning: Fully programmable neuron models and generalized spike messaging in Loihi 2 open the door to a wide range of new neural network models that can be trained in deep learning. Early evaluations suggest reductions of over 60 times fewer ops per inference on Loihi 2 compared to standard deep networks running on the original Loihi without loss in accuracy3. Loihi 2 addresses a practical limitation of Loihi by incorporating faster, more flexible, and more standard input/output interfaces.
Seamless integration with real-world robotics systems, conventional processors, and novel sensors: Loihi 2 addresses a practical limitation of Loihi by incorporating faster, more flexible, and more standard input/output interfaces. Loihi 2 chips will support Ethernet interfaces, glueless integration with a wider range of event-based vision sensors, and larger meshed networks of Loihi 2 chips.

More details may be found in the Loihi 2/Lava technical brief.

About the Intel Neuromorphic Research Community: The Intel Neuromorphic Research Community (INRC) has grown to nearly 150 members, with several new additions this year, including Ford, Georgia Institute of Technology, Southwest Research Institute (SwRI) and Teledyne-FLIR. New partners join a robust community of academic, government and industry partners that are working with Intel to drive advances in real-world commercial usages of neuromorphic computing. (Read what our partners are saying about Loihi technology.)

“Advances like the new Loihi 2 chip and the Lava API are important steps forward in neuromorphic computing,” said Edy Liongosari, chief research scientist and managing director at Accenture Labs. “Next-generation neuromorphic architecture will be crucial for Accenture Labs’ research on brain-inspired computer vision algorithms for intelligent edge computing that could power future extended-reality headsets or intelligent mobile robots. The new chip provides features that will make it more efficient for hyper-dimensional computing and can enable more advanced on-chip learning, while the Lava API provides developers with a simpler and more streamlined interface to build neuromorphic systems.”

Deep learning: Hardware Landscape

Deep learning: Hardware Landscape from Grigory Sapunov

About the Path to Commercialization: Advancing neuromorphic computing from laboratory research to commercially viable technology is a three-pronged effort. It requires continual iterative improvement of neuromorphic hardware in response to the results of algorithmic and application research; development of a common cross-platform software framework so developers can benchmark, integrate, and improve on the best algorithmic ideas from different groups; and deep collaborations across industry, academia and governments to build a rich, productive neuromorphic ecosystem for exploring commercial use cases that offer near-term business value.

Today’s announcements from Intel span all these areas, putting new tools into the hands of an expanding ecosystem of neuromorphic researchers engaged in re-thinking computing from its foundations to deliver breakthroughs in intelligent information processing.

What’s Next: Intel currently offers two Loihi 2 based neuromorphic systems through the Neuromorphic Research cloud to engaged members of the INRC: Oheo Gulch, a single chip system for early evaluation and Kapoho Point, an eight-chip system that will be available soon.

Introduction

Recent breakthroughs in AI have swelled our appetite for intelligence in computing devices at all scales and form factors. This new intelligence ranges from recommendation systems, automated call centers, and gaming systems in the data center to autonomous vehicles and robots to more intuitive and predictive interfacing with our personal computing devices to smart city and road infrastructure that immediately responds to emergencies. Meanwhile, as today’s AI technology matures, a clear view of its limitations is emerging. While deep neural networks (DNNs) demonstrate a near limitless capacity to scale to solve large problems, these gains come at a very high price in computational power and pre-collected data. Many emerging AI applications—especially those that must operate in unpredictable real-world environments with power, latency, and data constraints—require fundamentally new approaches. Neuromorphic computing represents a fundamental rethinking of computer architecture at the transistor level, inspired by the form and function of the brain’s biological neural networks. Despite many decades of progress in computing, biological neural circuits remain unrivaled in their ability to intelligently process, respond to, and learn from real-world data at microwatt power levels and millisecond response times. Guided by the principles of biological neural computation, neuromorphic computing intentionally departs from the familiar algorithms and programming abstractions of conventional computing so it can unlock orders of magnitude gains in efficiency and performance compared to conventional architectures. The goal is to discover a computer architecture that is inherently suited for the full breadth of intelligent information processing that living brains effortlessly support.

Advances in neuromorphic computing technology

Three Years of Loihi Research

Intel Labs is pioneering research that drives the evolution of compute and algorithms toward next-generation AI. In 2018, Intel Labs launched the Intel Neuromorphic Research Community (Intel NRC) and released the Loihi research processor for external use. The Loihi chip represented a milestone in the neuromorphic research field. It incorporated self-learning capabilities, novel neuron models, asynchronous spike-based communication, and many other properties inspired from neuroscience modeling, with leading silicon integration scale and circuit speeds. Over the past three years, Intel NRC members have evaluated Loihi in a wide range of application demonstrations. Some examples include:

• Adaptive robot arm control

• Visual-tactile sensory perception

• Learning and recognizing new odors and gestures

• Drone motor control with state-of-the-art latency in response to visual input

• Fast database similarity search • Modeling diffusion processes for scientific computing applications

• Solving hard optimization problems such as railway scheduling In most of these demonstrations, Loihi consumes far less than 1 watt of power, compared to the tens to hundreds of watts that standard CPU and GPU solutions consume.

With relative gains often reaching several orders of magnitude, these Loihi demonstrations represent breakthroughs in energy efficiency.1 Furthermore, for the best applications, Loihi simultaneously demonstrates state-of-the-art response times to arriving data samples, while also adapting and learning from incoming data streams.

This combination of low power and low latency, with continuous adaptation, has the potential to bring new intelligent functionality to power- and latencyconstrained systems at a scale and versatility beyond what any other programmable architecture supports today. Loihi has also exposed limitations and weaknesses found in today’s neuromorphic computing approaches.

While Loihi has one of the most flexible feature sets of any neuromorphic chip, many of the more promising applications stretch the range of its capabilities, such as its supported neuron models and learning rules. Interfacing with conventional sensors, processors, and data formats proved to be a challenge and often a bottleneck for performance.

While Loihi applications show good scalability in large-scale systems such as the 768-chip Pohoiki Springs system, with gains often increasing relative to conventional solutions at larger scales, congestion in inter-chip links limited application performance. Loihi’s integrated compute-and-memory architecture foregoes off-chip DRAM memory, so scaling up workloads requires increasing the number of Loihi chips in an application. This means the economic viability of the technology depends on achieving significant improvements in the resource density of neuromorphic chips to minimize the number of required chips in commercial deployments.

Wei Lu (U Mich) Neuromorphic Computing Based on Memristive Materials and Devices

Photonics for Computing: from Optical Interconnects to Neuromorphic Architectures

One of the biggest challenges holding back the commercialization of neuromorphic technology is the lack of software maturity and convergence. Since neuromorphic architecture is fundamentally incompatible with standard programming models, including today’s machine-learning and AI frameworks in wide use, neuromorphic software and application development is often fragmented across research teams, with different groups taking different approaches and often reinventing common functionality.

Yet to emerge is a single, common software framework for neuromorphic computing that supports the full range of approaches pursued by the research community that presents compelling and productive abstractions to application developers.

The Nx SDK software developed by Intel Labs for programming Loihi focused on low-level programming abstractions and did not attempt to address the larger community’s need for a more comprehensive and open neuromorphic software framework that runs on a wide range of platforms and allows contributions from throughout the community. This changes with the release of Lava.

Intel Labs is pioneering research that drives the evolution of compute and algorithms toward next-generation AI.

Loihi 2: A New Generation of Neuromorphic Computing Architecture

Building on the insights gained from the research performed on the Loihi chip, Intel Labs introduces Loihi 2. A complete tour of the new features, optimizations, and innovations of this chip is provided in the final section. Here are some highlights: • Generalized event-based messaging. Loihi originally supported only binary-valued spike messages. Loihi 2 permits spikes to carry integer-valued payloads with little extra cost in either performance or energy. These generalized spike messages support event-based messaging, preserving the desirable sparse and time-coded communication properties of spiking neural networks (SNNs), while also providing greater numerical precision. • Greater neuron model programmability. Loihi was specialized for a specific SNN model. Loihi 2 now implements its neuron models with a programmable pipeline in each neuromorphic core to support common arithmetic, comparison, and program control flow instructions. Loihi 2’s programmability greatly expands its range of neuron models without compromising performance or efficiency compared to Loihi, thereby enabling a richer space of use cases and applications.

• Enhanced learning capabilities. Loihi primarily supported two-factor learning rules on its synapses, with a third modulatory term available from nonlocalized “reward” broadcasts. Loihi 2 allows networks to map localized “third factors” to specific synapses. This provides support for many of the latest neuroinspired learning algorithms under study, including approximations of the error backpropagation algorithm, the workhorse of deep learning. While Loihi was able to prototype some of these algorithms in proof-of-concept demonstrations, Loihi 2 will be able to scale these examples up, for example, so new gestures can be learned faster with a greater range of presented hand motions.

• Numerous capacity optimizations to improve resource density. Loihi 2 has been fabricated with a preproduction version of the Intel 4 process to address the need to achieve greater application scales within a single neuromorphic chip. Loihi 2 also incorporates numerous architectural optimizations to compress and maximize the efficiency of each chip’s neural memory resources. Together, these innovations improve the overall resource density of Intel’s neuromorphic silicon architecture from 2x to over 160x, depending on properties of the programmed networks.

• Faster circuit speeds. Loihi 2’s asynchronous circuits have been fully redesigned and optimized, improving on Loihi down to the lowest levels of pipeline sequencing. This has provided gains in processing speeds from 2x for simple neuron state updates to 5x for synaptic operations to 10x for spike generation.2 Loihi 2 supports minimum chip-wide time steps under 200ns; it can now process neuromorphic networks up to 5000x faster than biological neurons.

• Interface improvements. Loihi 2 offers more standard chip interfaces than Loihi. These interfaces are both faster and higher-radix. Loihi 2 chips support 4x faster asynchronous chip-to-chip signaling bandwidths,3 a destination spike broadcast feature that reduces interchip bandwidth utilization by 10x or more in common networks,4 and three-dimensional mesh network topologies with six scalability ports per chip. Loihi 2 supports glueless integration with a wider range of both standard chips, over its new Ethernet interface, as well as emerging event-based vision (and other) sensor devices.

Photonic reservoir computing for high-speed neuromorphic computing applications - A.Lugnan

Using these enhancements, Loihi 2 now supports a new deep neural network (DNN) implementation known as the Sigma-Delta Neural Network (SDNN) that provides great gains in speed and efficiency compared to the rate-coded spiking neural network approach commonly used on Loihi. SDNNs compute graded activation values in the same way that conventional DNNs do, but they only communicate significant changes as they happen in a sparse, eventdriven manner. Simulation characterizations show that SDNNs on Loihi 2 can improve on Loihi’s rate-coded SNNs for DNN inference workloads by over 10x in both inference speeds and energy efficiency.

A First Tour of Loihi 2

Loihi 2 has the same base architecture as its predecessor Loihi, but comes with several improvements to extend its functionality, improve its flexibility, increase its capacity, accelerate its performance, and make it easier to both scale and integrate into a larger system (see Figure 1).

Base Architecture Building on the strengths of its predecessor, each Loihi 2 chip consists of microprocessor cores and up to 128 fully asynchronous neuron cores connected by a network-on-chip (NoC). The neuron cores are optimized for neuromorphic workloads, each implementing a group of spiking neurons, including all synapses connecting to the neurons. All communication between neuron cores is in the form of spike messages. The number of embedded microprocessor cores has doubled from three in Loihi to six in Loihi 2. Microprocessor cores are optimized for spike-based communication and execute standard C code to assist with data I/O as well as network configuration, management, and monitoring. Parallel I/O interfaces extend the on-chip mesh across multiple chips—up to 16,384—with direct pin-to-pin wiring between neighbors.

Programmable Photonic Integrated Circuits for Quantum Information Processing and Machine Learning

New Functionality Loihi 2 supports fully programmable neuron models with graded spikes. Each neuron model takes the form of a program, which is a short sequence of microcode instructions describing the behavior of a single neuron. The microcode instruction set supports bitwise and basic math operations in addition to conditional branching, memory access, and specialized instructions for spike generation and probing.

The second-generation “Loihi” processor from Intel has been made available to advance research into neuromorphic computing approaches that more closely mimic the behavior of biological cognitive processes. Loihi 2 outperforms the previous chip version in terms of density, energy efficiency, and other factors. This is part of an effort to create semiconductors that are more like a biological brain, which might lead to significant improvements in computer performance and efficiency.

Intel Announces Loihi 2, Lava Software Framework For Advancing Neuromorphic Computing - Phoronix

The first generation of artificial intelligence was built on the foundation of defining rules and emulating classical logic to arrive at rational conclusions within a narrowly defined problem domain. It was ideal for monitoring and optimizing operations. The second generation is dominated by the use of deep learning networks to examine the contents and data that were mostly concerned with sensing and perception. The third generation of AI focuses on drawing similarities to human cognitive processes, like interpretation and autonomous adaptation.

This is achieved by simulating neurons firing in the same way as humans’ nervous systems do, a method known as neuromorphic computing.

Neuromorphic computing is not a new concept. It was initially suggested in the 1980s by Carver Mead, who coined the phrase “neuromorphic engineering.” Carver had spent more than four decades building analytic systems that simulated human senses and processing mechanisms including sensation, seeing, hearing, and thinking. Neuromorphic computing is a subset of neuromorphic engineering that focuses on the human-like systems’ “thinking” and “processing” capabilities. Today, neuromorphic computing is gaining traction as the next milestone in artificial intelligence technology.

Intel Rolls Out New Loihi 2 Neuromorphic Chip: Built on Early Intel 4 Process

In 2017, Intel released the first-generation Loihi chip, a 14-nanometer chipset with a 60-millimeter die size. It has more than 2 billion transistors and three orchestration Lakemont cores. It also features 128 core packs and a configurable microcode engine for asynchronous spiking neural network-on-chip training. The benefit of having spiking neural networks enabled Loihi to be entirely asynchronous and event-driven, rather than being active and updating on a synchronized clock signal. When a charge builds up in the neurons, “spikes” are sent along active synapses. These spikes are mostly time-based, with time being recorded as part of the data. The core fires out its own spikes to its linked neurons when spikes accumulate in a neuron for a particular amount of time and reach a certain threshold.

Even though Loihi 2 has 128 neuromorphic cores, each core now has 8 times the number of neurons and synapses. Each of the 128 cores has 192 KB of flexible memory, compared to the prior limit of 24. Each neuron may now be assigned up to 4096 states depending on the model, compared to the previous limit of 24. The Neuron model can now be entirely programmable, similar to an FPGA, which gives it more versatility – allowing for new sorts of neuromorphic applications.

One of the drawbacks of Loihi was that spike signals were not programmable and had no context or range of values. Loihi 2 addresses all of these issues while also providing 2-10x (2X for neuron state updates, up to 10X for spike generation) faster circuits, eight times more neurons, and four times more link bandwidth for increased scalability.

Loihi 2 was created using the Intel 4 pre-production process and benefited from the usage of EUV technology in that node. The Intel 4 process allowed to halve the size of the chip from 60 mm2 to 31 mm2, with the number of transistors rising to 2.3 billion. In comparison to previous process technologies, the use of extreme ultraviolet (EUV) lithography in Intel 4 has simplified the layout design guidelines. This has allowed Loihi 2 to be developed quickly.

Programmable Photonic Circuits: a flexible way of manipulating light on chips

Support for three-factor learning rules has been added to the Loihi 2 architecture, as well as improved synaptic (internal interconnections) compression for quicker internal data transmission. Loihi 2 also features parallel off-chip connections (that enable the same types of compression as internal synapses) that may be utilized to extend an on-chip mesh network across many physical chips to create a very powerful neuromorphic computer system. Loihi 2 also features new approaches for continual and associative learning. Furthermore, the chip features 10GbE, GPIO, and SPI interfaces to make it easier to integrate Loihi 2 with traditional systems.

Loihi 2 further improves flexibility by integrating faster, standardized I/O interfaces that support Ethernet connections, vision sensors, and bigger mesh networks. These improvements are intended to improve the chip’s compatibility with robots and sensors, which have long been a part of Loihi’s use cases.

Another significant change is in the portion of the processor that assesses the condition of the neuron before deciding whether or not to transmit a spike. Earlier, users had to make such conclusions using a simple bit of arithmetic in the original processor. Now, they only need to conduct comparisons and regulate the flow of instructions in Loihi 2 thanks to a simpler programmable pipeline.

ESA+ Colloquium - Programmable Photonics - Wim Bogaerts - 3 May 2021

Intel claims Loihi 2’s enhanced architecture allows it to be compatible in carrying back-propagation processes, which is a key component of many AI models. This may help in accelerating the commercialization of neuromorphic chips. Loihi 2 has also been proven to execute inference calculations, with up to 60 times fewer operations per inference compared to Loihi – without any loss in accuracy. Often inference calculations are used by AI models to interpret given data.

The Neuromorphic Research Cloud is presently offering two Loihi 2-based neuromorphic devices to researchers. These are:

Oheo Gulch is a single-chip add-in card that comes with an Intel Arria 10 FPGA for interfacing with Loihi 2 which will be used for early assessment.

Kapoho Point, an 8-chip system board that mounts eight Loihi 2 chips in a 4×4-inch form factor, will be available shortly. It will have GPIO pins along with “standard synchronous and asynchronous interfaces” that will allow it to be used with things like sensors and actuators for embedded robotics applications

These will be available via a cloud service to members of the Intel Neuromorphic Research Community (INRC) and Lava via GitHub for free.

Intel has also created Lava to address the requirement for software convergence, benchmarking, and cross-platform collaboration in the realm of neuromorphic computing. As an open, modular, and extendable framework, it will enable academics and application developers to build on one other’s efforts and eventually converge on a common set of tools, techniques, and libraries.

Intel Announces Loihi 2, Lava Software Framework For Advancing Neuromorphic Computing - Phoronix

Lava operates on a range of conventional and neuromorphic processor architectures, allowing for cross-platform execution and compatibility with a variety of artificial intelligence, neuromorphic, and robotics frameworks. Users can get the Lava Software Framework for free on GitHub.

Edy Liongosari, chief research scientist and managing director for Accenture Labs believes that advances like the new Loihi-2 chip and the Lava API will be crucial to the future of neuromorphic computing. “Next-generation neuromorphic architecture will be crucial for Accenture Labs’ research on brain-inspired computer vision algorithms for intelligent edge computing that could power future extended-reality headsets or intelligent mobile robots,” says Edy.

For now, Loihi 2 has piqued the interest of the Queensland University of Technology. The institute is looking to work on more sophisticated neural modules to aid in the implementation of biologically inspired navigation and map formation algorithms. The first generation Loihi is already being used at Los Alamos National Lab to study tradeoffs between quantum and neuromorphic computing. It is also being used in the backpropagation algorithm, which is used to train neural networks.

Intel has unveiled its second-generation neuromorphic computing chip, Loihi 2, the first chip to be built on its Intel 4 process technology. Designed for research into cutting-edge neuromorphic neural networks, Loihi 2 brings a range of improvements. They include a new instruction set for neurons that provides more programmability, allowing spikes to have integer values beyond just 1 and 0, and the ability to scale into three-dimensional meshes of chips for larger systems.

The chipmaker also unveiled Lava, an open-source software framework for developing neuro-inspired applications. Intel hopes to engage neuromorphic researchers in development of Lava, which when up and running will allow research teams to build on each other’s work.

Loihi is Intel’s version of what neuromorphic hardware, designed for brain-inspired spiking neural networks (SNNs), should look like. SNNs are used in event-based computing, in which the timing of input spikes encodes the information. In general, spikes that arrive sooner have more computational effect than those arriving later.

Karlheinz meier - How neuromorphic computing may affect our future life HBP

Karlheinz meier - How neuromorphic computing may affect our future life HBP Colloquium FZ Jülich, October 2018 from albertspijkers

Intel’s Loihi 2 second-generation neuromorphic processor. (Source: Intel)

Among the key differences between neuromorphic hardware and standard CPUs is fine-grained distribution of memory, meaning Loihi’s memory is embedded into individual cores. Since Loihi’s spikes rely on timing, the architecture is asynchronous.

“In neuromorphic computing, the computation is emerging through the interaction between these dynamical elements,” explained Mike Davies, director of Intel’s Neuromorphic Computing Lab. “In this case, it’s neurons that have this dynamical property of adapting online to the input it receives, and the programmer may not know the precise trajectory of steps that the chip will go through to arrive at an answer.

“It goes through a dynamical process of self-organizing its states and it settles into some new condition. That final fixed point as we call it, or equilibrium state, is what is encoding the answer to the problem that you want to solve,” Davies added. “So it’s very fundamentally different from how we even think about computing in other architectures.”

First-generation Loihi chips have thus far been demonstrated in a variety of research applications, including adaptive robot arm control, where the motion adapts to changes in the system, reducing friction and wear on the arm. Loihi is able to adapt its control algorithm to compensate for errors or unpredictable behavior, enabling robots to operate with the desired accuracy. Loihi has also been used in a system that recognizes different smells. In this scenario, it can learn and detect new odors much more efficiently than a deep learning-based equivalent. A project with Deutsche Bahn also used Loihi for train scheduling. The system reacted quickly to changes such as track closures or stalled trains.

Second-gen features

Built on a pre-production version of the Intel 4 process, Loihi 2 aims to increase programmability and performance without compromising energy efficiency. Like its predecessor, it typically consumes around 100 mW (up to 1 W).

An increase in resource density is one of the most important changes; while the chip still incorporates 128 cores, the neuron count jumps by a factor of eight.

“Getting to a higher amount of storage, neurons and synapses in a single chip is essential for the commercial viability… and commercializing them in a way that makes sense for customer applications,” said Davies.

Loihi 2 features. (Source: Intel)

With Loihi 1, workloads would often map onto the architecture in non-optimal ways. For example, the neuron count would often max out while free memory was still available. The amount of memory in Loihi 2 is similar in total, but has been broken up into memory banks that are more flexible. Additional compression has been added to network parameters to minimize the amount of memory required for larger models. This frees up memory that can be reallocated for neurons.

The upshot is that Loihi 2 can tackle larger problems with the same amount of memory, delivering a roughly 15-fold increase in neural network capacity per millimeter 2 of chip area–bearing in mind that die area is halved overall by new process technology.

Neuron programmability

Programmability is another important architectural modification. Neurons that were previously fixed-function, though configurable, in Loihi 1 gain a full instruction set in Loihi 2. The instruction set includes common arithmetic, comparison and program control flow instructions. That level of programmability would allow varied SNN types to be run more efficiently.

“This is a kind of microcode that allows us to program almost arbitrary neuron models,” Davies said. “This covers the limits of Loihi [1], and where generally we’re finding more application value could be unlocked with even more complex and richer neuron models, which is not what we were expecting at the beginning of Loihi. But now we can actually encompass that full extent of neuron models that our partners are trying to investigate, and what the computational neuroscience domain [is] proposing and characterizing.”

The Loihi 2 die is the first to be fabricated on a pre-production version of Intel 4 process technology. (Source: Intel)

Programmable Photonic Circuits

For Loihi 2, the idea of spikes has also been generalized. Loihi 1 employed strict binary spikes to mirror what is seen in biology, where spikes have no magnitude. All information is represented by spike timing, and earlier spikes would have greater computational effect than later spikes. In Loihi 2, spikes carry a configurable integer payload available to the programmable neuron model. While biological brains don’t do this, Davies said it was relatively easy for Intel to add to the silicon architecture without compromising performance.

“This is an instance where we’re departing from the strict biological fidelity, specifically because we understand what the importance is, the time-coding aspect of it,” he said. “But [we realized] that we can do better, and we can solve the same problems with fewer resources if we have this extra magnitude that can be sent alongside with this spike.”

Generalized event-based messaging is key to Loihi 2’s support of a deep neural network called the sigma-delta neural network (SDNN), which is much faster than the timing approach used on Loihi 1. SDNNs compute graded-activation values in the same way that conventional DNNs do, but only communicate significant changes as they happen in a sparse, event-driven manner.

3D Scaling

Loihi 2 is billed as up to 10 times faster than its predecessor at the circuit level. Combined with functional improvements, the design can deliver up to 10X speed gains, Davies claimed. Loihi 2 supports minimum chip-wide time steps under 200ns; it can also process neuromorphic networks up to 5,000 times faster than biological neurons.

Programmable Photonics - Wim Bogaerts - Stanford

The new chip also features scalability ports which allow Intel to scale neural networks into the third dimension. Without external memory on which to run larger neural networks, Loihi 1 required multiple devices (such as in Intel’s 768-Loihi chip system, Pohoiki Springs). Planar meshes of Loihi 1 chips become 3D meshes in Loihi 2. Meanwhile, chip-to-chip bandwidth has been improved by a factor of four, with compression and new protocols providing one-tenth the redundant spike traffic sent between chips. Davies said the combined capacity boost is around 60-fold for most workloads, avoiding bottlenecks caused by inter-chip links.

Also supported is three-factor learning, which is popular in cutting-edge neuromorphic algorithm research. The same modification, which maps third factors to specific synapses, can be used to approximate back-propagation, the training method used in deep learning. That creates new ways of learning via Loihi.

Loihi 2 will be available to researchers as a single-chip board for developing edge applications (Oheo Gulch). It will also be offered as an eight-chip board intended to scale for more demanding applications. (Source: Intel)

Lava

The Lava software framework rounds out the Loihi enhancements. The open-source project is available to the neuromorphic research community.

“Software continues to hold back the field,” Davies said. “There hasn’t been a lot of progress, not at the same pace as the hardware over the past several years. And there hasn’t been an emergence of a single software framework, as we’ve seen in the deep learning world where we have TensorFlow and PyTorch gathering huge momentum and a user base.”

While Intel has a portfolio of applications demonstrated for Loihi, code sharing among development teams has been limited. That makes it harder for developers to build on progress made elsewhere.

Promoted as a new project, not a product, Davies said Lava is intended as a way to build a framework that supports Loihi researchers working on a range of algorithms. While Lava is aimed at event-based asynchronous message passing, it will also support heterogeneous execution. That allows researchers to develop applications that initially run on CPUs. With access to Loihi hardware, researchers can then map parts of the workload onto the neuromorphic chip. The hope is that approach would help lower the barrier to entry.

“We see a need for convergence and a communal development here towards this greater goal which is going to be necessary for commercializing neuromorphic technology,” Davies said.

Loihi 2 will be used by researchers developing advanced neuromorphic algorithms. Oheo Gulch, a single-chip system for lab testing, will initially be available to researchers, followed by Kapoho Point, an eight-chip Loihi 2 version of Kapoho Bay. Kapoho Point includes an Ethernet interface designed to allow boards to be stacked for applications such as robotics requiring more computing power.

More Information:

https://www.youtube.com/c/PhotonicsResearchGroupUGentimec/videos

https://ecosystem.photonhub.eu/trainings/product/?action=view&id_form=7&id_form_data=14

https://aip.scitation.org/doi/10.1063/5.0047946

https://www.intel.com/content/www/us/en/research/neuromorphic-computing.html

https://www.intel.com/content/www/us/en/newsroom/resources/press-kits-neuromorphic-computing.html

https://www.photonics.com/Articles/Neuromorphic_Processing_Set_to_Propel_Growth_in_AI/a66821

https://www.embedded.com/intel-offers-loihi-2-neuromorphic-chip-and-software-framework/

https://github.com/Linaro/lava

Achieving Scalability in Quantum Computing

Experience quantum impact with Azure Quantum

Navigating obstacles

Physical qubits, logical qubits, and the role of error correction

Quantum Algorithms Landscape

Stability and scale with a topological qubit

Developing a topological qubit is extremely challenging and is still underway, but these benefits make the pursuit well worth the effort.

A solid foundation to tackle problems unsolved by today’s computers

Providing a more solid foundation, the topological approach offers robust, stable qubits, and helps to bring the solutions to some of our most challenging problems within reach.

Myth vs. reality: a practical perspective on quantum computing

There’s a lot of speculation about the potential for quantum computing, but to get a clearer vision of the future impact, we need to disentangle myth from reality. At this week’s virtual Q2B conference, we take a pragmatic perspective to cut through the hype and discuss the practicality of quantum computers, how to future-proof quantum software development, and the real value obtained today through quantum-inspired solutions on classical computers.

Azure Quantum (Itailian Video)

Achieving practical quantum advantage

Dr. Matthias Troyer, Distinguished Scientist with Microsoft Quantum, explains what will be needed for quantum computing to be better and faster than classical computing in his talk Disentangling Hype from Reality: Achieving Practical Quantum Advantage. People talk about many potential problems they hope quantum computers can help with, including fighting cancer, forecasting the weather, or countering climate change. Having a pragmatic approach to determining real speedups will enable us to focus the work on the areas that will deliver impact.

For example, quantum computers have limited I/O capability and will thus not be good at big data problems. However, the area where quantum does excel is large compute problems on small data. This includes chemistry and materials science, for game-changing solutions like designing better batteries, new catalysts, quantum materials, or countering climate change. But even for compute-intensive problems, we need to take a closer look. Troyer explains that each operation in a quantum algorithm is slower by more than 10 orders of magnitude compared to a classical computer. This means we need a large speedup advantage in the algorithm to overcome the slowdowns intrinsic to the quantum system; we need superquadratic speedups.

Troyer is optimistic about the potential for quantum computing but brings a realistic perspective to what is needed to get to practical quantum advantage: small data/big compute problems, superquadratic speedup, fault-tolerant quantum computers scaling to millions of qubits and beyond, and the tools and systems to develop the algorithms to run the quantum systems.

Experiencing Quantum impact with Microsoft today | Julie Love | Microsoft

Future-proofing quantum development

Developers and researchers want to ensure they invest in languages and tools that will adapt to the capabilities of more powerful quantum systems in the future. Microsoft’s open-source Quantum Intermediate Representation (QIR) and the Q# programming language provide developers with a flexible foundation that protects their development investments.

QIR is a new Microsoft-developed intermediate representation for quantum programs that is hardware and language agnostic, so it can be a common interface between many languages and target quantum computation platforms. Based on the popular open-source LLVM intermediate language, QIR is designed to enable the development of a broad and flexible ecosystem of software tools for quantum development.

As quantum computing capabilities evolve, we expect large-scale quantum applications will take full advantage of both classical and quantum computing resources working together. QIR provides full capabilities for describing rich classical computation fully integrated with quantum computation. It’s a key layer in achieving a scaled quantum system that can be programmed and controlled for general algorithms.

In his presentation at the Q2B conference, Future-Proofing Your Quantum Development with Q# and QIR, Microsoft Senior Software Engineer Stefan Wernli explains to a technical audience why QIR and Q# are practical investments for long-term quantum development. Learn more about QIR in our recent Quantum Blog post.

Quantum-inspired optimization solutions today

At the same time, there are ways to get practical value today through “quantum-inspired” solutions that apply quantum principles for increased speed and accuracy to algorithms running on classical computers.

We are already seeing how quantum-inspired optimization solutions can solve complex transportation and logistics challenges. An example is Microsoft’s collaboration with Trimble Transportation to optimize its transportation supply chain, presented at the Q2B conference in Freight for the Future: Quantum-Inspired Optimization for Transportation by Anita Ramanan, Microsoft Quantum Software Engineer, and Scott Vanselous, VP Digital Supply Chain Solutions at Trimble.

Trimble’s Vanselous explains how today’s increased dependence on e-commerce and shipping has fundamentally raised expectations across the supply chain. However, there was friction in the supply chain because of siloed data between shippers, carriers, and brokers; limited visibility; and a focus on task optimization vs. system optimization. Trimble and Microsoft are designing quantum-inspired load matching algorithms for a platform that enables all supply chain members to increase efficiency, minimize costs, and take advantage of newly visible opportunities.

EdX Grover's Search Algorithm

Many industries—automotive, aerospace, healthcare, government, finance, manufacturing, and energy—have tough optimization problems where these quantum-inspired solutions can save time and money. And these solutions will only get more valuable when scaled quantum hardware becomes available and provides further acceleration.

Building a bridge to the future of supercomputing with quantum acceleration

Using supercomputing and new tools for understanding quantum algorithms in advance of scaled hardware gives us a view of what may be possible in a future with scaled quantum computing. Microsoft’s new Quantum Intermediate Representation (QIR), designed to bridge different languages and different target quantum computation platforms, is bringing us closer to that goal. Several Department of Energy (DOE) national laboratories are using this Microsoft technology in their research at the new National Quantum Initiative (NQI) quantum research centers.

As quantum computing capabilities mature, we expect most large-scale quantum applications will take full advantage of both classical and quantum computing resources working together. QIR provides a vital bridge between these two worlds by providing full capabilities for describing rich classical computation fully integrated with quantum computation.

QIR is central to a new collaboration between Microsoft and DOE’s Pacific Northwest National Laboratory (PNNL) born out of NQI’s Quantum Science Center (QSC) led by DOE’s Oak Ridge National Laboratory (ORNL). The goal of the PNNL project is to measure the impact of noisy qubits on the accuracy of quantum algorithms, specifically the Variational Quantum Eigensolver (VQE). In order to run it in simulation on the supercomputer, they needed a language to write the algorithm, and another representation to map it to run on the supercomputer. PNNL used Microsoft’s Q# language to write the VQE algorithm and then QIR provides the bridge, allowing easy translation and mapping to the supercomputer for the simulation.

The PNNL team is showcasing the simulation running on ORNL’s Summit supercomputer at this week’s virtual International Conference for High Performance Computing, Networking, Storage, and Analysis (SC20). You can view their presentation here: Running Quantum Programs at Scale through an Open-Source, Extensible Framework.

Q# and QIR are also helping to advance research at ORNL, which is accelerating progress by enabling the use of the Q# language for all QSC members, including four national labs, three industry partners, and nine universities. ORNL is integrating Q# and QIR into its existing quantum computing framework, so ORNL researchers can run Q# code on a wide variety of targets including both supercomputer-based simulators and actual hardware devices. Supporting Q# is important to ORNL’s efforts to encourage experimentation with quantum programming in high-level languages.

The ORNL team is using QIR to develop quantum optimizations that work for multiple quantum programming languages. Having a shared intermediate representation allows the team to write optimizations and transformations that are independent of the original programming language. ORNL chose to use QIR because, being based on the popular LLVM suite, it integrates seamlessly with ORNL’s existing platform and provides a common platform that can support all of the different quantum and hybrid quantum/classical programming paradigms.

Shifting left to scale up: shortening the road to scalable quantum computing | Quantum Week 2021

Since QIR is based on the open source LLVM intermediate language, it will enable the development of a broad ecosystem of software tools around the Q# language. The community can use QIR to experiment and develop optimizations and code transformations that will be crucial for unlocking quantum computing.

Microsoft technology is playing a crucial role in DOE’s NQI initiative connecting experts in industry, national labs, and academia to accelerate our nation’s progress towards a future with scaled quantum computing.

Learn more about the latest developments in quantum computing from Microsoft and our QSC national lab partner PNNL in these virtual SC20 conference sessions.

Complex quantum programs will require programming frameworks with many of the same features as classical software development, including tools to visualize the behavior of programs and diagnose issues. The Microsoft Quantum team presents new visualization tools being added to the Microsoft Quantum Development Kit (QDK) for visualizing the execution flow of a quantum program at each step during its execution. These tools are valuable for experienced developers and researchers as well as students and newcomers to the field who want to explore and understand quantum algorithms interactively.

Dr. Krysta Svore, Microsoft’s General Manager of Quantum Systems and Software, is on this year’s exotic system panel. The SC20 panel will discuss predictions from past year sessions, what actually happened, and predict what will be available for computing systems in 2025, 2030 and 2035.

As quantum computers evolve, simulations of quantum programs on classical computers will be essential in validating quantum algorithms, understanding the effect of system noise and designing applications for future quantum computers. In this paper, PNNL researchers first propose a new multi-GPU programming methodology which constructs a virtual BSP machine on top of modern multi-GPU platforms, and apply this methodology to build a multi-GPU density matrix quantum simulator. Their simulator is more than 10x faster than a corresponding state-vector quantum simulator on various platforms.

Full stack ahead: Pioneering quantum hardware allows for controlling up to thousands of qubits at cryogenic temperatures

Quantum computing offers the promise of solutions to previously unsolvable problems, but in order to deliver on this promise, it will be necessary to preserve and manipulate information that is contained in the most delicate of resources: highly entangled quantum states. One thing that makes this so challenging is that quantum devices must be ensconced in an extreme environment in order to preserve quantum information, but signals must be sent to each qubit in order to manipulate this information—requiring, in essence, an information superhighway into this extreme environment. Both of these problems must, moreover, be solved at a scale far beyond that of present-day quantum device technology.

Microsoft’s David Reilly, leading a team of Microsoft and University of Sydney researchers, has developed a novel approach to the latter problem. Rather than employing a rack of room-temperature electronics to generate voltage pulses to control qubits in a special-purpose refrigerator whose base temperature is 20 times colder than interstellar space, they invented a control chip, dubbed Gooseberry, that sits next to the quantum device and operates in the extreme conditions prevalent at the base of the fridge. They’ve also developed a general-purpose cryo-compute core that operates at the slightly warmer temperatures comparable to that of interstellar space, which can be achieved by immersion in liquid Helium. This core performs the classical computations needed to determine the instructions that are sent to Gooseberry which, in turn, feeds voltage pulses to the qubits. These novel classical computing technologies solve the I/O nightmares associated with controlling thousands of qubits.

Quantum Algorithms for Hamiltonian Simulation | Quantum Colloquium

Quantum computing could impact chemistry, cryptography, and many more fields in game-changing ways. The building blocks of quantum computers are not just zeroes and ones but superpositions of zeroes and ones. These foundational units of quantum computation are known as qubits (short for quantum bits). Combining qubits into complex devices and manipulating them can open the door to solutions that would take lifetimes for even the most powerful classical computers.

Despite the unmatched potential computing power of qubits, they have an Achilles’ heel: great instability. Since quantum states are easily disturbed by the environment, researchers must go to extraordinary lengths to protect them. This involves cooling them nearly down to absolute zero temperature and isolating them from outside disruptions, like electrical noise. Hence, it is necessary to develop a full system, made up of many components, that maintains a regulated, stable environment. But all of this must be accomplished while enabling communication with the qubits. Until now, this has necessitated a bird’s nest-like tangle of cables, which could work for limited numbers of qubits (and, perhaps, even at an “intermediate scale”) but not for large-scale quantum computers.

Azure Quantum Developer Workshop | Part 3

Microsoft Quantum researchers are playing the long game, using a wholistic approach to aim for quantum computers at the larger scale needed for applications with real impact. Aiming for this bigger goal takes time, forethought, and a commitment to looking toward the future. In that context, the challenge of controlling large numbers of qubits looms large, even though quantum computing devices with thousands of qubits are still years in the future.

Enter the team of Microsoft and University of Sydney researchers, headed by Dr. David Reilly, who have developed a cryogenic quantum control platform that uses specialized CMOS circuits to take digital inputs and generate many parallel qubit control signals—allowing scaled-up support for thousands of qubits—a leap ahead from previous technology. The chip powering this platform, called Gooseberry, resolves several issues with I/O in quantum computers by operating at 100 milliKelvin (mK) while dissipating sufficiently low power so that it does not exceed the cooling power of a standard commercially-available research refrigerator at these temperatures. This sidesteps the otherwise insurmountable challenge of running thousands of wires into a fridge.

Harnessing the problem-solving power of quantum computing

Their work is detailed in a paper published in Nature this month, called “A Cryogenic Interface for Controlling Many Qubits.” They’ve also extended this research to create the first-of-its-kind general-purpose cryo-compute core, one step up the quantum stack. This operates at around 2 Kelvin (K), a temperature that can be reached by immersing it in liquid Helium. Although this is still very cold, it is 20 times warmer than the temperatures at which Gooseberry operates and, therefore, 400 times as much cooling power is available. With the luxury of dissipating 400 times as much heat, the core is capable of general-purpose computing. Both visionary pieces of hardware are critical advances toward large-scale quantum computer processes and are the result of years of work.

Both chips help manage communication between different parts of a large-scale quantum computer—and between the computer and its user. They are the key elements of a complex “nervous system” of sorts to send and receive information to and from every qubit, but in a way that maintains a stable cold environment, which is a significant challenge for a large-scale commercial system with tens of thousands of qubits or more. The Microsoft team has navigated many hurdles to accomplish this feat.

The big picture: Topological quantum computing and the quantum stack

Quantum computing devices are often measured by how many qubits they contain. However, all qubits are not created equal, so these qubit counts are often apples-to-oranges comparisons. Microsoft Quantum researchers are pioneering the development of topological qubits, which have a high level of error protection built in at the hardware level. This reduces the overhead needed for software-level error correction and enables meaningful computations to be done with fewer physical qubits.

Although this is one of the unique features of Microsoft’s approach, it is not the only one. In the quantum stack, qubits make up its base. The quantum plane (at the bottom of Figure 1) is made up of a series of topological qubits (themselves made up of semiconductors, superconductors, and dielectrics), gates, wiring, and other packaging that help to process information from raw qubits. The vital processes of communication occur in the next layer higher in the stack (labeled “Quantum-Classical Interface” in Figure 1 above). The Gooseberry chip and cryo-compute core work together to bookend this communication. The latter sits at the bottom of the “Classical Compute” portion of the stack, and Gooseberry is unique relative to other control platforms in that it sits right down with the qubits at the same temperature as the quantum plane—able to convert classical instructions from the cryo-compute core into voltage signals sent to the qubits.

Play it cool: Dissipating heat in a CMOS-based control platform

Why does it matter where the Gooseberry chip sits? It is partly an issue of heat. When the wires that connect the control chip to the qubits are long (as they would have to be if the control chip were at room temperature), significant heat can be generated inside the fridge. Putting a control chip near the qubits avoids this problem. The tradeoff is that the chip is now near the qubits, and the heat generated by the chip could potentially warm up the qubits. Gooseberry navigates these competing effects by putting the control chip near, but not too near, the qubits. By putting Gooseberry in the refrigerator but thermally isolated from the qubits, heat created by the chip is drawn away from the qubits and into the mixing chamber. (See Figure 2 below).

Placing the chip near the qubits at the quantum plane solves one set of problems with temperature but creates another. To operate a chip where the qubits are, it needs to function at the same temperature as the qubits—100 mK. Operating standard bulk CMOS chips at this temperature is challenging, so this chip uses fully-depleted silicon-on-insulator (FDSOI) technology, which optimizes the system for operation at cryogenic temperatures. It has a back-gate bias, with transistors having a fourth terminal that can be used to compensate for changes in temperature. This system of transistors and gates allows qubits to be calibrated individually, and the transistors send individualized voltages to each qubit.

Gates galore: No need for separate control lines from room temperature to every qubit

Another advantage of Gooseberry is that the chip is designed in such a way that the electrical gates controlling the qubits are charged from a single voltage source that cycles through the gates in a “round-robin” fashion, charging as necessary. Previous qubit controllers required one-to-one cables from multiple voltage sources at room temperature or 4K, compromising the ability to operate qubits at large scale. The design pioneered by Dr. Reilly’s team greatly reduces the heat dissipated by such a controller. The cryogenic temperatures also come into play here to make this possible—the extreme cold allows capacitors to hold their charge longer. This means that the gates need to be charged less frequently and produce less heat and other disruptions to qubit stability.

Azure Quantum Developer Workshop | July 2020

The Gooseberry chip is made up of both digital and analog blocks. Coupled digital logic circuits perform communication, waveform memory, and autonomous operation of the chip through a finite-state machine (FSM), and the digital part of the chip also includes a master oscillator (see Figure 3). The chip also uses a Serial Peripheral Interface (SPI) for easy communication higher up the quantum stack. The analog component of the chip is a series of cells, called “charge-lock fast-gate” (CLFG) cells, that perform two functions. First, the charge-lock function is the process for charging gates, as described above. The voltage stored on each gate is tailored to individual qubits. Information is processed in qubits by changing the voltages on the gate, and that happens in the second function, “fast-gating.” This creates pulses that physically manipulate the qubits, ultimately directing the processing of information in the qubits.

Benchmarking results of the cryo-CMOS control with a quantum dot chip

Low power dissipation is a key challenge when it comes to communicating with qubits efficiently via these pulses. There are three variables that impact power dissipation: voltage level, frequency, and capacitance. The voltage needed in this case is set by the qubit, and the frequency is set by both the qubit and clock rate of the quantum plane. This leaves capacitance as the only variable you can adjust to create low power dissipation when charging gates and sending pulses—low capacitance means low dissipation. The capacitors in this system are tiny, spaced close together, and are very near the quantum plane, so they require as little power as possible to shuffle charge between capacitors to communicate with the qubits.

Disentangling hype from reality: Achieving practical quantum advantage

The researchers tested the Gooseberry chip to see how it would perform by connecting it with a GaAs-based quantum dot (QD) device. Some of the gates in the quantum dot device were connected to a digital-analog converter (DAC) at room temperature to compare these results with standard control approaches. Power leakage from the CLFG cells is measured by a second quantum dot in the device, and measurements of the QD conductance provide a way to monitor the charge-locking process. The temperature of all the components of the chip are measured as the control chip is being powered up, revealing that temperature stays below 100 mK within the necessary range of frequencies or clock speeds (see figure 4). See the paper for more details on the benchmarking process.

Extrapolating these results, the researchers estimated the total system power needed for the Gooseberry control chip as a function of frequency and the number of output gates. These results take into account both the clock speed and temperature needed for topological qubits, and Figure 5 shows that this chip is able to operate within the acceptable limits while communicating with thousands of qubits. This CMOS-based control approach also appears feasible for qubit platforms based on electron spins or gatemons.

Proof of principle that general-purpose compute is possible at cryogenic temperatures

The general-purpose cryo-compute core is a recent development that continues the progress made by Gooseberry. This is a general-purpose CPU operating at cryogenic temperatures. At present, the core operates at approximately 2 K, and it handles some triggering manipulation and handling of data. With fewer limitations from temperature, it also deals with branching decision logic, which requires more digital circuit blocks and transistors than Gooseberry has. The core acts as an intermediary between Gooseberry and executable code that can be written by developers, allowing for software-configurable communication between the qubits and the outside world. This technology proves it’s possible to compile and run many different types of code (written on current tools) in a cryogenic environment, allowing for greater possibilities of what can be accomplished with qubits being controlled by the Gooseberry chip.

Journey before destination: The zen behind the Microsoft approach to quantum computers

Trapped-ion qubit, the maglev train of a quantum computer

There’s no doubt that both Gooseberry and the cryo-compute core represent big steps forward for quantum computing, and having these concepts peer-reviewed and validated by other scientists is another leap ahead. But there are still many more leaps needed by researchers before a meaningful quantum computer can be realized. This is one of the reasons Microsoft has chosen to focus on the long game. While it might be nice to ramp up one aspect of quantum computers—such as the number of qubits—there are many concepts to be developed beyond the fundamental building blocks of quantum computers, and researchers at Microsoft Quantum and the University of Sydney aren’t stopping with these results.

The Transmon qubit | QuTech Academy

Projects like the Gooseberry chip and cryo-compute core take years to develop, but these researchers aren’t waiting to put new quantum projects into motion. The idea is to keep scaffolding prior work with new ideas so that all of the components necessary for quantum computing at large scale will be in place, enabling Microsoft to deliver solutions to many of the world’s most challenging problems.

More Information

https://cloudblogs.microsoft.com/quantum/2018/05/16/achieving-scalability-in-quantum-computing/

https://cloudblogs.microsoft.com/quantum/2018/06/06/the-microsoft-approach-to-quantum-computing/

https://cloudblogs.microsoft.com/quantum/2021/10/07/the-azure-quantum-ecosystem-expands-to-welcome-qiskit-and-cirq-developer-community/

https://news.microsoft.com/europe/2018/09/24/microsoft-and-the-university-of-copenhagen-are-building-the-worlds-first-scalable-quantum-computer/

https://www.microsoft.com/en-us/research/research-area/quantum-computing/?facet%5Btax%5D%5Bmsr-research-area%5D%5B%5D=243138&facet%5Btax%5D%5Bmsr-content-type%5D%5B%5D=post

https://azure.microsoft.com/en-us/resources/whitepapers/search/?term=quantum

https://azure.microsoft.com/en-us/solutions/quantum-computing/#news-blogs

https://sc20.supercomputing.org/

https://www.microsoft.com/en-us/research/research-area/quantum-computing/?facet%5Btax%5D%5Bmsr-research-area%5D%5B0%5D=243138&sort_by=most-recent

https://www.microsoft.com/en-us/research/blog/state-of-the-art-algorithm-accelerates-path-for-quantum-computers-to-address-climate-change/

https://www.microsoft.com/en-us/research/blog/full-stack-ahead-pioneering-quantum-hardware-allows-for-controlling-up-to-thousands-of-qubits-at-cryogenic-temperatures/

https://arxiv.org/abs/2007.14460

https://www.microsoft.com/en-us/research/publication/quantum-computing-enhanced-computational-catalysis/

https://ionq.com/

https://www.honeywell.com/us/en/company/quantum

https://www.honeywell.com/us/en/news/2020/06/quantum-scientific-papers

The Linux Kernel celebrates its 30th anniversary and it still has a lot to give

At the beginning of the month we released the note of the 30th anniversary of the publication of the first website, a fact that undoubtedly marked history and of which I have always related a bit to Linux, since both the publication of the first website as well as the first prototype of the Linux Kernel go hand in hand, since both were released in the same year.

As on August 25, 1991, after five months of development, 21-year-old student Linus Torvalds ad in the comp.os.minix conference call I was working on a working prototype of a new operating system Linux, for which the portability of bash 1.08 and gcc 1.40 had been completed. This first public version of the Linux kernel was released on September 17.

Kernel 0.0.1 was 62 KB in compressed form and it contained about 10 thousand lines of source code which compared to today's Linux kernel has more than 28 million lines of code.

According to a study commissioned by the European Union in 2010, the approximate cost of developing a project similar to a modern Linux kernel from scratch would have been more than a billion dollars (calculated when the kernel had 13 million lines of code), according to another estimate at more than 3 billion.

30 Years of Linux 1991-2021

A bit about Linux

The Linux kernel was inspired by the MINIX operating system, which Linus didn't like with his limited license. Later, when Linux became a famous project, the wicked they tried to accuse Linus of directly copying the code of some MINIX subsystems.

The attack was repelled by the author of MINIX, Andrew Tanenbaum, who commissioned a student to do a detailed comparison of the Minix code with the first public versions of Linux. Study results showed the presence of only four negligible code block matches due to POSIX and ANSI C requirements.

Linus originally thought of calling the kernel Freax, from free, freak and X (Unix). But the kernel got the name "Linux" with the light hand of Ari Lemmke, who, at Linus's request, put the kernel on the university's FTP server, naming the directory with the file not "freax," as Torvalds requested, but "linux."

Notably, entrepreneurial entrepreneur William Della Croce managed to trademark Linux and wanted to collect royalties over time, but then changed his mind and transferred all rights to the trademark to Linus. The official mascot for the Linux kernel, the Tux penguin, was selected through a competition held in 1996. The name Tux stands for Torvalds UniX.

Regarding the growth of the Kernel during the last 30 years:

0.0.1 - September 1991, 10 thousand lines of code
1.0.0 - March 1994, 176 thousand lines
1.2.0 - March 1995, 311 thousand lines
2.0.0 - June 1996, 778 thousand lines
2.2.0 - January 1999, 1,8 million lines
2.4.0 - January 2001, 3,4 million lines
2.6.0 - December 2003, 5,9 million lines
2.6.28 - December 2008, 10,2 million lines
2.6.35 - August 2010, 13,4 million lines
3.0 - August 2011, 14,6 million lines
3.5 - July 2012, 15,5 million lines
3.10 - July 2013, 15,8 million lines
3.16 - August 2014, 17,5 million lines
4.1 - June 2015, 19,5 million lines
4.7 - July 2016, 21,7 million lines
4.12 - July 2017, 24,1 million lines
4.18 - August 2018, 25,3 million lines
5.2 - July 2019, 26,55 million lines
5.8 - August 2020, 28,4 million lines
5.13 - June 2021, 29,2 million lines

While for the part of development and news:

September 1991: Linux 0.0.1, first public release that only supports i386 CPU and boots from floppy disk.
January 1992: Linux 0.12, the code began to be distributed under the GPLv2 license
March 1992: Linux 0.95, provided the ability to run the X Window System, support for virtual memory and partition swapping, and the first SLS and Yggdrasil distributions appeared.
In the summer of 1993, the Slackware and Debian projects were founded.
March 1994: Linux 1.0, first officially stable version.
March 1995: Linux 1.2, significant increase in the number of drivers, support for Alpha, MIPS and SPARC platforms, expansion of network stack capabilities, appearance of a packet filter, NFS support.
June 1996: Linux 2.0, support for multiprocessor systems.
January 1999: Linux 2.2, increased memory management system efficiency, added support for IPv6, implementation of a new firewall, introduced a new sound subsystem
February 2001: Linux 2.4, support for 8-processor systems and 64 GB of RAM, Ext3 file system, USB, ACPI support.
December 2003: Linux 2.6, SELinux support, automatic kernel tuning tools, sysfs, redesigned memory management system.
En septiembre de 2008, the first version of the Android platform based on the Linux kernel was formed.
In July 2011, after 10 years of development of the 2.6.x branch, the transition to 3.x numbering was made.
In 2015, Linux 4.0, the number of git objects in the repository has reached 4 million.
In April of 2018, I overcome the barrier of 6 million git-core objects in the repository.
In January of 2019, the Linux 5.0 kernel branch was formed.
Posted in August 2020, kernel 5.8 was the largest in terms of the amount of changes of all kernels during the entire life of the project.
In 2021, code for developing Rust language drivers was added to the next branch of the Linux kernel.

The Linux kernel is one of the most popular operating system kernels in the world. Less than 30 years after its humble beginnings in 1991, the Linux kernel now underpins modern computing infrastructure, with 2019 estimates of the number of running Linux kernels ranging upwards of twenty billion. To put that in perspective: There are about 3 Linux kernels for every living person.

The Linux kernel powers household appliances, smartphones, industrial automation, Internet data centers, almost all of the cloud, financial services, and supercomputers. It even powers a few percent of the world’s desktop systems, including the one that I am typing these words into. But the year of the Linux desktop continues its decades-long tradition of being next year.

But it wasn’t always that way.

A brave choice, big commitment in Linux’s early days

The Linux kernel was still a brave choice when IBM joined the Linux community in the late 1990s. IBM began its Linux-kernel work with a skunkworks port of Linux to the IBM mainframe and a corporate commitment that resulted in IBM’s investing $1B on Linux in 2001. Linux was ported to all IBM servers, and even to IBM Research’s Linux-powered wristwatch. Linux soon enjoyed widespread use within IBM’s hardware, software, and services.

Of course, IBM wasn’t the only player betting on Linux. For example, an IBM sales team spent much time preparing to convince a long-standing, technically conservative client to start moving towards Linux. When the team went to give their pitch, the client opened the discussion with: “We have decided that we are going with Linux. Will you be coming with us?” Although this destroyed untold hours of preparation, it produced a result beyond the sales team’s wildest imaginations.

And it wasn’t an isolated incident.

Keynote: Linus Torvalds in conversation with Dirk Hohndel

Setting Linux up for success

This widespread enthusiasm motivated IBM not only to make substantial contributions to Linux, but also to come to its defense. First, we committed to not attack Linux in the form of patent pledges. We took it a step further and opted to co-found the Open Invention Network, which helped defend open source projects such as the Linux kernel against attacks by patent holders. We made numerous visits to the courtroom to defend ourselves against a lawsuit related to our Linux involvement, and co-founded several umbrella organizations to facilitate open source projects, perhaps most notably helping to found the Linux Foundation.

IBM is also a strong technical contributor to the Linux kernel, ranking in the top ten corporate contributors and having maintainers for a wide range of Linux-kernel subsystems. Of course, IBM contributes heavily to support its own offerings, but it is also a strong contributor in the areas of scalability, robustness, security, and other areas that benefit the Linux ecosystem.

Of course, not everything that IBM attempted worked out. IBM’s scalability work in the scheduler was never accepted into the Linux kernel. Although its journaling filesystem (JFS) was accepted and remains part of the Linux kernel, it seems safe to say that JFS never achieved the level of popularity that IBM had hoped for. Nevertheless, it seems likely that IBM’s efforts helped to inspire the work leading to the Linux kernel’s excellent scalability, features, and functionality in its filesystems and scheduler.

In addition, these experiences taught IBM to work more closely with the community, paving the way to later substantial contributions. One example is the CPU-groups feature of the community’s scheduler that now underpins containers technologies such as Docker, along with the virtio feature that plays into the more traditional hypervisor-based virtualization. Another example is numerous improvements leading up to the community’s EXT4 filesystem. A final example is the device-tree hardware specification feature, originally developed for IBM’s Power servers but now also used by many embedded Linux systems.

Celebrating 30 Years of Open

Achieving impossible results

It has also been a great privilege for IBM to be involved in a number of Linux-kernel efforts that produced results widely believed to be impossible.

First, at the time that IBM joined the Linux kernel community, the kernel could scale to perhaps two or four CPUs. At the time there was a large patchset from SGI that permitted much higher scalability, but this patchset primarily addressed HPC workloads. About ten years of hard work across the community changed this situation dramatically, so that, despite the naysayers, the same Linux-kernel source code supports both deep embedded systems and huge servers with more than one thousand CPUs and terabytes of memory.

Second, it was once common knowledge that achieving sub-millisecond response times required a special-purpose, real-time operating system. In other words, sub-millisecond response times certainly could not be achieved by a general-purpose operating system such as the Linux kernel. IBM was an important part of a broad community effort that proved this to be wrong, as part of an award-winning effort including Raytheon and the US Navy. Although the real-time patchsett has not yet been fully integrated into the mainline Linux kernel, it does achieve not merely deep sub-millisecond response times, but rather deep sub-hundred-microsecond response times. And these response times are achieved not only on single-CPU embedded system, but also on systems with thousands of CPUs.

Third, only about a decade ago, it was common knowledge that battery-powered embedded systems required special-purpose operating systems. You might be surprised that IBM would be involved in kernel work in support of such systems. One reason for IBM’s involvement was that some of the same code that improves battery lifetime also improves the Linux kernel’s virtualization capabilities — capabilities important to the IBM mainframe. A second reason for IBM’s involvement was the large volume of ARM chips then produced by its semiconductor technology partners. This latter reason motivated IBM to cofound the Linaro consortium, which improved Linux support for ARM’s processor families. The result, as billions of Android smartphone users can attest, is that the Linux kernel has added battery-powered systems to its repertoire.

Fourth and finally, version 5.2 of the Linux kernel comprises 13,600 changes from 1,716 kernel developers. The vast majority of these changes were applied during the two-week merge window immediately following the release of version 5.1, with only a few hundred new changes appearing in each of the release candidates that appear at the end of each weekend following the merge window. This represents a huge volume of changes from a great many contributors, and with little formal coordination. Validating these changes is both a challenge and a first-class concern.

One of IBM’s contributions to validation is “-next” integration testing, which checks for conflicts among the contributions intended for the next merge window. The effects of -next integration testing, when combined with a number of other much-appreciated efforts throughout the community, has not been subtle. Ten years ago, serious kernel testing had to wait for the third or fourth release candidate due to bugs introduced during the preceding merge window. Today, serious kernel testing can almost always proceed with the first release candidate that comes out immediately at the close of the merge window.

But is Linux done yet?

Not yet.

Happy Birthday Linux! Plus Highlights of DLN MegaFest Celebrating 30 Years of Linux!

A continuing effort

Although developers should be proud of the great increases in stability of the Linux kernel over the years, the kernel still has bugs, some of which are exploitable. There is a wide range of possible improvements from more aggressively applying tried testing techniques to over-the-horizon research topics such as formal verification. In addition, existing techniques are finding new applications, so that CPU hotplug (which was spearheaded by IBM in the early 2000s) has recently been used to mitigate hardware side-channel attack vectors.

The size of hardware systems is still increasing, which will require additional work on scalability. Many of these larger systems will be used in various cloud-computing environments, some of which will pose new mixed-workload challenges. Changing hardware devices, including accelerators and non-volatile memory, will require additional support from Linux, as well as from hardware features such as IBM’s Power Systems servers’ support of the NVLink and CAPI interconnects.

Finally, there is more to security than simply fixing bugs faster than attackers can exploit them (though that would be challenge enough!). Although there is a great deal of security work needed in a great many areas, one important advance is Pervasive Encryption for IBM Z.

IBM congratulates the Linux kernel community on its excellent progress over the decades, and looks forward to being part of future efforts overturning yet more morsels of common wisdom!

Linux Kernel Internals

What 30 Years of Linux Taught the Software Industry

Linux has become the largest collaborative development project in the history of computing over the last 30 years. Reflecting on what made this possible and how its open source philosophy finally imposed itself in the industry can offer software vendors valuable lessons from this amazing success story.

The web may not have reached full adulthood yet, but it has already crafted its own mythology.

August 25, 1991: Linus Torvalds, a 21-year-old university student from Finland, writes a post to a Usenet group: “Hello everybody out there using minix — I’m doing a (free) operating system (just a hobby, won’t be big and professional like gnu) for 386 (486) AT clones […]”. A few weeks later, the project, which will eventually be known as Linux, is published for the first time.

This is the starting point of an epic that few could have foreseen.

Fast-forward 30 years and the Linux kernel isn’t only running on most of the web servers and smartphones around the globe, but it also supports virtually all of the much more recent cloud infrastructure. Without open source programs like Linux, cloud computing wouldn’t have happened.

Among the major factors that propelled Linux to success is security. Today, the largest software companies in the world are taking open source security to new levels, but the Linux project was one of the first to emphasize this.

HotCloud '14 - Cloudy with a Chance of …

How Linux Became the Backbone of the Modern IT World

Brief History

Open source predates the Linux project by many years and is arguably as old as software itself. Yet, it is the success of the latter that propelled this movement in the 1990s. When it was first submitted for contribution in 1991 by Torvalds, the Linux kernel was the GNU project’s ‘missing link’ to a completely free software operating system, which could be distributed and even sold without restrictions. In the following years, and as the project started to incorporate proprietary licensed components and grow in popularity, a clarification on the meaning of “free software” became necessary.

This led to the coining of the term “open source” as we use it today, thanks in part to Eric Raymond’s seminal paper The Cathedral and the Bazaar, a “reflective analysis of the hacker community and free software principles.” Open source was chosen to qualify software in which the source code is available to the general public for use or modification from its original design, depending on the terms of the license. People may then download, modify and publish their version of source code (fork) back to the community.

Open source projects started gaining traction in the late nineties thanks to the popularity of software like Apache HTTP Server, MySQL and PHP to run the first dynamic websites on the internet.

Facts and Figures

Today, not only is Linux powering most of the digital era, but open source has become the leading model for how we build and ship software. Though most people don’t realize it, much of the technology we rely on every day runs on free and open source software (FOSS). Phones, cars, planes and even many cutting-edge artificial intelligence programs use open source software. According to the Linux Foundation, 96.3% of the world’s top one million servers run on Linux and 95% of all cloud infrastructure operates on it. Other infrastructure also relies on open source: 70% of global mobile subscribers use devices running on networks built using ONAP (Open Network Automation Platform).

Linux adoption is very high in professional IT, where it’s become a de facto standard, especially with the advent of the cloud era. In fact, 83.1% of developers said Linux is the platform they prefer to work on. This success is due, in large part, to the community that contributed to its source code since its creation: More than 15,000 developers from more than 1,500 companies. Linux went on to become, arguably, the biggest success story of the free software movement, proving that open source could lead to the creation of software as powerful as any sold by a corporation.

The Linux Foundation, a non-profit technology consortium founded in 2000 to support the collaborative development of Linux and OS software projects, is itself a big success. It now has more than 100 projects under its umbrella, spread across technology sectors like artificial intelligence, autonomous vehicles, networking and security. Several subset foundations have also emerged over the years, including the Cloud Foundry Foundation, the influential Cloud Native Computing Foundation, and the recently announced Open Source Security Foundation. The Foundation estimates the total shared value created from the collective contributions of its community to a whopping $54.1 billion.

All these achievements may not have been possible without the embrace of open source by the enterprise world, which may represent its biggest win.

New Generation of Mainframers - John Mertic, The Linux Foundation & Len Santalucia, Vicom Infinity

Enterprise Adoption

Companies began to realize that many open source projects were easier and cheaper to implement than asking their developers to build the basic pieces of an internet business over and over again from scratch.

Twenty years ago, most businesses ran atop proprietary software from Microsoft, Oracle and IBM, and the idea of collaborating on big software projects might have sounded laughable to them. Today, these companies, along with relative newcomers such as Google, Facebook and Amazon, are not only employing thousands of full-time contributors to work on open source projects like Linux, they also regularly choose to open source some of their state-of-the-art projects; from Google Brain’s machine learning platform TensorFlow and container orchestration platform Kubernetes to Facebook’s React.

There’s no question that open source software created a new wave of business opportunities. As more companies took an interest in open source projects, they realized they didn’t necessarily have the in-house expertise to manage those projects themselves and turned to startups and larger companies for help.

Even Microsoft, which famously warred against the very concept of Linux for nearly a decade, made a strategic shift to embrace open source in the 2010s, led by CEO Satya Nadella. The IT giant finally joined the Linux Foundation in 2016 and acquired GitHub, the largest host for open source projects, two years later. It has since become one of the biggest sponsors of open source projects.

As a consequence, the stakes have been raised for open source software, which is the engine powering the shift toward the cloud for virtually every company. In this context, security is becoming a topic of the utmost importance, and the commitment to secure the open source ecosystem is growing fast.

LINUX Kernel

Setting a Standard for Security and Trust

Open Source Security Issues

Following the OSS adoption boom, the sustainability, stability and security of these software packages is now a major concern for every company that uses them.

The Census II report on structural and security complexities in the modern-day supply chain “where open source is pervasive but not always understood” revealed two concerning trends that could make FOSS more vulnerable to security breaches. First, the report said it is common to see popular packages published under individual developers’ accounts, raising the issue of security and reliability. Second, it is very common to see outdated versions of open source programs in use, meaning they contain fewer security patches.

The 2021 OSSRA report agrees: “98% of the codebases audited over the past year contain at least one open source component, with open source comprising 75% of the code overall.” The report also noted that 85% of the audited codebases contained components “more than four years out of date”.

This highlights the mounting security risk posed by “unmanaged” open source: “84% of audited codebases containing open source components with known security vulnerabilities, up from 75% the previous year. Similarly, 60% of the codebases contained high-risk vulnerabilities, compared to 49% just 12 months prior.” Not only is the security posture affected, but there are also compliance issues that can arise from unsupervised integration of open source content because licenses can be conflicting or even absent.

Because large corporations are now a big part of the open source ecosystem, their sponsorship is a welcome source of financing for many people whose work had been done for free until now, yet it may not be enough. The open source community is well-known for its commitment to independence, its sense of belonging and its self-sufficiency, and expecting contributors to voluntarily address security issues is unlikely to succeed.

This is where the experience of building Linux over 30 years and coordinating the work of thousands of individual contributors may be an example to follow.

Linux on IBM Z and LinuxONE: What's New

Linux Foundations

In Linux kernel development, security is taken very seriously. Because it is an underlying layer for so many public and private software ‘bricks’ in the digital world means that any mistake can cost millions to businesses, if not lives. Since the beginning, it has adopted a decentralized development approach with a large number of contributors collaborating continuously. Therefore, it has consolidated a strong peer-reviewing process as the community development effort grew and expanded.

The last stable release at the time of writing is 5.14, released on August 29th, 2021, only a few days before the 30th birthday of the project. The most important features in the release are security-related: One is intended to help mitigate processor-level vulnerabilities like Spectre and Meltdown and the other concerns system memory protection, which is a primary attack surface to exploit. Each Linux kernel release sees close to 100 new fixes per week committed by individuals and professionals from the likes of Intel, AMD, IBM, Oracle and Samsung.

With such broad adoption and long history, the Linux project has reached a level of maturity that few, if any, other FOSS projects have seen. The review process and release model have built confidence for numerous downstream vendors. Although the world is not perfect and it is arguably difficult for them to keep up with such a high rate of change, they can at least benefit from strong security enforcement mechanisms and they can adapt their security posture in concordance to their “risk appetite”: Vendors are able to do the calculus of determining how old a kernel they can tolerate exposing users to.

A maintainable, scalable, and verifiable SW architectural design model

Pushing the Boundaries of Open Source Security

Heartbleed and the Fragility of OS Security

In April 2014, a major security incident affecting the OpenSSL cryptography library; disclosed as “Heartbleed.” The developer who introduced the bug acknowledged that, though he was working on the project with a handful of other engineers:

“I am responsible for the error because I wrote the code and missed the necessary validation by an oversight. Unfortunately, this mistake also slipped through the review process and therefore made its way into the released version.” OpenSSL, an open source project, is widely used to implement the Transport Layer Security (TLS) protocol. In other words, it’s a fundamental piece used to secure a large part of the web.

Open source was seen as fundamentally secure for a long time because the more people examine a line of code, the better the chances of spotting any weakness. Additionally, this model prevents “security by obscurity,” whereby the bulk of the protection comes from people not knowing how the security software works—which can result in the whole edifice tumbling down if that confidential information is released or discovered externally.

This incident was a major turning point for a large share of the biggest web corporations: They realized that many open source technologies underpinning their core operations could not be “assumed to be secure” anymore. Any human error could have huge implications; therefore, a specific effort had to be made to improve the security in this specific space.

Linux Kernel Development

A New Era for Open Source

As we advance in an era where open source is omnipresent in codebases, tooling, networks and infrastructure and is even in fields other than software, security awareness is starting to take hold. But it needs a lot more work.

A big part of the challenge, to begin with, is for the industry to understand the scope of the problem.

Google just announced that it will be committing “$100 million to support third-party foundations that manage open source security priorities and help fix vulnerabilities.”

The Secure Open Source (SOS) pilot program, run by the Linux Foundation, will reward developers for enhancing the security of critical open source projects that we all depend on.

In doing so, Google leads the way in enlarging the financial sponsorship of big players like companies and governments — which are increasingly sponsoring open source both directly and indirectly. However, they also recommend that organizations “understand the impact they have on the future of the FOSS ecosystem and follow a few guiding principles.”

What could these principles look like?

Modernizing on IBM Z Made Easier With Open Source Software

A Roadmap to Safely Use and Contribute to Open Source

The Linux Foundation proposed a specific Trust and Security Initiative which describes a collection of eight best practices (with three degrees of maturity) open source teams should use to secure the software they produce as well as by a larger audience to “raise the collective security bar.” Here they are:

Clarifying the roles and responsibilities, and making sure everyone is aware of their security responsibilities across the organization.

Setting up a security policy for everyone; in other words, a clear north star for all members of the organization.

‘Know your contributors’ is defined as a set of practices to make risk-based decisions on who to trust and fight offensive cyberwarfare techniques, such as the poisoning of upstream code.

Locking down the software supply chain: This has become a preferred target as adversaries clearly understood that they can have a bigger and more effective impact with less effort than targeting individual systems.

Provide technical security guidance to narrow potential solutions down to the more appropriate ones in terms of security.

Deploy security playbooks to define how to do specific security processes, specifically incident response and vulnerability management processes, like creating roles and responsibilities or publishing security policies. This may feel formal, antiquated and old-school but having pre-defined playbooks means that teams can focus on shipping software and not learning how to do security, especially at the least convenient and most stressful time.

Securing Linux VM boot with AMD SEV measurement - Dov Murik & Hubertus Franke, IBM Research

Develop security testing techniques with automated testing strongly recommended since it scales better, has less friction and less cost to the teams and aligns well to modern continuous delivery pipelines.

However, the authors of the guide are aware that some major challenges are still facing the industry and, as such, need to be addressed. They mention:

The lack of open source security testing tools
The fact that open source package distribution is broken
The fact that the CVE format for vulnerability disclosure is also broken

The lack of a standard for a security build certificate which would allow any consumer to transparently verify that a product or component complies with the announced specifications

“The types of verification can and should include the use of automated security tools like SAST, DAST and SCA, as well as verification of security processes like the presence of security readmes in repos and that security response emails are valid.”

A scheme like this could have a significant and lasting effect on the security quality of open source software and the internet at large.

The Linux project, born 30 years ago, is present in all layers of the modern software stack today. It is used by all the largest server clusters powering the modern web and any business going digital will use it at some point. This unparalleled longevity and success have demonstrated that the open source model was compatible with the requirements of enterprise-grade services and economically viable. Now that open source is all the rage in the software industry, a consensus and action plan on how to ensure the sustainability of this ecosystem becomes urgent. The top priority for businesses that depend on it is to adopt strong application security guidelines, like the ones promoted by the Linux Foundation, which have proven their value.

One last note on the nature of open source: As businesses are now much more bound by the common use of open source components to build upon, they should not fall into the “tragedy of the commons” trap. This would mean waiting until others take action; for instance, to improve the global software security landscape. This might be one of the biggest challenges confronting our highly collaborative industry.

More Information:

https://www.linuxadictos.com/en/hoy-el-kernel-de-linux-cumple-su-30-aniversario-y-aun-le-queda-mucho-por-dar.html

https://developer.ibm.com/blogs/ibm-and-the-linux-kernel/

https://devops.com/what-30-years-of-linux-taught-the-software-industry/

https://www.howtogeek.com/754345/linux-turns-30-how-a-hobby-project-conquered-the-world/

CyberArk offers enhanced end-to-end security for critical assets

As an established leader in privileged access management and identity security capabilities, CyberArk helps the world’s leading organizations secure their most critical digital assets. CyberArk partnered to build Red Hat certified integrations, which offer joint customers endto- end security for Red Hat OpenShift Container Platform and Red Hat Ansible Automation Platform. This unique offering allows Red Hat and CyberArk to increase revenue and grow Partner resources their accounts through more efficient and secure business solutions.

“It’s really rewarding to see this win-win
partnership between Red Hat and CyberArk
that truly benefits both companies—
and their customers.”

Protecting Critical Digital AssetsWorldwide

For more than a decade, the world’s leading organizations have trusted CyberArk to help them secure their most critical digital assets. Today, the growing software security company protects more than 6,600 global businesses—including most of the Fortune 500—and a majority of Fortune banks and insurance, pharmaceutical, energy, and manufacturing companies rely on CyberArk. With its U.S. headquarters in Massachusetts and main office in Illinois, CyberArk offers customers solutions focused on privileged access management (PAM) identity security and DevSecOps. More than 2,000 staff members, located in offices around the globe, help security leaders get ahead of cyber threats, specifically cyberattacks against an organization’s most critical assets. “Typically the most privileged users inside an organization have access to the most sensitive information,” said John Walsh, Senior Product Marketing Manager at CyberArk. But the lines between the trusted insider, third-party vendor, and outsiders have started to blur and even disappear as sophisticated supply chain attacks like SolarWinds materialize. “It’s zero-trust—you really can’t tell who the outsiders and the insiders are anymore.” CyberArk is helping customers move their critical strategies forward more securely. Work from home dynamics and demand for efficiency have motivated companies to accelerate their digital transformations and cloud migration plans. And, with that, customers have a heightened sense of urgency around CyberArk’s PAM and identity solutions.

Secure Automation Secrets

Secure automation secrets from albertspijkers

Offering customers end-to-end security

A Red Hat partner since 2016, CyberArk is in the top four strategic security partners, and one of the highest revenue generating in the Global Partner Alliances (GPA) program, specifically in the security segment. GPA helps Red Hat partners build, expand, and sell software applications. As a result of the continued collaboration between the two organizations, CyberArk was recently awarded the Collaboration Technology Independent Software Vendor (ISV) Partner Of The Year, announced at Red Hat Summit 2021. The partnership is not localized to a specific region—it covers North America, Europe, the Middle East and Africa, and Asia Pacific and Japan, across a range of sectors. Red Hat elevated CyberArk to Globally Managed Partner in 2018. “We have a dedicated Red Hat resource,” said Joanne Wu, VP of Business Development at CyberArk. “When Red Hat runs campaigns or events, or goes to market, we are often, if not always, one of the top partners approached for these invitationonly strategic initiatives.” The partnership offers Red Hat customers enhanced security for Red Hat OpenShift and Red Hat Ansible Automation Platform. “Just like any good partnership, we complement and support each other as Red Hat is a market leader in container management and automation,” said Walsh, “while CyberArk is a market leader in privileged access management and identity security. Together, we not only help each other, but we also offer a better solution to our customers.” Red Hat and CyberArk collaborate on rich content such as whitepapers, a hands-on workshop that shows how the technologies integrate, and videos to increase customer skill levels.

Integrating leading solutions

Red Hat and CyberArk work together on Red Hat certified integrations to offer a solution that secures secrets and credentials in Red Hat Ansible Automation Platform and within the DevOps environments of Red Hat OpenShift. Red Hat OpenShift is an enterprise-ready Kubernetes container platform with full-stack automated operations to manage hybrid cloud, multicloud, and edge deployments. “CyberArk secures application secrets and the access they provide for Red Hat technologies, rotating them, auditing, and authenticating access according to best practices,” said Walsh. CyberArk’s Conjur provides a comprehensive, centralized solution for securing credentials and secrets for applications, containers, and continuous integration and continuous delivery (CI/CD) tools across native cloud and DevOps environments. CyberArk Conjur integrates with Red Hat OpenShift to provide ways to simplify and strengthen security by safeguarding the credentials used by applications running in OpenShift containers. CyberArk and Red Hat provide more than 10 integrations to enhance security and protect automation environments for Red Hat OpenShift and Red Hat Ansible Automation Platform. CyberArk makes these available as certified integrations on its marketplace, empowering DevOps and security teams to automatically secure and manage the credentials and secrets used by IT resources and CI/CD tools. These integrations simplify how operations teams write and use playbooks to more securely access credentials. Credentials are centrally managed and secured by CyberArk. Secrets used by Ansible Playbooks are automatically secured and rotated by CyberArk based on the organization’s policy.

Building a strong alliance Increased revenue year over year

Red Hat and CyberArk increase revenue for each other through their partnership, with revenue growing year over year. “CyberArk influences a Red Hat deal being closed and, vice versa, Red Hat helps CyberArk to find opportunities and close deals,” said Wu. “Both companies benefit from the value proposition. It’s a true win-win.” By mutually developing their pipeline over the years, both Red Hat and CyberArk have witnessed exponential growth in the number of accounts where they jointly present their value proposition.

Opened access to the wider organization

Red Hat helps CyberArk gain access to the DevOps team, and CyberArk helps Red Hat gain access to security teams. “CyberArk is mostly speaking to the security teams, all the way up to the CSO [Chief Security Officer],” said Wu. “Red Hat has given us visibility to the infrastructure side of the house.” Most importantly, the partnership with Red Hat helps CyberArk build relationships with DevOps teams using Ansible Automation Platform for their CI/CD pipeline, and looking for security solutions. CyberArk is then able to include security solutions with those DevOps projects. “Red Hat has helped CyberArk reach the IT organization,” said Walsh. “Red Hat enables CyberArk to provide our security solutions and Red Hat integrations as a stronger solution, to raise awareness, and to expand our market reach.”

Securing Your Digital Transformation - CyberArk and Red Hat Integration

Stayed aware of the latest developments

CyberArk’s close relationship with Red Hat means it is always fully informed about how Red Hat technologies are evolving, and, with that, it can ensure its security solutions are always fully aligned with new Red Hat features and products. “Having visibility into the Red Hat Ansible Automation Platform roadmap means we can stay ahead while developing our integrations,” said Wu. When Red Hat released Ansible security automation, CyberArk was one of the first ISVs to develop an integration. And when Ansible Automation Platform first included collections, CyberArk quickly packaged its collection to ensure it was available on Ansible Automation Hub. Enhanced security for users The partnership ensures customers get a more efficient and hardened implementation, whether with Red Hat OpenShift or Red Hat Ansible Automation Platform. Joint customers can find CyberArk’s Red Hat Certified integrations on the Red Hat Ecosystem Catalog and Ansible Automation Hub. CyberArk also has native integration with Ansible Automation Platform, built in at the product level. The integrations are not only free but also jointly supported by both Red Hat and CyberArk. Customers do not need to invest any development resources because the integrations do not require any code.

Ansible and Cyber Ark Security

Ansible and Cyber ark Security from albertspijkers

Expanding on successes with Red Hat

CyberArk is a leader in PAM and identity security. Red Hat is a leader in DevOps and hybrid cloud technology. Their strong alliance offers significant benefits and value for customers. “It’s really rewarding to see this win-win partnership between Red Hat and CyberArk that truly benefits both
companies—and their customers,” said Wu.

As an established leader in privileged access management and identity security capabilities, CyberArk helps the world’s leading organizations secure their most critical digital assets. CyberArk partnered to build Red Hat certified integrations, which offer joint customers end-to-end security for Red Hat OpenShift Container Platform and Red Hat Ansible Automation Platform. This unique offering allows Red Hat and CyberArk to increase revenue and grow their accounts through more efficient and secure business solutions.

Benefits

Increased revenue year over year
Opened access to customer’s wider IT organization to build new relationships
Enhanced security for users by automating credentials management
Protecting critical digital assets worldwide

Red Hat Ansible Security Automation Overview

Red hat-ansible-security-automation-overview from albertspijkers

With its U.S. headquarters in Massachusetts and main office in Illinois, CyberArk offers customers solutions focused on privileged access management (PAM) identity security and DevSecOps. More than 2,000 staff members, located in offices around the globe, help security leaders get ahead of cyber threats, specifically cyberattacks against an organization’s most critical assets. “Typically the most privileged users inside an organization have access to the most sensitive information,” said John Walsh, Senior Product Marketing Manager at CyberArk. But the lines between the trusted insider, third-party vendor, and outsiders have started to blur and even disappear as sophisticated supply chain attacks like SolarWinds materialize. “It’s zero-trust—you really can’t tell who the outsiders and the insiders are anymore.”

CyberArk is helping customers move their critical strategies forward more securely. Work from home dynamics and demand for efficiency have motivated companies to accelerate their digital transformations and cloud migration plans. And, with that, customers have a heightened sense of urgency around CyberArk’s PAM and identity solutions.

Offering customers end-to-end security for Ansible and OpenShift

The partnership is not localized to a specific region—it covers North America, Europe, the Middle East and Africa, and Asia Pacific and Japan, across a range of sectors. Red Hat elevated CyberArk to Globally Managed Partner in 2018. “We have a dedicated Red Hat resource,” said Joanne Wu, VP of Business Development at CyberArk. “When Red Hat runs campaigns or events, or goes to market, we are often, if not always, one of the top partners approached for these invitation-only strategic initiatives.”

The partnership offers Red Hat customers enhanced security for Red Hat OpenShift and Red Hat Ansible Automation Platform. “Just like any good partnership, we complement and support each other as Red Hat is a market leader in container management and automation,” said Walsh, “while CyberArk is a market leader in privileged access management and identity security. Together, we not only help each other, but we also offer a better solution to our customers.”

Red Hat and CyberArk collaborate on rich content such as whitepapers, a hands-on workshop that shows how the technologies integrate, and videos to increase customer skill levels.

CyberArk Secrets Management in Red Hat OpenShift

CyberArk Secrets Management in Red Hat OpenShift

Integrating leading solutions

CyberArk’s Conjur provides a comprehensive, centralized solution for securing credentials and secrets for applications, containers, and continuous integration and continuous delivery (CI/CD) tools across native cloud and DevOps environments. CyberArk Conjur integrates with Red Hat OpenShift to provide ways to simplify and strengthen security by safeguarding the credentials used by applications running in OpenShift containers.

AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer and OpenSCAP

AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer and OpenSCAP from Mihai Criveti

CyberArk and Red Hat provide more than 10 integrations to enhance security and protect automation environments for Red Hat OpenShift and Red Hat Ansible Automation Platform. CyberArk makes these available as certified integrations on its marketplace, empowering DevOps and security teams to automatically secure and manage the credentials and secrets used by IT resources and CI/CD tools.

These integrations simplify how operations teams write and use playbooks to more securely access credentials. Credentials are centrally managed and secured by CyberArk. Secrets used by Ansible Playbooks are automatically secured and rotated by CyberArk based on the organization’s policy.

Building a strong alliance: Red Hat and CyberArk increase revenue through partnership

Increased revenue year over year

By mutually developing their pipeline over the years, both Red Hat and CyberArk have witnessed exponential growth in the number of accounts where they jointly present their value proposition.

Shifting Security Left: Streamlining Enterprise Secrets Management With CyberArk & Red Hat OpenShift

Opened access to the wider organization

Most importantly, the partnership with Red Hat helps CyberArk build relationships with DevOps teams using Ansible Automation Platform for their CI/CD pipeline, and looking for security solutions. CyberArk is then able to include security solutions with those DevOps projects. “Red Hat has helped CyberArk reach the IT organization,” said Walsh. “Red Hat enables CyberArk to provide our security solutions and Red Hat integrations as a stronger solution, to raise awareness, and to expand our market reach.”

Stayed aware of the latest developments

When Red Hat released Ansible security automation, CyberArk was one of the first ISVs to develop an integration. And when Ansible Automation Platform first included collections, CyberArk quickly packaged its collection to ensure it was available on Ansible Automation Hub.

Container Technologies and Transformational value

Container Technologies and Transformational value from Mihai Criveti

Enhanced security for users

The partnership ensures customers get a more efficient and hardened implementation, whether with Red Hat OpenShift or Red Hat Ansible Automation Platform.

Joint customers can find CyberArk’s Red Hat Certified integrations on the Red Hat Ecosystem Catalog and Ansible Automation Hub. CyberArk also has native integration with Ansible Automation Platform, built in at the product level.

The integrations are not only free but also jointly supported by both Red Hat and CyberArk. Customers do not need to invest any development resources because the integrations do not require any code.

Expanding on successes with Red Hat

Looking to the future, CyberArk is planning to build on its already strong partnership with Red Hat. “We’ve had a tremendous co-selling effort in the U.S. and EMEA [Europe, Middle East, and Africa], and I’d like to see that expand even more so to APJ [Asia Pacific and Japan] and South America,” said Wu. “And we’re also planning to get closer and increase reach in the public sector.”

The security solutions company is also eager to expand its Red Hat Ansible Automation Platform integrations. CyberArk will soon be the first partner to develop a reference architecture with Ansible Automation Platform.

CyberArk is a leader in PAM and identity security. Red Hat is a leader in DevOps and hybrid cloud technology. Their strong alliance offers significant benefits and value for customers. “It’s really rewarding to see this win-win partnership between Red Hat and CyberArk that truly benefits both companies—and their customers,” said Wu.

(OCB) Identity, Access and Security Management for DevOps: RedHat and CyberArk

The Inside Playbook

Automating Security with CyberArk and Red Hat Ansible Automation Platform

Proper privilege management is crucial with automation. Automation has the power to perform multiple functions across many different systems. When automation is deployed enterprise-wide, across sometimes siloed teams and functions, enterprise credential management can simplify adoption of automation — even complex authentication processes can be integrated into the setup seamlessly, while adding additional security in managing and handling those credentials.

Depending on how users have defined them, users can craft Ansible Playbooks that require access to credentials and secrets that have wide access to organizational systems. These are necessary to systems and IT resources to accomplish their automation tasks, but they’re also a very attractive target for bad actors. In particular, they are tempting targets for advanced persistent threat (APT) intruders. Gaining access to these credentials could give the attacker the keys to the entire organization.

Introduction to Red Hat Ansible Automation Platform

Most breaches involve stolen credentials, and APT intruders prefer to leverage privileged accounts like administrators, service accounts with domain privileges, and even local admin or privileged user accounts.

You’re probably familiar with the traditional attack flow: compromise an environment, escalate privilege, move laterally, continue to escalate, then own and exfiltrate. It works, but it also requires a lot of work and a lot of time. According to the Mandiant Report, median dwell time for an exploit, while well down from over 400 days in 2011, remained over 50 days in 2019. However, if you can steal privileged passwords or the API keys to a сloud environment, the next step is complete compromise. Put yourself into an attacker’s shoes: what would be more efficient?

While Ansible Tower, one of the components of Red Hat Ansible Automation Platform, introduced built-in credentials and secret management capabilities, some may have the need for tighter integration with the enterprise management strategy. CyberArk works with Ansible Automation Platform, automating privileged access management (PAM), which involves the policies, processes and tools that monitor and protect privileged users and credentials.

Getting Started with OpenShift 4 Security

Why Privileged Access Management Matters

Technologies like cloud infrastructure, virtualization and containerization are being adopted by organizations and their development teams alongside DevOps practices that make the need for security practices based on identity and access management critical. Identity and access management isn't just about employees; it includes managing secrets and access granted to applications and infrastructure resources as well.

A PAM solution ideally handles the following key tasks for your organization:

Continuously scan an environment to detect privileged accounts and credentials.
Add accounts to a pending list to validate privileges.
Perform automated discovery of privileged accounts and credentials.
Provide protected control points to prevent credential exposure and isolate critical assets.
Record privileged sessions for audit and forensic purposes.
View privileged activity by going directly to specified activities and even keystrokes.

Detect anomalous behavior aiming to bypass or circumvent privileged controls, and alert SOC and IT admins to such anomalies.

Suspend or terminate privileged sessions automatically based on risk score and activity type.

Initiate automatic credential rotation based on risk in the case of compromise or theft.

The common theme in the preceding functions is automation. There’s a reason for that: Automation is not just a “nice to have” feature. It’s absolutely essential to PAM. Large organizations may have thousands of resources that need privileged access, and tens of thousands of employees who may need various levels of privilege to get their work done. Even smaller organizations need to monitor and scale privileged access as they grow. Automated PAM solutions handle the trivial aspects of identity and access management so your team can focus on business goals and critical threats.

WebLogic Continuous Deployment with Red Hat Ansible Automation Platform

Automation is what you use to:

Onboard and discover powerful secrets, where you auto-discover secrets, put them in a designated vault and trigger rotation, just to be on the safe side.
Apply compliance standards, such as auto-disabling certain network interfaces.
Harden devices via OS- and network-level controls — like blocking SSH connections as root.
Track and maintain configurations.

And, of course, automation becomes indispensable in the remediation and response (R&R) stage. When you’re under attack, the absolute worst-case scenario is having to undertake manual R&R. We’ve seen many times — as you probably have — that it puts security and operations teams at odds with each other, and makes both of these look at development as a source of continuous trouble.

Security can, and should, exist as code. Integrating Ansible with CyberArk implements security-as-code, which allows security, operations and developers to work in sync as your “first responder” group, giving them the time and peace of mind to meaningfully respond to the threat — and likely to find a way to prevent it from recurring.

Automatically Respond to Threats

For most teams, keeping a constant watch on every detail of privileged access is unsustainable and hard to scale. The default reaction is often to simply lock down access, making growth and development difficult. PAM automation can make responding to threats much more scalable. Your team can focus on setting identity and access parameters, and let automated tools apply those rules to daily access needs.

For example, Ansible Automation Platform, working with CyberArk Response Manager (CARM), can respond to threats automatically by managing users, security policies and credentials based on preconfigured parameters. CARM is part of the CyberArk PAM Collection, developed as part of the Ansible security automation initiative.

At a high level, the CARM algorithm works like this:

1. An event is detected. For example:

A user leaves the company

User credentials get compromised

An email address gets compromised

2. An automated workflow is triggered

3. A credential is retrieved to authenticate CyberArk

4. The relevant module is invoked:

cyberark_user

cyberark_policy

cyberark_account

cyberark_credential

5. A remediation is performed through the module

Depending on the specifics of the detected threat and the CyberArk platform configuration, the security action might be to, for example:

Reset a user’s credentials or disable the user so that the user must reset their password.

Enhance or relax a security policy or workflow.

Trigger a credential rotation, in which a vaulted credential is rotated.

As your environment goes about its daily business of deploying, testing and updating payloads, as well as processing and maintaining data, security operators can use Ansible to automatically call CARM to perform the security actions, and then CARM also performs them automatically.

Incident Response and Incident Remediation | E5: Ask CyberArk Podcast

Automating threat responses that previously required human intervention now serves as the basis for proactive defense in depth.

Credential retrieval is the first step in many scenarios using Ansible and CARM. This step is performed by the cyberark_credential module of the cyberark.pas Collection. The module can receive credentials from the Central Credential Provider. That way, we can obviate the need to hard code the credential in the environment:

- name: credential retrieval basic

cyberark_credential:

api_base_url: "http://10.10.0.1"

app_id: "TestID"

query: "Safe=test;UserName=admin"

As can be seen in this example, a target URL needs to be provided in addition to the application ID authorized for retrieving the credential.

The central parameter is the query: it contains the details of the object actually being queried, in this case the “UserName” and “Safe”. The query parameters depend on the use case, and possible values are “Folder”, “Object”, “Address”, “Database” and “PolicyID”.

If you are more familiar with the CyberArk API, here is the actual URI request that is created out of these parameter values:

{ api_base_url }"/AIMWebService/api/Accounts?AppId="{ app_id }"&Query="{ query }

The return value of the module contains — among other information — the actual credentials, and can be reused in further automation steps.

A more production-level approach is to also encrypt the communication to the API via client certificates:

- name: credential retrieval advanced

cyberark_credential:

api_base_url: "https://components.cyberark.local"

validate_certs: yes

client_cert: /etc/pki/ca-trust/source/client.pem

client_key: /etc/pki/ca-trust/source/priv-key.pem

app_id: "TestID"

query: "Safe=test;UserName=admin"

connection_timeout: 60

query_format: Exact

fail_request_on_password_change: True

reason: "requesting credential for Ansible deployment"

Now, let’s look at an example where the detected “bad” event requires rotation of account credentials. With the help of the cyberark_account module, we can change the credentials of the compromised account. The module supports account object creation, deletion and modification using the PAS Web Services SDK.

- name: Rotate credential via reconcile and provide new password

cyberark_account:

identified_by: "address,username"

safe: "Domain_Admins"

address: "prod.cyberark.local"

username: "admin"

platform_id: WinDomain

platform_account_properties:

LogonDomain: "PROD"

secret_management:

new_secret: "Ama123ah12@#!Xaamdjbdkl@#112"

management_action: "reconcile"

automatic_management_enabled: true

state: present

cyberark_session: "{{ cyberark_session }}"

In this example, we changed the password for the user “admin”. Note that the authentication is handled via the cyberark_session value, which is usually obtained from the cyberark_authentication module.

Ansible Automates 2021: Session 1 - Modern Governance - John Willis

More Information:

https://www.redhat.com/en/resources/cyberark-partner-case-study

https://www.redhat.com/en/technologies/management/ansible

https://www.redhat.com/en/technologies/cloud-computing/openshift/container-platform

https://www.redhat.com/en/technologies/management/ansible/automation-execution-environments

https://www.redhat.com/en/technologies/management/ansible/features

The Other IBM Big Iron That Is On The Horizon

The Hot Chips conference is underway this week, historically at Stanford University but this year as was the case last year, is being done virtually thanks to the coronavirus pandemic. There are a lot of new chips that are being discussed in detail, and one of them is not the forthcoming Power10 chip from IBM, which is expected to make its debut sometime in September and which was one of the hot items at last year’s Hot Chips event.

Previewing IBM Telum Processor

The one processor that IBM is talking about, however, is the “Telum” z16 processor for System z mainframes, and unlike in times past, IBM is revealing the latest of its epically long line of mainframe central processing units (1964 through 2022, and counting) before they are launched in systems rather than after. We happen to think IBM had hoped to be able to ship the Telum processors and their System z16 machines before the end of 2021 and the transition from 10 nanometer to 7 nanometer processes at former foundry partner GlobalFoundries to 7 nanometer processes at current foundry partner Samsung has delayed the z16 introduction from its usual cadence. As it stands, the z16 chip will come out in early 2022, after the Power10 chips with fat cores (meaning eight threads per core and only 15 cores per chip) come to market. The skinny Power10 cores (four threads per core but 30 cores on a die) used in so-called “scale out” systems are not expected until the second quarter of 2022. It is rough to change foundries and processes and microarchitectures all at the same time, so a delay from the original plan for both z16 and Power10 are to be expected.

It will be up to a judge to accept IBM’s lawsuit against GlobalFoundries, which we talked about back in June, or not accept it, and it will be up to a jury to decide economic damages should Big Blue prevail and win its case in the courts. Or, Intel could buy GlobalFoundries and settle the case and have IBM as its foundry partner. There are a lot of possible scenarios here. The good news is that IBM and Samsung have been able to get the z16 and Power10 processors designed and are ramping production on the Samsung 7 nanometer process, trying to drive up yields. If IBM could not deliver these chips in the near term, it would not be saying anything at this point. Like when the process shrink with the Power6+ or the Power6+ were not panning out, for instance.

The Telum z16 processor is interesting from a technology standpoint because it shows what IBM can do and what it might do with future Power processors, and it is important from an economic standpoint because the System z mainframe still accounts for a large percentage of IBM’s revenues and an even larger share of its profits. (It is hard to say with any precision.) As the saying goes around here, anything that lets IBM Systems stronger helps IBM i last longer.

Besides, it is just plain fun to look at enterprise server chips. So, without further ado, take a gander at the Telum z16 processor:

According to Ross Mauri, the general manager of the System z product line, “Telum” refers to one of the weapons sported by Artemis, the Greek goddess of the hunt, known for bow hunting but also for her skill with the javelin. This particular javelin has to hit its enterprise target and help Big Blue maintain its mainframe customer base and make them enthusiastic about investing in new servers. The secret sauce in the Telum chip, as it turns out, will be an integrated AI accelerator chip that was developed by IBM Research and that has been modified and incorporated into the design, thus allowing for machine learning inference algorithms to be run natively and in memory alongside production data and woven into mainframe applications.

This is important, and bodes well for the Power10 chip, which is also getting its own native machine learning inference acceleration, albeit of a different variety. The z16 chip has an on-die mixed-precision accelerator for floating point and integer data, while the Power10 chip has a matrix math overlay for its vector math units. The net result is the same, however: Machine learning inference can stay within the compute and memory footprint of the server, and that means it will not be offloaded to external systems or external GPUs or other kinds of ASICs and will therefore be inside the bulwark of legendary mainframe security. There will be no compliance or regulatory issues because customer data that is feeding the machine learning inference and the response or recommendation from that inference will all be in the same memory space. For this reason, we have expected for a lot of machine learning inference to stay on the CPUs on enterprise servers, while machine learning training will continue to be offloaded to GPUs and sometimes other kinds of ASICs or accelerators. (FPGAs are a good alternative for inference.)

Partner Preview of Telium 7nm Processor

https://cdnapisec.kaltura.com/index.php/extwidget/preview/partner_id/1773841/uiconf_id/27941801/entry_id/1_zkn3b6gd/embed/dynamic

The Telum chip measures 530 square millimeters in area and weighs in at about 22.5 billion transistors. By Power standards, the z16 cores are big fat ones, with lots of registers, branch target table entries, and such, which is why IBM can only get eight fat cores on that die. The Power10 chip, which we have nick-named “Cirrus” because IBM had a lame name for it, using the same 7 nanometer transistors can get sixteen fat cores (and 32 skinny cores) on a die that weighs in at 602 square millimeters but has only 18 billion transistors. The Telum chip will have a base clock speed of more than 5 GHz, which is normal for recent vintages of mainframe CPUs.

A whole bunch of things have changed with the Telum design compared to the z14 and z15 designs. IBM has used special versions of the chip called Service Processors, or SPs, to act as external I/O processors, offloading from Central Processors, or CPs, which actually do the compute. With this design, IBM is doing away with this and tightly coupling the chips together with on-die interconnects, much as it has done with Power processors for many generations. Mainframe processors in recent years also had lots of dedicated L3 cache and an external L4 cache that also housed the system interconnect bus (called the X-Bus). The z15 chip implemented in GlobalFoundries 14 nanometer processes had a dozen cores and 256 MB of L3 cache, plus 4 MB of L2 data cache and 4 MB of L2 instruction cache allocated for each core. Each core had 128 KB of L1 instruction cache and 128 KB of instruction cache. It ran at 5.2 GHz, and supported up to 40 TB of RAID-protected DDR4 main memory across a complex of 190 active compute processors.

With the z16 design, the cache is being brought way down. Each core has only 32 MB of L2 cache, which is made possible in part because the branch predictors on the front end of the chip have been redesigned. The core has four pipelines and supports SMT2 multithreading, but it doesn’t have physical L3 cache or physical L4 cache any more. Rather, according to Christian Jacobi, distinguished engineer and chief architect of the z16 processor, it implements a virtual 256 MB L3 cache across those physical L2 caches and a virtual 2 GB cache across eight chips in a system drawer. How this cache is all carved up is interesting, and it speaks to the idea that caches often are inclusive anyway (meaning everything in L1 is in L2, everything in L3 is in L3, and everything in L3 is in L4), which is a kind of dark silicon. Why not determine the hierarchy on the fly based on actual workload needs?

To make this virtual L3 cache, there are a pair of 320 GB/sec rings. Two chips are linked together in a single package using a synchronous transfer interface, shown at the bottom two thirds of the Telum chip and four sockets of these dual-chip modules (DCMs) are interlinked in a flat, all-to-all topology through on-drawer interfaces and fabric controllers, which run across the top of the Telum chip. At the bottom left is the AI Accelerator, which has more than 6 teraflops of mixed precision integer and floating point processing power that is accessible through the z16 instruction set and is not using a weird offload model as is the case with CPUs that offload machine learning inference to GPUs, FPGAs, or custom ASICs. This accelerator, says Jacobi, takes up a little less real estate on the chip than a core does. And clearly, if IBM wanted to raise the ratio, it can add more accelerators. This ratio is interesting in that it shows how much AI inference that IBM expects – and that its customers expect – to be woven into their applications.

That is the key insight here.

This on-chip AI Accelerator has 128 compute tiles that can do 8-way half precision (FP16) floating point SIMD operations, which is optimized for matrix multiplication and convolutions used in neural network training. The AI Accelerator also has 32 compute tiles that implement 8-way FP16/FP32 units that are optimized for activation functions and more complex operations. The accelerator also has what IBM calls an intelligent prefetcher and write-back block, which can move data to an internal scratchpad at more than 120 GB/sec and that can store data out to the processor caches at more than 80 GB/sec. The two collections of AI math units have what IBM calls an intelligent data mover and formatter that prepares incoming data for compute and then write-back after it has been chewed on by the math units, and this has an aggregate of 600 GB/sec of bandwidth.

That’s an impressive set of numbers for a small block of chips, and a 32-chip complex (four sets of four-sockets of DCMs) can deliver over 200 teraflops of machine learning inference performance. (There doesn’t seem to be INT8 or INT4 integer support on this device, but don’t be surprised if IBM turns it on eventually, thereby doubling and quadrupling the inference performance for some use cases that have relatively coarse data.)

Jacobi says that a z16 socket with an aggregate of 16 cores will deliver 40 percent more performance than a z15 socket, which had 12 cores. If you do the math, 33 percent of that increase came from the core count increase; the rest comes from microarchitecture tweaks and process shrinks. We don’t expect the clock speed to be much more than a few hundred megahertz more than the frequencies used in the z15 chip, in fact. There may be some refinements in the Samsung 7 nanometer process further down the road that allow IBM to crank it up and boost performance with some kickers. The same thing could happen with Power10 chips, by the way.

One final though, and it is where the rubber hits the road with this AI Accelerator. A customer in the financial services industry worked with IBM to adapt its recurrent neural network (RNN) to the AI Accelerator, allowing it to do inference on the machine for a credit card fraud model. This workload was simulated on a z16 system simulator, so take it with a grain of salt. It illustrates the principle:

With only one chip, the simulated System z16 machine could handle 116,000 inferences per second with an average latency of 1.1 milliseconds, which is acceptable throughput and latency for a financial transaction not to be stalled by the fraud detection and for it to be done in real time rather than after the fact. With 32 chips in a full System z16 machine, the AI Accelerator could scale linearly, yielding 3.5 million inferences per second with an average latency of 1.2 milliseconds. That’s a scalability factor of 94.3 percent of perfect linear scaling, and we think this has as much to do with the flat, fast topology in the new z16 interconnect and with the flatter cache hierarchy as it has to do with the robustness of the AI Accelerator.

IBM updates its mainframe processor to help AI

IBM's Telum processor will have on-chip acceleration for artificial intelligence inferencing.

IBM has introduced a new CPU for its Z Series mainframe that’s designed for transactions like banking, training, insurance, customer interactions, and fraud detection.

The Telum processor was unveiled at the annual Hot Chips conference and has been in development for three years to provide high-volume, real-time inferencing needed for artificial intelligence.

The Telum design is very different from its System z15 predecessor. It features 8 CPU cores, on-chip workload accelerators, and 32MB of what IBM calls Level 2 semi-private cache. The L2 cache is called semi-private because it is used to build a shared virtual 256MB L3 connection between the cores on the chip. This is a 1.5x growth in cache size over the z15.

The CPU comes in a module design that includes two closely coupled Telum processors, so you get 16 cores per socket running at 5Ghz. IBM Z systems pack their processors in what are known as drawers, with four sockets per drawer. The Telum processor will be manufactured by Samsung using a 7nm process, as compared to the 14nm process used for the z15 processor.

Stopping Fraud

IBM mainframes are still heavily used in online transaction processing (OLTP) and one of the problems that bedevils OLTP is that fraud usually isn’t caught until after it is committed.

Doing real-time analysis on millions of transactions is just not doable, particularly when fraud analysis and detection is conducted far away from mission-critical transactions and data, IBM says. AI could help, but AI workloads have much larger computational requirements than operating workloads.

“Due to latency requirements, complex fraud detection often cannot be completed in real-time—meaning a bad actor could have already successfully purchased goods with a stolen credit card before the retailer is aware fraud has taken place,” the company said in a blog post announcing Telum.

So the new chip is designed for real-time, AI-specific financial workloads. Just how it will work is not exactly known. Telum-based z16 mainframes are not expected until the second half of 2022.

A brief overview of IBM’s new 7 nm Telum mainframe CPU

A typical Telum-powered mainframe offers 256 cores at a base clock of 5+GHz.

From the perspective of a traditional x86 computing enthusiast—or professional—mainframes are strange, archaic beasts. They're physically enormous, power-hungry, and expensive by comparison to more traditional data-center gear, generally offering less compute per rack at a higher cost.

IBM MainFrame Life Cycle History

IBM MainFrame Life Cycle History from albertspijkers

This raises the question, "Why keep using mainframes, then?" Once you hand-wave the cynical answers that boil down to "because that's how we've always done it," the practical answers largely come down to reliability and consistency. As AnandTech's Ian Cutress points out in a speculative piece focused on the Telum's redesigned cache, "downtime of these [IBM Z] systems is measured in milliseconds per year." (If true, that's at least seven nines.)

IBM's own announcement of the Telum hints at just how different mainframe and commodity computing's priorities are. It casually describes Telum's memory interface as "capable of tolerating complete channel or DIMM failures, and designed to transparently recover data without impact to response time."

When you pull a DIMM from a live, running x86 server, that server does not "transparently recover data"—it simply crashes.

IBM Z-series architecture

Telum is designed to be something of a one-chip-to-rule-them-all for mainframes, replacing a much more heterogeneous setup in earlier IBM mainframes.

The 14 nm IBM z15 CPU that Telum is replacing features five total processors—two pairs of 12-core Compute Processors and one System Controller. Each Compute Processor hosts 256MiB of L3 cache shared between its 12 cores, while the System Controller hosts a whopping 960MiB of L4 cache shared between the four Compute Processors.

Five of these z15 processors—each consisting of four Compute Processors and one System Controller—constitutes a "drawer." Four drawers come together in a single z15-powered mainframe.

Although the concept of multiple processors to a drawer and multiple drawers to a system remains, the architecture inside Telum itself is radically different—and considerably simplified.

Telum architecture

Telum is somewhat simpler at first glance than z15 was—it's an eight-core processor built on Samsung's 7nm process, with two processors combined on each package (similar to AMD's chiplet approach for Ryzen). There is no separate System Controller processor—all of Telum's processors are identical.

From here, four Telum CPU packages combine to make one four-socket "drawer," and four of those drawers go into a single mainframe system. This provides 256 total cores on 32 CPUs. Each core runs at a base clockrate over 5 GHz—providing more predictable and consistent latency for real-time transactions than a lower base with higher turbo rate would.

Pockets full of cache

Doing away with the central System Processor on each package meant redesigning Telum's cache, as well—the enormous 960MiB L4 cache is gone, as well as the per-die shared L3 cache. In Telum, each individual core has a private 32MiB L2 cache—and that's it. There is no hardware L3 or L4 cache at all.

This is where things get deeply weird—while each Telum core's 32MiB L2 cache is technically private, it's really only virtually private. When a line from one core's L2 cache is evicted, the processor looks for empty space in the other cores' L2. If it finds some, the evicted L2 cache line from core x is tagged as an L3 cache line and stored in core y's L2.

OK, so we have a virtual, shared up-to-256MiB L3 cache on each Telum processor, composed of the 32MiB "private" L2 cache on each of its eight cores. From here, things go one step further—that 256MiB of shared "virtual L3" on each processor can, in turn, be used as shared "virtual L4" among all processors in a system.

Telum's "virtual L4" works largely the same way its "virtual L3" did in the first place—evicted L3 cache lines from one processor look for a home on a different processor. If another processor in the same Telum system has spare room, the evicted L3 cache line gets retagged as L4 and lives in the virtual L3 on the other processor (which is made up of the "private" L2s of its eight cores) instead.

AnandTech's Ian Cutress goes into more detail on Telum's cache mechanisms. He eventually sums them up by answering "How is this possible?" with a simple "magic."

IBM Telum Processor brings deep learning inference to enterprise workloads

AI inference acceleration

IBM's Christian Jacobi briefly outlines Telum's AI acceleration in this two-minute clip.

Telum also introduces a 6TFLOPS on-die inference accelerator. It's intended to be used for—among other things—real-time fraud detection during financial transactions (as opposed to shortly after the transaction).

In the quest for maximum performance and minimal latency, IBM threads several needles. The new inference accelerator is placed on-die, which allows for lower latency interconnects between the accelerator and CPU cores—but it's not built into the cores themselves, a la Intel's AVX-512 instruction set.

The problem with in-core inference acceleration like Intel's is that it typically limits the AI processing power available to any single core. A Xeon core running an AVX-512 instruction only has the hardware inside its own core available to it, meaning larger inference jobs must be split among multiple Xeon cores to extract the full performance available.

Telum's accelerator is on-die but off-core. This allows a single core to run inference workloads with the might of the entire on-die accelerator, not just the portion built into itself.

In a major refresh of its Z Series chips, IBM is adding on-chip AI acceleration capabilities to allow enterprise customers to perform deep learning inferencing while transactions are taking place to capture business insights and fight fraud in real-time.

IBM is set to unveil the latest Z chip Aug. 23 (Monday) at the annual Hot Chips 33 conference, which is being held virtually due to the ongoing COVID-19 pandemic. The company provided advance details in a media pre-briefing last week.

This will be the first Z chip, used in IBM's System Z mainframes, that won't follow a traditional numeric naming pattern used in the past. Instead of following the previous z15 chip with a z16 moniker, the new processor is being called IBM Telum.

This will be IBM's first processor to include on-chip AI acceleration, according to the company. Designed for customers across a wide variety of uses, including banking, finance, trading, insurance applications and customer interactions, the Telum processors have been in development for the past three years. The first Telum-based systems are planned for release in the first half of 2022.

Previewing IBM Telum Processor

Ross Mauri of IBM

One of the major strengths of the new Telum chips is that they are designed to enable applications to run efficiently where their data resides, giving enterprises more flexibility with their most critical workloads, Ross Mauri, the general manager of IBM Z, said in a briefing with reporters on the announcement before Hot Chips.

"From an AI point of view, I have been listening to our clients for several years and they are telling me that they can't run their AI deep learning inferences in their transactions the way they want to," said Mauri. "They really want to bring AI into every transaction. And the types of clients I am talking to are running 1,000, 10,000, 50,000 transactions per second. We are talking high volume, high velocity transactions that are complex, with multiple database reads and writes and full recovery for … transactions in banking, finance, retail, insurance and more."

By integrating mid-transaction AI inferencing into the Telum chips, it will be a huge breakthrough for fraud detection, said Mauri.

"You hear about fraud detection all the time," he said. "Well, I think we are going to be able to move from fraud detection to fraud prevention. I think this is a real game changer when it comes to the business world, and I am really excited about that."

Inside the Telum Architecture

Christian Jacobi of IBM

Christian Jacobi, an IBM distinguished engineer and the chief architect for IBM Z processor design, said that the Telum chip design is specifically optimized to run these kinds of mission critical, heavy duty transaction processing and batch processing workloads, while ensuring top-notch security and availability.

"We have designed this accelerator using the AI core coming from the IBM AI research center," in cooperation with the IBM Research team, the IBM Z chip design team and the IBM Z firmware and software development team, said Jacobi. It is the first IBM chip created using technology from the IBM Research AI hardware center.

"The goal for this processor and the AI accelerator was to enable embedding AI with super low latency directly into the transactions without needing to send the data off-platform, which brings all sorts of latency inconsistencies and security concerns," said Jacobi. "Sending data over a network, oftentimes personal and private data, requires cryptography and auditing of security standards that creates a lot of complexity in an enterprise environment. We have designed this accelerator to operate directly on the data using virtual address mechanisms and the data protection mechanisms that naturally apply to any other thing on the IBM Z processor."

To achieve the low latencies, IBM engineers directly connected the accelerator to the on-chip cache infrastructure, which can directly access model data and transaction data from the caches, he explained. "It enables low batch counts so that we do not have to wait for multiple transactions to arrive of the same model type. All of that is geared towards enabling millisecond range inference tasks so that they can be done as part of the transaction processing without impacting their service level agreements … and have the results available to be able to utilize the AI inference result as part of the transaction processing.

The new Telum chips are manufactured for IBM by Samsung using a 7nm Extreme Ultraviolet Lithography (EUV) process.

The chips have a new design compared to IBM's existing z15 chips, according to Jacobi. Today's z15 chips use dual chip modules, but the Telum chips will use a single module.

"Four of those modules will be plugged into one drawer, basically a motherboard with four sockets," he said. "In the prior generations, we used two different chip types – a processor chip and a system control chip that contained a physical L4 cache, and that system control hub chip also acted as a hub whenever two processor chips needed to communicate with each other."

Dropping the system control chip enabled the designers to reduce the latencies, he said. "When one chip needs to talk to another chip, we can implement a flat topology, where each of the eight chips in a drawer has a direct connection to the seven other chips in the drawer. That optimizes latencies for cache interactions and memory access across the drawer."

Jacobi said that the Telum chips will provide a 40 percent performance improvement at the socket level over the company's previous z15 chips, which is boosted in part by its extra cores.

Each Telum processor contains eight processor cores and use a deep super-scalar, out-of-order instruction pipeline. Running with more than 5GHz clock frequency, the redesigned cache and chip-interconnection infrastructure provides 32MB cache per core and can scale to 32 Telum chips. The dual-chip module design contains 22 billion transistors and 19 miles of wire on 17 metal layers.

Introducing the new IBM z15 T02

An Analyst Weighs In

Karl Freund, analyst

Karl Freund, founder and principal analyst of Cambrian AI Research, told EnterpriseAI that the upcoming Telum chips directly target enterprise users with a wide range of useful capabilities.

"In addition to a new cache architecture that will significantly pump-up performance, the Telum processor will provide Z customers with the ability to run real-time AI on the same platform that conducts the mission critical transactions and analyses on which enterprises depend," said Freund. "Until Telum, Z applications could run machine learning on the Z cores, or offload data to another platform for deep learning analysis. The latter introduces unacceptable latencies and significant costs, as well as introducing entirely new and now unnecessary security risk."

For customers, these could be compelling improvements, said Freund.

"I believe Telum provides the biggest reason we have seen in a while for applications to remain in the Z fold," he said. "Honestly, enterprise AI is about to get very real with the Telum processor. After all, the Z is the custodian of some of the most important data in an enterprise. Being able to run deep learning models directly on the Z will unlock tremendous value for Z clients, value that has been hidden until Telum."

What still needs to be determined, he said, is how it all will perform when the development work is completed. "We need to see whether the small accelerator at 'only' 6 TFLOPS can provide adequate processing power to eight very fast cores needing AI services. However, since the data involved here is highly structured numerical floating point or decimal data, instead of images, voice or video, I suspect the accelerator will prove up to the task."

More Details to Come Later

Jacobi said that IBM is not yet providing any additional system performance specifications until the company's engineers complete further Optimizations of its firmware and software stacks that will run on the systems.

"We will provide more performance gains out of Optimization across those layers of the entire stack when we get to that point next year," he said.

"I will add that one of the unique things about our design … is that every core when it does AI … are performing a hybrid – they are performing the complex transaction work including the databases and the business logic, and then switch over to perform AI as part of the transaction," said Jacobi. "When they switch over, they can use the entire capacity of the AI accelerator and not just a sliver that is assigned to the core. It is a dedicated piece of silicon on the chip. And the core can access that piece of silicon and use the entire compute capacity of that AI accelerator. That is important for us to achieve the low latencies that we need to make it all fit within the transaction response budget."

About That Z Series Name Change

The upcoming Telum chips will show up in IBM's Z Series and LinuxONE systems as the main processor chip for both product lines, said IBM's Mauri. They are not superseding the Z Series chips, he said.

So, does that mean that z16, z17 and other future Z chips are no longer on the roadmap?

"No, said Mauri. "It just means that we never named our chips before. We are naming it this time because we are proud of the innovation and breakthroughs and think that it is unique in what it does. And that is the only thing. I think z15 will still be around and there will be future generations, many future generations."

Still to Come: New z/OS

To prepare for the arrival of the Telum chips, IBM has already slated the debut of the next version of its updated z/OS 2.5 operating system for Z Series hardware sometime in September, according to an earlier report by The Register. The mainframe operating system is expected to get more AI and hybrid cloud features, as well as expanded co-processor support, based on a preview of the OS that was unveiled in March.

More Information

https://www.ibm.com/blogs/systems/ibm-telum-processor-the-next-gen-microprocessor-for-ibm-z-and-ibm-linuxone/

https://research.ibm.com/blog/telum-processor

https://www.extremetech.com/computing/326402-ibms-new-system-z-cpu-offers-40-percent-more-performance-per-socket-integrated-ai

Restricted Boltzmann machine (RBM)

A restricted Boltzmann machine (RBM) is a type of artificial neural network (ANN) for machine learning of probability distributions. An artificial neural network is a system of hardware and/or software patterned after the operation of neurons in the human brain.

Created by Geoff Hinton, RBM algorithms are useful in defining dimensionality reduction, classification, regression, collaborative filtering, feature learning and topic modeling. Like perceptrons, they are a relatively simple type of neural network.

RBMs fall into the categories of Stochastic and generative models of artificial intelligence. Stochastic refers to anything based on probabilities and generative means that it uses AI to produce (generate) a desired output. Generative models contrast with discriminative models, which classify existing data.

Like all multi-layer neural networks, RBMs have layers of artificial neurons, in their case two. The first layer is the input layer. The second is a hidden layer that only accepts what the first layer passes on. The restriction spoken of in RBM is that the different neurons within the same layer can’t communicate with one another. Instead, neurons can only communicate with other layers. (In a standard Boltzmann machine, neurons in the hidden layer intercommunicate.) Each node within a layer performs its own calculations. After performing its calculations, the node then makes a stochastic decision about whether to pass on the on to the next layer.

Though RBM are still sometimes used, they have mostly been replaced by generative adversarial networks or vibrational auto-encoders.

Boltzmann Machine -A Probabilistic Graphical Models

Sir Geoffrey Hinton, the “Godfather of Deep Learning” coined Boltzmann Machine in 1985 for the first time. A well-known figure and personality in the deep learning community Sir Geoffrey Hinton also a professor at the University of Toronto.

Boltzmann Machines – A kind of imaginary recurrent neural network and this normally get interpreted from the probabilistic graphical models. In a short and concise manner a neural network which is fully connected and consist of visible and hidden units. It operates in asynchronous mode with stochastic updates for each of its unit.

These machines are also called as probability distributions on high dimensional binary vectors. It’s a generative unsupervised model used for probability distribution from an original dataset. A great demanding/hungry tool for computation power however restricting its network topology the behaviour can be controlled.

It is indeed an algorithm which is useful for dimensionality reduction, classification, regression, collaborative filtering, feature learning and topic modelling. Like any other neural network, these machines also have (both BM and RBM) an input layer or referred to as the visible layer and one or several hidden layers or referred to as the hidden layer.

Restricted Boltzmann Machines

Boltzmann machines are probability distributions on high dimensional binary vectors which are analogous to Gaussian Markov Random Fields in that they are fully determined by first and second-order moments.

It is used for pattern storage and retrieval. As per wiki “A Boltzmann machine is also called stochastic Hopfield network with hidden units) is a type of stochastic recurrent neural network and Markov random field.” RBM itself has many applications, some of them are listed as below

Collaborative filtering
Multiclass classification
Information retrieval
Motion capture modelling
Segmentation
Modelling natural images

Deep belief nets use the Boltzmann machine especially the Restricted Boltzmann machine as a key component but first order weight updates.

Lecture 10 Boltzmann machine

Limitations of neural networks grow clearer in business

AI often means neural networks, but intensive training requirements are prompting enterprises to look for alternatives to neural networks that are easier to implement.

The rise in prominence AI today can be credited largely to improvements in one algorithm category: the neural network. But experts say that the limitations neural networks mean enterprises will need to embrace a fuller lineup algorithms to advance AI.

"With neural networks, there's this huge complication," said David Barrett, founder and CEO Expensify Inc. "You end up with trillions dimensions. If you want to change something, you need to start entirely from scratch. The more we tried [neural networks], we still couldn't get them to work."

Neural network technology is seen as cutting-edge today, but the underlying algorithms are nothing new. They were proposed as theoretically possible decades ago.

What's new is that we now have the massive stores data needed to train algorithms and the robust compute power to process all this data in a reasonable period time. As neural networks have moved from theoretical to practical, they've come to power some the most advanced AI applications, like computer vision, language translation and self-driving cars.

Tom Goldstein: "An empirical look at generalization in neural nets"

Training requirements for neural networks are too high

But the problem, as Barrett and others see it, is that neural networks simply require too much brute force. For example, if you show the algorithm a billion examples images containing certain objects, it will learn to classify that object in new images effectively. But that's a high bar for training, and meeting that requirement is sometimes impossible.

That was the case for Barrett and his team. At the 2018 Artificial Intelligence Conference in New York, he described how Expensify is using natural language processing to automate customer service for its expense reporting software. Neural networks weren't a good fit for Expensify because the San Francisco company didn't have the corpus of historical data necessary.

Expensify's customer inquiries are often esoteric, Barrett said. Even when customers' concerns map to common problems, their phrasing is unique and, therefore, hard to classify using a system that demands many training examples.

So, Barrett and his team developed their own approach. He didn't identify the specific type of algorithms their tool is based on, but he said it compares pieces of conversations to conversations that have proceeded successfully in the past. It doesn't need to classify queries with precision like a neural network would because it's more focused on moving the conversation along a path rather than delivering the right response to a given query. This gives the bot a chance to ask clarifying questions that reduce ambiguity.

"The challenge of AI is it's built to answer perfectly formed questions," Barrett said. "The challenge of the real world is different."

Deep Boltzmann Machines

A 'broad church' of algorithms is needed in AI

Part of the reason for the enthusiasm around neural network technology is that many people are just finding out about it, said Zoubin Ghahramani, chief scientist at Uber. But for those that have known about and used it for years, the limitations of neural networks are well known.

That doesn't mean it's time for people to ignore neural networks, however. Instead, Ghahramani said it comes down to using the right tool for the right job. He described an approach to incorporating Bayesian inference, in which the estimated probability of something occurring is updated when more evidence becomes available, into machine learning models.

"To have successful AI applications that solve challenging real-world problems, you have to have a broad church of methods," he said in a press conference. "You can't come in with one hammer trying to solve all problems."

Another alternative to neural network technology is deep reinforcement learning, which is optimized to achieve a goal over many steps by incentivizing effective steps and penalizing unfavorable steps. The AlphaGo program, which beat human champions at the game Go, used a combination of neural networks and deep reinforcement learning to learn the game.

Deep reinforcement learning algorithms essentially learn through trial and error, whereas neural networks learn through example. This means deep reinforcement requires less labeled training data upfront.

Kathryn Hume, vice president of product and strategy at Integrate.ai Inc., a Toronto-based software company that helps enterprises integrate AI into existing business processes, said any type of model that reduces the reliance on labeled training data is important. She mentioned Bayesian parametric models, which assess the probability of an occurrence based on existing data rather than requiring some minimum threshold of prior examples, one of the primary limitations of neural networks.

"We need not rely on just throwing a bunch of information into a pot," she said. "It can move us away from the reliance on labeled training data when we can infer the structure of data," rather than using algorithms like neural networks, which require millions or billions of examples of labeled training before they can make predictions.

What is a Neural Network and How Does it Work?

esearch on artificial neural networks was motivated by the observation that human intelligence emerges from highly parallel networks of relatively simple, non-linear neurons that learn by adjusting the strengths of their connections. This observation leads to a central computational question: How is it possible for networks of this general kind to learn the complicated internal representations that are required for difficult tasks such as recognizing objects or understanding language? Deep learning seeks to answer this question by using many layers of activity vectors as representations and learning the connection strengths that give rise to these vectors by following the stochastic gradient of an objective function that measures how well the network is performing. It is very surprising that such a conceptually simple approach has proved to be so effective when applied to large training sets using huge amounts of computation and it appears that a key ingredient is depth: shallow networks simply do not work as well.

We reviewed the basic concepts and some of the breakthrough achievements of deep learning several years ago.63 Here we briefly describe the origins of deep learning, describe a few of the more recent advances, and discuss some of the future challenges. These challenges include learning with little or no external supervision, coping with test examples that come from a different distribution than the training examples, and using the deep learning approach for tasks that humans solve by using a deliberate sequence of steps which we attend to consciously—tasks that Kahneman56 calls system 2 tasks as opposed to system 1 tasks like object recognition or immediate natural language understanding, which generally feel effortless.

From Hand-Coded Symbolic Expressions to Learned Distributed Representations

There are two quite different paradigms for AI. Put simply, the logic-inspired paradigm views sequential reasoning as the essence of intelligence and aims to implement reasoning in computers using hand-designed rules of inference that operate on hand-designed symbolic expressions that formalize knowledge. The brain-inspired paradigm views learning representations from data as the essence of intelligence and aims to implement learning by hand-designing or evolving rules for modifying the connection strengths in simulated networks of artificial neurons.

In the logic-inspired paradigm, a symbol has no meaningful internal structure: Its meaning resides in its relationships to other symbols which can be represented by a set of symbolic expressions or by a relational graph. By contrast, in the brain-inspired paradigm the external symbols that are used for communication are converted into internal vectors of neural activity and these vectors have a rich similarity structure. Activity vectors can be used to model the structure inherent in a set of symbol strings by learning appropriate activity vectors for each symbol and learning non-linear transformations that allow the activity vectors that correspond to missing elements of a symbol string to be filled in. This was first demonstrated in Rumelhart et al.74 on toy data and then by Bengio et al.14 on real sentences. A very impressive recent demonstration is BERT,22 which also exploits self-attention to dynamically connect groups of units, as described later.

The main advantage of using vectors of neural activity to represent concepts and weight matrices to capture relationships between concepts is that this leads to automatic generalization. If Tuesday and Thursday are represented by very similar vectors, they will have very similar causal effects on other vectors of neural activity. This facilitates analogical reasoning and suggests that immediate, intuitive analogical reasoning is our primary mode of reasoning, with logical sequential reasoning being a much later development,56 which we will discuss.

BOLTZMANN MACHINES

The Rise of Deep Learning

Deep learning re-energized neural network research in the early 2000s by introducing a few elements which made it easy to train deeper networks. The emergence of GPUs and the availability of large datasets were key enablers of deep learning and they were greatly enhanced by the development of open source, flexible software platforms with automatic differentiation such as Theano,16 Torch,25 Caffe,55 TensorFlow,1 and PyTorch.71 This made it easy to train complicated deep nets and to reuse the latest models and their building blocks. But the composition of more layers is what allowed more complex non-linearities and achieved surprisingly good results in perception tasks, as summarized here.

Why depth? Although the intuition that deeper neural networks could be more powerful pre-dated modern deep learning techniques, it was a series of advances in both architecture and training procedures,15,35,48 which ushered in the remarkable advances which are associated with the rise of deep learning. But why might deeper networks generalize better for the kinds of input-output relationships we are interested in modeling? It is important to realize that it is not simply a question of having more parameters, since deep networks often generalize better than shallow networks with the same number of parameters. The practice confirms this. The most popular class of convolutional net architecture for computer vision is the ResNet family of which the most common representative, ResNet-50 has 50 layers. Other ingredients not mentioned in this article but which turned out to be very useful include image deformations, drop-out, and batch normalization.

We believe that deep networks excel because they exploit a particular form of compositionality in which features in one layer are combined in many different ways to create more abstract features in the next layer.

For tasks like perception, this kind of compositionality works very well and there is strong evidence that it is used by biological perceptual systems.

Unsupervised pre-training. When the number of labeled training examples is small compared with the complexity of the neural network required to perform the task, it makes sense to start by using some other source of information to create layers of feature detectors and then to fine-tune these feature detectors using the limited supply of labels. In transfer learning, the source of information is another supervised learning task that has plentiful labels. But it is also possible to create layers of feature detectors without using any labels at all by stacking auto-encoders.

Deep Learning Lecture 10.3 - Restricted Boltzmann Machines

First, we learn a layer of feature detectors whose activities allow us to reconstruct the input. Then we learn a second layer of feature detectors whose activities allow us to reconstruct the activities of the first layer of feature detectors. After learning several hidden layers in this way, we then try to predict the label from the activities in the last hidden layer and we backpropagate the errors through all of the layers in order to fine-tune the feature detectors that were initially discovered without using the precious information in the labels. The pre-training may well extract all sorts of structure that is irrelevant to the final classification but, in the regime where computation is cheap and labeled data is expensive, this is fine so long as the pre-training transforms the input into a representation that makes classification easier.

In addition to improving generalization, unsupervised pre-training initializes the weights in such a way that it is easy to fine-tune a deep neural network with backpropagation. The effect of pre-training on optimization was historically important for overcoming the accepted wisdom that deep nets were hard to train, but it is much less relevant now that people use rectified linear units (see next section) and residual connections.43 However, the effect of pre-training on generalization has proved to be very important. It makes it possible to train very large models by leveraging large quantities of unlabeled data, for example, in natural language processing, for which huge corpora are available. The general principle of pre-training and fine-tuning has turned out to be an important tool in the deep learning toolbox, for example, when it comes to transfer learning or even as an ingredient of modern meta-learning.

The mysterious success of rectified linear units. The early successes of deep networks involved unsupervised pre-training of layers of units that used the logistic sigmoid nonlinearity or the closely related hyperbolic tangent. Rectified linear units had long been hypothesized in neuroscience29 and already used in some variants of RBMs70 and convolutional neural networks.54 It was an unexpected and pleasant surprise to discover35 that rectifying non-linearities (now called ReLUs, with many modern variants) made it easy to train deep networks by backprop and stochastic gradient descent, without the need for layerwise pre-training. This was one of the technical advances that enabled deep learning to outperform previous methods for object recognition, as outlined here.

Breakthroughs in speech and object recognition. An acoustic model converts a representation of the sound wave into a probability distribution over fragments of phonemes. Heroic efforts by Robinson using transputers and by Morgan et al. using DSP chips had already shown that, with sufficient processing power, neural networks were competitive with the state of the art for acoustic modeling. In 2009, two graduate students68 using Nvidia GPUs showed that pre-trained deep neural nets could slightly outperform the SOTA on the TIMIT dataset. This result reignited the interest of several leading speech groups in neural networks. In 2010, essentially the same deep network was shown to beat the SOTA for large vocabulary speech recognition without requiring speaker-dependent training and by 2012, Google had engineered a production version that significantly improved voice search on Android. This was an early demonstration of the disruptive power of deep learning.

Dr. Meir Shimon - ARE YOU A BOLTZMANN BRAIN?

At about the same time, deep learning scored a dramatic victory in the 2012 ImageNet competition, almost halving the error rate for recognizing a thousand different classes of object in natural images.60 The keys to this victory were the major effort by Fei-Fei Li and her collaborators in collecting more than a million labeled images31 for the training set and the very efficient use of multiple GPUs by Alex Krizhevsky. Current hardware, including GPUs, encourages the use of large mini-batches in order to amortize the cost of fetching a weight from memory across many uses of that weight. Pure online stochastic gradient descent which uses each weight once converges faster and future hardware may just use weights in place rather than fetching them from memory.

The deep convolutional neural net contained a few novelties such as the use of ReLUs to make learning faster and the use of dropout to prevent over-fitting, but it was basically just a feed-forward convolutional neural net of the kind that Yann LeCun and his collaborators had been developing for many years.64,65 The response of the computer vision community to this breakthrough was admirable. Given this incontrovertible evidence of the superiority of convolutional neural nets, the community rapidly abandoned previous hand-engineered approaches and switched to deep learning.

Recent Advances

Here we selectively touch on some of the more recent advances in deep learning, clearly leaving out many important subjects, such as deep reinforcement learning, graph neural networks and meta-learning.

Soft attention and the transformer architecture. A significant development in deep learning, especially when it comes to sequential processing, is the use of multiplicative interactions, particularly in the form of soft attention.7,32,39,78 This is a transformative addition to the neural net toolbox, in that it changes neural nets from purely vector transformation machines into architectures which can dynamically choose which inputs they operate on, and can store information in differentiable associative memories. A key property of such architectures is that they can effectively operate on different kinds of data structures including sets and graphs.

Soft attention can be used by modules in a layer to dynamically select which vectors from the previous layer they will combine to compute their outputs. This can serve to make the output independent of the order in which the inputs are presented (treating them as a set) or to use relationships between different inputs (treating them as a graph).

The transformer architecture,85 which has become the dominant architecture in many applications, stacks many layers of "self-attention" modules. Each module in a layer uses a scalar product to compute the match between its query vector and the key vectors of other modules in that layer. The matches are normalized to sum to 1, and the resulting scalar coefficients are then used to form a convex combination of the value vectors produced by the other modules in the previous layer. The resulting vector forms an input for a module of the next stage of computation. Modules can be made multi-headed so that each module computes several different query, key and value vectors, thus making it possible for each module to have several distinct inputs, each selected from the previous stage modules in a different way. The order and number of modules does not matter in this operation, making it possible to operate on sets of vectors rather than single vectors as in traditional neural networks. For instance, a language translation system, when producing a word in the output sentence, can choose to pay attention to the cor-responding group of words in the input sentence, independently of their position in the text. While multiplicative gating is an old idea for such things as coordinate transforms and powerful forms of recurrent networks, its recent forms have made it mainstream. Another way to think about attention mechanisms is that they make it possible to dynamically route information through appropriately selected modules and combine these modules in potentially novel ways for improved out-of-distribution generalization.

How a Boltzmann machine models data

Transformers have produced dramatic performance improvements that have revolutionized natural language processing,27,32 and they are now being used routinely in industry. These systems are all pre-trained in a self-supervised manner to predict missing words in a segment of text.

Perhaps more surprisingly, transformers have been used successfully to solve integral and differential equations symbolically.62 A very promising recent trend uses transformers on top of convolutional nets for object detection and localization in images with state-of-the-art performance.19 The transformer performs post-processing and object-based reasoning in a differentiable manner, enabling the system to be trained end-to-end.

Unsupervised and self-supervised learning. Supervised learning, while successful in a wide variety of tasks, typically requires a large amount of human-labeled data. Similarly, when reinforcement learning is based only on rewards, it requires a very large number of interactions. These learning methods tend to produce task-specific, specialized systems that are often brittle outside of the narrow domain they have been trained on. Reducing the number of human-labeled samples or interactions with the world that are required to learn a task and increasing the out-of-domain robustness is of crucial importance for applications such as low-resource language translation, medical image analysis, autonomous driving, and content filtering.

Humans and animals seem to be able to learn massive amounts of background knowledge about the world, largely by observation, in a task-independent manner. This knowledge underpins common sense and allows humans to learn complex tasks, such as driving, with just a few hours of practice. A key question for the future of AI is how do humans learn so much from observation alone?

A key question for the future of AI is how do humans learn so much from observation alone?

In supervised learning, a label for one of N categories conveys, on average, at most log2(N) bits of information about the world. In model-free reinforcement learning, a reward similarly conveys only a few bits of information. In contrast, audio, images and video are high-bandwidth modalities that implicitly convey large amounts of information about the structure of the world. This motivates a form of prediction or reconstruction called self-supervised learning which is training to "fill in the blanks" by predicting masked or corrupted portions of the data. Self-supervised learning has been very successful for training transformers to extract vectors that capture the context-dependent meaning of a word or word fragment and these vectors work very well for downstream tasks.

For text, the transformer is trained to predict missing words from a discrete set of possibilities. But in high-dimensional continuous domains such as video, the set of plausible continuations of a particular video segment is large and complex and representing the distribution of plausible continuations properly is essentially an unsolved problem.

Contrastive learning. One way to approach this problem is through latent variable models that assign an energy (that is, a badness) to examples of a video and a possible continuation.a

Given an input video X and a proposed continuation Y, we want a model to indicate whether Y is compatible with X by using an energy function E(X, Y) which takes low values when X and Y are compatible, and higher values otherwise.

E(X, Y) can be computed by a deep neural net which, for a given X, is trained in a contrastive way to give a low energy to values Y that are compatible with X (such as examples of (X, Y) pairs from a training set), and high energy to other values of Y that are incompatible with X. For a given X, inference consists in finding one cacm6407_a.gif that minimizes E(X, Y) or perhaps sampling from the Y s that have low values of E(X, Y). This energy-based approach to representing the way Y depends on X makes it possible to model a diverse, multi-modal set of plausible continuations.

The key difficulty with contrastive learning is to pick good "negative" samples: suitable points Y whose energy will be pushed up. When the set of possible negative examples is not too large, we can just consider them all. This is what a softmax does, so in this case contrastive learning reduces to standard supervised or self- supervised learning over a finite discrete set of symbols. But in a real-valued high-dimensional space, there are far too many ways a vector cacm6407_b.gif could be different from Y and to improve the model we need to focus on those Ys that should have high energy but currently have low energy. Early methods to pick negative samples were based on Monte-Carlo methods, such as contrastive divergence for restricted Boltzmann machines48 and noise-contrastive estimation.

The Deep Learning Revolution

Generative Adversarial Networks (GANs)36 train a generative neural net to produce contrastive samples by applying a neural network to latent samples from a known distribution (for example, a Gaussian). The generator trains itself to produce outputs to which the model gives low energy). The generator can do so using backpropagation to get the gradient of Ewith respect to . The generator and the model are trained simultaneously, with the model attempting to give low energy to training samples, and high energy to generated contrastive samples.

GANs are somewhat tricky to optimize, but adversarial training ideas have proved extremely fertile, producing impressive results in image synthesis, and opening up many new applications in content creation and domain adaptation34 as well as domain or style transfer.87

Making representations agree using contrastive learning. Contrastive learning provides a way to discover good feature vectors without having to reconstruct or generate pixels. The idea is to learn a feed-forward neural network that produces very similar output vectors when given two different crops of the same image10 or two different views of the same object17 but dissimilar output vectors for crops from different images or views of different objects. The squared distance between the two output vectors can be treated as an energy, which is pushed down for compatible pairs and pushed up for incompatible pairs.

A series of recent papers that use convolutional nets for extracting representations that agree have produced promising results in visual feature learning. The positive pairs are composed of different versions of the same image that are distorted through cropping, scaling, rotation, color shift, blurring, and so on. The negative pairs are similarly distorted versions of different images which may be cleverly picked from the dataset through a process called hard negative mining or may simply be all of the distorted versions of other images in a minibatch. The hidden activity vector of one of the higher-level layers of the network is subsequently used as input to a linear classifier trained in a supervised manner. This Siamese net approach has yielded excellent results on standard image recognition benchmarks.6, Very recently, two Siamese net approaches have managed to eschew the need for contrastive samples. The first one, dubbed SwAV, quantizes the output of one network to train the other network,20 the second one, dubbed BYOL, smoothes the weight trajectory of one of the two networks, which is apparently enough to prevent a collapse.

Restricted Boltzmann machine - definition

Variational auto-encoders. A popular recent self-supervised learning method is the Variational Auto-Encoder (VAE).58 This consists of an encoder network that maps the image into a latent code space and a decoder network that generates an image from a latent code. The VAE limits the information capacity of the latent code by adding Gaussian noise to the output of the encoder before it is passed to the decoder. This is akin to packing small noisy spheres into a larger sphere of minimum radius. The information capacity is limited by how many noisy spheres fit inside the containing sphere. The noisy spheres repel each other because a good reconstruction error requires a small overlap between codes that correspond to different samples. Mathematically, the system minimizes a free energy obtained through marginalization of the latent code over the noise distribution. However, minimizing this free energy with respect to the parameters is intractable, and one has to rely on variational approximation methods from statistical physics that minimize an upper bound of the free energy.

The Future of Deep Learning

The performance of deep learning systems can often be dramatically improved by simply scaling them up. With a lot more data and a lot more computation, they generally work a lot better. The language model GPT-318 with 175 billion parameters (which is still tiny compared with the number of synapses in the human brain) generates noticeably better text than GPT-2 with only 1.5 billion parameters. The chatbots Meena2 and BlenderBot73 also keep improving as they get bigger. Enormous effort is now going into scaling up and it will improve existing systems a lot, but there are fundamental deficiencies of current deep learning that cannot be overcome by scaling alone, as discussed here.

Comparing human learning abilities with current AI suggests several directions for improvement:

Supervised learning requires too much labeled data and model-free reinforcement learning requires far too many trials. Humans seem to be able to generalize well with far less experience.

Current systems are not as robust to changes in distribution as humans, who can quickly adapt to such changes with very few examples.

Current deep learning is most successful at perception tasks and generally what are called system 1 tasks. Using deep learning for system 2 tasks that require a deliberate sequence of steps is an exciting area that is still in its infancy.

What needs to be improved. From the early days, theoreticians of machine learning have focused on the iid assumption, which states that the test cases are expected to come from the same distribution as the training examples. Unfortunately, this is not a realistic assumption in the real world: just consider the non-stationarities due to actions of various agents changing the world, or the gradually expanding mental horizon of a learning agent which always has more to learn and discover. As a practical consequence, the performance of today's best AI systems tends to take a hit when they go from the lab to the field.

Our desire to achieve greater robustness when confronted with changes in distribution (called out-of-distribution generalization) is a special case of the more general objective of reducing sample complexity (the number of examples needed to generalize well) when faced with a new task—as in transfer learning and lifelong learning81—or simply with a change in distribution or in the relationship between states of the world and rewards. Current supervised learning systems require many more examples than humans (when having to learn a new task) and the situation is even worse for model-free reinforcement learning23 since each rewarded trial provides less information about the task than each labeled example. It has already been noted61,76 that humans can generalize in a way that is different and more powerful than ordinary iid generalization: we can correctly interpret novel combinations of existing concepts, even if those combinations are extremely unlikely under our training distribution, so long as they respect high-level syntactic and semantic patterns we have already learned. Recent studies help us clarify how different neural net architectures fare in terms of this systematic generalization ability. How can we design future machine learning systems with these abilities to generalize better or adapt faster out-of-distribution?

Lecture 12.3 — Restricted Boltzmann Machines — [ Deep Learning | Geoffrey Hinton | UofT ]

From homogeneous layers to groups of neurons that represent entities. Evidence from neuroscience suggests that groups of nearby neurons (forming what is called a hyper-column) are tightly connected and might represent a kind of higher-level vector-valued unit able to send not just a scalar quantity but rather a set of coordinated values. This idea is at the heart of the capsules architectures,47,59 and it is also inherent in the use of soft-attention mechanisms, where each element in the set is associated with a vector, from which one can read a key vector and a value vector (and sometimes also a query vector). One way to think about these vector-level units is as representing the detection of an object along with its attributes (like pose information, in capsules). Recent papers in computer vision are exploring extensions of convolutional neural networks in which the top level of the hierarchy represents a set of candidate objects detected in the input image, and operations on these candidates is performed with transformer-like architectures. Neural networks that assign intrinsic frames of reference to objects and their parts and recognize objects by using the geometric relationships between parts should be far less vulnerable to directed adversarial attacks,79 which rely on the large difference between the information used by people and that used by neural nets to recognize objects.

Multiple time scales of adaption. Most neural nets only have two timescales: the weights adapt slowly over many examples and the activities adapt rapidly changing with each new input. Adding an overlay of rapidly adapting and rapidly, decaying "fast weights"49 introduces interesting new computational abilities. In particular, it creates a high-capacity, short-term memory,4 which allows a neural net to perform true recursion in which the same neurons can be reused in a recursive call because their activity vector in the higher-level call can be reconstructed later using the information in the fast weights. Multiple time scales of adaption also arise in learning to learn, or meta-learning.

Higher-level cognition. When thinking about a new challenge, such as driving in a city with unusual traffic rules, or even imagining driving a vehicle on the moon, we can take advantage of pieces of knowledge and generic skills we have already mastered and recombine them dynamically in new ways. This form of systematic generalization allows humans to generalize fairly well in contexts that are very unlikely under their training distribution. We can then further improve with practice, fine-tuning and compiling these new skills so they do not need conscious attention anymore. How could we endow neural networks with the ability to adapt quickly to new settings by mostly reusing already known pieces of knowledge, thus avoiding interference with known skills? Initial steps in that direction include Transformers32 and Recurrent Independent Mechanisms.

It seems that our implicit (system 1) processing abilities allow us to guess potentially good or dangerous futures, when planning or reasoning. This raises the question of how system 1 networks could guide search and planning at the higher (system 2) level, maybe in the spirit of the value functions which guide Monte-Carlo tree search for AlphaGo.77

Machine learning research relies on inductive biases or priors in order to encourage learning in directions which are compatible with some assumptions about the world. The nature of system 2 processing and cognitive neuroscience theories for them5,30 suggests several such inductive biases and architectures,11,45 which may be exploited to design novel deep learning systems. How do we design deep learning architectures and training frameworks which incorporate such inductive biases?

The ability of young children to perform causal discovery37 suggests this may be a basic property of the human brain, and recent work suggests that optimizing out-of-distribution generalization under interventional changes can be used to train neural networks to discover causal dependencies or causal variables. How should we structure and train neural nets so they can capture these underlying causal properties of the world?

How are the directions suggested by these open questions related to the symbolic AI research program from the 20th century? Clearly, this symbolic AI program aimed at achieving system 2 abilities, such as reasoning, being able to factorize knowledge into pieces which can easily recombined in a sequence of computational steps, and being able to manipulate abstract variables, types, and instances. We would like to design neural networks which can do all these things while working with real-valued vectors so as to preserve the strengths of deep learning which include efficient large-scale learning using differentiable computation and gradient-based adaptation, grounding of high-level concepts in low-level perception and action, handling uncertain data, and using distributed representations.

Introduction to Restricted Boltzmann Machines.

Invented by Geoffrey Hinton(Sometimes referred to as the Godfather of Deep Learning), a Restricted Boltzmann machine is an algorithm useful for dimensionality reduction, classification, regression, collaborative filtering, feature learning, and topic modeling.

Before moving forward let us first understand what is Boltzmann Machines?

What are Boltzmann Machines?

A Boltzmann machine is a stochastic(non-deterministic) or generative deep learning model which has only visible(input) and hidden nodes.

The image below presents ten nodes in it and all of them are inter-connected and are also often referred to as States. Brown ones represent Hidden nodes (h)and blue ones represent Visible nodes (v). If you already understand Artificial, Convolutional, and Recurrent Neural networks, you’ll notice they never had their Input nodes connected, whereas Boltzmann Machines have their inputs connected & that is what makes them fundamentally unconventional. All these nodes exchange information among themselves and self-generate subsequent data hence termed as Generative deep model.

There is no output node in this model hence like our other classifiers, we cannot make this model learn 1 or 0 from the Target variable of the training dataset after applying gradient descent or stochastic gradient descent (SGD), etc. Exactly similar cases with our regressor models as well, where it cannot learn the pattern from Target variables. These attributes make the model non-deterministic. Thinking of how does this model then learns and predicts, is that intriguing enough?

Here, Visible nodes are what we measure and Hidden nodes are what we don’t measure. When we input data, these nodes learn all the parameters, their patterns, and correlation between those on their own and forms an efficient system, hence Boltzmann Machine is termed as an Unsupervised Deep Learning model. This model then gets ready to monitor and study abnormal behavior depending on what it has learned.

Hinton once referred to the illustration of a Nuclear Power plant as an example for understanding Boltzmann Machines. This is a complex topic so we shall proceed slowly to understand the intuition behind each concept, with a minimum amount of mathematics and physics involved.

So in the simplest introductory terms, Boltzmann Machines are primarily divided into two categories: Energy-based Models (EBMs) and Restricted Boltzmann Machines (RBMs). When these RBMs are stacked on top of each other, they are known as Deep Belief Networks (DBNs).

What are Restricted Boltzmann Machines?

A Restricted Boltzmann Machine (RBM) is a generative, stochastic, and 2-layer artificial neural network that can learn a probability distribution over its set of inputs.

Stochastic means “randomly determined”, and in RBMs, the coefficients that modify inputs are randomly initialized.

The first layer of the RBM is called the visible, or input layer, and the second is the hidden layer. Each circle represents a neuron-like unit called a node. Each node in the input layer is connected to every node of the hidden layer.

The restriction in a Restricted Boltzmann Machine is that there is no intra-layer communication(nodes of the same layer are not connected). This restriction allows for more efficient training algorithms than what is available for the general class of Boltzmann machines, in particular, the gradient-based contrastive divergence algorithm. Each node is a locus of computation that processes input and begins by making stochastic decisions about whether to transmit that input or not.

RBM received a lot of attention after being proposed as building blocks of multi-layer learning architectures called Deep Belief Networks(DBNs). When these RBMs are stacked on top of each other, they are known as DBNs.

Difference between Autoencoders & RBMs

Autoencoder is a simple 3-layer neural network where output units are directly connected back to input units. Typically, the number of hidden units is much less than the number of visible ones. The task of training is to minimize an error or reconstruction, i.e. find the most efficient compact representation for input data.

Working of Restricted Boltzmann Machine
One aspect that distinguishes RBM from other Neural networks is that it has two biases.
The hidden bias helps the RBM produce the activations on the forward pass, while
The visible layer’s biases help the RBM learn the reconstructions on the backward pass.

The reconstructed input is always different from the actual input as there are no connections among visible nodes and therefore, no way of transferring information among themselves.

The above image shows the first step in training an RBM with multiple inputs. The inputs are multiplied by the weights and then added to the bias. The result is then passed through a sigmoid activation function and the output determines if the hidden state gets activated or not. Weights will be a matrix with the number of input nodes as the number of rows and the number of hidden nodes as the number of columns. The first hidden node will receive the vector multiplication of the inputs multiplied by the first column of weights before the corresponding bias term is added to it.

More Information:

https://medium.com/edureka/restricted-boltzmann-machine-tutorial-991ae688c154

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6997788/

https://www.frontiersin.org/articles/10.3389/fphar.2019.01631/full

https://cacm.acm.org/magazines/2021/7/253464-deep-learning-for-ai/fulltext

https://www.theaidream.com/post/introduction-to-restricted-boltzmann-machines-rbms

https://onlinelibrary.wiley.com/doi/abs/10.1207/s15516709cog0901_7

https://vinodsblog.com/2020/07/28/deep-learning-introduction-to-boltzmann-machines/

http://www.cs.toronto.edu/~hinton/papers.html

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6997788/pdf/fphar-10-01631.pdf

https://research.google.com/pubs/GeoffreyHinton.html?source=post_page---------------------------