Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Transistor-like Qubits Hit Key Benchmark



A team in Australia has recently demonstrated a key advance in metal-oxide-semiconductor-based (or MOS-based) quantum computers. They showed that their two-qubit gates—logical operations that involve more than one quantum bit, or qubit—perform without errors 99 percent of the time. This number is important, because it is the baseline necessary to perform error correction, which is believed to be necessary to build a large-scale quantum computer. What’s more, these MOS-based quantum computers are compatible with existing CMOS technology, which will make it more straightforward to manufacture a large number of qubits on a single chip than with other techniques.

“Getting over 99 percent is significant because that is considered by many to be the error correction threshold, in the sense that if your fidelity is lower than 99 percent, it doesn’t really matter what you’re going to do in error correction,” says Yuval Boger, CCO of quantum computing company QuEra and who wasn’t involved in the work. “You’re never going to fix errors faster than they accumulate.”

There are many contending platforms in the race to build a useful quantum computer. IBM, Google and others are building their machines out of superconducting qubits. Quantinuum and IonQ use individual trapped ions. QuEra and Atom Computing use neutrally-charged atoms. Xanadu and PsiQuantum are betting on photons. The list goes on.

In the new result, a collaboration between the University of New South Wales (UNSW) and Sydney-based startup Diraq, with contributors from Japan, Germany, Canada, and the U.S., has taken yet another approach: trapping single electrons in MOS devices. “What we are trying to do is we are trying to make qubits that are as close to traditional transistors as they can be,” says Tuomo Tanttu, a research fellow at UNSW who led the effort.

Qubits That Act Like Transistors

These qubits are indeed very similar to a regular transistor, gated in such a way as to have only a single electron in the channel. The biggest advantage of this approach is that it can be manufactured using traditional CMOS technologies, making it theoretically possible to scale to millions of qubits on a single chip. Another advantage is that MOS qubits can be integrated on-chip with standard transistors for simplified input, output, and control, says Diraq CEO Andrew Dzurak.

The drawback of this approach, however, is that MOS qubits have historically suffered from device-to-device variability, causing significant noise on the qubits.

“The sensitivity in [MOS] qubits is going to be more than in transistors, because in transistors, you still have 20, 30, 40 electrons carrying the current. In a qubit device, you’re really down to a single electron,” says Ravi Pillarisetty, a senior device engineer for Intel quantum hardware who wasn’t involved in the work.

The team’s result not only demonstrated the 99 percent accurate functionality on two-qubit gates of the test devices, but also helped better understand the sources of device-to-device variability. The team tested three devices with three qubits each. In addition to measuring the error rate, they also performed comprehensive studies to glean the underlying physical mechanisms that contribute to noise.

The researchers found that one of the sources of noise was isotopic impurities in the silicon layer, which, when controlled, greatly reduced the circuit complexity necessary to run the device. The next leading cause of noise was small variations in electric fields, likely due to imperfections in the oxide layer of the device. Tanttu says this is likely to improve by transitioning from a laboratory clean room to a foundry environment.

“It’s a great result and great progress. And I think it’s setting the right direction for the community in terms of thinking less about one individual device, or demonstrating something on an individual device, versus thinking more longer term about the scaling path,” Pillarisetty says.

Now, the challenge will be to scale up these devices to more qubits. One difficulty with scaling is the number of input/output channels required. The quantum team at Intel, who are pursuing a similar technology, has recently pioneered a chip they call Pando Tree to try to address this issue. Pando Tree will be on the same plane as the quantum processor, enabling faster inputs and outputs to the qubits. The Intel team hopes to use it to scale to thousands of qubits. “A lot of our approach is thinking about, how do we make our qubit processor look more like a modern CPU?” says Pillarisetty.

Similarly, Diraq CEO Dzurak says his team plan to scale their technology to thousands of qubits in the near future through a recently announced partnership with Global Foundries. “With Global Foundries, we designed a chip that will have thousands of these [MOS qubits]. And these will be interconnected by using classical transistor circuitry that we designed. This is unprecedented in the quantum computing world,” Dzurak says.

NIST Announces Post-Quantum Cryptography Standards



Today, almost all data on the Internet, including bank transactions, medical records, and secure chats, is protected with an encryption scheme called RSA (named after its creators Rivest, Shamir, and Adleman). This scheme is based on a simple fact—it is virtually impossible to calculate the prime factors of a large number in a reasonable amount of time, even on the world’s most powerful supercomputer. Unfortunately, large quantum computers, if and when they are built, would find this task a breeze, thus undermining the security of the entire Internet.

Luckily, quantum computers are only better than classical ones at a select class of problems, and there are plenty of encryption schemes where quantum computers don’t offer any advantage. Today, the U.S. National Institute of Standards and Technology (NIST) announced the standardization of three post-quantum cryptography encryption schemes. With these standards in hand, NIST is encouraging computer system administrators to begin transitioning to post-quantum security as soon as possible.

“Now our task is to replace the protocol in every device, which is not an easy task.” —Lily Chen, NIST

These standards are likely to be a big element of the Internet’s future. NIST’s previous cryptography standards, developed in the 1970s, are used in almost all devices, including Internet routers, phones, and laptops, says Lily Chen, head of the cryptography group at NIST who lead the standardization process. But adoption will not happen overnight.

“Today, public key cryptography is used everywhere in every device,” Chen says. “Now our task is to replace the protocol in every device, which is not an easy task.”

Why we need post-quantum cryptography now

Most experts believe large-scale quantum computers won’t be built for at least another decade. So why is NIST worried about this now? There are two main reasons.

First, many devices that use RSA security, like cars and some IoT devices, are expected to remain in use for at least a decade. So they need to be equipped with quantum-safe cryptography before they are released into the field.

“For us, it’s not an option to just wait and see what happens. We want to be ready and implement solutions as soon as possible.” —Richard Marty, LGT Financial Services

Second, a nefarious individual could potentially download and store encrypted data today, and decrypt it once a large enough quantum computer comes online. This concept is called “harvest now, decrypt later“ and by its nature, it poses a threat to sensitive data now, even if that data can only be cracked in the future.

Security experts in various industries are starting to take the threat of quantum computers seriously, says Joost Renes, principal security architect and cryptographer at NXP Semiconductors. “Back in 2017, 2018, people would ask ‘What’s a quantum computer?’” Renes says. “Now, they’re asking ‘When will the PQC standards come out and which one should we implement?’”

Richard Marty, chief technology officer at LGT Financial Services, agrees. “For us, it’s not an option to just wait and see what happens. We want to be ready and implement solutions as soon as possible, to avoid harvest now and decrypt later.”

NIST’s competition for the best quantum-safe algorithm

NIST announced a public competition for the best PQC algorithm back in 2016. They received a whopping 82 submissions from teams in 25 different countries. Since then, NIST has gone through 4 elimination rounds, finally whittling the pool down to four algorithms in 2022.

This lengthy process was a community-wide effort, with NIST taking input from the cryptographic research community, industry, and government stakeholders. “Industry has provided very valuable feedback,” says NIST’s Chen.

These four winning algorithms had intense-sounding names: CRYSTALS-Kyber, CRYSTALS-Dilithium, Sphincs+, and FALCON. Sadly, the names did not survive standardization: The algorithms are now known as Federal Information Processing Standard (FIPS) 203 through 206. FIPS 203, 204, and 205 are the focus of today’s announcement from NIST. FIPS 206, the algorithm previously known as FALCON, is expected to be standardized in late 2024.

The algorithms fall into two categories: general encryption, used to protect information transferred via a public network, and digital signature, used to authenticate individuals. Digital signatures are essential for preventing malware attacks, says Chen.

Every cryptography protocol is based on a math problem that’s hard to solve but easy to check once you have the correct answer. For RSA, it’s factoring large numbers into two primes—it’s hard to figure out what those two primes are (for a classical computer), but once you have one it’s straightforward to divide and get the other.

“We have a few instances of [PQC], but for a full transition, I couldn’t give you a number, but there’s a lot to do.” —Richard Marty, LGT Financial Services

Two out of the three schemes already standardized by NIST, FIPS 203 and FIPS 204 (as well as the upcoming FIPS 206), are based on another hard problem, called lattice cryptography. Lattice cryptography rests on the tricky problem of finding the lowest common multiple among a set of numbers. Usually, this is implemented in many dimensions, or on a lattice, where the least common multiple is a vector.

The third standardized scheme, FIPS 205, is based on hash functions—in other words, converting a message to an encrypted string that’s difficult to reverse

The standards include the encryption algorithms’ computer code, instructions for how to implement it, and intended uses. There are three levels of security for each protocol, designed to future-proof the standards in case some weaknesses or vulnerabilities are found in the algorithms.

Lattice cryptography survives alarms over vulnerabilities

Earlier this year, a pre-print published to the arXiv alarmed the PQC community. The paper, authored by Yilei Chen of Tsinghua University in Beijing, claimed to show that lattice-based cryptography, the basis of two out of the three NIST protocols, was not, in fact, immune to quantum attacks. On further inspection, Yilei Chen’s argument turned out to have a flaw—and lattice cryptography is still believed to be secure against quantum attacks.

On the one hand, this incident highlights the central problem at the heart of all cryptography schemes: There is no proof that any of the math problems the schemes are based on are actually “hard.” The only proof, even for the standard RSA algorithms, is that people have been trying to break the encryption for a long time, and have all failed. Since post-quantum cryptography standards, including lattice cryptography, are newer, there is less certainty that no one will find a way to break them.

That said, the failure of this latest attempt only builds on the algorithm’s credibility. The flaw in the paper’s argument was discovered within a week, signaling that there is an active community of experts working on this problem. “The result of that paper is not valid, that means the pedigree of the lattice-based cryptography is still secure,” says NIST’s Lily Chen (no relation to Tsinghua University’s Yilei Chen). “People have tried hard to break this algorithm. A lot of people are trying, they try very hard, and this actually gives us confidence.”

NIST’s announcement is exciting, but the work of transitioning all devices to the new standards has only just begun. It is going to take time, and money, to fully protect the world from the threat of future quantum computers.

“We’ve spent 18 months on the transition and spent about half a million dollars on it,” says Marty of LGT Financial Services. “We have a few instances of [PQC], but for a full transition, I couldn’t give you a number, but there’s a lot to do.”

Giant Chips Give Supercomputers a Run for Their Money



As large supercomputers keep getting larger, Sunnyvale, California-based Cerebras has been taking a different approach. Instead of connecting more and more GPUs together, the company has been squeezing as many processors as it can onto one giant wafer. The main advantage is in the interconnects—by wiring processors together on-chip, the wafer-scale chip bypasses many of the computational speed losses that come from many GPUs talking to each other, as well as losses from loading data to and from memory.

Now, Cerebras has flaunted the advantages of their wafer-scale chips in two separate but related results. First, the company demonstrated that its second generation wafer-scale engine, WSE-2, was significantly faster than world’s fastest supercomputer, Frontier, in molecular dynamics calculations—the field that underlies protein folding, modeling radiation damage in nuclear reactors, and other problems in material science. Second, in collaboration with machine learning model optimization company Neural Magic, Cerebras demonstrated that a sparse large language model could perform inference at one-third of the energy cost of a full model without losing any accuracy. Although the results are in vastly different fields, they were both possible because of the interconnects and fast memory access enabled by Cerebras’ hardware.

Speeding Through the Molecular World

“Imagine there’s a tailor and he can make a suit in a week,” says Cerebras CEO and co-founder Andrew Feldman. “He buys the neighboring tailor, and she can also make a suit in a week, but they can’t work together. Now, they can now make two suits in a week. But what they can’t do is make a suit in three and a half days.”

According to Feldman, GPUs are like tailors that can’t work together, at least when it comes to some problems in molecular dynamics. As you connect more and more GPUs, they can simulate more atoms at the same time, but they can’t simulate the same number of atoms more quickly.

Cerebras’ wafer-scale engine, however, scales in a fundamentally different way. Because the chips are not limited by interconnect bandwidth, they can communicate quickly, like two tailors collaborating perfectly to make a suit in three and a half days.

“It’s difficult to create materials that have the right properties, that have a long lifetime and sufficient strength and don’t break.” —Tomas Oppelstrup, Lawrence Livermore National Laboratory

To demonstrate this advantage, the team simulated 800,000 atoms interacting with each other, calculating the interactions in increments of one femtosecond at a time. Each step took just microseconds to compute on their hardware. Although that’s still 9 orders of magnitude slower than the actual interactions, it was also 179 times as fast as the Frontier supercomputer. The achievement effectively reduced a year’s worth of computation to just two days.

This work was done in collaboration with Sandia, Lawrence Livermore, and Los Alamos National Laboratories. Tomas Oppelstrup, staff scientist at Lawrence Livermore National Laboratory, says this advance makes it feasible to simulate molecular interactions that were previously inaccessible.

Oppelstrup says this will be particularly useful for understanding the longer-term stability of materials in extreme conditions. “When you build advanced machines that operate at high temperatures, like jet engines, nuclear reactors, or fusion reactors for energy production,” he says, “you need materials that can withstand these high temperatures and very harsh environments. It’s difficult to create materials that have the right properties, that have a long lifetime and sufficient strength and don’t break.” Being able to simulate the behavior of candidate materials for longer, Oppelstrup says, will be crucial to the material design and development process.

Ilya Sharapov, principal engineer at Cerebras, say the company is looking forward to extending applications of its wafer-scale engine to a larger class of problems, including molecular dynamics simulations of biological processes and simulations of airflow around cars or aircrafts.

Downsizing Large Language Models

As large language models (LLMs) are becoming more popular, the energy costs of using them are starting to overshadow the training costs—potentially by as much as a factor of ten in some estimates. “Inference is is the primary workload of AI today because everyone is using ChatGPT,” says James Wang, director of product marketing at Cerebras, “and it’s very expensive to run especially at scale.”

One way to reduce the energy cost (and speed) of inference is through sparsity—essentially, harnessing the power of zeros. LLMs are made up of huge numbers of parameters. The open-source Llama model used by Cerebras, for example, has 7 billion parameters. During inference, each of those parameters is used to crunch through the input data and spit out the output. If, however, a significant fraction of those parameters are zeros, they can be skipped during the calculation, saving both time and energy.

The problem is that skipping specific parameters is a difficult to do on a GPU. Reading from memory on a GPU is relatively slow, because they’re designed to read memory in chunks, which means taking in groups of parameters at a time. This doesn’t allow GPUs to skip zeros that are randomly interspersed in the parameter set. Cerebras CEO Feldman offered another analogy: “It’s equivalent to a shipper, only wanting to move stuff on pallets because they don’t want to examine each box. Memory bandwidth is the ability to examine each box to make sure it’s not empty. If it’s empty, set it aside and then not move it.”

“There’s a million cores in a very tight package, meaning that the cores have very low latency, high bandwidth interactions between them.” —Ilya Sharapov, Cerebras

Some GPUs are equipped for a particular kind of sparsity, called 2:4, where exactly two out of every four consecutively stored parameters are zeros. State-of-the-art GPUs have terabytes per second of memory bandwidth. The memory bandwidth of Cerebras’ WSE-2 is more than one thousand times as high, at 20 petabytes per second. This allows for harnessing unstructured sparsity, meaning the researchers can zero out parameters as needed, wherever in the model they happen to be, and check each one on the fly during a computation. “Our hardware is built right from day one to support unstructured sparsity,” Wang says.

Even with the appropriate hardware, zeroing out many of the model’s parameters results in a worse model. But the joint team from Neural Magic and Cerebras figured out a way to recover the full accuracy of the original model. After slashing 70 percent of the parameters to zero, the team performed two further phases of training to give the non-zero parameters a chance to compensate for the new zeros.

This extra training uses about 7 percent of the original training energy, and the companies found that they recover full model accuracy with this training. The smaller model takes one-third of the time and energy during inference as the original, full model. “What makes these novel applications possible in our hardware,” Sharapov says, “Is that there’s a million cores in a very tight package, meaning that the cores have very low latency, high bandwidth interactions between them.”

❌
❌