Last week the organization tasked with running the the biggest chunk of U.S. CHIPS Act’s US $13 billion R&D program made some significant strides: The National Semiconductor Technology Center (NSTC) released a strategic plan and selected the sites of two of three planned facilities and released a new strategic plan. The locations of the two sites—a “design and collaboration” center in Sunnyvale, Calif., and a lab devoted to advancing the leading edge of chipmaking, in Albany, N.Y.—build on an existing ecosystem at each location, experts say. The location of the third planned center—a chip prototyping and packaging site that could be especially critical for speeding semiconductor startups—is still a matter of speculation.
“The NSTC represents a once-in-a-generation opportunity for the U.S. to accelerate the pace of innovation in semiconductor technology,” Deirdre Hanford, CEO of Natcast, the nonprofit that runs the NSTC centers, said in a statement. According to the strategic plan, which covers 2025 to 2027, the NSTC is meant to accomplish three goals: extend U.S. technology leadership, reduce the time and cost to prototype, and build and sustain a semiconductor workforce development ecosystem. The three centers are meant to do a mix of all three.
New York gets extreme ultraviolet lithography
NSTC plans to direct $825 million into the Albany project. The site will be dedicated to extreme ultraviolet lithography, a technology that’s essential to making the most advanced logic chips. The Albany Nanotech Complex, which has already seen more than $25 billion in investments from the state and industry partners over two decades, will form the heart of the future NSTC center. It already has an EUV lithography machine on site and has begun an expansion to install a next-generation version, called high-NA EUV, which promises to produce even finer chip features. Working with a tool recently installed in Europe, IBM, a long-time tenant of the Albany research facility, reported record yields of copper interconnects built every 21 nanometers, a pitch several nanometers tighter than possible with ordinary EUV.
“It’s fulfilling to see that this ecosystem can be taken to the national and global level through CHIPS Act funding,” said Mukesh Khare, general manager of IBM’s semiconductors division, speaking from the future site of the NSTC EUV center. “It’s the right time, and we have all the ingredients.”
While only a few companies are capable of manufacturing cutting edge logic using EUV, the impact of the NSTC center will be much broader, Khare argues. It will extend down as far as early-stage startups with ideas or materials for improving the chipmaking process “An EUV R&D center doesn’t mean just one machine,” says Khare. “It needs so many machines around it… It’s a very large ecosystem.”
Silicon Valley lands the design center
The design center is tasked with conducting advanced research in chip design, electronic design automation (EDA), chip and system architectures, and hardware security. It will also host the NSTC’s design enablement gateway—a program that provides NSTC members with a secure, cloud-based access to design tools, reference processes and designs, and shared data sets, with the goal of reducing the time and cost of design. Additionally, it will house workforce development, member convening, and administration functions.
Situating the design center in Silicon Valley, with its concentration of research universities, venture capital, and workforce, seems like the obvious choice to many experts. “I can’t think of a better place,” says Patrick Soheili, co-founder of interconnect technology startup Eliyan, which is based in Santa Clara, Calif.
Abhijeet Chakraborty, vice president of engineering in the technology and product group at Silicon Valley-based Synopsys, a leading maker of EDA software, sees Silicon Valley’s expansive tech ecosystem as one of its main advantages in landing the NSTC’s design center. The region concentrates companies and researchers involved in the whole spectrum of the industry from semiconductor process technology to cloud software.
Access to such a broad range of industries is increasingly important for chip design startups, he says. “To design a chip or component these days you need to go from concept to design to validation in an environment that takes care of the entire stack,” he says. It’s prohibitively expensive for a startup to do that alone, so one of Chakraborty’s hopes for the design center is that it will help startups access the design kits and other data needed to operate in this new environment.
Packaging and prototyping still to come
A third promised center for prototyping and packaging is still to come. “The big question is where does the packaging and prototyping go?” says Mark Granahan, cofounder and CEO of Pennsylvania-based power semiconductor startup Ideal Semiconductor. “To me that’s a great opportunity.” He points out that because there is so little packaging technology infrastructure in the United States, any ambitious state or region should have a shot at hosting such a center. One of the original intentions of the act, after all, was to expand the number of regions of the country that are involved in the semiconductor industry.
But that hasn’t stopped some already tech-heavy regions from wanting it. “Oregon offers the strongest ecosystem for such a facility,” a spokesperson for Intel, whose technology development is done there. “The state is uniquely positioned to contribute to the success of the NSTC and help drive technological advancements in the U.S. semiconductor industry.”
As NSTC makes progress, Granahan’s concern is that bureaucracy will expand with it and slow efforts to boost the U.S. chip industry. Already the layers of control are multiplying. The Chips Office at the National Institute of Standards and Technology executes the Act. The NSTC is administered by the nonprofit Natcast, which directs the EUV center, which is in a facility run by another nonprofit, NY CREATES. “We want these things to be agile and make local decisions.”
Yesterday, NASA successfully launched the Europa Clipper, the largest spacecraft the agency has ever built for a planetary mission. Clipper is now successfully on its multi-year journey to Europa, bristling with equipment to study the Jovian moon’s potential to support life—but just a few months ago, the mission was almost doomed. In July, researchers at NASA found out that a group of Europa Clipper’s transistors would fail under Jupiter’s extreme radiation levels. They spent months testing devices, updating their flight trajectories, and ultimately adding a warning “canary box” to monitor the effects of radiation as the mission progresses.
The canary box “is a very logical engineering solution to a problem,” says Alan Mantooth, an IEEE Fellow and a professor of electrical engineering at the University of Arkansas. But ideally, it wouldn’t have been needed at all. If NASA had caught the issues with these transistors earlier or designed their circuits with built-in monitoring, this last minute scramble wouldn’t have occurred. “It’s a clever patch,” says Mantooth, “but it’s a patch.”
Scientists have been “radiation hardening” electronics—designing them to function in a radioactive environment—since the 1960s. But as missions to space become more ambitious, radiation hardening techniques have had to evolve. “It’s kind of like cybersecurity,” says Mantooth. “You’re always trying to get better. There’s always a more harsh environment.”
With the rapid acceleration of companies like SpaceX, the space industry is at “a massive inflection point,” says Eric Faraci, an engineer at Infineon who works on aerospace and defense projects. “Everything we used to take for granted about how you do something, what’s accepted, best practices—everything’s been questioned.”
In future space exploration, we’ll see more systems made with alternative semiconductors like silicon carbide, specialized CMOS transistors, integrated photonics, and new kinds of radiation-resistant memory. Here’s your guide to the next generation of radiation hardened technology.
Silicon Carbide’s Ultra Wide Band Gap
Most power devices in spacecraft today use silicon as the semiconductor, but the next generation will use silicon carbide, says Enxia Zhang, a researcher at the University of Central Florida who has been developing radiation hard microelectronics for over 20 years. Silicon carbide is more resistant to radiation because of its wider band gap, which is the extra energy electrons need to transition from being bound to an atom’s nucleus to participating in conduction. Silicon has a band gap of 1.1 electron volts, while silicon carbide’s ranges from 3.3 to 3.4 eV. This means that more energy is required to disturb an electron of silicon carbide, so it’s less likely that a dose of stray radiation will manage to do it.
Silicon carbide chips are being manufactured right now, and NASA holds a weekly meeting to test them for space missions, says Zhang. NASA’s silicon carbide devices are expected to be used on missions to the Moon and Venus in the future.
“People are flying silicon carbide” devices right now, says Infineon’s Faraci. They are getting around a lack of standards by using them at parameters well below what they are designed for on Earth, a technique called derating.
Another semiconductor with a suitably wide band gap is gallium nitride (3.2 eV). Most commonly found in LEDs, it is also used in laptop chargers and other lower power consumer electronics. While it’s a “very exciting” material for space applications, it’s still a new material, which means it has to go through a lot of testing to be trusted, says Faraci.
Gallium nitride is best suited for cold temperatures, like on Mars or the dark side of the Moon, says Mantooth. But “if we’re doing something on Mercury or we’re doing something close to the Sun—any high temperature stuff … silicon carbide’s your winner.”
Silicon on Insulator Designs and FinFETs for Designing Radiation-Hardened CMOS
New materials aren’t the only frontier in radiation hardening; researchers are also exploring new ways of designing silicon transistors. Two CMOS production methods are already have a radiation hardened form: silicon on insulator (SOI), and fin field effect transistors (FinFETs). Both methods are designed to prevent a kind of radiation damage called single event effects, where a high energy particle hits an electronic device, jolting its electrons into places they shouldn’t be and flipping bits.
In ordinary bulk CMOS, current flows from the source to the drain through the channel, with a gate acting as a switch, blocking or allowing the current’s flow. These sit in the top layer of silicon. Radiation can excite charges deeper down in the silicon bypassing the gate’s control and allowing current to flow when it shouldn’t. Radiation hardening methods work by impeding the movement of these excited electrons.
SOI designs add a layer of an insulator like silicon oxide below the source and the drain, so that charges cannot flow as easily below the channel. FinFET designs raise the drain, source, and the channel between them into one or more 3D “fins”. Excited charges now have to flow down, around, and back up in order to bypass the gate. FinFETs are also naturally resistant to another form of radiation damage: the total ionizing dose, which occurs when a slow buildup of charged particles changes the properties of the insulating layer between the channel and gate of a device.
The techniques to produce SOI devices and FinFETs have existed for decades. In the 2000s, they weren’t used as much in radiation hardening, because circuit designers could still use ordinary, bulk CMOS devices, mitigating radiation risks in their circuit design and layout, according to Hugh Barnaby, a professor of electrical engineering at Arizona State University. But lately, as CMOS devices have gotten smaller and therefore more vulnerable to radiation, there’s been renewed interest in producing these naturally radiation hard varieties of CMOS devices, even if they are more specialized and expensive.
Barnaby is working with a team on improving radiation hardness in FinFETs. They found that adding more fins increased the device’s ability to control current, but reduced its radiation hardness. Now they are working to rearrange where the fins are to maximize the effectiveness of radiation resistant circuits. “We haven’t done this quite yet,” says Barnaby, “but I’m sure it will work.”
Photonic Systems for High Bandwidth, Faster Data Transfer
Photonic systems use light instead of electrons to transfer information over long distances with little energy. For example, the Internet uses optical fibers to quickly transfer large amounts of data. Within the last decade, researchers have developed silicon photonics integrated circuits which are currently used for high bandwidth information transmission in data centers, but would also enable us to move high volumes of data around in spacecraft, according to John Cressler, a professor of electronics at Georgia Tech.
“If you think of some of the systems that are up in space, either maybe they’re remote sensing or communication,” says Cressler, “they have a lot of data that they’re gathering or moving and that’s much easier to do in photonics.”
The best part? Photonics integrated circuits are naturally radiation hard, because their data transfer is done using photons instead of electrons. A high energy dose of radiation won’t disrupt a photon as it would an electron, because photons are not electrically charged.
Cressler anticipates that integrated photonics will be used in spacecraft in the next two years. “NASA and the [U.S. Department of Defense] and even commercial space [companies] are very interested in photonics,” he says.
Nonvolatile Memory in Space
Another promising area of research for radiation hardness in space is new kinds of nonvolatile memory. Computers usually use static random access memory (SRAM) or dynamic random access memory (DRAM). These are volatile memories, which means once the power is off, they cannot store their state. But nonvolatile memories are able to remember their state. They don’t require continuous power, and therefore reduce power consumption needs.
There are two front-runners in nonvolatile memory for use in space: Magnetoresistive-RAM (MRAM), and Resistive-RAM (ReRAM). MRAM uses magnetic states to store data, and ReRAM uses a quality called memristance. Both technologies are radiation hard simply by how they are designed; radiation won’t affect the magnetic fields of MRAM or the resistances of ReRAM.
“Resistive RAM is one of the technologies that has the potential to get to neuromorphic, low energy computing,” says Michael Alles, the director of the Institute for Space and Defense Electronics at Vanderbilt University, referring to a form of computing inspired by how brains work. Satellites usually are not equipped with the ability to process much of their own data, and have to send it back to Earth. But with the lower power consumption of memristor-based circuits, satellites could do computations onboard, saving communications bandwidth and time.
Though still in the research phases, Zhang predicts we will see nonvolatile memory in space in the next 10 to 15 years. Last year, the U.S. Space Force contracted Western Digital $35 million dollars to develop nonvolatile radiation hardened memory.
A Note of Caution and Hope
Alles cautions, however, that the true test for these new technologies will not be how they do on their own, but rather how they can be integrated to work as a system. You always have to ask: “What’s the weak link?” A powerful and radiation hard memory device could be for naught, if it depends on a silicon transistor that fails under radiation.
As space exploration and satellite launches continue to ramp up, radiation hardening will only become more vital to our designs. “What’s exciting is that as we advance our capabilities, we’re able to go places we haven’t been able to go before and stay there longer,” says Mantooth. “We can’t fly electronics into the Sun right now. But one day, maybe we will.”
In space, high-energy gamma radiation can change the properties of semiconductors, altering how they work or rendering them completely unusable. Finding devices that can withstand radiation is important not just to keep astronauts safe but also to ensure that a spacecraft lasts the many years of its mission. Constructing a device that can easily measure radiation exposure is just as valuable. Now, a globe-spanning group of researchers has found that a type of memristor, a device that stores data as resistance even in the absence of a power supply, can not only measure gamma radiation but also heal itself after being exposed to it.
Memristors have demonstrated the ability to self-heal under radiation before, says Firman Simanjuntak, a professor of materials science and engineering at the University of Southampton, in England, whose team developed this memristor. But until recently, no one really understood how they healed—or how best to apply the devices. Recently, there’s been “a new space race,” he says, with more satellites in orbit and more deep-space missions on the launchpad, so “everyone wants to make their devices…tolerant towards radiation.” Simanjuntak’s team has been exploring the properties of different types of memristors since 2019, but now wanted to test how their devices change when exposed to blasts of gamma radiation.
Normally, memristors set their resistance according to their exposure to high-enough voltage. One voltage boosts the resistance, which then remains at that level when subject to lower voltages. The opposite voltage decreases the resistance, resetting the device. The relationship between voltage and resistance depends on the previous voltage, which is why the devices are said to have a memory.
The hafnium oxide memristor used by Simanjuntak is a type of memristor that cannot be reset, called a WORM (write once, read many) device, suitable for permanent storage. Once it is set with a negative or positive voltage, the opposing voltage does not change the device. It consists of several layers of material: first conductive platinum, then aluminum doped hafnium oxide (an insulator), then a layer of titanium, then a layer of conductive silver at the top.
When voltage is applied to these memristors, a bridge of silver ions forms in the hafnium oxide, which allows the current to flow through, setting its conductance value. Unlike in other memristors, this device’s silver bridge is stable and fixes in place, which is why once the device is set, it usually can’t be returned to a rest state.
That is, unless radiation is involved. The first discovery the researchers made was that under gamma radiation, the device acts as a resettable switch. They believe that the gamma rays break the bond between the hafnium and oxygen atoms, causing a layer of titanium oxide to form at the top of the memristor, and a layer of platinum oxide to form at the bottom. The titanium oxide layer creates an extra barrier for the silver ions to cross, so a weaker bridge is formed, one that can be broken and reset by a new voltage.
The extra platinum oxide layer caused by the gamma rays also serves as a barrier to incoming electrons. This means a higher voltage is required to set the memristor. Using this knowledge, the researchers were able to create a simple circuit that measured amounts of radiation by checking the voltage that was required to set the memristor. A higher voltage meant the device had encountered more radiation.
From a regular state, the hafnium oxide memristor forms a stable conductive bridge. Under radiation, a thicker layer of titanium oxide creates a slower-forming, weaker conductive bridge.OM Kumar et al./IEEE Electron Device Letters
But the true marvel of these hafnium oxide memristors is their ability to self-heal after a big dose of radiation. The researchers treated the memristor with 5 megarads of radiation—500 times as much as a lethal dose in humans. Once the gamma radiation was removed, the titanium oxide and platinum oxide layers gradually dissipated, the oxygen atoms returning to form hafnium oxide again. After 30 days, instead of still requiring a higher-than-normal voltage to form, the devices that were exposed to radiation required the same voltage to form as untouched devices.
“It’s quite exciting what they’re doing,” says Pavel Borisov, a researcher at Loughborough University, in England, who studies how to use memristors to mimic the synapses in the human brain. His team conducted similar experiments with a silicon oxide based memristor, and also found that radiation changed the behavior of the device. In Borisov’s experiments, however, the memristors did not heal after the radiation.
Memristors are simple, lightweight, and low power, which already makes them ideal for use in space applications. In the future, Simanjuntak hopes to use memristors to develop radiation-proof memory devices that would enable satellites in space to do onboard calculations. “You can use a memristor for data storage, but also you can use it for computation,” he says, “So you could make everything simpler, and reduce the costs as well.”
This research was accepted for publication in a future issue of Electron Device Letters.
Big-name makers of processors, especially those geared toward cloud-based
AI, such as AMD and Nvidia, have been showing signs of wanting to own more of the business of computing, purchasing makers of software, interconnects, and servers. The hope is that control of the “full stack” will give them an edge in designing what their customers want.
Amazon Web Services (AWS) got there ahead of most of the competition, when they purchased chip designer Annapurna Labs in 2015 and proceeded to design CPUs, AI accelerators, servers, and data centers as a vertically-integrated operation. Ali Saidi, the technical lead for the Graviton series of CPUs, and Rami Sinno, director of engineering at Annapurna Labs, explained the advantage of vertically-integrated design and Amazon-scale and showed IEEE Spectrum around the company’s hardware testing labs in Austin, Tex., on 27 August.
RamiSinno: Amazon is my first vertically integrated company. And that was on purpose. I was working at Arm, and I was looking for the next adventure, looking at where the industry is heading and what I want my legacy to be. I looked at two things:
One is vertically integrated companies, because this is where most of the innovation is—the interesting stuff is happening when you control the full hardware and software stack and deliver directly to customers.
And the second thing is, I realized that machine learning, AI in general, is going to be very, very big. I didn’t know exactly which direction it was going to take, but I knew that there is something that is going to be generational, and I wanted to be part of that. I already had that experience prior when I was part of the group that was building the chips that go into the Blackberries; that was a fundamental shift in the industry. That feeling was incredible, to be part of something so big, so fundamental. And I thought, “Okay, I have another chance to be part of something fundamental.”
Does working at a vertically-integrated company require a different kind of chip design engineer?
Sinno: Absolutely. When I hire people, the interview process is going after people that have that mindset. Let me give you a specific example: Say I need a signal integrity engineer. (Signal integrity makes sure a signal going from point A to point B, wherever it is in the system, makes it there correctly.) Typically, you hire signal integrity engineers that have a lot of experience in analysis for signal integrity, that understand layout impacts, can do measurements in the lab. Well, this is not sufficient for our group, because we want our signal integrity engineers also to be coders. We want them to be able to take a workload or a test that will run at the system level and be able to modify it or build a new one from scratch in order to look at the signal integrity impact at the system level under workload. This is where being trained to be flexible, to think outside of the little box has paid off huge dividends in the way that we do development and the way we serve our customers.
“By the time that we get the silicon back, the software’s done”
—Ali Saidi, Annapurna Labs
At the end of the day, our responsibility is to deliver complete servers in the data center directly for our customers. And if you think from that perspective, you’ll be able to optimize and innovate across the full stack. A design engineer or a test engineer should be able to look at the full picture because that’s his or her job, deliver the complete server to the data center and look where best to do optimization. It might not be at the transistor level or at the substrate level or at the board level. It could be something completely different. It could be purely software. And having that knowledge, having that visibility, will allow the engineers to be significantly more productive and delivery to the customer significantly faster. We’re not going to bang our head against the wall to optimize the transistor where three lines of code downstream will solve these problems, right?
Do you feel like people are trained in that way these days?
Sinno: We’ve had very good luck with recent college grads. Recent college grads, especially the past couple of years, have been absolutely phenomenal. I’m very, very pleased with the way that the education system is graduating the engineers and the computer scientists that are interested in the type of jobs that we have for them.
The other place that we have been super successful in finding the right people is at startups. They know what it takes, because at a startup, by definition, you have to do so many different things. People who’ve done startups before completely understand the culture and the mindset that we have at Amazon.
Ali Saidi: I’ve been here about seven and a half years. When I joined AWS, I joined a secret project at the time. I was told: “We’re going to build some Arm servers. Tell no one.”
We started with Graviton 1. Graviton 1 was really the vehicle for us to prove that we could offer the same experience in AWS with a different architecture.
The cloud gave us an ability for a customer to try it in a very low-cost, low barrier of entry way and say, “Does it work for my workload?” So Graviton 1 was really just the vehicle demonstrate that we could do this, and to start signaling to the world that we want software around ARM servers to grow and that they’re going to be more relevant.
Graviton 2—announced in 2019—was kind of our first… what we think is a market-leading device that’s targeting general-purpose workloads, web servers, and those types of things.
It’s done very well. We have people running databases, web servers, key-value stores, lots of applications... When customers adopt Graviton, they bring one workload, and they see the benefits of bringing that one workload. And then the next question they ask is, “Well, I want to bring some more workloads. What should I bring?” There were some where it wasn’t powerful enough effectively, particularly around things like media encoding, taking videos and encoding them or re-encoding them or encoding them to multiple streams. It’s a very math-heavy operation and required more [single-instruction multiple data] bandwidth. We need cores that could do more math.
We also wanted to enable the [high-performance computing] market. So we have an instance type called HPC 7G where we’ve got customers like Formula One. They do computational fluid dynamics of how this car is going to disturb the air and how that affects following cars. It’s really just expanding the portfolio of applications. We did the same thing when we went to Graviton 4, which has 96 cores versus Graviton 3’s 64.
How do you know what to improve from one generation to the next?
Saidi: Far and wide, most customers find great success when they adopt Graviton. Occasionally, they see performance that isn’t the same level as their other migrations. They might say “I moved these three apps, and I got 20 percent higher performance; that’s great. But I moved this app over here, and I didn’t get any performance improvement. Why?” It’s really great to see the 20 percent. But for me, in the kind of weird way I am, the 0 percent is actually more interesting, because it gives us something to go and explore with them.
Most of our customers are very open to those kinds of engagements. So we can understand what their application is and build some kind of proxy for it. Or if it’s an internal workload, then we could just use the original software. And then we can use that to kind of close the loop and work on what the next generation of Graviton will have and how we’re going to enable better performance there.
What’s different about designing chips at AWS?
Saidi: In chip design, there are many different competing optimization points. You have all of these conflicting requirements, you have cost, you have scheduling, you’ve got power consumption, you’ve got size, what DRAM technologies are available and when you’re going to intersect them… It ends up being this fun, multifaceted optimization problem to figure out what’s the best thing that you can build in a timeframe. And you need to get it right.
One thing that we’ve done very well is taken our initial silicon to production.
How?
Saidi: This might sound weird, but I’ve seen other places where the software and the hardware people effectively don’t talk. The hardware and software people in Annapurna and AWS work together from day one. The software people are writing the software that will ultimately be the production software and firmware while the hardware is being developed in cooperation with the hardware engineers. By working together, we’re closing that iteration loop. When you are carrying the piece of hardware over to the software engineer’s desk your iteration loop is years and years. Here, we are iterating constantly. We’re running virtual machines in our emulators before we have the silicon ready. We are taking an emulation of [a complete system] and running most of the software we’re going to run.
So by the time that we get to the silicon back [from the foundry], the software’s done. And we’ve seen most of the software work at this point. So we have very high confidence that it’s going to work.
The other piece of it, I think, is just being absolutely laser-focused on what we are going to deliver. You get a lot of ideas, but your design resources are approximately fixed. No matter how many ideas I put in the bucket, I’m not going to be able to hire that many more people, and my budget’s probably fixed. So every idea I throw in the bucket is going to use some resources. And if that feature isn’t really important to the success of the project, I’m risking the rest of the project. And I think that’s a mistake that people frequently make.
Are those decisions easier in a vertically integrated situation?
Saidi: Certainly. We know we’re going to build a motherboard and a server and put it in a rack, and we know what that looks like… So we know the features we need. We’re not trying to build a superset product that could allow us to go into multiple markets. We’re laser-focused into one.
What else is unique about the AWS chip design environment?
Saidi: One thing that’s very interesting for AWS is that we’re the cloud and we’re also developing these chips in the cloud. We were the first company to really push on running [electronic design automation (EDA)] in the cloud. We changed the model from “I’ve got 80 servers and this is what I use for EDA” to “Today, I have 80 servers. If I want, tomorrow I can have 300. The next day, I can have 1,000.”
We can compress some of the time by varying the resources that we use. At the beginning of the project, we don’t need as many resources. We can turn a lot of stuff off and not pay for it effectively. As we get to the end of the project, now we need many more resources. And instead of saying, “Well, I can’t iterate this fast, because I’ve got this one machine, and it’s busy.” I can change that and instead say, “Well, I don’t want one machine; I’ll have 10 machines today.”
Instead of my iteration cycle being two days for a big design like this, instead of being even one day, with these 10 machines I can bring it down to three or four hours. That’s huge.
How important is Amazon.com as a customer?
Saidi: They have a wealth of workloads, and we obviously are the same company, so we have access to some of those workloads in ways that with third parties, we don’t. But we also have very close relationships with other external customers.
So last Prime Day, we said that 2,600 Amazon.com services were running on Graviton processors. This Prime Day, that number more than doubled to 5,800 services running on Graviton. And the retail side of Amazon used over 250,000 Graviton CPUs in support of the retail website and the services around that for Prime Day.
The AI accelerator team is colocated with the labs that test everything from chips through racks of servers. Why?
Sinno: So Annapurna Labs has multiple labs in multiple locations as well. This location here is in Austin… is one of the smaller labs. But what’s so interesting about the lab here in Austin is that you have all of the hardware and many software development engineers for machine learning servers and for Trainium and Inferentia [AWS’s AI chips] effectively co-located on this floor. For hardware developers, engineers, having the labs co-located on the same floor has been very, very effective. It speeds execution and iteration for delivery to the customers. This lab is set up to be self-sufficient with anything that we need to do, at the chip level, at the server level, at the board level. Because again, as I convey to our teams, our job is not the chip; our job is not the board; our job is the full server to the customer.
How does vertical integration help you design and test chips for data-center-scale deployment?
Sinno: It’s relatively easy to create a bar-raising server. Something that’s very high-performance, very low-power. If we create 10 of them, 100 of them, maybe 1,000 of them, it’s easy. You can cherry pick this, you can fix this, you can fix that. But the scale that the AWS is at is significantly higher. We need to train models that require 100,000 of these chips. 100,000! And for training, it’s not run in five minutes. It’s run in hours or days or weeks even. Those 100,000 chips have to be up for the duration. Everything that we do here is to get to that point.
We start from a “what are all the things that can go wrong?” mindset. And we implement all the things that we know. But when you were talking about cloud scale, there are always things that you have not thought of that come up. These are the 0.001-percent type issues.
In this case, we do the debug first in the fleet. And in certain cases, we have to do debugs in the lab to find the root cause. And if we can fix it immediately, we fix it immediately. Being vertically integrated, in many cases we can do a software fix for it. We use our agility to rush a fix while at the same time making sure that the next generation has it already figured out from the get go.
In March, India announced a major investment to establish a semiconductor-manufacturing industry. With US $15 billion in investments from companies, state governments, and the central government, India now has plans for several chip-packaging plants and the country’s first modern chip fab as part of a larger effort to grow its electronics industry.
But turning India into a chipmaking powerhouse will also require a substantial investment in R&D. And so the Indian government turned to IEEE Fellow and retired Georgia Tech professor Rao Tummala, a pioneer of some of the chip-packaging technologies that have become critical to modern computers. Tummala spoke with IEEE Spectrum during the IEEE Electronic Component Technology Conference in Denver, Colo., in May.
Rao Tummala
Rao Tummala is a pioneer of semiconductor packaging and a longtime research leader at Georgia Tech.
What are you helping the government of India to develop?
Rao Tummala: I’m helping to develop the R&D side of India’s semiconductor efforts. We picked 12 strategic research areas. If you explore research in those areas, you can make almost any electronic system. For each of those 12 areas, there’ll be one primary center of excellence. And that’ll be typically at an IIT (Indian Institute of Technology) campus. Then there’ll be satellite centers attached to those throughout India. So when we’re done with it, in about five years, I expect to see probably almost all the institutions involved.
Why did you decide to spend your retirement doing this?
Tummala: It’s my giving back. India gave me the best education possible at the right time.
I’ve been going to India and wanting to help for 20 years. But I wasn’t successful until the current government decided they’re going to make manufacturing and semiconductors important for the country. They asked themselves: What would be the need for semiconductors, in 10 years, 20 years, 30 years? And they quickly concluded that if you have 1.4 billion people, each consuming, say, $5,000 worth of electronics each year, it requires billions and billions of dollars’ worth of semiconductors.
“It’s my giving back. India gave me the best education possible at the right time.” —Rao Tummala, advisor to the government of India
What advantages does India have in the global semiconductor space?
Tummala: India has the best educational system in the world for the masses. It produces the very best students in science and engineering at the undergrad level and lots of them. India is already a success in design and software. All the major U.S. tech companies have facilities in India. And they go to India for two reasons. It has a lot of people with a lot of knowledge in the design and software areas, and those people are cheaper [to employ].
What are India’s weaknesses, and is the government response adequate to overcoming them?
Tummala: India is clearly behind in semiconductor manufacturing. It’s behind in knowledge and behind in infrastructure. Government doesn’t solve these problems. All that the government does is set the policies and give the money. This has given companies incentives to come to India, and therefore the semiconductor industry is beginning to flourish.
Will India ever have leading-edge chip fabs?
Tummala: Absolutely. Not only will it have leading-edge fabs, but in about 20 years, it will have the most comprehensive system-level approach of any country, including the United States. In about 10 years, the size of the electronics industry in India will probably have grown about 10 times.
This article appears in the August 2024 print issue as “5 Questions for Rao Tummala.”
The CHIPS America Act was a response to a worsening shortfall in engineers equipped to meet the growing demand for advanced electronic devices. That need persists. In its 2023 policy report,
Chipping Away: Assessing and Addressing the Labor Market Gap Facing the U.S. Semiconductor Industry, the Semiconductor Industry Association forecast a demand for 69,000 microelectronic and semiconductor engineers between 2023 and 2030—including 28,900 new positions created by industry expansion and 40,100 openings to replace engineers who retire or leave the field.
This number does
not include another 34,500 computer scientists (13,200 new jobs, 21,300 replacements), nor does it count jobs in other industries that require advanced or custom-designed semiconductors for controls, automation, communication, product design, and the emerging systems-of-systems technology ecosystem.
Purdue University is taking charge, leading semiconductor technology and workforce development in the U.S. As early as Spring 2022, Purdue University became the first top engineering school to offer an online
Master’s Degree in Microelectronics and Semiconductors.
U.S. News & World Report has ranked the university’s graduate engineering program among America’s 10 best every year since 2012 (and among the top 4 since 2022)
“The degree was developed as part of Purdue’s overall semiconductor degrees program,” says Purdue Prof. Vijay Raghunathan, one of the architects of the semiconductor program. “It was what I would describe as the nation’s most ambitious semiconductor workforce development effort.”
Prof. Vijay Raghunathan, one of the architects of the online Master’s Degree in Microelectronics and Semiconductors at Purdue.Purdue University
Purdue built and announced its bold high-technology online program while the U.S. Congress was still debating the $53 billion “Creating Helpful Incentives to Produce Semiconductors for America Act” (CHIPS America Act), which would be passed in July 2022 and signed into law in August.
Today, the online Master’s in Microelectronics and Semiconductors is well underway. Students learn leading-edge equipment and software and prepare to meet the challenges they will face in a rejuvenated, and critical, U.S. semiconductor industry.
Is the drive for semiconductor education succeeding?
“I think we have conclusively established that the answer is a resounding ‘Yes,’” says Raghunathan. Like understanding big data, or being able to program, “the ability to understand how semiconductors and semiconductor-based systems work, even at a rudimentary level, is something that everybody should know. Virtually any product you design or make is going to have chips inside it. You need to understand how they work, what the significance is, and what the risks are.”
Earning a Master’s in Microelectronics and Semiconductors
Students pursuing the Master’s Degree in Microelectronics and Semiconductors will take courses in circuit design, devices and engineering, systems design, and supply chain management offered by several schools in the university, such as Purdue’s Mitch Daniels School of Business, the Purdue Polytechnic Institute, the Elmore Family School of Electrical and Computer Engineering, and the School of Materials Engineering, among others.
Professionals can also take one-credit-hour courses, which are intended to help students build “breadth at the edges,” a notion that grew out of feedback from employers: Tomorrow’s engineering leaders will need broad knowledge to connect with other specialties in the increasingly interdisciplinary world of artificial intelligence, robotics, and the Internet of Things.
“This was something that we embarked on as an experiment 5 or 6 years ago,” says Raghunathan of the one-credit courses. “I think, in hindsight, that it’s turned out spectacularly.”
A researcher adjusts imaging equipment in a lab in Birck Nanotechnology Center, home to Purdue’s advanced research and development on semiconductors and other technology at the atomic scale.Rebecca Robiños/Purdue University
The Semiconductor Engineering Education Leader
Purdue, which opened its first classes in 1874, is today an acknowledged leader in engineering education.
U.S. News & World Report has ranked the university’s graduate engineering program among America’s 10 best every year since 2012 (and among the top 4 since 2022). And Purdue’s online graduate engineering program has ranked in the country’s top three since the publication started evaluating online grad programs in 2020. (Purdue has offered distance Master’s degrees since the 1980s. Back then, of course, course lectures were videotaped and mailed to students. With the growth of the web, “distance” became “online,” and the program has swelled.)
Thus, Microelectronics and Semiconductors Master’s Degree candidates can study online or on-campus. Both tracks take the same courses from the same instructors and earn the same degree. There are no footnotes, asterisks, or parentheses on the diploma to denote online or in-person study.
“If you look at our program, it will become clear why Purdue is increasingly considered America’s leading semiconductors university” —Prof. Vijay Raghunathan, Purdue University
Students take classes at their own pace, using an integrated suite of proven online-learning applications for attending lectures, submitting homework, taking tests, and communicating with faculty and one another. Texts may be purchased or downloaded from the school library. And there is frequent use of modeling and analytical tools like Matlab. In addition, Purdue is also the home of national the national design-computing resources
nanoHUB.org (with hundreds of modeling, simulation, teaching, and software-development tools) and its offspring, chipshub.org (specializing in tools for chip design and fabrication).
From R&D to Workforce and Economic Development
“If you look at our program, it will become clear why Purdue is increasingly considered America’s leading semiconductors university, because this is such a strategic priority
for the entire university, from our President all the way down,” Prof. Raghunathan sums up. “We have a task force that reports directly to the President, a task force focused only on semiconductors and microelectronics. On all aspects—R&D, the innovation pipeline, workforce development, economic development to bring companies to the state. We’re all in as far as chips are concerned.”