IEEE Spectrum Recent Content full text
How and Why Gary Marcus Became AI's Leading Critic
17 September 2024 at 14:00

How and Why Gary Marcus Became AI's Leading Critic

By: Eliza Strickland

17 September 2024 at 14:00

Maybe you’ve read about Gary Marcus’s testimony before the Senate in May of 2023, when he sat next to Sam Altman and called for strict regulation of Altman’s company, OpenAI, as well as the other tech companies that were suddenly all-in on generative AI. Maybe you’ve caught some of his arguments on Twitter with Geoffrey Hinton and Yann LeCun, two of the so-called “godfathers of AI.” One way or another, most people who are paying attention to artificial intelligence today know Gary Marcus’s name, and know that he is not happy with the current state of AI.

He lays out his concerns in full in his new book, Taming Silicon Valley: How We Can Ensure That AI Works for Us, which was published today by MIT Press. Marcus goes through the immediate dangers posed by generative AI, which include things like mass-produced disinformation, the easy creation of deepfake pornography, and the theft of creative intellectual property to train new models (he doesn’t include an AI apocalypse as a danger, he’s not a doomer). He also takes issue with how Silicon Valley has manipulated public opinion and government policy, and explains his ideas for regulating AI companies.

Marcus studied cognitive science under the legendary Steven Pinker, was a professor at New York University for many years, and co-founded two AI companies, Geometric Intelligence and Robust.AI. He spoke with IEEE Spectrum about his path to this point.

What was your first introduction to AI?

portrait of a man wearing a red checkered shirt and a black jacket with glasses Gary MarcusBen Wong

Gary Marcus: Well, I started coding when I was eight years old. One of the reasons I was able to skip the last two years of high school was because I wrote a Latin-to-English translator in the programming language Logo on my Commodore 64. So I was already, by the time I was 16, in college and working on AI and cognitive science.

So you were already interested in AI, but you studied cognitive science both in undergrad and for your Ph.D. at MIT.

Marcus: Part of why I went into cognitive science is I thought maybe if I understood how people think, it might lead to new approaches to AI. I suspect we need to take a broad view of how the human mind works if we’re to build really advanced AI. As a scientist and a philosopher, I would say it’s still unknown how we will build artificial general intelligence or even just trustworthy general AI. But we have not been able to do that with these big statistical models, and we have given them a huge chance. There’s basically been $75 billion spent on generative AI, another $100 billion on driverless cars. And neither of them has really yielded stable AI that we can trust. We don’t know for sure what we need to do, but we have very good reason to think that merely scaling things up will not work. The current approach keeps coming up against the same problems over and over again.

What do you see as the main problems it keeps coming up against?

Marcus: Number one is hallucinations. These systems smear together a lot of words, and they come up with things that are true sometimes and not others. Like saying that I have a pet chicken named Henrietta is just not true. And they do this a lot. We’ve seen this play out, for example, in lawyers writing briefs with made-up cases.

Second, their reasoning is very poor. My favorite examples lately are these river-crossing word problems where you have a man and a cabbage and a wolf and a goat that have to get across. The system has a lot of memorized examples, but it doesn’t really understand what’s going on. If you give it a simpler problem, like one Doug Hofstadter sent to me, like: “A man and a woman have a boat and want to get across the river. What do they do?” It comes up with this crazy solution where the man goes across the river, leaves the boat there, swims back, something or other happens.

Sometimes he brings a cabbage along, just for fun.

Marcus: So those are boneheaded errors of reasoning where there’s something obviously amiss. Every time we point these errors out somebody says, “Yeah, but we’ll get more data. We’ll get it fixed.” Well, I’ve been hearing that for almost 30 years. And although there is some progress, the core problems have not changed.

Let’s go back to 2014 when you founded your first AI company, Geometric Intelligence. At that time, I imagine you were feeling more bullish on AI?

Marcus: Yeah, I was a lot more bullish. I was not only more bullish on the technical side. I was also more bullish about people using AI for good. AI used to feel like a small research community of people that really wanted to help the world.

So when did the disillusionment and doubt creep in?

Marcus: In 2018 I already thought deep learning was getting overhyped. That year I wrote this piece called “Deep Learning, a Critical Appraisal,” which Yann LeCun really hated at the time. I already wasn’t happy with this approach and I didn’t think it was likely to succeed. But that’s not the same as being disillusioned, right?

Then when large language models became popular [around 2019], I immediately thought they were a bad idea. I just thought this is the wrong way to pursue AI from a philosophical and technical perspective. And it became clear that the media and some people in machine learning were getting seduced by hype. That bothered me. So I was writing pieces about GPT-3 [an early version of OpenAI's large language model] being a bullshit artist in 2020. As a scientist, I was pretty disappointed in the field at that point. And then things got much worse when ChatGPT came out in 2022, and most of the world lost all perspective. I began to get more and more concerned about misinformation and how large language models were going to potentiate that.

You’ve been concerned not just about the startups, but also the big entrenched tech companies that jumped on the generative AI bandwagon, right? Like Microsoft, which has partnered with OpenAI?

Marcus: The last straw that made me move from doing research in AI to working on policy was when it became clear that Microsoft was going to race ahead no matter what. That was very different from 2016 when they released [an early chatbot named] Tay. It was bad, they took it off the market 12 hours later, and then Brad Smith wrote a book about responsible AI and what they had learned. But by the end of the month of February 2023, it was clear that Microsoft had really changed how they were thinking about this. And then they had this ridiculous “Sparks of AGI” paper, which I think was the ultimate in hype. And they didn’t take down Sydney after the crazy Kevin Roose conversation where [the chatbot] Sydney told him to get a divorce and all this stuff. It just became clear to me that the mood and the values of Silicon Valley had really changed, and not in a good way.

I also became disillusioned with the U.S. government. I think the Biden administration did a good job with its executive order. But it became clear that the Senate was not going to take the action that it needed. I spoke at the Senate in May 2023. At the time, I felt like both parties recognized that we can’t just leave all this to self-regulation. And then I became disillusioned [with Congress] over the course of the last year, and that’s what led to writing this book.

You talk a lot about the risks inherent in today’s generative AI technology. But then you also say, “It doesn’t work very well.” Are those two views coherent?

Marcus: There was a headline: “Gary Marcus Used to Call AI Stupid, Now He Calls It Dangerous.” The implication was that those two things can’t coexist. But in fact, they do coexist. I still think gen AI is stupid, and certainly cannot be trusted or counted on. And yet it is dangerous. And some of the danger actually stems from its stupidity. So for example, it’s not well-grounded in the world, so it’s easy for a bad actor to manipulate it into saying all kinds of garbage. Now, there might be a future AI that might be dangerous for a different reason, because it’s so smart and wily that it outfoxes the humans. But that’s not the current state of affairs.

You’ve said that generative AI is a bubble that will soon burst. Why do you think that?

Marcus: Let’s clarify: I don’t think generative AI is going to disappear. For some purposes, it is a fine method. You want to build autocomplete, it is the best method ever invented. But there’s a financial bubble because people are valuing AI companies as if they’re going to solve artificial general intelligence. In my view, it’s not realistic. I don’t think we’re anywhere near AGI. So then you’re left with, “Okay, what can you do with generative AI?”

Last year, because Sam Altman was such a good salesman, everybody fantasized that we were about to have AGI and that you could use this tool in every aspect of every corporation. And a whole bunch of companies spent a bunch of money testing generative AI out on all kinds of different things. So they spent 2023 doing that. And then what you’ve seen in 2024 are reports where researchers go to the users of Microsoft’s Copilot—not the coding tool, but the more general AI tool—and they’re like, “Yeah, it doesn’t really work that well.” There’s been a lot of reviews like that this last year.

The reality is, right now, the gen AI companies are actually losing money. OpenAI had an operating loss of something like $5 billion last year. Maybe you can sell $2 billion worth of gen AI to people who are experimenting. But unless they adopt it on a permanent basis and pay you a lot more money, it’s not going to work. I started calling OpenAI the possible WeWork of AI after it was valued at $86 billion. The math just didn’t make sense to me.

What would it take to convince you that you’re wrong? What would be the head-spinning moment?

Marcus: Well, I’ve made a lot of different claims, and all of them could be wrong. On the technical side, if someone could get a pure large language model to not hallucinate and to reason reliably all the time, I would be wrong about that very core claim that I have made about how these things work. So that would be one way of refuting me. It hasn’t happened yet, but it’s at least logically possible.

On the financial side, I could easily be wrong. But the thing about bubbles is that they’re mostly a function of psychology. Do I think the market is rational? No. So even if the stuff doesn’t make money for the next five years, people could keep pouring money into it.

The place that I’d like to prove me wrong is the U.S. Senate. They could get their act together, right? I’m running around saying, “They’re not moving fast enough,” but I would love to be proven wrong on that. In the book, I have a list of the 12 biggest risks of generative AI. If the Senate passed something that actually addressed all 12, then my cynicism would have been mislaid. I would feel like I’d wasted a year writing the book, and I would be very, very happy.

Neuroscience News
AI Conversations Help Conspiracy Theorists Change Their Views
12 September 2024 at 15:09

AI Conversations Help Conspiracy Theorists Change Their Views

Neuroscience News

By: Neuroscience News

12 September 2024 at 15:09

AI-powered conversations can reduce belief in conspiracy theories by 20%. Researchers found that AI provided tailored, fact-based rebuttals to participants' conspiracy claims, leading to a lasting change in their beliefs. In one out of four cases, participants disavowed the conspiracy entirely. The study suggests that AI has the potential to combat misinformation by engaging people directly and personally.

Neuroscience News
Unlocking the Brain’s “Neural Code” Could Lead to Superhuman AI
9 September 2024 at 15:11

Unlocking the Brain’s “Neural Code” Could Lead to Superhuman AI

Neuroscience News

By: Neuroscience News

9 September 2024 at 15:11

Researchers believe that cracking the brain's "neural code" could lead to AI surpassing human intelligence in capacity and speed. This neural code refers to how the brain processes sensory information and performs cognitive tasks like learning and problem-solving.

Neuroscience News
Robot Deception: Some Lies Accepted, Others Rejected
6 September 2024 at 22:54

Robot Deception: Some Lies Accepted, Others Rejected

Neuroscience News

By: Neuroscience News

6 September 2024 at 22:54

A new study examined how humans perceive different types of deception by robots, revealing that people accept some lies more than others. Researchers presented nearly 500 participants with scenarios where robots engaged in external, hidden, and superficial deceptions in medical, cleaning, and retail settings. Participants disapproved most of hidden deceptions, such as a cleaning robot secretly filming, while external lies, like sparing a patient from emotional pain, were viewed more favorably.

Neuroscience News
AI Models Complex Molecular States with Precision
26 August 2024 at 02:34

AI Models Complex Molecular States with Precision

Neuroscience News

By: Neuroscience News

26 August 2024 at 02:34

Researchers developed a brain-inspired AI technique using neural networks to model the challenging quantum states of molecules, crucial for technologies like solar panels and photocatalysts. This new approach significantly improves accuracy, enabling better prediction of molecular behaviors during energy transitions. By enhancing our understanding of molecular excited states, this research could revolutionize material prototyping and chemical synthesis.

Neuroscience News
AI Model Predicts Autism in Toddlers with 80% Accuracy
19 August 2024 at 22:27

AI Model Predicts Autism in Toddlers with 80% Accuracy

Neuroscience News

By: Neuroscience News

19 August 2024 at 22:27

A new machine learning model, AutMedAI, can predict autism in children under two with nearly 80% accuracy, offering a promising tool for early detection and intervention. The model analyzes 28 parameters available before 24 months, such as age of first smile and eating difficulties, to identify children likely to have autism. Early diagnosis is crucial for optimal development, and further validation of the model is underway.

Neuroscience News
AI Lacks Independent Learning, Poses No Existential Threat
12 August 2024 at 23:41

AI Lacks Independent Learning, Poses No Existential Threat

Neuroscience News

By: Neuroscience News

12 August 2024 at 23:41

New research reveals that large language models (LLMs) like ChatGPT cannot learn independently or acquire new skills without explicit instructions, making them predictable and controllable. The study dispels fears of these models developing complex reasoning abilities, emphasizing that while LLMs can generate sophisticated language, they are unlikely to pose existential threats. However, the potential misuse of AI, such as generating fake news, still requires attention.

Neuroscience News
Can We Hear Temperature? New Study Says Yes
8 August 2024 at 22:10

Can We Hear Temperature? New Study Says Yes

Neuroscience News

By: Neuroscience News

8 August 2024 at 22:10

This shows a man wearing headphones on a cold day.

Researchers discovered that humans can detect water temperature through its sound. Using machine learning, they analyzed how people perceive thermal properties via auditory cues.

AI Helps Decode the Language of DNA

Neuroscience News

By: Neuroscience News

5 August 2024 at 23:31

Researchers have developed GROVER, an AI language model trained on human DNA, to decode the complex information in our genome. GROVER treats DNA as a language, learning its rules and context to extract biological meanings, such as gene promoters and protein binding sites. This innovative approach could revolutionize genomics and personalized medicine by unlocking hidden layers of genetic information. The findings suggest that DNA functions are encoded in sequences, offering new insights into disease predispositions and treatments.

Neuroscience News
Study Finds Faces Evolve to Match Names Over Time
29 July 2024 at 23:44

Study Finds Faces Evolve to Match Names Over Time

Neuroscience News

By: Neuroscience News

29 July 2024 at 23:44

A new study reveals that a person's face tends to evolve to suit their name, demonstrating the profound impact of social expectations. The research showed that adults' faces could be matched to their names with high accuracy, while children's faces could not.

Neuroscience News
Consciousness in AI: Distinguishing Reality from Simulation
19 July 2024 at 22:44

Consciousness in AI: Distinguishing Reality from Simulation

Neuroscience News

By: Neuroscience News

19 July 2024 at 22:44

A new study examines the possibility of consciousness in artificial systems, focusing on ruling out scenarios where AI appears conscious without actually being so. Using the free energy principle, the study highlights that while some information processes of living organisms can be simulated by computers, the causal structure differences between brains and computers may be crucial for consciousness. This approach aims to prevent the inadvertent creation of artificial consciousness and mitigate deception by seemingly conscious AI.

Neuroscience News
AI Identifies Three Parkinson’s Subtypes
17 July 2024 at 00:10

AI Identifies Three Parkinson’s Subtypes

Neuroscience News

By: Neuroscience News

17 July 2024 at 00:10

Researchers used machine learning to identify three subtypes of Parkinson’s disease based on progression speed. These subtypes, marked by distinct genetic drivers, could enhance diagnosis and treatment strategies.

Neuroscience News
People Prefer AI in Fairness-Related Decisions
16 July 2024 at 22:21

People Prefer AI in Fairness-Related Decisions

Neuroscience News

By: Neuroscience News

16 July 2024 at 22:21

A new study reveals that over 60% of participants prefer AI over humans for redistributive decisions, despite finding AI decisions less satisfying and fair. Researchers conducted an online experiment with over 200 participants from the UK and Germany.

Andrew Ng: Unbiggen AI

IEEE Spectrum Recent Content full text

By: Eliza Strickland

9 February 2022 at 16:31

Andrew Ng has serious street cred in artificial intelligence. He pioneered the use of graphics processing units (GPUs) to train deep learning models in the late 2000s with his students at Stanford University, cofounded Google Brain in 2011, and then served for three years as chief scientist for Baidu, where he helped build the Chinese tech giant’s AI group. So when he says he has identified the next big shift in artificial intelligence, people listen. And that’s what he told IEEE Spectrum in an exclusive Q&A.

Ng’s current efforts are focused on his company Landing AI, which built a platform called LandingLens to help manufacturers improve visual inspection with computer vision. He has also become something of an evangelist for what he calls the data-centric AI movement, which he says can yield “small data” solutions to big issues in AI, including model efficiency, accuracy, and bias.

Andrew Ng on...

The great advances in deep learning over the past decade or so have been powered by ever-bigger models crunching ever-bigger amounts of data. Some people argue that that’s an unsustainable trajectory. Do you agree that it can’t go on that way?

Andrew Ng: This is a big question. We’ve seen foundation models in NLP [natural language processing]. I’m excited about NLP models getting even bigger, and also about the potential of building foundation models in computer vision. I think there’s lots of signal to still be exploited in video: We have not been able to build foundation models yet for video because of compute bandwidth and the cost of processing video, as opposed to tokenized text. So I think that this engine of scaling up deep learning algorithms, which has been running for something like 15 years now, still has steam in it. Having said that, it only applies to certain problems, and there’s a set of other problems that need small data solutions.

When you say you want a foundation model for computer vision, what do you mean by that?

Ng: This is a term coined by Percy Liang and some of my friends at Stanford to refer to very large models, trained on very large data sets, that can be tuned for specific applications. For example, GPT-3 is an example of a foundation model [for NLP]. Foundation models offer a lot of promise as a new paradigm in developing machine learning applications, but also challenges in terms of making sure that they’re reasonably fair and free from bias, especially if many of us will be building on top of them.

What needs to happen for someone to build a foundation model for video?

Ng: I think there is a scalability problem. The compute power needed to process the large volume of images for video is significant, and I think that’s why foundation models have arisen first in NLP. Many researchers are working on this, and I think we’re seeing early signs of such models being developed in computer vision. But I’m confident that if a semiconductor maker gave us 10 times more processor power, we could easily find 10 times more video to build such models for vision.

Having said that, a lot of what’s happened over the past decade is that deep learning has happened in consumer-facing companies that have large user bases, sometimes billions of users, and therefore very large data sets. While that paradigm of machine learning has driven a lot of economic value in consumer software, I find that that recipe of scale doesn’t work for other industries.

Back to top

It’s funny to hear you say that, because your early work was at a consumer-facing company with millions of users.

Ng: Over a decade ago, when I proposed starting the Google Brain project to use Google’s compute infrastructure to build very large neural networks, it was a controversial step. One very senior person pulled me aside and warned me that starting Google Brain would be bad for my career. I think he felt that the action couldn’t just be in scaling up, and that I should instead focus on architecture innovation.

“In many industries where giant data sets simply don’t exist, I think the focus has to shift from big data to good data. Having 50 thoughtfully engineered examples can be sufficient to explain to the neural network what you want it to learn.”
—Andrew Ng, CEO & Founder, Landing AI

I remember when my students and I published the first NeurIPS workshop paper advocating using CUDA, a platform for processing on GPUs, for deep learning—a different senior person in AI sat me down and said, “CUDA is really complicated to program. As a programming paradigm, this seems like too much work.” I did manage to convince him; the other person I did not convince.

I expect they’re both convinced now.

Ng: I think so, yes.

Over the past year as I’ve been speaking to people about the data-centric AI movement, I’ve been getting flashbacks to when I was speaking to people about deep learning and scalability 10 or 15 years ago. In the past year, I’ve been getting the same mix of “there’s nothing new here” and “this seems like the wrong direction.”

Back to top

How do you define data-centric AI, and why do you consider it a movement?

Ng: Data-centric AI is the discipline of systematically engineering the data needed to successfully build an AI system. For an AI system, you have to implement some algorithm, say a neural network, in code and then train it on your data set. The dominant paradigm over the last decade was to download the data set while you focus on improving the code. Thanks to that paradigm, over the last decade deep learning networks have improved significantly, to the point where for a lot of applications the code—the neural network architecture—is basically a solved problem. So for many practical applications, it’s now more productive to hold the neural network architecture fixed, and instead find ways to improve the data.

When I started speaking about this, there were many practitioners who, completely appropriately, raised their hands and said, “Yes, we’ve been doing this for 20 years.” This is the time to take the things that some individuals have been doing intuitively and make it a systematic engineering discipline.

The data-centric AI movement is much bigger than one company or group of researchers. My collaborators and I organized a data-centric AI workshop at NeurIPS, and I was really delighted at the number of authors and presenters that showed up.

You often talk about companies or institutions that have only a small amount of data to work with. How can data-centric AI help them?

Ng: You hear a lot about vision systems built with millions of images—I once built a face recognition system using 350 million images. Architectures built for hundreds of millions of images don’t work with only 50 images. But it turns out, if you have 50 really good examples, you can build something valuable, like a defect-inspection system. In many industries where giant data sets simply don’t exist, I think the focus has to shift from big data to good data. Having 50 thoughtfully engineered examples can be sufficient to explain to the neural network what you want it to learn.

When you talk about training a model with just 50 images, does that really mean you’re taking an existing model that was trained on a very large data set and fine-tuning it? Or do you mean a brand new model that’s designed to learn only from that small data set?

Ng: Let me describe what Landing AI does. When doing visual inspection for manufacturers, we often use our own flavor of RetinaNet. It is a pretrained model. Having said that, the pretraining is a small piece of the puzzle. What’s a bigger piece of the puzzle is providing tools that enable the manufacturer to pick the right set of images [to use for fine-tuning] and label them in a consistent way. There’s a very practical problem we’ve seen spanning vision, NLP, and speech, where even human annotators don’t agree on the appropriate label. For big data applications, the common response has been: If the data is noisy, let’s just get a lot of data and the algorithm will average over it. But if you can develop tools that flag where the data’s inconsistent and give you a very targeted way to improve the consistency of the data, that turns out to be a more efficient way to get a high-performing system.

“Collecting more data often helps, but if you try to collect more data for everything, that can be a very expensive activity.”
—Andrew Ng

For example, if you have 10,000 images where 30 images are of one class, and those 30 images are labeled inconsistently, one of the things we do is build tools to draw your attention to the subset of data that’s inconsistent. So you can very quickly relabel those images to be more consistent, and this leads to improvement in performance.

Could this focus on high-quality data help with bias in data sets? If you’re able to curate the data more before training?

Ng: Very much so. Many researchers have pointed out that biased data is one factor among many leading to biased systems. There have been many thoughtful efforts to engineer the data. At the NeurIPS workshop, Olga Russakovsky gave a really nice talk on this. At the main NeurIPS conference, I also really enjoyed Mary Gray’s presentation, which touched on how data-centric AI is one piece of the solution, but not the entire solution. New tools like Datasheets for Datasets also seem like an important piece of the puzzle.

One of the powerful tools that data-centric AI gives us is the ability to engineer a subset of the data. Imagine training a machine-learning system and finding that its performance is okay for most of the data set, but its performance is biased for just a subset of the data. If you try to change the whole neural network architecture to improve the performance on just that subset, it’s quite difficult. But if you can engineer a subset of the data you can address the problem in a much more targeted way.

When you talk about engineering the data, what do you mean exactly?

Ng: In AI, data cleaning is important, but the way the data has been cleaned has often been in very manual ways. In computer vision, someone may visualize images through a Jupyter notebook and maybe spot the problem, and maybe fix it. But I’m excited about tools that allow you to have a very large data set, tools that draw your attention quickly and efficiently to the subset of data where, say, the labels are noisy. Or to quickly bring your attention to the one class among 100 classes where it would benefit you to collect more data. Collecting more data often helps, but if you try to collect more data for everything, that can be a very expensive activity.

For example, I once figured out that a speech-recognition system was performing poorly when there was car noise in the background. Knowing that allowed me to collect more data with car noise in the background, rather than trying to collect more data for everything, which would have been expensive and slow.

Back to top

What about using synthetic data, is that often a good solution?

Ng: I think synthetic data is an important tool in the tool chest of data-centric AI. At the NeurIPS workshop, Anima Anandkumar gave a great talk that touched on synthetic data. I think there are important uses of synthetic data that go beyond just being a preprocessing step for increasing the data set for a learning algorithm. I’d love to see more tools to let developers use synthetic data generation as part of the closed loop of iterative machine learning development.

Do you mean that synthetic data would allow you to try the model on more data sets?

Ng: Not really. Here’s an example. Let’s say you’re trying to detect defects in a smartphone casing. There are many different types of defects on smartphones. It could be a scratch, a dent, pit marks, discoloration of the material, other types of blemishes. If you train the model and then find through error analysis that it’s doing well overall but it’s performing poorly on pit marks, then synthetic data generation allows you to address the problem in a more targeted way. You could generate more data just for the pit-mark category.

“In the consumer software Internet, we could train a handful of machine-learning models to serve a billion users. In manufacturing, you might have 10,000 manufacturers building 10,000 custom AI models.”
—Andrew Ng

Synthetic data generation is a very powerful tool, but there are many simpler tools that I will often try first. Such as data augmentation, improving labeling consistency, or just asking a factory to collect more data.

Back to top

To make these issues more concrete, can you walk me through an example? When a company approaches Landing AI and says it has a problem with visual inspection, how do you onboard them and work toward deployment?

Ng: When a customer approaches us we usually have a conversation about their inspection problem and look at a few images to verify that the problem is feasible with computer vision. Assuming it is, we ask them to upload the data to the LandingLens platform. We often advise them on the methodology of data-centric AI and help them label the data.

One of the foci of Landing AI is to empower manufacturing companies to do the machine learning work themselves. A lot of our work is making sure the software is fast and easy to use. Through the iterative process of machine learning development, we advise customers on things like how to train models on the platform, when and how to improve the labeling of data so the performance of the model improves. Our training and software supports them all the way through deploying the trained model to an edge device in the factory.

How do you deal with changing needs? If products change or lighting conditions change in the factory, can the model keep up?

Ng: It varies by manufacturer. There is data drift in many contexts. But there are some manufacturers that have been running the same manufacturing line for 20 years now with few changes, so they don’t expect changes in the next five years. Those stable environments make things easier. For other manufacturers, we provide tools to flag when there’s a significant data-drift issue. I find it really important to empower manufacturing customers to correct data, retrain, and update the model. Because if something changes and it’s 3 a.m. in the United States, I want them to be able to adapt their learning algorithm right away to maintain operations.

In the consumer software Internet, we could train a handful of machine-learning models to serve a billion users. In manufacturing, you might have 10,000 manufacturers building 10,000 custom AI models. The challenge is, how do you do that without Landing AI having to hire 10,000 machine learning specialists?

So you’re saying that to make it scale, you have to empower customers to do a lot of the training and other work.

Ng: Yes, exactly! This is an industry-wide problem in AI, not just in manufacturing. Look at health care. Every hospital has its own slightly different format for electronic health records. How can every hospital train its own custom AI model? Expecting every hospital’s IT personnel to invent new neural-network architectures is unrealistic. The only way out of this dilemma is to build tools that empower the customers to build their own models by giving them tools to engineer the data and express their domain knowledge. That’s what Landing AI is executing in computer vision, and the field of AI needs other teams to execute this in other domains.

Is there anything else you think it’s important for people to understand about the work you’re doing or the data-centric AI movement?

Ng: In the last decade, the biggest shift in AI was a shift to deep learning. I think it’s quite possible that in this decade the biggest shift will be to data-centric AI. With the maturity of today’s neural network architectures, I think for a lot of the practical applications the bottleneck will be whether we can efficiently get the data we need to develop systems that work well. The data-centric AI movement has tremendous energy and momentum across the whole community. I hope more researchers and developers will jump in and work on it.

Back to top

This article appears in the April 2022 print issue as “Andrew Ng, AI Minimalist.”

Normal view