Ask a few random people about Apple Intelligence and you’ll probably get quite different responses.
One might be excited about the new features. Another could opine that no one asked for this and the company is throwing away its reputation with creatives and artists to chase a fad. Another still might tell you that regardless of the potential value, Apple is simply too late to the game to make a mark.
The release of Apple’s first Apple Intelligence-branded AI tools in iOS 18.1 last week makes all those perspectives understandable.
AI startup Mistral has launched a new API for content moderation. The API, which is the same API that powers moderation in Mistral’s Le Chat chatbot platform, can be tailored to specific applications and safety standards, Mistral says. It’s powered by a fine-tuned model (Ministral 8B) trained to classify text in a range of languages, […]
Suno CEO Mikey Shulman found himself in an unlikely place for the founder of a generative AI music company: a songwriting class at Berklee College of Music. “It sounds like walking into the lion’s den,” Shulman said onstage at TechCrunch Disrupt 2024. “The approach of just walking in there and saying, ‘don’t worry, there’s no […]
Most AI benchmarks don’t tell us much. They ask questions that can be solved with rote memorization, or cover topics that aren’t relevant to the majority of users. So some AI enthusiasts are turning to games as a way to test AIs’ problem-solving skills. Paul Calcraft, a freelance AI developer, has built an app where […]
An expert shares insight and guidance into an area of growing concern.
GUEST COLUMN | by Jordan Adair
Artificial intelligence (AI) in higher education continues to expand into more aspects of student learning. Initially, some administrators and faculty pointed to possible data privacy or ethical concerns with AI, but the larger focus now is how generative AI, such as ChatGPT and Google Gemini, makes it easier for students to submit work or assessments that lack original content.
As AI adoption and academic concerns grow, educators may need to rethink how students learn, how student demonstrate understanding of a topic, and how assessments are designed and administered to measure learning and practical application. This may require institutions to throw out the “business-as-usual” approach, especially when it comes to anything involving writing, whether it’s essays or online exams.
‘As AI adoption and academic concerns grow, educators may need to rethink how students learn, how student demonstrate understanding of a topic, and how assessments are designed and administered to measure learning and practical application.’
As higher education institutions look to maintain academic integrity, staying ahead of how students use AI is critical. Some tools exist to detect and monitor AI use, but are these tools fixing a problem or leaving a void?
Getting Ahead of the Game
Institutions should familiarize themselves with the potential of large language models in education and open transparent communication channels to discuss AI with stakeholders, including researchers and IT support. This can help set a baseline for potential policies or actions.
Developing a dedicated committee may be beneficial as institutions create and implement new policies and guidelines for using AI tools, develop training and resources for students, faculty, and staff on academic integrity, and encourage the responsible use of AI in education.
Unlike contract cheating, using AI tools isn’t automatically unethical. On the contrary, as AI will permeate society and professions in the near future, there’s a need to discuss the right and wrong ways to leverage AI as part of the academic experience.
Some AI tools, especially chatbots like ChatGPT, present specific academic integrity challenges. While institutions strive to equip students for an AI-driven future, they also need to ensure that AI doesn’t compromise the integrity of the educational experience.
Study Results Paint a Grim Picture
As AI evolves and is adopted more broadly, colleges and universities are exploring how to implement better detection methods effectively. While some existing detection tools show promise, they all struggle to identify AI-generated writing accurately.
AI and plagiarism detection are similar but different. Both aim to detect unoriginal content, but their focus is different. AI detection looks for writing patterns, like word choice and sentence structure, to identify AI-generated text. Plagiarism detection compares text against huge databases to identify copied or paraphrased content from other sources.
Looking at a growing level of research, there are strong concerns about these tools’ inabilities to detect AI. One study tested the largest commercial plagiarism and AI detection tool against ChatGPT-generated text. It was found that when text is unaltered, the detection tool effectively detects it as AI-generated. However, when Quillbot paraphrased it, the score dropped to 31% and 0% after two rephrases. Another 2024 experiment of the same AI detection software showed the same results: it can accurately detect unaltered AI content but struggles when tools like Quillbot make changes. Unfortunately, this experiment also highlighted how AI detection is completely unable—with 0% success—to detect AI content that has been altered by AI designed to humanize AI-generated text.
In another instance, a recent International Journal for Educational Integritystudy tested 14 AI detection tools—12 publicly available and two commercial—against ChatGPT:
AI detection tools are inaccurate: they often mistakenly identify AI-generated text as human-written and struggle to detect AI content translated from other languages.
Manually editing responses reduces the accuracy of detection tools: swapping words, reordering sentences, and paraphrasing decreased the accuracy of the detection tools.
Finally, a 2023 study titled “Will ChatGPT Get You Caught? Rethinking of Plagiarism Detection” fed 50 ChatGPT-generated essays into two text-matching software systems from the largest and most well-known plagiarism tool. The results of the submitted essays “demonstrated a remarkable level of originality stirring up alarms of the reliability of plagiarism check software used by academia.”
AI chatbots are improving at writing, and more effective prompts help them generate more human-like content. In the examples above, AI detection tools from the biggest companies to the free options were tested against various content types, including long-form essays and short-form assignments across different subjects and domains. No matter the size or content type, they all struggled to detect AI. While AI detection tools can help as a high-level gut check, they’re still mostly ineffective, as shown by the many studies.
Up the Ante Against Cheating
Given the ineffectiveness of AI detection tools, academic institutions must consider alternative methods to curb AI usage and protect integrity.
One option is to consider a modified approach to written assignments and essays. Instead of traditional written assessments, try scaffolded assignments that require input on one subject over a series of tests. You can also ask students to share their opinions on specific class discussions or request that they cite examples from class.
Another option is instructing students to review an article or a case study. Then, ask them to reply to specific questions that require them to think critically and integrate their opinions and reasoning. Doing this makes it challenging to use AI content tools because they do not have enough context to formulate a usable response.
Institutions can also proctor written assignments like an online exam. This helps to block AI usage and removes access or help from phones. Proctoring can be very flexible, allowing access to specific approved sites, such as case studies, research articles, etc., while blocking everything else.
Protecting Academic Integrity
If proctoring is being used, consider a hybrid proctoring solution that combines AI, human review, and a secure browser rather than just one of those methods. Hybrid proctoring uses AI to monitor each test taker and alert a live proctor if potential misconduct is detected. Once alerted, the proctor reviews the situation and only intervenes if misconduct is suspected. Otherwise, the test taker isn’t interrupted. This smarter proctoring approach delivers a much less intimidating and noninvasive testing experience than human-only platforms.
Preserving the integrity of exams and protecting the reputation of faculty and institutions is incredibly important to continue attracting high-potential students. AI tools are here to stay; schools don’t need to stay ahead of them. Instead, understand how students use AI, modify how learning is delivered, use AI to your benefit when possible, and create clear and consistent policies so students understand how and where they can ethically leverage the latest in AI.
—
Jordan Adair is VP of Product at Honorlock. Jordan began his career in education as an elementary and middle school teacher. After transitioning into educational technology, he became focused on delivering products designed to empower instructors and improve the student experience. Connect with Jordan on LinkedIn.
Anthropic’s AI-powered chatbot, Claude, now has desktop apps. Anthropic is launching Claude apps for Mac and Windows today in public beta, which — as Anthropic writes in a blog post — “brings Claude’s capabilities directly to your preferred work environment.” These capabilities, to be clear, don’t include Anthropic’s recently announced Computer Use feature, which allows […]
Google revealed today how it plans to use generative AI to enhance its mapping activities. It's the latest application of Gemini, the company's in-house rival to GPT-4, which the company wants to use to improve the experience when searching for something. Google Maps, Google Earth, and Waze will all get feature upgrades thanks to Gemini, although in some cases only with Google's "trusted testers" at first.
Google Maps
More than 2 billion people use Google Maps every month, according to the company, and in fact, AI is nothing new to Google Maps. "A lot of those features that we've introduced over the years have been thanks to AI," said Chris Phillips,VP and general manager of Geo at Google. "Think of features like Lens and maps. When you're on a street corner, you can lift up your phone and look, and through your camera view, you can actually see we laid places on top of your view. So you can see a business. Is it open? What are the ratings for it? Is it busy? You can even see businesses that are out of your line of sight," he explained.
At some point this week, if you use the Android or iOS Google Maps app here in the US, you should start seeing more detailed and contextual search results. Maps will now respond to conversational requests—during a demo, Google asked it what to do on a night out with friends in Boston, with the app returning a set of results curated by Gemini. These included categories of places—speakeasies, for example—with review summaries and answers from users.
ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code with short text prompts has evolved into a behemoth used by more than 92% of Fortune 500 companies. That growth has propelled OpenAI itself into […]
The speed of recent innovation is head spinning. Here’s some help.
GUEST COLUMN | by Delia DeCourcy
“As artificial intelligence proliferates, users who intimately understand the nuances, limitations, and abilities of AI tools are uniquely positioned to unlock AI’s full innovative potential.”
Ethan Mollick’s insight from his recent book Co-Intelligence: Living and Working with AI, is a great argument for why AI literacy is crucial for our students and faculty right now. To understand AI, you have to use it – a lot – not only so you know how AI can assist you, but also, as Mollick explains, so you know how AI will impact you and your current job–or in the case of students, the job they’ll eventually have.
What is AI Literacy?
Definitions of AI literacy abound but most have a few characteristics in common:
AI literacy doesn’t mean deep technical knowledge or the skill to develop AI tools.
Deeper dimensions of that second bullet could include knowing the difference between AI and generative AI; understanding the biases and ethical implications of large language model training; and mastering prompting strategies to name a few.
AI Literacy and Future Readiness
If the two-year generative AI tidal wave originating with ChatGPT going live isn’t enough to stoke your belief in the need for AI literacy, consider these facts and statistics:
A poll conducted by Impact Research for the Walton Family Foundation revealed that as of June 2024, about half of K-12 students and teachers said they use ChatGPT at least weekly.
According to a June report from Pearson, 56% of higher education students said that generative AI tools made them more efficient in the spring semester, while only 14% of faculty were confident about using AI in their teaching.
AI is already integrated into many of the devices and platforms we use every day. That’s now true in education as well with the integration of the Gemini chatbot in Google Workspace for Education and Microsoft’s offering of Copilot to education users.
Supporting institutions, educators, and students with AI literacy
Institutions – Assess, Plan, Implement
Assessing institutional readiness for generative AI integration, planning, and implementation means looking not only at curriculum integration and professional development for educators, but also how this technology can be used to personalize the student experience, streamline administration, and improve operating costs – not to mention the critical step of developing institutional policies for responsible and ethical AI use. This complex planning process assumes a certain level of AI literacy for the stakeholders contributing to the planning. So some foundational learning might be in order prior to the “assess” stage.
‘This complex planning process assumes a certain level of AI literacy for the stakeholders contributing to the planning. So some foundational learning might be in order prior to the “assess” stage.’
Fortunately for K-12 leaders, The Council of the Great City Schools and CoSN have developed a Gen AI Readiness Checklist, which helps districts think through implementation necessities from executive leadership to security and risk management to ensure a roll out aligns with existing instructional and operational objectives. It’s also helpful to look at model districts like Gwinnett County Schools in Georgia that have been integrating AI into their curriculum since before ChatGPT’s launch.
Similarly, in higher education, Educause provides a framework for AI governance, operations, and pedagogy and has also published the 2024 Educause AI Landscape Study that helps colleges and universities better understand the promise and pitfalls of AI implementation. For an example of what AI assessment and planning looks like at a leading institution, see The Report of the Yale Task Force on Artificial Intelligence published in June of this year. The document explains how AI is already in use across campus, provides a vision for moving forward, and suggests actions to take.
Educators – Support Innovation through Collaboration
Whether teaching or administrating, in university or K12, educators need to upskill and develop a generative AI toolbox. The more we use the technology, the better we will understand its power and potential. Fortunately, both Google Gemini and Microsoft Copilot have virtual PD courses that educators can use to get started. From there, it’s all about integrating these productivity platforms into our day to day work to “understand the nuances, limitations, and abilities” of the tools. And for self-paced AI literacy learning, Common Sense Education’s AI Foundations for Educators course introduces the basics of AI and ethical considerations for integrating this technology into teaching.
The best learning is inherently social, so working with a team or department to share discoveries about how generative AI can help with personalizing learning materials, lesson plan development, formative assessment, and daily productivity is ideal. For more formalized implementation of this new technology, consider regular coaching and modeling for new adopters. At Hillsborough Township Public Schools in New Jersey, the district has identified a pilot group of intermediate and middle school teachers, technology coaches, and administrators who are exploring how Google Gemini can help with teaching and learning this year. With an initial pre-school year PD workshop followed by regular touch points, coaching, and modeling, the pilot will provide the district a view of if and how they want to scale generative AI with faculty across all schools.
‘The best learning is inherently social, so working with a team or department to share discoveries about how generative AI can help with personalizing learning materials, lesson plan development, formative assessment, and daily productivity is ideal.’
In higher education, many institutions are providing specific guidance to faculty about how generative AI should and should not be used in the classroom as well as how to address it in their syllabi with regard to academic integrity and acceptable use. At the University of North Carolina at Chapel Hill, faculty are engaging in communities of practice that examine how generative AI is being used in their discipline and the instructional issues surrounding gen AI’s use, as well as re-designing curriculum to integrate this new technology. These critical AI literacy efforts are led by the Center for Faculty Excellence and funded by Lenovo’s Instructional Innovation Grants program at UNC. This early work on generative AI integration will support future scaling across campus.
Students – Integrate AI Literacy into the Curriculum
The time to initiate student AI literacy is now. Generative AI platforms are plentiful and students are using them. In the work world, this powerful technology is being embraced across industries. We want students to be knowledgeable, skilled, and prepared. They need to understand not only how to use AI responsibly, but also how it works and how it can be harmful.
‘We want students to be knowledgeable, skilled, and prepared. They need to understand not only how to use AI responsibly, but also how it works and how it can be harmful.’
The AI literacy students need will vary based on age. Fortunately, expert organizations like ISTE have already made recommendations about the vocabulary and concepts K12 educators can integrate at which grades to help students understand and use AI responsibly. AI literacy must be integrated across the curriculum in ways that are relevant for each discipline. But this is one more thing to add to educators’ already full plates as they themselves develop their own AI literacy. Fortunately, MIT, Stanford, and Common Sense Education have developed AI literacy materials that can be integrated into existing curriculum. And Microsoft has an AI classroom toolkit that includes materials on teaching prompting.
The speed of recent innovation is head spinning. Remaining technologically literate in the face of that innovation is no small task. It will be critical for educators and institutions to assess and implement AI in ways that matter, ensuring it is helping them achieve their goals. Just as importantly, educators and institutions play an essential role in activating students’ AI literacy as they take the necessary steps into this new technology landscape and ultimately embark on their first professional jobs outside of school.
—
Delia DeCourcy is a Senior Strategist for the Lenovo Worldwide Education Portfolio. Prior to joining Lenovo she had a 25-year career in education as a teacher, consultant, and administrator, most recently as the Executive Director of Digital Teaching and Learning for a district in North Carolina. Previously, she was a literacy consultant serving 28 school districts in Michigan focusing on best practices in reading and writing instruction. Delia has also been a writing instructor at the University of Michigan where she was awarded the Moscow Prize for Excellence in Teaching Composition. In addition, she served as a middle and high school English teacher, assistant principal, and non-profit director. She is the co-author of the curriculum text Teaching Romeo & Juliet: A Differentiated Approach published by the National Council for the Teachers of English. Connect with Delia on LinkedIn.
Growing up as an immigrant, Cyril Gorlla taught himself how to code — and practiced as if a man possessed. “I aced my mother’s community college programming course at 11, amidst periodically disconnected household utilities,” he told TechCrunch. In high school, Gorlla learned about AI, and became so obsessed with the idea of training his […]
Today, Apple released iOS 18.1, iPadOS 18.1, macOS Sequoia 15.1, tvOS 18.1, visionOS 2.1, and watchOS 11.1. The iPhone, iPad, and Mac updates are focused on bringing the first AI features the company has marketed as "Apple Intelligence" to users.
Once they update, users with supported devices in supported regions can enter a waitlist to begin using the first wave of Apple Intelligence features, including writing tools, notification summaries, and the "reduce interruptions" focus mode.
In terms of features baked into specific apps, Photos has natural language search, the ability to generate memories (those short gallery sequences set to video) from a text prompt, and a tool to remove certain objects from the background in photos. Mail and Messages get summaries and smart reply (auto-generating contextual responses).
AI is giving scammers a more convincing way to impersonate police, reports show.
Just last week, the Salt Lake City Police Department (SLCPD) warned of an email scam using AI to convincingly clone the voice of Police Chief Mike Brown.
A citizen tipped off cops after receiving a suspicious email that included a video showing the police chief claiming that they "owed the federal government nearly $100,000."
Fourteen-year-old Sewell Setzer III loved interacting with Character.AI's hyper-realistic chatbots—with a limited version available for free or a "supercharged" version for a $9.99 monthly fee—most frequently chatting with bots named after his favorite Game of Thrones characters.
Within a month—his mother, Megan Garcia, later realized—these chat sessions had turned dark, with chatbots insisting they were real humans and posing as therapists and adult lovers seeming to directly spur Sewell to develop suicidal thoughts. Within a year, Setzer "died by a self-inflicted gunshot wound to the head," a lawsuit Garcia filed Wednesday said.
As Setzer became obsessed with his chatbot fantasy life, he disconnected from reality, her complaint said. Detecting a shift in her son, Garcia repeatedly took Setzer to a therapist, who diagnosed her son with anxiety and disruptive mood disorder. But nothing helped to steer Setzer away from the dangerous chatbots. Taking away his phone only intensified his apparent addiction.
The chatbot revolution has left our world awash in AI-generated text: It has infiltrated our news feeds, term papers, and inboxes. It’s so absurdly abundant that industries have sprung up to provide moves and countermoves. Some companies offer services to identify AI-generated text by analyzing the material, while others say their tools will “humanize“ your AI-generated text and make it undetectable. Both types of tools have questionable performance, and as chatbots get better and better, it will only get more difficult to tell whether words were strung together by a human or an algorithm.
Here’s another approach: Adding some sort of watermark or content credential to text from the start, which lets people easily check whether the text was AI-generated. New research from Google DeepMind, described today in the journal Nature, offers a way to do just that. The system, called SynthID-Text, doesn’t compromise “the quality, accuracy, creativity, or speed of the text generation,” says Pushmeet Kohli, vice president of research at Google DeepMind and a coauthor of the paper. But the researchers acknowledge that their system is far from foolproof, and isn’t yet available to everyone—it’s more of a demonstration than a scalable solution.
Google has already integrated this new watermarking system into its Gemini chatbot, the company announced today. It has also open-sourced the tool and made it available to developers and businesses, allowing them to use the tool to determine whether text outputs have come from their own large language models (LLMs), the AI systems that power chatbots. However, only Google and those developers currently have access to the detector that checks for the watermark. As Kohli says: “While SynthID isn’t a silver bullet for identifying AI-generated content, it is an important building block for developing more reliable AI identification tools.”
The Rise of Content Credentials
Content credentials have been a hot topic for images and video, and have been viewed as one way to combat the rise of deepfakes. Tech companies and major media outlets have joined together in an initiative called C2PA, which has worked out a system for attaching encrypted metadata to image and video files indicating if they’re real or AI-generated. But text is a much harder problem, since text can so easily be altered to obscure or eliminate a watermark. While SynthID-Text isn’t the first attempt at creating a watermarking system for text, it is the first one to be tested on 20 million prompts.
Outside experts working on content credentials see the DeepMind research as a good step. It “holds promise for improving the use of durable content credentials from C2PA for documents and raw text,” says Andrew Jenks, Microsoft’s director of media provenance and executive chair of the C2PA. “This is a tough problem to solve, and it is nice to see some progress being made,” says Bruce MacCormack, a member of the C2PA steering committee.
How Google’s Text Watermarks Work
SynthID-Text works by discreetly interfering in the generation process: It alters some of the words that a chatbot outputs to the user in a way that’s invisible to humans but clear to a SynthID detector. “Such modifications introduce a statistical signature into the generated text,” the researchers write in the paper. “During the watermark detection phase, the signature can be measured to determine whether the text was indeed generated by the watermarked LLM.”
The LLMs that power chatbots work by generating sentences word by word, looking at the context of what has come before to choose a likely next word. Essentially, SynthID-Text interferes by randomly assigning number scores to candidate words and having the LLM output words with higher scores. Later, a detector can take in a piece of text and calculate its overall score; watermarked text will have a higher score than non-watermarked text. The DeepMind team checked their system’s performance against other text watermarking tools that alter the generation process, and found that it did a better job of detecting watermarked text.
However, the researchers acknowledge in their paper that it’s still easy to alter a Gemini-generated text and fool the detector. Even though users wouldn’t know which words to change, if they edit the text significantly or even ask another chatbot to summarize the text, the watermark would likely be obscured.
Testing Text Watermarks at Scale
To be sure that SynthID-Text truly didn’t make chatbots produce worse responses, the team tested it on 20 million prompts given to Gemini. Half of those prompts were routed to the SynthID-Text system and got a watermarked response, while the other half got the standard Gemini response. Judging by the “thumbs up” and “thumbs down” feedback from users, the watermarked responses were just as satisfactory to users as the standard ones.
Which is great for Google and the developers building on Gemini. But tackling the full problem of identifying AI-generated text (which some call AI slop) will require many more AI companies to implement watermarking technologies—ideally, in an interoperable manner so that one detector could identify text from many different LLMs. And even in the unlikely event that all the major AI companies signed on to some agreement, there would still be the problem of open-source LLMs, which can easily be altered to remove any watermarking functionality.
MacCormack of C2PA notes that detection is a particular problem when you start to think practically about implementation. “There are challenges with the review of text in the wild,” he says, “where you would have to know which watermarking model has been applied to know how and where to look for the signal.” Overall, he says, the researchers still have their work cut out for them. This effort “is not a dead end,” says MacCormack, “but it’s the first step on a long road.”
Just outside Lausanne, Switzerland, in a meeting room wallpapered with patent drawings, Ioannis Ierides faced a classic sales challenge: demonstrating his product’s advantages within the short span of his customer’s attention. Ierides is a business-development manager at Iprova, a company that sells ideas for invention with an element of artificial intelligence (AI).
When Ierides gets someone to sign on the bottom line, Iprova begins sending their company proposals for patentable inventions in their area of interest. Any resulting patents will name humans as the inventors, but those humans will have benefited from Iprova’s AI tool. The software’s primary purpose is to scan the literature in both the company’s field and in far-off fields and then suggest new inventions made of old, previously disconnected ones. Iprova has found a niche tracking fast-changing industries and suggesting new inventions to large corporations such as Procter & Gamble, Deutsche Telekom, and Panasonic. The company has even patented its own AI-assistedinvention method.
In this instance, Ierides was trying to demonstrate to me, an inquisitive journalist, that Iprova’s services can accelerate the age-old engineers’ quest for new inventions. “You want something that can transcribe interviews? Something that can tell who’s speaking?” he asked. While such transcription tools already exist, there is plenty of room for improvement, and better transcription seemed a fine example for our purposes.
Ierides typed some relevant search terms into Iprova’s software, which displayed a pie chart with concentric circles, whose every slice represented a different research area. “This is the scoping step,” he said. As he put in more text, the circle broke apart into the more relevant constituent slices. The software used its semantic-search capabilities to detect similarities to his prompt in its enormous text corpus, which included patents, peer-reviewed articles, and other technology-related texts from the Internet. (Since our meeting, Iprova has replaced the pie chart workflow with a new one.)
Ierides called the next step “sensing and connecting.” The software presented short text summaries of the material it considered relevant, and Ierides highlighted with his cursor the ones he found interesting. Then he clicked a button marked “generate connection,” and the software displayed a proposal for our machine transcriber in a paragraph so dry, but also so clear that not even a machine editor would have changed a word.
Iprova’s system suggested I combine a new type of high-quality microphone with two new software programs that can identify speakers by their personal speech patterns. “As you can see this is a fairly ‘obvious’ invention, since we did not use the tool to its full capability,” Ierides wrote in a later email. In the real world, Iprova inventors would iterate the search, scan related patents, and check in with their clients. To get to a less obvious invention than ours, Iprova inventors might challenge the software to find connections between more distant fields.
Trying to Automate Invention
The inventors at Iprova might also, in the time-honored tradition, stare out the window, doodle on some paper with a pen, or build something unrelated to the task at hand before arriving at an exciting new idea. That new concept would almost surely be the product of an unplanned collision of unconnected ideas and points of view. It would likely be serendipitous.
“If you tell someone you can do this in a more reliable, substantial way, they don’t believe it,” says Iprova’s cofounder and CEO Julian Nolan. Nolan spends a lot of time persuading potential clients that the company’s software offers the right mix of AI literature-scanning and human insights, which will help these clients to invent new technologies faster than the competition. “Invention is a winner-takes-all activity,” he says. “If you’re second, you’re too late.”
“Invention is a winner-takes-all activity. If you’re second, you’re too late.” –Julian Nolan
The company finds ideas on the cutting edge of the cutting edge. Take, for example, the time that Panasonic asked Iprova for help finding new uses for autonomous vehicles. The software suggested giving the cars jobs when their human passengers weren’t using them, such as delivering parcels—essentially making them self-driving gig workers. It even suggested that human passengers might be willing to take the scenic route, or at least routes involving picking up or dropping off parcels, for the right discount on their ride. Panasonic bought that idea and filed a patent application in 2021.
“They’re at the confluence of competitive intelligence and patent law,” says Eric Bonabeau, chief technology officer of Biomedit, in Berkeley, Calif., who has not worked with Iprova. Using AI to discover patentable ideas is not the new part—that’s been going on for years. In 2021, the inventor Stephen L. Thaler and attorney Ryan Abbott even got the South African patent office to recognize Thaler’s AI system as the co-inventor of a food container (patent offices in other countries have rejected his applications).
“The new thing we have is an incredible generation machine,” Bonabeau says, referring to the large language models produced by generative AI that have emerged in the last few years. Those language models allow Iprova to summarize an enormous body of training texts—patent databases and other technological publications including peer-reviewed articles, industry technical standards, and non-peer-reviewed text. Iprova’s invention engineers have named this constantly updating trove of the world’s newest technical ideas “the Index.” Iprova’s search tools wend their way through the Index, hunting for the most helpful signals of novelty, while different tools rate existing inventions within the client’s domain. Searches that turn up strong novelty signals but weak existing inventions reveal places where inventors might add something both new and useful.
One such Iprova invention straddles a pair of seemingly disparate research areas: lithium batteries and message encryption. Ericsson, the mobile-phone company based in Stockholm, asked Iprova for a way of generating unique encryption keys known only to the users of two mobile devices.
Christian Gralingen
A typical cryptologist might not know much about how lithium batteries form tiny projections called dendrites during their cycles of charging and discharging. But Iprova’s software surfaced the fact that lithium dendrites represented an example of natural randomness, which is at the root of reliable encryption. The lithium batteries inside modern mobile phones each degrade in their own random ways and each battery has its own ever-changing magnetic signature as a result. A mobile device, held near another, can measure that fleeting magnetic signature and use it to generate an encryption key that nobody could replicate, given the batteries’ subsequent random degradation. The invention resulted in multiple patents.
Not every patent leads to an invention that someone will build. Companies sometimes rely on patents to help protect their intellectual property; the existence of those patents may deter competitors from offering something closely related. In other cases, a company may lay claim to ideas it later determines aren’t commercially mature or which don’t align with its mission. The company may use the ideas later or license them to another firm. The uncharitable might call this practice patent trolling, but it’s probably an inevitable result of the patent system: Companies will always generate more ideas than they can pursue.
Using Iprova’s software to generate scattershot inventions in the hopes of collecting license fees on the patents wouldn’t work as a business model, says Harry Cronin, the company’s head of standards. For one thing, Iprova’s own staff aren’t specialized enough to generate many market-ready ideas on their own: “We need the steer from the clients,” he says. Even if they could be AI-powered patent trolls, Cronin says, “Nobody at Iprova wants to do that.”
Invention in an Age of Information Overload
No one engineer, no matter how well-read, can be an expert across all potentially useful domains. At a June industry meeting that Iprova organized, Cronin gave a talk about how difficult it is becoming these days for engineers to keep up with all the telecom standards. A pacemaker that can connect to a 5G network must comply with both health standards and telecom standards. A drone must also meet aviation requirements. As the Internet’s wireless tentacles reach into more and more devices, telecom engineers cannot keep up with all the rules.
Iprova found the problem of proliferating telecom standards so attractive that it built a module for its software to track the industry’s so-called 3GPP standards and help inventors make new 3GPP-compatible inventions. The tool can push through the “wall of jargon” in the original standards texts, Cronin said, and identify useful similarities.
Bonabeau’s company, Biomedit, does something similar to invent new peptides using AlphaFold, the biology-focused generative-AI tool from DeepMind. Bonabeau says the generative component has revolutionized their company’s workflow, enabling Biomedit to identify successful peptides while synthesizing thousands fewer candidates. Generative AI is “baked into our process,” he says.
Iprova’s approach differs because it focuses on physical inventions, rather than biological ones. A biological invention is like a hypothesis—it requires a wet lab and time to confirm it works—while a physical invention is more like a mathematical proof. The inventor, the client, and in the final test, a patent examiner, should all be able to see the novelty and the value in the text description.
This insight may be the machine’s weak point. Nolan often uses the analogy of cooking, saying that while a machine can suggest ingredients that a cook might not know about, a human can intuit—or find out fast—how best to combine them. Bonabeau suggested the same analogy after examining Iprova’s case studies. “The human is in the loop exactly where I would put him or her,” Bonabeau says. “We know the machine isn’t able to assess whether something is interesting or not.”
Others agree. “AI really can’t invent,” said research fellow Paul Sagel, of Procter & Gamble, during a panel at Iprova’s June meeting. “It has to have some element of human assistance…otherwise it hallucinates.”
Or maybe those are just things we’ll tell ourselves as we get more comfortable with the idea of AI invention. Thaler, Abbott, and others are trying to lay the legal groundwork for granting patents to AI systems. And we’ll learn what AI is capable of as different inventors use it in opposing ways. Nolan, for example, told attendees at the June meeting about the power of delivering a predictable number of inventions to clients each week, of harnessing serendipity. Regularly scheduled eureka moments are useful to clients, he said. Bonabeau, on the other hand, embraces the chaos he sees in AI invention. “I personally love [generative AI] hallucinations. For me, they’re one of the big sources of innovation, kind of a mushroom trip. I’m looking for weird connections.”
Much of what people call AI are advanced forms of pattern recognition. That includes recognizing patterns in other people’s inventions. Public inventions have a creative footprint, Nolan says. “If you have enough examples of the paintings of a painter, then you can mimic their style. Perhaps the same is true of inventors.”
And what are companies but groups of people, with their own identifiable collective patterns? A clever-enough AI, guided by a clever human, might even recognize the patterns in a given company’s patent filings. Mixed with the right generative AI, that combination might open the door to anticipating a competitor’s moves. But what if the competitor is itself using AI to generate inventions? Then, perhaps, an invention-producing AI will predict another invention-producing AI’s next invention.
“What are the differences between trail shoes and running shoes?”
“What are the best dinosaur toys for a five year old?”
These are some of the open-ended questions customers might ask a helpful sales associate in a brick-and-mortar store. But how can customers get answers to similar questions while shopping online?
Amazon’s answer is Rufus, a shopping assistant powered by generative AI. Rufus helps Amazon customers make more informed shopping decisions by answering a wide range of questions within the Amazon app. Users can get product details, compare options, and receive product recommendations.
I lead the team of scientists and engineers that built the large language model (LLM) that powers Rufus. To build a helpful conversational shopping assistant, we used innovative techniques across multiple aspects of generative AI. We built a custom LLM specialized for shopping; employed retrieval-augmented generation with a variety of novel evidence sources; leveraged reinforcement learning to improve responses; made advances in high-performance computing to improve inference efficiency and reduce latency; and implemented a new streaming architecture to get shoppers their answers faster.
How Rufus Gets Answers
Most LLMs are first trained on a broad dataset that informs the model’s overall knowledge and capabilities, and then are customized for a particular domain. That wouldn’t work for Rufus, since our aim was to train it on shopping data from the very beginning—the entire Amazon catalog, for starters, as well as customer reviews and information from community Q&A posts. So our scientists built a custom LLM that was trained on these data sources along with public information on the web.
But to be prepared to answer the vast span of questions that could possibly be asked, Rufus must be empowered to go beyond its initial training data and bring in fresh information. For example, to answer the question, “Is this pan dishwasher-safe?” the LLM first parses the question, then it figures out which retrieval sources will help it generate the answer.
Our LLM uses retrieval-augmented generation (RAG) to pull in information from sources known to be reliable, such as the product catalog, customer reviews, and community Q&A posts; it can also call relevant Amazon Stores APIs. Our RAG system is enormously complex, both because of the variety of data sources used and the differing relevance of each one, depending on the question.
Every LLM, and every use of generative AI, is a work in progress. For Rufus to get better over time, it needs to learn which responses are helpful and which can be improved. Customers are the best source of that information. Amazon encourages customers to give Rufus feedback, letting the model know if they liked or disliked the answer, and those responses are used in a reinforcement learning process. Over time, Rufus learns from customer feedback and improves its responses.
Special Chips and Handling Techniques for Rufus
Rufus needs to be able to engage with millions of customers simultaneously without any noticeable delay. This is particularly challenging since generative AI applications are very compute-intensive, especially at Amazon’s scale.
To minimize delay in generating responses while also maximizing the number of responses that our system could handle, we turned to Amazon’s specialized AI chips, Trainium and Inferentia, which are integrated with core Amazon Web Services (AWS). We collaborated with AWS on optimizations that improve model inference efficiency, which were then made available to all AWS customers.
But standard methods of processing user requests in batches will cause latency and throughput problems because it’s difficult to predict how many tokens (in this case, units of text) an LLM will generate as it composes each response. Our scientists worked with AWS to enable Rufus to use continuous batching, a novel LLM technique that enables the model to start serving new requests as soon as the first request in the batch finishes, rather than waiting for all requests in a batch to finish. This technique improves the computational efficiency of AI chips and allows shoppers to get their answers quickly.
We want Rufus to provide the most relevant and helpful answer to any given question. Sometimes that means a long-form text answer, but sometimes it’s short-form text, or a clickable link to navigate the store. And we had to make sure the presented information follows a logical flow. If we don’t group and format things correctly, we could end up with a confusing response that’s not very helpful to the customer.
That’s why Rufus uses an advanced streaming architecture for delivering responses. Customers don’t need to wait for a long answer to be fully generated—instead, they get the first part of the answer while the rest is being generated. Rufus populates the streaming response with the right data (a process called hydration) by making queries to internal systems. In addition to generating the content for the response, it also generates formatting instructions that specify how various answer elements should be displayed.
Even though Amazon has been using AI for more than 25 years to improve the customer experience, generative AI represents something new and transformative. We’re proud of Rufus, and the new capabilities it provides to our customers.
Maybe you’ve read about Gary Marcus’s testimony before the Senate in May of 2023, when he sat next to Sam Altman and called for strict regulation of Altman’s company, OpenAI, as well as the other tech companies that were suddenly all-in on generative AI. Maybe you’ve caught some of his arguments on Twitter with Geoffrey Hinton and Yann LeCun, two of the so-called “godfathers of AI.” One way or another, most people who are paying attention to artificial intelligence today know Gary Marcus’s name, and know that he is not happy with the current state of AI.
He lays out his concerns in full in his new book, Taming Silicon Valley: How We Can Ensure That AI Works for Us, which was published today by MIT Press. Marcus goes through the immediate dangers posed by generative AI, which include things like mass-produced disinformation, the easy creation of deepfake pornography, and the theft of creative intellectual property to train new models (he doesn’t include an AI apocalypse as a danger, he’s not a doomer). He also takes issue with how Silicon Valley has manipulated public opinion and government policy, and explains his ideas for regulating AI companies.
Marcus studied cognitive science under the legendary Steven Pinker, was a professor at New York University for many years, and co-founded two AI companies, Geometric Intelligence and Robust.AI. He spoke with IEEE Spectrum about his path to this point.
What was your first introduction to AI?
Gary MarcusBen Wong
Gary Marcus: Well, I started coding when I was eight years old. One of the reasons I was able to skip the last two years of high school was because I wrote a Latin-to-English translator in the programming language Logo on my Commodore 64. So I was already, by the time I was 16, in college and working on AI and cognitive science.
So you were already interested in AI, but you studied cognitive science both in undergrad and for your Ph.D. at MIT.
Marcus: Part of why I went into cognitive science is I thought maybe if I understood how people think, it might lead to new approaches to AI. I suspect we need to take a broad view of how the human mind works if we’re to build really advanced AI. As a scientist and a philosopher, I would say it’s still unknown how we will build artificial general intelligence or even just trustworthy general AI. But we have not been able to do that with these big statistical models, and we have given them a huge chance. There’s basically been $75 billion spent on generative AI, another $100 billion on driverless cars. And neither of them has really yielded stable AI that we can trust. We don’t know for sure what we need to do, but we have very good reason to think that merely scaling things up will not work. The current approach keeps coming up against the same problems over and over again.
What do you see as the main problems it keeps coming up against?
Marcus: Number one is hallucinations. These systems smear together a lot of words, and they come up with things that are true sometimes and not others. Like saying that I have a pet chicken named Henrietta is just not true. And they do this a lot. We’ve seen this play out, for example, in lawyers writing briefs with made-up cases.
Second, their reasoning is very poor. My favorite examples lately are these river-crossing word problems where you have a man and a cabbage and a wolf and a goat that have to get across. The system has a lot of memorized examples, but it doesn’t really understand what’s going on. If you give it a simpler problem, like one Doug Hofstadter sent to me, like: “A man and a woman have a boat and want to get across the river. What do they do?” It comes up with this crazy solution where the man goes across the river, leaves the boat there, swims back, something or other happens.
Sometimes he brings a cabbage along, just for fun.
Marcus: So those are boneheaded errors of reasoning where there’s something obviously amiss. Every time we point these errors out somebody says, “Yeah, but we’ll get more data. We’ll get it fixed.” Well, I’ve been hearing that for almost 30 years. And although there is some progress, the core problems have not changed.
Let’s go back to 2014 when you founded your first AI company, Geometric Intelligence. At that time, I imagine you were feeling more bullish on AI?
Marcus: Yeah, I was a lot more bullish. I was not only more bullish on the technical side. I was also more bullish about people using AI for good. AI used to feel like a small research community of people that really wanted to help the world.
So when did the disillusionment and doubt creep in?
Marcus: In 2018 I already thought deep learning was getting overhyped. That year I wrote this piece called “Deep Learning, a Critical Appraisal,” which Yann LeCun really hated at the time. I already wasn’t happy with this approach and I didn’t think it was likely to succeed. But that’s not the same as being disillusioned, right?
Then when large language models became popular [around 2019], I immediately thought they were a bad idea. I just thought this is the wrong way to pursue AI from a philosophical and technical perspective. And it became clear that the media and some people in machine learning were getting seduced by hype. That bothered me. So I was writing pieces about GPT-3 [an early version of OpenAI's large language model] being a bullshit artist in 2020. As a scientist, I was pretty disappointed in the field at that point. And then things got much worse when ChatGPT came out in 2022, and most of the world lost all perspective. I began to get more and more concerned about misinformation and how large language models were going to potentiate that.
You’ve been concerned not just about the startups, but also the big entrenched tech companies that jumped on the generative AI bandwagon, right? Like Microsoft, which has partnered with OpenAI?
Marcus: The last straw that made me move from doing research in AI to working on policy was when it became clear that Microsoft was going to race ahead no matter what. That was very different from 2016 when they released [an early chatbot named] Tay. It was bad, they took it off the market 12 hours later, and then Brad Smith wrote a book about responsible AI and what they had learned. But by the end of the month of February 2023, it was clear that Microsoft had really changed how they were thinking about this. And then they had this ridiculous “Sparks of AGI” paper, which I think was the ultimate in hype. And they didn’t take down Sydney after the crazy Kevin Roose conversation where [the chatbot] Sydney told him to get a divorce and all this stuff. It just became clear to me that the mood and the values of Silicon Valley had really changed, and not in a good way.
I also became disillusioned with the U.S. government. I think the Biden administration did a good job with its executive order. But it became clear that the Senate was not going to take the action that it needed. I spoke at the Senate in May 2023. At the time, I felt like both parties recognized that we can’t just leave all this to self-regulation. And then I became disillusioned [with Congress] over the course of the last year, and that’s what led to writing this book.
You talk a lot about the risks inherent in today’s generative AI technology. But then you also say, “It doesn’t work very well.” Are those two views coherent?
Marcus: There was a headline: “Gary Marcus Used to Call AI Stupid, Now He Calls It Dangerous.” The implication was that those two things can’t coexist. But in fact, they do coexist. I still think gen AI is stupid, and certainly cannot be trusted or counted on. And yet it is dangerous. And some of the danger actually stems from its stupidity. So for example, it’s not well-grounded in the world, so it’s easy for a bad actor to manipulate it into saying all kinds of garbage. Now, there might be a future AI that might be dangerous for a different reason, because it’s so smart and wily that it outfoxes the humans. But that’s not the current state of affairs.
Marcus: Let’s clarify: I don’t think generative AI is going to disappear. For some purposes, it is a fine method. You want to build autocomplete, it is the best method ever invented. But there’s a financial bubble because people are valuing AI companies as if they’re going to solve artificial general intelligence. In my view, it’s not realistic. I don’t think we’re anywhere near AGI. So then you’re left with, “Okay, what can you do with generative AI?”
Last year, because Sam Altman was such a good salesman, everybody fantasized that we were about to have AGI and that you could use this tool in every aspect of every corporation. And a whole bunch of companies spent a bunch of money testing generative AI out on all kinds of different things. So they spent 2023 doing that. And then what you’ve seen in 2024 are reports where researchers go to the users of Microsoft’s Copilot—not the coding tool, but the more general AI tool—and they’re like, “Yeah, it doesn’t really work that well.” There’s been a lot of reviews like that this last year.
The reality is, right now, the gen AI companies are actually losing money. OpenAI had an operating loss of something like $5 billion last year. Maybe you can sell $2 billion worth of gen AI to people who are experimenting. But unless they adopt it on a permanent basis and pay you a lot more money, it’s not going to work. I started calling OpenAI the possible WeWork of AI after it was valued at $86 billion. The math just didn’t make sense to me.
What would it take to convince you that you’re wrong? What would be the head-spinning moment?
Marcus: Well, I’ve made a lot of different claims, and all of them could be wrong. On the technical side, if someone could get a pure large language model to not hallucinate and to reason reliably all the time, I would be wrong about that very core claim that I have made about how these things work. So that would be one way of refuting me. It hasn’t happened yet, but it’s at least logically possible.
On the financial side, I could easily be wrong. But the thing about bubbles is that they’re mostly a function of psychology. Do I think the market is rational? No. So even if the stuff doesn’t make money for the next five years, people could keep pouring money into it.
The place that I’d like to prove me wrong is the U.S. Senate. They could get their act together, right? I’m running around saying, “They’re not moving fast enough,” but I would love to be proven wrong on that. In the book, I have a list of the 12 biggest risks of generative AI. If the Senate passed something that actually addressed all 12, then my cynicism would have been mislaid. I would feel like I’d wasted a year writing the book, and I would be very, very happy.
The Ansys SimAI™ cloud-enabled generative artificial intelligence (AI) platform combines the predictive accuracy of Ansys simulation with the speed of generative AI. Because of the software’s versatile underlying neural networks, it can extend to many types of simulation, including structural applications.
This white paper shows how the SimAI cloud-based software applies to highly nonlinear, transient structural simulations, such as automobile crashes, and includes:
Vehicle kinematics and deformation
Forces acting upon the vehicle
How it interacts with its environment
How understanding the changing and rapid sequence of events helps predict outcomes
These simulations can reduce the potential for occupant injuries and the severity of vehicle damage and help understand the crash’s overall dynamics. Ultimately, this leads to safer automotive design.