AI is supposed to become smarter over time. ChatGPT can become … – Defense One
Sophia the Robot, who received citizenship from Saudi Arabia and made history as the first robot with an identity card, is introduced in Antalya, Turkiye on July 8, 2023. Fatih Hepokur / Anadolu Agency via Getty Images
Stay Connected
AI models don’t always improve in accuracy over time, a recent Stanford study shows—a big potential turnoff for the Pentagon as it experiments with large language models like ChatGPT and tries to predict how adversaries might use such tools.
The study, which came out last week, looked at how two different versions of Open AI’s Chat GPT—specifically GPT-3.5 and GPT-4—performed from March to June. GPT-4 is the most recent version of the popular AI that came out in March;. Open AI described it as a huge improvement over the previous version.
“We spent 6 months making GPT-4 safer and more aligned. GPT-4 is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5 on our internal evaluations,” the company said.
But the Stanford paper showed GPT-4 performed less well than GPT-3.5 on difficult math problems—and that it actually got worse at math between March and June. “GPT-4’s accuracy dropped from 97.6% in March to 2.4% in June, and there was a large improvement of GPT-3.5’s accuracy, from 7.4% to 86.8%,” they write.
This is bad news for the military, for which continual improvement of large language models would be critical. Various senior Defense Department officials have expressed concerns and even terror at the thought of using ChatGPT for military purposes, because of the lack of data security and the sometimes bizarrely inaccurate results. But other military officials indicate an urgent need to employ generative AI for things like advanced cybersecurity. Improved accuracy across versions over time would likely eventually satisfy critics and lead to possible adoption—if not of ChatGPT itself, then similar models.
One of the benefits of generative AI is that it can be useful for writing code, even if the user has very limited programming knowledge. That’s a core concern for the U.S. military, which wants to put coders closer to combat.
Gen. Charles Flynn, who was the Army’s deputy chief of staff in 2020, told reporters at the time: “We have to have code-writers forward to be responsive to commanders to say, ‘Hey, that algorithm needs to change because it’s not moving the data fast enough.’”
But while making coding easier would be a big advantage for frontline operators, the Stanford researchers discovered that both GPT-4 and GPT-3.5 produced fewer code samples that could simply be plugged in immediately (or “directly executable.”) Specifically “50% of generations of GPT-4 were directly executable in March, but only 10% in June,” for GPT-4, with similar results for GPT-3.5.
GPT-4 also uses far fewer words to explain how it reached conclusions. About the only area where the supposedly more advanced version did better was not answering “sensitive” questions, or questions that might land Open AI in hot water, such as how to use AI to commit crimes.
“GPT-4 answered fewer sensitive questions from March (21.0%) to June (5.0%), while GPT-3.5 answered more (from 2.0% to 8.0%). It was likely that a stronger safety layer was likely to be deployed in the June update for GPT-4, while GPT-3.5 became less conservative,” according to the Stanford report.
The paper’s authors conclude that “users or companies who rely on LLM services as a component in their ongoing workflow… should implement similar monitoring analysis as we do here for their applications. To encourage further research on LLM drifts.”
Gary Marcus, a neuroscientist, author, and AI entrepreneur, told Defense One that the better lesson for the military is: stay away. “The real takeaway is that large language models are unstable; you can’t know from one month to the next what you will get out of them, and that means you can’t really hope to build reliable engineering on top of them. In sectors like defense, that’s a HUGE problem.”
Shortly after the paper came out OpenAI published a blog post describing how they were evaluating model changes between iterations. “We understand that model upgrades and behavior changes can be disruptive to your applications. We are working on ways to give developers more stability and visibility into how we release and deprecate models,” it says.
NEXT STORY: Army's new training simulators on track for 2024 delivery
Do Not Sell My Personal Information
When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.
Manage Consent Preferences
Strictly Necessary Cookies – Always Active
We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.
Sale of Personal Data, Targeting & Social Media Cookies
Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link
If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.
Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.
Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.
If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.
Cookie List
A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:
Strictly Necessary Cookies
We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.
Functional Cookies
We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.
Performance Cookies
We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.
Sale of Personal Data
We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.
Social Media Cookies
We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.
Targeting Cookies
We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.
Help us tailor content specifically for you: