Google Teaches ChatGPT How to Solve Math Problems – Analytics India Magazine

OpenAI chatbot ChatGPT excels at a plethora of tasks, like script writing, explaining complex topics, debugging, code explaining and others, but performs poorly when it comes to maths. 
Stanford University and University of California, Berkeley, recently published a research paper that stated that large language models (LLMs) can perform simple maths operations when numbers are small, but struggle with large numbers suggesting that LLMs have not learned the underlying rules needed to perform these arithmetic operations. It further mentioned, even with GPT-4 improvements on the MATHS dataset, errors largely occur due to arithmetic and calculation mistakes
The rival company, Google, has acknowledged the issue in LLMs and stepped in to teach models like ChatGPT to reason better algorithmically. The work by Google researchers titled, ‘Teaching language models to reason algorithmically’, takes the in-context learning approach and introduces an algorithm better at reasoning. 
In-context learning is teaching a new skill where the researchers guide it through the process step-by-step instead of overwhelming it with all instructions upfront. The method refers to a model’s ability to perform a task after seeing a few examples of it within the context of the model. 
They also presented a prompting technique for general purpose language models to have strong generalisation on maths problems more difficult than the ones in prompt. The technique builds upon other rationale-augmented approaches (e.g., scratchpad and chain-of-thought). Lastly, they demonstrated that a model can ‘reliably execute algorithms on out-of-distribution examples with the right prompts’.
ChatGPT has become worse at performing certain basic maths operations — as it is getting better at other things. The same study highlighted that the high-profile chatbot is getting worse as compared to its performance earlier in March. 
Researchers said the deterioration is due to an AI phenomenon known as drift, where attempts to improve one part of complex models make other parts of the models worse.
To track performance, James Zou, Stanford professor affiliated with the school’s AI lab and his colleagues, Matei Zaharia and Lingjiao Chen fed ChatGPT 1,000 different numbers. In March, the paid GPT-4 version impressively identified whether 84% of the numbers were prime or not. By June, the success rate dropped to 51%.
Apart from getting the answers wrong, ChatGPT also got a thumbs down for its attempt to show the researchers how it arrived at certain conclusions. As part of the research, the researchers additionally asked the chatbot to lay out its “chain of thought”, the term for when a chatbot explains its reasoning. In March, it did so, but by June it stopped showing its step-by-step reasoning.
The recent Google study tries to tackle this issue with its in-context learning approach. These discoveries suggest that exploring longer contexts, and prompting more informative explanations could provide valuable research.
A pioneer in fusing technology with maths education, Wolfram Research has been working with ChatGPT’s parent OpenAI, to bring better maths capabilities in AI models. ”We have seen some interesting results with our LLM. I tried to run a British ‘A’ level maths, an exam students take before University, and ChatGPT alone got 43% which is quite impressive, but Wolfram plus ChatGPT got 96%,” cofounder of the company, Conrad Wolfram revealed in an interview with AIM. 
“Game over for humans on that one,”  he quipped. 
Notably, when a same maths teaser was thrown at ChatGPT version 3.5, 4, and Wolfram Plugin — what is the smallest integer greater than 95,555, in which there will be 4 identical numbers? — only the latter got it right in the first attempt. 
The Wolfram + ChatGPT plugin not only solves maths step-by-step but it can also present them visually if specifically prompted to do so. Based on the prompts, it can go a step further and represent the data in different forms like graphs, charts, and histograms. 
The plugin can turn queries in natural language into beautiful mathematical equations. It can do so since it combines ChatGPT’s human mimicking technology and Wolfram’s strong foundation of symbolic programming language  that focuses on expressing ideas in a computational form. 
On one hand, Wolfram is making strides with its plug-in and on the other, researchers show models performance worsening. In the current landscape, Google’s latest in-context learning approach can help AI chatbots become an above-average student. 
The most prestigious AI awards in the country. Nominations Open.
Discover special offers, top stories, upcoming events, and more.
Stay Connected with a larger ecosystem of data science and ML Professionals
Google looks to safeguard the future of computing with quantum-resistant encryption keys, which serve as the building blocks of FIDO2
Companies looking to cut costs shouldn’t rely on open source solutions either. It’s a juggle
At the annual conference of the International Speech Communication Association, Meta presented more than 20 papers primarily focusing on NLP
YouTube is ensuring that UMG does not sue it in future for hosting AI-generated content
Riding high on the generative AI wave, companies are generously offering upto $1 million for jobs
Job postings for GenAI roles surged 50% from July 2022 to July 2023 on Indeed, a job-hunt platform
Python works perfectly well locally, but Microsoft wants it to run only on its cloud
With Visualising AI, Google DeepMind takes an unconventional approach to create stock images that break AI stereotypes. But, why?
China was responsible for approximately one-third of the global output in terms of both AI research papers published and AI citations in 2021.
‘All that glitters is not gold’ stands corrected for Stability AI since its market supremacy can’t overshadow its mischief. 
© Analytics India Magazine Pvt Ltd & AIM Media House LLC 2023