OpenAI’s ChatGPT is a wonderful tool, albeit flawed in several respects. Leveraging the Large Language Model’s (LLM) capabilities while keeping its limitations in the peripheral vision is the correct approach for now.
Recently, a paper made waves by claiming that ChatGPT-4 can score 100 percent on MIT’s EECS curriculum. What followed, however, is a sordid tale of unethical data sourcing and repeated prompts to obtain the desired outcome. Let’s delve deeper.
A few days back, Professor Iddo Drori published a paper titled “Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models.” The paper scrutinized a “comprehensive dataset of 4,550 questions and solutions from problem sets, midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and Computer Science (EECS) courses required for obtaining a degree.” In a striking outcome, the paper concludes:
“Our results demonstrate that GPT-3.5 successfully solves a third of the entire MIT curriculum, while GPT-4, with prompt engineering, achieves a perfect solve rate on a test set excluding questions based on images.”
Given these astonishing claims, the paper went viral on social media, garnering over 500 retweets in a single day.
The paper’s claims were then examined by Raunak Chowdhuri and his colleagues. Contrary to the paper’s assertions, Chowdhuri found glaring problems in the methodology used:
FINAL UPDATE: On June 24th, Armando Solar-Lezama (Professor in EECS and COO/Associate Director of CSAIL, MIT), Tonio Buonassisi (Professor of Mechanical Engineering, MIT), and Yoon Kim (Assistant Professor in EECS and CSAIL, MIT) released a public statement regarding the paper.
Additionally, a number of MIT professors then issued a statement, disclosing that the paper sourced the MIT dataset without authorization:
“On June 15th, Iddo Drori posted on arXiv a working paper associated with a dataset of exams and assignments from dozens of MIT courses. He did so without the consent of many of his co-authors and despite having been told of problems that should be corrected before publication.”
The statement concludes with the following one-liner:
“And no, GPT-4 cannot get an MIT degree.”
Do you think that ChatGPT’s potential is being damaged by unethical papers? Let us know your thoughts in the comments section below.
