ChatGPT comes up short when compared to Stack Overflow … – CIO Dive

Let CIO Dive’s free newsletter keep you informed, straight from your inbox.

When presented with 517 user-written software engineering questions, ChatGPT gets less than half correct, according to researchers from Purdue University.
Stack Overflow was one of the first organizations to restrict the use of ChatGPT, but unlike other enterprises that cited data privacy concerns, Stack Overflow was more concerned with accuracy.
Less than a week after the generative AI chatbot launched, the company banned developers and engineers from generating answers using the tool as it feared incorrect answers would lower the credibility of the site.
But as enthusiasm over ChatGPT’s effect on coding and broader IT operations spread, there were concerns that Stack Overflow would lose users to the faster alternative.
One-third of developers believe a productivity boost is the greatest upside to enhancing the software creation process with AI, according to a Stack Overflow survey of nearly 90,000 engineers in June.
The company changed its tune on generative AI when it announced it would begin incorporating the technology into its public platform and paid service in a blog post in April. But users of the site were still concerned about the validity of answers generated by AI, information overload and data privacy as it relates to individual contributors on the platform.
“We aren’t surprised by the research paper’s findings that AI tools can be inaccurate,” said via email Ellen Brandenberger, director of product innovation at Stack Overflow, in reference to the work by Purdue researchers. “For the last several months, our team has been outlining our vision for community and AI coming together as the inevitable next phase of growth in generative AI’s trajectory.”
The company launched OverflowAI last month, which serves as a platform for users to check, validate, attribute and confirm accuracy and trustworthiness across its more than 58 million questions and answers.
Researchers from the University of California, Berkeley, found that in some cases the behavior of OpenAI’s large language models is getting significantly worse over time. When presented with 50 code-generation problems from LeetCode’s easy category, the percentage of executable GPT-4 generated code dropped from 52% in March to 10% in June. GPT-3.5’s performance decreased from 22% to 2%.
Stack Overflow has experienced a small decline in traffic this year, dipping an average of 5% compared to 2022, according to a company blog post last week.
“The future of the internet and the modern tech landscape isn’t going to be measured by web traffic alone — it’s about the quality of content, trust in the content, and the communities of experts and human beings curating the content,” the company said.
The company expects to continue to see traffic fluctuate from historical norms as first-time coders leverage generative AI tools more often and as the technology spurs new questions that will bring users to the platform.
OpenAI did not immediately respond to a request for comment.
Get the free daily newsletter read by industry experts
Remaining staff members are watching closely how companies treat departing workers — and will need support as they attempt to carry on.
Businesses must reconfigure the development process, tackling the biggest issues first on the road toward software redemption.
Subscribe to CIO Dive for top news, trends & analysis
Get the free daily newsletter read by industry experts
Remaining staff members are watching closely how companies treat departing workers — and will need support as they attempt to carry on.
Businesses must reconfigure the development process, tackling the biggest issues first on the road toward software redemption.
The free newsletter covering the top industry headlines

source

ChatGPT comes up short when compared to Stack Overflow … – CIO Dive

ChatGPT comes up short when compared to Stack Overflow … – CIO Dive

Jesse

https://playwithchatgtp.com