AI Showdown, Part 2: ChatGPT, Claude, Bing, And Bard Tackle Blogging – Forbes
Technology competition with 3d rendering cyborg confront in red and blue side
In my last article, Part 1 of the AI Showdown series, I shared my surprise when speaking with a prospect who hadn’t heard of Claude or Bard, two of the major generative AI tools making news. That surprise led me to organize this AI showdown.
The first article introduced the contenders: ChatGPT, Claude, Bing Chat, and Bard. For this article, I tested each model by having it create content for a popular marketing use case—blogging.
Since I ran the experiment, Google announced an update to Bard, which is supposed to have made it the “most capable model yet.” I double-checked the results using the new model, but nothing changed. Bard performed just as it had in the first run.
My original hypothesis stands: ChatGPT will outperform the other AI tools. Was I right? Let’s start with the content brief I prepared to put the chatbots on even ground.
What’s the first thing you do when you want a freelance content creator to write a blog post for you? You craft a content brief. That’s what I did for this experiment, too.
When creating a content brief for an AI tool, include all the information it needs to write the content you want, for the audience you want, in the voice and style you want.
If you’ve been following my Forbes.com articles on ChatGPT, you may notice that this article’s content brief has more elements because I’ve expanded what goes into my standard brief. In the following brief, I included the new elements and omitted others because they were irrelevant to the task.
Create content briefs for your AI chatbots just as you’d create them for human writers.
Let’s dive into the brief, starting with the first element: the AI’s role in the work to come.
You’re a world-class expert in content marketing. For the task to come, think deeply about how you’ll go about the work given the context in this brief.
Write a listicle blog post on 10 ways content teams can generate ideas for the content calendar. Also, to drive people to the blog post, write four social media posts—one each for LinkedIn, Twitter, Facebook, and Instagram—and one email.
To showcase my creativity, ideation power, and 20+ years of experience in content marketing. The objective of the blog post is to book an exploratory call. The objective of the email and social posts is to drive people to read the blog post.
[Note: The following content is included in ChatGPT’s custom instructions so the AI can access it for each conversation. The other AI tools do not have the same functionality, so I included the details in the content brief.]
My brand voice is expert, insightful, passionate, and empathetic. I position myself as an expert and share deep insights about the importance of the holistic content experience—how content looks, sounds, feels, and functions, all from the reader’s perspective. My love for great content shines through as I describe its effect on the reading experience. I empathize with readers because I understand the frustration of encountering poorly conceived, written, designed, and delivered content.
I use a conversational tone that makes my content approachable and easy to digest. I use reflective questions to engage readers in dialogue. I incorporate humor in my writing, using playful analogies to emphasize the transformative power of exceptional content experiences. I also encourage content marketers to build respectful and trusting relationships with readers.
My style is narrative, descriptive, and persuasive. I use storytelling elements to draw readers in, such as personal anecdotes and quotes that set the stage for the discussion. I vividly describe the outcomes of great content experiences, painting pictures of smooth reading and hearts filled with hope and trust. I persuasively argue for the value of great content experiences, showcasing their ability to smooth the reader’s learning and buying journeys.
The audience comprises CEOs, founders, marketing leaders, and content teams in B2B SaaS, technology, and consulting startups and established companies.
The content will be a blog post, an email message, and social media posts with informative and engaging narratives.
The key message is the connection between consistently producing excellent, reader-friendly content and increased leads and sales.
In my experiments with AI chatbots for content creation, I discovered that asking the AI tools if they had questions led to better content.
My second step was to feed the brief into each tool with the following instructions:
I’ll give you a content brief and then ask you to create some content. First, consume the brief and let me know if you have any questions.
Let’s see what they came up with.
ChatGPT asked a few questions, but nothing that would tip the way I created content.
Ask your AI chatbots to review your brief and, in turn, ask you clarifying questions.
Claude asked more profound, business-focused questions than ChatGPT.
Asking your AI chatbots to ask you clarifying questions before they start writing gives you better … [+]
I was surprised by Claude’s question about tone of voice because the creative brief included three paragraphs about it.
Even so, the question made me pause: Do I want to write the piece in my brand voice as described in the brief? Or do I want to write it in a less conversational, more informational voice?
I was impressed that Claude understands the nuance of voice enough to cause me to stop and reflect, just as a good editor’s questions would.
Because ChatGPT and Claude asked questions, I was surprised that Bing Chat had no questions for me.
Not all AIs will ask you clarifying questions. In this case, Bing Chat did not.
Let’s see what Bard had to say.
In a race to be the biggest and best, AI chatbots continually add new functionality. Sometimes, the developers trumpet new goodness through press releases and platform notes; other times, the new functionality just appears. That’s what happened when I visited Bard.
In the following screenshot, notice how:
In the race to be the best AI chatbot, developers release new features often. Here, notice how Bard … [+]
Except for the questions hiding in Bard’s second and third drafts, ChatGPT and Claude were the only two models to ask follow-up questions. Because both asked different questions and because I wanted to keep each model on the same page for this experiment, I fed my aggregate answers into each AI tool. For your reference, this is the text I submitted:
Here are the answers to your questions, plus a little more information that may be helpful.
After feeding in my answers to the clarifying questions, I told each AI to write the blog post. Here’s what they delivered.
ChatGPT’s blog post came in at 723 words, an appropriate length for a blog post designed to generate awareness.
Although Yoast, the SEO tool built into WordPress, says to write 300 words, minimum, for a regular blog post, I recommend at least 750 words to ensure enough content depth to interest readers. If I had asked the models to write cornerstone blog posts, I’d expect much longer content—at least 2,500 words.
I’ll present the blog post content first, and then I’ll evaluate it.
Image 1/3 of a blog post written by ChatGPT
Image 2/3 of a blog post written by ChatGPT
Image 3/3 of a blog post written by ChatGPT
How did the chatbots perform? Let’s analyze.
I’ll evaluate each chatbot’s post on the same elements.
The content ChatGPT provided is relevant and, as far as I can tell, accurate. If I were publishing this blog post, I’d double-check and link to each example. I’d also include screenshots and, if possible, quotes from one or more companies mentioned.
I’d also want to make the content more tactical and how-to, perhaps by writing a fuller, deeper blog post on each way and linking to those posts from this original. As it is now, the post “looks” weak, with just two or three sentences per heading.
I don’t see anything innovative in any of the 10 ways. They’re all old news, likely written to death about. What would make the content innovative in my eyes would be to include new, rich examples for each of the 10 ways. For instance, under “Collaborative content,” ChatGPT said, “Look at the collaborations between Adobe and various artists….” What collaborations? What artists? What should I “look at?”
I’m not a fan of the “we” voice for a personal brand. I generally reserve “we” writing for content coming from companies. The content also has a few strange elements, such as the overly conversational, out-of-flow “See that?” in the first paragraph.
I also see nothing special about ChatGPT’s narrative and descriptive abilities. It did incorporate two analogies—one about a treadmill and one about a well—in the introduction, but other than that, the narration falls flat.
That said, I’d never expect to publish a blog post directly from the mouth of ChatGPT or any other tool. The tools simply give you a starting point. After that, it’s up to you and your team to be humans in the loop, asking questions, researching, expounding, and adding the human elements that make blog posts exciting and worth reading.
Nothing of note here.
ChatGPT gets most grammar correct, although it referred to “Moz” as a “who” when it’s actually the name of a company.
It also failed to use the same consistent form for each heading. For instance, to match the noun forms of the other headings, “Leveraging user-generated content” should be “User-generated content.”
ChatGPT’s draft is, as expected, average. It’s usable as a starting point, but I’d need to do much more back and forth or research before publishing it.
Claude’s blog post was lighter than ChatGPT’s, coming in at 588 words.
Here’s the post; my analysis follows.
Image 1/2 of a blog post written by Claude
Image 2/2 of a blog post written by Claude
Before my analysis, let’s talk about the elephant in the blog post—Claude’s use of a biblical framework, complete with a title using the 10 commandments and the words thou, thy, and thee in the content.
I have no idea why Claude would draw on such an analogy in our global world, where it makes more sense to rely on what unites than on what divides.
I didn’t want to proceed with the draft because I couldn’t bear to read the content in that state. Kudos to you if you were able to slog through it.
I asked Claude to try again using another framework or approach. The result follows.
A blog post written by Claude
With the biblical framework gone, I could read and analyze the post.
How did the chatbots perform? Let’s analyze.
Claude’s post is relevant and possibly accurate but too short to be immediately useful. The post presents some neat ideas—like filtering topics through a journalistic lens—but just like ChatGPT’s post, Claude’s needs a lot more substance. The stat in the introduction is a nice touch, but I’d check its accuracy before publishing.
Whereas ChatGPT took the route of naming each “way” with a noun phrase, for instance, “Analytical insights,” Claude used more creative language: “Let analytics uncover resonance.” I’m not saying Claude’s language is better, just more creative.
The proposed ways for generating more content ideas are also more creative. For instance, Claude said to draw from everyday experiences for ideas. ChatGPT talked about a “feedback loop,” which could be similar.
I like Claude’s writing in this blog post more than ChatGPT’s. It feels richer and flows better. That said, I’m disappointed by the lack of depth. I also cringed at the word “skyrocket” in the last sentence.
Claude’s use of language is tighter and more elevated than ChatGPT’s. The first sentence describing each “way” is more creative than ChatGPT’s offerings. Claude also included a hypothetical example with each, helping readers understand how they might use the advice.
Nothing to note.
No issues noted.
Claude’s post is better than ChatGPT’s in terms of creativity and reader experience. However, the post would need much work to be useful to readers. I would not want to carry on building out the post with Claude’s draft.
Now, let’s see how Bing Chat performed.
Screenshot of Bing Chat
I was blown away when Bing’s content began populating my screen. “It’s a tome!” I thought.
Then Bing stopped writing after about 1,800 words, in the middle of the fourth point. “Can you continue writing the post?” I asked.
It did, this time getting through the start of item 10 before stopping again—but only briefly. Then, without prompting, Bing continued responding—but with the exact words it had already written.
I knew then that Bing was confused; the following screenshot shows where it happened. Notice how Bing had already started writing the tenth point but suddenly stopped and returned to the fourth.
Bing Chat got confused while writing the blog post. Notice how in the middle of writing about the … [+]
In my experience, it’s difficult to unconfuse a confused model, so I didn’t ask Bing to finish item 10 or to write a conclusion.
In the end, even in its unfinished state, Bing’s blog post weighed in at 3,155 words.
Warning: Because the post is so long, there are 10 screenshots.
Image 1/10 of a blog post written by Bing Chat
Image 2/10 of a blog post written by Bing Chat
Image 3/10 of a blog post written by Bing Chat
Image 4/10 of a blog post written by Bing Chat
Image 5/10 of a blog post written by Bing Chat
Image 6/10 of a blog post written by Bing Chat
Image 7/10 of a blog post written by Bing Chat
Image 8/10 of a blog post written by Bing Chat
Image 9/10 of a blog post written by Bing Chat
Image 10/10 of a blog post written by Bing Chat
How did the chatbots perform? Let’s analyze.
Even though I knew Bing Chat’s reputation for being chatty, I was surprised by the length and depth of the post. Based on my experience using the chatbot since its release, I expected Bing to be one of the weaker competitors in the AI showdown. That proved not to be the case.
Just because a post is long doesn’t mean it’s relevant and accurate.
Bing Chat included many links in its draft, but most led to 404 “not found” errors, a significant and common problem with all four chatbots in this experiment.
Broken links are a major problem because you have to ask the chatbots for correct links (a fruitless task, I’ve discovered), research the included facts yourself, or come up with new sources and examples for which you can find links. Either way, it’s a time suck.
Besides the accuracy issue, the post seems relevant because it touches on good ideas for generating ideas for the content calendar.
Although Bing’s post is entirely formulaic instead of innovative, formulaic isn’t always bad. For each numbered “way,” Bing presents the tactic, how it helps with idea generation, an example, and tools to help with the idea.
I don’t see anything innovative or new in the post, though.
Bing presents the best reader experience. The content is long, suggesting depth. It’s well structured, with many bullet points to break up the text. Bing also suggests images and examples for the content, elements that positively affect the reader’s experience.
Bing’s narrative and descriptive abilities exceeded ChatGPT’s and Claude’s for this experiment. Each example Bing presented relates to content marketers, showing how they might use the idea in practice.
I didn’t see any humor in Bing’s post, but the potential for engagement is high based on other factors I mentioned.
Bing gets good grades for technical writing ability; although I didn’t scour the post word by word, I didn’t notice any glaring errors.
Bing delivered the most thorough, detailed draft, making it the winner thus far. Despite the broken links—a major letdown—I’d still want to work with this draft as a starting point.
Next up—Bard’s draft.
Based on my earlier experiences with Bard, I expected the AI to bomb in this experiment. It didn’t bomb, but it didn’t blow me away.
Take a look.
Image 1/2 of a blog post written by Bard
Image 2/2 of a blog post written by Bard
How did the chatbots perform? Let’s analyze.
As a fellow content creator, I bet you’d agree with me when I say Bing Chat’s 3,000+ word draft makes any draft with fewer words seem poor by comparison. Bard’s post was much poorer, coming in at 549 words.
The content is relevant because it targets content creators and meets the brief. It’s accurate, too, but only because it lacks the content to be inaccurate.
I saw neither creativity nor innovation in the content of this post. It’s just a typical “meh” blog post, like thousands—likely millions—of others online.
Bard gave a good experience, speaking directly to readers and asking many questions. The part about the holistic content experience is tacked on at the end, though. And the quote at the end felt out of place, which left me out of sorts.
Bard excelled at narrative and connection in the introduction by opening with a quote. “I’m so tired of writing blog posts!” it wrote, followed by, “I hear content marketers say this all the time.” Of all the introductions, this one grabbed my attention the most.
The opening quote and the questions throughout cause readers to be engaged. I didn’t see humor, though. The supposedly funny quote at the end of the post seemed misplaced.
Bard can write correctly but needs to learn finesse and flow. Almost every “way” has one sentence out of a few that begins with, “This is a great way to….”
Bard’s post cries out for added substance, meaning more work querying the AI or more research on my part. I wouldn’t want to use this draft as a starting point.
Whew! It’s been a lot of work to get to this point of the experiment. So far, based solely on the blog posts, here are the winners and losers:
In the third and final article in this AI showdown series, we’ll see how ChatGPT, Claude, Bing Chat, and Bard performed when creating social media content and an email to drive readers to the blog post.
Stay tuned!