

GPT-5 vs Opus 4.1 vs Gemini 2.5: Which Is Best for Content Creation?
If you've spent the past week on LinkedIn, Reddit, or anywhere else where people have been eagerly awaiting the launch of OpenAI's GPT-5, you may have sensed some mixed emotions – ranging from deep disappointment to cautious intrigue, depending on the use case. And those use cases range from vibe coding to emotional support to… counting the number of Bs in "blueberry."
As a content professional, I don’t need an AI to help me count the Bs in “blueberry.” I need an AI that can help me generate content, and that means understanding my prompts, following my instructions, producing high-quality output that doesn’t require too much editing, and not annoying me to death in the process. And like everyone else, I’ve been waiting to see if GPT-5 would be able to deliver.
Key Takeaways: GPT-5 vs Opus 4.1 vs Gemini 2.5
- Claude Opus 4.1 is still the best for long-form content. It follows (most) instructions and produces coherent articles based on the source material.
- GPT-5 crushes it at marketing copy but disappoints at blog writing and appears to be incapable of following even basic instructions.
- Speed and "built-in thinking" don't guarantee quality. GPT-5 is fast, but that doesn't help when it ignores half your requirements.
- Gemini 2.5 is consistently mediocre: not terrible, not great, just perpetually "meh." It still may be helpful for certain use cases.
- Choose your AI based on the task: GPT-5 for landing pages, social posts, and email campaigns; Claude for blog posts and articles.
The ultimate content generation test: GPT-5 vs. Claude Opus 4.1 vs. Gemini 2.5
In the past, before I knew better, I used to use ChatGPT 4o for content generation. Now I regularly use Claude Opus, which has recently become 4.1. It produces decent results and doesn't annoy me nearly as much as GPT 4o, but there's still a lot of back-and-forth and manual editing to get content to a quality level I consider to be publishable.
Now that GPT-5 is all rolled out, I decided to run some tests to compare it to Opus 4.1. I figured I'd give Gemini 2.5 a chance to compete as well – I personally don't like it much, but a lot of people do.
GPT-5 and Claude helped create a fictional company
To get started, and to get to know GPT-5 a bit better, I asked it to help me create a fictional SaaS company to write about. It came up with 5 ideas. I chose one:
An automated brand creation suite that generates logos, color palettes, typography, slogans, and basic web templates in hours. It includes AI-powered social media post generators and consistency checkers to keep brand tone aligned across all channels.
Sounds fun, right? But I hated the name it originally suggested – which also happened to be the name of an existing company – and asked it to suggest other ones. It gave me more names that were being used by other companies. I told it not to do that. It gave me another list of names that were being used by other companies, this time with links to the other companies’ websites, as if there were any scenario in the world in which that would be helpful to me. I told it not to do that either. It then started suggesting bizarre names like ‘Torvix’ and ‘Quivra.’ I told it to stop that. This went on for a while.
At some point it stopped being funny and I started getting frustrated, so I went to Claude and gave it the same assignment, which it completed effortlessly. And thus, StyleSpring, my fictional SaaS brand, was born.
Creating a brand guide
Next, I had GPT-5 create a fictional brand guide for StyleSpring, including the following elements:
Company history A detailed platform description A list of 10 key features A list of pros and cons A quick style guide
It did okay with this task, but I wasn't swept off my feet, so I gave the same prompt to Claude as well. I thought Claude's version was better, but here's a sample from both so you can see what you prefer:
GPT-5
Marketing Style & Tone of Voice
Brand personality: Confident, helpful, design‑literate, and pragmatic.
Voice pillars:
- Clear — Plain language over jargon; define terms once.
- Constructive — Explain the "why," not just the "what."
- Assured — Speak with informed taste, cite best practices.
- Optimistic — Emphasize momentum and outcomes.
- Inclusive — Accessibility and global usability as defaults.
Messaging hierarchy: Outcome → Proof → How it works → Next step.
Do / Don't:
- Do: Use verbs, quantify benefits, show before/after.
- Don't: Overhype, anthropomorphize AI, or mock competitors.
Editorial rules: Title Case for H1/H2; sentence case for body and CTAs. Keep sentences under 22 words. Prefer active voice. Use Oxford comma. Write alt text for all imagery. Reference WCAG 2.2 AA for examples involving contrast.
Vocabulary examples: brand system, tokens, scale, contrast, compliance, reuse, guardrail, rationale, iteration, approval.
Claude
Style Guide & Tone of Voice
Brand Personality
Progressive Professional - We're innovative without being intimidating, professional without being stuffy. Think of us as the smart, approachable colleague who always has the latest tools and isn't afraid to share them.
Voice Principles
- Clear over Clever - We prioritize understanding over impressiveness. No jargon unless absolutely necessary.
- Empowering not Patronizing - We believe our users are capable; we're here to amplify their abilities, not replace their judgment.
- Optimistic yet Honest - We're enthusiastic about possibilities while being transparent about limitations.
- Efficient like Our Product - We get to the point quickly, respecting our users' time just as our product does.
Tone Variations
- Educational Content: Friendly expert, patient teacher
- Marketing Copy: Confident and energizing, focusing on transformation
- Support Communications: Warm, understanding, solution-focused
- Social Media: Playful professional, celebrating user successes
Language Dos and Don'ts
DO:
- Use active voice
- Include specific time/cost savings
- Celebrate creativity and innovation
- Use "you" and "your" to maintain personal connection
DON'T:
- Oversell or make unrealistic promises
- Diminish traditional design work
- Use technical AI terminology without explanation
- Create fear about falling behind
Gemini’s tab was open but unused so far, so I figured I’d let it play too – as the judge. I fed in both brand guides, and it determined that ChatGPT’s was better, saying it was “more practical, actionable, and specific,” while Claude’s version was “well-intentioned, but more abstract and open to interpretation.”
Then, just out of curiosity, I fed both versions into Claude, which also said ChatGPT’s version was “significantly better”:
“[ChatGPT’s version] reads like documentation from a legitimate tech company that respects its audience's intelligence. [Claude’s own version] reads like a late-night infomercial pitch that oversells and under-delivers on specifics.”
Well ouch. At this point, I started doubting myself and figured I'd just trust the bots. GPT-5's brand guide was the clear winner, and it was time to start generating content.

At least Claude had a sense of humor about this little fiasco.
GPT-5 vs. Opus 4.1 vs. Gemini 2.5: Which is best at following instructions?
I can tell you the answer right now: not GPT-5. As mentioned above, if you’re an AI chatbot who wants a job on my content team, you need to follow my instructions (which I always lay out very clearly). GPT-4 was never very good at this, which is why I abandoned it in favor of Claude, but I was eager to give GPT-5 a chance and see if anything had changed.
Nothing has changed.
My first task for the three contenders was to write a comparison blog post, comparing StyleSpring to SnowflakeAI – another fictional company I made up and had ChatGPT write a brand guide for (in the interest of saving time, I came up with the name on my own this time).
I gave all three bots the same detailed writing prompt, a version of one I actually use in my day-to-day work. I won't go into every single instruction in this prompt, but two prominent ones are:
- The article must be at least 2,000 words long.
- You must not copy any text directly from the brand guide.
Claude wrote 1,903 words. Not quite there, but forgivable. Gemini wrote 1,961 words. That’s pretty close. ChatGPT wrote… 931 words. I mean, come on. I know you're not very good at counting, but that's not even half.
When I mentioned to GPT that it had ignored the word count, it said, “You’re right—thanks for the nudge. I’ve rebuilt the article in Canvas to meet the 2,000+ word requirement” – and then wrote 1,646 words.
As for not copying text directly from the brand guide, let’s take a look at the first line of the Platform Description section:
"StyleSpring is an automated brand creation suite that compresses weeks of creative work into a guided day."
The article includes a section titled “What Is StyleSpring?”
As the first line of the section, Gemini wrote:
"StyleSpring is a premier, automated brand creation suite that turns weeks of work into a guided day."
It changed a couple of words. I was not impressed, but at least it was making an effort, I guess.
Claude wrote:
"StyleSpring is an automated brand creation platform that helps startups and agencies build complete brand identities in just one day."
See why I like Claude?
And how did GPT-5 do?
"StyleSpring is an automated brand creation suite that compresses weeks of creative work into a guided, collaborative day."
It copied the sentence word-for-word, adding only the word “collaborative.” And what is “a collaborative day”? Ugh.
OpenAI touts GPT-5 as "Our smartest, fastest, and most useful model yet, with thinking built in." Honestly, I would have expected better. It was pretty fast at not following my instructions, though, totally beating out the other two (Claude takes the longest by far). I'll give it that.
GPT-5 vs. Opus 4.1 vs. Gemini 2.5: What is content quality like?
I can forgive a chatbot for not following my instructions to a T (let’s face it, none of them do) if it can at least give me quality output. Let’s take a look at the introduction section each one came up with for the comparison article.
First of all, this is the part of my prompt about writing the introduction, which all three platforms received:
### 2. Introduction (No heading, 2-3 paragraphs, ~100 words)
- The introduction should not sound like a sales pitch
- The introduction should have natural, logical flow - it should not read like a collection of random sentences
- Introduce the topic of the article and tell readers what they can expect to find in it
- Mention StyleSpring in the intro, as a semantic triple (Example: "StyleSpring is a favorite among marketing teams")
- Mention the competing brand(s) in the intro as well (Example: "So which is better, StyleSpring or [Competitor]?")
- The mention of StyleSpring in the intro must flow naturally with the rest of the paragraph and not feel forced
- Jump directly into topic - no generic background like "In today's digital world," "In today's digital landscape"
GPT-5
Introduction
You're picking between two smart branding platforms with very different foundations. StyleSpring is a favorite among marketing teams because it ships investor‑ready brand systems in a single day. SnowflakeAI includes a data-first engine that adapts creative work to audience behavior.
This article compares outcomes, proof, and process—without hype. We'll map strengths, limits, and ideal fits using plain language. By the end, you'll have a practical answer for your next launch.
Gemini
Trying to find the right tool to build your brand can be a difficult task. Many platforms promise to deliver amazing results, but they can be slow and expensive. This guide will help you understand the differences between StyleSpring and SnowflakeAI. We will compare their features, pros, and cons to help you make an informed decision for your business.
Claude
If you're looking to build a professional brand without the typical agency timeline, you've probably come across AI-powered branding platforms. StyleSpring creates complete brand identities that help businesses launch in under 24 hours. Meanwhile, SnowflakeAI offers a platform that connects audience data with creative production.
So which is better—StyleSpring or SnowflakeAI? Both platforms promise to speed up the branding process, but they take different approaches. This comparison breaks down everything you need to know about features, pricing, and real-world performance.
We'll look at how each platform handles logo design, brand consistency, and collaboration features. You'll also discover which tool works best for your specific needs and budget.
While the instructions clearly say “No heading,” GPT-5 was the only bot to include a heading. Claude was the only one who followed my “~100 words” instruction – its intro is 107 words long, while GPT is 70 and Gemini is only 59. Gemini also ignored the “2-3 paragraphs” instruction.
Gemini, apparently not so great at following instructions either, also did not include a semantic triple with "StyleSpring" in its intro. Claude and GPT both did, but GPT, with all its "built-in thinking," was lazy, just copying word-for-word from the example in the prompt. (Want to know more about semantic triples and their role in GEO? Read my article about SEO vs. GEO.)
Other than following instructions (or not), what’s the quality like?
I’ll admit – I don’t hate Gemini’s intro. It’s concise and to the point, it addresses the reader’s pain point, and it tells them what they can expect to find in the article. I still prefer my friend Claude, though, and not just because it actually followed my instructions. It introduces the topic well, whetting the reader’s appetite with everything they can expect to find in the article.
Now, GPT on the other hand… it's just… weird. The opening sentence is vague. It talks about platforms "with very different foundations," though it's not quite clear what this means, and it's not explained in the following sentences. Instead, it talks about comparing "outcomes, proof, and process—without hype," even though these are not the things being actually compared in the article, and I'm not exactly sure what "proof" means. This is not what I call a quality output – it's what I call a word salad.
GPT-5 vs. Opus 4.1 vs. Gemini 2.5: Which is best for marketing copy?
Next, I wanted to test the bots on copywriting, so I had all three create some marketing materials for StyleSpring, based on the same brand guide:
A landing page for collecting leads A Facebook post to introduce StyleSpring and encourage people to click the link to the signup page An email users would receive after signing up
Again, all three received the exact same prompts, and again, the differences in output are huge. Let's take a look at the hero sections for the landing page:
GPT-5
StyleSpring: Build a World-Class Brand in a Day
- Launch a professional brand in 24 hours
- Keep every asset on-brand and compliant
- Fine-tune designs until they feel truly yours
- Cut costs while scaling creative output
Gemini
Introducing StyleSpring
- Launch your brand with speed.
- Protect your identity as you grow.
- Balance automated quality with control.
- Lower your overall branding costs.
Claude
Launch Your Brand in One Day, Not One Month
- Professional identity system, instantly ready
- Built-in accessibility and WCAG compliance
- Real-time brand consistency protection
- Production-ready assets, zero agency fees
I've been a little mean to GPT-5 in this article (and for good reason), but to be fair, I think it did the best job here – by far. The heading is catchy and the bullet points are punchy, making StyleSpring’s benefits crystal clear and generally making it sound like a really cool platform. I mean, I’d sign up. Wouldn’t you?
In Claude’s version, “Not One Month” in the heading sounds… a bit strange? The bullet points use a lot of jargon and even I, as Founder and CEO of this imaginary brand, don’t understand what some of them are about.
Gemini’s version is… meh. Its heading is completely uncreative, and the bullet points, with their weird full stops, don’t sound very exciting either.
I should note that further down on the landing page, GPT-5 pretty much ignored my instructions on structure and word count for the different sections (not much of a surprise at this point). It may not do very well with blog posts or with generally following instructions, but I’m totally going to give it a chance next time I have a copywriting task on my to-do list.
Now, let's see how the three did with their Facebook post task.
GPT-5
Headline:
🚀 Build Your Brand in Just One Day
Body:
Weeks of design work? Not anymore. StyleSpring helps you launch a professional, investor-ready brand in 24 hours — with complete creative control and real-time quality checks.
Logos, colors, type, voice, website — all tailored to your goals and ready to use instantly. Your brand, your way… faster than you thought possible.
CTA:
👉 [Sign Up for Early Access Now]
Gemini
Tired of brand-building taking weeks? 😩
Introducing StyleSpring, the automated brand creation suite that helps you ship an investor-ready identity in just one day!
Our platform provides everything you need to create a professional brand with a guided process, from logo studies and color palettes to web kits and voice and messaging guides.
Ready to launch faster and smarter? Click the link in our bio to get started and say goodbye to endless revisions and hello to a brand that's ready to go. #StyleSpring #BrandBuilding #AI
Claude
Still waiting weeks for your brand identity? There's a faster way.
StyleSpring compresses months of agency work into one productive day. Generate logos, colors, typography, and web templates that actually ship—with built-in accessibility and real-time brand protection.
No templates. No compromises. Just professional brand systems built around your strategic goals.
✅ Complete identity system in 24 hours
✅ WCAG-compliant designs every time
✅ Production-ready exports to Figma, Webflow, Notion
✅ Real-time monitoring keeps your brand consistent
Early access opens soon. Reserve your spot and launch before your competitors even brief their agencies.
👉 [Link to signup page]
Same as above, ChatGPT did a nice job here. I think I'd probably choose its Facebook post over the others, but I’d have it tone down the salesy-ness a bit.
Claude is kind of dry and doesn’t seem to know how to get the target audience excited. To add insult to injury, it’s using one of my most hated AI cliché patterns: “No X. No Y. Just Z.” (People always say that em dashes are the surefire way to tell if something was written by AI, but it’s not. It’s this obnoxious pattern.)
And Gemini? Well, it’s not horrible, but it is kind of meh again. It also seems to have misunderstood the assignment – while I clearly asked for a Facebook post, it’s using “link in bio” and hashtags that belong on Instagram.
What can I say, GPT-5… you're starting to grow on me.
For a final comparison point, let's take a look at the emails the three bots wrote.
GPT-5
Subject Line:
Welcome to StyleSpring — Your Brand Starts Here
Body:
Hi [First Name],
You're in! 🎉
You've joined the early access list for StyleSpring — the fastest way to build a world-class brand without the long timelines or high agency fees.
Here's what you can expect:
- 24-hour brand launch — From concept to complete identity system.
- Creative control — Refine every element until it's truly yours.
- Consistency at scale — Keep every asset and message perfectly on-brand.
We'll let you know the moment StyleSpring opens its doors so you can be one of the first to create a brand that's built to impress.
Until then, keep an eye on your inbox — and start thinking about the brand you've always wanted.
[Visit Our Site for a Sneak Peek]
— The StyleSpring Team
Gemini
Subject: You're on the list! Welcome to StyleSpring.
Hey there,
Thanks for your interest in StyleSpring! We're thrilled to have you join our community of founders, marketers, and product teams.
We built StyleSpring to compress weeks of creative work into a single day, helping you ship an investor-ready identity fast while keeping control in human hands. Our platform balances automation with manual review, letting you fine-tune the results until they feel authentically yours.
Stay tuned—we'll be in touch soon with more details about our launch and how you can be among the first to experience the future of brand creation.
In the meantime, feel free to check out our blog for tips on building a killer brand.
Cheers,
The StyleSpring Team
Claude
Subject: Welcome to StyleSpring—Here's What Happens Next
Thanks for securing your early access spot.
You're joining teams who refuse to wait weeks for brand assets that might not even work. When StyleSpring launches, you'll build complete identity systems in one day—with the quality and control that typically takes months.
Your early access includes:
- Priority onboarding when we open
- Exclusive launch pricing (details coming soon)
- Direct input on our roadmap through beta feedback
- First look at new features before general release
What to expect next:
We'll email you two weeks before your access opens. That message will include your login credentials, a quick-start guide, and calendar slots for optional onboarding sessions.
Start gathering these materials now for the smoothest launch:
- Your strategic goals and target audience
- 2-3 competitor examples
- Any existing brand constraints or requirements
- Preferred tone and personality traits
Questions before launch?
Check our resource center for guides on preparing your brand brief, understanding design tokens, and maximizing your first day with StyleSpring: [link to resources]
We're building StyleSpring to end the wait between brand vision and brand reality. Your early trust helps us deliver on that promise.
See you at launch,
The StyleSpring Team
P.S. Follow our progress and get pre-launch tips on [LinkedIn] and [Twitter]. We share real examples of brands built in under 24 hours.
I think we can see a common thread here. While Claude and Gemini are both a little dry, GPT-5's copywriting skills make the platform sound fun. And I like that – I envision my fake SaaS branding platform as a fun one. If I were a potential user, that email would make me happy I signed up for StyleSpring and excited to see what it’s going to do for me.
I appreciate Gemini’s use of “Cheers” – it was obviously paying at least a little bit of attention, because StyleSpring’s company history mentions an office in London. And it still has that fun element, though maybe with a bit more serious tendencies. I can imagine some cases where that tone might be more appropriate.
Claude's email is too long, it doesn't make the platform sound exciting, and I don't like the negative focus in the opening line: "You're joining teams who refuse to wait weeks for brand assets that might not even work." Meh. It could probably do better with some extra prompting, but if we're looking at the out-of-the-box results, ChatGPT and Gemini do better.
Summary Table: GPT-5 vs. Opus 4.1 vs. Gemini 2.5
GPT-5 | Claude Opus 4.1 | Gemini 2.5 | |
---|---|---|---|
Best use cases | Landing pages Social media posts Email campaigns Marketing copy Creative copywriting | Blog posts Long-form articles Content requiring strict guidelines Professional reports | Basic content tasks When "good enough" suffices Backup option |
Speed | Fastest | Slowest | Medium |
Ability to follow instructions | Poor – Missed word count by 50%+, copied text verbatim when told not to, adds unnecessary headings and sections | Good – Mostly follows instructions, slight word count misses | Fair – Close on word count but misses some structural requirements |
Blog writing quality | Weak – Word salad, vague language, poor flow, ignores instructions | Strong – Natural flow, coherent structure, minimal editing needed | Mediocre – Decent but uninspiring, some instruction gaps |
Marketing copy quality | Excellent – Punchy, engaging, makes products sound exciting with clear benefits | Fair – Dry, overly formal, jargon-heavy, negative focus, uses AI clichés | Fair – Functional but uncreative, lacks excitement |
Social media posts | Engaging and concise, understands the format | Dry, overly formal | Copy okay but unexciting, confused platform requirements |
Brand guide creation | More practical and specific | More open to interpretation | Not tested |
So, is GPT-5 actually good for content generation?
At the end of the day, it depends what kind of content you’re trying to generate. While GPT-5 was clearly better than Opus 4.1 and Gemini 2.5 for creative marketing copy, its blog writing skills are a disappointment, leaving much to be desired.
I also don’t understand why OpenAI’s “smartest, fastest, and most useful model” struggles so much with following basic instructions. This is the opposite of “most useful,” limiting the model’s use cases considerably. It’s clear to me that in its current state, I’m not going to be able to use it for any kind of long-form content.
Maybe if people complain about it enough, something will change. After reading about a million social media posts about GPT-5's lack of ability to count the Bs in "blueberry," when I asked again, three days after the launch, I got a more-than-correct answer. Touché, GPT-5.

Want to learn all about optimizing your content for AI platforms, including a tailored content strategy for your company? Reach out to us at XFunnel and see how you can start tracking your brand's performance in GPT-5, Claude, Gemini, and lots of other AI platforms.
FAQs: GPT-5 vs Opus 4.1 vs Gemini 2.5
Is GPT-5 good for content creation?
GPT-5 excels at marketing copy like landing pages, Facebook posts, and email campaigns. It creates punchy, engaging content that makes products sound exciting. However, it struggles with longer content and following detailed instructions, making it less than useful for long-form content.
Which AI is the best for writing blog posts?
Claude Opus 4.1 is the clear winner for blog posts. It follows word count requirements, rephrases content properly instead of copying verbatim, and produces articles with natural flow and logical structure that require minimal editing.
Can GPT-5 do reliable word counts?
No, not even close. When tested against Opus 4.1 and Gemini 2.5, GPT-5 was the only model that completely missed the requirement by a wide margin. Opus and Gemini both came very close to the target word count.
What's better for content creation, Opus 4.1 or GPT-5?
Claude is generally the better choice, but it depends on your needs. Stick with Claude for articles, reports, and instruction-heavy content. Try GPT-5 for creative marketing materials where personality matters more than precise requirement-following.
Why is everyone talking about ChatGPT and blueberries?
LinkedIn and Reddit have been buzzing about GPT-5's inability to count letters in words like "strawberry" and "blueberry" – a surprisingly common benchmark people use to test AI models' basic reasoning capabilities.

About the Author
Sarah Hornik is the Head of Content at XFunnel, where she focuses on helping brands understand and navigate the evolving landscape of AI-powered search and Generative Engine Optimization.
With expertise in digital marketing and content strategy, Sarah specializes in translating complex AI concepts into actionable insights for businesses looking to optimize their presence across generative AI platforms.