Generative AI like ChatGPT isn't just answering questions; it's steering buying decisions. Today, 58% of consumers rely on AI for product and service recommendations (up from 25% in 2023), and AI-driven search referrals surged 1,300% last holiday season [hbr.org]. This shift is reflected in data from 3,000+ marketers using our AI SEO agent, KIVA, which tracks where ChatGPT sources and cites content across synthetic query workflows.
At Wellows, we dug into real-world data from thousands of KIVA users, analyzing 7,785 ChatGPT queries and 485,000+ citations across 38,000+ domains. The verdict? AI now plays a significant role in product discovery, with consumer reliance on recommendations increasing by 132% year-over-year.
But it's not just about getting mentioned. We set out to decode which sites large language models (LLMs) trust and cite, and why. What did we find? The rules have changed, and what works might surprise you.
In this post, I'll unpack those insights and what they mean for content creators and marketers trying to stay ahead of the curve.
Top 50 domains get 48% of citations, but 52% goes to long-tail sites - niche expertise can compete with authority
Commercial queries favor product sites (20%) + tech media (22%) - content strategy must align with user intent
Recent content wins for time-sensitive queries, authority wins for evergreen topics - balance is key
Be authoritative + structured + intent-aligned = higher citation chances in AI responses
Let's dive into the data and see what it reveals about aligning your content strategy with the era of generative search.
This report is based on 7,785 anonymized ChatGPT queries captured through the KIVA platform in 2024, generating over 485,000 citations across 38,000+ unique domains. All data was gathered in compliance with privacy standards and reflects enterprise-grade usage across a wide range of content creation intents.
We gathered anonymized logs from 3,000+ marketers. KIVA generated synthetic queries for keywords, simulating searches and tracking which sources ChatGPT cited to reveal AI-native citation behavior.
KIVA classified synthetic queries by intent, domain type, and timing. This structure traced how LLMs respond to machine-generated prompts, linking each citation to context for deeper behavioral insight.
Cited domains were tagged and grouped by type, time, and intent. This analysis showed which content LLMs prefer, helping marketers boost visibility and increase chances of being cited by ChatGPT.
One of the first analysis we did was to categorize the top-cited domains by their industry or content archetype. In other words, what kinds of sites does ChatGPT tend to pull answers from?
The results showed a clear divide between tech media publishers and product/service websites, with a few educational and consulting sources also included.
To start, here are the 50 most-cited domains in our dataset:
TechRadar leads with around 14,500 citations, followed closely by trusted tech publishers like CNET, PCMag, Tom's Guide, and TechCrunch. These sites frequently appeared in queries such as top AI optimization tools or best tools for productivity, suggesting that ChatGPT leans heavily on authoritative review platforms for curated content.
What stands out even more is the high visibility of official product websites, such as OpenAI and HubSpot. These domains often show up when ChatGPT references tool features or offers product comparisons.
The trend is clear: tech media and product vendor pages dominate citation rankings, highlighting the SEO value of third-party credibility and well-structured, informative product content.
It's important to note that beyond the leaders, citation frequency drops off in a long tail. The top 50 domains accounted for almost 48% of all citations, while the Long tail (Other domains) accounted for 52%.
Our data included references to over 38,000 unique domains, indicating that ChatGPT drew information from an extensive range of sources. This "fat head, long tail" distribution is visualized below:
This indicates that, although a handful of sites are frequently cited, ChatGPT also draws answers from a broad, long-tail of niche sources for information. In practical terms, authority sites command a significant chunk of citations, but there's plenty of opportunity in the tail. Many smaller blogs, company sites, and forums each contributed a few citations here and there.
It suggests that if your content is highly relevant to a specific question, ChatGPT may find and cite it even if your site isn't a household name, though building authority helps (more on that later).
We manually categorized a sample of the top 50 domains into a few archetypes and estimated their share of total citations.
Here's a breakdown:
| Domain Category | Examples (Top Cited Sites) | Approx. Citation Share |
|---|---|---|
| Tech Media & Publishers | TechRadar, CNET, PCMag, Forbes, TomsGuide, TechCrunch, Wired, TheVerge, Technology Review, Technadu, VentureBeat, CIO, ZDNet | ~46.5% |
| Product & SaaS Websites | OpenAI, HubSpot, Jasper.ai, Copy.ai, Semrush, Salesforce, Writesonic, Grammarly, Adobe, Moz, Microsoft, Ahrefs, Hootsuite, Neil Patel, AI.Google, Netflix, Canva, Buffer | ~25.3% |
| Others | Comparitech, VPNPro, EFF, Norton, McAfee, Search Engine Journal, TorrentFreak, EdSurge, CyberNews, Mayo Clinic, Towards Data Science, etc. | ~16.8% |
| Education & Research | Harvard Business Review, Brookings.edu, Business Insider | ~6.6% |
| Consulting & Analysts | McKinsey, Gartner, Deloitte, IBM | ~4.8% |
Classification of cited domains by type.
Tech publishers and review outlets are highly influential on LLM queries. Established sites like TechRadar and CNET have the kind of authoritative, comprehensive articles that ChatGPT loves to cite. These sites often update their content regularly and cover "top X" lists that align well with user queries, making them prime targets for citations.
Official product pages for leading tools get cited frequently, more than we initially expected. In queries asking for the "best" products, ChatGPT would not only list those products but often cite the product's website to back up claims about features or pricing.
For example, in answers about VPN services, ChatGPT might say "NordVPN offers fast speeds and strong encryption" and cite NordVPN's site as evidence. This underscores that having clear, factual information on your official site can directly earn you citations in AI-generated answers.
Educational and research sources (like HBR.org, Brookings, or arXiv papers) constituted a smaller slice of total citations (~9%), but were prominent for certain types of questions (primarily conceptual or trend-focused queries).
For instance, Harvard Business Review articles appeared often in answers to questions about AI's impact on industry or strategy. These sources tend to provide deep insight or data, indicating that ChatGPT turns to them for authoritative context.
Consulting firms and industry analysts (McKinsey, Gartner, etc.) were only rarely cited (~1%). This was a bit surprising, given their thought leadership reports.
It's possible that their content is gated or not as frequently surfaced by web search, or that ChatGPT had sufficient information from other sources. In any case, traditional consulting whitepapers didn't significantly influence ChatGPT's answers in our sample.
The long tail of "other" domains (over 50% of citations) includes everything from niche blogs to community forums and Q&A sites. This diversity means that almost any quality content, even from lesser-known sites, can potentially be pulled in by ChatGPT if it directly addresses the user's query.
We saw citations from government sites (e.g., NIST.gov for cybersecurity standards), developer documentation (IBM's knowledge base for programming tool queries), and even Reddit threads for specific troubleshooting questions.
The caveat: each of these individual sites contributed only a few citations. Together, they form the fabric of the AI's broad knowledge base.
In short, dominant citation sources fall into two camps authoritative publishers and trusted product/service sites, with a healthy mix of others filling specific niches.
For brands, this highlights two paths to being cited: become a go-to publisher of information in your domain, or build your product's site into a trusted source of facts and specs. Ideally, do both.
Does ChatGPT prefer fresh, recently updated content, or does it lean on evergreen information? This is a crucial question for content strategy, since one might assume an AI with up-to-date browsing would always fetch the latest articles.
Our analysis suggests a nuanced answer: ChatGPT's citations tend to reflect content freshness when it's relevant, but not at the expense of authority.
Keep your content updated when appropriate. If you operate a site that publishes rankings, how-tos, or guides, updating them with current information can increase the chances that an LLM will pick your page as a source for current queries.
That said, don't update just for the sake of a new timestamp update to maintain relevance and accuracy.
ChatGPT's behavior suggests it values a mix of up-to-date and authoritative information. The best scenario is having content that is both definitive and regularly updated with new data, for example, a comprehensive guide in your field that stays current over time.
Not all queries are created equally. We categorized the user queries into a few intent types - broadly Informational, Commercial Investigation, and Transactional, to see how the citation patterns differed. Here's how we defined them:
When we mapped queries to intent types, ~73% were Commercial, ~23% Informational, and ~4% Transactional. The skew toward "best/tools" queries reflects the dataset's focus on tool-related topics. Still, citation patterns varied notably across these categories.
Comparison + product discovery.
ChatGPT often cites:
Breakdown:
Educational + explanatory queries.
ChatGPT favors:
Breakdown:
Action-oriented queries.
ChatGPT prefers:
Breakdown:
For Commercial Queries:
Breakdown:
For Informational Queries:
Breakdown:
For Transactional Queries:
Breakdown:
As a founder, analyzing these citation patterns completely changed my perspective. Talking about AI's impact on search is one thing, but seeing how an LLM selects its sources is eye-opening.
Here are some honest insights and lessons we at Wellows learned firsthand:
We noticed how often ChatGPT cited official product sites for top-ranked tools. This drove home a point, if you manage to be one of the top recommendations in your category, the AI will likely mention you even use your site as evidence.
It's like winning two prizes: first, the human-crafted "best of" lists include you, then AI amplifies your presence by citing your own domain. For us, this emphasized the importance of product excellence and reputation. No amount of SEO tricks will get you cited by ChatGPT as "one of the best" if you truly aren't regarded as such by the source material it trusts.
In an age of generative search, quality and brand reputation are more critical than ever - there's no gaming the system when an AI is synthesizing consensus opinion.
Initially, I thought big-name domains would completely dominate citations. While high-authority sites did lead, the long tail showed that authority in AI eyes is very context-specific. If someone asks about machine learning frameworks, an obscure blog by a ML engineer might get cited because it has the precise answer, even if that blog would never outrank TensorFlow's official site on Google.
This taught us not to discount niche expertise. At Wellows, we encourage deep-dive content on specific subtopics even if it targets a niche. If it's the best answer on that topic, ChatGPT might surface it. In short, be the Wikipedia of your niche.
We now do "SEO and GEO" - Search Engine Optimization and Generative Engine Optimization. Traditional SEO isn't going away, but we now also evaluate how LLMs consume our content.
Are we using semantically rich language that mirrors the kinds of queries users type? Our LLM Pattern Analysis Checklist breaks down these patterns into actionable steps, ensuring content is both SEO-friendly and optimized for AI-driven discovery.
We began adding FAQ sections and summary bullets to our articles, anticipating that AI might favor well-formatted snippets. We also started paying more attention to schema markup and metadata to boost contextual clarity even though LLMs don't directly use schema like search engines do, it still helps signal intent.
Some domains-like HBR or Mayo Clinic-consistently popped up in ChatGPT for critical topics. That's because ChatGPT appears to favor sources that ooze credibility, reflecting an internalized sense of trust and authority.
This reinforced our strategy to become trusted resource in our domain. That means:
It's all about E-E-A-T - Experience, Expertise, Authoritativeness, and Trustworthiness - principles that now influence both SEO and LLM relevance [lseo.com].
Throughout this journey, a humbling realization is that we're all learning how these AI systems work. The data gave us clues, but the algorithms are complex and evolving. We have to remain adaptive.
One week, a certain site might be the source for ChatGPT, and a month later, a new, better source could replace it as the model's knowledge updates.
So our mindset at Wellows is to stay curious, keep testing, and never assume we've "figured it out" completely. The goal is to continuously align our strategies with the AI's behavior, which, in a way, keeps us honest about making genuinely useful content.
Wellows is envisioned as an AI-first marketing platform that acts as an autonomous marketer for your brand. It deploys a "Visibility Stack" of intelligent agents to uncover the exact questions your audience is asking across Google and leading LLMs like ChatGPT, Gemini, Claude, and DeepSeek.
Wellows identifies visibility gaps, prioritizes high-impact content opportunities, generates citation-worthy content, and structures it for maximum discoverability. In addition, it publishes with precision, continuously monitors every mention, flags drops, and automatically updates or prunes assets so your growth engine remains self-learning, always-on, and headcount-free.
To earn AI citations, be the best answer-publish the most useful, accurate, and relevant content. Algorithms reward quality, authority, and relevance. Focus on standout value and clarity.
Early findings from our ChatGPT analysis indicate that brands with strong content tend to receive more citations. Keep adapting as AI evolves.
Bottom line: Experiment, analyze, and keep improving. The field is new, so there's room to lead.