Creating Synthetic Users for Free with ChatGPT

We get it ... it's extremely hard to get people to respond to surveys, or answer questions in general. Really, we have a 2% answer rate when we ask users for feedback 🥲 This is why synthetic users are a growing trend in the insights industry. These AI-generated personas mimic specific personality traits (e.g. interests, pain points, etc.) and enable you to interact with them as if they were real-life human beings. They're increasingly used in various fields (e.g. market research, user research, customer feedback surveys, etc.) to complement human-generated data.

How? Well, in theory, a synthetic user allows you to ask AI directly what you would otherwise painstakingly try to get from actual human beings. Assuming that the synthetic user accurately represents that human to begin with!

This article will focus on how to create accurate synthetic users in practice, for free, using ChatGPT. You'll learn the concrete steps and ChatGPT prompts needed to create a synthetic persona of your customer or user. However, this isn't a "AI is going to replace everything" blog post. There are clear limitations to synthetic users, as you'll find out from a real-world example. Let's dive in!

P.S.: We promise, this article wasn’t written by a synthetic persona! Only 100% flesh-and-bones humans behind the writing.

What Are Synthetic Users?

Synthetic users are virtual personas created using artificial intelligence trained on demographic data and behavior patterns. The goal is to approximate the user as closely as possible, and subsequently run a battery of tests on the synthetic user. Compared to running the test with real users you'll obviously gain user insights a lot faster and cheaper. However, this increased efficiency typically comes at the cost of less trust worthy insights and test results.

As an interesting side note, synthetic users have been around before the 2022 LLM boom. For example, synthetic users have been used to perform load testing on servers and software. However, the interest from the insights industry for AI personas in general has clearly boomed together with the advent of LLMs as shown below.

Graph showing the increasing search interest for synthetic users and AI personas on Google

Synthetic users used to be rather one-dimensional, mimicking very specific sets of human behavior (e.g. clicking on websites, loading pages in browsers, etc.). However, recent advances in AI have made it possible to mimic complex human behavior, by relying on vast amounts of training data.

For example, large language models such as ChatGPT can be asked to impersonate a specific persona (e.g. a middle-aged woman, interested in entrepreneurship living in suburban setting). ChatGPT can do this because it went through acting classes together with Brad Pitt. Just kidding, ChatGPT was trained on bucketloads (yes, that's a computer science term) of data from real humans online that it assimilated rather precise behavior patterns.

Besides these general-purpose AI tools, we've also seen the emergence of dedicated AI tools for user research (e.g. Syntheticusers.com) and market research (e.g. Lakmoos). These tools typically sport proprietary machine learning algorithms, claiming to increase the accuracy of the data and insights they generate compared to tools such as ChatGPT.

How to Use AI and Synthetic Users in Research

How do you use synthetic users for conducting research? Here are two concrete AI variations on traditional research methods that might inspire you:

Competitor Benchmarking & Market Research

Doing competitor benchmarking is something which is typically done internally. You, as a marketeer, will map out competitors along various axes to determine how your product or service fares compared to others. An interesting use case for synthetic users is to let them analyse your competitors for you. Basically, you ask the AI to impersonate a particular persona and analyse a competitor's website.

Moreover, LLM providers are increasingly adding web searching capabilities to their AIs. Meaning that in the future we expect these synthetic benchmarks to not only be more in depth (i.e. better insights gathered from the AI personas) but broader as well (i.e. more tools contained in the benchmark as the AI's search capabilities improve).

Synthetic Focus Groups

Traditional user groups can be expensive, time consuming, and prone to the occasional participant who just loves to hear themselves talk. Synthetic focus groups, powered by AI-generated personas, aim to solve this by simulating realistic customer conversations without the logistical headache. Instead of recruiting human participants, you generate AI personas that match your target audience or user base. These personas then engage in simulated discussions, reacting to new product ideas, marketing messages, or pricing models.

The upside of AI-generated focus groups isn't only their cost effectiveness. Analysis of these AI-generated focus groups is also significantly easier, given that you can instruct the AI to immediately categorise discussions according to sentiment, pain points, etc. Of course, AI personas don’t feel human emotions, but for rapid, cost-effective insights, they’re the closest thing to having an on-demand, scalable focus group in your back pocket.

The Problem with Synthetic Users

Large language models like ChatGPT can generate synthetic users that mimic real people based on the vast amount of data they've been trained on. But while ChatGPT’s impersonations might sound convincing, they’re ultimately generic. They're essentially more of a well-educated guess than a accurate representation of your actual customers.

You'll likely think that ChatGPT's synthetic personas are rather obvious if you've spent time with your users (e.g. reading their feedback, analyzing product behavior, or just plain old talking). These synthetic users might not tell you anything that you couldn't have guessed yourself. ChatGPT might "know" what drives a particular demographic in general from it's training data. But your audience is probably too unique for ChatGPT to have seen in-depth data on it during training.

"Yes, but blog-writing man how do I then make my ChatGPT personas more accurate" I hear you say? We'll dive into how to do this in ChatGPT in a second. But essentially it comes down to two key techniques:

Feed ChatGPT your actual user data. Instead of relying on its general knowledge of how a 'customer in your industry' might behave, you can give ChatGPT real inputs: past survey responses, customer reviews, or user interviews. Then ask it to impersonate a user based on that data. This allows ChatGPT to make informed decisions based on real-world insights, rather than assumptions based on generic synthetic data. Please make sure that you're allowed to share this information first, knowing that OpenAI might use it as training data!
Create feedback loops to refine accuracy. The best way to make synthetic users truly useful is to continuously fine tune the AI models with fresh data. In other words, feeding it new customer insights as they occur (e.g. new product feedback, changes in your product strategy, etc.). Think of it as teaching ChatGPT about your customers the same way you’d onboard a new team member: it improves the more it learns.

Enough talking now, let's get down to it and show you how you can get started building synthetic users for yourself in ChatGPT.

ChatGPT Prompts to Create Relevant Synthetic Users

We're going to assume that you have user data at hand (e.g. survey results, interview recordings, reviews, etc.). If this isn't the case we'll show you how we used ChatGPT to generate synthetic data at the end of this section. This allows you to play around with the concept before you commit real user data.

Two Methods to Turn User Data into Synthetic Users

There are two methods we recommend to turn your user data into synthetic users for research studies:

If you're working with bulk, or anonymized data, you'll want to use the first method. It processes the data, extracts personas and applies them in research.
If you have access to precise user data (e.g. recordings of interviews, reviews left by a specific user), you'll want to use the second method. This method uses the data from individual users as context for further prompting.

Method 1: From Bulk to Breakthroughs

ChatGPT workflow to create synthetic users from bulk user data

Say you have a CSV file with survey results from a new marketing campaign, and the data doesn't allow you to identify precise customers. To generate synthetic customers you'll first want to analyse this bulk data using ChatGPT to have it distill key learnings. For example using the following prompt:

“Analyse survey data from a CSV file to identify recurring themes, motivations, and frustrations across user feedback. Distill user personas based on these patterns and structure them into actionable profiles (e.g., name, motivations, pain points, feature priorities).”

This is only a summary of the actual prompt we use, you can find the full prompt here. It's important to note that, according to our own experience, it's better to already instruct ChatGPT on the personas you have identified .

You can either directly continue with the research part, or ask ChatGPT to generate a separate document with the personas which you can later reuse in other prompts. For example, in the following prompt we use one of these AI generated "persona documents" to test how different kinds of synthetic users would react to a new product concept.

“You are an AI trained to analyse user personas from a provided context document. I will paste the document containing user personas, their motivations, pain points, and feature priorities after this prompt. Analyse the following feature idea: A new option allowing users to request colleague feedback directly within the ‘preview’ mode of a form before publishing it. Summarise how each persona would react to this feature, assess whether this feature fits into their workflow and if they would use it regularly, and evaluate whether they would consider this a premium feature worth paying for. Provide a structured summary of each persona’s perspective, highlighting their enthusiasm, hesitations, or concerns. If certain personas are more likely to adopt and pay for the feature, explain why.”

This is only a summary of the actual prompt we use, you can find the full prompt here.

Method 2: From Precise to Predictive

ChatGPT workflow to create a synthetic users from precise behavioral data

In this second method you'll be using all data you have on a specific user to have ChatGPT impersonate them. To do this you'll be first issuing a system prompt (i.e. a prompt instructing how ChatGPT should behave for the rest of the conversation). For example:

“You are an AI trained to analyze and synthesize detailed user data to create a comprehensive profile of a specific user. I will provide information about a single user, including their feedback, preferences, motivations, pain points, and behavior. Your task is to process this data to understand their customer experience and summarize their key traits, motivations, and frustrations. Using this profile, you will impersonate the user in all subsequent prompts, providing realistic responses to questions about product features, workflows, and decisions. The goal is to simulate their thought process and perspective, ensuring accurate and actionable insights.”

This is only a summary of the actual prompt we use, you can find the full prompt here.

You can now interact with ChatGPT as if it were that particular user. Except that you can fire an endless onslaught of questions without running the risk of scaring them away!

Generating Synthetic Data to Play Around With

Not ready yet to create synthetic users from real data? We get it, we also first generated some synthetic data in order to play around with prompts and methods. Here's how we went about it, in case you want to follow our example.

First we asked ChatGPT to create a fake customer feedback survey. The goal of this survey is to be broad enough for ChatGPT to then create patterns, synthetically, according to various personas. You can find the prompt we used to create this fake survey here.

Next, prompt ChatGPT to create a CSV file with answers to your survey questions. The goal here is to simulate human behavior by telling ChatGPT which generic user personas it should impersonate while filling in the survey. For example, we used marketeers, entrepeneurs and customer success managers in our prompt.

Synthetic Personas: Real-World Example and Limitations

If you're anything like us you might be wondering to what extent the answers of synthetic users are trustworthy at all. Is there any proof that synthetic users don't hallucinate their way through qualitative research interviews, instead of accurately simulating real people? Fortunately, there's an increasing body of peer-reviewed scientific literature examining this exact question. As you would expect, the answer is nuanced. Below we present the results of the existing studies on the effectiveness of using synthetic personas.

Research Results Against Synthetic Users

A German study explored the potential of ChatGPT (3.5) to estimate public opinion in Germany, focusing on voting behavior. They created synthetic personas mirroring respondents from the 2017 German election study . These personas were then entered into ChatGPT, prompting the model to predict each individual's voting choice. Subsequently, the results obtained from ChatGPT were compared to the actual 2017 survey results.

The researchers found that ChatGPT wasn't able to accurately represent voter opinions. It clearly displayed a bias towards the "green" and "left" parties. It successfully captured voting patterns of particular subgroups (e.g. partisans) but failed overall.

Research Results in Favour of Synthetic Users

Another study from, amongst others, Google DeepMind (Google's AI division) and Stanford University also investigated the accuracy of synthetic agents. They first conducted two-hour qualitative interviews to gather comprehensive life histories from a diverse set of 1000 people. These interviews were given to GPT (4-o) in order to create synthetic personas (or agents in their terminology). They then asked the real users and the AI agents to complete a set of established tests in the social sciences (e.g. the General Social Survey (GSS), the Big Five Personality Inventory, etc.)

The results show that the agents replicated participants’ GSS responses with 85% accuracy. In other words, GPT was able to accurately create a synthetic user from the participants. This further drives home the point that we've been making. The data you use to fine tune AI models plays a significant role in the accuracy of the persona. By leveraging human feedback you ensure that the responses given by your synthetic users more closely match real customer behavior.

Synthetic Users FAQ

What is a synthetic persona?

A synthetic persona is an AI-generated profile that mimics real user behaviors and preferences. These personas are created using data from actual users and are used in market research, product development, and testing to uncover insights efficiently.

What is synthetic user testing?

Synthetic user testing uses AI personas to simulate interactions with products, services, or campaigns. This allows businesses to explore user experiences and refine ideas faster and more cost-effectively than with real participants.

Can you trust synthetic users?

Synthetic users can predict trends with up to 85% accuracy but may miss nuances like political or behavioral subtleties. They are best used as a complement to real user feedback, offering scalable insights without replacing human validation.

Do synthetic customers generate good insights?

Synthetic customers help explore ideas when real feedback isn’t available, but their effectiveness depends on data quality. They’re most useful alongside real input.

How can synthetic personas help marketing teams?

Synthetic personas allow marketing teams to simulate user behavior for specific purposes, such as testing campaigns, exploring new user groups, or understanding how different audiences might react to their messaging. This enables faster insights without requiring live user feedback.

Can synthetic personas replace real users?

Synthetic personas are excellent for specific purposes like rapid testing or early exploration, but they don’t fully replace real users. While they help marketing and product teams simulate behavior and understand trends, real user feedback is still essential for validating insights.

‍