One of the challenges of building software systems and algorithms is that oftentimes you don’t have the real-world data you need to actually test before you go into production or before customers start using it. It’s all too common to design a product interface or algorithm on paper, only to discover that once put into production with real data, the look of the output isn’t what you expected. GPTs like OpenAI’s GPT-4 and Anthropic’s Claude can be a game changer in these instances.
We ran into this issue at Nomad Data while building a new product, Data Relationship Manager, which is similar to a CRM for data. The product helps firms keep track of their data vendors, datasets, purchases, interactions, meetings, tests and more. After we had a working version of the application, we realized it was a challenge to visualize what the screens would actually look like in a real-world setting. We had no actual user data, and most screens sat empty. This was a challenge from a UI validation standpoint and also made it a challenge to demo the product. We pondered where we could get a meaningful amount of test data when we realized that generative AI was the obvious solution.
Generative AI allowed us to do something that hadn’t been previously possible — generate all the usage data we needed. New generative AI models do an incredible job with text. The key is to give them the context about what you need created.
Nomad’s product is used by a variety of different user types across business functions. They all perform specific activities. We needed to generate data to simulate a multitude of user types using our product to get their jobs done. These activities range in time and need to happen in a logical order. We accomplished this in a few steps.
Related: I Got a First Look at OpenAI’s GPT-4. Here’s How It’s Going to Revolutionize Industries Worldwide — Even More Than ChatGPT.
Step 1: We needed to give the GPT models a general introduction to what we were trying to accomplish
You are a system that is designed to generate useful testing data for a Customer Relationship Management (CRM) product. Here are the steps:
First, you will make up a fictitious management consulting firm with a need for data to use on client projects ranging from market sizing to competitive analysis to pricing studies. Make up a very specific storyline of what specific data they are looking for and why across a number of projects.
Second, make up 10 users that work in this company. Assign random job roles and titles based on the definitions below.
Step 2: We needed to explain to GPT what the different user types spend their time doing so it could construct a realistic set of events
Here is an example of one such user type we teach it about in the prompt:
Data Sourcer: The employee who searches for data after receiving a request from a consultant.
Role: A data sourcer specializes in finding and gathering relevant data based on what consultants ask them for in response to a consulting project. They search for data vendors, initiate communication with them, ensure data quality and accuracy meet the project requirements, coordinate with the consultant and then ultimately pass the vendor off to procurement if the consultant agrees to purchase. They log all early engagements with a data vendor such as that they filled out a contact form, exchanged an email, had a meeting, received test data, ran a data test or initiated a purchase discussion with their internal procurement people.
Job Titles: Data Sourcer, Data Researcher, Data Acquisition Specialist
We ultimately taught it about five different roles but could have just as easily done this for dozens.
Related: Why Entrepreneurs Should Embrace Generative AI
Step 3: We need to explain what we need the model to do with this information
This company is logging their activities around data vendors that they work and evaluate into our CRM to keep track of everything that has happened. Any work they do with the data or data vendor is logged so that their colleagues are aware of what is happening surrounding a data vendor and its products.
Create a set of activities between two years ago and today for each, to tell a story/dialogue of how these users communicate and work with the data from specific vendors. Create activities for between five and 10 people for each data vendor. Each user is to create three to five activities for each data vendor they are working with.
Make sure there are activities that mention experiences actually using the data. How well did it work? Was there missing data? Was it a problem?
The output should be in a CSV format. Each row should be in the format:
Date (mm/dd/YYYY), User Full Name, Data Vendor Name, Data Vendor ID, Activity Text
Examples:
9/10/2021, Sarah Chang, AI Global Insights, Sent an introductory email to AI Global Insights expressing the need for AI market data.
9/15/2021, Lisa Martin, SSC, Discussed SSC’s requirements with Sarah Chang and shared a high-level overview of AI Global Insights’ data capabilities.
9/16/2021, Michael Johnson, TechIntel, Requested a subset of AI industry data from TechIntel for preliminary analysis.
Step 4: Test, tweak and test more
After we ran this, we noticed areas where we needed to be more specific. Within less than an hour, GPT-4 was producing highly realistic test data:
“06/24/2021,” “Emma Smith,” “AgriDataCorp,” “Reached out to AgriDataCorp for initial discussion on South American organic farming data needs.”
“06/28/2021,” “John Davis,” “AgriDataCorp,” “Received AgriDataCorp’s data product catalogue. Initiated discussions on cost and licensing agreement.”
“06/30/2021,” “Alice Williams,” “AgriDataCorp,” “Received initial data sample from AgriDataCorp. Started cleaning and integration with our system.”
We were quickly able to generate an endless amount of test data —something that would have been either incredibly expensive or time-consuming only a few months ago.
Whether it’s producing better products or algorithms, using GPT-powered models to generate test and demo data is a must. In seconds, you can breathe life into an empty product demo. You can just as easily see what your products will look like in the hands of real users and companies.
Related: How AI Will Transform Software Development
Read the full article here