Step-by-Step prompt engineering proves key to GPT-4 brainstorming tasks


Researchers have conducted a study to determine how prompt methods can be used to generate ideas and which prompt methods generate the greatest diversity of ideas.

The working paper by Lennart Meincke, Ethan Mollick and Christian Terwiesch from the Wharton School of the University of Pennsylvania focuses on idea generation with GPT-4.

The team investigated how different prompting methods can influence the diversity of ideas generated. Specifically, the goal was to develop new products for students that cost less than $50.

The researchers tested different prompting methods, including minimal prompts, prompts in which the AI model takes on different personalities, and prompts in which the AI model applies different creativity techniques from existing literature.



The eight top groups of prompts tested.| Image: Meincke et al.

The diversity of ideas was measured using cosine similarity, a measure of the similarity between two ideas, but without comparison to existing ideas. The researchers also measured the number of unique ideas and the rate at which the idea space was exhausted.

The team found that different prompting methods had different effects on the diversity of ideas generated. But “chain-of-thought” prompting, a long-established prompting method, was the one that nearly reached the level of one group of students in the tests and came out on top by a wide margin.

Image: Meincke et al.

This method was also the one that generated the most unique ideas. This suggests that CoT prompting can help to open up the idea space more effectively and generate a greater variety of possible solutions.

Many of the “advanced” prompts tested were inferior to a simple basic prompt. Only the step-by-step prompts showed a significant improvement in idea diversity.| Image: Meincke et al.

Getting to better AI ideas – step by step

CoT prompting asks the AI model to solve a task in multiple steps. You don’t have to specify these steps; just asking it to proceed step by step can improve the outcome.

It’s not exactly clear why this works – the basic premise is that the prompt causes the model to focus on higher-quality data from the training dataset, which is more analytical.


Ethan Mollick, has published a GPT for idea generation that follows the step-by-step principle, but it is not the prompt used in the study.

Another recent study showed that the length of reasoning steps in CoT prompts is directly related to the performance of language models in complex problem-solving tasks. This was true even when the longer prompt did not contain significant new information.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top