Situation
In market research, achieving good quality data is difficult through online surveys, and the cost of in-depth-interviews is not feasible at scale.
Task
Improve interface to promote effortless transfer of information, reducing fatigue, and increasing data richness/quality without affecting cost.
Action
Voice response prototypes to
a) test if voice performs better than typing
b) determine which feature set prevailed.
Result
Quality data:
Verbatim length:
LOI:
Satisfaction:
Situation
In online market research, there is a juxtaposition between the efficiency and rigor of quantitative data and the illustrative, actionable insights from qualitative data. Many research firms have tried to combine the benefits of both. ‘Insights of quality data at the scale of quantitative data’ is a common claim in marketing briefs. However, online surveys have one main flaw, they are tiring. Working on the ground level, you’ll quickly notice that respondents quickly fatigue and look to proceed and complete the survey for their reward in the most efficient, and effective way possible. When it comes to open ends, this means bare answers that meet the minimum word requirement.
Task
Our task was to turn this around. While most researchers focus on increasing incentives, or finding better quality sample, our approach was to directly combat respondent fatigue with an engaging interface that promotes an effortless transfer of information. Removing these barriers is the key mindset that most market research firms brush over - because what could be easier than typing?
Given our recruitment cost restrictions, our goal became improving the respondent interface to promote an effortless transfer of information, reducing respondent fatigue, and increasing the data richness and quality at the same cost of a standard online survey.
Key KPIs:
Data quality (measured by verbatim length, bad data frequency and qualitative assessment)
Length of interview (LOI)
Survey satisfaction.
Action
Exploratory phase
This problem was made clear to us by researchers. There are GBs of open ended data that Protobrand has collected over years telling the same story, that respondents don’t like open ended questions. They require effort to answer, a respondent’s worst nightmare.
We starting by identifying the pain points for respondents. It was clear that the length of interview (LOI) was the biggest predictor of survey satisfaction. This was known beforehand from previous research across years of projects. The keystone open ended exercises that Protobrand uses were then identified as a more precise area of friction.
Hypothesis
Respondents are able to read, think, and speak much faster than they are able to type, so why not allow them to do so. This was our hypothesis. Responding by submitting a voice memo would be easier for respondents. We hypothesized it would also feel natural within our chat interface for our keystone exercises.
Secondary Research
A lot of research has been done as per whether voice input works for online surveys. Much of the research discussed respondent’s elected preference of typing. This makes sense, how often to people use the dictation transcription button on their keyboard for example? The consensus stated that, between technical problems and respondent preference, voice input was not ready. However, the research has been published years prior, and we anticipated that respondent’s views on voice inputs might have softened since. With voice memos gaining popularity in messaging apps, we strove to ride this wave. Additionally, the context of our use case did not exactly match those in the surveys, and thus we decided to proceed with our own primary research. That being said, this research provided us an important metric to watch – drop rate. Keeping the drop rate relatively unchanged would be crucial for both cost and feasibility of recruitment.
""
Prototypes
With this in mind, we built three prototypes we could test along with a control, which was the standard typing input. These prototypes mixed features such as playback, live transcription, transcription editing and more.
We recognized the drawbacks of these prototypes. For one, they did not exist in our chat interface, meaning the natural feeling and familiarity was not there. This was mainly due to resource constraints. We expected drop rates and other KPIs to improve after the prototyping stage.
The results of the prototype testing were generally positive. Most KPI’s improved, with the exception of the drop rate, which was partially expected. With this in mind, we set the goal of no more than +15% drop rate. This was calculated by weighing the benefits of the tool against the costs of a high drop rate.
We used a weighted average score to proceed with treatment group 1, building an MVP that was used in parallel with a real client project to A/B test the feature. This pushed the direction away from a transcription style input and towards a voice memo style input, matching our hypothesis about rising popularity of voice messages.
Result
As this was a recent development, we are still measuring the effect this feature had on key KPIs, however, the initial results show similar improvements to the client A/B test.
+32% length of response
-15% bad data
-17% length of interview
+15% satisfaction
The most important and highest performing metric was the qualitative assessment of the verbatims. Below is an example of typed verbatims compared to spoken verbatims from the same general prompt: Tell me a story and describe your emotions.
*data changed to protect client and respondent privacy
BEFORE
"Calm and tired, wanting to be productive"
"Full and happy when I eat lunch"
AFTER
"I was with my husband and my three children. We were sitting around in the kitchen talking about the great day we had. We felt happy, we were having fun. We decided to play a game even. This time we had together was really special to me."
We saw this reflected in client feedback on these respond by voice projects. Below are some client testimonies:
""
While this project has been a success overall, there were some flaws identified in the initial rollout. Technical difficulties for edge case users caused some drop rates to increase more than expected. Our team worked quickly to fix these bugs so projects could resume. This taught us the importance of thorough testing and QA, which were implemented to a greater degree in later projects.