Situation
Protobrand faced delays as researchers spent 8+ hours on data-cleaning, increasing costs and timelines. This hurt both project acquisition and client satisfaction.
Task
Save on intensive researcher labor costs by incorporating a multi-pronged strategy for more efficient data cleaning workflows.
Action
Built solutions to decrease labor intensive tasks by integrating AI pre-cleaning, simplifying interfaces, and adding flexible data review tools.
Result
The project cut data-cleaning time by 44%, speeding delivery and lowering costs. Researchers adopted the tool quickly due to its user-friendly design.
Situation
Our market research firm is known for its premium, high-quality survey analysis, but the data-cleaning process was a bottleneck in delivering timely results to clients. On average, researchers were spending up to 8 hours cleaning datasets before analysis, which increased project costs and stretched timelines. To stay competitive, we needed a solution to reduce this labor-intensive step without compromising data quality. As the Product Manager, I took ownership of this challenge, driving a strategy to optimize and streamline the cleaning process for quicker, more cost-effective project turnaround.
Task
My primary goal was to reduce the time researchers spent cleaning data, ideally by at least 30%. This project involved creating a seamless, intuitive platform feature that incorporated AI for initial data quality assessment and gave researchers tools to finalize the cleaning efficiently. My responsibilities included the full lifecycle of this project, from ideation and prompt engineering to design, testing, and implementation.
Action
To address the problem, I designed a two-pronged solution that focused on reducing the cleaning load on researchers and making the remaining process fast and user-friendly.
1. Reducing Data to be Cleaned
AI-Driven Pre-Cleaning
I engineered prompts to help the AI evaluate data quality, terminating low-quality responses and flagging suspicious entries. Through numerous iterations, I tested each approach on real datasets, refining the prompts until the AI consistently provided accurate quality ratings. This step was crucial for building a solid foundation in pre-cleaning, allowing researchers to focus on the most relevant responses.
Termination of Unsuccessful Methods
At the board’s suggestion, we initially included a keystroke analysis to measure data quality based on typing patterns. However, testing revealed that keystrokes had minimal correlation with data quality, despite repeated attempts to refine the model. By presenting clear, data-backed evidence of its ineffectiveness, I was able to convince stakeholders to discontinue this approach and reallocate resources.
2. Streamlining the Manual Cleaning Process
Flexible Data Review Tools
For the data not filtered by AI, I created a tool allowing researchers to review and finalize data quickly within the platform, rather than downloading files for manual cleanup. Researchers could easily approve, reject, or flag responses based on AI recommendations, which provided guidance without restricting their judgment.
Feature Design Adjustments
In initial usability tests, I discovered that the backend-focused design (with benchmarks, scores, and statuses) was too complex for researchers, who needed a simpler interface. Based on this feedback, I redesigned the interface to remove unnecessary metrics, resulting in a more intuitive, streamlined process. This simplification paid off during rollout, significantly reducing training time.
Handling Varied Inputs
I collaborated closely with developers to build a versatile system capable of handling diverse response types—open-ended answers, images, keyboard strokes, length of interaction (LOI), and other metrics. This step required extensive mock testing to ensure consistent quality evaluation across various data types submitted by respondents on the platform.
Compared below are the before and after cleaning workflows. This exemplifies how the above changes combine for a more streamlines workflow.
Result
Compared below are the before and after entrant data flows. This helps illustrate how the new workflow allowed for a 40% decrease in uncleaned data.
The project achieved a 44%* reduction in data-cleaning time, exceeding our initial goal and enabling faster project delivery and lower costs. Due to the user-friendly design and intuitive features, researchers quickly adopted the new tool, with minimal training required. The rollout went smoothly compared to previous features, thanks to the iterative usability testing and simplification efforts. Researchers found the tool not only efficient but essential, making it a valuable addition to our product offering.
*This is a recently deployed project and the time reduction is based off a small sample size
Key Skills
Prompt Engineering Expertise
Iterating on prompt design taught me how to fine-tune AI’s performance, measuring each iteration to optimize quality and efficiency.
Stakeholder Alignment
Managing expectations and communicating data-backed results helped align stakeholders on resource allocation, especially in discontinuing keystroke analysis.
User-Centered Design Success
Simplifying the UI and focusing on real user needs led to a product that researchers adopted eagerly—proving the impact of usability testing and feedback loops.