3 min

Scaling Data Science and Experimentation: Lessons from Shuling Ding’s Journey at Tubi

The streaming tv platform Tubi was launched in 2014. Its revenue had reached $150 million by 2019, and by 2023 the revenue was $900 million.

Applied AI

Data Analytics

Data Engineering

Introduction

The streaming tv platform Tubi was launched in 2014.  Its revenue had reached $150 million by 2019, and by 2023 the revenue was $900 million.  During the same term the monthly active users increased from 20 to 78 million.  (Source:  https://www.businessofapps.com/data/tubi-statistics/) We had a conversation with Shuling Ding who lead Tubi’s data science efforts during this period.

Shuling led the data science team through hypergrowth, navigating challenges in data infrastructure, collaboration with data engineering, and the development of a scalable A/B testing framework.

Scaling a data science team isn’t just about hiring great talent—it’s about building the right infrastructure, collaboration processes, and analytical frameworks that enable teams to drive business impact. This post explores key lessons from scaling a data science team and building an experimentation platform that empowers both technical and business teams.

Building a Scalable Data Infrastructure for Growth

When Shuling joined Tubi, the company was growing exponentially, and the data warehouse had to be migrated and restructured multiple times to keep up. Without a scalable data infrastructure, data science teams struggle with inefficiencies and inconsistent insights.

Key challenges included:

⦁ Fragmented data sources – Disparate datasets across content, ads, growth marketing, and product domains created integration difficulties.

⦁ Lack of data consistency – Changing schemas and metrics definitions made it difficult to track historical trends and measure performance effectively.

⦁ Engineering bottlenecks – Understaffed data engineering teamforced data scientists to build proxy models to fill gaps, leading to potential bugs..

🚀 Solutions that worked at Tubi:

⦁ dbt for modular transformations – Standardized data modeling across teams, improving maintainability and reducing redundant work.

⦁ Databricks for scalable processing – Enabled machine learning workflows and seamless integration with existing analytics pipelines.

⦁ Self-service analytics for business teams – Empowered Content, Sales, Marketing and BizOps teams to access insights independently, reducing dependency on engineering and data teams.

Aligning Data Science and Engineering for Efficiency

A common challenge in fast-growing companies is the misalignment between data science and data engineering. At Tubi, this misalignment resulted in:

⦁ Understaffed data engineering teams that couldn’t keep up with business demands.

⦁ Data scientists creating temporary fixes to cover data gaps, leading to inefficiencies and inconsistent data definitions.

⦁ Siloed workflows where engineering teams were isolated from direct business interactions, reducing their understanding of real business needs.

✅ Best practices for better collaboration:

⦁ Establish clear ownership and work on data requirements collaboratively.   Data science teams often clarify the data requirements, working in partnership with data engineering to ensure accountability for data pipelines and modeling.

⦁ Embed data engineers into data science projects to improve collaboration and speed up development cycles.

⦁ Invest in automation and lineage tracking to reduce the need for manual data fixes and improve trust in data quality.

Scaling A/B Testing to Drive Business Decisions

A/B testing is a critical tool for data-driven decision-making, but scaling experimentation in a streaming business like Tubi presents unique challenges. Unlike SaaS or e-commerce platforms, where controlled experiments can run over long periods, streaming companies must analyze massive datasets in real-time to optimize recommendations, ad placements, and content decisions.

Key components of an effective A/B testing framework:

⦁ Randomization & event tracking – Ensures experiment methodology is statistically sound and unbiased.

⦁ Business-friendly UI for experiment analysis – Allows GTM teams to access and interpret results without needing SQL or engineering support.

⦁ Automated data pipelines – Uses dbt to transform experiment data efficiently, reducing the need for manual intervention.

⦁ Scalable storage and processing – Data infrastructure to handle complex queries, machine learning workflows, and real-time analysis.  This isn’t unique to A/B testing, but a scalable A/B testing environment requires a well thought of foundational data model that has building blocks that can be reused and that allows separate and additional data models development.  There is always activity in the transformation layer (in this case with dbt) continuously implementing improvements as required.

Without a scalable experimentation infrastructure, teams rely on outdated spreadsheets and manual tracking, leading to misinterpretations and slow decision-making. Implementing an automated framework accelerates insights and improves confidence in A/B test results.

Accelerating Growth with Data-Driven Experimentation

A well-designed experimentation platform benefits multiple teams across the organization:

💡 For Data Teams:

⦁ Standardized processes reduce time spent cleaning and structuring experiment data.

⦁ Improved reliability of tracking ensures accurate analysis and decision-making.

💡 For GTM & RevOps Teams:

⦁ Faster insights enable better campaign optimization and content strategy.

⦁ A user-friendly interface makes it easy to access results, reducing reliance on data teams.

🚀 Business impact of scalable experimentation:

⦁ Faster iteration on product and marketing strategies, leading to better business outcomes.

⦁ Increased confidence in data-driven decision-making across teams.

⦁ Reduced bottlenecks in running and analyzing experiments, allowing teams to scale their efforts efficiently.

By investing in scalable A/B testing infrastructure and aligning data teams effectively, companies can turn experimentation into a true growth driver.

Final Thoughts: The Ongoing Balance of Technology and Business Needs

Scaling data science and experimentation capabilities is not a one-time effort—it’s a continuous process of balancing technological resources with evolving business needs. As companies grow, their data infrastructure, engineering capacity, and analytical frameworks must adapt to new challenges and opportunities.

A rigid, one-size-fits-all approach can lead to inefficiencies, whether through over-investment in complex systems too early or underinvestment that slows down decision-making. The key is building adaptable systems—ones that scale with demand, remain cost-effective, and enable teams to make better decisions without unnecessary friction.

This adaptability depends not only on technology but also on the people driving it. A strong data team needs the right mix of technical expertise, problem-solving skills, and a willingness to learn new technologies as the landscape evolves. For managers, the challenge is to balance individual capabilities and interests with the demands of an ever-changing data environment, ensuring that both the team and the business continue to thrive.

These insights come from Shuling’s experience at Tubi, where she navigated the complexities of scaling data infrastructure, aligning teams, and implementing a robust experimentation framework. Her experience and insights highlights the importance of adaptability, collaboration, and strategic investment in technology to keep pace with rapid growth.

Don’t Miss Out On Future Articles

Stay in the loop with everything you need to know.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.