Client Background:

Our client, a global investment firm, wanted to develop a data-driven, automated, and innovative approach for identifying start-ups with talented founders as qualified leads or their investment operations.

Challenges

Our client wanted to integrate data from external job and company sources and the firm’s network. They wanted to operationalize the data - so the output would be more than reports and visualization, but a workflow product. Lastly, our client wanted to develop a robust platform to build on further and continuously improve over time.

Our Approach:

In line with our client’s technology priorities and existing data stack, we used a combination of modern data stack tools for this project.

The final output was a product that continuously retrieves input from operational teams, prepares the data for required calls from external data sources, retrieves the data, continues the transformation, and then exposes the data back to the operational and reporting layers.

Prefect performs the orchestration of these steps.

Operational Analytics - input from the ops group

  • We used Airtable as the operations team’s workflow tool and interface for the operations team. The operations team provides data sets inputs, adjusts parameters, and keeps information on work items updated.
  • Hightouch synchronizes data from Airtable to S3 - where Snowflake accesses it as an external table.

Data Extraction

  • Data is sourced from external sources via API calls to S3 buckets and then accessed within snowflake as external tables,

Transformation

  • A dbt project within Snowflake performs the data transformation.
  • The transformation happens in multiple steps where:
    1. Data is prepared for API calls that perform data enrichment
    2. The resulting data is stored in S3 folders and are queried as views by snowflake near real-time.
    3. Finally, data is further transformed for operational use and aggregated reporting.

Closing the loop: operationalizing the data

  • Updated enriched data for operational purposes is migrated back to Airtable vis Census.
  • Aggregated reporting and visualization are with Looker directly accessing snowflake.
  • The above steps are orchestrated with Prefect and run on predetermined schedules.

Impact

  • The automated one-click process increased the operational capability by a multiple.
  • The operational teams can entirely focus on the actual content of their work rather than data manipulation, cleaning, and report preparation.
  • Management gained visibility into the operational funnel and access to up-to-date data at the aggregate and detail levels.

Technologies Used

Prefect, Snowflake, dbt, looker, Airtable, Census, Hightouch