Case Study: Building and maintaining a data pipeline and data warehouse for the enterprise

Enterprise data pipeline and data warehouse

The amount of load that Reflective Data was able to lift from the shoulders of my team was unbeliavable. Without their help, we’d still be building out the pipelines and never gotten to the level of advanced analysis and ML that we’re able to do now. They let us focus on what brings the most value to our business.

Melanie, Director of Data Analytics, Frankfurt

At Reflective Data, we’ve worked with companies big and small. This means we have seen all levels of maturity when it comes to the infrastructure and knowledge around data pipelines and data warehouses.

Some of the most challenging projects have been enterprises with quite some infrastructure, legacy pipelines, and of course, opinions. Smaller businesses are just starting to adopt the concept of having all of their data stored in a data warehouse but many enterprises have been doing this for a decade!

The challenge

When many of the enterprises that we’ve worked with started building their data pipelines, they didn’t have tools like Airflow, BigQuery etc. that we use and love today. This means the bulk of it was built in-house. Even the concept of cloud computing was in its early days and most operations were kept on-premise.

The challenge with this kind of setup starts by understanding the existing setup. In some cases, the documentation is close to none and the people that built it are no longer with the company. This alone can take a month or so – mapping everything out, understanding the structure, creating the plan for moving forward.

Another challenge is getting everyone on the team on board. More often than not there are people who value the work that has been put into the old system over the years so much that it blinds them from seeing the obvious benefits of moving to a much more modern infrastructure.

The solution

When working with the enterprise and legacy infrastructure, nothing happens overnight. Below are the phases of a typical project of getting an enterprise client onto a modern cloud-based data infrastructure.

Phase 1: understanding and mapping the existing situation

With most enterprises, it’s not just one team or system that depends on the data infrastructure. More often than not, this is the backbone of the entire business. This means we need to make sure we understand every aspect of the current system, where it gets the data, how it’s being processed and what processes depend on this data.

Phase 2: planning the infrastructure

We do our best to work closely with all teams involved to make sure their needs are taken into account. This means a series of on-hands meetings where we learn about their use cases and problems they’re having with the existing setup. The output of this phase is a clear plan for moving forward, including the tool stack, reporting mechanisms and several feedback rounds to make sure everyones’ needs are taken into account.

Phase 3: implementation

Depending on the in-house knowledge, resources and other aspects, a company can decide to implement the plan themselves and continue using Reflective Data as a consultant or hire us to handle the technical execution as well. By far, the most effective arrangement in our experience has been where we do the bulk of the work while including a few technical people from the client’s side in every step of the process. In some cases, those people is hired specifically for this purpose.

Phase 4: monitoring, reporting and integrations

The whole point of having high-quality data is to make it actionable. Of course, we handle core integrations within the implementation phase but in a sense, data infrastructure is a growing organism that needs constant attention. Reflective Data is here to build long-term relationships with its clients, ready to help whenever there’s a new data source to be added, a report to be built or if a new team member needs training.

Conclusion

Moving away from a legacy data infrastructure is one of the best actions an enterprise can take towards being more data-driven, more effective in managing the infrastructure and, in all reality, keeping up with the competition.

Ending with another quote from the customer here.

I guess you could say we were your average enterprise with LOTS of legacy data infrastructure that had been built over many years. This system was extremely complex and very expensive to maintain. When Reflective Data came in, they acted as true professionals, worked very closely with our IT and came up with the plan that pleased everyone.

Today, being on the new cloud-based data infrastructure for almost a year now, I can say with all certainty that this project was a success. Not only are we saving tens of thousands of dollars every month on the infrastructure alone, the amount of hours it takes to maintain the system has gone from hundreds down to a ten or so. This has huge impact on our business. Implementing anything new would’ve taken at least 6 months with the old system, now it’s a matter of week or two to get everything up and running.

Julien, VP of Marketing, Austin

It’s feedback like this that makes us love the work we do even more! Get in touch and learn how we can help you, too.

For more case studies, see here.

Leave a Reply

Your email address will not be published. Required fields are marked *

Sign up for Reflective Data

5,000 sessions / month ×

Please enter your email

Submit

After submitting your email, we will send you the registration form.