In the summer of 2022, Gartner® published its Hype Cycle on Data Management, and this was not without some controversy.If you’re not familiar with the Gartner Hype Cycle...
In the first of a blog series featuring data engineers talking about all things data and DataOps, we meet Martin Getov, Bulgaria-based CloudOps & DataOps Engineer.
I love data - it’s as simple as that. For many people, data engineering is a process of collecting data and providing it to people to make business decisions. For me, it’s all about revealing the knowledge that’s hidden in the data.
It’s like a jigsaw puzzle. You have all the pieces of the puzzle available, but you don’t see the whole picture until you put each piece in precisely the right place. Data engineering is the practice that puts together the pieces from the data puzzle in each organization.
And as with jigsaw puzzles, sometimes you have 20 pieces, sometimes you have more complex challenges with more than 5,000 pieces. The difference with my work is that I’m not only doing this for entertainment, for the pleasure of completing a puzzle (although competitive jigsaw puzzle competitions do exist). In our world, we’re focused on specific outcomes when we put the pieces together, on delivering new value, and completing a puzzle in which the pieces themselves and the final outcome required can be constantly changing.
One of the biggest challenges for data engineers has always been time to deliver and, for businesses, time to market. The time needed to complete a 20-piece puzzle is obviously not the same as a 5,000 piece one. But what if you have eight different 5,000-piece puzzles, and you need to figure out which pieces you need to complete the puzzle and, while you’re doing that, more puzzles arrive and more pieces are added to the pile. This is often the reality in the data world. At this point, you need automation, orchestration and testing to complete your puzzle in a timely fashion. You want to deliver fully-shaped data products in days or even hours, no matter the complexity. This is the magic of DataOps.live.
The complexity of data engineering successfully at scale is why we need #TrueDataOps best practices. Data engineering isn’t about creating software to move or aggregate data. It’s driven by natural curiosity, and the essential questions are WHY we are collecting these data points and how can we deliver more value to our customers?
Data engineering is the intersection point between the engineering team and the business teams. And being effective in that role is more important than ever.
As data science pioneer Clive Humby said, way back in 2006, data is the new oil. In the same way that we use so many derivatives of oil products in our day-to-day lives, all recommendations, customizations, personalization and other client-oriented products are based on the data that organizations collect. As data engineers, we should be enabling our organizations to use and benefit from 100% of their data. As with so many different oil derivatives, we have different data products that companies want to use and exploit in different ways.
For example, in many organizations, the people taking strategic decisions are not necessarily the same people who are talking and interacting with customers on a day-to-day basis. This is why a data-driven approach is so important. #TrueDataOps makes it possible for us to deliver those new and higher data products faster and more effectively. It helps the people taking strategic decisions to stay informed of what’s happening, and so take more timely, proactive and relevant decisions.
Each decision in a business should be based on data points: no gut feelings, no assumptions. One of the biggest mistakes a company can make is when they have an idea they look for data points to validate that idea as being the right one. These are not data-driven companies, and confirmation bias won’t get you very far. Data-driven means you look at the data first and, if you have created the right data products utilizing the right data points, the ideas and options will flow. You’re doing a sample Proof of Concept, validating it and - if the data shows this is a good investment – you’re continuing to develop it. If an idea doesn’t fly, no harm done or time wasted: it fails fast and you look again at the data.
So, what are the key attributes for driving that culture, for successful data engineering? Top of the list, for me, are curiosity, analytical thinking, a combination of business knowledge and data knowledge, and having the right technology to make the most of all that. Like everything else in life, the best way to effect change – in this case, to develop a true data-driven culture - is to lead by example. And the DataOps.live platform enables you to do precisely that.
I’ve worked in this field for more than a decade, first as an ETL developer, then data modeling for different information systems, and eventually leading a team of data engineers. I’ve worked with many different tools and techniques. I struggled to find a single platform that brings together all the different pieces and requirements. But now I’ve found it, in DataOps.live, and I couldn’t be happier. Even better, I’m also helping to constantly improve it.
Join us this year's Snowflake BUILD ‘22, a two-day virtual event filled with technical product deep dive sessions and hands-on lab on November 15th- 16th.
DataOps.live was fortunate to be one of only five technology partners invited to present at this event – our session on November 16th at 3:30pm EST will showcase how we bring all the DevOps speed, automation and agility to the data world so you can build on Snowflake even more easily and quickly by our CTO, Guy Adams, and our talented dev cloud engineers.
We will cover how to create a secure, sandboxed, rapid development experience; use a bullet-proof release process; and do a one-click rollback when things go wrong. You will get a deep dive into an optimized developer experience for management of objects within Snowflake, data ingestion, Snowflake ecosystem orchestration, data transformation, automated testing, Snowpark, and Streamlit, a recent Snowflake acquisition. Register for Snowflake BUILD here.
Based in Sofia, Bulgaria, Martin has been with DataOps.live since August 2022. He was previously a Data Platform Technical Lead at Financial Times Sofia. Martin has a degree in Information Systems from Sofia University St. Kliment Ohridski. Connect with him here https://www.linkedin.com/in/martingetov/