Skip to content
DataOps.live Professional EditionNEW
Purpose-built environment for small data teams and dbt Core developers.
DataOps.live Enterprise Edition
DataOps.live is the leading provider of Snowflake environment management, end-to-end orchestration, CI/CD, automated testing & observability, and code management, wrapped in an elegant developer interface.
Spendview for Snowflake FREE

An inexpensive, quick and easy way to build beautiful responsive website pages without coding knowledge.


Pricing and Edition

See whats included in our Professional and Enterprise Editions.

Getting Started
Docs- New to DataOps.liveStart learning by doing. Create your first project and set up your DataOps execution environment.
Join the Community
Join the CommunityFind answers to your DataOps questions, collaborate with your peers, share your knowledge!
#TrueDataOps Podcast
#TrueDataOps PodcastWelcome to the #TrueDataOps podcast with your host Kent Graziano, The Data Warrior!
Resource Hub
On-demand resources: eBooks, white papers, videos, webinars.

Customer Stories
Academy

Enroll in the DataOps.live Academy to take advantage of training courses. These courses will help you make the most out of DataOps.live.


Learning Resources
A collection of resources to support your learning journey.
Events
Connect with fellow professionals, expand your network, and gain knowledge from our esteemed product and industry experts.
Blogs

Stay updated with the latest insights and news from our DataOps team and community.


#TrueDataOps.org
#TrueDataOps is defined by seven key characteristics or pillars:
In The News

In The News

Stay up-to-date with the latest developments, press releases, and news.
About Us
About UsFounded in 2020 with a vision to enhance customer insights and value, our company has since developed technologies focused on DataOps.
Careers

Careers

Join the DataOps.live team today! We're looking for colleagues on our Sales, Marketing, Engineering, Product, and Support teams.
data science
Doug 'The Data Guy' NeedhamAug 16, 2022 10:32:15 AM3 min read

Powering Data Science: How DataOps Can Enrich Your Activities and Deliver More For the Business

#TrueDataOps is a valuable catalyst in helping organizations take their use of data science, artificial intelligence and machine learning to the next level, expanding the scope of analytics way beyond business intelligence.

Any company carrying out sophisticated analytics and data science is most likely a step ahead of its competition already. As the use of data science becomes more widespread, however, you can take further steps to ensure what you’re doing in data science, to meet business needs today, will continue evolving in scalable, predictable and stable ways.

In my experience, the more mature organizations in data science terms already use tools like Dataiku and DataRobot while the less mature ones do not. Those tools are about automating and streamlining stuff you’ve already been doing. On the other hand, I’ve been involved in projects that are about moving the organization itself into a place where it can start using and then develop data science to adapt to changing business requirements.

I use a number of levels to describe the different stages in an organization’s analytics evolution. You may recognize your own organization somewhere along the line.

Level 0

Level Zero, from a data science perspective, is about achieving some type of BI environment: an enriched platform of data that business analysts use to see, for example, key performance indicators, and helping the business make better decisions.

Level 1

Level One takes things a step further: the process of ‘proving’ data science works for you. This is about asking ‘Does data science work for our business, and how would we apply the predictions that we gain?’given you are not, say, a Facebook or a Google. You may hire one or more data scientists, you have a reasonable amount of data to play with, and want to know what data science can deliver. As such, you can draw on basic Python tools like Pandas, NumPy, SciPy or sklearn, which should be enough to solve basic issues and provide fundamental use cases. 

Level 2

Level Two is when things start getting sophisticated, with more technical people performing data science, and making it easier by building ‘pipelines’ to handle different stages of the process. These organizations will probably do their own Docker containers, with trained models built into the container; what goes into a production pipeline is the actual predictive type work that’s written out to a Snowflake table.

Level 3

Level Three is when you want to go further still, and so look to the aforementioned tools like DataRobot and Dataiku. However, there is a way to include true data science at every level: DataOps. 

Conclusion

On my travels, I often come across data scientists doing predictions on data sets they’ve built themselves. The problem is, not all data scientists are strong in SQL. So on the one side, you have data sets and KPIs being used by analysts in BI, looking historically, and on the other side you have data scientists projecting forwardbut the data sets are not exactly the same. This means the predictions being made, once they get feedback, don’t necessarily tie-in together. Less DataOps, more data oops.

By contrast, a massive benefit of the #TrueDataOps approach is ensuring data is entirely consistent across the organization, managing and orchestrating that process on your behalfwhich includes feeding into whatever data science tool you’re using, whether homegrown, R, basic Python tools, or more sophisticated approaches. #TrueDataOps provides that consistent pipeline from Level Zero BI through to Level Four full-on data science.

This means you gain consistency from a business reporting perspective as well as a predictive reporting perspective, which is extremely important. One thing missing from a lot of conversations around data science is ensuring your data science predictions are consistent with your BI reporting. This is a way to achieve that, and you don’t need to start from scratch. 

The #TrueDataOps concept ties it all together, bringing consistency of environment and visibility of data across production, development, QA, staging, testand ultimately your data science modelsproviding full access to all the data that your people need. In short, you gain the power to see more, predict better, and ultimately take the business forward in smarter ways. 

 

A hugely experienced Data Scientist and author of The Enrichment Game: A Story About Making Data More Powerful, Cincinnati-based Doug ‘The Data Guy’ Needham is a Senior Solutions Architect with DataOps.live. 

 

avatar

Doug 'The Data Guy' Needham

“The Data Guy” Needham started his career as a Marine Database Administrator supporting operational systems that spanned the globe in support of the Marine Corps missions. Since then, Doug has worked as a consultant, data engineer, and data architect for enterprises of all sizes. He is currently working as a data scientist tinkering with graphs and enrichment platforms – showing others how to get more meaning from data.

RELATED ARTICLES