Skip to content
DataOps.live Professional EditionNEW
Purpose-built environment for small data teams and dbt Core developers.
DataOps.live Enterprise
DataOps.live is the leading provider of Snowflake environment management, end-to-end orchestration, CI/CD, automated testing & observability, and code management, wrapped in an elegant developer interface.
Spendview for Snowflake FREE

 

An inexpensive, quick and easy way to build beautiful responsive website pages without coding knowledge.
Getting Started
Docs- New to DataOps.liveStart learning by doing. Create your first project and set up your DataOps execution environment.
Join the Community
Join the CommunityFind answers to your DataOps questions, collaborate with your peers, share your knowledge!
#TrueDataOps Podcast
#TrueDataOps PodcastWelcome to the #TrueDataOps podcast with your host Kent Graziano, The Data Warrior!
Academy
DataOps AcademyEnroll in the DataOps.live Academy to take advantage of training courses. These courses will help you make the most out of DataOps.live.
Resource Hub
On-Demand Resources: eBooks, White Papers, Videos, Webinars

Learning Resources
A collection of resources to support your learning journey.

Customer stories
Events
Connect with fellow professionals, expand your network, and gain knowledge from our esteemed product and industry experts.
#TrueDataOps.org
#TrueDataOps.Org#TrueDataOps is defined by seven key characteristics or pillars:
Blogs
Stay informed with the latest insights from the DataOps team and the vibrant DataOps Community through our engaging DataOps blog. Explore updates, news, and valuable content that keep you in the loop about the ever-evolving world of DataOps.
In The News

In The News

Stay up-to-date with the latest developments, press releases, and news.
About Us
About UsFounded in 2020 with a vision to enhance customer insights and value, our company has since developed technologies focused on DataOps.
Careers

Careers

Join the DataOps.live team today! We're looking for colleagues on our Sales, Marketing, Engineering, Product, and Support teams.
DataOps.liveMar 10, 2022 10:53:48 AM6 min read

#TrueDataOps: ELT and the Spirit of ELT

By now, it is probably a well-known fact among our clients that before developing our DataOps for Snowflake data orchestration platform, we set about to define the principles that we felt were critical to ensuring that we laid the proper foundations for developing a data platform that would change the global enterprise data management solutions.  

We returned to the DevOps philosophy because it has been battle-hardened in the software development industry for more than twenty years. DevOps has been successful since its inception. And it works.  

To quote the DataOps for Dummies book:  

DevOps is a “set of guiding principles that allow for agility while maintaining governance over the development and deployment of code.” 

And “DevOps has led to principles and tools to provide the ability to maintain configuration and code in repositories, check-in/check-out functionality, and… the ability to rollback code in the event of a failure.”   

We used this “set of guiding principles” as a foundation on which we built the truest form of DataOps (Data + Operations) known as #TrueDataOps.  

Because of the exploding data volumes, one of the biggest challenges we initially faced was the need to balance governance and agility. Data must be governed and secured to protect it from unauthorized access. The long-accepted constraint of maintaining a balance between governance and agility is that to increase agility; governance must decrease and vice versa.  

However, this premise is not necessarily so. The good news is that, with the development of the #TrueData Ops philosophy, one of the promises that we can give is that our data orchestration platform provides exponential increases in both governance and agility. Neither one must suffer because of the other.  

This set of articles aims to drill down into each of the seven #TrueDataOps pillars, one at a time, to understand why DataOps.live has so much value to add to every enterprise organization’s data ecosystem and the generation of data analytics products used to inform all strategic decisions.  

ELT and the new spirit of ELT  

 As highlighted throughout this article, we believe that to truly understand the role the #TrueDataOps philosophy plays in our company ethos, our product, and consequently the value we add to "Snowflake environment management, end-to-end orchestration, CI/CD, testing & observability, and code management," the following questions are valid and deserve a considered response:  

  • What is the difference between ETL and ELT?  
  • Why adopt ELT instead of ETL?  
  • Where does EtLT fit into the picture?  
  • What is the (new) spirit of ELT?  

Now that we have ringfenced these discussion points let’s look at each one individually.  

1. What is the difference between ETL and ELT?

Both the DataOps for Dummies book and the TrueDataOps.org website provide a comprehensive description of the most important differences between ETL (extract, transform, load) and ELT (extract, load, transform).  

Note: Both these constructs perform the same function: to ingest data from multiple, disparate data sources (or data producers) and load it into a centralized data store or data cloud. The primary difference is in the HOW, or when the data is transformed into useful information or usable datasets for data analytics, data science, and BI processes. Cloud platforms tend to favor ELT for cost and performance considerations.  

1.1. ETL  

ETL or extract, transform, load is the traditional way to move company-generated data from source to destination. The basic model functions as follows:  

  • The initial step is to extract it from its source and stage it in a staging area. This is the “E” in ETL.  
  • The second step to transform (clean, process, and convert) this data into meaningful information, the “T” in ETL.   
  • Lastly, the transformed data is loaded into a data warehouse, data lakehouse, or data lake, the “L” in ETL.  

1.2. ELT 

ELT, extract, load, transform moves the transformation stage from the middle to the end of the ELT model. Therefore, the steps in this model are as follows:  

  • As with the ETL model, the first step is to extract the data from its multiple, different data sources as raw data. 
  • The second step is to load or ingest the raw data into the data platform.  
  • Thirdly, the data is transformed and served to the business as valuable datasets or data analytics reports and dashboards. 

2. Why adopt ELT instead of ETL? 

Even though the differences between these two models might seem fairly insignificant, they are substantial in practice.  

The point at which the data is transformed plays a considerable role in the overall management of company-generated data. 

Why? 

Raw data is valuable and should never be deleted. When the data is transformed, it is permanently degraded in some form. Therefore, the data provenance and lineage are maintained by storing the raw data in the data platform and using it as a foundation for requested data products. This then allows for the continual creation of unique data insights. 

On the other hand, ETL transforms data before it is loaded into the data platform. This might seem beneficial, especially when considering the cost of storing massive volumes of data, structured, semi-structured, and unstructured. Nonetheless, the cost of cloud data storage is inexpensive, negating this benefit. 

3. Where does ELT fit into the picture? 

Another seeming benefit of ETL is that the sensitive data or PII regulated by global regulations such as GDPR and CCPA is masked or anonymized before being loaded into the data store.   

Therefore, the general recommendation seems to be that when working with sensitive data and needing to mask data is to use ETL instead of ELT.   

The challenge here is that the raw data is transformed (or deformed) before it reaches the cloud data storage point. However, #TrueDataOps posits that instead of using ETL to load this data into a data platform, EtLT must be used.  

Let’s turn, once again to the DataOps for Dummies book for a description of EtLT:  

In some cases, you can’t avoid regulations that require some data to be removed, encrypted, anonymized for privacy. In these cases, a small “t” is inserted into the ETL acronym (EtLT), signifying minimal, but required change.” 

4. what is the (new) spirit of ELT?  

The new spirit of ELT is fundamental to the #TrueDataOps philosophy. The truedataops.org website has the following to say about the spirit of ELT:  

The Spirit of ELT take the concept of ELT further and advocates that we avoid ALL actions which remove data that could be useful or valuable to someone later, including changes in data.” 

In other words, it is about maximizing the ability to derive future value from the data by pushing its transformation down the pipeline and making sure that any value that we might derive in the future is considered and not discarded. 

Conclusion 

#TrueDataOps is critical to the success of DataOps for Snowflake. As described above, it forms the foundation upon which our market-leading data orchestration platform is built. And, because it is also the first pillar of this philosophy, it provides the foundation for the other six pillars:  

  • Agility and CI/CD 
  • Component design and maintainability 
  • Environment management  
  • Governance, security, and change control 
  • Automated testing and monitoring 
  • Collaboration and self-service 

Lastly, simplicity is one of the principal goals of #TrueDataOps. This first pillar, ELT, and the spirit of ELT streamlines the potentially complex process of moving data from data producers to data consumers. It’s no wonder, DataOps.live is a leader in the data management and orchestration lifecycle. 

 

Ready to get started?

Sign up for your free 14 day trial of DataOps.Live on Snowflake Partner Connect today!

Start Free Trial

RELATED ARTICLES