The rapid and continued expansion of data systems as well as the exponential explosion of data are driving the use cases for advanced data analytics and data science applications. However, without adopting the principles and philosophy of #TrueDataOps, it is always going to be challenging to develop, test, and deploy data pipelines that deliver trusted, reliable data to analytics applications and machine learning models in a short space of time and according to business stakeholder requirements.
Blog - DataOps
Posts by DataOps:
With the exponentially increasing importance of and value attributed to data, it’s never been more critical to test, observe, and monitor the quality of data being used to develop many different data products, driving strategic decision-making at an organizational level. To enhance and facilitate the development of the highest-quality data products, we have recently announced our support for Soda SQL and Soda Cloud. Succinctly stated, Soda SQL has been fully integrated into our DataOps platform, with full support for Soda Cloud.
The cloud data warehouse or data cloud is increasing in importance exponentially as more organizations understand the value of using data-driven insights as a foundation for and critical part of any decision-making process. As a result, it is essential to move unstructured, structured, and semi-structured raw data from its source to a centralized location (the cloud data warehouse) to be processed, transformed, modeled, and analyzed to derive meaningful insights or information.
We recently (14 July 2021) completed a masterclass with Kent Graziano, Chief Technical Evangelist, Snowflake, discussing Snowpark, the use of Scala and Java UDFs, and how we integrate this new technology into our DataOps platform. In particular, we discussed how we are using our Snowflake Object Lifecycle Engine to recycle these Snowpark objects through our DataOps platform via CI/CD pipelines and automated regression testing.
Recently DataOps.live announced our support for Snowflake Java UDFs. This new Snowflake feature is another important step on the road (especially when combined with the release of Snowpark – see our blog about this here).
Recently DataOps.live announced our support for Snowflake Snowpark.
Snowflake is known for its performance, scalability, and concurrency. Before Snowpark, users interacted with Snowflake predominately through SQL. Now, customers will be able to execute more workflows entirely within Snowflake’s Data Cloud, without the need to manage additional processing systems.
FUTURE GRANTS, ALL TABLES in the Snowflake Data Cloud (and similar constructs in other databases) are necessary and powerful tools for manually administered databases. However, they have significant downsides in terms of flexibility, auditability, potential information bleeds, Principle of Least Privilege etc. In a DataOps approach all the same convenience is possible, but all of these limitations are addressed.
Following on from IMPERATIVE VS DECLARATIVE FOR DATA. In this blog post, we will look at different ways that the Imperative Approach can be implemented and also give an overview on how a basic Declarative Approach could work.
Let's now consider this in the context of Data and Databases. The most typical example of changing the state of a database is creating a table. We would all initially jump to something like:
Thanks to everyone who attended the Technical Masterclass on CI/CD and DataOps for Snowflake 2 weeks ago. And to everyone who has since watched the recording since.