Community blog | DataOps.live

DataOps builds integrated governance solutions with Snowflake’s Data Governance Accelerated Program

Written by DataOps.live | Nov 16, 2021 8:00:00 AM

DataOps.live builds integrated governance solutions with Snowflake’s Data Governance Accelerated Program & new capabilities around Object Tagging, Access History, Row Access Policies and Dynamic Masking policies.

Even when your data is all in one place it still requires visibility, control, and security. In the Snowflake Data Cloud you get the governance and security capabilities required to know your data, comply with regulatory mandates, and collaborate with confidence inside your organization and beyond.  Snowflake think there are three key pillars to Govern your data. 

  • Know what is in the data, where it is, and who is accessing it. 
  • Protect your data by controlling access to your data based on the user’s context and the sensitivity of the data. 
  • Unlock the full potential of your data by making it easier to share and collaborate with your data in a secure, governed manner 

DataOps.live have announced an expanded partnership with Snowflake, the Cloud Data Platform, with immediate support for the latest Snowflake governance and security capabilities required to know your data, comply with regulatory mandates, and collaborate with confidence inside your organization and beyond. 

This includes all the Snowflake features below whether they be released, in public preview or in private preview. 

 

Snowflake Governance  

Governance Accelerated is a new program for partners who integrate with Snowflake’s governance capabilities. 

Governance is a core pillar of Snowflake and also one of the core pillars of DataOps.  Governance on Snowflake is broadly split into 3 areas: 

  • Creation of the Governance Objects (e.g. tags, masking policies, row access policies) and making sure the right groups of people have permissions to use them 
  • Appling these Governance Objects to the actual objects to be governed (tables and views in most cases, but tagging goes far wider). This also includes all of the GRANT management on every object 
  • Observability and Reporting for independent reporting everything that is going on. In order to achieve great governance, just doing the right thing isn’t enough – you also need to be able to prove you did the right thing at a point in time 

Lets look at each of these areas in turn. 

Creation & lifecycle management of Governance Objects 

There are now a new set of governance objects in Snowflake to be managed. These include  

  • Tags to enable data stewards to track sensitive data for compliance, discovery, protection, and resource usage use cases through either a centralized or decentralized data governance management approach.  A tag is a schema-level object that can be associated to another Snowflake object. A tag can be assigned an arbitrary string value upon associating the tag to a Snowflake object. Snowflake stores the tag and its string value as a key-value pair in the form key = 'value'. In this example, cost_center = 'sales', cost_center is the tag and 'sales' is the string value. The tag must be unique for your schema and the tag value is always a string. The maximum number of characters for the tag value is 256. The maximum number of tags that can be set on a single object is 20. 
  • Row access policies to implement row-level security to determine which rows are visible in the query result. A row access policy is a schema-level object that determines whether a given row in a table or view can be viewed.  Snowflake’s row access policies simplify data governance and improve organizations’ security posture by eliminating the need for data silos for different groups of users. Row access policies allow you to consolidate data by controlling access dynamically based on user authorization. 
  • Masking policies as part of Dynamic Data Masking is a Column-level Security feature to selectively mask plain-text data in table and view columns at query time.  In Snowflake, masking policies are schema-level objects, which means a database and schema must exist in Snowflake before a masking policy can be applied to a column. Currently, Snowflake supports using Dynamic Data Masking on tables and views.  At query runtime, the masking policy is applied to the column at every location where the column appears. Depending on the masking policy conditions, the SQL execution context, and role hierarchy, Snowflake query operators may see the plain-text value, a partially masked value, or a fully masked value. 

All these governance objects need to be managed (created, updated, removed, and granted access to for different roles) as they change over time and in line with organizational policies.   

DataOps.live provides full declarative lifecycle management for all of these objects as part of its Snowflake Object Lifecycle Engine.

Applying these Governance Objects to the actual objects to be governed, including grant management on every object 

Once these policies have been defined, the need to be applied to specific objects within a database.  Typically, these are tables and views and the attributes thereof.  But tagging is a much wider capability as tags can be applied to almost any object withing Snowflake and each tag can have any number of unique string values. 

Programmatically applying and reapplying tags to object is critical as a manual approach to this is management and fraught with maintainability issues and risk something will go wrong at some point.  As these are governance and data security related objects, getting something wrong could result in data security breaches.   

DataOps.live provides full declarative control of the application of these policy objects and tags to specific data objects (tables and views),  and the assignment of specific values as key value pairs. 

 
Observability & reporting to independently report everything that is going on 

In order to achieve great governance, just doing the right thing isn’t enough – you also need to be able to prove you did the right thing ….. at EVERY point in time.  And proving this at EVERY point in time is not only essential to full governance and auditability, but also to overall to organizational creating the organizational confidence needed to build and change the data platform in a faster and more agile way. 

In DataOps we build observability and reporting into every run so that we can look at these reports again and again for every point in time a pipeline has run. 

We have pre-built a variety of governance reports that run every time a pipeline is run and can be accessed alongside the pipeline view.  This list is customizable and extendable. 

Examples of governance and observability reports include: 

Monitoring & Reporting of Tag Usage 

This report shows which tags are applied to which objects in the current environment. 

Monitoring & Reporting of Policy Usage 

This report shows which masking policies are applied to which objects in the current environment. 

Monitoring & Reporting of Snowflake Access Time 

This report used the Snowflake Access History to show which Tables and Views have been accessed the greatest number of times and also for the largest duration of time.  

Monitoring & Reporting of Roles, Users & Objects 

This is a highly functional, interactive and powerful tool to explore the relationships between Users, Roles and Functional Objects. Snowflake has an extremely powerful system for managing users, roles permissions and inheritance. However, from a Reporting and Observability perspective, it’s very important to able to answer questions like: 

  • What access does Fred have? 
  • How and my whom is this role used? 
  • Who has access to this object in Snowflake and what Role(s) did they get access via?  

Not just currently, but at any point in history. 

 

Summary

Organizations regardless of market or mission need to manage data, minimize data risk, and meet data-focused regulatory compliance mandates. This means rules, universal agreement on rules, and rules enforcement -- in other words, a data governance framework and strategy. The larger the organization, the (usually) larger the data sets, therefore larger organizations need to lock down even more stringent rules around data governance than the average SMB.   

At minimum, a data governance framework involves rules, regulations, and processes for data management, data security, data quality, data ownership, and data access.   

Without a data governance framework or strategy, organizations can run into issues with data quality (impacting decision making) and run afoul of regulatory requirements (which can be expensive and even catastrophic). An organization-wide commitment to building a data governance framework and investing in its implementation and adoption can mitigate risk and help ensure future business success. 

Snowflake are continually raising the bar and providing more and more functional capabilities to support data governance and data security.  However, with more capabilities comes more moving parts and therefore more things (objects) to manage to ensure these rules, polices, and processes for data management, data security, data quality, data ownership, and data access are applied consistently, and mistakes are not made.  These management and maintenance of this has become as critical, if not more critical, than the management of the primary functional object like tables and views. 

DataOps.live is the first and only platform to provide full lifecycle management control over these objects and their application.  And as importantly DataOps.live is providing the observability and reporting capabilities to evidence and assure the business and governance teams what policies and rules were applied and at what point in time.  And enable those teams to forensically audit that information if required.  

 

Snowflake Data Governance Accelerated Program