Skip to content
DataOps.live Professional EditionNEW
Purpose-built environment for small data teams and dbt Core developers.
DataOps.live Enterprise Edition
DataOps.live is the leading provider of Snowflake environment management, end-to-end orchestration, CI/CD, automated testing & observability, and code management, wrapped in an elegant developer interface.
Spendview for Snowflake FREE

An inexpensive, quick and easy way to build beautiful responsive website pages without coding knowledge.


Pricing and Edition

See whats included in our Professional and Enterprise Editions.

Getting Started
Docs- New to DataOps.liveStart learning by doing. Create your first project and set up your DataOps execution environment.
Join the Community
Join the CommunityFind answers to your DataOps questions, collaborate with your peers, share your knowledge!
#TrueDataOps Podcast
#TrueDataOps PodcastWelcome to the #TrueDataOps podcast with your host Kent Graziano, The Data Warrior!
Academy
DataOps AcademyEnroll in the DataOps.live Academy to take advantage of training courses. These courses will help you make the most out of DataOps.live.
Resource Hub
On-Demand Resources: eBooks, White Papers, Videos, Webinars

Learning Resources
A collection of resources to support your learning journey.

Customer stories
Events
Connect with fellow professionals, expand your network, and gain knowledge from our esteemed product and industry experts.
#TrueDataOps.org
#TrueDataOps.Org#TrueDataOps is defined by seven key characteristics or pillars:
Blogs
Stay informed with the latest insights from the DataOps team and the vibrant DataOps Community through our engaging DataOps blog. Explore updates, news, and valuable content that keep you in the loop about the ever-evolving world of DataOps.
In The News

In The News

Stay up-to-date with the latest developments, press releases, and news.
About Us
About UsFounded in 2020 with a vision to enhance customer insights and value, our company has since developed technologies focused on DataOps.
Careers

Careers

Join the DataOps.live team today! We're looking for colleagues on our Sales, Marketing, Engineering, Product, and Support teams.
DataOps.liveApr 29, 2021 4:12:48 PM4 min read

FUTURE PERMISSION GRANTs: why they are very useful and why you shouldn't use them in a DataOps world

FUTURE GRANTS, ALL TABLES in the Snowflake Data Cloud (and similar constructs in other databases) are necessary and powerful tools for manually administered databases.  However, they have significant downsides in terms of flexibility, auditability, potential information bleeds, Principle of Least Privilege etc. In a DataOps approach all the same convenience is possible, but all of these limitations are addressed.

Most databases have concepts of GRANTing permissions to all existing objects in a container (Database, Schema etc). In a system like MySQL this can be done with wildcard GRANTS. These wildcards apply to all current, and all future objects that match the rules.

In Snowflake this is split into two parts, GRANT [permissions] on ALL TABLES/VIEWS in [container] for all objects that are currently there and grant [permissions] on future tables in [container]for all objects that don’t currently exist but will be added in the future.

In both cases there is similar principle i.e., “Set some permissions now that will apply to objects in the future”. Why is this so important? Pre-DataOps approaches virtually all databases are configured by hand, meaning that the action of creating a table/view is a manual one and therefore the process of granting permissions to see it would be similarly manual. This creates a high administrative burden on people, and a significant chance of permissions being forgotten. By essentially preloading these permissions, users will get to see new objects without the person creating them having to remember to grant explicit permissions i.e., these permissions are implicit permissions.

However, there are a few problems with this approach which make it incompatible with TrueDataOps principles.

Firstly, it very often violates the Principle of Least Privilegewe aren’t always granting just to those who need it, we’d often be granting to a wider group to ensure that at least those who need permissions get them.

Secondly, there are a number of activities that can apply to an object which mean that it may not be ‘safe’ through its entire life. Consider the very powerful Snowflake feature of Dynamic Masking. This is applied after a table/view is created. In the case of a VIEW, or a TABLE created via CTAS, a user given access via FUTURE GRANTS would have access to see this VIEW/TABLE the moment it was created, before the Dynamic Masking had been applied. Hopefully, this is done soon after, but from a security perspective, even a short period where users that should only see masked data see the raw data is totally unacceptable.

Thirdly, from an auditing and governance perspective, an implicit approach makes it very hard to determine/prove “who actually got to see what, when”with an explicit approach, managed under TrueDataOps principles, it’s far clearer to see who was allow to see what, and when these were actually applied.

Finally, a FUTURE GRANTs approach locks you into a very rigid structure for the future. For example, it becomes extremely hard to say “Actually I don’t want these specific users to have access to those view tables and views, but otherwise keep everything as it is”. When you need to move to a granularity below the schema level, it’s very hard to do.

Given our understanding of how important these approaches are in a manually administered database world, but the challenges they create, how can we solve this in a DataOps world?

In a DataOps world, all GRANTs and permission must be stored in the central source of truth and applied to the target platform via automated DataOps pipelines. This:

  • allows testing of GRANTs and permissions to be done in branches other than production
  • removes the possibility of manual mistakes. If the configuration has been changed in a development branch, peer reviewed and tested outside of production, we can be extremely confidence that, once in production, it will perform correctly. Given the importance of GRANTs and permissions we can be extremely confidence that, once in production, it will perform correctly. Given the importance of GRANTs and permissions (and the potential catastrophic impacts of them being misapplied due to human error) this is one of the most important elements that should be automated in a DataOps approach.

Furthermore, applying fine grained, explicit permissions to objects, doesn’t have to mean that there is any additional burden to users. For example, we can still define a high-level intent e.g.:

[my_database]:

[my_schema]:

apply_grants(role='ANALYST_RO_ROLE' privileges='SELECT')

Is still saying that I want my ANALYST_RO_ROLE to have SELECT permissions on everything in my_database.my_schema, but the pipeline execution will convert this to individual GRANT statements that can audited later. This also allows me to be more selective only when I need to be e.g.

[my_database]:

[my_schema]:

apply_grants(role='ANALYST_RO_ROLE' privileges='SELECT')

[my_table]:

apply_grants(role='FINANCE_RO_ROLE' privileges='SELECT')

Still keeps the broad and generalised definition for the ANALYST_RO_ROLE, but adds a much more fine grained permission for the FINANCE_RO_ROLE.

To address the information bleed potential race condition, in a DataOps pipeline you would ensure that the GRANTs were only applied AFTER the TABLE/VIEW was in it’s secured state. By performing explicit GRANTs, we can choose when to do these, as opposed to implicit GRANTs which apply whether we are ready or not.

RELATED ARTICLES