Community blog | DataOps.live

Connecting to Snowflake from a Snowpark Container Services (SPCS) container

Written by Colin Bradford | Mar 6, 2024 12:19:13 PM

Many workloads running as containers in Snowpark Container Services (SPCS) will want to connect to a Snowflake warehouse to interact with data stored in Snowflake. You can use any of the supported Snowflake clients from within the container. However, there are several prerequisites for setting up this access:

  • Credentials (username/password or private key) must be injected into the container. Snowflake provides secrets for this purpose.
  • An external access integration is required to allow outbound network connections from the container. See Creating and using an external access integration | Snowflake Documentation.
  • The Snowflake warehouse has to allow incoming connections from SPCS. The address ranges of SPCS are not yet well defined, so this allow-range has to be large.

To simplify access, Snowpark Container Services provides a token to the container in the file /snowflake/session/token that can be used for authentication. This token has many benefits:

  • The token is automatically provisioned. No user interaction is required, so users do not have to handle sensitive credential information.
  • The token can only be used within the container. If the token leaks, it cannot be used externally.
  • The token lifetime is 10 minutes, reducing the impact of a credential leak. The refresh is automatic and does not have to be configured or managed by a user.
  • The token supports internal, private connections to Snowflake using the connection parameter host. Using the host parameter forces the connection to stay internal to Snowflake. No data is sent to the internet to connect to Snowflake.

The Snowflake connection parameter host parameter is particularly important. Using host  means that the connection will resolve to a private IP address and, thus, will not connect to a public endpoint on the internet. Further, the private endpoint is not controlled by network policies, which avoids having to open up large IP address ranges in a network policy to allow access from SPCS containers to Snowflake.

Using a token instead of a username/password in a client library such as Python is straightforward. The sample code below shows making a connection:

def get_login_token():
with open('/snowflake/session/token', 'r') as f:
return f.read()

conn = snowflake.connector.connect(
host = os.getenv('SNOWFLAKE_HOST'),
account = os.getenv('SNOWFLAKE_ACCOUNT'),
token = get_login_token(),
authenticator = 'oauth',
database = os.getenv('SNOWFLAKE_DATABASE'),
schema = os.getenv('SNOWFLAKE_SCHEMA')
)


In addition to the token and host parameters discussed above, the connection uses the predefined variables SNOWFLAKE_ACCOUNT, SNOWFLAKE_DATABASE, and  SNOWFLAKE_SCHEMA. The values default to the same account, database, and schema in which the container’s Snowpark Container Services image registry resides.

Currently, the connection will have the same role as the one that created the service, but this may change in the future.

In summary, using the Snowflake-provided token to authenticate a connection to Snowflake from an SPCS container makes configuration more straightforward and provides a more secure connection.