Resources | DataOps.live

DataOps for Snowflake Cortex | Build Apps with Cortex ML and LLMs

Written by Thomas Steinborn, SVP Products | May 13, 2024

Snowflake Cortex is the latest in a rich set of Artificial Intelligence (AI) and Machine Learning (ML) capabilities built into Snowflake.

Cortex makes it easier than ever to build AI-powered applications directly inside Snowflake. But moving fast without testing, governance, and deployment discipline creates production risk. That’s where DataOps.live helps. 

Snowflake Cortex vs Snowpark ML: What’s the Difference?

Snowflake offers a variety of AI, ML, and LLMs, supporting functions across model development and application layers:

  • Cortex Search Service
  • Document AI
  • Snowflake Copilot
  • Snowpark ML
  • Snowpark ML Packages
  • Snowpark Model Registry
  • Snowpark Feature Store
  • Snowpark Container Services (with Nvidia GPUs)

Most data engineers are already familiar with Snowpark and Snowpark Container Services (SPCS).

Snowpark provides great flexibility in your choice of data science tasks. You can choose which Python packages and models to use, including libraries like Pandas..

SPCS gives more freedom and is highly effective for AI tasks when used with GPU compute pools. One common scenario is to train a model on SPCS and then use Snowpark ML to run predictions. This offers a good balance between cost and performance.

What is Snowflake Cortex adding to the mix? Snowflake Cortex LLMs provide serverless ML and LLM functions on top of your data in Snowflake. For example, Cortex Code, which replaces Snowflake Copilot to help you write better Snowflake code more quickly.

Now, you can use Snowflake Cortex and DataOps.live to build data products rapidly, with the rigor required for enterprise AI, right in Snowflake.

What can you do with Cortex LLM functions and DataOps?

Snowflake Cortex LLM functions cover common use cases for text analytics and chatbots. Unless Cortex LLM functions are unavailable in your region or an admin has revoked default Cortex privileges, there is no special Snowflake Cortex user role required for LLM functions.

If you’re still relying on manual processes for data validation and governance, reducing the friction of creating new data apps opens the door to quality issues and governance gaps. To build safely, you’ll need to operationalize DataOps in your Snowflake Cortex workflow.

Getting Started With DataOps for Snowflake Cortex LLMs

Let's start with text analytics against transcription of all your meeting recordings. Snowflake Cortex LLMs simplify access to accurate summaries without any prompt engineering. A simple call to SNOWFLAKE.CORTEX.SUMMARIZE against your table of transcription is sufficient.

Within DataOps.live, you can develop a full Streamlit application calling the native function from Python:  

The final result can be a rich user experience built with DataOps.live and deployed as Streamlit in Snowflake. Based on the summary, you can then choose a meeting recording and analyze it:  

Building and Deploying an App With Snowflake Cortex LLMs

Once you find a recording that interests you, start interacting with it.

Let’s create a chatbot to query the full transcription in natural language. The SNOWFLAKE.CORTEX.COMPLETE function is the right choice to pass your input as prompts to a Large Language Model (LLM).

 

Snowflake offers the choice of different LLM models from:

  • OpenAI
  • Anthropic
  • Meta
  • Mistral AI
  • DeepSeek

In addition, you can use their own Snowflake Arctic model. Choose the one best-suited to your use case. For our example, Mistral gave the best results.

Build and deploy your chatbot with DataOps.live and provide a fully immersive experience to your users:

 

What can you do with Snowflake Cortex ML functions and DataOps?  

Cortex ML functions work on top of your Snowflake data to give you powerful prediction and analysis insights, for example, time-series forecasting or anomaly detection. Time-series forecasting employs a machine learning algorithm to predict future data using historical time series data. Anomaly detection identifies outliers in data.

DataOps for Snowflake Cortex ML Functions

When you want to use time-series forecasts, you can use the Snowflake Object Lifecycle Engine (SOLE) to create your data tables, run your data pipeline to ingest the necessary data, and then launch into our development environment DataOps.live Develop.

You can explore the underlying Snowflake data directly in our browser-based IDE. We will use a Jupyter Notebook to connect to Snowflake. Then, DataOps.live will run a Pandas query with Snowpark on your data table.

Creating and Calling New Cortex ML Functions

Once you have reviewed the data, plotted it, and found the interesting data pattern, you can prototype the forecast and visualize the upper and lower bounds as well as the expected forecast for the next few months.

Next, you can create your Snowflake view for your training data. Once done, you can create your SNOWFLAKE.ML.FORECAST function my_forecast_model. Later, you can use that new function in standard SQL with CALL my_forecast_model!forecast.

Build reliable apps quickly with Snowflake Cortex LLMs and DataOps

What is Snowflake Cortex going to change for data engineers? That depends on how well you operationalize the testing, governance, and monitoring of the AI applications you use. Automating DataOps keeps your data safe in production by validating data at every step and applying policies consistently.

DataOps.live and Snowflake work together to give you a complete delivery model that embeds CI/CD, testing, governance, and observability into every data app’s lifecycle. Start your 30-day free trial of DataOps.live and start building confidently with Snowflake Cortex LLMs.