Snowflake Data Cloud - Insights from the field
My colleague Marc an Voort and Pascal Pfäffle are experts regularly sharing their knowledge and insights about Snowflake’s Data Cloud on LinkedIn and Medium:
But not only this. They did a series of Data Innovation Podcasts where they talk about there experiences with Snowflake and from making it work in reality (in german). I wraped up the most important insights from my colleagues in this article.
“Das Snowflake Special - Teil 1” - About their favorite features
Insight’s from the podcast (first published in german on LinkedIn):
✅ Shared storage and distributed compute power makes it possible to access the same data with different computing power
✅ Elastic scalability of both storage and compute power through customizable warehouses (from XS to 6XL) and multi-cluster options
✅ “Time Travel” allows, restore data to an earlier point in time, for example after accidental deletion (“Oh-Shit Feature”)
✅ Snowpark enables the use of Python directly in Snowflake for data processing
✅ The Marketplace allows you to consume data from other companies or offer and monetize your own data products, which supports data democratization
✅ Internal marketplaces support the data mesh concept and data exchange within organizations
✅ Streamlit can be used to create your own applications on the Snowflake platform
But why is Snowflake now really interesting for modern data management?
✔️ No need for configurations - users can log in and start working directly with their workloads
✔️ Continuous updates and new functions are available without long patch cycles - simply start
✔️ High flexibility and freedom in scaling resources
“Das Snowflake Special - Teil 2” - About project experiences
Insight’s from the podcast (first published in german on LinkedIn):
✅ Use Snowflake as a centralized platform: There is value in using Snowflake as an end-to-end platform for data processing and even model execution to minimize the need for separate services
✅ Greenfield projects for optimal results: Greenfield projects developed specifically for Snowflake lead to better performance and cost efficiency in the long run compared to pure lift-and-shift migrations
✅ Refactoring after lift and shift is essential: To fully utilize Snowflake's potential, refactoring after a lift-and-shift migration is essential, as inefficient legacy code can otherwise increase costs
✅ Real-time processing through tasks and streams: Snowflake's tasks and streams enable near real-time data processing and transformation, which is especially beneficial for high data volumes
✅ Leveraging native integrations and pushdown: Using native connectors for ETL/ELT and BI tools as well as pushdown processing within Snowflake optimizes performance and scalability
✅ Consider integration of different data sources: Integrating data from different sources requires consideration of available tools such as the Snowflake Marketplace and specialized connectors
✅ Security and governance are seamless: Snowflake's security features such as Masking Policies and Row Level Security are seamlessly integrated with reporting tools and are based on user roles
✅ Use modern data engineering tools: Using modern tools like dbt in conjunction with Snowflake improves workflows and utilizes the platform more effectively
✅ Implement native Snowflake features early: Early implementation of native features such as Dynamic Tables can significantly improve performance
✅ Internal monitoring through dashboards: Snowflake's simple dashboarding capabilities enable monitoring of internal processes without external BI tools
“Das Snowflake Special - Teil 3” - About the future of Snowflake
Insights from the podcast:
✅ Regarding the continuous stream of new features, experimenting with them in a separate environment to potentially improve data loading processes and utilize the full potential of the platform is recommended
✅ Especially AI features, are a reason why it's fun to work with Snowflake, but you have to stay updated
✅ Regarding AI, Snowflake tries to always prioritize the benefit for the customer, e. g. Cortex Studio for comparing LLMs and Copilot for generating SQL as examples of making AI more accessible
✅ Concerning the end-to-end vision, Snowflake wants to be a data platform where when it comes to data, come to us and then you are in good hands, the all-round carefree package
✅ Regarding Snowpark it simply expands possibilities for running more complex algorithms by allowing direct access to data and the use of Python packages natively on the platform
✅ For Data sharing the internal and external marketplaces are valuable features, especially in the context of data mesh and data monetization. Secure data sharing uses pointers to the data, avoiding replication and improving performance.
✅ It is increasing easy to integrate Snowflake with other systems, reinforcing the end-to-end vision. Native data lineage feature is another example of Snowflake evolving into a comprehensive platform.
Snowflake is for sure a good example for a modern, flexible data platform, growing into a flexible and modern developer experience and adding important features for todays data needs.
For those interested, I already wrote an article about Snowflake and SAP some time ago: Snowflake Data Cloud in a SAP Ecosystem for Data & Analytics
👉 What do you think about about Snowflake Data Cloud, the features and the value it brings to the customers?
For transparency, I used AI tools like Google’s NotebookLM, ChatGPT and DeepL to work on this text. While everything is read, adapted and compiled by myself these tools helped me to wrap up the knowledge and support the writing process.
GLOSSARY
Snowflake Data Cloud
A cloud-native data platform that enables data storage, processing, and analytics across multiple cloud providers with near-unlimited scalability and high performance.
Data Sharing
A Snowflake feature that allows real-time, secure sharing of live data between accounts and organizations without the need to copy or move data.
Data Lineage
The ability to track the flow, transformation, and origin of data within Snowflake to support data governance, quality, and compliance.
Snowpark
A developer framework that enables writing complex data transformations in Java, Scala, or Python and executing them directly inside Snowflake.
Cortex Studio
An interface within Snowflake for building, deploying, and managing machine learning and AI applications using built-in LLMs and other models.
LLM (Large Language Model)
An advanced AI model trained on large volumes of text data, used within Snowflake (e.g., via Cortex) for tasks like summarization, sentiment analysis, or natural language querying.
Dynamic Tables
Tables in Snowflake that automatically stay up to date by continuously applying incremental changes, simplifying ELT pipelines and reducing latency.
dbt (data build tool)
An open-source tool that integrates with Snowflake to help manage and automate SQL-based data transformations using version-controlled, modular code.
Snowflake Marketplace
A platform where users can discover, access, and share third-party datasets, applications, and services directly within the Snowflake ecosystem.
Snowflake Tasks and Streams
Streams track changes in data, and Tasks allow scheduling SQL or procedural logic—together, they enable robust, incremental data processing workflows.
Greenfield Project
A new implementation of Snowflake without any legacy constraints, allowing full freedom to design systems and architecture from the ground up.
Streamlit
An open-source Python framework for building and deploying interactive data applications—integrated directly into Snowflake for low-code dashboarding.
Time Travel
A feature in Snowflake that allows users to query, restore, or clone data from a previous state within a specified retention window, aiding recovery and auditing.