Skip to content
  • AT SNOWFLAKE
  • 산업 솔루션
  • 파트너 및 고객 가치
  • 제품 및 기술
  • 전략 및 통찰력
Languages
  • Deutsch
  • Français
  • Português
  • Español
  • English
  • Italiano
  • 日本語
  • 한국어
  • Deutsch
  • Français
  • Português
  • Español
  • English
  • Italiano
  • 日本語
  • 한국어
  • AT SNOWFLAKE
  • 산업 솔루션
  • 파트너 및 고객 가치
  • 제품 및 기술
  • 전략 및 통찰력
  • Deutsch
  • Français
  • Português
  • Español
  • English
  • Italiano
  • 日本語
  • 한국어
  • 개요
    • Why Snowflake
    • 고객 사례
    • 파트너 네트워크
    • 서비스
  • 데이터 클라우드
    • 데이터 클라우드
    • 플랫폼 개요
    • SNOWFLAKE 데이터 마켓플레이스
    • Powered by Snowflake
    • 라이브 데모
  • WORKLOADS
    • 협업
    • 데이터 사이언스&머신러닝
    • 사이버 보안
    • 애플리케이션
    • 데이터 웨어하우스
    • 데이터 레이크
    • 데이터 엔지니어링
    • 유니스토어
  • PRICING
    • Pricing Options
  • 산업별 솔루션
    • 광고, 미디어 및 엔터테인먼트
    • 금융 서비스
    • 의료 및 생명 과학
    • 제조
    • 공공 부문
    • 소매 / CPG
    • 테크놀로지
  • 리소스
    • 리소스
    • Documentation
    • 핸즈온 랩
    • 트레이닝
  • CONNECT
    • Snowflake 블로그
    • 커뮤니티
    • 이벤트
    • 웨비나
    • 팟캐스트
  • 개요
    • 회사 소개
    • 투자정보
    • 리더십 및 이사회
    • 채용
Author
Jakub Puchalski
Jakub Puchalski
Xin Huang
Xin Huang
Share
Subscribe
2024년 06월 13일

Ingest Data Faster, Easier and Cost-Effectively with New Connectors and Product Updates

  • 제품 및 기술
    • 데이터 엔지니어링
Ingest Data Faster, Easier and Cost-Effectively with New Connectors and Product Updates

The journey toward achieving a robust data platform that secures all your data in one place can seem like a daunting one. But at Snowflake, we’re committed to making the first step the easiest — with seamless, cost-effective data ingestion to help bring your workloads into the AI Data Cloud with ease.

Snowflake is launching native integrations with some of the most popular databases, including PostgreSQL and MySQL. With other ingestion improvements and our new database connectors, we are smoothing out the data ingestion process, making it radically simple and efficient to bring data to Snowflake. That means fewer tools and licenses, lower costs, and a more frictionless experience for your organization. 

Like any first step, data ingestion is a critical foundational block. Ingestion with Snowflake should feel like a breeze. Given the many different ways to ingest data, in this blog we will walk through the various methods, calling out the latest announcements and improvements we’ve made. 

Bringing in batch and streaming data efficiently and cost-effectively 

Ingest and transform batch or streaming data in <10 seconds: Use COPY for batch ingestion, Snowpipe to auto-ingest files, or bring in row-set data with single-digit latency using Snowpipe Streaming. 

COPY INTO now supports use cases for unstructured data with the new ingestion capabilities for Document AI (generally available soon). Users can now use Document AI to create a model and use it in automated batch ingest of unstructured documents with formats like PDF, JPEG, HTML and more. Together with Document AI, Snowflake customers can utilize the analytical insights they extract from documents and directly operationalize them in their data pipelines. 

Both Snowpipe and Snowpipe Streaming are serverless, leading to better scalability and cost efficiency. Snowpipe Streaming, compared to Snowpipe, can handle high volumes of data at a lower cost and low latency — without complex manual client configuration and management. Exactly-once delivery, data ordering and availability are automatically managed by Snowflake, freeing up expensive developer resources for more mission-critical work. Users can also unify data pipelines, no longer needing to separate streaming and batch data. Ingest and transform easily in a single system without having to stitch solutions together or build additional data pipelines to move data around. 

Snowpipe and Snowpipe Streaming also serve as foundations for Snowflake’s native connectors and partner integrations, such as AWS Data Firehose, Striim and Streamkap. Customers benefit from the same cost efficiency and low latency. 

Simplifying ingestion with Snowflake native connectors 

Building on the success of Snowflake native connectors — Snowflake Connector for Kafka and connectors for SaaS applications, like ServiceNow and Google Analytics — we have just announced a public preview (soon) of connectors for some of the leading open source relational databases, PostgreSQL and MySQL. The new database connectors are built on top of Snowpipe Streaming, which means they also provide more cost-effective and lower latency pipelines for customers. They further our commitment to offering simple native connectors for change data capture (CDC) from the top online transaction processing (OLTP) database systems. We soon expect to expand the connectors roster to leading proprietary databases as well.

These native connectors are built with the Snowflake Native App Framework, which means customers can connect their data through the Snowflake Marketplace with built-in security and reliability. Instead of transporting files between systems, data flows directly from the source right into Snowflake, and the data is always encrypted, whether in motion or at rest. Additionally, you can pay as you consume, with no need for additional licenses or procurement processes. 

Developers can operationalize their analytics, AI and ML workflows by bringing Postgres and MySQL data into Snowflake with lower latency. Customers have already unlocked incredible value from these connectors across retail, healthcare, high-tech, media, financial services and other industries. 

Figure 1: Snowflake’s Native Connectors can be found and used from Snowflake Marketplace

Now, let’s take a deeper look at how the native connectors work in Snowflake. 

The OLTP database connectors, built on a strong foundation of capabilities that have already been highly recognized by our customers, offer the same set of benefits — ease of use, high scalability, cost-effectiveness, low latency — as our SaaS native connectors and Snowpipe Streaming, with little operational oversight.

Snowflake database connectors consist of two components:

  • The agent, a standalone application distributed as a docker image, available on Docker Hub, deployed in the customer’s infrastructure. It is responsible for sending the initial snapshot load and incremental load by reading data changes from the source database CDC stream.
  • The Snowflake Native App, an object that resides in the customer’s Snowflake account and is the brain behind the connector. It is primarily responsible for managing the replication process, controlling the agent state and creating all database objects, including the target database.
Figure 2: Example database connector configuration

Users can connect a single agent to multiple data sources and synchronize the data — whether continuously or at prescribed intervals — into a single Snowflake account. From inside the Snowflake Native App, they can select which tables and columns are replicated. In case of errors (e.g., a network issue or lost connection with the agent), users will be notified with an email alert. And, soon available in public preview, if a table in the source database changes its schema (e.g., a column is added, removed or renamed), the connector will automatically adjust and continue syncing the table with a new schema.

Customer use cases across industries 

E-commerce and retail: A developer at an e-commerce platform, tasked with personalizing the shopping experience for millions of users, can now use Snowflake’s native database connectors to tap into near real-time website interaction data from globally distributed Postgres databases, continuously analyze that data in Snowflake and serve personalized recommendations without expensive ETL.

Healthcare: A healthcare company, planning to optimize patient-care experiences through data-driven insights, can securely integrate patient-interaction data from their hospital management system Postgres into Snowflake, without needing a third-party processor, and leverage Snowflake Cortex AI to analyze trends and improve service quality in real time.

Gaming: With Snowflake’s native connectors, developers can quickly and continuously stream billing and customer usage data from thousands of Postgres databases into Snowflake, enabling them to make lightning-fast decisions to optimize user engagement in their game and user portals.

You soon will be able to try out the Snowflake Connectors for PostgreSQL or MySQL by installing them from Snowflake Marketplace and downloading the agent from Docker Hub. 

Connecting to more data with the Marketplace ecosystem and Connector SDK

In addition to the connectors delivered natively by Snowflake, customers can also benefit from a broad ecosystem of partners who have built Snowflake Native Apps to distribute connectors via the Snowflake Marketplace. For example, SNP developed SNP Glue to ingest SAP data directly to Snowflake; Omnata offers out-of-box SaaS connectors, such as Monday.com, HubSpot and Zendesk; as do many other providers, such as Nimbus and Informatica, to name just a few. 

Figure 3: Example of connectors available via Snowflake Marketplace. 

Additionally, developers have the option to build their own connectors. Snowflake Native SDK for Connectors offers core libraries and templates so developers can build connectors faster. 

Of course, one of the key reasons why data engineering with Snowflake is revolutionary is the need for fewer data pipelines with easy data sharing. Customers have access to live data sets from Snowflake Marketplace, which reduces the costs and burden associated with traditional ETL pipelines and API-based integrations. 

Continuous improvement with performance optimization and usability 

To make data ingestion even more cost-effective and effortless, Snowflake continues to invest in higher performance and a better user interface. We have improved JSON file loading by up to 25% without any action required from customers, and up to 50% for loading Parquet files.

Snowsight makes it simple to get your data into Snowflake. Snowsight is now even easier to navigate, with a centralized location for a range of generally available features, including creating stages, uploading files to create a table, loading files into an existing table, installing a connection, and automatic schema inference with ability to update or override. 

Snowsight now allows users to directly create tables using Schema Detection, load tables and stages with little-to-no coding, while users can now upload files as large as 250 MB — up from 50MB. Learn more here.

You can learn more about data ingestion here. Or simply visit Snowflake Marketplace or Snowsight to jumpstart your ingestion pipelines. 

Share

Related content

  • 제품 및 기술
  • 산업별 솔루션
2024년 03월 20일

The Modern Data Streaming Pipeline: Streaming Reference Architectures and Use Cases Across 7 Industries 

Executives across various industries are under pressure to reach insights and make decisions quickly. This is driving the importance of streaming data and analytics, which play a crucial role in…

See how
Read More
  • 제품 및 기술
    • 데이터 엔지니어링
2024년 05월 06일

Reimagine Batch and Streaming Data Pipelines with Dynamic Tables, Now Generally Available

Since Snowflake’s Dynamic Tables went into preview, we have worked with hundreds of customers to…

Delve into the details
Read More

The Modern Data Streaming Pipeline

Download now

Snowflake Inc.
  • 플랫폼 개요
    • 아키텍처
    • 데이터 애플리케이션
  • 데이터 마켓플레이스
  • Snowflake 파트너 네트워크
  • 지원 및 서비스
  • 회사
    • 문의하기

Sign up for Snowflake Communications

Thanks for signing up!

  • Privacy Notice
  • Site Terms
  • Cookie Settings

© 2024 Snowflake Inc. All Rights Reserved