❄️ Snowflake & YData
YData seamless integrates with Snowflake, allowing you to connect, query, and manage your data in Snowflake with ease. This section will guide you through the benefits, setup, and use of the Snowflake connector.
Benefits of Integration
Integrating YData SDK with Snowflake offers several key benefits:
- Scalability: Snowflake's architecture scales effortlessly with your data needs, while YData SDK ensure efficient data integration and management.
- Performance: Leveraging Snowflake's high performance for data querying and YData SDK's optimization techniques enhances overall data processing speed.
- Security: Snowflake's robust security features, combined with YData SDK's data governance capabilities, ensure your data remains secure and compliant.
- Interoperability: YData SDK simplifies the process of connecting to Snowflake, allowing you to quickly set up and start using the data without extensive configuration. Benefit from the unique YData functionalities like data preparation with Python, synthetic data generation and data profiling.
Setting Up the Snowflake Connector
👨💻 Complete code example and recipe can be found here.
# Importing YData's package
from ydata.connectors import SnowflakeConnector
# Build your connection string
USERNAME = "insert-username"
PASSWORD = "insert-password"
ACCOUNT_IDENTIFIER = "insert-account-identifier"
PORT = 443
DATABASE_NAME = "insert-database-name"
SCHEMA = "insert-schema-name"
WAREHOUSE = "insert-warehouse-name"
conn_str = {
"hostname": ACCOUNT_IDENTIFIER,
"username": USERNAME,
"password": PASSWORD,
"port": PORT,
"database": DATABASE_NAME,
"warehouse": WAREHOUSE
}
# Create the Snowflake Connector
conn = SnowflakeConnector(conn_string=conn_str)
print(conn)
Navigate your Snowflake database
With your connector created you are now able to explore your database and available datasets.
# returns a list of schemas
schemas = conn.list_schemas()
# get the metadata of a database schema, including columns and relations between tables (PK and FK)
schema = conn.get_database_schema('PATIENTS')
Read from a Snowflake instance
Using the Snowflake connector it is possible to:
- Get the data from a Snowflake table
- Get a sample from a Snowflake table
- Get the data from a query to a Snowflake instance
- Get the full data from a selected database
# returns the whole data from a given table
table = conn.get_table('cardio_test')
print(table)
# Get a sample with n rows from a given table
table_sample = conn.get_table_sample(table='cardio_test', sample_size=50)
print(table_sample)
# returns the whole data from a given table
query_output = conn.query('SELECT * FROM patients.cardio_test;')
print(query_output)
Write to a Snowflake instance
If you need to write your data into a Snowflake instance you can also leverage your Snowflake connector for the following actions:
- Write the data into a table
- Write a new database schema
The if_exists parameter allow you to decide whether you want to append, replace or fail in case a table with the same name already exists in the schema.
conn.write_table(data=tables['cardio_test'],
name='cardio',
if_exists='fail')
table_names allow you to define a new name for the table in the database. If not provided it will be assumed the table names from your dataset.
conn.write_database(data=database,
schema_name='new_cardio',
table_names={'cardio_test': 'cardio'})
I hope you enjoyed this quick tutorial on seamlessly integrating Snowflake with your data preparation workflows. ❄️🚀