Delivery options
This article describes the options for getting the dataset once the snapshot is ready.
To set your delivery preferences for the dataset, simply click on the ‘Delivery settings’ tab:
- Choose file format :
- JSON
- NDJSON
- CSV
- JSON lines
- Choose how to receive the data :
- Amazon S3 (AWS S3 User Role Permissions)
- Google Cloud Storage (How to find your google cloud Private Key)
- Google Cloud PubSub
- Microsoft Azure Storage
- SFTP/FTP
- Snowflake (Snowflake Delivery Configuration Guide)
You can also set delivery preferences per snapshot, to do that simply click on the snapshot and then click ‘Export’, set required storage parameters:
There is an option to download the dataset directly from the CP by clicking ‘download + format’ button:
AWS S3 User Role Permissions
To control access to S3 resources, you can use IAM (Identity and Access Management) to create and manage AWS users and their permissions. One way to do this is by creating IAM roles and attaching them to S3 resources.
Create a Policy
Go to the “Policies” section in the IAM console
Create a new policy that defines the permissions for the S3 resources you want to grant access to.
Example of AWS policy:
Create a Role
Go to the “Roles” section in the IAM console
Create a new role and specify the policy created in step 1 in the “Permission policies” section.
Make a note of the ARN of the role, which will be used for delivery credentials. (The ARN will look like arn:aws:iam::<ROLE_ID>:role/<ROLE_NAME>
)
Example of User Role:
Use the ARN of the role
In the S3 resources that you want to grant access to, attach the role created in step 2 by using the ARN.
How to find your Google Cloud Private Key
-
Go to the Google Cloud Platform Console home page - https://console.cloud.google.com/
-
Expand the menu by Google Cloud Platform, and click IAM & Admin.
-
Click Service accounts.
- Choose an existing service account from the list or create one.
If the button is not visible, create a project first in order to Create Service Account.
- Create the service account by entering the name, ID, and description at the Create Service Account process. Then grant the access and create the account.
- Click on Email of the service account.
- To access the keys, click on the ‘KEYS’ tab. Click the “Add Key” dropdown and then select ‘Create New Key’.
- Choose JSON as the key type.
-
The service account key JSON file is automatically downloaded to your local machine.
-
Copy
private_key
from the downloaded JSON file & Paste it to Data Collector delivery settings.
Snowflake Delivery Configuration Guide only visible to agents and admins
Getting Started
In order to allow efficient delivery of Datasets to your Snowflake environment, we provide a step-by-step guide to set it up. Just follow these steps:
Select or Create a Database
Firstly, decide if you will use an existing database or create a new one. If you opt for a new database, here’s the command you need:
Remember to replace <database>
with the name you want for your database.
Select or Create a Schema
Decide if you will use an existing schema or create a new one. By default every database has a PUBLIC schema. If you wish to use a different schema, here’s the command you need:
Replace <schema>
with your own schema name.
Select or Create a Warehouse
Choose an existing warehouse or create a new one. When creating a new warehouse, consider Snowflake’s recommendations for configuring a warehouse specifically for data loading. Use the following command to create a warehouse:
Replace <warehouse>
with your desired warehouse name.
Select or Create an Internal Named Stage
Next, choose an existing internal named stage or create a new one. To create a new stage, use this command:
Don’t forget to replace <stage>
with your preferred stage name.
Create a Role
You’ll need a role that can write to your chosen stage. To create one, use:
Change <role_name>
to your chosen role name.
Grant Warehouse Rights to the Role
Now, grant your new role the necessary rights to operate on your chosen warehouse using:
Remember to replace <warehouse>
and <role_name>
with your specific warehouse and role name respectively.
Enable Write Operations on the Stage for the Role
To enable your role to write on the stage, use the command:
Again, replace <stage>
and <role_name>
with your chosen stage and role names.
Create a BrightData User
Next, create a new user for BrightData that will be used to upload data directly into Snowflake. The command is as follows:
Replace <user_name>
, <password>
, and <login>
with your chosen username, password, and login name.
Grant Role Privileges to the New User
Finally, grant your new user the privileges of the role you created:
Replace <role_name>
and <user_name>
with your role and user names.
Whitelist IPs
If you have an active Network Policy applied in your Snowflake account you need to whiltelist following IPs:
Replace <policy_name>
with your network policy name. Replace <existing_whiltelisted_ips>
with the list of existing whitelisted IPs.
And that’s it! You have now configured your Snowflake environment to receive data from our platform.
If you have any issues or need further assistance, please contact our support team.
If you want to learn more about Data Loading Performance and Warehouse Size Considerations click here.
Was this page helpful?