Superset Docker Requirements
Introduction
In this post, we will expand our Superset configuration to support additional databases: Snowflake, Google BigQuery, Google Sheets, and Elasticsearch. Although we will be focusing on these specific databases for this tutorial, Superset supports many more. You can refer to the full list of supported databases and their respective drivers in the official Superset documentation.
Step 1: Create a requirements-local.txt file
First, we need to create a requirements-local.txt
file within the ./docker
directory. This file will be used to specify the additional database drivers required for our desired databases.
# Navigate to the docker directory
cd ./docker
# Create the requirements-local.txt file
touch requirements-local.txt
Step 2: Add database drivers to the requirements-local.txt file
Open the requirements-local.txt
file in your favorite text editor and add the following lines:
snowflake-sqlalchemy<=1.2.4
pybigquery
elasticsearch-dbapi
gsheetsdb
These lines specify the necessary drivers for Snowflake, Google BigQuery, Elasticsearch, and Google Sheets, respectively.
Step 3: Modify the superset_config.py file
To display the preferred databases in the Superset UI, we need to modify the superset_config.py
file. Add the following code snippet:
PREFERRED_DATABASES = [
"Apache Druid",
"Google BigQuery",
"Snowflake",
"Google Sheets",
"PostgreSQL",
]
This will ensure that our desired databases are displayed in the UI.
Step 4 (Optional): Add database images to the UI
This step is optional but recommended for a more polished user experience. We will add custom images for each supported database in the Superset UI. Please note that these changes may not work when using Docker Compose, but they will be effective in production when building the image.
First, create a new YAML file named superset_text.yml
with the following content:
DB_IMAGES:
snowflake: "/static/assets/images/database_logo/snowflake.jpeg"
postgresql: "/static/assets/images/database_logo/postgres.jpg"
druid: "/static/assets/images/database_logo/druid.png"
bigquery: "/static/assets/images/database_logo/bq.png"
gsheets: "/static/assets/images/database_logo/gsheets.png"
presto: "/static/assets/images/database_logo/prestodb.png"
Make sure to place the corresponding image files for each database in the specified paths within the Superset static assets directory.
That's it! You've successfully extended your Superset configuration to support Snowflake, Google BigQuery, Google Sheets, and Elasticsearch. You can now use these databases as data sources within your Superset instance.