Databricks
Databricks is a cloud-based data platform that helps companies manage and analyze large amounts of data from various sources.
Databricks was originally created as a way to easily run Apache Spark, a powerful open-source data processing engine, without having to worry about the underlying infrastructure. It provided a user-friendly "notebook" interface where you could write code and run it on a scalable, distributed computing cluster in the cloud.
Databricks parameters
Required parameters are in red and Optional parameters are in blue.
- server_hostname, your SQL Warehouse's host name
this can be found on the Connection Details tab
- http_path, your SQL Warehouse's path
this can be found on the Connection Details tab
- access_token, generate a personal access token from your SQL Warehouse
this can be generated on the Connection Details tab
- catalog, the catalog to write new data to
if tables do not exist in the catalog.schema location already, we'll auto-create them
- schema, the schema to write new data to
if tables do not exist in the database.schema location already, we'll auto-create them
- port, the port number where your Databricks instance is exposed on
default:443
- load_strategy, how to write new data into existing tables
default:APPEND
( allowed:APPEND
,TRUNCATE
,UPSERT
)
How do I use the Databricks syncer in commands?
cs_tools tools searchable bi-server --syncer "databricks://server_hostname=...&http_path=...&access_token=...&catalog=..."
- or -
cs_tools tools searchable bi-server --syncer databricks://definition.toml
Definition TOML Example
definition.toml