Databricks

There is No Magic!

This Syncer uses Databricks's SQLAlchemy driver under the hood.

Databricks states that it is intended to connect to Unity Catalog, and that usage with hive_metastore is untested.

Parameters

Required parameters are in red and Optional parameters are in blue.

server_hostname, your SQL Warehouse's host name
this can be found on the Connection Details tab

http_path, your SQL Warehouse's path
this can be found on the Connection Details tab

access_token, generate a personal access token from your SQL Warehouse
this can be found on the Connection Details tab

catalog, the catalog to write new data to
if tables do not exist in the catalog.schema location already, we'll auto-create them

schema, the schema to write new data to
if tables do not exist in the database.schema location already, we'll auto-create them

port, the port number where your Databricks instance is exposed on
default: 443

use_legacy_dataload, fall back to slower data loading with JDBC-style INSERTs
default: false ( allowed: true, false )

load_strategy, how to write new data into existing tables
default: APPEND ( allowed: APPEND, TRUNCATE, UPSERT )

Serverless Requirements

If you're running CS Tools serverless, you'll want to ensure you install these python requirements.

Don't know what this means? It's probably safe to ignore it.

How do I use the Syncer in commands?

CS Tools accepts syncer definitions in either declarative or configuration file form.

_{Find the copy button to the right of the code block.}

Declarative

Configuration File

Simply write the parameters out alongside the command.

cs_tools tools searchable metadata --syncer "databricks://server_hostname=dbc-abc1234-efgh.cloud.databricks.com&http_path=/sql/protocolv1/o/1234567890123456/0123-456789-abcdef01&access_token=dapi0123456789abcdef0123456789abcdef&catalog=thoughtspot" --config dogfood

^{* when declaring multiple parameters inline, you should wrap the enter value in quotes.}

Create a file with the .toml extension.

syncer-overwrite.toml

[configuration]
server_hostname = "dbc-abc1234-efgh.cloud.databricks.com"
http_path = "/sql/protocolv1/o/1234567890123456/0123-456789-abcdef01"
access_token = "dapi0123456789abcdef0123456789abcdef"
catalog = "thoughtspot"
schema = "cs_tools"
port = 443
load_strategy = "TRUNCATE"

^{* this is a complete example, not all parameters are required.}

Write the filename in your command in place of the parameters.

cs_tools tools searchable metadata --syncer databricks://syncer-overwrite.toml --config dogfood