Connecting to Databricks

Connect Kyomi to your Databricks SQL warehouse for AI-powered analytics.

Connection Details

Field	Description	Example
Server Hostname	Databricks workspace URL	`dbc-xxxxxxxx-xxxx.cloud.databricks.com`
HTTP Path	SQL warehouse path	`/sql/1.0/warehouses/xxxx`
Catalog	Unity Catalog or hive_metastore	`main`
Default Schema	Default schema for queries	`default`

Prerequisites

Databricks workspace with SQL warehouse
Authentication: Personal Access Token (PAT) or OAuth
Appropriate permissions on catalogs and schemas

Setup Steps

Step 1: Get Connection Details

Open your Databricks workspace
Go to SQL Warehouses
Select your warehouse
Click Connection Details tab
Note the Server hostname and HTTP path

SQL Warehouse Types

Serverless: Instant startup, auto-scales, pay-per-query
Pro: Fixed size, good for predictable workloads
Classic: Legacy option, full control over compute

Step 2: Choose Authentication Method

Databricks supports two authentication methods:

Option A: Personal Access Token (PAT)

Best for quick setup and individual users.

In Databricks, click your username → Settings
Go to Developer → Access tokens
Click Generate new token
Give it a description (e.g., "Kyomi Analytics")
Set expiration (or no expiration for long-term use)
Copy the token immediately (you won't see it again)

Option B: OAuth (Recommended for Teams)

Best for organizations that want users to authenticate with their Databricks accounts.

Admin Setup (one-time):

Go to Databricks Account Console at accounts.cloud.databricks.com
Navigate to Settings → App connections
Click Add connection
Configure:
- Name: Kyomi Analytics
- Redirect URI: https://your-kyomi-domain.com/auth/oauth/databricks/callback
- Scopes: Enable all-apis, sql, offline_access
Save and copy the Client ID and Client Secret

User Setup:

In Kyomi, select OAuth as authentication mode
Admin enters the Client ID and Client Secret (if not already configured)
Click Connect with Databricks
Sign in with your Databricks account in the popup
Authorize Kyomi to access your workspace

Step 3: Configure Connection in Kyomi

In the datasource modal, select Databricks as the datasource type
Enter the Server Hostname (e.g., dbc-xxxxxxxx-xxxx.cloud.databricks.com)
Enter the HTTP Path (e.g., /sql/1.0/warehouses/xxxx)
Click Connect to test the connection

Step 4: Select Catalog and Schema

Choose your Catalog from the dropdown:
- hive_metastore - Legacy Hive metastore
- main or custom - Unity Catalog
Select a Default Schema (e.g., default)

Step 5: Enter Credentials

For PAT authentication:

Paste your Personal Access Token in the credentials section
Click Save

For OAuth authentication:

Your credentials are stored automatically after OAuth authorization
Click Save to complete setup

Step 6: Configure Catalog Indexing

Select which catalogs Kyomi should index:

Tables and columns from these catalogs will appear in the catalog
The AI will use this information to help write queries
Leave empty to index all accessible catalogs

Unity Catalog Structure

Databricks Unity Catalog uses a three-level namespace:

Catalog → Schema → Table
   │         │        │
   │         │        └── Individual tables/views
   │         └── Grouping of tables (like database schemas)
   └── Top-level container (like a database)

Examples:

main.sales.orders - Table orders in schema sales in catalog main
hive_metastore.default.users - Legacy Hive table

Required Permissions

Your Databricks user needs:

sql

-- Unity Catalog permissions
GRANT USE CATALOG ON CATALOG main TO `user@example.com`;
GRANT USE SCHEMA ON SCHEMA main.sales TO `user@example.com`;
GRANT SELECT ON TABLE main.sales.* TO `user@example.com`;

-- Or broader access
GRANT USE CATALOG ON CATALOG main TO `user@example.com`;
GRANT USE SCHEMA ON ALL SCHEMAS IN CATALOG main TO `user@example.com`;
GRANT SELECT ON ALL TABLES IN CATALOG main TO `user@example.com`;

For legacy Hive metastore:

sql

GRANT SELECT ON DATABASE default TO `user@example.com`;

Troubleshooting

"Invalid access token" error (PAT)

Verify the token is correct and not expired
Generate a new token if needed
Ensure the token has appropriate permissions

OAuth connection failed

Verify the redirect URI matches exactly: https://your-domain/auth/oauth/databricks/callback
Check that offline_access scope is enabled (required for refresh tokens)
Ensure your Databricks user has access to the workspace
Try disconnecting and reconnecting OAuth

"SQL warehouse not found" error

Verify the HTTP path is correct
Check that the SQL warehouse exists and is running
Ensure you have access to the warehouse

"Catalog not found" error

Verify the catalog name is correct
Check you have USE CATALOG permission
For hive_metastore, ensure it's enabled in your workspace

Slow queries

Check if the SQL warehouse is starting up (serverless can have cold start)
Consider using a Pro warehouse for consistent performance
Review query efficiency and table partitioning

Can't see expected tables

Verify permissions on the catalog/schema/table
Check if using Unity Catalog vs legacy Hive metastore
Ensure "Catalogs to Index" includes the desired catalogs

Best Practices

Token Management

Use service principals for production (not personal tokens)
Set appropriate token expiration
Rotate tokens periodically
Store tokens securely

Performance

Use Delta Lake tables for best performance
Partition large tables appropriately
Use photon-enabled warehouses when available
Consider caching for frequently accessed data

Cost Management

Use serverless for variable workloads
Set auto-stop for SQL warehouses
Monitor query costs in the Databricks console

Additional Resources

← Back to Datasources | Back to Docs

Connecting to Databricks ​

Connection Details ​

Prerequisites ​

Setup Steps ​

Step 1: Get Connection Details ​

Step 2: Choose Authentication Method ​

Option A: Personal Access Token (PAT) ​

Option B: OAuth (Recommended for Teams) ​

Step 3: Configure Connection in Kyomi ​

Step 4: Select Catalog and Schema ​

Step 5: Enter Credentials ​

Step 6: Configure Catalog Indexing ​

Unity Catalog Structure ​

Required Permissions ​

Troubleshooting ​

"Invalid access token" error (PAT) ​

OAuth connection failed ​

"SQL warehouse not found" error ​

"Catalog not found" error ​

Slow queries ​

Can't see expected tables ​

Best Practices ​

Token Management ​

Performance ​

Cost Management ​

Additional Resources ​