Connecting to Databricks
Connect Kyomi to your Databricks SQL warehouse for AI-powered analytics.
Connection Details
| Field | Description | Example |
|---|---|---|
| Server Hostname | Databricks workspace URL | dbc-xxxxxxxx-xxxx.cloud.databricks.com |
| HTTP Path | SQL warehouse path | /sql/1.0/warehouses/xxxx |
| Catalog | Unity Catalog or hive_metastore | main |
| Default Schema | Default schema for queries | default |
Prerequisites
- Databricks workspace with SQL warehouse
- Authentication: Personal Access Token (PAT) or OAuth
- Appropriate permissions on catalogs and schemas
Setup Steps
Step 1: Get Connection Details
- Open your Databricks workspace
- Go to SQL Warehouses
- Select your warehouse
- Click Connection Details tab
- Note the Server hostname and HTTP path
SQL Warehouse Types
- Serverless: Instant startup, auto-scales, pay-per-query
- Pro: Fixed size, good for predictable workloads
- Classic: Legacy option, full control over compute
Step 2: Choose Authentication Method
Databricks supports two authentication methods:
Option A: Personal Access Token (PAT)
Best for quick setup and individual users.
- In Databricks, click your username → Settings
- Go to Developer → Access tokens
- Click Generate new token
- Give it a description (e.g., "Kyomi Analytics")
- Set expiration (or no expiration for long-term use)
- Copy the token immediately (you won't see it again)
Option B: OAuth (Recommended for Teams)
Best for organizations that want users to authenticate with their Databricks accounts.
Admin Setup (one-time):
- Go to Databricks Account Console at accounts.cloud.databricks.com
- Navigate to Settings → App connections
- Click Add connection
- Configure:
- Name:
Kyomi Analytics - Redirect URI:
https://your-kyomi-domain.com/auth/oauth/databricks/callback - Scopes: Enable
all-apis,sql,offline_access
- Name:
- Save and copy the Client ID and Client Secret
User Setup:
- In Kyomi, select OAuth as authentication mode
- Admin enters the Client ID and Client Secret (if not already configured)
- Click Connect with Databricks
- Sign in with your Databricks account in the popup
- Authorize Kyomi to access your workspace
Step 3: Configure Connection in Kyomi
- In the datasource modal, select Databricks as the datasource type
- Enter the Server Hostname (e.g.,
dbc-xxxxxxxx-xxxx.cloud.databricks.com) - Enter the HTTP Path (e.g.,
/sql/1.0/warehouses/xxxx) - Click Connect to test the connection
Step 4: Select Catalog and Schema
- Choose your Catalog from the dropdown:
hive_metastore- Legacy Hive metastoremainor custom - Unity Catalog
- Select a Default Schema (e.g.,
default)
Step 5: Enter Credentials
For PAT authentication:
- Paste your Personal Access Token in the credentials section
- Click Save
For OAuth authentication:
- Your credentials are stored automatically after OAuth authorization
- Click Save to complete setup
Step 6: Configure Catalog Indexing
Select which catalogs Kyomi should index:
- Tables and columns from these catalogs will appear in the catalog
- The AI will use this information to help write queries
- Leave empty to index all accessible catalogs
Unity Catalog Structure
Databricks Unity Catalog uses a three-level namespace:
Catalog → Schema → Table
│ │ │
│ │ └── Individual tables/views
│ └── Grouping of tables (like database schemas)
└── Top-level container (like a database)Examples:
main.sales.orders- Tableordersin schemasalesin catalogmainhive_metastore.default.users- Legacy Hive table
Required Permissions
Your Databricks user needs:
sql
-- Unity Catalog permissions
GRANT USE CATALOG ON CATALOG main TO `user@example.com`;
GRANT USE SCHEMA ON SCHEMA main.sales TO `user@example.com`;
GRANT SELECT ON TABLE main.sales.* TO `user@example.com`;
-- Or broader access
GRANT USE CATALOG ON CATALOG main TO `user@example.com`;
GRANT USE SCHEMA ON ALL SCHEMAS IN CATALOG main TO `user@example.com`;
GRANT SELECT ON ALL TABLES IN CATALOG main TO `user@example.com`;For legacy Hive metastore:
sql
GRANT SELECT ON DATABASE default TO `user@example.com`;Troubleshooting
"Invalid access token" error (PAT)
- Verify the token is correct and not expired
- Generate a new token if needed
- Ensure the token has appropriate permissions
OAuth connection failed
- Verify the redirect URI matches exactly:
https://your-domain/auth/oauth/databricks/callback - Check that
offline_accessscope is enabled (required for refresh tokens) - Ensure your Databricks user has access to the workspace
- Try disconnecting and reconnecting OAuth
"SQL warehouse not found" error
- Verify the HTTP path is correct
- Check that the SQL warehouse exists and is running
- Ensure you have access to the warehouse
"Catalog not found" error
- Verify the catalog name is correct
- Check you have
USE CATALOGpermission - For
hive_metastore, ensure it's enabled in your workspace
Slow queries
- Check if the SQL warehouse is starting up (serverless can have cold start)
- Consider using a Pro warehouse for consistent performance
- Review query efficiency and table partitioning
Can't see expected tables
- Verify permissions on the catalog/schema/table
- Check if using Unity Catalog vs legacy Hive metastore
- Ensure "Catalogs to Index" includes the desired catalogs
Best Practices
Token Management
- Use service principals for production (not personal tokens)
- Set appropriate token expiration
- Rotate tokens periodically
- Store tokens securely
Performance
- Use Delta Lake tables for best performance
- Partition large tables appropriately
- Use photon-enabled warehouses when available
- Consider caching for frequently accessed data
Cost Management
- Use serverless for variable workloads
- Set auto-stop for SQL warehouses
- Monitor query costs in the Databricks console