Confluence Connector
Sync spaces, pages, and blog posts from Atlassian Confluence into LH42.
Features
- Spaces - Sync entire spaces or selected ones
- Pages - Full page content with hierarchy
- Blog posts - Company blog content
- Attachments - PDF and Office documents
- XHTML conversion - Confluence storage format → markdown
- Incremental sync - Only changed pages re-indexed
Prerequisites
- Atlassian Cloud account (or Data Center/Server)
- Confluence space access
Setup
Step 1: Connect via OAuth
- Go to Settings > Integrations
- Find Confluence and click Connect
- Sign in with your Atlassian account
- Grant the requested permissions:
- Read Confluence spaces
- Read Confluence content
- Read user information
Step 2: Select Spaces
After connecting, choose which spaces to sync:
python
client.connectors.configure("confluence", {
"settings": {
"space_keys": ["TEAM", "DOCS", "ENG"], # Specific spaces
"include_blogs": True, # Include blog posts
"include_attachments": True # Sync attachments
}
})Options:
- Omit
space_keysto sync all accessible spaces - Set
include_blogs: falseto skip blog posts - Set
include_attachments: falseto skip attachments
Step 3: Start Initial Sync
python
client.connectors.sync("confluence", mode="full")Content Transformation
Confluence XHTML storage format is converted to searchable markdown:
| Confluence Element | Output |
|---|---|
| ac:structured-macro | Expanded content |
| ac:rich-text-body | Plain text |
| ac:image | Image reference |
| ac:link | Markdown link |
| table | Markdown table |
| code | Code block |
API Reference
List Spaces
bash
GET /api/connectors/{connector_id}/spaces
# Response
{
"spaces": [
{
"key": "TEAM",
"name": "Team Documentation",
"type": "global",
"page_count": 150,
"synced": true
}
]
}List Pages in Space
bash
GET /api/connectors/{connector_id}/pages?space_key=TEAM&limit=50Sync Specific Space
bash
POST /api/connectors/{connector_id}/sync
{
"mode": "incremental",
"filters": {
"space_keys": ["TEAM"]
}
}Hierarchy Preservation
Confluence page hierarchy is preserved:
- Parent-child relationships tracked in metadata
- Breadcrumbs available for navigation context
- Space structure maintained for filtering
Cloud vs Data Center
| Feature | Cloud | Data Center |
|---|---|---|
| OAuth | Yes | API Token |
| Webhook | Yes | Limited |
| Sync | REST API | REST API |
For Data Center/Server, use API token authentication:
python
client.connectors.create("confluence", {
"auth_type": "api_token",
"credentials": {
"base_url": "https://confluence.yourcompany.com",
"username": "user@company.com",
"api_token": "your-api-token"
}
})Sync Frequency
| Mode | Frequency | Use Case |
|---|---|---|
| Scheduled | Every 4 hours | Standard sync |
| On-demand | Manual trigger | Immediate updates |
| Webhook | Near real-time | Cloud only |
Troubleshooting
"Space not found" errors
- Verify you have access to the space
- Check space key is correct (case-sensitive)
Missing page content
- Some macros may not be fully supported
- Check for restricted pages
Large spaces timing out
- Sync spaces individually
- Increase timeout settings
Next Steps
- SharePoint Connector - Connect SharePoint
- Integrations Overview - Architecture overview