Confluence Connector

Sync spaces, pages, and blog posts from Atlassian Confluence into LH42.

Features

Spaces - Sync entire spaces or selected ones
Pages - Full page content with hierarchy
Blog posts - Company blog content
Attachments - PDF and Office documents
XHTML conversion - Confluence storage format → markdown
Incremental sync - Only changed pages re-indexed

Prerequisites

Atlassian Cloud account (or Data Center/Server)
Confluence space access

Setup

Step 1: Connect via OAuth

Go to Settings > Integrations
Find Confluence and click Connect
Sign in with your Atlassian account
Grant the requested permissions:

- Read Confluence spaces

- Read Confluence content

- Read user information

Step 2: Select Spaces

After connecting, choose which spaces to sync:

python

client.connectors.configure("confluence", {
    "settings": {
        "space_keys": ["TEAM", "DOCS", "ENG"],  # Specific spaces
        "include_blogs": True,                    # Include blog posts
        "include_attachments": True               # Sync attachments
    }
})

Options:

Omit space_keys to sync all accessible spaces
Set include_blogs: false to skip blog posts
Set include_attachments: false to skip attachments

Step 3: Start Initial Sync

python

client.connectors.sync("confluence", mode="full")

Content Transformation

Confluence XHTML storage format is converted to searchable markdown:

Confluence Element	Output
ac:structured-macro	Expanded content
ac:rich-text-body	Plain text
ac:image	Image reference
ac:link	Markdown link
table	Markdown table
code	Code block

API Reference

List Spaces

bash

GET /api/connectors/{connector_id}/spaces

# Response
{
  "spaces": [
    {
      "key": "TEAM",
      "name": "Team Documentation",
      "type": "global",
      "page_count": 150,
      "synced": true
    }
  ]
}

List Pages in Space

bash

GET /api/connectors/{connector_id}/pages?space_key=TEAM&limit=50

Sync Specific Space

bash

POST /api/connectors/{connector_id}/sync
{
  "mode": "incremental",
  "filters": {
    "space_keys": ["TEAM"]
  }
}

Hierarchy Preservation

Confluence page hierarchy is preserved:

Parent-child relationships tracked in metadata
Breadcrumbs available for navigation context
Space structure maintained for filtering

Cloud vs Data Center

Feature	Cloud	Data Center
OAuth	Yes	API Token
Webhook	Yes	Limited
Sync	REST API	REST API

For Data Center/Server, use API token authentication:

python

client.connectors.create("confluence", {
    "auth_type": "api_token",
    "credentials": {
        "base_url": "https://confluence.yourcompany.com",
        "username": "user@company.com",
        "api_token": "your-api-token"
    }
})

Sync Frequency

Mode	Frequency	Use Case
Scheduled	Every 4 hours	Standard sync
On-demand	Manual trigger	Immediate updates
Webhook	Near real-time	Cloud only

Troubleshooting

"Space not found" errors

Verify you have access to the space
Check space key is correct (case-sensitive)

Missing page content

Some macros may not be fully supported
Check for restricted pages

Large spaces timing out

Sync spaces individually
Increase timeout settings

Next Steps

SharePoint Connector - Connect SharePoint
Integrations Overview - Architecture overview

Confluence Connector

Sync spaces, pages, and blog posts from Atlassian Confluence into LH42.

Features

Spaces - Sync entire spaces or selected ones
Pages - Full page content with hierarchy
Blog posts - Company blog content
Attachments - PDF and Office documents
XHTML conversion - Confluence storage format → markdown
Incremental sync - Only changed pages re-indexed

Prerequisites

Atlassian Cloud account (or Data Center/Server)
Confluence space access

Setup

Step 1: Connect via OAuth

Go to Settings > Integrations
Find Confluence and click Connect
Sign in with your Atlassian account
Grant the requested permissions:

- Read Confluence spaces

- Read Confluence content

- Read user information

Step 2: Select Spaces

After connecting, choose which spaces to sync:

python

client.connectors.configure("confluence", {
    "settings": {
        "space_keys": ["TEAM", "DOCS", "ENG"],  # Specific spaces
        "include_blogs": True,                    # Include blog posts
        "include_attachments": True               # Sync attachments
    }
})

Options:

Omit space_keys to sync all accessible spaces
Set include_blogs: false to skip blog posts
Set include_attachments: false to skip attachments

Step 3: Start Initial Sync

python

client.connectors.sync("confluence", mode="full")

Content Transformation

Confluence XHTML storage format is converted to searchable markdown:

Confluence Element	Output
ac:structured-macro	Expanded content
ac:rich-text-body	Plain text
ac:image	Image reference
ac:link	Markdown link
table	Markdown table
code	Code block

API Reference

List Spaces

bash

GET /api/connectors/{connector_id}/spaces

# Response
{
  "spaces": [
    {
      "key": "TEAM",
      "name": "Team Documentation",
      "type": "global",
      "page_count": 150,
      "synced": true
    }
  ]
}

List Pages in Space

bash

GET /api/connectors/{connector_id}/pages?space_key=TEAM&limit=50

Sync Specific Space

bash

POST /api/connectors/{connector_id}/sync
{
  "mode": "incremental",
  "filters": {
    "space_keys": ["TEAM"]
  }
}

Hierarchy Preservation

Confluence page hierarchy is preserved:

Parent-child relationships tracked in metadata
Breadcrumbs available for navigation context
Space structure maintained for filtering

Cloud vs Data Center

Feature	Cloud	Data Center
OAuth	Yes	API Token
Webhook	Yes	Limited
Sync	REST API	REST API

For Data Center/Server, use API token authentication:

python

client.connectors.create("confluence", {
    "auth_type": "api_token",
    "credentials": {
        "base_url": "https://confluence.yourcompany.com",
        "username": "user@company.com",
        "api_token": "your-api-token"
    }
})

Sync Frequency

Mode	Frequency	Use Case
Scheduled	Every 4 hours	Standard sync
On-demand	Manual trigger	Immediate updates
Webhook	Near real-time	Cloud only

Troubleshooting

"Space not found" errors

Verify you have access to the space
Check space key is correct (case-sensitive)

Missing page content

Some macros may not be fully supported
Check for restricted pages

Large spaces timing out

Sync spaces individually
Increase timeout settings

Next Steps

SharePoint Connector - Connect SharePoint
Integrations Overview - Architecture overview