Knowledge Graph

LH42 automatically builds a knowledge graph by extracting entities (people, organizations, concepts, locations) and their relationships from your documents. This enables powerful graph-based queries, entity linking, and relationship discovery.

What is a Knowledge Graph?

A knowledge graph represents information as a network of:

Nodes (Entities): People, organizations, concepts, locations, dates, etc.
Edges (Relationships): Connections between entities (e.g., "works_for", "located_in", "reports_to")

[John Smith] --works_for--> [Acme Corp]
[Acme Corp] --located_in--> [San Francisco]
[John Smith] --reports_to--> [Jane Doe]

Automatic Entity Extraction

When you upload documents, LH42 automatically extracts entities using NLP and LLM-based extraction:

python

# Upload a document - entities are extracted automatically
doc = client.documents.upload(file, metadata={"source": "legal"})

# View extracted entities
graph = await client.entities.get_graph(document_id=doc.id)
for node in graph.nodes:
    print(f"{node.name} ({node.type})")

Entity Types

LH42 extracts and classifies entities into types:

Type	Examples
`person`	Names, job titles
`organization`	Companies, departments, agencies
`concept`	Technical terms, abstract ideas
`location`	Cities, countries, addresses
`date`	Dates, time periods
`product`	Product names, services
`technology`	Software, frameworks, tools

Querying the Knowledge Graph

Get the Full Graph

python

# Python
graph = await client.entities.get_graph(limit=200)

print(f"Nodes: {graph.stats.total_nodes}")
print(f"Edges: {graph.stats.total_edges}")
print(f"Avg connections: {graph.stats.avg_connections:.2f}")

for node in graph.nodes:
    print(f"- {node.name} ({node.type})")

typescript

// TypeScript
const graph = await client.knowledgeGraph.get();

console.log('Nodes:', graph.stats.totalNodes);
console.log('Edges:', graph.stats.totalEdges);

for (const node of graph.nodes) {
  console.log(node.label, node.type);
}

for (const edge of graph.edges) {
  console.log(edge.source, '->', edge.target, ':', edge.label);
}

Get Graph for a Specific Document

python

graph = await client.entities.get_graph(document_id="doc_123")

typescript

const graph = await client.knowledgeGraph.forDocument('doc-uuid');

Filter by Entity Type

python

# Get only person entities
result = await client.entities.list(entity_type="person")
for entity in result.get("entities", []):
    print(f"- {entity['name']}")

typescript

const people = await client.knowledgeGraph.getNodesByType('person');

Find Related Entities

python

related = await client.entities.get_related(
    "entity_123",
    relationship_types=["works_for", "reports_to"]
)
for rel in related:
    print(f"{rel.source} --{rel.relationship_type}--> {rel.target}")

typescript

const related = await client.knowledgeGraph.getRelatedNodes('node-uuid');
console.log('Inbound:', related.inbound.length);
console.log('Outbound:', related.outbound.length);

Search Entities by Name

python

entities = await client.entities.search(
    "John",
    entity_types=["person"]
)

typescript

const results = await client.knowledgeGraph.searchNodes('machine learning');

REST API

Get Knowledge Graph

bash

GET /api/knowledge-graph?limit=100

# Response
{
  "nodes": [
    {
      "id": "node_abc",
      "label": "John Smith",
      "type": "person",
      "metadata": {
        "normalizedName": "john smith",
        "entityType": "person",
        "documentCount": 5
      }
    }
  ],
  "edges": [
    {
      "source": "node_abc",
      "target": "node_xyz",
      "label": "works_for",
      "weight": 0.95
    }
  ],
  "stats": {
    "totalNodes": 150,
    "totalEdges": 320,
    "avgConnections": 4.2
  }
}

Get Entities for a Document

bash

GET /api/knowledge-graph?documentId=doc_123&limit=50

List Entities with Pagination

bash

GET /v1/entities?limit=20&entity_type=person&cursor=abc123

Get Entity Relationships

bash

GET /v1/entities/{entity_id}/relationships?limit=10

Use Cases

Relationship Discovery

Find hidden connections between entities across your document corpus:

python

# Find all relationships for a person
related = await client.entities.get_related("person_123")

# Discover who they work with, report to, or collaborate with
for rel in related:
    if rel.relationship_type in ["works_with", "collaborates_with"]:
        print(f"Collaborator: {rel.target}")

Entity Linking

Link mentions of the same entity across different documents:

python

# Search for an entity across all documents
entities = await client.entities.search("Acme Corporation")

# The knowledge graph normalizes variations:
# "Acme Corp", "Acme Corporation", "ACME" -> same entity

Org Chart Extraction

Build organizational hierarchies from document analysis:

python

graph = await client.entities.get_graph()

# Find reporting relationships
for edge in graph.edges:
    if edge.relationship_type == "reports_to":
        print(f"{edge.source} reports to {edge.target}")

Topic Mapping

Identify key concepts and how they relate:

typescript

const concepts = await client.knowledgeGraph.getNodesByType('concept');
const stats = await client.knowledgeGraph.getStats();

console.log(`Found ${concepts.length} concepts`);
console.log(`Average connections: ${stats.avgConnections}`);

Best Practices

Use filters - Limit results with limit and entity_type for better performance
Cache the graph - For visualization, cache the graph and refresh periodically
Normalize queries - Entity names are normalized; search handles variations
Combine with search - Use knowledge graph alongside hybrid search for richer results

Next Steps

Hybrid Retrieval - Combine graph queries with semantic search
Python SDK - Full Python SDK reference
JavaScript SDK - Full TypeScript SDK reference