Knowledge Triples

KnowledgeGraphs

GraphAI

Various ways to create a knowledge graph.

Graph AI is a large part about creating triples and extracting them so that an LLM gets augmented in function of a particular question. In the past years (we are 2024) there has been a lot flux in this direction:

frameworks have appeared wrapping functionality (to ‘make it easy’): LlamaIndex, LangChain, CrewAI, to name the main ones
language models can call functions, return objects (aka structured output)
models specifically trained to return triples (e.g. Triplex)
Neo4j invested a lot in GenAI

There is a general agreement that graphs lead to more accurate RAG (see Neo4j’s comparison for instance) but it’s equally clear that creating graphs is more demanding than simply dumping and querying vectors. It’s also unclear whether knowledge graphs mean triples (as in RDF) or property graphs. Frameworks typically output triples but store them in property graphs. Mamy people, on the other end, highlight the importance of ontologies and the virtues of RDF. The wisdom, like so often, is: it all depends on what you’re after and lots of details (budget, tech stack, vision…).

There are many examples out there and the essence of graph RAG is not complex. It’s only when you go a few steps further that it becomes challenging, for instance:

co-reference resolution: requires classic and subtle NLP techniques, though LLM can be used here as well
entity resolution: consolidation of (almost) the same things
using graph ML and traversal to fetch relevant data for RAG
keeping things organized with (something like) an ontology. Without this your knowledge graph quickly becomes a heap of noise.
making the AI system end to end fast and secure.

Below I approach the basics of graph RAG in various ways. It’s also a set of snippet I try to keep up to date since the API’s of OpenAI, LangChain, Ollama are constantly changing. Having these basic snippets ready to go help a lot to experiment and navigate to more sophisticated things.

The essence

Gettings triples out of an LLM is really not difficult. You just ask.

You can use the Ollama API or the OpenAI one with appropriate rewiring to Ollama. You can use all sorts of models and tools (like LLM Studio) but the crucial ingredients are the same.

# pip install openai
from openai import OpenAI
# the following is used in the various approaches
content = "Return the triples for the following text: John knows Mary and works at Microsoft. The weather in Brussels is 20 degrees."
client = OpenAI(base_url="http://localhost:11434/v1")
response = client.chat.completions.create(
    model="llama3.1",
    messages=[
        {
            "role": "system", 
            "content": """
        You convert text to a knowledge graphs with the given ontology.
        Return triples in the form of (subject, predicate, object).
        Return the triples and nothing else.
        Do not return the triples if the text does not match the ontology.
        The ontology is:
        - (Person, Knows, Person)
        - (Person, WorksAt, Company)
        - (City, Weather, Temperature)
        
        """},
        {
            "role": "user", 
            "content": content
        }
    ],
    temperature=0
)
print(response.choices[0].message.content)

[(John, Knows, Mary), (John, WorksAt, Microsoft), (Brussels, Weather, 20)]

Using special models

Some models have been trained specifically to output triples and this increases performance and accuracy. Triplex from SciPhi AI is one such a model and below you can see how it generates the same graph as above. A lot faster, that is.

The format is not exactly a triple but this is just a post-processing detail.

response = client.chat.completions.create(
    model="sciphi/triplex-tiny:latest",
    messages=[       
        {
            "role": "user", 
            "content": """
    Perform Named Entity Recognition (NER) and extract knowledge graph triplets from the text. NER identifies named entities of given entity types, and triple extraction identifies relationships between entities using specified predicates.
    **Entity Types:**
    {"entity_types": ["Person", "City", "Company", "Temperature"]}
    **Predicates:**
    {"predicates": ["Knows", "WorksAt", "Weather"]}
    **Text:**
    
            """ + content
        }
    ],
    temperature=0
)
print(response.choices[0].message.content)

{
    "entities_and_triples": [
        "[1], PERSON:John",
        "[2], PERSON:Mary",
        "[1] KNOWS [2]",
        "[3], COMPANY:Microsoft",
        "[1] WORKS_AT [3]",
        "[4], CITY:Brussels",
        "[5], TEMPERATURE:20 degrees",
        "[4] Weather [5]"
    ]
}

LlamaIndex

Frameworks wrap the basic ingredient above in sohpisticated (and presumably flexible) classes and constructs. Whether you need this depends on your bigger picture. The more you trust the framework the less you ‘see’ what is really happening and the more difficult it often is to debug things.

The above snippets extract triples (aka knowledge graph) but this does not immediately allow you to query the information or use it for downstream taks. Typically one uses vector comparisons and so, you need vectors. The conversion from text to vectors is a topic on its own but the easiest way is to use LLMs here as well. Pick an embedding model and it returns vectors.

LamaIndex works with Documents and ‘nodes’. These nodes combine text fragments and vectors. So, we start with creating a document:

from llama_index.core import Document
content = "Return the triples for the following text: John knows Mary and works at Microsoft. The weather in Brussels is 20 degrees."
doc = Document(text=content)

The whole of LlamaIndex depends on an LLM and various other bits you can configure globally in the Settings:

import nest_asyncio
nest_asyncio.apply()

#  pip install llama-index-llms-ollama llama-index-embeddings-ollama
from llama_index.core import Settings
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
 
Settings.llm =  Ollama(model="llama3.1", request_timeout=300.0)
Settings.embed_model = OllamaEmbedding(model_name="nomic-embed-text", base_url="http://localhost:11434")

Here is where things get turned into a graph (via the PropertyGraphIndex) but notice that the schema (entities and relations) are created automatically and the precise prompt is hidden within the LlamaIndex framework:

from llama_index.core import PropertyGraphIndex
index = PropertyGraphIndex.from_documents(
    [doc],   
    show_progress=True,
)

Extracting paths from text: 100%|██████████| 1/1 [00:03<00:00,  3.07s/it]
Extracting implicit paths: 100%|██████████| 1/1 [00:00<00:00, 15141.89it/s]
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  5.37it/s]
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  6.47it/s]

The good thing is that you don’t need to wonder how to compare vectors, fetch the relevant data…it all happens automatically. You can directly ask a question and underneath the knowledge graph with be consulted:

query_engine = index.as_query_engine(include_text=True)
response = query_engine.query("What temperature in Brussels?")
print(str(response))

20 degrees Celsius.

This simple example is great to go beyond and experiment. LlamaIndex is great for this sorta things and you have tons of connectors and utils (to spend the rest of your life with AI).

LangChain

LangChain is very similar to LlamaIndex but is in general much less focused on knowledge graphs. A straightforward example based on the documentation and with similar input as above does not yield an answer.

# pip install langchain langchain_community langchain_openai langchain_ollama
from langchain_community.graphs.index_creator import GraphIndexCreator
from langchain_ollama import ChatOllama
from langchain_openai import ChatOpenAI
# llm = ChatOllama( model="llama3.1", temperature=0.5)
llm = ChatOpenAI(temperature=0.5, model="gpt-3.5-turbo-0613", max_tokens=1000)

This creates the knoweldge graphs:

index_creator = GraphIndexCreator(llm=llm)
content = "Return the triples for the following text: John knows Mary and works at Microsoft. The weather in Brussels is 20 degrees."
graph = index_creator.from_text(content)

but do notice the unexpected inversion of predictate and object:

graph.get_triples()

[('John', 'Mary', 'knows'),
 ('John', 'Microsoft', 'works at'),
 ('weather in Brussels', '20 degrees', 'is')]

This inversion might explain why this does not get answered:

from langchain.chains import GraphQAChain
chain = GraphQAChain.from_llm(llm, graph=graph, verbose=True)
chain.run("what is the weather in Brussels?")



> Entering new GraphQAChain chain...
Entities Extracted:
Brussels
Full Context:


> Finished chain.

"I don't know."

I have not investigated further what the reason is.

Like many, my opinion is that a bespoke codebase is probably easier to debug and more lean that using premade frameworks like LangChain. At least, with respect to ingestion. If about agents and workflows (LangGraph and LlamaIndex Flows), this is another topic.