Graphs - KG Embedding

Knowledge Graph Embedding

By Duaibeom Mar. 12, 2023

cs224w
graph

This post based on 10. Knowledge Graph Embeddings of CS224W

Knowledge Graphs#

Knowledge in graph form:

Capture entities, types, and relationships
Nodes are entities
Nodes are labeled with their types
Edges between two nodes capture relationships between entities

Knowledge Graph (KG) is an example of a heterogeneous graph

Bibliographic Networks
- Node types: paper, title, author, conference, year
- Relation types: pubWhere, pubYear, hasTitle, hasAuthor, cite

Bio Knowledge Graphs
- Node types: drug, disease, adverse event, protein, pathways
- Relation types: has_func, causes, assoc, treats, is_a

Examples of knowledge graphs:

Google Knowledge Graph
Amazon Product Graph
Facebook Graph API
IBM Watson
Microsoft Satori
Project Hanover/Literome
LinkedIn Knowledge Graph
Yandex Object Answer

Building a Knowledge Graph-based Dialogue System at the 2nd ConvAI Summer School

Datasets#

Publicly available KGs:

FreeBase, Wikidata, Dbpedia, YAGO, NELL, etc.

Common characteristics:

Massive: Millions of nodes and edges;
Incomplete: Many true edges are missing

enumerating all the possible facts is intractable!

Given an enormous KG, can we complete the KG?

For a given (head, relation), we predict missing tails.
- (Note this is slightly different from link prediction task)

KG Representation#

Edges in KG are represented as triples $(h, r, t)$

head ( $h$ ) has relation ( $r$ ) with tail ( $t$ )

Key Idea:

Model entities and relations in the embedding/vector space ${\Bbb R}^d$ $R^{d}$ .
- Associate entities and relations with shallow embeddings
- Note we do not learn a GNN here!
Given a true triple $(h, r, t)$ $(h, r, t)$ , the goal is that the embedding of $(h, r)$ $(h, r)$ should be close to the embedding of $t$ $t$ .
- How to embed $(h, r)$ ?
- How to define closeness?

We are going to learn about different KG embedding models (shallow/transductive embs):

Different models are…
- …based on different geometric intuitions
- …capture different types of relations (have different expressivity)

Relations:#

Symmetric Relation
Antisymmetric Relation
Inverse Relation
Composition (Transitive) Relation
1-to-N Relation

TransE#

https://papers.nips.cc/paper/2013/hash/1cecc7a77928ca8133fa24680a88d2f9-Abstract.html

Translation Intuition:

For a triple $(h, r, t)$ , $\bf{h, r, t} \in {\Bbb R}^d$ , $\bf{h + r \approx t}$ if the given fact is true else $\bf{h + r \neq t}$ .

Scoring function: $f_r (h,t) = -\| \bf{h + r - t} \|$

Relations in a heterogeneous KG have different properties:

Example:

Symmetry: If the edge ( $h$ , “Roommate”, $t$ ) exists in KG, then the edge ( $t$ , “Roommate”, $h$ ) should also exist.
Inverse relation: If the edge ( $h$ , “Advisor”, $t$ ) exists in KG, then the edge $t$ , “Advisee”, $h$ should also exist.

Can we categorize these relation patterns?

Are KG embedding methods (e.g., TransE) expressive enough to model these patterns?

Antisymmetric Relations:#

$r(h,t) \rArr \neg r(t,h),\ \forall h,t$

Antisymmetric: Hypernym
- $\bf{h + r = t}$ , but $\bf{t + r \neq h}$

Inverse Relations:#

$r_2 (h,t) \rArr r_1 (t,h)$

Example : (Advisor, Advisee)
- $\bf{h + r_{\rm 2} = t}$ , we can set $\bf{r}_1 = -\bf{r}_2$

Composition (Transitive) Relations:#

$r_1 (x,y) \land r_2 (y,z) \rArr r_3 (x, z),\ \forall x, y, z$

Example: My mother’s husband is my father.
- $\bf{r}_3 = \bf{r}_1 + \bf{r}_2$

1-to-N relations:#

$r(h, t_1), r(h, t_2), …, r(h, t_n)$ are all True.

Example: $(h, r, t_1)$ $(h, r, t_{1})$ and $(h, r, t_2)$ $(h, r, t_{2})$ both exist in the knowledge graph, e.g., $r$ $r$ is “StudentsOf”
- TransE cannot model 1-to-N relations; $\bf{t}_1$ and $\bf{t}_2$ will map to the same vector, although they are different entities
- $\bf{t}_1 = \bf{h + r} = \bf{t}_2$ , contradictory $\bf{t}_1 \neq \bf{t}_2$

Symmetric relations#

$r(h,t) \rArr r(t, h),\ \forall h,t$

Example: Family, Roommate
- TransE cannot model symmetric relations
- only if $\bf{r} = 0, \bf{h = t}$

TransR#

Learning Entity and Relation Embeddings for Knowledge Graph Completion

TransE models translation of any relation in the same embedding space.
Can we design a new space for each relation and do translation in relation-specific space?

TransR: model entities as vectors in the entity space ℝ𝑑 and model each relation as vector in relation space ${\bf r} \in \Bbb{R}^k$ with ${\bf M}_r \in {\Bbb R}^{k\times d}$ as the projection matrix.

${\bf h}_\perp = {\bf M}_r {\bf h}$ , ${\bf t}_\perp = {\bf M}_r {\bf t}$

Score function: $f_r (h,t) = -\| \bf{h_\perp + r - t_\perp} \|$

Mark

Specifically, in the equation ${\bf h}_\perp = {\bf M}_r {\bf h}$ , ${\bf h}_\perp$ is a vector that represents the projection of the vector ${\bf h}$ onto the orthogonal complement of the row space of the matrix ${\bf M}$ . The symbol $\perp$ is used as a subscript to distinguish this vector from other vectors that may be used in the context of the equation.

In general mathematical notation, the symbol $\perp$ can be used to denote orthogonality or perpendicularity between two objects, such as vectors or lines. For example, the symbol $\perp$ can be used to represent the relationship between two perpendicular lines or two orthogonal vectors. However, in the context of the equation you provided, the symbol $\perp$ is being used in a more specific and non-standard way to denote a particular vector component.

Note different symmetric relations may have different ${\bf M}_r$ .

Symmetric relations:#

$r(h,t) \rArr r(t, h),\ \forall h,t$

Example: Family, Roommate
TransR can model symmetric relations
- ${\bf r}=0,\ {\bf h}_\perp = {\bf M}_r {\bf h} = {\bf M}_r {\bf t} = {\bf t}_\perp$

Antisymmetric relations:#

TransR can model antisymmetric relations
- ${\bf r} \neq 0,\ {\bf M}_r {\bf h} + {\bf r} = {\bf M}_r {\bf t}$ , Then ${\bf M}_r {\bf t} + {\bf r} \neq {\bf M}_r {\bf h}$

1-to-N Relations:#

$r(h, t_1), r(h, t_2), …, r(h, t_n)$ are all True.

TransR can model 1-to-N relations
- We can learn ${\bf M}_r$ so that ${\bf t}_\perp = {\bf M}_r {\bf t}_1 = {\bf M}_r {\bf t}_2$

Inverse Relations:#

TransR can model inverse relations

${\bf r}_2 = -{\bf r}_1$ , ${\bf M}_{r_1} = {\bf M}_{r_2}$ . Then ${\bf M}_{r_1} {\bf t} + {\bf r}_1 = {\bf M}_{r_1} {\bf h}$ and ${\bf M}_{r_2} {\bf h} + {\bf r}_2 = {\bf M}_{r_2} {\bf t}$

Composition Relations:#

TransR can model composition relations

High-level intuition: TransR models a triple with linear functions, they are chainable.
- Background:
  Kernel space of a matrix ${\bf M}$ : ${\bf h} \in \text{Ker}({\bf M})$ , then ${\bf Mh = 0}$ .

Assume ${\bf M}_{r_a} {\bf g}_a = {\bf r}_a$ and ${\bf M}_{r_b} {\bf g}_b = {\bf r}_b$

For $r_1(x, y)$ :

\begin{aligned} &{\bf M}_{r_1} {\bf x} + {\bf r}_1 = {\bf M}_{r_1} {\bf y} \rarr\\ &{\bf y - x} \in {\bf g}_1 + \text{Ker}({\bf M}_{r_1}) \rarr\\ &{\bf y} \in {\bf x} + {\bf g}_1 + \text{Ker}({\bf M}_{r_1}) \end{aligned}

For $r_2(y, z)$ :

\begin{aligned} &{\bf M}_{r_2} {\bf y} + {\bf r}_2 = {\bf M}_{r_2} {\bf z} \rarr\\ &{\bf z - y} \in {\bf g}_2 + \text{Ker}({\bf M}_{r_2}) \rarr\\ &{\bf z} \in {\bf y} + {\bf g}_2 + \text{Ker}({\bf M}_{r_2}) \end{aligned}

Then,

{\bf z} \in {\bf x} + {\bf g}_1 + {\bf g}_2+ \text{Ker}({\bf M}_{r_1}) + \text{Ker}({\bf M}_{r_2})

Construct ${\bf M}_{r_3}$ , s.t. $\text{Ker}({\bf M}_{r_3}) = \text{Ker}({\bf M}_{r_1}) + \text{Ker}({\bf M}_{r_2})$

Since,
- $\text{dim}\left(\text{Ker}({\bf M}_{r_3})\right) \ge \text{dim}\left(\text{Ker}({\bf M}_{r_1})\right)$
- ${\bf M}_{r_3}$ has the same shape as ${\bf M}_{r_1}$

We know ${\bf M}_{r_3}$ exists!

Set ${\bf r}_3 = {\bf M}_{r_3}({\bf g}_1 + {\bf g}_2)$

We are done! We have ${\bf M}_{r_3} {\bf x} + {\bf r}_3 = {\bf M}_{r_3} {\bf z}$

DistMult#

Embedding Entities and Relations for Learning and Inference in Knowledge Bases

https://arxiv.org/abs/1412.6575

The scoring function $f_r(h, t)$ is negative of L1/L2 distance in TransE and TransR.

Entities and relations using vectors in ${\Bbb R}^k$

Score function: $f_r(h, t) = < {\bf h}, {\bf r}, {\bf t} > = \sum_i {\bf h}_i \cdot {\bf r}_i \cdot {\bf t}_i,\quad {\bf h}, {\bf r}, {\bf t} \in {\Bbb R}^k$

Bilinear modeling

Intuition of the score function: Can be viewed as a cosine similarity between ${\bf h} \cdot {\bf r}$ and ${\bf t}$

1-to-N Relations:#

$r(h, t_1), r(h, t_2), …, r(h, t_n)$ are all True.

$< {\bf h}, {\bf r}, {\bf t}_1 > = < {\bf h}, {\bf r}, {\bf t}_2 >$

Symmetric Relations:#

$r(h,t) \rArr r(t, h),\ \forall h,t$

$f_r(h, t) = < {\bf h}, {\bf r}, {\bf t} > = \sum_i {\bf h}_i \cdot {\bf r}_i \cdot {\bf t}_i = < {\bf t}, {\bf r}, {\bf h} > = f_r(t, h)$

Then, $r(h, t)$ and $r(t, h)$ always have same score! → Cannot possible Antisymmetric relations

Also, Inverse relations not possible.

$f_{r_2}(h, t) = < {\bf h}, {\bf r}_2, {\bf t} > = < {\bf t}, {\bf r}_1, {\bf h} > = f_{r_1}(t, h)$

This mean ${r_2} = {r_1}$ $r_{2} = r_{1}$ .
- But semantically this does not make sense: The embedding of “Advisor” should not be the same with “Advisee”.

Composition relations:#

$r_1 (x,y) \land r_2 (y,z) \rArr r_3 (x, z),\ \forall x, y, z$

ComplEx#

Trouillon et al, Complex Embeddings for Simple Link Prediction, ICML 2016

Based on Distmult, ComplEx embeds entities and relations in Complex vector space

ComplEx: model entities and relations using vectors in ${\Bbb C}^k$

Imaginary: 허수

Score function: $f_r(h, t) = \text{Re}(\sum_i {\bf h}_i \cdot {\bf r}_i \cdot {\bf \bar{t}}_i)$

Antisymmetric Relations:#

The model is expressive enough to learn
- High, $f_r(h, t) = \text{Re}(\sum_i {\bf h}_i \cdot {\bf r}_i \cdot {\bf \bar{t}}_i)$
- Low, $f_r(t, h) = \text{Re}(\sum_i {\bf t}_i \cdot {\bf r}_i \cdot {\bf \bar{h}}_i)$

Due to the asymmetric modeling using complex conjugate.

Symmetric Relations:#

$r(h,t) \rArr r(t, h),\ \forall h,t$

When $\text{Im}({\bf r}) = 0$ , we have

\begin{aligned} f_r(h, t) &= \text{Re}(\sum_i {\bf h}_i \cdot {\bf r}_i \cdot {\bf \bar{t}}_i)\\ &= \sum_i \text{Re}({\bf r}_i \cdot {\bf h}_i \cdot {\bf \bar{t}}_i) \\ &= \sum_i {\bf r}_i \cdot \text{Re}( \cdot {\bf h}_i \cdot {\bf \bar{t}}_i) \\ &= \sum_i {\bf r}_i \cdot \text{Re}( \cdot {\bf \bar{h}}_i \cdot {\bf t}_i) \\ &= \sum_i \text{Re}( {\bf r}_i \cdot {\bf \bar{h}}_i \cdot {\bf t}_i) = f_r(t, h) \end{aligned}

Inverse Relations:#

${\bf r} = {\bf \bar{r}}_2$
${\bf r}_2 = \argmax\limits_r \text{Re}(<{\bf h}, {\bf r}, {\bf \bar{t}}>) = \argmax\limits_r \text{Re}(<{\bf t}, {\bf r}, {\bf \bar{h}}>) = {\bf r}_1$

ComplEx share the same property with DistMult

Cannot possible Composition relations
Can possible 1-to-N relations

KG Embeddings in Practice#

Different KGs may have drastically different relation patterns!
There is not a general embedding that works for all KGs, use the table to select models
Try TransE for a quick run if the target KG does not have much symmetric relations
Then use more expressive models, e.g., ComplEx, RotatE (TransE in Complex space)

previous Graphs - Theory
next Graphs - Reasoning

Graphs - KG Embedding

Contents

Knowledge Graphs#

Datasets#

KG Representation#

Relations:#

TransE#

Antisymmetric Relations:#

Inverse Relations:#

Composition (Transitive) Relations:#

1-to-N relations:#

Symmetric relations#

TransR#

Symmetric relations:#

Antisymmetric relations:#

1-to-N Relations:#

Inverse Relations:#

Composition Relations:#

DistMult#

1-to-N Relations:#

Symmetric Relations:#

Composition relations:#

ComplEx#

Antisymmetric Relations:#

Symmetric Relations:#

Inverse Relations:#

KG Embeddings in Practice#