Machine learning with Graphs#
Overview#
Different types of tasks:
- Node level
- Edge level
- Graph level
Choice of a graph representation:
- Directed, undirected, bipartite, weighted, adjacency matrix
Applications and use cases:
- Node classification: Predict a property of a node
- Example: Categorize online users / items
- Link prediction: Predict whether there are missing links between two nodes
- Example: Knowledge graph completion
- Graph classification: Categorize different graphs
- Example: Molecule property prediction
- Clustering: Detect if nodes form a community
- Example: Social circle detection
- Other tasks:
- Graph generation: Drug discovery
- Graph evolution: Physical simulation
Node-level tasks#
- AlphaFold - Spatial graph
- nodes: amino acids in a protein sequence
- edges: proximity between amino acids (residus)
Edge-level tasks#
recommender system#
Users interacts with items:
- Watch movies, buy merchandise, listen to music
- Nodes: Users and items
- Edges: User-item interactions
Goal: Recommend items users might like
Drug Side Effects#
Many patients take multiple drugs to treat complex or co-existing diseases:
- 46% of people ages 70-79 take more than 5 drugs
- Many patients take more than 20 drugs to treat heart disease, depression, insomnia, etc.
Task: Given a pair of drugs predict adverse side effects
Biomedical Graph Link Prediction#
Nodes: Drugs & Proteins
Edges: Interactions
Subgraph-level tasks#
Traffic prediction#
Road Network as a Graph
Nodes: road segments
Edges: Connectivity between road segments
Prediction: Time of Arrival (ETA)
Graph-level tasks#
Drug Discovery#
Molecule is a Graph.
Nodes: Atoms
Edges: Chemical bonds.
- Graph classification (predict activity, for vHTS)
- Molecule Generation
- Molecule Optimization
Physics Simulation#
Nodes: Particles
Edges: Interaction between particles
Representing Graphs#
Undirected
- Links: undirected (symmetrical, reciprocal)
- Examples:
- Collaborations
- Friendship on Facebook
Directed
- Links: directed (arcs)
- Examples:
- Phone calls
- Following on Twitter
Heterogeneous graph#
G=(V,E,R,T)
Nodes with node types vi∈V
Edges with relation types (vi,r,vj)∈E
Node type T(vi)
Relation type r∈R
Adjacency Matrix#
Aij=1 has edge from node i to node j
Aij=0 otherwise
the matrix of a undirected graph is symmetric.
the matrix of a directed graph is not symmetric.
Most real-world networks are sparse.
Unweighted vs Weighted
Self-edge, Multigraph