Learning combinatorial optimization algorithms over graphs
NIPS 2017 9-22-2020
Motivation
The current methods for combinatorial optimization problem cannot learn from experience.
Can we learn from experience?
Given a graph optimization problem and a distribution of problem instances, can we learn a better greedy heuristics that generalize to unseen instances from ?
Three common greedy algorithms on Graphs
Given weighted graph: , : edge weight function, is the weight of edge
Minimum Vertex Cover (MVC):
Given a graph , find a subset of nodes , s.t. every edge is covered.
is minimized
Maximum Cut (MAXCUT):
Given a graph , find a subset of nodes , s.t. the weight of cut-set is maximized.
cut-set: , the set of edges with one end in and the other end in
Traveling Salesman Problem (TSP)
Overview
basic idea
original graph with initial state
input the graph to an embedding model (structure2vec) for T steps
each node has a corresponding score based on structure2vec
pick up the node with the highest score (after one iteration)
go back to step 2 and repeat until termination
Relation to Q-learning (taking MVC for example)
Reinforcement Learning
MVC (Minimum Vertex Cover)
score we earned at current step
current screen
current selected nodes
move your board left/right
select a node
Action value function
predicted future total rewards
structure2vec embedding
How to choose the action
How to represent the node embedding:
How to measure the Q-value for each node:
Reference
Last updated