分布式图算法Pregel

发布时间 2023-11-06 05:11:32作者: 真昼小天使daisuki

最近看了Google的Pregel论文,图算法有一些经典且不可被替代的应用场景,如社交网络,相互引用等。但是在单个结点上的运算量往往过少,注重的是消息传播和逻辑处理,而不是单纯的大规模计算。虽然已经过去了十几年,但是其中的messsage passing,combiner,aggregator,group partition,状态机等机制还是设计分布式图算法的有效方法。这里针对设计和实现部分做一点点笔记。

Pregel: A System for Large-Scale Graph Processing

key point:

distributed computer clusters -> graph task
graph algorithm

introduction

Target:

Many practical computing problems concern large graphs. e.g. the Web graph and various social networks.

Parallelism for graph challenging:

  1. Graph algorithms often exhibit poor locality of memory access,
  2. very little work per vertex,
  3. and a changing degree of parallelism over the course of execution.

We want a scalable general-purpose system for graph! Compared with existed options:

  • efficient
  • scalable
  • fault-tolerant

inspired by: Valiant’s Bulk Synchronous Parallel model


Model

data structure

input: directed graph(both have value):

  • vertex: vertex identifier!
  • edge: associated with source vertices

process

superstep: working unit with state machine -> sync!
vertex: first class citizens

output: may not as same as input(the graph structure can change)
terminal: voting to halt


API

Compute()
GetValue()
MutableValue()

combiner:

aggregator: global monitor (in my opinion, aggregator is global combiner + global coordinator(more logical than combiner))
aggregator: get global information and do a calculation, not only the combination of data.

e.g. do a particular operation when all the vertices meet a particualr condition


Implementation

Architecture

group partition: default(hash(ID) mod N)

self-define partition allocator: use locality

hierarchy(up to bottom):

  • user program(copies of app) = 1 master + N-1 workers
  • physical machine
  • group partition
  • vertex & outer-edge

Fault Tolerance

checkpoint

worker: partition state -> persistant storage
master: aggregator

regular "ping" message