Problem overview

The Minimum Spanning Tree (MST) problem in an undirected graph with positive real weights assigned to each edge is suggested. More information about the problem can be found in Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein Introduction to Algorithms, third edition, 2009 (ISBN 9780262033848).

Two synthetic graph classes, namely R-MAT and SSCA2, are used for the performance evaluation. The R-MAT graphs are similar to the real-world graphs, e.g. social networks and Internet graphs: R-MAT: A Recursive Model for Graph Mining (English paper, PDF). We consider R-MAT graphs with an average vertex degree 32 and a power of 2 number of vertices. The SSCA2 graphs consist of sets of independent components interconnected with each other: Design and Implementation of the HPCS Graph Analysis Benchmark on Symmetric Multiprocessors (English paper, sections 2 and 3, PDF). We consider SSCA2 graphs with a very large number of components.

Contestants can choose graph sizes for the performance evaluation. The reference implementation includes parallel and serial graph generators.

Contestants can implement any MST algorithm, as well as modify the internal representation of graphs. Graph construction time is not included in the MST algorithm execution time.

Allowed modifyings of the internal graph representation are:

  • edges sorting of each vertex by weights,
  • all edges sorting not using weights,
  • removing duplicated information,
  • improving of temporal and spatial locality of memory access.

Note! Performing of semantic calculations is not allowed during graph construction:

  • searching connected components, its count,
  • all edges sorting using weights.

Requirements to the MST algorithm result output:

  • vertices or edges set of determined trees forest,
  • stored information about residing of vertices or edges in the connected components.

A special data structure for the result validation should be filled at the end of the MST execution (an example of this procedure is given in the reference implementation). That time is not included in the MST algorithm execution time.

The validation routine consists of several steps. First, a number of found trees in the contestant MST implementation and in the reference MST implementation are compared. Second, all trees are arranged in the increasing order of their weights and the validation routine checks that the weight of the current tree is equal to the weight of the tree from the reference implementation as well as whether the current tree is really a tree (using the fact that a tree is a connected graph containing N vertices and N-1 edges). Finally, the validation routine checks that all considered trees contain all non-isolated vertices of the graph.

Testing systems

There are two classes of testing systems which will be regarded separately:

  1. Single-node system.
  2. Multi-node cluster system.

Three computing systems will be available to contestants for the performance evaluation: 2 single-node and 1 multi-node. Two single-node systems are an SMP compute node with two 6-core Intel Xeon E5-2630 processors and a heterogeneous compute node with NVIDIA Tesla K20x GPU accelerator. The multi-node system is an 8 node cluster with Infiniband 4x FDR interconnect, each compute node has two 6-core Intel Xeon E5-2630 processors. Access to these systems will be provided using automatic contest system.

Implementations for other systems and architectures are encouraged, performance results in that case will be considered separately.

Reference implementation

The MST reference implementation includes serial and parallel graph generators, solution validation tools. It is available here: GraphHPC-1.0.tgz.

Questions

If you have any questions, please feel free to contact: contest@dislab.org.