## bisquik – fast random graph sampling

Generating a random instance of a graph with a prescribed degree sequence is a surprisingly tricky problem.  Such random objects are necessary to assess the statistical significance of various network analysis measures.  They can be used to answer the question: is property X present in many graphs with this degree distribution, or just the graph I’m looking at?

The Erdős-Gallai conditions provide a check of whether a connected graph with a degree sequence exists and also construct such a graph.  But these graphs are not random.

The standard algorithm for generating a random instance is a Markov Chain Monte Carlo procedure that starts with a graph with the prescribed degree sequence — possibly the output of the Erdős-Gallai consruction — and then performs a series of edge swaps.  That is, we pick two edges in the graph randomly, and swap their endpoints.  To be concrete, let $(u,v)$ and $(r,s)$ be the two edges.  The swap replaces $(u,v)$ with \$(u,r)\$ and $(v,s)$ if the resulting graph is a simple graph (no loops).  If we let this procedure run “long enough” then it will produce a random graph with the prescribed degree distribution.

The weakness of the algorithm is that “long enough” isn’t always well understood.  Most of the theory for related algorithms requires something like $O(n^7)$ steps for a random sample, where $n$ is the number of vertices. In practice, many people often run it for as little as $O(|E|)$ steps.

Bayati, Kim, and Saberi set out to devise a better approach, and they produced one:

Bayati, Kim, and Saberi.  A sequential algorithm for sampling random graph.  Algorithmica 58(4):860-910, 2010. doi:10.1007/s00453-009-9340-1

This algorithm has a slightly complicated implementation.  I’ve provided my implementation online in a github repository:

https://github.com/dgleich/bisquik

The name bisquik is a play on Bayati/Kim/Saberi.  But it’s also a play on Bisquick the baking ingredient.  The idea is that bisquik is the ingredient you need to make graph analysis good.    I’m currently working on a python wrapper for this algorithm to make it useful for packages like igraph and networkx as well.

So give it a try and let me know if you run into any problems.

I’ve tested it under Linux, OSX, and Cygwin.  (The makefile requires a small edit on OSX, remove the -std=c++0x flag from the CFLAGS.)

I’m currently working on a quick paper on some of the implementation decisions.

This entry was posted in Uncategorized. Bookmark the permalink.