bisquik – fast random graph sampling

Generating a random instance of a graph with a prescribed degree sequence is a surprisingly tricky problem.  Such random objects are necessary to assess the statistical significance of various network analysis measures.  They can be used to answer the question: is property X present in many graphs with this degree distribution, or just the graph I’m looking at?

The Erdős-Gallai conditions provide a check of whether a connected graph with a degree sequence exists and also construct such a graph.  But these graphs are not random.

The standard algorithm for generating a random instance is a Markov Chain Monte Carlo procedure that starts with a graph with the prescribed degree sequence — possibly the output of the Erdős-Gallai consruction — and then performs a series of edge swaps.  That is, we pick two edges in the graph randomly, and swap their endpoints.  To be concrete, let (u,v) and (r,s) be the two edges.  The swap replaces (u,v) with $(u,r)$ and (v,s) if the resulting graph is a simple graph (no loops).  If we let this procedure run “long enough” then it will produce a random graph with the prescribed degree distribution.

The weakness of the algorithm is that “long enough” isn’t always well understood.  Most of the theory for related algorithms requires something like O(n^7) steps for a random sample, where n is the number of vertices. In practice, many people often run it for as little as O(|E|) steps.

Bayati, Kim, and Saberi set out to devise a better approach, and they produced one:

Bayati, Kim, and Saberi.  A sequential algorithm for sampling random graph.  Algorithmica 58(4):860-910, 2010. doi:10.1007/s00453-009-9340-1

This algorithm has a slightly complicated implementation.  I’ve provided my implementation online in a github repository:

https://github.com/dgleich/bisquik

The name bisquik is a play on Bayati/Kim/Saberi.  But it’s also a play on Bisquick the baking ingredient.  The idea is that bisquik is the ingredient you need to make graph analysis good.    I’m currently working on a python wrapper for this algorithm to make it useful for packages like igraph and networkx as well.

So give it a try and let me know if you run into any problems.

I’ve tested it under Linux, OSX, and Cygwin.  (The makefile requires a small edit on OSX, remove the -std=c++0x flag from the CFLAGS.)

I’m currently working on a quick paper on some of the implementation decisions.

 

About these ads
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s