I posted a journal version of the “Fast Katz and commute time” paper from WAW2010 to arXiv entitled “Fast matrix computations for pair-wise and column-wise commute times and Katz scores”
Commute time scores and Katz scores are useful ranking, link predictions, and clustering metrics. Given a graph, they are defined between any pair of nodes. Computing all of the scores at once takes work and memory ( is the number of nodes). What we investigate are faster algorithms that compute a single score in memory and time; and column-wise algorithms that approximate the score between a single node and all other nodes (think Dijkstra’s algorithm here) with similar computational efficiency.
The major difference from the conference version is a much more thorough description of the pairwise algorithms, more extensive experiments, a fully explained convergence theory, and a column-wise algorithm for commute time. Although, the latter algorithm is not that great. However, that problem is not likely to be important as column-wise commute time distances are well approximated by the following constant time algorithm: where is the degree of node . This formula was derived by von Luxburg et al. A recent paper they wrote explains the issue clearly:
Ulrike von Luxburg, Agnes Radl, and Matthias Hein. Getting lost in space: Large sample analysis of the commute distance. In Advances in Neural Information Processing Systems 24 (NIPS2010), 2010. URL http://books.nips.cc/papers/files/nips23/NIPS2010_0929.pdf.
We test this approximation with graphs up to 100,000 nodes and show that it’s really good.
As with most of my papers, the code for the project is online: