10.15.14

Automatic Parameter Learning for Multiple Network Alignment

Jason Flannick1, Antal Novak1, Chuong B. Do1, Balaji S. Srinivasan2, and Serafim Batzoglou1

1Department of Computer Science, Stanford University, Stanford, CA 94305, USA

2Department of Statistics, Stanford University, Stanford, CA 94305, USA

We developed Græmlin 2.0, a new multiple network aligner with (1) a novel scoring func-

tion; (2) an algorithm that automatically learns the scoring function’s parameters; and (3) an

algorithm that uses the scoring function to globally align multiple networks. Existing alignment

tools use heuristic scoring functions, which must be hand-tuned to a given set of networks and

do not apply to multiple network alignment.

Our scoring function can use arbitrary features of a multiple network alignment, such as

protein deletions, protein duplications, protein mutations, and interaction losses. Our parameter

learning algorithm uses a training set of known network alignments to learn parameters for

our scoring function and thereby automatically adapts it to any set of networks. Our global

alignment algorithm finds approximate multiple network alignments in linear time.

We tested Græmlin 2.0’s accuracy on protein interaction networks from IntAct, DIP, and

the Stanford Network Database. We show that, on each of these datasets, Græmlin 2.0 has

higher sensitivity and specificity than existing network aligners. Græmlin 2.0 is available under

the GNU public license at http://graemlin.stanford.edu.