We use the simple eight node network of Figure 1 to provide the intuition behind our algorithm. Each node in the network maintains a route to its immediate successor. In addition, node 1 has a route to nodes 2 and 4, node 4 has a route to node 6 and node 2 has a complete routing table. We focus on the situation of node 0 looking up content on node 7. This lookup for a traditional P2P system is shown by the dashed line where the system greedily selects the route which moves the query along the farthest to its destination. It is easy to see however that because of node 2's complete routing table, node 0 would benefit from learning to forward queries to node 2. The optimal route, with respect to hops in the overlay, is shown by the solid line of 1.
Figure 2 shows node 0's estimation of the cost to lookup node 7. The learning algorithm converges after approximately 100 lookup operations where node 0 has determined that node 2 is the superior route with a cost of two hops where as taking the route to either node 1 or 4 results in three hop lookups. We show a similar plot for node 4 in Figure 3.
The number of hops for the lookup versus iteration is shown in Figure 4. The initial exploration phase of the algorithm is evident by the large hop variance in the first few hundred lookups. After 200 lookups, the simulated annealing in the algorithm allows the node exploit its learned information. A hop comparison with the greedy approach is shown in Figure 5. A negative value indicates instances of exploration where the algorithm performed worse than the greedy algorithm whereas positive values show the algorithm exploiting a better path.
Next we look at the ability of the algorithm to adapt to membership changes. We run the a simulation on the same eight node network starting in the same initial configuration, but force node 2 to leave after 300 iterations, node 1 to install a new route to node 4 after 500 iterations and node 1 to install a new route to node 7 after 700 iterations.
Figure 6 depicts node 0's estimations for destination node 7. At iteration 300 when node 2 leaves there are two resulting effects. First, there is no longer a 2 hop route through the overlay from node 0 to 7. Second, the cost for node 0 to send to node 1 increases to 5 since node 1 has lost node 2 as a path. Thus, the network transitions from forwarding the query to node 2 to node 4. After 500 iterations, node 1 installs a route to node 4, lowering the cost for node 0 to use node 1 to 4. Still though, node 4 is the best choice. Finally at iteration 700, node 1 installs a direct route to destination node 7. Node 0 discovers this better two hop route and proceeds to use it. Each of these discrete events are evident in Figure 6.
The number of lookup hops in our example of membership changes is shown in Figure 7. We see the same exploration phase early in the progress of the lookups. From iteration 300 to 800 we see the resulting instability, but the majority of the lookups succeed in 3 hops. Finally after iteration 800 the network is exploiting node 1's new route and is achieving 2 hop lookups again.