A significant challenge with the Q-learning approach is that a node must maintain per route estimates, each of which has an entry per node in the system. The memory and processing requirements for maintaining per destination Q-values is prohibitive in large networks. Thus, we introduce the notion of Q-value slices. The P2P identifier space is divided into equal sized slices as shown in Figure 12.
CONTINUED...