Dynamic Task Allocation in Turn Based Strategy Games

 

Introduction

Colony.png
In this work, we use a strategy game based on Sid Meier’s “Colonization” (1994) to demonstrate how one can use different biology inspired techniques to make an efficient artificial player. In Colonization (or rather in FreeCol, an open-source clone of the original game), a player is tasked with replaying the development of the new World colonies, from the initial discovery of the Americas in 1492 until the declaration of independence (which may or may not be 1776 depending on the skill of the player). At each turn, the player has to choose where to allocate his units to grow and defend his colonies optimally. We show that an artificial player using the response threshold mechanism set forth by Wilson (Wilson, 1984) can prove a simple and elegant way to allocate units to the different tasks. We also show that this simple model can be augmented with planning-like functionalities and improves on the general result.

 

Dynamic Task Allocation in Insect Colonies

Insect colonies, like ants, bees or wasps, have no centralized command. The queen inside the colony is only responsible for breeding and does not possess any capabilities to order the other colony inhabitants around, unlike the medieval royalties. Furthermore, a single individual is a very simple and primal agent so it seems nothing short of a miracle to see what the colony has a whole is capable of. Task allocation of the colony’s inhabitants is both efficient and robust despite the fact that individuals do not have a language or a means to communicate directly with each other. Instead, they lay down pheromones that can be detected by other individuals. This process of modifying the environment to provide information is called stigmergy. Through stigmergy, ants, bees or wasps can communicate stimuli to one another and take the appropriate action. The beauty of it all is that an individual ant has no conscience of what it is doing but together, their collective actions result in emergent behavior, such as the reallocation of majors when the level of minors in the colony drops. It is to explain Wilson’s findings about such dynamic task allocation (Wilson, 1984) that the response threshold model was proposed, first with the Fixed Threshold Model (FTM) in (Bonabeau et al., 1996).

The FTM provides a simple dynamic task allocation mechanism that partially matches what Wilson saw in his experiments. Under this model, each task is associated with a stimulus, each individual has a response threshold for each task, and they will engage in a task when the associated stimulus exceeds the threshold. More formally, for a system with only one task and two castes: Let X be the state of the individual (X = 0 corresponds to inactivity, X = 1 corresponds to performing the task). If S is the stimuli associated with the task, and θi is the threshold of an individual of caste i (i = 1, 2), an inactive individual belonging to caste i will start performing the task with a probability Pi per time unit:

F1.png

Thus, the probability that an individual will perform a task depends on the stimulus and the threshold. It will have exactly 1⁄2 chance of responding to it if the threshold and the task are equal, while the square will ensure that the probability will increase rapidly once the threshold is reached. Furthermore, we can explain the division of labor observed between minors and majors by postulating that the majors possess a higher threshold to that task. On the other hand, an active individual will become inactive with a probability p (assumed to be fixed and equal for both castes) per unit of time:

F2.png

This means an individual will on average spend 1/p time units performing the task and could keep performing a task even after it is no longer necessary. While this behavior is not particularly efficient, it was observed in several species of ants and (Bonabeau et al., 1996) did not find it necessary to come up with a more elaborate model for the return to inactivity. Finally, the variations in stimulus intensity were set to depend both from the task performance α, which reduces the stimulus intensity by meeting the demand for action, and from a natural increase of the demand δ, irrespective of whether or not the task is performed. The resulting equation for stimulus intensity S is therefore (in discrete time):

F3.png

where N is the total number of potentially active individuals in the colony, and Ni is the number of individuals from caste i performing the task. It is assumed here that both castes are equally efficient at performing the task and that the strength δ of the stimulus scales up with the size of the colony. With those three simple equations, (Bonabeau et al., 1996) obtained results very similar to observations made with real insect colonies (Wilson, 1984).

 

Contributions

Our contributions are threefold. First, we showed that using the threshold response method, with very simple rules, an efficient resource management system can emerge, even with intricate constraints such as the ones present in a strategy game. Without any fine tuning, our basic AI player was able to perform at around 66% of an expert player on average, over a full game episode.
 
Mapping.png
 
Second, we showed that while adding another response threshold layer to emulate planning capabilities brings better average results, the variation in the player’s performances from one game to another increases beyond that of our initial player. Thus it will make such an approach inappropriate for most processes with low risk tolerance, and it shows this kind of architecture might be a dead end.
Results.png

Finally, we showed that with a few minor modifications, such as adding rule-based constraints to our stochastic algorithm, we obtained an AI player that can play the resource management part of a FreeCol game at near expert level, with excellent results on average and good results in the worst case.

Although applications of the response threshold algorithm have been developed academically before, it is, to our knowledge, only the second time it has been applied to a real problem, and the first time where multiple agents have to choose between multiple tasks at each time step. Beyond the obvious use in computer games where this problem is very common, we believe it could be applied to a wide range of task allocation problems where worker allocation has to be done in an efficient manner (automatic scheduling for managers, weapons-target allocation, etc...). But even just in the computer game field, it could potentially be a big improvement to traditionally cheating or rigidly scripted artificial players, which are very hard to develop in an efficient manner.

 

Materials

 

References

  • Bonabeau, E., Theraulaz, G., & Deneubourg, J.-L. (1996). Quantitative study of the fixed threshold model for the regulation of division of labour in insect societies. Proceedings: Biological Sciences, 263(1376), 1565–1569.
  • FreeCol - The Colonization of America http://www.freecol.org/
  • Wilson, E. O. (1984). The relation between caste ratios and division of labor in the ant genus Pheidole (Hymenoptera: Formicidae). Behavioral Ecology and Sociobiology, 16(1), 89–98. Retrieved from http://www.springerlink.com/content/u5067p1121415116/