Gunnar

Gunnar

Class of worlds

This agent is much more complex than Einar. This complexity makes it handle worlds with the following properties quite well:

  • All interior tiles are walls or clear and some of them have items.

  • The border tiles are walls.

  • There are no enemies.

Strategy

Gunnar's basic principle is to always explore the nearest unknown tile. Near is defined as low cost to reach it. But there are exceptions to the basic principle; the right-rule and priorities.

Right-rule

The right-rule means that if the tile to the right of Gunnar is unexplored (and has the highest priority, se below about priorities), gunnar will explore that tile.

If the right-rule is not used, Gunnar will move in the direction he is facing when he is exploring, and only turn when he has bumped into something. That behaviour tends to make him leave tiles behind, so he has to return later, thus making him less efficient.

When the right-rule is used, Gunnar will not always explore the tile he is facing if it is unknown with highest priority just because it is the cheapest tile to explore. Instead he will spend an extra amount, Turn_Right_Cost, to explore the tile to the right first instead.

A test in a particular world of the size 60×52 with no homes, determined with the wall probability 0.3 and the item probability 0.1, Gunnar achieved the score 13691 with the right-rule and only 12983 without it.

Excercise.  Test the effect of the right-rule yourself. Use a world similar to the one described above, a much larger world is not necessary to see the effect and may take unnecessarily much time.

Caution

There is an important detail; Gunnar must avoid getting stuck in an infinite rotation because of the right-rule. If he is surrounded by unknown tiles with the highest priority, he could be tempted to turn right to explore the tile to the right of him. But the next turn he is facing the same situation and would turn right again, and so on. This is particularly likely to happen at the beginning of the simulation. Therefore Gunnar only follows the right-rule when his previous motion was a move. He uses the state variable Previous_Motion for this purpouse. The variable is initialized to Turn_Right so that the right-rule is avoided when selecting the first action in the simulation.

Priorities

Even with the right-rule in place, Gunnar would leave many unexplored tiles behind until later, if it was not for the priorities. Gunnar does not just consider every unexplored tile by itself, he can also think of them in terms of contiguous unexplored areas, which are equivalent to groups of unexplored tiles. When exploring a tile, it can happen that such a group of unexplored tiles is divided into 2 or even 3 groups. This is called a split. It seems wise to explore the smallest groups first to get them done instead of wandering away exploring a large group and then spend a lot of movement to return to the small group. This is why Gunnar assigns different priorities to the tiles in the different groups after such a split.

Map

Obviously gunnar needs a world map store and use the information he has gathered. It is a 2-dimensional array indexed with relative (to Gunnar's starting location) coordinates.

Searching

When gunnar wants to find a tile to explore or a home to return to he has to search for a way to it. He uses a breadth-first search, which is optimal.

Speed optimizations

This section is about optimizations whose purpouse is not to achieve a higher score, but to make Gunnar decide faster which action to perform.

Note

Even the design of the simulator, where agent and client communicates over internet sockets, was developed with performance in mind. It makes it possible to run simulator and agent on different computers, for example the simulator on a workstation and the agent on a remote supercomuter.

Chosing Ada as a language for developing agents enables the relatively easy use of tasking and distributed computing on the agent side.

Irrelevant tiles

Gunnar marks tiles as irrelevant if he can prove they are. Irrelevant tiles are clear tiles which he will never need to visit again, for example dead ends. This lets the search algorithm search faster by pruning the search tree when the search reaches irrelevant tiles.

Note

This is such an obvious optimization that a large fraction of agents are expected implement it. Therefore the simulator has special support for debugging it. The agent can send a world map symbol when it marks a tile as irrelevant, and the simulator can show it on the map. This makes it easier to debug the agent code for finding irrelevant tiles.

Of course there is a small cost to check if tiles are irrelevant. Gunnar only makes some quick and simple irrelevance tests and fails to identify some dead ends to avoid wasting too much computation time on it.

More excercises

Excercise.  Make Gunnar react faster. Start a background task when the simulation begins (in Initialize or Inform_About_Rules). Make the background task perform calculations while the agent waits for a new percept from the simulator. Suspend the background task when the percept arrives to devote the system resources to Get_Action. Let Get_Action store hints about worthwile things to do for the background task. For example instead of letting Get_Action call Mark_Row_As_Irrelevant_If_Irrelevant and Mark_Col_As_Irrelevant_If_Irrelevant, store the data needed to do so in a record and add it to the background task's list of things to do. Find something to do for the background task when the list of hints is empty. This could be for example:

  • Make more complex searches to identify dead ends to mark the tiles there as irrelevant. Here are 2 image examples of dead ends that Gunnar does not identify that could be identified.


    Here the agent has left the dead end but it is not marked as irrelevant because there is a home inside. But there is a home in the orifice of the dead end, and because the agent is outside, it will never need to visit the inner home. Whenever it intends to visit a home, it will be nearer to the orifice home. So the dead end should be marked as irrelevant.

    Note

    Marking this dead end as irrelevant will not prune the search tree of searches for homes, because the orifice home will be found first anyway, but it will improve other searches and maybe even make it possible to forget entire rows/columns that could not be forgotten otherwise (see an excercise below).



    Here the agent fails to detect a dead end because it contains a loop.


  • Try to detect if the agent has explored the complete edge of the available world. When that is the case, there will be no need to extend the map any more and all unknown tiles that belong to groups at the edge of the map can be marked as unreachable. The criterion for this state is that no way can be found from the agent location to a location at the edge of the map. The way is allowed to pass over tiles marked as clear, home and unknown of any priority. If such a way is found, there is no use to test for this criterion again until the agent has perceived a bump (against an unknown tile of course).

  • When the state described above has been detected, the agent could safely forget about entire rows/columns if they are beyond rows/columns that only have walls or irrelevant tiles.

Excercise.  Improve Gunnar to handle fatal dangers and enemies. He should be able to infer that a tile is safe when possible. He should also be able to neutralize without wasting ammunition and cost. But most important, he should make sure to stay alive.

Tip

If an agent comes in a situation where it has explored all tiles it could prove to be safe it may not know how to continue. But if it can neutralize and there is a tile where it percieves hostility, it can try to neutralize in one of the directions where an enemy could be. This could waste ammunition by not neutralizing an enemy, but it gives something else; the agent can be certain that there is no enemy in the tile it faces (and maybe the tiles beyond that, depending on the circumstances). So it can continue exploring. If it is lucky it might get a Neutralize_Reward. It is even nicer for the agent if it also no longer percieves hostility after the neutralization, because that means that there was only 1 adjacent enemy and that all adjacent tiles are now free from enemies.

KDE Logo