Resume
Work Experiences
Course Projects
Source Codes
Relate Links
Contact Me
 
Copyright © 2004
Guang Huang
Hit Counter
Logo
 HomeResumeWork ExperiencesCourse ProjectsSource CodesLinkContact
 

HOME > Course Projects > Artificial Intelligence > Empire Builder Game Agent


Requirement
AI Methods Used
...Heuristic Search
...HTN planning
...Learning
Main Algorithms
Challenges & Future Work

Download the codes from HERE

Requirement

In this project, we need to build an agent who acts as a game player to play a simplified Empire Builder Game. The map used is a hex map, which is simpler than the original one in the real game. The board is from (0,0) to (21,59).

AI Methods Used

Heuristic Search
We are using a revised version of A* search in this project. Generally, A* search can search a least cost path between two points. However, if we use such kind of search in our project, we will meet some problems.

Problems

Agent wants (At A)
1) deliver goods at B
2) pick up goods at C

Problem 1

1. Normally, agent finds an optimal way to build from A to B, delivery goods to B, then finds an optimal way to build again and pick up the goods from C. It is quite stupid for the agent to work like this. At least, the agent can choose a point as reference to search the optimal path. By considering the point C, agent can find a more proper path to build from A to B. Even if the path from A to B is not an optimal one, it is a better path for us to build between three points. The first change we made is to introduce another point, which is in the next step of the current plan, as a reference to search and find the path.

1) A is the current point that the agent stays
2) B is the destination point that it wants to reach
3) Lines are railways built by the agent itself.

Problem 2

2) It is quite stupid for the agent to build a brand new path from A to B. And because some part of the path has been build before, the agent no longer needs to pay for building this part of path. To utilize what has been built before is what we need to change. For this reason, we define a new list to what we have built. The points on these paths are points we can build from. The path costs of each step in these paths are set to zero. When the agent wants to build from A to B, it will try to end to the points that railways have been built on rather than simply building from A directly to B. In searching, all that has been built will cost zero, rather than the real cost agent needs to pay. It is clear that the path the agent finds finally will no longer the optimal one.
3) In the game, there is more than one agent, what will happen is that others will build the path. If this is happened before the search begins, it is not so difficult to solve. In the search, we mark this kind of paths and set their costs to infinite large. Thus all these paths will never appear in the whole search procedure

Heuristic Functions
Some admissible functions (such as functions based on straight-line distance or other distance) will help agent to find an optimal path. But they are not fast enough. Others are not as good as these functions, but they are faster. We have tested several functions, typically, below are some fast and useful functions.
1) distance / (sqrt 2) not admissible
2) |x1-x2| + |y1-Y2| not admissible
3) |x1-x2| + |y1-Y2|/2 admissible

HTN planning
In project, use HTN planning as a kind of technology for our project mainly because in the high level, it is quite clear what the agent need to do in order to reach the goal.


HTN Planning

It is just a very simple graph of Empire Builder. Actually, the agent needs to deal with the result returned from the server after sending its actions.
The three actions in the first level
1) Choose to start: choose a city to start.
2) Choose to build: choose tracks to build
3) Choose to move: choose the place to move, deliver or pick up goods.
Choose to build can be decomposed into:
1) Upgrade the train
2) Choose track to build
Choose to start can be decomposed into:
1) Move to the city
2) Deliver the goods
3) Pick up the goods.

Learning
IThey are mainly two parts that adding learning technologies into the system.
Heuristic Search
Heuristic functions are very important to heuristic search. Some are quite good, admissible and can find optimal solutions. Others are not so complicated, not so good, only can find good solutions in most situations. But they are quicker. In an interactive game, both the good solution (track to be built and move) and the search speed are important. In the first part of the Empire Builder game, there are very few railways in the map, to find and build optimal railways are important. When most of the railways are large part of them are built by others, it is not so easy to search and find an optimal road. To find a good road to build is much more important than find an optimal one.
We add learning technologies to the heuristic search so that the heuristic search can choose different heuristic functions at different situation.
Choose the city to move
When the agent moves to a city, it will deliver or pick up goods or does both of the actions. In the program, how to choose the city to move is based on three demand cards. To choose goods from a demand card based on different elements. At the beginning, because the railways are not built too much, it is good to choose a city very near. Therefore, agent can deliver and pick up goods very quickly, and then can earn money quickly. But after some turns passes, to choose good in the demand card so that the agent can get larger profit becomes more important, because the train has been upgraded and the railway are built. Adding learning skills in city selection help agent to choose different cities under different situations.

Main Algorithms

Choose goods from demand card
In order to choose a reasonable goods from a demand card to delivery, the program will according the current player position to use the hfvalue (amount, hf3cities(pick-up-city, delivery-city, current)) to determinate which city is the best pick-up-city for this good in the demand, then selected the best demand in the card.
Choose the demand card
If the train does not upgrade, the load in the train is 1, so we only need to select the best card according to the hfcard in the cards, and push it to the planning cards. If the train has upgraded, the load of the train is 3, we can push all 3 cards to the planning cards.
Heuristic search
It is like the general A* search, the most different parts are:
1. if the track is occupied by myself, the a* will regard the cost of the track as 0
2. if the track is occupied by myself, the a* will regard the cost of the track as + infinite.
3. can according the current planning to select the path from source to target.
4. can adapt the city-limit-p
4.When searching, begins from one point that in the railway one has built than directly from the beginning point.
First plan
Due to the goods on the card the agent chooses as the first card and other goods on the other two cards, find appropriate pick up point. Introduce all these points into the planning. The result has three parts, one for the first point to pick up or deliver some goods, the other two are for other demand cards, and the actions in the second part should be executed before the actions in the third part. Generate the list to build and move.
Choose start
Based on the card selected as the first card, choose a pick up place as start place
Re-planning
When agent delivers one goods, then call this function to re-planning. The procedure is similar to the first planning. Because now it has a new card, the whole planning needs to be rescheduled. Some part will become the first part of the planning, which will be executed the next turn.. Others may upgrade from the third part to the second part.
Choose build
First check to upgrade the train. Then build the list generated by the planning. Each time the node build one path, then waiting for the Process Result function to process the result and then continue to build until it cannot build any more or the list is nil.
Choose move
Move along the list generated by the planning. Continue to move until it reaches the move limited or the move list is null. For the next situation, the agent will then execute the actions attached by that point, deliver or pick up good or both. Then re-planning.
Process Result
This function processes all the results returned from the server.

Challenges & Future Work

Heuristic function
It is very important to find a good heuristic function. But on the other hand, it is not so easy to find a good heuristic function that works not only good but also quick.
Using other’s information
This is a big part that can continue to do some work on it. Now consider only when a track is already built by other player, the agent can then call the heuristic function to re-search. And we add code into the heuristic search function to help search.
Forward building and re-planning
If a kind of goods in a demand card has been delivered, the agent will get a new card. Then the agent needs to re-plan its existents planning. It is clear that the program can plan the whole six cities and six activities at the beginning. But the agent will only build the track to the first deliver place. Because when the new card comes ,the agent will re-plan the rest part of the planning and add more activities into the plan, it is not a good way to build some track into a no use city.
This means to build the tracks after the first deliver city is not so easy to decide. We tried to build some extra tracks before re-planning happens, but we still did not reach a good evaluation that what is worth building.
On the other side, the re-planning is not so easy to do. It is one of the hardest parts of the project. One thing is that we need to utilize the existent planning. Another thing is we should merge activities if they are in the same city. It is not necessary for an agent to go to a city twice in a plan.
Interrupt others
A good agent should not only do its own work. It should also interrupt others in such an open and competitive environment.
This is a part that we can do if we have more time. Unlike to use the other’s information to help the agent’s own decisions, the agent will use what it get to interrupt others and let others cannot work well. It is difficult but interesting.
Utilizing the specific information of the map
To learn and utilize the specific information is a work we planed to do but actually did nothing. Because each map has its own specific information, we may have some ways to utilize such kind of information