Overview

Project: Master Of None

Estimated Development Time: 25-35 hours

Lines Of Code: 500 (counting only the system code, not individual actions)

Description: The action planner in Master Of None is heavily influenced by traditional HTNs and the GOAP(Goal Oriented Action Planner) seen in F.E.A.R and several other games. At its most basic, the planner starts out with a prime goal, to "Increase Wealth" or "Survive" based on its current situation, then builds out a stack of subgoals/actions. Its largest differentiator is its use of subgoals to efficiently re-plan when a particular element of the overall strategy no longer seems effective or profitable.

Objectives:

  1. Have goals and options adjust by what is available to that specific NPC.
  2. Plan in small chunks.
  3. Allow subgoals to re-plan.
  4. Allow NPCs to manage multiple action stacks at once.

 

Thought Process and Choices

 

Why I chose to use an Action Planner :

Master Of None is centered around the NPC's ability to satisfy their own goals while the player leads/supports them instead of directing them. I did consider using a modified behavior tree that relied heavily on priorities, but soon realized during the prototype stage that it was going to be become overly complex with each new behavior needing to be weighed and prioritized. I also didn't like the concept of having to evaluate the entire tree on a regular basis.

Using a planner allowed me to naturally associate cost to actions and create the dynamic behavior the game required.

 

Risks:

  1. The planner I desired required many dependencies to be taken care of prior to its development; the main one being the Teamwide Knowledge System. Instead of creating a light version of the planner,  I decided a to go with a temporary behavior tree that was already partially implemented.
  2. I had never developed a planner before, but I had a strong feeling it would be a system that would require many iterations to not only work but to also be designer friendly.

 

Results:

Although the action planner currently works, it is not designer friendly in any shape or form. The actions currently in the game are:

** - The action is not fully implemented

Normal Actions

  • Collect Item
  • Explore
  • Hunt
  • Lock Door
  • Unlock Door
  • Destroy Door**
  • Open Door
  • Shut Door
  • Sell Loot
  • Trade**
  • Throw Item**
  • Travel To
  • Reload
  • Scavenge Room
  • Scavenge Container

 

Combat Actions

  • Defensive Tactics
  • Offensive Tactics
  • Flee**
  • Check For Enemies**
  • Kite**
  • Stand Ground
  • Cover Tactics
  • Enter Cover
  • Cover: Popup
  • Cover: Lean Out
  • Cover: Wait
  • Flank
  • Charge
  • Attack Position**

Design and Development

Summary:

When you look into the future of a game, even the next 30 seconds, details about the world can become less about prediction and more along the lines of making naive guesses. And although the player may be trying to strategize as to what a NPC is working up to, it is unlikely they are considering the behaviors of a NPC even 10 seconds in the future.

So why plan the small, behavioral  actions so far out? The answer is typically because you need to evaluate actions to find not only a successful condition but a condition that has the least cost. However, I believe this answer to be self-limiting in game development. Perhaps conditions and actions do not need to be exact.

When it comes to graphics, they use a LOD(Level Of Detail) system, displaying lower quality textures based on how far away the player is from the object. The player's experience is not diminished by this optimization, because they most likely hardly noticed that lower textured box on the opposite side of the map. What if a similar concept could be applied to Action Planning?

Instead of purely planning actions to find the best solution, you use subgoals to make assumptions for a grouping of actions. Actions  typically have a single function like "GoTo" or "Reload." These subgoals are higher level actions in a sense. For example, they could be "Defensive Tactics" or "Explore."

Alongside the world state, when a subgoal is originally planned it is also passed an estimated activation time. This activation is used for determining the level of detail required when evaluating conditions. The further out the activation time, the more abstract the condition.  And rather than performing a specific behavior when active, the subgoal plans out an additional set of actions or subgoals in order to reach their successful condition. This allows for easy and efficient re-planning when a trigger occurs.

(I have recently learned that LOD techniques are already being applied to action conditions. I haven't yet investigated the information out there about others research into this area, but I plan to soon and hopefully will be able to improve the work I have already done.)

 

Differences from GOAP:

  • The main difference behind my action planner is that it does not plan from a prime goal, like "Increase Wealth," all the way to picking up a coin in the next room. Instead, it finds the next best subgoal that predicts both success and the highest potential profit. Once these subgoals are executed they will have their own sub goals or actions planned out. This continues over time until a final action is reached such as Collect Item.
  • Another variation comes from the fact that actions do not simply have a cost associated to them, but instead calculate their cost or potential profit during the planning process. In some cases, the the final value of an action stack will be positive.
  • A single NPC can have multiple actions stacks at once, which are also stacked based on priority.
  • Actions can have links to each other, allowing knowledge of success or failure of actions in other stacks.
  • The actions themselves are not static, but instead change on properties passed from the goal and the current circumstances of the NPC.
  • Subgoals always trim their own branches based on the most profitable solution.

 

Architecture:

There are 6 main components to the action planner system: States, Goals, Actions/Subgoals, Action Stacks, the Handler, and the Planner.

 

1. The State is a structure of references to all the pertinent data that the Planner will use to evaluate goals.

2. Goals contain a list of a actions/subgoals that are to be planned out. A Goal also contain its type(mission, combat, required) and any additional information that is to be sent to the starting actions/subgoals.

3. Actions and Subgoals are actually the exact same object, and could technically be referred to as simply an action. However, I like to refer to them as separate elements since they handle two different types of logic. Subgoals do not directly execute a NPC's behavior, but instead works as a checkpoint to evaluate new actions or subgoals. Actions have a specific function such as picking up an item or moving to a location.

4. Action Stacks are lists of actions/subgoals that are executed from the top down. These stacks are stored and executed by the handler. When the top(active) action completes, it is removed from the stack and next action on top is considered active. If the top action fails, the entire stack fails.

5. The Handler is the component attached the NPC that manages its action stacks, executes the active actions, keeps track of the state data, and is the only component to interact with the planner.

6. The Planner stores goals ready to be evaluated and plans so many of them based on how much available computing time is left. If a goal fails it is simply discarded. If a solution is found the planner strips away the unnecessary data and sends it back as an action stack to the NPC's Handler.

 

Example 1:

For the purpose of demonstrating the advantages of this system, lets walk through the entire process with a soldier NPC who has just spawned into a map. At the beginning of the soldiers self evaluation he notices he has no current goal. Since the soldier is currently not in danger and has no reason to upgrade/resupply, he decides to contact the Goal Factory, requesting the prime goal "Increase Wealth."

The Goal Factory builds the "Increase Wealth" goal, making specifications based on the fact he is a soldier and a part of the player's team.  Once the prime goal is created, it is sent back to the soldier where it is then re-routed to his action Handler.

The Handler adds on the State structure and fills in additional data before sending all of it to the planner. The planner doesn't evaluate it immediately, but rather stores it. It is only when the Planner goes through its update that it determines which goals will be planned out based on the priority attached to them. And this only happens if it thinks their is enough time left in the frame to go through the planning process.

When the goal is finally up to be evaluated, the planner starts with each of the subgoals and the conditions they contain. If a condition fails, it looks to see if there is a possible solution(Another action to be planned) in order to resolve the condition.

To make a note of it once more, you can see that the estimated profits of actions and subgoals are not static. Exploring rooms for items and exploring rooms for enemies evaluate to different estimated profits. This is because their EP is determined based on properties sent from their parent actions. Eventually attributes about the NPC itself will also come into play, like personality and previous experiences. At the moment, though, it only takes into account the current circumstances.

When the Planner evaluates the goal fully and finds a successful solution, it converts it into an Action Stack that ends up looking like this:

 

The Action Stack is then sent back to the Handler to be managed. Since the handler can contain multiple Action Stacks at once, it sticks it within the list of Action Stack based on its type. Required actions always go on top, which can be things like opening doors or reloading. Next is the Combat type, which revolves around the survival goals. And on the bottom are the Mission types. These have to do with making money or upgrading equipment.

This means when a NPC runs into an enemy, they don't automatically drop their goal of scavenging corpses. Instead, they add their combat action stack on top of it, returning to their previous mission once the combat is over.

 

Example 2:

Below is an example of what soldier's action handler could be containing in a single instance:

At the Top of the stack is the action "Reload." When it completes, it will be popped off, allowing the the NPC to perform the combat Action Stack below it. Once that stack completes or fails, it will be popped off, and the next one on the stack will be executed.

As you may have concluded, "Cover Tactics" is functionally a subgoal rather than a direct action. Its purpose is to re-plan and add on actions, such as popping up from cover to attack.

Without having to plan from the prime goal "Survive" every time makes it easier to change up strategies and adapt. Eventually it will drop back down to "Defensive Tactics" and then down to "Combat," re-planning at each stage as often as possible.

 

 

Final Notes:

For the system itself, I am very pleased with how well it works. Particularly how easy it is to re-plan at times that might cause issues for other Action Planners. An example of this that commonly occurs in Master Of None is in the midst of a soldier running to a cover object that is a far away. By the time he gets halfway to the cover objects, the position can quite possibly no longer be ideal with all the moving around the enemy and his own team has done. At this point, his current action stack is popped off, making "Defensive Tactics" the active action. From there the soldier will decide if a different piece of cover would work better or perhaps stand his ground to take advantage of exposed enemies while they are running to their cover.

I'm still actively adding actions and subgoals, but eventually I do plan to revamp the entire planner to work with a visual editor. This should make it more accessible to designers.