The following points highlight the two main planning methods used to solve AI problems. The methods are: 1. Failure of the Linear Method (Method of Repair) 2. Planning Graphs.
Method # 1. Failure of the Linear Method (Method of Repair):
Now let us take a difficult problem shown in Fig. 8.7. Sussman anamoly
There are two ways to start with the problem solution, giving the goal stacks:
Suppose we choose alternative 1 and produce the following goal stack to put A on B:
ON(C, A)
CLEAR(C)
ARMEMPTY
ADVERTISEMENTS:
ON(C, A) ˄ CLEAR (C) ˄ ARMEMPTY
UNSTACK(C, A)
ARMEMPTY
CLEAR (A) ˄ ARMEMPTY
ADVERTISEMENTS:
PICKUP (A)
CLEAR (B) ˄ HOLDING (A)
STACK (A, B)
ON (B, C)
ADVERTISEMENTS:
ON (A, B) ˄ ON (B, C)
We can then pop off the stack goals which have already been satisfied, until we reach the ARMEMPTY precondition of PICKUP (A). To satisfy it, we need to PUTDOWN(C).
Then we can continue popping it until the goal stack is:
ON (B, C) ˄
ADVERTISEMENTS:
ONTABLE(A) ˄
ONTABLE(C) ˄
ARMEMPTY
When the remaining goal on stack-
ON (A, B) ˄ ON (B, C)
The sequence of operators applied so far is:
i. UNSACK(C, A)
ii. PUTDOWN(C)
iii. PICKUP (A)
iv. STACK (A, B)
Now let us satisfy ON (B, C). This can be done by stacking B on C. (The reader should go through the algorithm of doing this). But to do that, it has to un-stack A from B. After this goal ON (B, C) is achieved and popped off the goal stack: we would need additional sequence of operators.
v. UNSTACK (A, B)
vi. PUTDOWN (A)
vii. PICKUP (B)
viii. STACK (B, C)
And the corresponding problem state would become:
ON (B, C) ˄
ONTABLE(A) ˄
ONTABLE(C) ˄
ARMEMPTY
When the remaining goal on stack
ON (A, B) ˄ ON (B, C)
is checked it is not found to be satisfied. In the process of achieving ON (B, C) the action ON (A, B) need to be undone and the difference between the goal and the current state is ON (A, B).
This difference is added in the goal stack to achieve the goal and the operators needed are:
ix. PICKUP (A)
x. STACK (A, B)
Now the combined goal is again checked and this time it is satisfied. The complete plan is the combination of the ten operators enlisted above and obviously the solution is not that efficient.
A similar sequence of operators would have been produced if the two major sub-goals in the opposite order had been considered. These plans are not good. How do we decide whether a good solution has been found?
There are two alternatives:
1. Repair the plan we have already made. We can look for the places in the plan where we perform an operation and then immediately undo it. If such places are found, we eliminate both the doing and undoing steps from the plan. In the above plan eliminate the steps 4 and 5 and then steps 3 and 6.
The remaining plan will be:
i. UNSTACK(C, A)
ii. PUTDOWN(C)
iii. PICKUP (B)
iv. STACK (B, C)
v. PICKUP (A)
vi. STACK (A, B)
The plan contains the minimum numbers of operators to solve this problem. But for more complex problems the interfering operations may be farther apart in the plan and thus much more difficult to detect. Moreover, much efforts have been wasted which is obvious in the method of repair.
2. The second and better indeed alternative could be to devise direct plan for the planning rather than repairing. We now turn to be second alternative.
Non-Linear Planning using Constrains (Partial Order Planning):
Planners in the early 1970s generally worked with totally ordered sequences forward and backward state space search. Problem decomposition was achieved by computing a sub-plan for each sub-goal and the stringing the sub-plans together in some order. This approach, called linear (goal stacking) planning by Sacredoti (1975) was soon discovered to be in competent.
It cannot solve some simple problems such as the Susman anomaly (Fig. 8.7) found by Allen Brown during experimentation with HACKER system. A complete planner must allow for interleaving of actions from different sub-plans within a single sequence (multiple subprograms are worked simultaneously). Such a plan is called non-linear plan. A non- interleaved planner is a planner which, when given two sub-goals G1 and G2 produces either a plan for G1 concatenated with a plan for G2 or vice-versa. We have seen how a linear (stacked) plan does not produce solution for an interleaved solution such as Sussman problem.
Here we describe a better plan for this anomly:
i. Start on the goal ON (A, B) by clearing A, thus by putting C on the table.
ii. Complete the goal ON (A, B) by stacking A on B.
iii. Achieve the goal ON (B, C) by stacking Bon C.
Many programs are available which can deal with such non-linear problems. A few are- HACKER by Sussman (1975), NOAH, NONLIN, MOLGEN and TWEAK. The last two programs used constraint posting as a pivotal technique.
Heuristics are required to tackle non-linear problems, but before that the concept of constraint posting, a central technique for such a method need clarification.
In constraint posting a plan is prepared when:
i. Some operators are hypothesised
ii. They are then partially ordered
iii. And then variables within operators are bound.
There is no clear idea as how the operators should be ordered with respect to each other, they are only partially ordered to generate a partial plan. This is then converted into a number of total ordered plans, hence this method is also called Partial-order Planning. The partial order solution then corresponds to possible total order plans, each of these is called a lineraisation of the partial order plan. Partial order planning can be implemented as a search in the space of partial order plans (or just plans).
According to this plan we start with a null plan, that is a plan with no steps. Next we look at state, and finally suggest certain steps for achieving the goal. The actions in this search are not actions in the world, but actions on plans: adding a step to the plan, imposing an ordering which puts one action before another, and so on.
The states of our search problem will be (mostly) unfinished plans. To continue with the Sussman anomly we begin with the null plan. Next we look at goal plans (we call plans rather than states) and add some steps.
Mean-Ends-Analysis tells us to choose two steps with respective post conditions ON (A, B) and ON (B, C):
Each step (sub-goal) is written with its preconditions above it and its post conditions below it. Delete post conditions are marked with¬(negation). We may note that at present the steps are not ordered with respect to each other. Also we want to achieve both of them ultimately. But these, steps cannot be achieved as such because some pre-conditions are not achieved (marked with *). These star marked conditions HOLDING are not achieved because at the start of the problem, the robot’s arm does not hold anything.
A set of actions which make up the steps of the plan are added. These are taken from actions in the planning problem (process called addition). This is one of the heuristics we would be using in generating non-linear plans. Addition is a basic method, used by Newell and Simon (1963) where Means-Ends-Analysis was used to pick up operators with post conditions corresponding to the desired plans. Fig. 8.8., defines step addition along with other heuristics, introduced later, to arguably first simple and readable description of a complete partial order planner.
Using heuristic (step addition) the pre-conditions of above two steps become:
Adding these PICKUP steps (enclosed in two lines) is not enough for satisfying the *HOLDING pre-conditions of the STACK steps. This is because there are no ordering constrains present among the steps. If in a successful plan the PICKUP steps were to follow the STACK steps, the *HOLDING preconditions would need to be satisfied by some other set of steps. Thus, we solve this problem by introducing ordering constraints whenever we make use of step addition.
In our case we can say that each PICKUP step should precede its corresponding STACK step:
PICKUP (A) ← STACK (A, B)
PICKUP (B) ← STACK (B, C)
where S1 ← S2 means that the step S1 must precede S2 or “S1 before S2” written as S1 < S2.
There are four unachieved pre-conditions in the above form partially ordered steps, and let us talk about them now:
*CLEAR (A) is unachieved since at the start block A is not clear. CLEAR(B) is unachieved because although B is clear in the initial state but there exists a step STACK(A, B) with post condition ¬ CLEAR(B), and that step might precede the step with *CLEAR(B) as a precondition. To achieve CLEAR (B) we use a second heuristic PROMOTION.
In the spirit of this PICKUP (B) must come before STACK (A, B):
PICKUP (B) ←STACK (A, B)
The unachieved condition *PICKUP (A) will be taken up later after dealing with the pre-condition * ARMEMPTY. This strategy of delaying a choice during search is called least commitment strategy. While the initial state has an empty arm, each of the two pickup operators contains ¬ ARMEMPTY as post conditions after HOLDING (A) or HOLDING (B). Either operator could prevent the other from executing.
So we should use again, heuristic, PROMOTION, in order to achieve at least one of the two pre-conditions:
PICKUP (B) ← PICKUP (A)
Since the initial situation contains an empty arm and no step preceding PICKUP(B) could make it occupied (unempty), so the preconditions of PICKUP(B) are all satisfied. Now let us take the precondition *ARMEMPTY in PICKUP (A) step.
We have introduced above PICKUP (B) to precede PICKUP (A) and PICKUP (B) asserts ¬ ARMEMPTY which contradicts the present goal (ARMEMPTY), so heuristic DECLOBBERING need to be brought in between PICKUP (B) and PICKUP (A). That is another step should be introduced; the STAPK (B, C) does the needful in the following sequence.
PICKUP (B) ←STACK (B, C) ← PICKUP (A)
The step STACK (B, C) has de-clobbered, PICKUP (B), which in turn clobbers PICKUP (A)’s precondition.
It is now right point to take up the unachieved precondition *CLEAR (A) from the PICKUP (A) step.
Using step addition we have CLEAR (A) occurring as precondition in UNSTACK(x, A):
*ON(x, A)
*CLEAR (x)
*ARMEMPTY
UNSTACK(x,A)
¬ ARMEMPTY
CLEAR (A)
HOLDING (A)
¬ ON(x, A)
We have introduced the variable x because the only post condition, we are interested in the condition, CLEAR (A). Whatever block is on top of A is irrelevant. Constraint posting allows us to create plans which are incomplete in term of ordering of the steps. So the variable ‘x’ is invoked which allows us to avoid committing to particular instantiations of operators.
Out of the three un achieved conditions ON(x, A) can be easily achieved by assigning value C to the variable x; because this makes block C is on block A in the initial state, x = C in step UNSTACK(x, A). This heuristic is called simple establishment.
The unachieved pre-conditions CLEAR(C) and ARMEMPTY, can be tackled by again using heuristic PROMOTION:
UNSTACK(x, A) ←STACK (B, C)
UN STACK(x, A) ← PICKUP (A)
UNSTACK(x, A) ←PICKUP (B)
A word of caution requiring the use of promotion: we must always ensure that the new step does not clobber some pre-condition of a later, already existing step. This has in fact already happened in our example. The step PICKUP (B) requires ARMEMPTY, but this is denied by the new UNSTACK(x, A) step.
A solution to this problem is to add new de-clobbering step to the plan:
HOLDING(C)
PUTDOWN (C)
¬ HOLDING(C)
ONTABLE(x)
ARMEMPTY
ordered as
UNSTACK(x, A) ←PUTDOWN(C) ←PICKUP (B)
We may further notice that there are two types of de-clobbering-one in which an existing de-clobbering step is used to de-clobber another and the second in which a new clobbering step is introduced. The reader is advised to search for these two types of de-clobbering agents in the above example.
Lucky for us, the precondition of our newest step PUTDOWN, HOLDING(x) is satisfied and with this all the preconditions of all the steps are satisfied except to use the plan ordering and variable binding constraints to build a concrete plan:
1. UNSTACK(C, A)
2. PUTDOWN (C)
3. PICKUP (B)
4. STACK (B, C)
5. PICKUP (A)
6. STACK (A, B)
The same plan was used to solve the same problem by Goal stack Planning technique.
Out of the five heuristics defined above, four only were used here (find which one is not used). These heuristics plans are called PLAN MODIFICATION OPERATIONS.
Work on the TWEAK planner presented formal definition of the five plan modification operations and prove that they are sufficient for solving ANY SOLVABLE non-linear planning problem. Thus, TWEAK removed the aphorism in the results of non-linear planning research.
The following is the algorithm of non-linear planning (TWEAK):
1. Initialise plan, S to be the set of propositions in the goal state.
2. Remove some unachieved propositions P from S.
3. Achieve P by using step addition, promotion, de-clobbering, simple establishment or separation.
4. Review all the steps in the plan, including any new steps introduced by step addition, to see if any of their pre-conditions are unachieved. Add to S the new set of unachieved pre-conditions.
5. If S is empty, complete the plan by converting the partial order of steps into a total order, and instantiate any variables as necessary.
6. Otherwise, go to step 2.
It is a matter of experience that very sequence of plan modification operations does not lead to a solution. For instance, infinite time use of step addition may not ever give solution. Steps 2 and 3 in the above algorithm may not be determined with certainly. So some smart heuristic may be applied, which in this case may be to try with promotion first when both promotion and addition become applicable.
Method # 2. Planning Graphs:
All of the suggested heuristics for total order and partial order planning can suffer from inaccuracies. For better estimation of heuristics a special data structure planning graph can be used. These heuristics can be applied to any of the search techniques available or the solution can be directly extracted from the planning graph, using a specialised algorithm, called GRAPHPLAN.
A planning graph consists of sequence of levels which correspond to time steps in the plan; the level ɸ being the initial state. Each level contains a set of literals and a set of actions. These have the same meaning as in knowledge representation schemes.
1. Literals are those which could be true at that point of time, depending upon the actions completed at the preceding time step.
2. Actions are those actions which could have their pre-conditions satisfied at that time step, depending on which of literals actually hold true.
The planning graph records only a restricted subset of possible negative interactions among actions, therefore, it might be optimistic about the minimum number of time steps for literal to become true. None the less this number of steps in the planning graph provides a good estimate of how difficult it is to achieve a given literal from the initial state. Actually, the planning graph is defined in such a way that it can be constructed very efficiently.
Planning graphs work only for propositional planning that is for those problems which contain no variables. As already mentioned both STRIPS and ADL representations can be propositionalised. For problems with large number of objects this could result in a very substational blow up in the number of action schemes. In such cases this planning graphs have proved to be effective tools for solving hard planning problem. Fig. 8.10 shows the planning graph for the problem shown in Fig. 8.9.
We start with state level, so which represents the problem “initial state” and action level A0 in which we place all actions whose pre-conditions are satisfied in the previous level. Each action is connected to its preconditions in S0 and its effects in S1, introducing new literals (into S1which were not present in S0).
The planning graph needs a way to represent in action as well as action, that is, it needs the equivalent of frame axioms in situation calculus. (This theorem states that literals remain true from one situation to the next if no action alters it). In the planning graph this is done with the help of a set of persistence actions.
For every positive and negative literal C we add to the problem a persistence action with precondition C and effect C. Fig. 8.10 shows one “real” action Eat Cake in A0 along with two persistence actions. Rectangles indicate actions, small squares indicate persistence, straight lines show preconditions and effects.
Dotted lines indicate mutual exclusion (or mutex lines). Level A0 contains all the actions which could occur in state S0, it also records conflicts between actions which would prevent them from occurring together (shown by dotted curved lines). For example, Eat(Cake) is mutually exclusive with the persistence of either Have (Cake) or ¬ Eaten(Cake).
Level S1contains all the literals which could result from picking any subset of actions in A0. It also contains mutex links indicating the literals which could not appear together regardless of the choice of actions: for example, Have (Cake) and Eaten (Cake) are mutex, depending on the choice of actions in A0, one or the other, but not both could be the result. In other words, S1 represents multiple states; just as regression state space search does, and the mutex links are constraints which define the possible states.
We continue in this way, alternating between state level S and action level A, until we reach a level where two consecutive levels are identical, we say that the graph has levelled off. At this stage further expansion stops.
So the structure of the plan graph becomes:
1. Level Ai contains all the actions which are applicable in Si’ along with constraints stating which pairs of actions cannot be executed.
2. Every level Si contains all the literals which could result from any possible choice of actions in As-1 level, and the constraints saying which literals could not pair up.
Thus, it is worth noting that the process of constructing the planning graph does not require choosing between all possible actions, which otherwise would cause combinatorial explosion. Instead it just records the impossibility of certain choice, using mutex links. Thus, the complexity of constructing this data structure is a low order polynomial in terms of the number of actions and literals as compared to state space method of planning which is exponential in terms of the number of literals.
Mutex Conditions for Actions and Literals:
A mutex relation holds between two actions at a given level if any of the following conditions holds:
i. Inconsistent Effects:
One action negates an effect of the other. For example, Eat (Cake) and the persistence of Have (Cake) have inconsistent effects because they disagree on the effect Have (Cake).
ii. Interference:
One of the effects of one action is the negation of a precondition of the other. For example, Eat (Cake) interferes with the persistence of Have (Cake) by negating its precondition.
iii. Competing Needs:
One of the preconditions of one action is mutually exclusive with a precondition of the other. For example, Bake (Cake) and Eat (Cake) are mutex because Bake (Cake) is the precondition of Have (Cake).
A mutex condition holds between two literals at the same level if one is the negation of the other or if each possible pair of actions which could achieve the two literals is mutually exclusive. This condition is called inconsistent support. For example, Have (Cake) and Eaten (Cake) are mutex in S1 because the only way of achieving Have (Cake), the persistent action, is mutex with the only way of achieving Eaten (Cake), namely Eat (Cake). In S2, the two literals Have (Cake) and Eaten (Cake) arenotmutex (can you find the reason?)
Heuristic Estimation through Planning Graph:
A planning graph, once constructed becomes a rich source of information about the problem solving (can you brood over at this point of time?) A literal which does not appear in the final level of the graph cannot be achieved by any plan.
This observation can be used in backward search as shown below:
Any state containing an unachievable literal has a cost (n) = ∞. This is similar to partial order planning (stack-planning method), wherein any plan with unachieved open-condition has h (n) = ∞.
The idea can be generalized:
We can estimate the cost of achieving any goal literal as the level at which it appears in the planning graph (we can call it as the cost of the goal). In Fig. 8.10, Have (Cake) has cost 0 and Eaten (Cake) has the level cost 1. It can be shown (not given here) that these estimates are admissible for the individual goals.
The estimate may not be very good, however, because planning graphs allow several actions at each level, whereas the heuristic counts just the levels and not the number of actions. For this reason it is common to use a serial planning graph for computing heuristics. In this type of graph only one action can actually occur at any given time step, by adding mutex links between every pair of actions, except persistence actions.
To estimate the heuristic for a conjunction of goals, no unified approach is available. Three different approaches-maximum level heuristic, level sum heuristic and set level heuristic are used depending upon the efficiency required. Towards an attempt to make use of planning graph for generating accurate heuristic is to treat it as a relaxed problem.
In the sprit of relaxed problem; for a literal to appear in the planning graph at level Si with a guarantee g there should exist a plan with i action levels which achieves g and if g does not appear in the plan then the plan does not exist at all.
But if g goes appear, then all that the planning graph promises is that there is does the plan which possibly achieves g and has no ‘obvious’ flaws. Any obvious flaw is the flaw which can be detected by looking at mutex relations. If there are more flaws, then it is not worth continuing with this method of planning, as is the case with constraint posting method.
Graph Plan Algorithm:
The algorithm for the graph plan helps in extracting a plan directly from the plan graph rather than using the graph to provide a heuristic. Two main steps iterate within a loop. First, it checks whether all the goal literals are present in the current level with no mutex links between any pair of them.
If this is so then a solution might exist within the current graph, so the algorithm tries to extract that solution. Otherwise, it expands the graph by adding the actions for the current level and the state literals for the next level. The procedure continues until either a solution is found or no solution exists. In the event when there is no solution the graph plan algorithm does not loop forever.
The planning graph has the following characteristics:
1. Literals Increase Monotonically:
Once a literal appears at a given level, it will appear at all subsequent levels, because the persistence actions cause a literal to stay there forever.
2. Actions Increase Monotonically:
Once an action appears at a given level it will appear at all subsequent levels. This is because, if the pre-conditions of an action appear at one level they will appear at subsequent levels and so will be the action.
3. Mutexes Decrease Monotonically:
If two actions are mutex at a given level A then they will also be mutex for all previous levels at which they both appear. The same holds for mutexes between literals.
Because the actions and literals increase and the mutexes decrease and because there are only a definite number of actions and literals every planning graph will eventually level-off. Once a graph has levelled off and still it is missing one of the goals or if two goals are mutex, then the problem can never be solved, we should stop the GRAPHPLAN algorithm and return failure.