Planning Competition Results

The 1st International Planning Competition, 1998

The original web page is included below. Here is some complementary material:

A zip file collecting the benchmarks used in the competition, per domain in a simple form. (The benchmarks are also contained in the original compressed tar file, but in a more complicated form.)
Here is the original PDDL manual.
Here is the AI Magazine article about the competition, providing amongst other things a description of PDDL and a summary of the competition results.

AIPS98 Planning Competition Results

Five contestants participated in the First Planning Systems Competition:

- Planner Name: Blackbox
- Creators: Henry Kautz (ATT) and Bart Selman (ATT & Cornell)
- Track: Strips
- Language: C
- Planner Name: HSP
- Creators: Blai Bonet and Hector Geffner (Simon Bolivar University)
- Track: Strips
- Language: C/gcc
- Planner name: IPP
- Creator: Jana Koehler (Freiburg University)
- Tracks: strips, adl
- Language: C/C++ (gcc/gpp)
- Planner Name: SGP
- Creators: Corin Anderson and Dan Weld (University of Washington)
- Tracks: STRIPS, ADL
- Language/compiler: Lisp
- Planner Name: STAN
- Creators: Derek Long and Maria Fox (Durham University)
- Tracks: strips
- Language/compiler: Gnu C++ (g++) and Gnu C (gcc).

The competition seemed to succeed in its goals of generating excitement and establishing how well top-quality planning algorithms actually perform. It did not, however, succeed in finding a clear-cut winner. For details, see the README file in the compressed tar file. This file contains all the problems, solutions, and scoring procedures. We encourage others to run their systems on the problems are compare the results.

Round 1 of the competition consisted of two tracks, labeled "ADL" and "Strips." The difference between the two is that ADL allows context-dependent action effects and quantified preconditions. (For details, see the PDDL manual.) When there are both an ADL version and a Strips version of a domain, there is usually a slight difference between them.

Here is a brief description of each domain (roughly in order of complexity). Within each domain, problems are numbered in approximate order of increasing complexity, although for artificially generated problems it is hard to guarantee that kind of ordering.

Movie: In this domain, the goal is always the same (to have lots of snacks in order to watch a movie), but the number of constants increases with problem number. Some planners have combinatorial problems in such cases. This domain was created by Corin Anderson.
Gripper: There is a robot with two grippers. It can carry a ball in each. The goal is to take N balls from one room to another; N rises with problem number. Some planners treat the two grippers asymmetrically, giving rise to an unnecessary combinatorial explosion. This domain was created by Jana Koehler.
Logistics: There are several cities, each containing several locations, some of which are airports. There are also trucks, which can drive within a single city, and airplanes, which can fly between airports. The goal is to get some packages from various locations to various new locations. This domain was created by Bart Selman and Henry Kautz, based on an earlier domain by Manuela Veloso.
Mystery: There is a planar graph of nodes. At each node are vehicles, cargo items, and some amount of fuel. Objects can be loaded onto vehicles (up to their capacity), and the vehicles can move between nodes; but a vehicle can leave a node only if there is a nonzero amount of fuel there, and the amount decreases by one unit. The goal is to get cargo items from various nodes to various new nodes. To disguise the domain, the nodes were called emotions, the cargo items were pains, the vehicles were pleasures, and fuel and capacity numbers were encoded as geographical entities. This domain was created by Drew McDermott.
Mprime: This is the mystery domain with one extra action, the ability to squirt a unit of fuel from any node to a neighboring node, provided the originating node has at least two units. Created by Drew McDermott
Grid: There is a square grid of locations. A robot can move one grid square at a time horizontally and vertically. If a square is locked, the robot can move to it only by unlocking it, which requires having a key of the same shape as the lock. The goal is to get keys from various locations to various new locations. This domain was created by Jana Koehler, based on an earlier domain by Drew McDermott.
Assembly: The goal is to assemble a complex object made out of subassemblies. The sequence of steps must obey a given partial order. In addition, through poor engineering design, many subassemblies must be installed temporarily in one assembly, then removed and given a permanent home in another. This domain was created by Drew McDermott.

Two planners, IPP and SGP, competed in the ADL track. Four, Blackbox, HSP, IPP, and STAN, competed in the Strips track. Problems were drawn from domains Assembly (ADL only), Gripper, Logistics, Movie, Mystery, and Mprime.

The contestants were given two or three days to run their planners on these problems (depending on when they arrived in Pittsburgh). The idea was to allow them to do any last-minute tuning of their planners in Round 1, then do Round 2 without any further tuning. Round 1 ended at 5 PM on Monday, June 8. After much discussion (but, fortunately, no fatalities), the Committee decided to declare IPP the winner of the ADL track, and focus Round 2 on Strips problems only, with all four Strips planners (Blackbox, HSP, IPP, and STAN) as finalists. In addition, we decided to compute statistics for all the systems, but avoid assigning a single number and declaring a winner. For Round 2, we used the Grid, Logistics, and Mprime domains, all in their Strips versions. In Round 1, we had had 140 Strips problems, of which 52 could not be solved by any of the planners. Having established the range they systems could realistically strive for, we deliberately chose a smaller number of problems, closer to that range, for Round 2. The results are given in the following files: round1/results/adl-round1.results round1/results/strips-round1.results round2/results/round2.results As explained in the README file, we were not entirely satisfied with the output, and have provided an alternative view of the data in scoring/uniform-adl-round1.results scoring/uniform-strips-round1.results scoring/uniform-round2.results After this last iteration, it is really hard to declare a winner. IPP solved more problems than any other program and found shorter plans. STAN ran faster than any other on the problems it solved. But HSP solved the most problems in Round 1, using different domains. Blackbox ran fastest in Round 1.

Data

Here are the "uniform" results for ADL round 1:

ADL Round 1

Planner	Av. Time	Solved	Fastest	Shortest	Score
IPP	21396	69	68	68	199.34
SGP	14343	38	5	35	45.02

The two planners solved 69 problems total; they were tied for fastest time on 4 problems.

Here are the results for Strips round 1 and round 2:

Strips Round 1

Planner	Av. Time	Solved	Fastest	Shortest	Score
Blackbox	1498	63	16	55	163.83
HSP	35483	82	19	61	233.64
IPP	7408	63	29	49	158.78
Stan	55413	64	24	47	177.56

Round 2

Planner	Av. Time	Solved	Fastest	Shortest	Score
Blackbox	2464	8	3	6	172.28
HSP	25875	9	1	5	119.90
IPP	17375	11	3	8	271.28
Stan	1334	7	5	4	180.62

Note that the planners are sorted in alphabetical order.

It is hard to draw any conclusion from these data, except to note that all of these planners performed very well, compared to the state of the art a few years ago. Many of the plans found were 30 or 40 steps long, and some were longer than 100 steps.