Incomplete draft: do not cite!

A grammar is a set of rules describing the arrangements of words within phrases.1  Nearly all grammars in games and CS are variations of a specific formalism called a context-free grammar (CFG).  Basic CFGs are one of the simplest declarative formalisms.  They're used for describing programming languages, dialog for non-player characters, patterns for generating random names for things in a game, and many other applications.

A grammar for treasure

From a practical standpoint, a CFG defines a number of phrase types, each in terms of a set of text templates: fixed text strings with the ability to substitute other phrase types in particular spots.  For example, if we were making a random treasure generator for a game, we might say that a treasure can be a weapon, a piece or armor, or an item:

Treasure: [Weapon]
Treasure: [Armor]
Treasure: [Item]

Here, the name before the colon is a phrase type and the text after it is the template for generating it.  A type can have many alternative templates, and if a phrase type appears in brackets in the template, then a phrase of that type is to be substituted for the bracketed expression.  So the grammar above says “a treasure is a weapon, armor, or item.”

Great.  What’s a weapon?

Weapon: [PreBuff] [WeaponType] [PostBuff]
WeaponType: sword
WeaponType: mace
...

Since “sword” and “mace” are not in brackets, they’re printed verbatim.  So this says that a weapon is a weapon type (sword or mace), surrounded by two buffs, a “pre-buff” and a “post-buff”:

PreBuff: +1
PreBuff: +2
...
PreBuff: cursed
...

PostBuff: of [Element]
PostBuff: of [Effect]
...

Element: earth
Element: air
Element: fire
Element: water
Element: soul

Effect: insanity
Effect: sleep
Effect: poison
Effect: social anxiety
...

Generating treasure

To generate a treasure using this grammar, we start with [Treasure] and randomly choose one of its rules, e.g.:

Treasure: [Weapon]

to change [Treasure] into [Weapon].  Then, we use the rule for Weapon to turn it into [PreBuff] [WeaponType] [PostBuff].  Then, we pick a rule to replace [PreBuff], e.g.:

PreBuff: cursed

and change our text to cursed [WeaponType] [PostBuff].  Then, we pick rules for [WeaponType] and [PostBuff] and we get something like:

cursed mace of social anxiety

Producing text using a grammar is called generation.  Conversely, using a grammar to break existing text up into its constituent phrases and sub-phrases, is called parsing.2  Nearly all programming languages are defined in terms of CFGs.  Compilers or interpreters nearly always start by using the grammar to parse the program into its pieces.

This is the grammar above written as a Step program.  To try it, click the code box below to open a Step window.  Then type [Treasure] in the command box and press Run:

# Try: [Treasure]
Treasure: [Weapon]

Weapon: [PreBuff] [WeaponType] [PostBuff]

[randomly]
WeaponType: sword
WeaponType: mace

[randomly]
PreBuff: +1
PreBuff: +2
PreBuff: cursed

[randomly]
PostBuff: of [Element]
PostBuff: of [Effect]

[randomly]
Element: earth
Element: air
Element: fire
Element: water
Element: soul

[randomly]
Effect: insanity
Effect: sleep
Effect: poison
Effect: social anxiety

The line starting with # is called a comment.  It's a note from the human programmer to other humans.  Step ignores the rest of a line after a #.

Fancier grammars

In games and computer science, we frequently augment the basic CFG formalism with mechanisms for:

  • Passing information into and out of the parsing or generation process, and
  • Placing conditions on when a given template can be used based on that information

This has led to the development of endless variations on CFGs.  In this book, we’ll focus on a particular class of “unification grammars” called definite clause grammars, but you don’t really need to know that term, especially since it’s an extremely non-obvious name.3

In this chapter, we’ll walk you through how to write CFGs using the programming language Step.  It will also serve as an introduction to the Step language.  In the next chapter, we’ll introduce how Step can be used for logical reasoning, after which we’ll return to grammars and discuss how to use logic inside grammars.

Notes


  1. Technically, those are a particular kind of grammar called a phrase structure grammar.  There are other kinds of grammars such as dependency grammars.  However, most grammars in computer science and game development are phrase-structure grammars.

  2. From the Latin pars meaning “part” or “piece.”

  3. Esoterica: The name comes from the fact that DCGs can be expressed in terms of sentences in a formal logic called definite clauses.