BP2 grammars

The language generated by a BP2 grammar is the intersection of a pattern language and an unrestricted (phrase-structure or type 0) language (Salomaa 1973:15, Révész 1985:6, Bel 1989). A production rule is written

P --> Q

in which "P" (the left argument ) and "Q" (the right argument ) are strings of variables and/or terminals; the left argument should contain at least one variable.

The right argument may be the empty string. In formal language literature the empty string is often notated 'λ', for instance:

S --> λ

Since 'λ' is not accessible on the keyboard you may type any of the following:

S -->
S --> lambda
S --> nil
S --> empty
S --> null

Reserved tokens 'lambda"', 'nil', 'empty' and 'null' may not be redefined as terminal symbols.

In derivations of a formal grammar, a string containing several occurrences of a single variable is normally not handled as a pattern: each occurrence may be derived independently. Therefore a special format is needed for patterns. If for instance "YabZXcYXdX" is a pattern, it is necessary to indicate explicitly that all occurrences of "X" (resp. "Y") must be derived identically. The notation used in BP2 grammars is

(=Y) abZ (=X) c (:Y) (:X) d (:X)

in which brackets flagged with '=' are reference expressions and brackets with ':' slave expressions. Multilevel bracketing is possible. For instance, the following BP2 grammar

S --> (= (= X) S (:X)) (: (= X) S (:X))
S --> nil
X --> a
X --> a S b
X --> b

generates self-similar fractals.

Bracketing is also used to indicate homomorphic transformations. For instance, given the TRANS and OCT homomorphisms (see §4.1 supra), pattern

TRANS TRANS (=X) (:X) OCT TRANS TRANS TRANS TRANS (:X)
would represent three occurrences of the same structure of simple notes "X", the first one being two semitones higher, and the last one four semitones and one octave higher than the second one.

Brackets and other structural markers are removed once a terminal string (a sentence or item) has been generated.

BP grammars are generally arranged in several layers of subgrammars separated with lines of hyphens. These may be called transformational grammars (in a non-linguistic sense, see Kain 1981:24-5). Each subgrammar may have its own derivation mode :

-- "RND" mode: rules are applied in a random order. (Default mode)
-- "LIN" mode: candidate rule chosen randomly among rules yielding a leftmost derivation.
-- "ORD" mode: candidate rules are applied in the order in which they appear in the grammar.
-- "SUB" mode: same as "RND", but all occurrences of the selected rule's left argument in the work string are rewritten simultaneously. In this mode, for instance, p-substitutions and other kinds of automatic (infinite) sequences may be generated. (See §4.14 infra.) These derivations stop either when a predefined buffer length has been reached, or the weights of all candidate rules are null as the result of having been dynamically decremented during computation, or by clicking the mouse.
-- "SUB1" mode: same as "SUB", but only one rewriting of variables will be allowed. This mode is much faster than "SUB" and may be used when it is certain that the first substitution yields only terminal symbols. (See example in "-gr.Mozart", §5.4 of QuickStart)
Below is an illustration of a "LIN" subgrammar ("-gr.tryLIN"). We need to produce strings of 'a', 'b', 'c' with lengths 18, in which no two consecutive occurrences of b are found, and c's always come in pairs. The following grammar does it:

ORD
S --> X X X X X X X X X X X X X X X X X X
----------------------------------------------------------
LIN
X --> a
#b X --> #b b [Negative context. Never two consecutive 'b']
X ? --> c Y ? [Can't be applied if "X" is the rightmost symbol]
Y X --> c [This rule is length-decreasing]

Typical productions of this grammar are:

b a a b c c a c c a b a a b a b c c
b c c b c c c c c c a b a b c c b a
b c c c c c c c c c c c c c c b a a
a c c b a c c a a c c c c b c c c c
c c a a c c a b a c c a a b c c b a
a b a b a b a c c b a b a a b c c a
b c c a b a a a c c a b a c c c c b
a c c c c c c b c c c c a a a c c b
a a a c c a b c c a c c a b c c c c

The detailed derivation of the first example is self-explanatory:

S
X X X X X X X X X X X X X X X X X X
b X X X X X X X X X X X X X X X X X
b a X X X X X X X X X X X X X X X X
b a a X X X X X X X X X X X X X X X
b a a b X X X X X X X X X X X X X X
b a a b c Y X X X X X X X X X X X X X
b a a b c c X X X X X X X X X X X X
b a a b c c a X X X X X X X X X X X
b a a b c c a c Y X X X X X X X X X X
b a a b c c a c c X X X X X X X X X
b a a b c c a c c a X X X X X X X X
b a a b c c a c c a b X X X X X X X
b a a b c c a c c a b a X X X X X X
b a a b c c a c c a b a a X X X X X
b a a b c c a c c a b a a b X X X X
b a a b c c a c c a b a a b a X X X
b a a b c c a c c a b a a b a b X X
b a a b c c a c c a b a a b a b c Y X
b a a b c c a c c a b a a b a b c c

A "RND" subgrammar instead of a "LIN" subgrammar would yield, for instance

S
X X X X X X X X X X X X X X X X X X
X X X b X X X X X X X X X X X X X X
X X X b X X X X X a X X X X X X X X
X X X b X X X X X a X X c Y X X X X X
X X a b X X X X X a X X c Y X X X X X
X X a b X X X a X a X X c Y X X X X X
X X a b X X b a X a X X c Y X X X X X
X X a b X X b a X a X X c c X X X X
X X a b X b b a X a X X c c X X X X
X X a b X b b a X a X X c c X X b X
X X a b X b b a X a b X c c X X b X
X X a b X b b a X a b X c c X c Y b X
X X a b X b b a X a b X c c c Y c Y b X
X X a b X b b a b a b X c c c Y c Y b X
b X a b X b b a b a b X c c c Y c Y b X
b X a b X b b a b a b a c c c Y c Y b X
b a a b X b b a b a b a c c c Y c Y b X
b a a b a b b a b a b a c c c Y c Y b X
b a a b a b b a b a b a c c c Y c Y b a
b a a b a b b a b a b a c c c Y c Y b a

which (like many context-sensitive grammars) does not produce a terminal derivation.



Each rule has its own derivation mode. As suggested above, in "LIN" grammars the leftmost occurrence of the first argument of the rule is searched for. However, in "RND" or "ORD" grammars the derivation mode needs to be specified. The default derivation mode, notated "RND", means that the derivation position will be taken randomly among candidate positions in the work string. If you want to rewrite the leftmost (resp. rightmost) occurrence of the left argument of the rule, insert "LEFT" (resp. "RIGHT") before the first argument of the rule (after its weight). Note that "LEFT" or "RIGHT" rules take significantly less computation time.

You must also keep in mind that a "RND" grammar with "LEFT" rules does necessarily yield a leftmost derivation. For instance, the following variant of the "-gr.tryLIN" grammar

ORD
S --> X X X X X X X X X X X X X X X X X X
----------------------------------------------------------
RND
LEFT X --> a
LEFT #b X --> #b b
LEFT X ? --> c Y ?
LEFT Y X --> c

produces the (incorrect) derivation:

S
X X X X X X X X X X X X X X X X X X
b X X X X X X X X X X X X X X X X X
b a X X X X X X X X X X X X X X X X
b a a X X X X X X X X X X X X X X X
b a a b X X X X X X X X X X X X X X
b a a b c Y X X X X X X X X X X X X X
b a a b c Y a X X X X X X X X X X X X [Here it failed because "X --> a" was also candidate]
b a a b c Y a c Y X X X X X X X X X X X
b a a b c Y a c Y a X X X X X X X X X X
b a a b c Y a c Y a b X X X X X X X X X
b a a b c Y a c Y a b a X X X X X X X X
b a a b c Y a c Y a b a a X X X X X X X
b a a b c Y a c Y a b a a b X X X X X X
b a a b c Y a c Y a b a a b a X X X X X
b a a b c Y a c Y a b a a b a b X X X X
b a a b c Y a c Y a b a a b a b c Y X X X
b a a b c Y a c Y a b a a b a b c Y c Y X X
b a a b c Y a c Y a b a a b a b c Y c c X
b a a b c Y a c Y a b a a b a b c Y c c b
b a a b c Y a c Y a b a a b a b c Y c c b

A partial control of derivations is possible either after an interruption or before starting a production, by choosing the "Step-by-step" and "Select candidates" options. (See QuickStart §12)