BP2
grammars
The
language generated by a BP2 grammar is the intersection of a pattern language
and an unrestricted (phrase-structure or type 0) language (Salomaa 1973:15,
Révész 1985:6, Bel 1989). A
production
rule
is written
P
--> Q
in
which "P" (the
left
argument
)
and "Q" (the
right
argument
)
are strings of variables and/or terminals; the left argument should contain at
least one variable.
The
right argument may be the
empty
string.
In formal language literature the empty string is often notated 'λ', for
instance:
S
--> λ
Since
'λ' is not accessible on the keyboard you may type any of the following:
S
-->
S
--> lambda
S
--> nil
S
--> empty
S
--> null
Reserved
tokens 'lambda"', 'nil', 'empty' and 'null' may not be redefined as terminal
symbols.
In
derivations of a formal grammar, a string containing several occurrences of a
single variable is normally not handled as a pattern: each occurrence may be
derived independently. Therefore a special format is needed for patterns. If
for instance "YabZXcYXdX" is a pattern, it is necessary to indicate explicitly
that all occurrences of "X" (resp. "Y") must be derived identically. The
notation used in BP2 grammars is
(=Y)
abZ (=X) c (:Y) (:X) d (:X)
in
which brackets flagged with '=' are
reference
expressions
and brackets with ':'
slave
expressions.
Multilevel bracketin
g
is possible. For instance, the following BP2 grammar
S
--> (= (= X) S (:X)) (: (= X) S (:X))
S
--> nil
X
--> a
X
--> a S b
X
--> b
generates
self-similar fracta
ls.
Bracketing
is also used to indicate homomorphic transformatio
ns.
For instance, given the TRANS and OCT homomorphisms (see §4.1 supra),
pattern
TRANS
TRANS (=X) (:X) OCT TRANS TRANS TRANS TRANS (:X)
would
represent three occurrences of the same structure of simple notes "X", the
first one being two semitones higher, and the last one four semitones and one
octave higher than the second one.
Brackets
and other structural markers are removed once a terminal string (a
sentence
or
item)
has been generated.
BP
grammars are generally arranged in several layers of
subgrammars
separated with lines of hyphens. These may be called
transformational
grammars
(in a non-linguistic sense, see Kain 1981:24-5). Each
subgrammar
may have its own
derivation
mode
:
--
"
RN
D"
mode: rules are applied in a random order. (Default mode)
--
"LI
N"
mode: candidate rule chosen randomly among rules yielding a leftmost derivation.
--
"OR
D"
mode: candidate rules are applied in the order in which they appear in the
grammar.
--
"SU
B"
mode: same as "RND", but all occurrences of the selected rule's left argument
in the work string are rewritten simultaneously. In this mode, for instance,
p-substitutions
and other kinds of automatic (infinite) sequences may be generated. (See
§4.14 infra.) These derivations stop either when a predefined buffer
length has been reached, or the weights of all candidate rules are null as the
result of having been dynamically decremented during computation, or by
clicking the mouse.
--
"SUB
1"
mode: same as "SUB", but only one rewriting of variables will be allowed. This
mode is much faster than "SUB" and may be used when it is certain that the
first substitution yields only terminal symbols. (See example in "-gr.Mozar
t",
§5.4 of QuickStart)
Below
is an illustration of a "LIN" subgrammar ("-gr.tryLI
N").
We need to produce strings of 'a', 'b', 'c' with lengths 18, in which no two
consecutive occurrences of b are found, and c's always come in pairs. The
following grammar does it:
ORD
S
--> X X X X X X X X X X X X X X X X X X
----------------------------------------------------------
LIN
X
--> a
#b
X --> #b b
[Negative
context. Never two consecutive 'b']
X
? --> c Y ?
[Can't
be applied if "X" is the rightmost symbol]
Y
X --> c
[This
rule is length-decreasing]
Typical
productions of this grammar are:
b
a a b c c a c c a b a a b a b c c
b
c c b c c c c c c a b a b c c b a
b
c c c c c c c c c c c c c c b a a
a
c c b a c c a a c c c c b c c c c
c
c a a c c a b a c c a a b c c b a
a
b a b a b a c c b a b a a b c c a
b
c c a b a a a c c a b a c c c c b
a
c c c c c c b c c c c a a a c c b
a
a a c c a b c c a c c a b c c c c
The
detailed derivation of the first example is self-explanatory:
S
X
X X X X X X X X X X X X X X X X X
b
X X X X X X X X X X X X X X X X X
b
a X X X X X X X X X X X X X X X X
b
a a X X X X X X X X X X X X X X X
b
a a b X X X X X X X X X X X X X X
b
a a b c Y X X X X X X X X X X X X X
b
a a b c c X X X X X X X X X X X X
b
a a b c c a X X X X X X X X X X X
b
a a b c c a c Y X X X X X X X X X X
b
a a b c c a c c X X X X X X X X X
b
a a b c c a c c a X X X X X X X X
b
a a b c c a c c a b X X X X X X X
b
a a b c c a c c a b a X X X X X X
b
a a b c c a c c a b a a X X X X X
b
a a b c c a c c a b a a b X X X X
b
a a b c c a c c a b a a b a X X X
b
a a b c c a c c a b a a b a b X X
b
a a b c c a c c a b a a b a b c Y X
b
a a b c c a c c a b a a b a b c c
A
"RND" subgrammar instead of a "LIN" subgrammar would yield, for instance
S
X
X X X X X X X X X X X X X X X X X
X
X X b X X X X X X X X X X X X X X
X
X X b X X X X X a X X X X X X X X
X
X X b X X X X X a X X c Y X X X X X
X
X a b X X X X X a X X c Y X X X X X
X
X a b X X X a X a X X c Y X X X X X
X
X a b X X b a X a X X c Y X X X X X
X
X a b X X b a X a X X c c X X X X
X
X a b X b b a X a X X c c X X X X
X
X a b X b b a X a X X c c X X b X
X
X a b X b b a X a b X c c X X b X
X
X a b X b b a X a b X c c X c Y b X
X
X a b X b b a X a b X c c c Y c Y b X
X
X a b X b b a b a b X c c c Y c Y b X
b
X a b X b b a b a b X c c c Y c Y b X
b
X a b X b b a b a b a c c c Y c Y b X
b
a a b X b b a b a b a c c c Y c Y b X
b
a a b a b b a b a b a c c c Y c Y b X
b
a a b a b b a b a b a c c c Y c Y b a
b
a a b a b b a b a b a c c c Y c Y b a
which
(like many context-sensitive grammars) does not produce a terminal derivation.
Each
rule has its own derivation mode. As suggested above, in "
LI
N"
grammars the leftmost occurrence of the first argument of the rule is searched
for. However, in "RN
D"
or "OR
D"
grammars the derivation mode needs to be specified. The default derivation
mode, notated "RND", means that the derivation position will be taken randomly
among candidate positions in the work string. If you want to rewrite the
leftmost (resp. rightmost) occurrence of the left argument of the rule, insert
"LEF
T"
(resp. "RIGH
T")
before the first argument of the rule (after its weight). Note that "LEFT" or
"RIGHT" rules take significantly less computation time.
You
must also keep in mind that a "RND" grammar with "LEFT" rules does necessarily
yield a leftmost derivation. For instance, the following variant of the
"-gr.tryLI
N"
grammar
ORD
S
--> X X X X X X X X X X X X X X X X X X
----------------------------------------------------------
RND
LEFT
X --> a
LEFT
#b X --> #b b
LEFT
X ? --> c Y ?
LEFT
Y X --> c
produces
the (incorrect) derivation:
S
X
X X X X X X X X X X X X X X X X X
b
X X X X X X X X X X X X X X X X X
b
a X X X X X X X X X X X X X X X X
b
a a X X X X X X X X X X X X X X X
b
a a b X X X X X X X X X X X X X X
b
a a b c Y X X X X X X X X X X X X X
b
a a b c Y a X X X X X X X X X X X X
[Here
it failed because "X --> a" was also candidate]
b
a a b c Y a c Y X X X X X X X X X X X
b
a a b c Y a c Y a X X X X X X X X X X
b
a a b c Y a c Y a b X X X X X X X X X
b
a a b c Y a c Y a b a X X X X X X X X
b
a a b c Y a c Y a b a a X X X X X X X
b
a a b c Y a c Y a b a a b X X X X X X
b
a a b c Y a c Y a b a a b a X X X X X
b
a a b c Y a c Y a b a a b a b X X X X
b
a a b c Y a c Y a b a a b a b c Y X X X
b
a a b c Y a c Y a b a a b a b c Y c Y X X
b
a a b c Y a c Y a b a a b a b c Y c c X
b
a a b c Y a c Y a b a a b a b c Y c c b
b
a a b c Y a c Y a b a a b a b c Y c c b
A
partial control of derivations is possible either after an interruption or
before starting a production, by choosing the "Step-by-step" and "Select
candidates" options. (See QuickStart §12)