Words and standard forms

By a word, we mean a string written using the symbols

\[X \cup \Omega = \{x_1, \dotsc, x_r\} \cup \{\alpha_1, \dotsc, \alpha_n, \lambda\}.\]

We treat the \(x_i\) are constants, the \(\alpha_i\) are unary operators and \(\lambda\) as an n-ary operation. We refer to the parameters \(n\) and \(r\) as the arity and alphabet size respectively.


If we collect together all such words, we obtain an algebra (in the sense of universal algebra) of words. We consider three different algebras; using the notation of [Cohn81]:

  1. \(W(\Omega; X)\), the set of all finite strings written using letters in \(X \cup \Omega\). In other words/jargon, this is the free monoid on \(X \cup \Omega\). Cohn calls these strings \(\Omega\)-rows.
  1. \(W_\Omega(X)\), the subset of \(W(\Omega; X)\) whose strings represent a valid series of operations. Valid means that each the operations \(\alpha_i\) and \(\lambda\) always recieve the correct number of arguments. Cohn calls these strings \(\Omega\)-words.
  1. \(V_{n, r}(X) = W_\Omega(X) / \mathfrak{q}\). This is equivalent to the set of words in \(W_\Omega(X)\) which are in Higman’s standard form. Roughly speaking, a word is in standard form if is valid (type 2) and it cannot be reduced to a shorter word using the following two rules.

    1. \(w_1 \dots w_n \lambda \alpha_i = w_i\), where each \(w_i\) is in standard form.
    2. \(w \alpha_1 \dots w \alpha_n \lambda = w\), where \(w\) is in standard form.

Sometimes we need to refer to a string which consists only of \(\alpha\)-s. Write \(A\) for the set \(\{\alpha_1, \dotsc, \alpha_n\}\). We define \(A^*\) to be the set of finite strings over \(A\). (Again, this is the free monoid on \(A\).)

Finally, let \(S\) be a set of words. We define \(X\langle A\rangle\) to be the set of concatenations \(s \Gamma\), where \(s \in S\) and \(\Gamma \in A^*\). It is sometimes helpful to think of \(S\langle A\rangle\) as the set of words below \(S\).

See also

Sections 2 and 2.3 and Remark 3.3 of the paper.


It may happen that we need to work in different algebras \(V_{n,r}, V_{m,s}\) at the same time. To keep track of the algebras that different words belong to, we associate a Signature to each word \(w \in V_{n,r}\). [1] Signatures are implemented as a glorified tuple \((n, r)\). This means they are immutable.

class thompson.word.Signature[source]
static __new__(cls, arity, alphabet_size)[source]

Signatures store an arity \(n \geq 2\) and an alphabet_size \(r \geq 1\).


We can use the in operator to see if a Word lies in the algebra that this signature describes.

>>> s = Signature(2, 1)
>>> Word('x1 a1 a2', s) in s
>>> Word('x1 a1 a2', (2, 2)) in s

We can test the (signatures of) two algebras \(V_{n,r}\) and \(V_{m,s}\) to see if they are isomorphic. This happens precisely when \(n = m\) and \(r \equiv s \pmod{n-1}\).

>>> Signature(2, 1).is_isomorphic_to(Signature(2, 4))
>>> Signature(2, 1).is_isomorphic_to(Signature(3, 1))
>>> Signature(3, 2).is_isomorphic_to(Signature(3, 1))
>>> Signature(3, 3).is_isomorphic_to(Signature(3, 1))

See also

Corollary 3.14 of the paper.

Operations on words

We represent a word as a tuple of integers, where:

  • \(x_i\) is represented by \(i\),
  • \(\alpha_i\) is represented by \(-i\), and
  • \(\lambda\) is represented by \(0\).

We can write words of all types in this format, but we’re only interested in the standard forms (type 3). To this end, the validate() detects those which are of type 2, and standardise() reduces type 2 words to type 3.


Turns a Word, or a sequence of integers representing a word into a nicely formatted string. Can be thought of as an inverse to from_string(). The word is not processed or reduced in any way by this function.

>>> format(Word("x1 a1 a2 a1 a1", (2, 1)))
'x1 a1 a2 a1 a1'
>>> format([2, -1, 2, -2, 0])
'x2 a1 x2 a2 L'
>>> format([])
'<the empty word>'
thompson.word.format_cantor(word, subset=False)[source]

An alternative, more concise notation for simple words. We use the Cantor set notation, where we write \(n-1\) in place of lpha_n. All other symbols \(\lambda\) and \(x_i\) are omitted, so this should really only be used on simple words. The subset argument changes how we display a single \(x_i\). It’s here for experimental use by CantorSubset.__str__.

>>> format_cantor(Word("x1 a1 a2 a1 a1", (2, 1)))
>>> format_cantor([2, -1, 2, -2, 0])
>>> format_cantor([])
'<the empty word>'
>>> format_cantor([1])
>>> format_cantor([1], subset=True)
'<entire Cantor set>'
>>> format_cantor([-1])

Converts a string representing a word to the internal format—a tuple of integers. No kind of validation or processing is performed. This is used internally by other functions and doesn’t typically need to be called directly. Two forms of string are accepted: ‘Higman’ and ‘Cantor’. We assume the former is in use if the string contains an x, a or L; else we assume the latter. The first is the Higman notation (involving \(x`s, :math:\)alpha`s and \(\lambda`s). Anything which does not denote a basis letter (e.g. `\)‘x’, ``'x2', 'x45') a descendant operator (e.g. 'a1', 'a3', 'a27') or a contraction ('L' for \(\lambda\)) is ignored. Note that 'x' is interpreted as shorthand for 'x1'.

>>> x = from_string('x2 a1 x2 a2 L')
>>> print(x, format(x), sep='\n')
(2, -1, 2, -2, 0)
x2 a1 x2 a2 L
>>> x = from_string('x a1 a2 a3')
>>> print(x, format(x), sep='\n')
(1, -1, -2, -3)
x1 a1 a2 a3
>>> w = random_simple_word()
>>> from_string(str(w)) == w

The second form is the Cantor-set notation. This only applies when there is a single basis letter \(x\), which is not written down as part of the format. We use this mode if the first non-whitespace character of the input string is not x. A descendant operator \(\alpha_n\) is written as the integer \(n-1\). All characters outside the range 0–9 are ignored.

>>> x = from_string("001101")
>>> print(x, format(x), format_cantor(x), sep='\n')
(1, -1, -1, -2, -2, -1, -2)
x1 a1 a1 a2 a2 a1 a2
Return type:tuple of integers


More convenient ways of inputting words, e.g. x a1^3 instead of x a1 a1 a1. Use of unicode characters α and λ?

thompson.word.validate(letters, signature)[source]

Checks that the given letters describe a valid word belonging to the algebra with the given signature = (arity, alphabet_size). If no errors are raised when running this function, then we know that:

  • The first letter is an \(x_i\).
  • Every \(\lambda\) has arity arguments to its left.
  • If the word contains a \(\lambda\), every subword is an argument of some \(\lambda\).
  • No \(\alpha_i\) appears with \(i >\) arity.
  • No \(x_i\) occurs with \(i >\) alphabet_size.

Calling this function does not modify letters. The argument letters need not be in standard form.

>>> w = from_string('a1 x a2 L'); w
(-1, 1, -2, 0)
>>> validate(w, (2, 1))
Traceback (most recent call last):
ValueError: The first letter (a1) should be an x.
>>> validate(from_string('x1 a2 x12 a1 a2 L'), (2, 4))
Traceback (most recent call last):
IndexError: Letter x12 at index 2 is not in the alphabet (maximum is x4).
>>> validate(from_string('x1 a1 a2 a3 a4 a5 x2 a2 L'), (3, 2))
Traceback (most recent call last):
IndexError: Letter a4 at index 4 is not in the alphabet (maximum is a3).
>>> validate(from_string('x1 a2 L'), (4, 1))
Traceback (most recent call last):
ValueError: Letter L at index 2 is invalid. Check that lambda has 4 arguments.
>>> validate(from_string('x1 x1 L x1 L x1'), (2, 1))
Traceback (most recent call last):
ValueError: Word is invalid: valency is 2 (should be 1).
  • ValueError – if the first letter is not an \(x_i\).
  • IndexError – if this word contains an \(x_i\) with \(i\) outside of the range 1 … alphabet_size
  • IndexError – if this word contains an \(\alpha_i\) outside of the range 1 … arity
  • IndexError – if this word is empty (i.e. consists of 0 letters).
  • ValueError – if this word fails the valency test.

See also

Proposition 2.12 of the paper for the valency test.

thompson.word.standardise(letters, signature, tail=())[source]

Accepts a valid word as a tuple of letters and reduces it to standard form. The result—a new tuple of letters—is returned. The signature must be given so we know how many arguments to pass to each \(\lambda\). The tail argument is used internally in recursive calls to this function and should be omitted.

In the examples below, the Word class standardises its input before creating a Word.

>>> #Extraction only
>>> print(Word("x a1 a2 a1 x a2 a1 L a1 a2", (2, 1)))
x1 a1 a2 a1 a2
>>> #Contraction only
>>> print(Word("x2 a2 a2 a1 x2 a2 a2 a2 L", (2, 2)))
x2 a2 a2
>>> #Contraction only, arity 3
>>> print(Word("x1 a1 a1 x1 a1 a2 x1 a1 a3 L", (3, 2)))
x1 a1
>>> #Both extraction and contraction
>>> print(Word("x a1 a2 a1 a1 x a1 a2 a1 x a2 a1 L a1 a2 a1 x a1 a2 a1 a2 a2 L L", (2, 1)))
x1 a1 a2 a1
>>> #Lambda concatenation
>>> print(Word("x a1 a2 a1 x a1 a2 a1 x a2 a1 L a1 a2 a1 x a1 a2 a1 a2 a2 L L", (2, 1)))
x1 a1 a2 a1 x1 a1 a2 a1 a2 L
>>> #A mix of contraction and extraction
>>> print(Word("x a2 x a1 L x a1 L a1 a2 a1 a2", (2, 1)))
x1 a1 a1 a2
>>> #Can't contract different x-es
>>> print(Word('x1 a1 x2 a2 L', (2, 2)))
x1 a1 x2 a2 L
>>> #Something meaty, arity 3
>>> print(Word('x1 a1 x1 a2 x1 a3 L a2 a1 a3 x1 a2 a1 x2 a1 x1 a1 a1 x1 a1 a2 x2 L x1 a2 L a2 a1 a1 L a3 a3 a2', (3, 2)))
x1 a1 a1 a1 a3 a2
Return type:

tuple of integers.

See also

Remark 3.3 of the paper.

thompson.word.lambda_arguments(word, signature=None)[source]

This function takes a valid word which ends in a \(\lambda\), and returns the arguments of the rightmost \(\lambda\) as a list of Words.

If word is given as a Word instance, then signature may be omitted. Otherwise the signature should be provided.

>>> w = Word('x a1 a2 x a2 a2 L x a1 a1 L', (2, 1))
>>> subwords = lambda_arguments(w)
>>> for subword in subwords: print(subword)
x1 a1 a2 x1 a2 a2 L
x1 a1 a1
>>> w = 'x x x x a1 x L x a1 x a2 x L L x a1 a2 x L x a1 a2 x a2 a1 x L L'
>>> subwords = lambda_arguments(Word(w, (3, 1)))
>>> for subword in subwords: print(subword)
x1 x1 x1 a1 x1 L x1 a1 x1 a2 x1 L L x1 a1 a2 x1 L
x1 a1 a2 x1 a2 a1 x1 L
  • IndexError – if word is an empty list of letters.
  • ValueError – if the last letter in word is not a lambda.
  • TypeError – if no arity is provided and word has no arity attribute.
Return type:

list of Words.


Let words be a list of words, either as sequences of integers or as full Word objects. This function tests to see if words is a list of the form \((w\alpha_1, \dotsc, w\alpha_n)\), where n == len(words).

Returns:\(w\) (as a tuple of integers) if the test passes; the empty tuple if the test fails.


If there is a prefix \(w\), this function does not check to see if

  • \(w\) is a valid word, or
  • \(n\) is the same as the arity of the context we’re working in.
>>> prefix = are_contractible(
...     [Word("x a1 a2 a3 a1", (3, 1)), Word("x a1 a2 a3 a2", (3, 1)), Word("x a1 a2 a3 a3", (3, 1))])
>>> format(prefix)
'x1 a1 a2 a3'

An infinite iterator which enumerates the elements of \(A^*\) in shortlex order.

>>> for i, gamma in enumerate(free_monoid_on_alphas(4)):
...     if i >= 10: break
...     print(format(gamma))
<the empty word>
a1 a1
a1 a2
a1 a3
a1 a4
a2 a1

Given a sequence \(\Gamma\in A^*\), this function computes the root \(\sqrt\Gamma \in A^*\) and the root power \(r \in \mathbb N\). These are objects such that \((\sqrt\Gamma)^r = \Gamma\), and \(r\) is maximal with this property.

Returns:the pair \((\sqrt\Gamma, r)\)
>>> root('abababab')
('ab', 4)
>>> root([1, 2, 3, 1, 2, 3])
([1, 2, 3], 2)

See also

The discussion following Corollary 6.7 in the paper.

The Word class

It’s important to know when a sequence of letters denotes a word in standard form. The Word class addresses this problem by standardising its input upon creation. We can think of a Word object as a fixed list of letters with the guarantee that it is in standard form. Words are also given a Signature at creation time, so that we know which algebra the word comes from.

We also need to know that once a Word has been created, it cannot be modified to something not in standard form. We achieve this simply by making it impossible to modify a Word. Words are implemented as (subclasses of) tuple, which are immutable. One useful side effect of this is that Words can be used as dictionary keys. [2]

>>> w = Word('x1 a1', (2, 1))
>>> x = {}; x[w] = 'stored value'
>>> x[w]
'stored value'
>>> #Can also use the underlying tuple as a key
>>> x[1, -1]
'stored value'
>>> #the reason this works
>>> hash(w) == hash((1, -1))
class thompson.word.Word[source]
  • signature – the signature of the algebra this word belongs to.
  • lambda_length – the number of times the contraction symbol \(\lambda\) occurs in this word after it has been written in standard form.
static __new__(cls, letters, signature, preprocess=True)[source]

Creates a new Word consisting of the given letters belonging the algebra with the given signature. The letters may be given as a list of integers or as a string (which is passed to the from_string() function). The signature can be given as a tuple or a fully-fledged Signature.

>>> Word("x2 a1 a2 a3 x1 a1 a2 a1 x2 L a2", (3, 2))
Word('x1 a1 a2 a1', (3, 2))

By default, the argument preprocess is True. This means that words are validated and reduced to standard form before they are stored as a tuple. These steps can be disabled by using preprocess=False. This option is used internally when we know we have letters which are already vaild and in standard form.

Raises:ValueError – See the errors raised by validate().

A shorthand to format_cantor() for printing simple words. By default, it discards the the root \(x_i\).

>>> w = Word("x2 a1 a3 a1 a1 a2", (3, 4))
>>> w.address()
>>> Word("x2 a1 a3 a1 a1 a2", (3, 4)).address(include_root=True)

Let us assign a total order to the alphabet \(X \cup \Omega\) by declaring:

\[\alpha_1 < \dots < \alpha_n < x_1 < \dots < x_r < \lambda \]

We can use this to order simple words by using dictionary order.

>>> Word('x1', (2, 2)) < Word('x1', (2, 2))
>>> Word('x1', (2, 2)) < Word('x2', (2, 2))
>>> Word('x1 a2', (2, 2)) < Word('x1 a1', (2, 2))
>>> Word('x1 a1 a2 a3 a4', (4, 1)) < Word('x1 a1 a2 a3 a3', (4, 1))

We extend this to non-simple words in standard form in the following way. Let \(\lambda(u)\) denote the lambda-length of \(u\).

  1. If \(\lambda(u) < \lambda(v)\), then \(u < v\).
>>> Word('x2 x1 L', (2, 2)) < Word('x1 x2 x1 L L', (2, 2))
  1. If \(u=v\), then neither is strictly less than the other.
>>> Word('x1 x2 L', (2, 2)) < Word('x1 x2 L', (2, 2))
  1. Otherwise \(u \neq v\) are different words with the same \(\lambda\)-length. Break both words into the \(n\) arguments of the outmost lambda, so that \(u = u_1 \dots u_n \lambda\) and \(v = v_1 \dots v_n \lambda\), where each subword is in standard form.
  2. Let \(i\) be the first index for which \(u_i \neq v_i\). Test if \(u_i < v_i\) by applying this definition recursively. If this is the case, then \(u < v\); otherwise, \(u > v\).
>>> #True, because x2 < x2 a2
>>> Word('x1 x2 L', (2, 2)) < Word('x1 x2 a2 L', (2, 2))
>>> #True, because words of lambda-length 1 are less than those of lambda-length 2.
>>> Word('x1 x2 L x3 x4 L L', (2, 4)) < Word('x1 x2 L x3 L x4 L', (2, 4))

The other three comparison operators (<=, >, >=) are also implemented.

>>> Word('x1 x2 L', (2, 2)) <= Word('x1 x2 L', (2, 2))
>>> Word('x1 a1', (2, 2)) <= Word('x1 a1', (2, 2)) <= Word('x1 a1 a2', (2, 2))
>>> Word('x1', (2, 2)) > Word('x1', (2, 2))
>>> Word('x1', (2, 2)) >= Word('x1', (2, 2))
>>> Word('x1 a2', (2, 2)) > Word('x1', (2, 2))
>>> Word('x1 a2', (2, 2)) >= Word('x1', (2, 2))


I think this is a total order—try to prove this. I based the idea on [Zaks] (section 2, definition 1) which describes a total order on k-ary trees. On another note, isn’t this similar to shortlex order?


Let us call a Word simple if it has a lambda length of zero once it has been reduced to standard form.

>>> Word("x1 a1 a2", (2, 1)).is_simple()
>>> Word("x1 a1 a2 x2 a2 L", (2, 2)).is_simple()
>>> #Simplify a contraction
>>> Word("x1 a1 a1 x1 a1 a2 L", (2, 1)).is_simple()
>>> #Lambda-alpha extraction
>>> Word("x1 x2 L a2", (2, 2)).is_simple()

Tests to see if the current word is an initial segment of word. Returns True if the test passes and False if the test fails.

>>> w = Word('x1 a1', (2, 2))
>>> w.is_above(Word('x1 a1 a1 a2', (2, 2)))
>>> w.is_above(Word('x1', (2, 2)))
>>> w.is_above(w)
>>> v = Word('x1 a1 x1 a2 L', (2, 2))
>>> w.is_above(v)
>>> v.is_above(w)

Tests to see if the current word \(c\) is an initial segment of the given word \(w\). In symbols, we’re testing if \(w = c \Gamma\), where \(\Gamma \in \langle A \rangle\) is some string of alphas only.

The test returns \(\Gamma\) (as a tuple of integers) if the test passes; note that \(\Gamma\) could be the empty word \(1\) (if \(c = w\)). If the test fails, returns None.

>>> c = Word('x1 a1', (2, 2))
>>> c.test_above(Word('x1 a1 a1 a2', (2, 2)))
(-1, -2)
>>> c.test_above(Word('x1', (2, 2))) is None
>>> c.test_above(c)
>>> w = Word('x1 a2 x1 a1 L', (2, 2))
>>> print(c.test_above(w))
>>> w.test_above(c)
>>> v = Word('x1 a2 x1 a1 L x1 a2 a2 L x1 a2 x1 a1 L L', (2, 2))
>>> print(c.test_above(v))
>>> #There are two possible \Gamma values here; only one is returned.
>>> v.test_above(c)
(-1, -1, -2)


There may be many different values of \(\Gamma\) for which \(w = c \Gamma\); this method simply returns the first such \(\Gamma\) that it finds.


Let \(w\) be the current word and let \(X\) be a basis. Choose a (possibly empty) string of alphas \(\Gamma\) of length \(s \geq 0\) at random. What is the smallest value of \(s\) for which we can guarantee that \(w\Gamma\) is below \(X\), i.e. in \(X\langle A\rangle\)?

>>> from thompson.generators import Generators
>>> basis = Generators.standard_basis((2, 1)).expand(0).expand(0).expand(0)
>>> basis
Generators((2, 1), ['x1 a1 a1 a1', 'x1 a1 a1 a2', 'x1 a1 a2', 'x1 a2'])
>>> Word('x', (2, 1)).max_depth_to(basis)
>>> Word('x a1', (2, 1)).max_depth_to(basis)
>>> Word('x a2 x a1 L x1 a1 L', (2, 1)).max_depth_to(basis)
>>> Word('x a2', (2, 1)).max_depth_to(basis)


If basis doesn’t actually generate the algebra we’re working in, this method could potentially loop forever.


Returns a pair (start, end) of Fractions which describe the interval \(I \subseteq [0,1]\) that this word corresponds to.

>>> def print_interval(w, s):
...     start, end = Word(w, s).as_interval()
...     print('[{}, {}]'.format(start, end))
>>> print_interval('x a1 a2 a1 x L', (2, 1))
Traceback (most recent call last):
ValueError: The non-simple word x1 a1 a2 a1 x1 L does not correspond to a interval.
>>> print_interval('x1', (2, 1))
[0, 1]
>>> print_interval('x1', (3, 1))
[0, 1]
>>> print_interval('x1', (4, 2))
[0, 1/2]
>>> print_interval('x1 a1', (2, 1))
[0, 1/2]
>>> print_interval('x1 a1', (3, 1))
[0, 1/3]
>>> print_interval('x1 a1', (4, 2))
[0, 1/8]
>>> print_interval('x1 a2 a1 a2', (2, 1))
[5/8, 3/4]
>>> print_interval('x1 a2 a1 a2', (3, 1))
[10/27, 11/27]
>>> print_interval('x1 a2 a1 a2', (4, 2))
[17/128, 9/64]
>>> from thompson.examples import random_simple_word
>>> s, e = random_simple_word().as_interval(); 0 <= s < e <= 1
Raises:ValueError – if the word is not simple.
static ray_as_rational(base, spine)[source]

Converts the infinite string \(\text{base} \, \text{spine}^\infty\) to a rational in the unit interval.

>>> Word.ray_as_rational(Word("x", (2,1)), from_string("a1 a2"))
Fraction(1, 3)
>>> Word.ray_as_rational(Word("x a1", (2,1)), from_string("a2 a1"))
Fraction(1, 3)

Let \(w\) stand for the current word. This method creates and returns a new word \(w\alpha_\text{index}\).

>>> Word("x a1 a2", (3, 2)).alpha(3)
Word('x1 a1 a2 a3', (3, 2))
Raises:IndexError – if index is not in the range 1… arity.

Concatenates the current word with the series of letters tail to form a new word.The argument tail can be given as either a string or a tuple of integers.

>>> Word('x1 a2 a1', (2, 1)).extend('a2 a2')
Word('x1 a2 a1 a2 a2', (2, 1))
>>> Word('x1 a2 a1', (2, 1)).extend('x1 a2 a2')
Traceback (most recent call last):
ValueError: Word is invalid: valency is 2 (should be 1).
>>> Word('x1 a2 a1', (2, 1)).extend('x1 a2 a2 L')
Word('x1 a2', (2, 1))

Returns an iterator that yields the arity descendants of this word.

>>> w = Word("x a1", (5, 1))
>>> for child in w.expand():
...     print(child)
x1 a1 a1
x1 a1 a2
x1 a1 a3
x1 a1 a4
x1 a1 a5
>>> w = Word("x a1 a1 x a2 a1 L", (2, 1))
>>> for child in w.expand():
...     print(child)
x1 a1 a1
x1 a2 a1
Return type:iterator which yields Words.

Splits the current word w into a pair of tuples head, tail where len(head) == head_width. The segments of the word are returned as tuples of integers, i.e. not fully fledged Words.

Raises:IndexError – if head_width is outside of the range 0 <= head_width <= len(w).
>>> Word('x1 a1 a2 a3 a1 a2', (3, 1)).split(4)
((1, -1, -2, -3), (-1, -2))
>>> Word('x1 a1 a2 a3 a1 a2', (3, 1)).split(10)
Traceback (most recent call last):
IndexError: The head width 10 is not in the range 0 to 6.

The same as split(), but this time tail_width counts from the right hand side. This means that len(tail) == tail_width.

Raises:IndexError – if tail_width is outside of the range 0 <= tail_width <= len(w).
>>> Word('x1 a1 a2 a3 a1 a2', (3, 1)).rsplit(4)
((1, -1), (-2, -3, -1, -2))
>>> Word('x1 a1 a2 a3 a1 a2', (3, 1)).rsplit(10)
Traceback (most recent call last):
IndexError: The tail width 10 is not in the range 0 to 6.
shift(delta=1, signature=None)[source]

Takes a simple word \(x_i\Gamma\) and returns the word \(x_{i+\delta}\Gamma\). The target word has signature \((n, r+\delta)\) by default where \((n, r)\) is the current word’s signature. This can be overridden by passing in a signature.

>>> w = Word("x1 a1 a2 x1 a2 x2 L", (3, 2))
>>> w.shift()
Traceback (most recent call last):
ValueError: Can only shift simple words.
>>> v = Word("x1 a2 a3", (3, 1))
>>> v.shift()
Word('x2 a2 a3', (3, 2))

An iterator method which yields the anscestors of a simple word. Use discard_root if you want to ignore words which correspond to \(x_i\).

>>> w = Word("x1 a1 a2 a1 a2", (2, 1))
>>> print( *w.subwords(), sep="\n")
x1 a1
x1 a1 a2
x1 a1 a2 a1
x1 a1 a2 a1 a2
>>> print( *w.subwords(discard_root=True), sep="\n" )
x1 a1
x1 a1 a2
x1 a1 a2 a1
x1 a1 a2 a1 a2


[1]This made it easier to catch bugs before they happened when I was passing words around as arguments to functions.
[2]This made it so much easier to cache the images of homomorphisms, because I could just use a vanilla Python dictionary.

Next steps

Now that we can work with words in \(V_{n,r}\), we need to be able to work with collections of words. The generators module will let us treat a list of words as a generating set and examine the subalgebra that the collection generates.