Introduction to Finite State Transducers
Weighted Finite State Transducers is a generalisations of finite state machines. They can be used for many purposed, including implementing algorithms that are hard to write out otherwise – such as HMMs, as well as for the representation of knowledge – similar to a grammar.
Other places to get information
- A descent set of slides can be found here
- The OpenFst documentation and FST Examples are nonawful, though the shell and C++ sections are intermixed.
- Speech Recognition with Weighted Finite-state Transducers a book chapter.
Above: An FST for pronouncing the digits 1-9 and two pronouncations of zero as: O (o) and zero (z), as used in TIDIGITS
Terminology
Symbols and Strings
Symbols come from some alphabet. They could be letters, words, phonemes, etc.
A string is a series of symbols from an alphabet, it can include the empty string. Matching the examples above, a string could be a word (spelt out), a sentence, a word (spelt out phonetically), etc.
A string can be represented as a Finite State Acceptor, where each symbol links to the state which links to the next.
Finite State Acceptor (FSA)
A Finite State Acceptor has the components of:
- a number of States
- one or more of which is initial
- one or more of which is terminal
- connections between states, with a input symbol (IE label)
- the symbol could be the empty string (often written “-” or “
” or “ε”) - Not necessarily a one to one label to next state mapping (IE nondeterministic)
- the symbol could be the empty string (often written “-” or “
A FSA can be used to check if a string matches its pattern – it is computationally equivalent to a regular expression. It can also be used to generate strings which match that pattern.
FSA’s can be treated as FSTs with same input and output symbols at each edge. Kaldi example scripts sometimes write them this way.
Finite State Transducers (FST)
A Finite State Transducer extends the Finite State Acceptor with the addition of:
- output labels on each edge
- again the output can be the empty string.
- it is common (such as in the TIDIGIT example above), to see only the first transition in a nonbranching substructure to be labels – the other states have nothing to add other than confirming we are in that chain. (which we might Not be)
- The input alphabet and output alphabet do not have to be the same, and indeed are normally not.
- again the output can be the empty string.
A FST can be used to translate strings in its input alphabet to strings in its output alphabet, iff the input string matches the FSTs structure of allowed transitions. Thus if a FSA accepting its input alphabet is composed with it, it can translate the FSA. A series of FSAs can be composed, translating (matched) alphabet to alphabet, to get the desired output.
Weighted Finite State Acceptor/Transducer
As per the ordinal, but with a weight associated with each edge (as well as input, and output for transducers) This weight has a ⊕ and ⊗ operation defined on it, so that weight of alternatives and that cumulative weight along a path can be found.
- e.g. weight along a path is product of probabilities, and represents the probability of that input string.
- e.g. sum of weights on two edges is the probability of either of those alternitives.
Finite State Transducers in Kaldi
Kaldi uses FSTs (and FSAs), as a common knowledge representation for all things.
OpenFST
Filetypes
Textual FST/FSA definition: .fst.txt
, .fsa.txt
, .txt
Textual Representation of the finite state transducer or finite state acceptor respectively. These are the files you write to get things done, to describe your system.
In most of kaldi the .fst.txt
/.fsa.txt
is used. In other places it is just called .txt
. In this document, it is always referred to by the former terms.
Line format:
Normal line fromState toState inSymbol [outSymbol] [weight]
Terminal state line terminalState
fromState
,toState
, andterminalState
are integer state labelsinSymbol
,outSymbol
are textual strings being the name of the symbols from the respective input and output alphabets.outSymbol
should not be present in FSAs, and should always be present in FSTs
weight
is a decimal number, indicating the weight of the edge. It must be present in Weighted FSTs/FSAs
Symbol table file: .isyms
, .osyms
, .syms
, .dict
, .txt
OpenFst like to refer to symbols by a positive integer. Since any finite alphabet is isomorphic to a subset of the positive integers, such a bijection exists, and can be created by enumerating each symbol.
For each FST you should have two of these files, one for the input alphabet and one for the output alphabet. For an FSA you should only have one – for the input alphabet. Under most circumstances these can be generated from the .fst.txt
/.fsa.txt
programatically. One such script for that is provided here in . Others exist throughout the kaldi example scripts, often using AWK oneliners.
In different places different extensions are used.
The example script uses .isyms
for symbol files generated from the input alphabet in the textual FSA/FSA description, and .osyms
for that generated from the output alphabet.
Line Format:
symbol integer
symbol
is a symbol from the alphabet being mapsinteger
is a unique positive integer (that is to say each integer only appears once in this file).
Binary FST/FSA: .fst
, .fsa
This is the binary representation of the finite state transducer/acceptor.
It is produced from the textual representation and symbol tables using
fstcompile
.
Graph of FST/FSA: .dot
It is a Graph Description Language File, produced by fstdraw
.
Piping in through dot
can convert it into another more common format.
E.g.: cat example.dot | dot -Tsvg > example.svg
will convert example.dot
to a SVG file.
This is often done directly from the line that calls fstdraw
.
OpenFST components
OpenFST is made up of several different command line applications. The three most used in kaldi are details briefly below:
Common convention
Input and Output
OpenFST commands which take a single input and produce a single output
(such as fstdraw
and fstcompile
)
have the basic usage of
fstcommand [FLAGS] [inputfile [outputfile]]
Which is to say an inputfile
can optionally be provided,
and if it is, then optionally an outputfile
can be provided also.
If either is missing then input will be taken from standard in (IE piped in, or read from keyboard if no input pipe), and output will be sent to standard output (IE piped out, or printed to the terminal if there is no output pipe.), respectively.
Accessing Help (manpages)
Because OpenFST is not properly installed, it does not have entries in the man pages. To get help with a command use:
fstcommand --help | less
Compile: fstcompile
this converts a textural FST/FSA into a binary one.
- FSA Usage:
fstcompile --acceptor --isymbols=<input.sym> [--keep_isymbols];
- FST Usage:
fstcompile --isymbols=<input.sym> -osymbols=<output.sym> [--keep_isymbols] [--keep_osymbols];
Flags:
--acceptor
: compiles it as an FSA, rather than a FST--isymbols=
,--osymbols=
: specifies the input and output symbol tables--keep_isymbols
,--keep_osymbols
: If set then the symbol stables as keeps in the binary file and do not need to be specified at later steps such asfstdraw
Draw: fstdraw
produces a .dot
file graph, from a binary FST/FSA
- FSA Usage:
fstdraw --acceptor --portait [--isymbols=<input.sym>] [--osymbols=<output.sym>]
- FST Usage:
fstdraw --portait [--isymbols=<input.sym>] [--osymbols=<output.sym>]
- Common Use example:
cat eg.fst | fstdraw --portait --isymbols=eg.isyms --osymbols | dot -Tsvg > eg.svg
Flags:
--portrait
this flag should always be set. If not set then image comes out rotated 90 degrees, and on a overly large canvas.--isymbols
,--osymbols
, as before, but if not provided then symbols in the graphic will be replaced with their numeric representation, unless--keep_isymbols
or--keep_osymbols
was set in the compile step--acceptor
: draws a FSA, rather than a FST. Without it it will label the FSA with output labels.
Compose: fstcompose
Composed a FSA/FST with a FST
- Usage:
fstcompose [--fst_compat_symbols=false] outer.[fst|fsa] inner.fst output.fst
Applying an input to the Output FST is equivelent to first applying it to the Inner then applying the output of that to the Outer. i.e. output(x)=outer(inner(x))
--fst_compat_symbols=false
: setting this to false (it defaults to true), may be required when composing FSTs/FSA where--keep_isymbols
or--keep_osymbols
was used and that the symbol files embedded while actually compatible are not the same files (it seems to store the filenames, which can be seen by runningstrings
on a fst).
Other useful Commands
All the commands in OpenFst have a use. Other commands which I have found particularly useful, but do not have space to detail include;
fstsymbols
manipulate and export the symbols tables in the binary FST/FSAfstproject
convert the FST into a FSA in either the input or output space by discarding the appropriate labels
Examples provided here
Several scripts are provided here to demonstrate how to make use of OpenFST, and to make using it easier. They can be downloaded from the Git backing this site. The section names below are also hyperlinks to download those scripts/files.
NOTE:
The example scripts assume openfst binaries are in your PATH
.
If you added all kaldi binerys during install step you will already have them.
Otherwise you can add just the Openfst binaries by:
Add to your .bashrc
(or similar) PATH="<...>/kaldi-trunk/tools/openfst/bin:${PATH}"
, where <...>
is the math to the kaldi-trunk folder.
then source ~/.bashrc
makeSymbols.py
makeSymbols.py is a script to make creating the symbol tables (which map symbols to arbitary unique integers) easier.
Usage: python makeSymbols.py file fieldNumber
file
: the textual FST/FSA file (.fst.txt
or.fsa.txt usually
), to extract the symbols fromfieldNumber
: which column of the file to take symbols from- input symbols use
fieldNumber
of 2 - output symbols use
fieldNumber
of 3
- input symbols use
The Symbols Table is output to standard out, and can be piped into a file
compileAndDraw.sh
the compileAndDraw.sh is a simple bash script that runs the whole process of compiling then drawing a FST/FSA.
Usage FSA: bash compileAndDraw.sh filename.fsa.txt
Usage FST: bash compileAndDraw.sh filename.fst.txt
Note: unlike openfst programs this is file extension sensitive. It will make the appropriate call for a FSA or a FST based on the extension.
composeExample.sh
The composeExample.sh script runs though the creation then composition of the dict.fst
and sent.fsa
. It then outputs some sentences generated using the language model descried.
Usage: bash composeExample.sh
Example FSTs/FSAs
This folder contains 3 examples: The later two examples of sentence construction are based on ones provided in these lecture notes
simple.fsa.txt
simple.fsa.txt is a very simple Finite State Accepter.
dict.fst.txt
dict.fst.txt is a dictionary containing several words. There is only state in the dictionary – as far it its concerns words can be in any order
sent.fsa.txt
sent.fsa.txt is a grammar for a simple sentence, expressed as a finite state acceptor. Sentences can either be determiner noun verb
or determiner noun verb determiner noun
.