Lab 3 - FST

Lab Session #3 #

Agenda #

  • Finite State Transducers
  • Exercises on Finite State Transducer

Finite State Transducer #

A finite-state transducer (FST) is a finite-state machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finite-state automaton, which has a single tape. An FST is a type of finite-state automaton (FSA) that maps between two sets of symbols. An FST is more general than an FSA.

An FSA defines a formal language by defining a set of accepted strings, while an FST defines relations between sets of strings. An FST will read a set of strings on the input tape and generates a set of relations on the output tape. An FST can be thought of as a translator or relater between strings in a set. In morphological parsing, an example would be inputting a string of letters into the FST, the FST would then output a string of morphemes.

FSTs are used in a variety of applications, including:

Natural language processing: FSTs are used in natural language processing for tasks such as morphological analysis, part-of-speech tagging, and named entity recognition. Speech recognition: FSTs are used in speech recognition for tasks such as acoustic modeling and decoding. Machine translation: FSTs are used in machine translation for tasks such as lexical translation and grammar conversion. Compilers: FSTs are used in compilers for tasks such as lexical analysis and syntax analysis. FSTs are a powerful tool for modeling and manipulating strings. They are used in a variety of applications, including natural language processing, speech recognition, machine translation, and compilers.

Finite State Transducer formula #

A finite state transducer (FST) is a tuple \( \langle \) \(Q\) , \(I\) , \(\delta\) , \(q_0\) , \(F\) , \(O\) , \(\eta\) \(\rangle\) , where:

  • \(Q\) is a finite set of states. Each state represents a possible configuration of the FST.
  • \(I\) is a finite input alphabet. Each symbol in the input alphabet represents a possible input that the FST can read.
  • \(\delta\) is the transition function. It maps a state and an input symbol to a new state.
  • \(q_0\) is the initial state. This is the state that the FST starts in when it is first activated.
  • \(F\) is the set of final states. A state is a final state if it represents a valid output.
  • \(O\) is the output alphabet. Each symbol in the output alphabet represents a possible output that the FST can produce.
  • \(\eta\) is the output function. It maps a state and an input symbol to a sequence of output symbols. The rule that all of these variables follow is that they must be finite. This means that there can only be a finite number of states, input symbols, final states, and output symbols. This is because an FST is a finite-state machine, which is a machine that can only be in a finite number of states at any given time.

Remark

  • the condition for acceptance remains the same as in acceptors
  • the translation is performed only on accepted strings

Example #

Build a complete FST accepting the following language over the alphabet \( A = \{0,1 \}\) .
\[ L = \{ x \isin A^* | \textit { the number of } 0's \textit{ is even} \} \]

The FST outputs the string obtained by removing every odd occurrence of 0 and doubling every occurrence of 1. Examples of inputs recognised by L and their respective outputs:

  • Input: \(010010\) , Output: \( 110110\)
  • Input: \(00\) , Output: \(0\)
  • Input: \(000100011\) , Output: \(011001111\)
\[ L = \{ x \isin A^* | \textit { the number of } 0's \textit{ is even} \} \]

Exercises #

Build complete FSAs over the languages given below:

  1. \( A = \{w,t,a,l,k,e,d\} \) that accepts only the verb ”walked” or ”talked”. The FST will translate the input verb to present form ex: walked to walk.
  2. \(A = \{a, b\} \) that accepts only strings ending with the letter b. The FST will translate the input string where every second symbol a in the input is erased.
  3. \(A = \{0, 1\} \) that accepts strings that are binary representation of integers divisible by 2. The FST will translate the input string into result of division by 2.
  4. \(A = \{0, 1\} \) that accepts strings that are binary representation of integers divisible by 3. The FST will translate the input string into result of division by 3.

Operations on FSA #

Intersection #


Intersection Formula #

Suppose \( M_1 = \lparen Q_1, A, \delta_1, q^1_0, F_1 \rparen \) and \( M_2 = \lparen Q_2, A, \delta_2, q^2_0, F_2 \rparen\) are finite state automata accepting \( L_1 \) and \(L_2 \) , respectively. Let \( M \) be the complete \( \text {FSA M} = \lparen Q,A,\delta,q_0,F \rparen \) ,
where:

\[ Q = Q_1 \times Q_2 \] \[ q_0 = \lparen q^1_0, q^2_0 \rparen \] the transition function \( \delta\) is defined by the formula \[ \delta \lparen \lparen q, p \rparen, a \rparen = \lparen \delta_1 \lparen q, a \rparen , \delta_2 \lparen p, a \rparen \rparen \] for every \( q \isin Q_1 \) , every \( p \isin Q_2 \) , and every \( a \isin A \) . And the set of the final states is defined as: \[ F= \lbrace \lparen q,p \rparen | q \isin F_1 \land p \isin F_2 \rbrace \] \( M \) accepts the language \( L_1 \cap L_2 \) .


Intersection Example #

Let \( M_1\) be a complete \(\text{FSA}\) defined as

\[ M_1 = \lang \lbrace q_0, q_1 \rbrace, \lbrace a \rbrace, \lbrace \lparen \lparen q_0,a \rparen, q_1 \rparen, \lparen \lparen q_1,a \rparen, q_0 \rparen \rbrace, q_0, \lbrace q_1 \rbrace \rang\]
  • \( \lbrace q_0, q_1 \rbrace \rarr\) set of states;

  • \( \lbrace a \rbrace \rarr \) input alphabet;

  • \( \lbrace \lparen \lparen q_0,a \rparen, q_1 \rparen, \lparen q_1,a \rparen, q_0 \rparen \rbrace \rarr \) partial transition function;

  • \( q_0 \rarr \) initial state;

  • \( \lbrace q_1 \rbrace \rarr \) set of final states;

Let \( M_2\) be a complete \(\text{FSA}\) defined as

\[ M_2 = \lang \lbrace p_0 \rbrace, \lbrace a \rbrace, \lbrace \lparen \lparen p_0,a \rparen, p_0 \rparen \rbrace, p_0, \lbrace p_0 \rbrace \rang\]

Then :

\[ \lparen M_1 \cap M_2 \rparen = \lang \lbrace \lparen q_0, p_0 \rparen , \lparen q_1, p_0 \rparen \rbrace, \lbrace a \rbrace, \lbrace \lparen \lparen \lparen q_0, p_0 \rparen, a \rparen , \lparen q_1, p_0 \rparen \rparen , \] \[\lparen \lparen \lparen q_1, p_0 \rparen ,a \rparen , \lparen q_0,p_0 \rparen \rparen \rbrace , \lparen q_0, p_0 \rparen ,\lbrace \lparen q_1, p_0 \rparen \rbrace \rang .\]


Intersection Graph #

Intersection Graph


Union #

Suppose \( M_1 = \lparen Q_1, A, \delta_1, q^1_0, F_1 \rparen \) and \( M_2 = \lparen Q_2, A, \delta_2, q^2_0, F_2 \rparen\) are finite state automata accepting \( L_1 \) and \(L_2 \) , respectively. Let \( M \) be the complete \( \text {FSA M} = \lparen Q,A,\delta,q_0,F \rparen \) ,
where:

\[ Q = Q_1 \times Q_2 \] \[ q_0 = \lparen q^1_0, q^2_0 \rparen \] the transition function \( \delta\) is defined by the formula \[ \delta \lparen \lparen q, p \rparen, a \rparen = \lparen \delta_1 \lparen q, a \rparen , \delta_2 \lparen p, a \rparen \rparen \] for every \( q \isin Q_1 \) , every \( p \isin Q_2 \) , and every \( a \isin A \) . And the set of the final states is defined as: \[ F= \lbrace \lparen q,p \rparen | q \isin F_1 \lor p \isin F_2 \rbrace \] \( M \) accepts the language \( L_1 \cup L_2 \) .


Difference #

Suppose \( M_1 = \lparen Q_1, A, \delta_1, q^1_0, F_1 \rparen \) and \( M_2 = \lparen Q_2, A, \delta_2, q^2_0, F_2 \rparen\) are finite state automata accepting \( L_1 \) and \(L_2 \) , respectively. Let \( M \) be the complete \( \text {FSA M} = \lparen Q,A,\delta,q_0,F \rparen \) ,
where:

\[ Q = Q_1 \times Q_2 \] \[ q_0 = \lparen q^1_0, q^2_0 \rparen \] the transition function \( \delta\) is defined by the formula \[ \delta \lparen \lparen q, p \rparen, a \rparen = \lparen \delta_1 \lparen q, a \rparen , \delta_2 \lparen p, a \rparen \rparen \] for every \( q \isin Q_1 \) , every \( p \isin Q_2 \) , and every \( a \isin A \) . And the set of the final states is defined as: \[ F= \lbrace \lparen q,p \rparen | q \isin F_1 \land p \notin F_2 \rbrace \] \( M \) accepts the language \( L_1 \setminus L_2 \) .


Complement #

Suppose \( M = \lparen Q, A, \delta, q_0, F \rparen \) is a complete finite state automaton accepting \( L \) . A complement \( M^c\) is a complete \( \text {FSA} \) \( M^c = \lparen Q,A,\delta,q_0,F^c \rparen \) , where:

\[ F^c = Q \setminus F \]

\( M^c \) accepts the language \(L\) .


Summary of Operations on FSA #

Suppose \( M_1 = \lparen Q_1, A, \delta_1, q^1_0, F_1 \rparen \) and \( M_2 = \lparen Q_2, A, \delta_2, q^2_0, F_2 \rparen\) are finite state automata accepting \( L_1 \) and \(L_2 \) , respectively. Let \( M \) be the complete \( \text {FSA M} = \lparen Q,A,\delta,q_0,F \rparen \) ,
where:

\[ Q = Q_1 \times Q_2 \] \[ q_0 = \lparen q^1_0, q^2_0 \rparen \] \[ \delta \lparen \lparen q, p \rparen, a \rparen = \lparen \delta_1 \lparen q, a \rparen , \delta_2 \lparen p, a \rparen \rparen \]

The set of final states will be defined as

\( L_1 \setminus L_2 : F= \lbrace \lparen q,p \rparen | q \isin F_1 \land p \notin F_2 \rbrace \) .
\( L_1 \cup L_2 : F= \lbrace \lparen q,p \rparen | q \isin F_1 \lor p \isin F_2 \rbrace \) .
\( L_1 \cap L_2 : F= \lbrace \lparen q,p \rparen | q \isin F_1 \land p \isin F_2 \rbrace \) .

Exercises #

Part 1 #

Let \( A = \lbrace 0, 1 \rbrace \) be the alphabet.

  1. Build a complete \( \text { FSA } M_1 \) that recognises the language: \[ L_1= \lbrace x \isin A^* | x \text { has an even number of } 1's \rbrace \]

  2. Build a complete \( \text { FSA } M_2 \) that recognises the language: \[ L_2= \lbrace x \isin A^* | x \text { has an odd number of } 0's \rbrace \]

  3. Build a complete \( \text { FSA } \) that accepts when either \( M_1 \) or \( M_2 \) accepts.

  4. Build a complete \( \text { FSA } \) that accepts when both \( M_1 \) and \( M_2 \) accept.

  5. Build a complete \( \text { FSA } \) that accepts when \( M_1 \) accepts and \( M_2 \) rejects.

  6. Build a complement for \( M_1 \) .


Part 2 #

Construct a complement for the following \( \text {FSA} \)

placeholdertext


Part 3 #

Let \( A = \lbrace 0, 1 \rbrace \) be the alphabet.

  1. Build a complete \( \text { FSA } M_a \) that recognises the language: \[ L_a= \lbrace x \isin A^* | x \text { is the binary representation of an integer, and it is divisible by } 2 \rbrace \]

  2. Build a complete \( \text { FSA } M_b \) that recognises the language: \[ L_b= \lbrace x \isin A^* | x \text { is the binary representation of an integer, and it is divisible by } 3 \rbrace \]

  3. Build a complete \( \text { FSA } \) that accepts when both \( M_a \) and \( M_b \) accept.

  4. Build a complete \( \text { FSA } \) that accepts when either \( M_a \) or \( M_b \) accepts.

  5. Build a complete \( \text { FSA } \) that accepts when \( M_a \) accepts and \( M_b \) rejects.