Lab 3 - FST

Lab Session #3 #

Agenda #

  • Finite State Transducers
  • Exercises on Finite State Transducer

Finite State Transducer #

A finite-state transducer (FST) is a finite-state machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finite-state automaton, which has a single tape. An FST is a type of finite-state automaton (FSA) that maps between two sets of symbols. An FST is more general than an FSA.

An FSA defines a formal language by defining a set of accepted strings, while an FST defines relations between sets of strings. An FST will read a set of strings on the input tape and generates a set of relations on the output tape. An FST can be thought of as a translator or relater between strings in a set. In morphological parsing, an example would be inputting a string of letters into the FST, the FST would then output a string of morphemes.

FSTs are used in a variety of applications, including:

Natural language processing: FSTs are used in natural language processing for tasks such as morphological analysis, part-of-speech tagging, and named entity recognition. Speech recognition: FSTs are used in speech recognition for tasks such as acoustic modeling and decoding. Machine translation: FSTs are used in machine translation for tasks such as lexical translation and grammar conversion. Compilers: FSTs are used in compilers for tasks such as lexical analysis and syntax analysis. FSTs are a powerful tool for modeling and manipulating strings. They are used in a variety of applications, including natural language processing, speech recognition, machine translation, and compilers.

Finite State Transducer formula #

A finite state transducer (FST) is a tuple \langle QQ , II , δ\delta , q0q_0 , FF , OO , η\eta \rangle , where:

  • QQ is a finite set of states. Each state represents a possible configuration of the FST.
  • II is a finite input alphabet. Each symbol in the input alphabet represents a possible input that the FST can read.
  • δ\delta is the transition function. It maps a state and an input symbol to a new state.
  • q0q_0 is the initial state. This is the state that the FST starts in when it is first activated.
  • FF is the set of final states. A state is a final state if it represents a valid output.
  • OO is the output alphabet. Each symbol in the output alphabet represents a possible output that the FST can produce.
  • η\eta is the output function. It maps a state and an input symbol to a sequence of output symbols. The rule that all of these variables follow is that they must be finite. This means that there can only be a finite number of states, input symbols, final states, and output symbols. This is because an FST is a finite-state machine, which is a machine that can only be in a finite number of states at any given time.

Remark

  • the condition for acceptance remains the same as in acceptors
  • the translation is performed only on accepted strings

Example #

Build a complete FST accepting the following language over the alphabet A={0,1} A = \{0,1 \} .
L={xA the number of 0s is even} L = \{ x \isin A^* | \textit { the number of } 0's \textit{ is even} \}

The FST outputs the string obtained by removing every odd occurrence of 0 and doubling every occurrence of 1. Examples of inputs recognised by L and their respective outputs:

  • Input: 010010010010 , Output: 110110 110110
  • Input: 0000 , Output: 00
  • Input: 000100011000100011 , Output: 011001111011001111
L={xA the number of 0s is even} L = \{ x \isin A^* | \textit { the number of } 0's \textit{ is even} \}

Exercises #

Build complete FSAs over the languages given below:

  1. A={w,t,a,l,k,e,d} A = \{w,t,a,l,k,e,d\} that accepts only the verb ”walked” or ”talked”. The FST will translate the input verb to present form ex: walked to walk.
  2. A={a,b}A = \{a, b\} that accepts only strings ending with the letter b. The FST will translate the input string where every second symbol a in the input is erased.
  3. A={0,1}A = \{0, 1\} that accepts strings that are binary representation of integers divisible by 2. The FST will translate the input string into result of division by 2.
  4. A={0,1}A = \{0, 1\} that accepts strings that are binary representation of integers divisible by 3. The FST will translate the input string into result of division by 3.

Operations on FSA #

Intersection #


Intersection Formula #

Suppose M1=(Q1,A,δ1,q01,F1) M_1 = \lparen Q_1, A, \delta_1, q^1_0, F_1 \rparen and M2=(Q2,A,δ2,q02,F2) M_2 = \lparen Q_2, A, \delta_2, q^2_0, F_2 \rparen are finite state automata accepting L1 L_1 and L2L_2 , respectively. Let M M be the complete FSA M=(Q,A,δ,q0,F) \text {FSA M} = \lparen Q,A,\delta,q_0,F \rparen ,
where:

Q=Q1×Q2 Q = Q_1 \times Q_2 q0=(q01,q02) q_0 = \lparen q^1_0, q^2_0 \rparen the transition function δ \delta is defined by the formula δ((q,p),a)=(δ1(q,a),δ2(p,a)) \delta \lparen \lparen q, p \rparen, a \rparen = \lparen \delta_1 \lparen q, a \rparen , \delta_2 \lparen p, a \rparen \rparen for every qQ1 q \isin Q_1 , every pQ2 p \isin Q_2 , and every aA a \isin A . And the set of the final states is defined as: F={(q,p)qF1pF2} F= \lbrace \lparen q,p \rparen | q \isin F_1 \land p \isin F_2 \rbrace M M accepts the language L1L2 L_1 \cap L_2 .


Intersection Example #

Let M1 M_1 be a complete FSA\text{FSA} defined as

M1={q0,q1},{a},{((q0,a),q1),((q1,a),q0)},q0,{q1} M_1 = \lang \lbrace q_0, q_1 \rbrace, \lbrace a \rbrace, \lbrace \lparen \lparen q_0,a \rparen, q_1 \rparen, \lparen \lparen q_1,a \rparen, q_0 \rparen \rbrace, q_0, \lbrace q_1 \rbrace \rang
  • {q0,q1} \lbrace q_0, q_1 \rbrace \rarr set of states;

  • {a} \lbrace a \rbrace \rarr input alphabet;

  • {((q0,a),q1),(q1,a),q0)} \lbrace \lparen \lparen q_0,a \rparen, q_1 \rparen, \lparen q_1,a \rparen, q_0 \rparen \rbrace \rarr partial transition function;

  • q0 q_0 \rarr initial state;

  • {q1} \lbrace q_1 \rbrace \rarr set of final states;

Let M2 M_2 be a complete FSA\text{FSA} defined as

M2={p0},{a},{((p0,a),p0)},p0,{p0} M_2 = \lang \lbrace p_0 \rbrace, \lbrace a \rbrace, \lbrace \lparen \lparen p_0,a \rparen, p_0 \rparen \rbrace, p_0, \lbrace p_0 \rbrace \rang

Then :

(M1M2)={(q0,p0),(q1,p0)},{a},{(((q0,p0),a),(q1,p0)), \lparen M_1 \cap M_2 \rparen = \lang \lbrace \lparen q_0, p_0 \rparen , \lparen q_1, p_0 \rparen \rbrace, \lbrace a \rbrace, \lbrace \lparen \lparen \lparen q_0, p_0 \rparen, a \rparen , \lparen q_1, p_0 \rparen \rparen , (((q1,p0),a),(q0,p0))},(q0,p0),{(q1,p0)}.\lparen \lparen \lparen q_1, p_0 \rparen ,a \rparen , \lparen q_0,p_0 \rparen \rparen \rbrace , \lparen q_0, p_0 \rparen ,\lbrace \lparen q_1, p_0 \rparen \rbrace \rang .


Intersection Graph #

Intersection Graph


Union #

Suppose M1=(Q1,A,δ1,q01,F1) M_1 = \lparen Q_1, A, \delta_1, q^1_0, F_1 \rparen and M2=(Q2,A,δ2,q02,F2) M_2 = \lparen Q_2, A, \delta_2, q^2_0, F_2 \rparen are finite state automata accepting L1 L_1 and L2L_2 , respectively. Let M M be the complete FSA M=(Q,A,δ,q0,F) \text {FSA M} = \lparen Q,A,\delta,q_0,F \rparen ,
where:

Q=Q1×Q2 Q = Q_1 \times Q_2 q0=(q01,q02) q_0 = \lparen q^1_0, q^2_0 \rparen the transition function δ \delta is defined by the formula δ((q,p),a)=(δ1(q,a),δ2(p,a)) \delta \lparen \lparen q, p \rparen, a \rparen = \lparen \delta_1 \lparen q, a \rparen , \delta_2 \lparen p, a \rparen \rparen for every qQ1 q \isin Q_1 , every pQ2 p \isin Q_2 , and every aA a \isin A . And the set of the final states is defined as: F={(q,p)qF1pF2} F= \lbrace \lparen q,p \rparen | q \isin F_1 \lor p \isin F_2 \rbrace M M accepts the language L1L2 L_1 \cup L_2 .


Difference #

Suppose M1=(Q1,A,δ1,q01,F1) M_1 = \lparen Q_1, A, \delta_1, q^1_0, F_1 \rparen and M2=(Q2,A,δ2,q02,F2) M_2 = \lparen Q_2, A, \delta_2, q^2_0, F_2 \rparen are finite state automata accepting L1 L_1 and L2L_2 , respectively. Let M M be the complete FSA M=(Q,A,δ,q0,F) \text {FSA M} = \lparen Q,A,\delta,q_0,F \rparen ,
where:

Q=Q1×Q2 Q = Q_1 \times Q_2 q0=(q01,q02) q_0 = \lparen q^1_0, q^2_0 \rparen the transition function δ \delta is defined by the formula δ((q,p),a)=(δ1(q,a),δ2(p,a)) \delta \lparen \lparen q, p \rparen, a \rparen = \lparen \delta_1 \lparen q, a \rparen , \delta_2 \lparen p, a \rparen \rparen for every qQ1 q \isin Q_1 , every pQ2 p \isin Q_2 , and every aA a \isin A . And the set of the final states is defined as: F={(q,p)qF1pF2} F= \lbrace \lparen q,p \rparen | q \isin F_1 \land p \notin F_2 \rbrace M M accepts the language L1L2 L_1 \setminus L_2 .


Complement #

Suppose M=(Q,A,δ,q0,F) M = \lparen Q, A, \delta, q_0, F \rparen is a complete finite state automaton accepting L L . A complement Mc M^c is a complete FSA \text {FSA} Mc=(Q,A,δ,q0,Fc) M^c = \lparen Q,A,\delta,q_0,F^c \rparen , where:

Fc=QF F^c = Q \setminus F

Mc M^c accepts the language LL .


Summary of Operations on FSA #

Suppose M1=(Q1,A,δ1,q01,F1) M_1 = \lparen Q_1, A, \delta_1, q^1_0, F_1 \rparen and M2=(Q2,A,δ2,q02,F2) M_2 = \lparen Q_2, A, \delta_2, q^2_0, F_2 \rparen are finite state automata accepting L1 L_1 and L2L_2 , respectively. Let M M be the complete FSA M=(Q,A,δ,q0,F) \text {FSA M} = \lparen Q,A,\delta,q_0,F \rparen ,
where:

Q=Q1×Q2 Q = Q_1 \times Q_2 q0=(q01,q02) q_0 = \lparen q^1_0, q^2_0 \rparen δ((q,p),a)=(δ1(q,a),δ2(p,a)) \delta \lparen \lparen q, p \rparen, a \rparen = \lparen \delta_1 \lparen q, a \rparen , \delta_2 \lparen p, a \rparen \rparen

The set of final states will be defined as

L1L2:F={(q,p)qF1pF2} L_1 \setminus L_2 : F= \lbrace \lparen q,p \rparen | q \isin F_1 \land p \notin F_2 \rbrace .
L1L2:F={(q,p)qF1pF2} L_1 \cup L_2 : F= \lbrace \lparen q,p \rparen | q \isin F_1 \lor p \isin F_2 \rbrace .
L1L2:F={(q,p)qF1pF2} L_1 \cap L_2 : F= \lbrace \lparen q,p \rparen | q \isin F_1 \land p \isin F_2 \rbrace .

Exercises #

Part 1 #

Let A={0,1} A = \lbrace 0, 1 \rbrace be the alphabet.

  1. Build a complete  FSA M1 \text { FSA } M_1 that recognises the language: L1={xAx has an even number of 1s} L_1= \lbrace x \isin A^* | x \text { has an even number of } 1's \rbrace

  2. Build a complete  FSA M2 \text { FSA } M_2 that recognises the language: L2={xAx has an odd number of 0s} L_2= \lbrace x \isin A^* | x \text { has an odd number of } 0's \rbrace

  3. Build a complete  FSA  \text { FSA } that accepts when either M1 M_1 or M2 M_2 accepts.

  4. Build a complete  FSA  \text { FSA } that accepts when both M1 M_1 and M2 M_2 accept.

  5. Build a complete  FSA  \text { FSA } that accepts when M1 M_1 accepts and M2 M_2 rejects.

  6. Build a complement for M1 M_1 .


Part 2 #

Construct a complement for the following FSA \text {FSA}

placeholdertext


Part 3 #

Let A={0,1} A = \lbrace 0, 1 \rbrace be the alphabet.

  1. Build a complete  FSA Ma \text { FSA } M_a that recognises the language: La={xAx is the binary representation of an integer, and it is divisible by 2} L_a= \lbrace x \isin A^* | x \text { is the binary representation of an integer, and it is divisible by } 2 \rbrace

  2. Build a complete  FSA Mb \text { FSA } M_b that recognises the language: Lb={xAx is the binary representation of an integer, and it is divisible by 3} L_b= \lbrace x \isin A^* | x \text { is the binary representation of an integer, and it is divisible by } 3 \rbrace

  3. Build a complete  FSA  \text { FSA } that accepts when both Ma M_a and Mb M_b accept.

  4. Build a complete  FSA  \text { FSA } that accepts when either Ma M_a or Mb M_b accepts.

  5. Build a complete  FSA  \text { FSA } that accepts when Ma M_a accepts and Mb M_b rejects.