A high level sketch: Another High Level Sketch Step 1: Write some RegExps Do that. Step 2: Construct R For regular expressions you have defined for keywords, identifies, numbers, etc… We want to construct an uber union regular expression:

\begin{align} R &= \text{Keyword} + \text{Identifier} + \text{Number} + \dots \\ &= R_1 | R_2 | R_3 | \dots \end{align}

Step 3: Tokenization For input x_1, …, x_{n}, for i \in 1 …n inclusive, we check:

\begin{equation} x_1 \dots x_{i} \stackrel{?}{\in} L\left( R\right) \end{equation}

Upon success for some i, we know that x_{1}, …, x_{i} \in L\left(R_{j}\right) for some j. We can then send x_1, …, x_{i} to the parser and then continue lexing for input x_{i+1}…x_{n}. ambiguity multiple matches, different lengths: we pick the longest possible substring multiple matches, same length: pick the first rule nothing matches: we just put a catch all rule in the bottom How exactly does “matching” mean? DFAs, NFAs, subset construction! Conventions used in this class: \Sigma alphabet S states n start F \subseteq S accepting states S \to^{x} S transition and recall regular expressions are equivalent to regular languages. When you attempt to take a transition that doesn’t exist in NFAs, you end into nothing. algorithm for casting RegExp to NFA By convention every NFA will have a single starting state and a single ending state. We will then compose every composition’s machine together. concatenation For parser of A and B, we compose them by composing the accept state of A with an \epsilon move to B, and make A no longer an accepting state. union We make an epsilon move from the starting state to both A and B. And then when you hit the accepting state of either A or B you epsilon move to the accepting state. kleene star implementing DFAs as a table columns: alphabet rows: states the way you implement this is to just look up the (state, symbol) pair, take the transition, and look up the new state, symbol, etc. 0 1 a a b b a b is and so on. To construct this table, you just read off the transition function.

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?