To Study about Lexical Analyzer Generator(LEX) and Flex(Fast Lexical Analyzer)

LEX : 
         LEX is a tool used to generate a lexical analyzer. This document is a tutorial for the use of LEX for SIL Compiler development. Technically, LEX translates a set of  regular expression specifications (given as input in input_file.l) into a C  implementation of a corresponding finite state machine (lex.yy.c). This C program, when compiled, yields an executable lexical analyzer.
The source SIL program is fed as the input to the the lexical analyzer which produces a sequence of tokens as output. (Tokens are explained below). Conceptually, a lexical analyzer scans a given source SIL program and produces an output of tokens. A token is a single element of the SIL programming language that is recognized by the compiler . For instance integer, boolean, begin, end, if, while etc. are tokens in SIL. 
Example:
Integer͇ {return ID_TYPE_INTEGER;}
 LEX
C compiler \
a.out
lex.yy.c
a.out
Input Output
lex.yy.c
input_file.l
This example demonstrates the specification of rule in LEX. This rule in this example specifies that the lexical analyzer must return the token named ID_TYPE_INTEGER when the pattern ͆integer͇ is found in the input file. A rule in a LEX program comprises of a pattern part (specified by a regular expression) and a corresponding (semantic) action part (a sequence of C statements). In the above example, ͆integer͇ is the pattern and {return ID_TYPE_INTEGER;} is the corresponding action. The statements in the action part will be executed when the pattern is detected in the input. 
Declarations
The declarations section consists of two parts, regular definitions and auxiliary declarations . LEX allows the use of short-hands and extensions to regular expressions for the regular definitions. The auxiliary declarations are copied as such by LEX to the output lex.yy.c file. 
Rules in a LEX program consists of two parts :
i. The pattern to be matched
ii. The corresponding action to be executed 
LEX generates C code for the rules specified in the Rules section and places this code into a single function called yylex(). (To be discussed in detail later). In addition to this LEX generated code, the programmer may wish to add his own code to the lex.yy.c file. The auxiliary functions section allows the programmer to achieve this.

FLEX :

Flex (fast lexical analyzer generator) is a free and open-source software alternative to lex.[2] It is a computer program that generates lexical analyzers (also known as "scanners" or "lexers").[3][4] It is frequently used as the lex implementation together with Berkeley Yacc parser generator on BSD-derived operating systems (as both lex and yacc are part of POSIX),[5][6][7] or together with GNU bison (a version of yacc) in *BSD ports[8] and in Linux distributions. Unlike Bison, flex is not part of the GNU Project and is not released under the GNU General Public License.

Issues

Time complexity :
A Flex lexical analyzer usually has time complexity  in the length of the input. That is, it performs a constant number of operations for each input symbol. This constant is quite low: GCC generates 12 instructions for the DFA match loop.[citation needed] Note that the constant is independent of the length of the token, the length of the regular expression and the size of the DFA.
However, using the REJECT macro in a scanner with the potential to match extremely long tokens can cause Flex to generate a scanner with non-linear performance. This feature is optional. In this case, the programmer has explicitly told Flex to "go back and try again" after it has already matched some input. This will cause the DFA to backtrack to find other accept states. The REJECT feature is not enabled by default, and because of its performance implications its use is discouraged in the Flex manual.[11]
Reentrancy :By default the scanner generated by Flex is not reentrant. This can cause serious problems for programs that use the generated scanner from different threads. To overcome this issue there are options that Flex provides in order to achieve reentrancy. A detailed description of these options can be found in the Flex manual.
Usage under non-Unix environments:
Normally the generated scanner contains references to unistd.h header file which is Unix specific. To avoid generating code that includes unistd.h%option nounistd should be used. Another issue is the call to isatty (a Unix library function), which can be found in the generated code. The %option never-interactive forces flex to generate code that doesn't use isatty.

Using flex from other languages :
Flex can only generate code for C and C++. To use the scanner code generated by flex from other languages a language binding tool such as SWIG can be used.

 Issues
Time complexity :
A Flex lexical analyzer usually has time complexity  in the length of the input. That is, it performs a constant number of operations for each input symbol. This constant is quite low: GCC generates 12 instructions for the DFA match loop.[citation needed] Note that the constant is independent of the length of the token, the length of the regular expression and the size of the DFA.
However, using the REJECT macro in a scanner with the potential to match extremely long tokens can cause Flex to generate a scanner with non-linear performance. This feature is optional. In this case, the programmer has explicitly told Flex to "go back and try again" after it has already matched some input. This will cause the DFA to backtrack to find other accept states. The REJECT feature is not enabled by default, and because of its performance implications its use is discouraged in the Flex manual.[11]

Reentrancy

By default the scanner generated by Flex is not reentrant. This can cause serious problems for programs that use the generated scanner from different threads. To overcome this issue there are options that Flex provides in order to achieve reentrancy. A detailed description of these options can be found in the Flex manual.[12]

Usage under non-Unix environments

Normally the generated scanner contains references to unistd.h header file which is Unix specific. To avoid generating code that includes unistd.h%option nounistd should be used. Another issue is the call to isatty (a Unix library function), which can be found in the generated code. The %option never-interactive forces flex to generate code that doesn't use isatty.[13]

Using flex from other languages

Flex can only generate code for C and C++. To use the scanner code generated by flex from other languages a language binding tool such as SWIG can be used.