flex: Simple Examples

 
 4 Some Simple Examples
 **********************
 
 First some simple examples to get the flavor of how one uses 'flex'.
 
    The following 'flex' input specifies a scanner which, when it
 encounters the string 'username' will replace it with the user's login
 name:
 
          %%
          username    printf( "%s", getlogin() );
 
    By default, any text not matched by a 'flex' scanner is copied to the
 output, so the net effect of this scanner is to copy its input file to
 its output with each occurrence of 'username' expanded.  In this input,
 there is just one rule.  'username' is the "pattern" and the 'printf' is
 the "action".  The '%%' symbol marks the beginning of the rules.
 
    Here's another simple example:
 
                  int num_lines = 0, num_chars = 0;
      
          %%
          \n      ++num_lines; ++num_chars;
          .       ++num_chars;
      
          %%
      
          int main()
                  {
                  yylex();
                  printf( "# of lines = %d, # of chars = %d\n",
                          num_lines, num_chars );
                  }
 
    This scanner counts the number of characters and the number of lines
 in its input.  It produces no output other than the final report on the
 character and line counts.  The first line declares two globals,
 'num_lines' and 'num_chars', which are accessible both inside 'yylex()'
 and in the 'main()' routine declared after the second '%%'.  There are
 two rules, one which matches a newline ('\n') and increments both the
 line count and the character count, and one which matches any character
 other than a newline (indicated by the '.' regular expression).
 
    A somewhat more complicated example:
 
          /* scanner for a toy Pascal-like language */
      
          %{
          /* need this for the call to atof() below */
          #include <math.h>
          %}
      
          DIGIT    [0-9]
          ID       [a-z][a-z0-9]*
      
          %%
      
          {DIGIT}+    {
                      printf( "An integer: %s (%d)\n", yytext,
                              atoi( yytext ) );
                      }
      
          {DIGIT}+"."{DIGIT}*        {
                      printf( "A float: %s (%g)\n", yytext,
                              atof( yytext ) );
                      }
      
          if|then|begin|end|procedure|function        {
                      printf( "A keyword: %s\n", yytext );
                      }
      
          {ID}        printf( "An identifier: %s\n", yytext );
      
          "+"|"-"|"*"|"/"   printf( "An operator: %s\n", yytext );
      
          "{"[^{}\n]*"}"     /* eat up one-line comments */
      
          [ \t\n]+          /* eat up whitespace */
      
          .           printf( "Unrecognized character: %s\n", yytext );
      
          %%
      
          int main( int argc, char **argv )
              {
              ++argv, --argc;  /* skip over program name */
              if ( argc > 0 )
                      yyin = fopen( argv[0], "r" );
              else
                      yyin = stdin;
      
              yylex();
              }
 
    This is the beginnings of a simple scanner for a language like
 Pascal.  It identifies different types of "tokens" and reports on what
 it has seen.
 
    The details of this example will be explained in the following
 sections.