flex and bison usage in mysql

发布时间 2023-10-19 14:51:19作者: winter-loo

query parsing in mysql

mysql source code version: 8.0.34 (from MYSQL_VERSION file)

This an article from questions to understandings.

  • which file does mysql use to define sql grammar?
    sql/sql_yacc.yy

  • what is the name yyparse replaced with in mysql?
    Searching name-prefix in mysql source code, you will get --name-prefix=MYSQL in sql/CMakeLists.txt. In short, yyparse function is replaced with MYSQLparse function. In latest version of mysql, yyparse is named as my_sql_parser_parse due to the directive %define api.prefix {my_sql_parser_}.

  • where does the MYSQLparse function be called?
    In THD::sql_parser() method, which is defined in sql/sql_class.cc file.

  • what is the signature of MYSQLparse function and how is it generated?
    Here's the signature,

    extern int MYSQLparse(class THD * thd, class Parse_tree_root * *root);
    

    This signature will be generated by bison command with following declaration:

    %parse-param { class THD *YYTHD }
    %parse-param { class Parse_tree_root **parse_tree }
    
  • How does the parameters YYTHD and parse_tree be used in semantic actions?
    Here's how YYTHD used,

    prepare:
          PREPARE_SYM ident FROM prepare_src
          {
            THD *thd= YYTHD;
            LEX *lex= thd->lex;
            lex->sql_command= SQLCOM_PREPARE;
            lex->prepared_stmt_name= to_lex_cstring($2);
            lex->contains_plaintext_password= true;
          }
        ;
    
    

    Here's how parse_tree used,

    simple_statement_or_begin:
          simple_statement      { *parse_tree= $1; }
        | begin_stmt
        ;
    
  • How does mysql define the bison YYSTYPE?
    Generally, YYSTYPE definition is generated by bison tool with directive %union. Such as, in .yy file,

    %union {
    	int ival;
    }
    

    and use it as type for nonterminal symbol,

    %type <ival> Iconst
    

    Bison tool will generate a definition in .c or .cc file,

    union YYSTYPE {
      int ival;
    };
    

    However, mysql does not use the %union directive and defines YYSTYPE type in a separate header file sql/parser_yystype.h,

    union YYSTYPE {
      Lexer_yystype lexer;  // terminal values from the lexical scanner
    };
    

    The Lexer_yystype is defined as,

    union Lexer_yystype {
      LEX_STRING lex_str;
      LEX_SYMBOL keyword;
      const CHARSET_INFO *charset;
      PT_hint_list *optimizer_hints;
      LEX_CSTRING hint_string;
    };
    

    Then, mysql defines its nonterminal symbol type as,

    %type <lexer.lex_str> TEXT_STRING_sys TEXT_STRING
    
  • In general, yyparse function calls yylex function internally. How does mysql implement its lexical analyzer?
    From previous question, we know yyparse is replaced with MYSQLparse. As a consequence, yylex is named MYSQLlex. However, mysql does not use flex tool as its lexical analyzer. Instead, mysql implements its own lexical analysis process in class Lex_input_stream.