[Mesa-dev] [PATCH 07/23] glsl/glcpp: Stop using a lexer start condition (<SKIP>) for token skipping.

Thu Jul 17 16:36:37 PDT 2014

On Thu, Jun 26, 2014 at 3:19 PM, Carl Worth <cworth at cworth.org> wrote:
> Here, "skipping" refers to the lexer not emitting any tokens for portions of
> the file within an #if condition (or similar) that evaluates to false.
>
> Previously, the lexer had a special <SKIP> start condition used to control
> this skipping. This start condition was not handled like a normal start
> condition. Instead, there was a particularly ugly block of code set to be
> included at the top of the generated lexing loop that would change from
> <INITIAL> to <SKIP> or from <SKIP> to <INITIAL> depending on various pieces of
> parser state, (such as parser->skip_state and parser->lexing_directive).
>
> Not only was that an ugly approach, but the <SKIP> start condition was
> complicating several glcpp bug fixes I attempted recently that want to use
> start conditions for other purposes, (such as a new <HASH> start condition).
>
> The recently added RETURN_TOKEN macro gives us a convenient way to implement
> skipping without using a lexer start condition. Now, at the top of the
> generated lexer, we examine all the necessary parser state and set a new
> parser->skipping bit. Then, in RETURN_TOKEN, we examine parser->skipping to
> determine whether to actually emit the token or not.
>
> Besides this, there are only a couple of other places where we need to examine
> the skipping bit (other than when returning a token):
>
>         * To avoid emitting an error for #error if skipped.
>
>         * To avoid entering the <DEFINE> start condition for a #define that is
>           skipped.
>
> With all of this in place in the present commit, there are hopefully no
> behavioral changes with this patch, ("make check" still passes all of the
> glcpp tests at least).
> ---
>  src/glsl/glcpp/glcpp-lex.l   | 160 ++++++++++++++++++++++++++-----------------
>  src/glsl/glcpp/glcpp-parse.y |   1 +
>  src/glsl/glcpp/glcpp.h       |   1 +
>  3 files changed, 99 insertions(+), 63 deletions(-)
>
> diff --git a/src/glsl/glcpp/glcpp-lex.l b/src/glsl/glcpp/glcpp-lex.l
> index 37fcc84..cb06bb8 100644
> --- a/src/glsl/glcpp/glcpp-lex.l
> +++ b/src/glsl/glcpp/glcpp-lex.l
> @@ -61,19 +61,52 @@ void glcpp_set_column (int  column_no , yyscan_t yyscanner);
>                 yylloc->source = 0;     \
>         } while(0)
>
> -#define RETURN_TOKEN(token)                                    \
> +/* It's ugly to have macros that have return statements inside of
> + * them, but flex-based lexer generation is all built around the
> + * return statement.
> + *
> + * To mitigate the ugliness, we defer as much of the logic as possible
> + * to an actual function, not a macro (see
> + * glcpplex_update_state_per_token) and we make the word RETURN
> + * prominent in all of the macros which may return.
> + *
> + * The most-commonly-used macro is RETURN_TOKEN which will perform all
> + * necessary state updates based on the provided token,, then
> + * conditionally return the token. It will not return a token if the
> + * parser is currently skipping tokens, (such as within #if
> + * 0...#else).
> + *
> + * The RETURN_TOKEN_NEVER_SKIP macro is a lower-level variant that
> + * makes the token returning unconditional. This is needed for things
> + * like #if and the tokens of its condition, (since these must be
> + * evaluated by the parser even when otherwise skipping).
> + *
> + * Finally, RETURN_STRING_TOKEN is a simple convenience wrapper on top
> + * of RETURN_TOKEN that performs a string copy of yytext before the
> + * return.
> + */
> +#define RETURN_TOKEN_NEVER_SKIP(token)                         \
>         do {                                                    \
>                 if (token == NEWLINE)                           \
>                         parser->last_token_was_newline = 1;     \
>                 else                                            \
>                         parser->last_token_was_newline = 0;     \
>                 return (token);                                 \
> +       } while (0)
> +
> +#define RETURN_TOKEN(token)                                            \
> +       do {                                                            \
> +               if (! parser->skipping) {                               \
> +                       RETURN_TOKEN_NEVER_SKIP(token);                 \

It looks like parser->last_token_was_newline will not be updated while
skipping. Should we update that during the skipping even though we're
not returning tokens?

> +               }                                                       \
>         } while(0)
>
> -#define RETURN_STRING_TOKEN(token)                             \
> -       do {                                                    \
> -               yylval->str = ralloc_strdup (yyextra, yytext);  \
> -               RETURN_TOKEN (token);                           \
> +#define RETURN_STRING_TOKEN(token)                                     \
> +       do {                                                            \
> +               if (! parser->skipping) {                               \
> +                       yylval->str = ralloc_strdup (yyextra, yytext);  \
> +                       RETURN_TOKEN (token);                           \

I guess this could use RETURN_TOKEN_NEVER_SKIP.

-Jordan