| |
| =-----------------------------------------------------------------------------= |
| * * |
| * Paren Stack * |
| * * |
| =-----------------------------------------------------------------------------= |
| |
| At the heart of this algorithm are two stacks. |
| There is the Paren Stack (PS) and the Frame stack. |
| |
| The PS (pse in the code) keeps track of braces, parens, |
| if/else/switch/do/while/etc items -- anything that is nestable. |
| |
| Complex statements go through some of these BS_ stages: |
| BS_PAREN1 - paren on if/for/switch/while, etc |
| BS_PAREN2 - paren on do{}while() |
| BS_BRACE_DO - brace set on do{} |
| BS_BRACE2 - brace on if/else/for/switch/while |
| BS_ELSE - expecting 'else' after 'if' |
| BS_ELSEIF - expecting 'if' after 'else' |
| BS_WHILE - expecting 'while' after 'do' |
| |
| The file is processed one token at a time to support #if/#else/#endif |
| preprocessors at any point. |
| |
| Take this simple if statement as an example: |
| |
| if ( x ) |
| { |
| x--; |
| } |
| |
| The stack would look like so: |
| The format is first the token processed and then the PSE stack as it appears |
| AFTER the token is processed. |
| |
| 'if' [IF - PAREN1] |
| '(' [IF - PAREN1] [SPAREN OPEN] |
| 'x' [IF - PAREN1] [SPAREN OPEN] |
| ')' [IF - BRACE2] <- note that the stage was changed on SPAREN_CLOSE |
| '{' [IF - BRACE2] [BRACE OPEN] |
| 'x' [IF - BRACE2] [BRACE OPEN] |
| '--' [IF - BRACE2] [BRACE OPEN] |
| ';' [IF - BRACE2] [BRACE OPEN] |
| '}' [IF - ELSE] |
| <- lack of else kills the ELSE, closes statement |
| |
| Virtual brace example: |
| if ( x ) |
| x--; |
| else if (y) |
| y--; |
| else |
| z++; |
| |
| 'if' [IF - PAREN1] |
| '(' [IF - PAREN1] [SPAREN OPEN] |
| 'x' [IF - PAREN1] [SPAREN OPEN] |
| ')' [IF - BRACE2] |
| 'x' [IF - BRACE2] [VBRACE OPEN] <- VBrace open inserted before because |
| the token was not '{' |
| '--' [IF - BRACE2] [VBRACE OPEN] |
| ';' [IF - ELSE] <- Semicolon causes a VBrace close to be |
| inserted after the semicolon |
| 'else' [ELSE - ELSEIF] <- IF changed into ELSE, expect IF or BRACE |
| 'x' [ELSE - BRACE2] [VBRACE OPEN] <- lack of '{' -> VBrace |
| '++' [ELSE - BRACE2] [VBRACE OPEN] |
| ';' <- VBrace close inserted after semicolon |
| ELSE removed after statement close |
| |
| Nested virtual brace example: (EOF represents the end of the file) |
| if ( x ) |
| if (y) |
| y--; |
| else |
| z++; |
| EOF |
| |
| 'if' [IF - PAREN1] |
| '(' [IF - PAREN1] [PAREN OPEN] |
| 'x' [IF - PAREN1] [PAREN OPEN] |
| ')' [IF - BRACE2] |
| 'if' [IF - BRACE2] [VBRACE OPEN] [IF - PAREN1] <- VBrace on BRACE2, IF opened |
| '(' [IF - BRACE2] [VBRACE OPEN] [IF - PAREN1] [SPAREN OPEN] |
| 'y' [IF - BRACE2] [VBRACE OPEN] [IF - PAREN1] [SPAREN OPEN] |
| ')' [IF - BRACE2] [VBRACE OPEN] [IF - BRACE2] |
| 'y' [IF - BRACE2] [VBRACE OPEN] [IF - BRACE2] [VBRACE OPEN] |
| '--' [IF - BRACE2] [VBRACE OPEN] [IF - BRACE2] [VBRACE OPEN] |
| ';' [IF - BRACE2] [VBRACE OPEN] [IF - ELSE] |
| 'else' [IF - BRACE2] [VBRACE OPEN] [ELSE - ELSEIF] |
| 'z' [IF - BRACE2] [VBRACE OPEN] [ELSE - BRACE2] [VBRACE OPEN] |
| '++' [IF - BRACE2] [VBRACE OPEN] [ELSE - BRACE2] [VBRACE OPEN] |
| ';' [IF - BRACE2] [VBRACE OPEN] [ELSE - BRACE2] - step1 |
| [IF - BRACE2] [VBRACE OPEN] - step2 |
| [IF - ELSE] - step3 |
| EOF |
| |
| -- this last semi is more complicated - first it terminates the VBRACE and then |
| the else, which then, since it is the end of a statement, terminates the |
| VBRACE. That bumps the IF stage to ELSE. |
| The EOF kills that off (since it is not an else) |
| |
| Order of operation: |
| 1) if TOS=VBRACE && PC=SEMI, insert VBRACE close, PC=>VBRACE close |
| 2) if PC=VBRACE close or PC=BRACE close, and TOS is complex (if/else/etc) |
| then advance complex stage. If statement ends, pop and advance |
| |
| |
| Stages for each complex statement: |
| if |
| IF-PAREN1, IF-BRACE2, IF-ELSE |
| |
| if/else |
| IF-PAREN1, IF-BRACE2, IF-ELSE, ELSE-ELSEIF, ELSE-BRACE2 |
| |
| if/else if/else |
| IF-PAREN1, IF-BRACE2, IF-ELSE, ELSE-ELSEIF, IF-PAREN1, IF-BRACE2, IF-ELSE, ELSE-ELSEIF, ELSE-BRACE2 |
| |
| for |
| FOR-PAREN1, FOR-BRACE2 |
| |
| while |
| WHILE-PAREN1, WHILE-BRACE2 |
| |
| switch |
| SWITCH-PAREN1, SWITCH-BRACE2 |
| |
| do/while |
| DO-BRACE_DO, DO-WHILE, WHILE-PAREN2 |
| |
| |
| Another less-interesting example: |
| |
| { |
| if (x) |
| volatile |
| { |
| y++; |
| } |
| return y; |
| } |
| |
| '{' [BRACE OPEN] |
| 'if' [BRACE OPEN] [IF - PAREN1] |
| '(' [BRACE OPEN] [IF - PAREN1] [PAREN OPEN] |
| 'x' [BRACE OPEN] [IF - PAREN1] [PAREN OPEN] |
| ')' [BRACE OPEN] [IF - BRACE2] |
| 'volatile' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] |
| '{' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN] |
| 'y' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN] |
| '++' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN] |
| ';' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN] |
| '}' [BRACE OPEN] [IF - ELSE] <- the brace close ends brace-open, |
| volatile-brace2 and vbrace-open |
| 'return' [BRACE OPEN] <- not else |
| 'y' [BRACE OPEN] |
| ';' [BRACE OPEN] |
| '}' <- empties the stack |
| |
| |
| =-----------------------------------------------------------------------------= |
| * * |
| * Parse Frames * |
| * * |
| =-----------------------------------------------------------------------------= |
| |
| The pse stack is kept on a frame stack. |
| The frame stack is need for languages that support preprocessors (C, C++, C#) |
| that can arbitrarily change code flow. It also isolates #define macros so |
| that they are indented independently and do not affect the rest of the program. |
| |
| When an #if is hit, a copy of the current frame is push on the frame stack. |
| When an #else/#elif is hit, a copy of the current stack is pushed under the |
| #if frame and the original (pre-#if) frame is copied to the current frame. |
| When #endif is hit, the top frame is popped. |
| This has the following effects: |
| - a simple #if / #endif does not affect program flow |
| - #if / #else /#endif - continues from the #if clause |
| |
| When a #define is entered, the current frame is pushed and cleared. |
| When a #define is exited, the frame is popped. |
| |
| |
| Take this example, which isn't very exciting, as both the #if and #else parts |
| end up with the same paren stack. This is the usual case. |
| |
| { |
| foo(param1, |
| #ifdef DEBUG |
| "debug"); |
| #else |
| "release"); |
| #endif |
| } |
| |
| Right before the #ifdef, we have this for the paren stack: |
| Top> [BRACE OPEN] [PAREN OPEN] |
| |
| The #ifdef pushes a copy of the current stack, so we have this: |
| Top> [BRACE OPEN] [PAREN OPEN] |
| [BRACE OPEN] [PAREN OPEN] |
| |
| The close paren after "debug" closes out the PAREN-OPEN on the top of the stack. |
| Top> [BRACE OPEN] |
| [BRACE OPEN] [PAREN OPEN] |
| |
| The #else swaps the top two frames. |
| Top> [BRACE OPEN] [PAREN OPEN] |
| [BRACE OPEN] |
| |
| Right after the #else, we hit another close paren after the "release". |
| Top> [BRACE OPEN] |
| [BRACE OPEN] |
| |
| At the #endif, the top of stack is thrown out, which restores us to the #if path. |
| Top> [BRACE OPEN] |
| |
| |