Ignore tokens in the token characters?

I have the following token definition in my lexer defining a CharacterString (e.g. 'abcd'):

CharacterString:
  Apostrophe
  (Alphanumeric)*
  Apostrophe
;

Is it possible to ignore the two apostrophes to then be able to get the token string without them in the lexer (via $CharacterString.text->chars)?

I tried ...

CharacterString:
  Apostrophe { $channel = HIDDEN; }
  (Alphanumeric)*
  Apostrophe { $channel = HIDDEN; }
;

... without success... This case does not even match my string anymore (e.g. 'oiu' will fail in the parser - Missmatched Set Exception).

Thank you :)

Answers


The inline code {$channel=HIDDEN;} affects the entire CharacterString, so you can't do it like the way you tried.

You will need to add some custom code and remove the quotes yourself. Here's a small C demo:

grammar T;

options {
  language=C;
}

parse
  :  (t=. {printf(">\%s<\n", $t.text->chars);})+ EOF
  ;

CharacterString
  :  '\'' ~'\''* '\''
     {
       pANTLR3_STRING quoted = GETTEXT();
       SETTEXT(quoted->subString(quoted, 1, quoted->len-1));
     }
  ;

Any
  :  .
  ;

and a little test function:

#include "TLexer.h"
#include "TParser.h"

int main(int argc, char *argv[])
{
  pANTLR3_UINT8 fName = (pANTLR3_UINT8)"input.txt";
  pANTLR3_INPUT_STREAM input = antlr3AsciiFileStreamNew(fName);

  if(input == NULL)
  {
    fprintf(stderr, "Failed to open file %s\n", (char *)fName);
    exit(1);
  }

  pTLexer lexer = TLexerNew(input);

  if(lexer == NULL)
  {
    fprintf(stderr, "Unable to create the lexer due to malloc() failure1\n");
    exit(1);
  }

  pANTLR3_COMMON_TOKEN_STREAM tstream = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT, TOKENSOURCE(lexer));

  if(tstream == NULL)
  {
    fprintf(stderr, "Out of memory trying to allocate token stream\n");
    exit(1);
  }

  pTParser parser = TParserNew(tstream);

  if(parser == NULL)
  {
    fprintf(stderr, "Out of memory trying to allocate parser\n");
    exit(ANTLR3_ERR_NOMEM);
  }

  parser->parse(parser);

  parser->free(parser);   parser = NULL;
  tstream->free(tstream); tstream = NULL;
  lexer->free(lexer);     lexer = NULL;
  input->close(input);    input = NULL;

  return 0;
}

and the test input.txt file contains:

'abc'

If you now 1) generate the lexer and parser, 2) compile all .c source files, and 3) run main:

# 1
java -cp antlr-3.3.jar org.antlr.Tool T.g

# 2
gcc -Wall main.c TLexer.c TParser.c -l antlr3c -o main

# 3
./main

you'll see that abc (without the quotes) is being printed to the console.


Need Your Help

About “Load” event

c# winforms forms events

I know that in C# the Form.Load event occurs only before the form is displayed for the first time.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.