You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Pinky Iyer <pi...@yahoo.com> on 2003/02/28 18:04:02 UTC
htmlParser problem...anybody knowledge with CC
Hi!
I am trying to parse some JSP files and i am trying to change the HTMLParser.jj code to accomodate this. As mentioned in the FAQ i created the 3rd comment tags type in the void CommentTag() :, TOKEN :, and <WithinCommentN> TOKEN : sections of HTMLParser.jj
Here is it.
void CommentTag() :
{}
{
(<Comment1> ( <CommentText1> )* <CommentEnd1>)
|
(<Comment2> ( <CommentText2> )* <CommentEnd2>)
|
(<Comment3> ( <CommentText3> )* <CommentEnd3>)
}
and the token part has following:
< Comment3: "<%" > : WithinComment3
and withinComment3 is as follows:
<WithinComment3> TOKEN :
{
< CommentText3: (~[">"])+>
| < CommentEnd3: "%>" > : DEFAULT
}
However I get lexical errors when parsing the jsp file which is :
Parse Aborted: Lexical error at line 2, column 96. Encountered: ">" (62), after
: ""
Title:
Summary:
and title and summary are not picked up. ANybody has anyidea whats the mistake i am commiting. I do not know the parsing language.....
Anyhelp appreciated!
Thanks!
Pinky
---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more
Re: Word doc parser
Posted by Ryan Ackley <sa...@cfl.rr.com>.
Go to http://www.textmining.org
----- Original Message -----
From: "Pinky Iyer" <pi...@yahoo.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Friday, February 28, 2003 3:44 PM
Subject: Word doc parser
>
> Anybody knows of a good word document parsers.
> Thanks !
> P Iyer
>
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Word doc parser
Posted by Clemens Marschner <cm...@lanlab.de>.
You may want to think about using POI from Jakarta
http://jakarta.apache.org/poi
Clemens
----- Original Message -----
From: "Pinky Iyer" <pi...@yahoo.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Friday, February 28, 2003 9:44 PM
Subject: Word doc parser
>
> Anybody knows of a good word document parsers.
> Thanks !
> P Iyer
>
>
>
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Tax Center - forms, calculators, tips, and more
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Word doc parser
Posted by Pinky Iyer <pi...@yahoo.com>.
Anybody knows of a good word document parsers.
Thanks !
P Iyer
---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, and more