You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by vijay vijay <vi...@gmail.com> on 2007/10/17 05:34:39 UTC

can we put word doc in place of text files.

Hi
      i have done one sample example UIMA as Web Application.here i have
taken reference from ExampleApplication.it is working fine with text files.
i am able to see the result dynamically.so i have taken one step further by
taking word in place of text .here the problem is it is recognizing the word
doc and giving result  but not able to get the tables and screens.

here insted of taking word directley i have used poi and conveted in to text
passed the out put to my annotation.here also i am getting the same problem.

so if u want to look for word docs do we need to use other
techniques.canany one help me here

vijay

Re: can we put word doc in place of text files.

Posted by Thilo Goetz <tw...@gmx.de>.
Michael Baessler wrote:
> Sorry, I have no pointers to a word doc parser.
> 
> -- Michael
> 
> vijay vijay wrote:
...

Michael,

he is using POI, which is an Apache project that handles Word
documents (amongst other things).  Vijay, if you have trouble
using POI, please ask the POI folks.  UIMA only handles text,
document parsing is not something we support out of the box.

--Thilo


Re: can we put word doc in place of text files.

Posted by Michael Baessler <mb...@michael-baessler.de>.
Sorry, I have no pointers to a word doc parser.

-- Michael

vijay vijay wrote:
> thnak u
>                 is there any references from ur side.i think i might me
> troubleing you. i have been stuck with this almost for 2 weeks .thats why i
> have posted this to u.can u tell me what kind of parser we need.
>
> actaulluy i have tried with
> resp.setContentType("application/binary");
>     resp.setHeader("Content-Disposition", "attachment; filename=\"" +
> fname.getName() + "\";");
>    and
>
> response.setContentType("application/vnd.ms-word");
>
> still i got the same result.
>
> vijay
> On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
>   
>> No, you need a special parser to parse the content of the word file to
>> extract the plain text.
>> When you only replace the files you will get confused results.
>>
>> -- Michael
>>
>> vijay vijay wrote:
>>     
>>> Hi michael
>>>                i have seen so many postings from u. my problem is can i
>>>       
>> keep
>>     
>>> word files in place of text files in input directory.
>>>
>>>
>>> On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
>>>
>>>       
>>>> I don't understand your problem, so I can't help you. I'm also not sure
>>>> if you talk about UIMA problems.
>>>>
>>>> -- Michael
>>>> vijay vijay wrote:
>>>>
>>>>         
>>>>> HI Michael,
>>>>>                 can u help me on my topic if u give me some urls also
>>>>>           
>> no
>>     
>>>>> problem.i have not been posting because i don't recive any
>>>>>           
>> replies.todayout
>>     
>>>>> of curiosity i have posted.
>>>>>
>>>>>              i have sucessfully getting the results for uima as web
>>>>> application. i am able to look for strings dynamically(text).here in
>>>>>
>>>>>           
>>>> place
>>>>
>>>>         
>>>>> of text i have given word doc then problem started coming. it is
>>>>>           
>> reading
>>     
>>>>> only test from it and if u have table and figures which are not
>>>>> recognized.ihave used poi concept here and converted the word doc into
>>>>> text file then i
>>>>> done the search same thing is repeted.
>>>>>
>>>>> so can u help me here.......michael
>>>>> vijay
>>>>>
>>>>>
>>>>> On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> vijay vijay wrote:
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> Hi
>>>>>>>       i have done one sample example UIMA as Web Application.here i
>>>>>>>
>>>>>>>               
>>>> have
>>>>
>>>>         
>>>>>>> taken reference from ExampleApplication.it is working fine with text
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> files.
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> i am able to see the result dynamically.so i have taken one step
>>>>>>>
>>>>>>>               
>>>> further
>>>>
>>>>         
>>>>>> by
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> taking word in place of text .here the problem is it is recognizing
>>>>>>>
>>>>>>>               
>>>> the
>>>>
>>>>         
>>>>>> word
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> doc and giving result  but not able to get the tables and screens.
>>>>>>>
>>>>>>> here insted of taking word directley i have used poi and conveted in
>>>>>>>
>>>>>>>               
>>>> to
>>>>
>>>>         
>>>>>> text
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> passed the out put to my annotation.here also i am getting the same
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> problem.
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> so if u want to look for word docs do we need to use other
>>>>>>> techniques.canany one help me here
>>>>>>>
>>>>>>> vijay
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> Again, do not post the same question to both UIMA lists!
>>>>>>
>>>>>> -- Michael
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>       
>>     
>
>   


Re: can we put word doc in place of text files.

Posted by vijay vijay <vi...@gmail.com>.
thnak u
                is there any references from ur side.i think i might me
troubleing you. i have been stuck with this almost for 2 weeks .thats why i
have posted this to u.can u tell me what kind of parser we need.

actaulluy i have tried with
resp.setContentType("application/binary");
    resp.setHeader("Content-Disposition", "attachment; filename=\"" +
fname.getName() + "\";");
   and

response.setContentType("application/vnd.ms-word");

still i got the same result.

vijay
On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
>
> No, you need a special parser to parse the content of the word file to
> extract the plain text.
> When you only replace the files you will get confused results.
>
> -- Michael
>
> vijay vijay wrote:
> > Hi michael
> >                i have seen so many postings from u. my problem is can i
> keep
> > word files in place of text files in input directory.
> >
> >
> > On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
> >
> >> I don't understand your problem, so I can't help you. I'm also not sure
> >> if you talk about UIMA problems.
> >>
> >> -- Michael
> >> vijay vijay wrote:
> >>
> >>> HI Michael,
> >>>                 can u help me on my topic if u give me some urls also
> no
> >>> problem.i have not been posting because i don't recive any
> replies.todayout
> >>> of curiosity i have posted.
> >>>
> >>>              i have sucessfully getting the results for uima as web
> >>> application. i am able to look for strings dynamically(text).here in
> >>>
> >> place
> >>
> >>> of text i have given word doc then problem started coming. it is
> reading
> >>> only test from it and if u have table and figures which are not
> >>> recognized.ihave used poi concept here and converted the word doc into
> >>> text file then i
> >>> done the search same thing is repeted.
> >>>
> >>> so can u help me here.......michael
> >>> vijay
> >>>
> >>>
> >>> On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
> >>>
> >>>
> >>>> vijay vijay wrote:
> >>>>
> >>>>
> >>>>> Hi
> >>>>>       i have done one sample example UIMA as Web Application.here i
> >>>>>
> >> have
> >>
> >>>>> taken reference from ExampleApplication.it is working fine with text
> >>>>>
> >>>>>
> >>>> files.
> >>>>
> >>>>
> >>>>> i am able to see the result dynamically.so i have taken one step
> >>>>>
> >> further
> >>
> >>>> by
> >>>>
> >>>>
> >>>>> taking word in place of text .here the problem is it is recognizing
> >>>>>
> >> the
> >>
> >>>> word
> >>>>
> >>>>
> >>>>> doc and giving result  but not able to get the tables and screens.
> >>>>>
> >>>>> here insted of taking word directley i have used poi and conveted in
> >>>>>
> >> to
> >>
> >>>> text
> >>>>
> >>>>
> >>>>> passed the out put to my annotation.here also i am getting the same
> >>>>>
> >>>>>
> >>>> problem.
> >>>>
> >>>>
> >>>>> so if u want to look for word docs do we need to use other
> >>>>> techniques.canany one help me here
> >>>>>
> >>>>> vijay
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>> Again, do not post the same question to both UIMA lists!
> >>>>
> >>>> -- Michael
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >
> >
>
>

Re: can we put word doc in place of text files.

Posted by Michael Baessler <mb...@michael-baessler.de>.
No, you need a special parser to parse the content of the word file to 
extract the plain text.
When you only replace the files you will get confused results.

-- Michael

vijay vijay wrote:
> Hi michael
>                i have seen so many postings from u. my problem is can i keep
> word files in place of text files in input directory.
>
>
> On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
>   
>> I don't understand your problem, so I can't help you. I'm also not sure
>> if you talk about UIMA problems.
>>
>> -- Michael
>> vijay vijay wrote:
>>     
>>> HI Michael,
>>>                 can u help me on my topic if u give me some urls also no
>>> problem.i have not been posting because i don't recive any replies.todayout
>>> of curiosity i have posted.
>>>
>>>              i have sucessfully getting the results for uima as web
>>> application. i am able to look for strings dynamically(text).here in
>>>       
>> place
>>     
>>> of text i have given word doc then problem started coming. it is reading
>>> only test from it and if u have table and figures which are not
>>> recognized.ihave used poi concept here and converted the word doc into
>>> text file then i
>>> done the search same thing is repeted.
>>>
>>> so can u help me here.......michael
>>> vijay
>>>
>>>
>>> On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
>>>
>>>       
>>>> vijay vijay wrote:
>>>>
>>>>         
>>>>> Hi
>>>>>       i have done one sample example UIMA as Web Application.here i
>>>>>           
>> have
>>     
>>>>> taken reference from ExampleApplication.it is working fine with text
>>>>>
>>>>>           
>>>> files.
>>>>
>>>>         
>>>>> i am able to see the result dynamically.so i have taken one step
>>>>>           
>> further
>>     
>>>> by
>>>>
>>>>         
>>>>> taking word in place of text .here the problem is it is recognizing
>>>>>           
>> the
>>     
>>>> word
>>>>
>>>>         
>>>>> doc and giving result  but not able to get the tables and screens.
>>>>>
>>>>> here insted of taking word directley i have used poi and conveted in
>>>>>           
>> to
>>     
>>>> text
>>>>
>>>>         
>>>>> passed the out put to my annotation.here also i am getting the same
>>>>>
>>>>>           
>>>> problem.
>>>>
>>>>         
>>>>> so if u want to look for word docs do we need to use other
>>>>> techniques.canany one help me here
>>>>>
>>>>> vijay
>>>>>
>>>>>
>>>>>
>>>>>           
>>>> Again, do not post the same question to both UIMA lists!
>>>>
>>>> -- Michael
>>>>
>>>>
>>>>         
>>>       
>>     
>
>   


Re: can we put word doc in place of text files.

Posted by vijay vijay <vi...@gmail.com>.
Hi michael
               i have seen so many postings from u. my problem is can i keep
word files in place of text files in input directory.


On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
>
> I don't understand your problem, so I can't help you. I'm also not sure
> if you talk about UIMA problems.
>
> -- Michael
> vijay vijay wrote:
> > HI Michael,
> >                 can u help me on my topic if u give me some urls also no
> > problem.i have not been posting because i don't recive any replies.todayout
> > of curiosity i have posted.
> >
> >              i have sucessfully getting the results for uima as web
> > application. i am able to look for strings dynamically(text).here in
> place
> > of text i have given word doc then problem started coming. it is reading
> > only test from it and if u have table and figures which are not
> > recognized.ihave used poi concept here and converted the word doc into
> > text file then i
> > done the search same thing is repeted.
> >
> > so can u help me here.......michael
> > vijay
> >
> >
> > On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
> >
> >> vijay vijay wrote:
> >>
> >>> Hi
> >>>       i have done one sample example UIMA as Web Application.here i
> have
> >>> taken reference from ExampleApplication.it is working fine with text
> >>>
> >> files.
> >>
> >>> i am able to see the result dynamically.so i have taken one step
> further
> >>>
> >> by
> >>
> >>> taking word in place of text .here the problem is it is recognizing
> the
> >>>
> >> word
> >>
> >>> doc and giving result  but not able to get the tables and screens.
> >>>
> >>> here insted of taking word directley i have used poi and conveted in
> to
> >>>
> >> text
> >>
> >>> passed the out put to my annotation.here also i am getting the same
> >>>
> >> problem.
> >>
> >>> so if u want to look for word docs do we need to use other
> >>> techniques.canany one help me here
> >>>
> >>> vijay
> >>>
> >>>
> >>>
> >> Again, do not post the same question to both UIMA lists!
> >>
> >> -- Michael
> >>
> >>
> >
> >
>
>

Re: can we put word doc in place of text files.

Posted by Michael Baessler <mb...@michael-baessler.de>.
I don't understand your problem, so I can't help you. I'm also not sure 
if you talk about UIMA problems.

-- Michael
vijay vijay wrote:
> HI Michael,
>                 can u help me on my topic if u give me some urls also no
> problem.i have not been posting because i don't recive any replies.today out
> of curiosity i have posted.
>
>              i have sucessfully getting the results for uima as web
> application. i am able to look for strings dynamically(text).here in place
> of text i have given word doc then problem started coming. it is reading
> only test from it and if u have table and figures which are not
> recognized.ihave used poi concept here and converted the word doc into
> text file then i
> done the search same thing is repeted.
>
> so can u help me here.......michael
> vijay
>
>
> On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
>   
>> vijay vijay wrote:
>>     
>>> Hi
>>>       i have done one sample example UIMA as Web Application.here i have
>>> taken reference from ExampleApplication.it is working fine with text
>>>       
>> files.
>>     
>>> i am able to see the result dynamically.so i have taken one step further
>>>       
>> by
>>     
>>> taking word in place of text .here the problem is it is recognizing the
>>>       
>> word
>>     
>>> doc and giving result  but not able to get the tables and screens.
>>>
>>> here insted of taking word directley i have used poi and conveted in to
>>>       
>> text
>>     
>>> passed the out put to my annotation.here also i am getting the same
>>>       
>> problem.
>>     
>>> so if u want to look for word docs do we need to use other
>>> techniques.canany one help me here
>>>
>>> vijay
>>>
>>>
>>>       
>> Again, do not post the same question to both UIMA lists!
>>
>> -- Michael
>>
>>     
>
>   


Re: can we put word doc in place of text files.

Posted by vijay vijay <vi...@gmail.com>.
HI Michael,
                can u help me on my topic if u give me some urls also no
problem.i have not been posting because i don't recive any replies.today out
of curiosity i have posted.

             i have sucessfully getting the results for uima as web
application. i am able to look for strings dynamically(text).here in place
of text i have given word doc then problem started coming. it is reading
only test from it and if u have table and figures which are not
recognized.ihave used poi concept here and converted the word doc into
text file then i
done the search same thing is repeted.

so can u help me here.......michael
vijay


On 10/17/07, Michael Baessler <mb...@michael-baessler.de> wrote:
>
> vijay vijay wrote:
> > Hi
> >       i have done one sample example UIMA as Web Application.here i have
> > taken reference from ExampleApplication.it is working fine with text
> files.
> > i am able to see the result dynamically.so i have taken one step further
> by
> > taking word in place of text .here the problem is it is recognizing the
> word
> > doc and giving result  but not able to get the tables and screens.
> >
> > here insted of taking word directley i have used poi and conveted in to
> text
> > passed the out put to my annotation.here also i am getting the same
> problem.
> >
> > so if u want to look for word docs do we need to use other
> > techniques.canany one help me here
> >
> > vijay
> >
> >
> Again, do not post the same question to both UIMA lists!
>
> -- Michael
>

Re: can we put word doc in place of text files.

Posted by Michael Baessler <mb...@michael-baessler.de>.
vijay vijay wrote:
> Hi
>       i have done one sample example UIMA as Web Application.here i have
> taken reference from ExampleApplication.it is working fine with text files.
> i am able to see the result dynamically.so i have taken one step further by
> taking word in place of text .here the problem is it is recognizing the word
> doc and giving result  but not able to get the tables and screens.
>
> here insted of taking word directley i have used poi and conveted in to text
> passed the out put to my annotation.here also i am getting the same problem.
>
> so if u want to look for word docs do we need to use other
> techniques.canany one help me here
>
> vijay
>
>   
Again, do not post the same question to both UIMA lists!

-- Michael