You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Shinichiro Abe <sh...@gmail.com> on 2011/07/04 15:37:58 UTC
Changing Include files doesn't work
Hi.
I scheduled the job of once crawling.I set including *.txt only on Paths tab.
I started the job, text files were indexed.
After that, I changed *.txt into *.xls.
I started the job, it should index xls files, but text files were not deleted and xls files were not indexed.
This problem occurs in Tomcat + agent process, not in Jetty.
(If restarting agent process after changing target file, this can be resolved.)
Which is correct behavior?
It seems that it is raised by cache managing, or looks like some bugs around seeding
or something wrong on my Tomcat environment.
And it occurs on both once crawling and continuous crawling.
And it occurs on both jcifs and filesystem.
Regards, Shichiro Abe
Re: Changing Include files doesn't work
Posted by Karl Wright <da...@gmail.com>.
Feel free to open a ticket. But this cannot be done at the script
level - there's a race condition if you were to do that. So any fix
needs to be in the AgentRun class.
Karl
On Tue, Jul 5, 2011 at 12:13 AM, Shinichiro Abe
<sh...@gmail.com> wrote:
> Thank you.
> I set up sync directory property, then it worked well.
>
> BTW, though I don't know whether it has relevant to the synchronization,
> can it check running more than one of instance of agent?
> For example, if one runs "executecommand.sh ~ agents.AgentRun" twice,
> can the command warn that it is already running?
> Now, if one runs the command twice, Java procceses run twice.
>
> Thank you,
> Shinichiro Abe
>
> On 2011/07/05, at 9:26, Karl Wright wrote:
>
>> It sounds like you have not set up synchronization properly for the
>> multi-process installation you are running. See:
>> http://incubator.apache.org/connectors/how-to-build-and-deploy.html#Examples
>>
>> The synch directory needs to be specified for multi-process installations.
>>
>> Karl
>>
>>
>> On Mon, Jul 4, 2011 at 9:37 AM, Shinichiro Abe
>> <sh...@gmail.com> wrote:
>>> Hi.
>>> I scheduled the job of once crawling.I set including *.txt only on Paths tab.
>>> I started the job, text files were indexed.
>>> After that, I changed *.txt into *.xls.
>>> I started the job, it should index xls files, but text files were not deleted and xls files were not indexed.
>>>
>>> This problem occurs in Tomcat + agent process, not in Jetty.
>>> (If restarting agent process after changing target file, this can be resolved.)
>>> Which is correct behavior?
>>> It seems that it is raised by cache managing, or looks like some bugs around seeding
>>> or something wrong on my Tomcat environment.
>>>
>>> And it occurs on both once crawling and continuous crawling.
>>> And it occurs on both jcifs and filesystem.
>>>
>>> Regards, Shichiro Abe
>
>
Re: Changing Include files doesn't work
Posted by Shinichiro Abe <sh...@gmail.com>.
Thank you.
I set up sync directory property, then it worked well.
BTW, though I don't know whether it has relevant to the synchronization,
can it check running more than one of instance of agent?
For example, if one runs "executecommand.sh ~ agents.AgentRun" twice,
can the command warn that it is already running?
Now, if one runs the command twice, Java procceses run twice.
Thank you,
Shinichiro Abe
On 2011/07/05, at 9:26, Karl Wright wrote:
> It sounds like you have not set up synchronization properly for the
> multi-process installation you are running. See:
> http://incubator.apache.org/connectors/how-to-build-and-deploy.html#Examples
>
> The synch directory needs to be specified for multi-process installations.
>
> Karl
>
>
> On Mon, Jul 4, 2011 at 9:37 AM, Shinichiro Abe
> <sh...@gmail.com> wrote:
>> Hi.
>> I scheduled the job of once crawling.I set including *.txt only on Paths tab.
>> I started the job, text files were indexed.
>> After that, I changed *.txt into *.xls.
>> I started the job, it should index xls files, but text files were not deleted and xls files were not indexed.
>>
>> This problem occurs in Tomcat + agent process, not in Jetty.
>> (If restarting agent process after changing target file, this can be resolved.)
>> Which is correct behavior?
>> It seems that it is raised by cache managing, or looks like some bugs around seeding
>> or something wrong on my Tomcat environment.
>>
>> And it occurs on both once crawling and continuous crawling.
>> And it occurs on both jcifs and filesystem.
>>
>> Regards, Shichiro Abe
Re: Changing Include files doesn't work
Posted by Karl Wright <da...@gmail.com>.
It sounds like you have not set up synchronization properly for the
multi-process installation you are running. See:
http://incubator.apache.org/connectors/how-to-build-and-deploy.html#Examples
The synch directory needs to be specified for multi-process installations.
Karl
On Mon, Jul 4, 2011 at 9:37 AM, Shinichiro Abe
<sh...@gmail.com> wrote:
> Hi.
> I scheduled the job of once crawling.I set including *.txt only on Paths tab.
> I started the job, text files were indexed.
> After that, I changed *.txt into *.xls.
> I started the job, it should index xls files, but text files were not deleted and xls files were not indexed.
>
> This problem occurs in Tomcat + agent process, not in Jetty.
> (If restarting agent process after changing target file, this can be resolved.)
> Which is correct behavior?
> It seems that it is raised by cache managing, or looks like some bugs around seeding
> or something wrong on my Tomcat environment.
>
> And it occurs on both once crawling and continuous crawling.
> And it occurs on both jcifs and filesystem.
>
> Regards, Shichiro Abe