You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by ritika jain <ri...@gmail.com> on 2020/12/31 13:23:47 UTC

Indexation Not OK

Hi,

I am using Manifoldcf 2.14 and JCIFS connector, to ingest some billions of
records into elastic search
I am facing an issue in which when Job is run some time, successful
indexation happens but after sometime , manifoldcf loops the records and
Indexation is not getting OK.

[image: image.png]

and it keeps on retrying for those specific records, then to again start
up, I need to restart the docker container everytime and after restart
Indexation works fine for those records too.
And also checked JSON formation of elastic search connector is fine, which
sures that the files are not having any problem.
Can anybody please guide me the reason for this

Thanks
Ritika

Re: Indexation Not OK

Posted by Karl Wright <da...@gmail.com>.
Sorry, I couldn't quite understand everything in your email, but it sounds
like the problem is in the ES connection.  It is possible that ES expires
your connection and the indexing fails after that happens.  If that is
happening, however, I would expect to see a much more detailed error
message in both the logs and in the simple history.  Can you provide any
error messages from the log that seem to be coming from the output
connection?

Thanks,
Karl


On Thu, Dec 31, 2020 at 8:30 AM Karl Wright <da...@gmail.com> wrote:

> Hi,
> Can you let us know what you are using for the output connector?
> Thanks,
> Karl
>
>
> On Thu, Dec 31, 2020 at 8:24 AM ritika jain <ri...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am using Manifoldcf 2.14 and JCIFS connector, to ingest some billions
>> of records into elastic search
>> I am facing an issue in which when Job is run some time, successful
>> indexation happens but after sometime , manifoldcf loops the records and
>> Indexation is not getting OK.
>>
>> [image: image.png]
>>
>> and it keeps on retrying for those specific records, then to again start
>> up, I need to restart the docker container everytime and after restart
>> Indexation works fine for those records too.
>> And also checked JSON formation of elastic search connector is fine,
>> which sures that the files are not having any problem.
>> Can anybody please guide me the reason for this
>>
>> Thanks
>> Ritika
>>
>>
>>

Re: Indexation Not OK

Posted by Karl Wright <da...@gmail.com>.
Hi,
I don't have the ability to delete mail from mailing lists.  You have to
request Apache Infra do that.

Karl


On Thu, Dec 31, 2020 at 11:38 AM Michael Cizmar <mi...@mcplusa.com>
wrote:

> Ritika – We have had some discussions regarding docker and etc.  The
> public one that is out there builds a single node and does not use an
> RDBM.  I would not recommend using that to index billions of documents.
> You can turn on debugging in the connector and look at the logs to see if
> that traffic is actually going to Elastic search.
>
>
>
> Karl – I believe Ritika said Elastic.
>
>
>
>
>
> --
>
> Michael Cizmar
>
>
>
> *From:* ritika jain <ri...@gmail.com>
> *Sent:* Thursday, December 31, 2020 7:33 AM
> *To:* user@manifoldcf.apache.org
> *Subject:* Re: Indexation Not OK
>
>
>
> Elastic search output connector with some custom changes for some fields
>
> On Thursday, December 31, 2020, Karl Wright <da...@gmail.com> wrote:
>
> Hi,
> Can you let us know what you are using for the output connector?
>
> Thanks,
>
> Karl
>
>
>
>
>
> On Thu, Dec 31, 2020 at 8:24 AM ritika jain <ri...@gmail.com>
> wrote:
>
> Hi,
>
>
>
> I am using Manifoldcf 2.14 and JCIFS connector, to ingest some billions of
> records into elastic search
>
> I am facing an issue in which when Job is run some time, successful
> indexation happens but after sometime , manifoldcf loops the records and
> Indexation is not getting OK.
>
>
>
>
>
> and it keeps on retrying for those specific records, then to again start
> up, I need to restart the docker container everytime and after restart
> Indexation works fine for those records too.
>
> And also checked JSON formation of elastic search connector is fine, which
> sures that the files are not having any problem.
>
> Can anybody please guide me the reason for this
>
>
>
> Thanks
>
> Ritika
>
>
>
>
>
>

RE: Indexation Not OK

Posted by Michael Cizmar <mi...@mcplusa.com>.
Ritika – We have had some discussions regarding docker and etc.  The public one that is out there builds a single node and does not use an RDBM.  I would not recommend using that to index billions of documents.  You can turn on debugging in the connector and look at the logs to see if that traffic is actually going to Elastic search.

Karl – I believe Ritika said Elastic.


--
Michael Cizmar


From: ritika jain <ri...@gmail.com>
Sent: Thursday, December 31, 2020 7:33 AM
To: user@manifoldcf.apache.org
Subject: Re: Indexation Not OK

Elastic search output connector with some custom changes for some fields

On Thursday, December 31, 2020, Karl Wright <da...@gmail.com>> wrote:
Hi,
Can you let us know what you are using for the output connector?
Thanks,
Karl


On Thu, Dec 31, 2020 at 8:24 AM ritika jain <ri...@gmail.com>> wrote:
Hi,

I am using Manifoldcf 2.14 and JCIFS connector, to ingest some billions of records into elastic search
I am facing an issue in which when Job is run some time, successful indexation happens but after sometime , manifoldcf loops the records and Indexation is not getting OK.

[cid:image003.png@01D6DF61.14FDAFC0]

and it keeps on retrying for those specific records, then to again start up, I need to restart the docker container everytime and after restart Indexation works fine for those records too.
And also checked JSON formation of elastic search connector is fine, which sures that the files are not having any problem.
Can anybody please guide me the reason for this

Thanks
Ritika



Re: Indexation Not OK

Posted by ritika jain <ri...@gmail.com>.
Elastic search output connector with some custom changes for some fields

On Thursday, December 31, 2020, Karl Wright <da...@gmail.com> wrote:

> Hi,
> Can you let us know what you are using for the output connector?
> Thanks,
> Karl
>
>
> On Thu, Dec 31, 2020 at 8:24 AM ritika jain <ri...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am using Manifoldcf 2.14 and JCIFS connector, to ingest some billions
>> of records into elastic search
>> I am facing an issue in which when Job is run some time, successful
>> indexation happens but after sometime , manifoldcf loops the records and
>> Indexation is not getting OK.
>>
>> [image: image.png]
>>
>> and it keeps on retrying for those specific records, then to again start
>> up, I need to restart the docker container everytime and after restart
>> Indexation works fine for those records too.
>> And also checked JSON formation of elastic search connector is fine,
>> which sures that the files are not having any problem.
>> Can anybody please guide me the reason for this
>>
>> Thanks
>> Ritika
>>
>>
>>

Re: Indexation Not OK

Posted by Karl Wright <da...@gmail.com>.
Hi,
Can you let us know what you are using for the output connector?
Thanks,
Karl


On Thu, Dec 31, 2020 at 8:24 AM ritika jain <ri...@gmail.com>
wrote:

> Hi,
>
> I am using Manifoldcf 2.14 and JCIFS connector, to ingest some billions of
> records into elastic search
> I am facing an issue in which when Job is run some time, successful
> indexation happens but after sometime , manifoldcf loops the records and
> Indexation is not getting OK.
>
> [image: image.png]
>
> and it keeps on retrying for those specific records, then to again start
> up, I need to restart the docker container everytime and after restart
> Indexation works fine for those records too.
> And also checked JSON formation of elastic search connector is fine, which
> sures that the files are not having any problem.
> Can anybody please guide me the reason for this
>
> Thanks
> Ritika
>
>
>