You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by gnandre <ar...@gmail.com> on 2022/09/23 15:51:39 UTC
Atomic indexing as default indexing
Is there a way to make atomic indexing default?
Say, even if some clients send non-atomic indexing requests, it should get
converted to atomic indexing requests on Solr end, is that possible?
I am asking because we usually run into the following issue:
1. Client A is the major contributor of almost all the fields of a Solr
document. This is non-atomic indexing.
2. Client B contributes some additional fields to the same document and
does this with atomic indexing.
3. If Client A indexes again, the fields populated by Client B are wiped
out.
If we make all indexing atomic indexing on Solr end then we won't run into
this problem (except in a rare case where Client A deletes the document
then indexes it back, this is fine and we can deal with it because it is
rare)
Re: Exception with embedded Solr (was: Re: Atomic indexing as default indexing)
Posted by Shawn Heisey <el...@elyograg.org.INVALID>.
On 9/23/22 15:08, Shawn Heisey wrote:
> have removed the email headers that would bury this message inside a
> thread that has nothing to do with it
I *thought* had removed those headers. But the message got buried anyway.
Shawn
Exception with embedded Solr (was: Re: Atomic indexing as default indexing)
Posted by Shawn Heisey <ap...@elyograg.org.INVALID>.
On 9/23/22 12:07, L H wrote:
> Hello dear colleagues,
>
> I was using Embedded solr on JAVA 8 for caching some data - however, I am
> required to update JAVA to version 17.
>
> I can see that core container is not able to access home directory.
>
> Below is the exception I get; could someone please help me to know to fix
> the issue?
I have removed the email headers that would bury this message inside a
thread that has nothing to do with it, which is where I found your
message. You didn't even change the subject. Please do not reply to an
existing message unless that message is directly related to what you are
sending. Start a brand new message with a new subject for a new topic.
https://www.dropbox.com/s/3avr9o03gpx7rko/solr-user-buried-thread-2022-09.png?dl=0
What version of Solr/SolrJ are you using? I suspect that you're using a
version that was not qualified with any Java version later than 8. You
might need to upgrade Solr to have it work right with Java 17. In
recent years Java has gotten a lot better at not introducing breaking
changes, but you have just jumped NINE major versions. Any software is
likely to change in extreme ways across that many major versions.
The sweet spot for Solr 7 or 8 seems to be Java 11, but these Solr
versions only require Java 8. Solr 9.x *requires* Java 11, and it is
the only version I personally would run with anything newer than Java
11. For Solr 6, I would not run anything newer than Java 8. Solr 7.0
was the first version that was qualified to run in Java 9, and I recall
code changes being required to achieve that.
Thanks,
Shawn
Re: Atomic indexing as default indexing
Posted by L H <le...@gmail.com>.
Hello dear colleagues,
I was using Embedded solr on JAVA 8 for caching some data - however, I am
required to update JAVA to version 17.
I can see that core container is not able to access home directory.
Below is the exception I get; could someone please help me to know to fix
the issue?
============================ exception
======================:
Caused by: org.apache.solr.common.SolrException: JVM Error creating core
[invoiceconfig]: null
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:856)
Caused by: org.apache.solr.common.SolrException: JVM Error creating core
[invoiceconfig]: null
at org.apache.solr.core.CoreContainer.lambda$load$0(CoreContainer.java:494)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:889)
Caused by: java.lang.ExceptionInInitializerError
Caused by: java.lang.ExceptionInInitializerError
at java.base/java.lang.J9VMInternals.ensureError(J9VMInternals.java:185)
at
java.base/java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:174)
at
org.apache.solr.core.MMapDirectoryFactory.init(MMapDirectoryFactory.java:51)
at org.apache.solr.core.SolrCore.initDirectoryFactory(SolrCore.java:528)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:724)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:688)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:838)
... 6 more
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make
public jdk.internal.ref.Cleaner java.nio.DirectByteBuffer.cleaner()
accessible: module java.base does not "opens java.nio" to unnamed module
@f0b0647f
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make
public jdk.internal.ref.Cleaner java.nio.DirectByteBuffer.cleaner()
accessible: module java.base does not "opens java.nio" to unnamed module
@f0b0647f
at
java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
at
java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
at java.base/java.lang.reflect.Method.checkCanSetAccessible(Method.java:199)
at java.base/java.lang.reflect.Method.setAccessible(Method.java:193)
at
org.apache.lucene.store.MMapDirectory.unmapHackImpl(MMapDirectory.java:345)
at
java.base/java.security.AccessController.doPrivileged(AccessController.java:692)
at org.apache.lucene.store.MMapDirectory.<clinit>(MMapDirectory.java:326)
... 11 more
Re: Atomic indexing as default indexing
Posted by Thomas Corthals <th...@klascement.net>.
Op vr 23 sep. 2022 om 18:17 schreef Shawn Heisey
<ap...@elyograg.org.invalid>:
> On 9/23/22 09:51, gnandre wrote:
> > Is there a way to make atomic indexing default?
> >
> > Say, even if some clients send non-atomic indexing requests, it should
> get
> > converted to atomic indexing requests on Solr end, is that possible?
> >
> > I am asking because we usually run into the following issue:
> > 1. Client A is the major contributor of almost all the fields of a Solr
> > document. This is non-atomic indexing.
> > 2. Client B contributes some additional fields to the same document and
> > does this with atomic indexing.
> > 3. If Client A indexes again, the fields populated by Client B are wiped
> > out.
> >
> > If we make all indexing atomic indexing on Solr end then we won't run
> into
> > this problem (except in a rare case where Client A deletes the document
> > then indexes it back, this is fine and we can deal with it because it is
> > rare)
>
> We would be surprising a LOT of users if we did that. Right now they
> can simply reindex a document to delete fields that were indexed before
> but shouldn't be there. If we made atomic indexing the default, we
> would definitely get complaints about the fact that these fields did not
> get removed.
>
> And what about users that have a schema that is not appropriate for
> atomic indexing? Quite a lot of users, me included, have fields that
> are indexed but not stored and have no docValues. I can guarantee you
> that if we made atomic indexing the default, that users would assume
> that all their existing fields will be preserved, and that might not be
> the case.
>
> It sounds like what you should do is have client A be aware that a
> document might have changes done after they indexed it, and they should
> do a check to see whether a doc already exists, and if it does, change
> their indexing to atomic.
>
> It is extremely problematic to have one index be built by two different
> entities in this way. Maybe instead you should have separate indexes
> for each client and use Solr's join capability to combine the info from
> both indexes into one result. Just be aware that Solr's join capability
> will NOT do everything a relational database expert might expect.
>
> Thanks,
> Shawn
>
>
Client A can use Optimistic Concurrency
<https://solr.apache.org/guide/solr/latest/indexing-guide/partial-document-updates.html#optimistic-concurrency>
to check if a document has been updated by client B.
Use the /get handler from client A to get the _version_ after indexing and
store it locally. Use that _version_ for further updates from client A to
check if the document was changed by client B.
Thomas
Re: Atomic indexing as default indexing
Posted by Shawn Heisey <ap...@elyograg.org.INVALID>.
On 9/23/22 09:51, gnandre wrote:
> Is there a way to make atomic indexing default?
>
> Say, even if some clients send non-atomic indexing requests, it should get
> converted to atomic indexing requests on Solr end, is that possible?
>
> I am asking because we usually run into the following issue:
> 1. Client A is the major contributor of almost all the fields of a Solr
> document. This is non-atomic indexing.
> 2. Client B contributes some additional fields to the same document and
> does this with atomic indexing.
> 3. If Client A indexes again, the fields populated by Client B are wiped
> out.
>
> If we make all indexing atomic indexing on Solr end then we won't run into
> this problem (except in a rare case where Client A deletes the document
> then indexes it back, this is fine and we can deal with it because it is
> rare)
We would be surprising a LOT of users if we did that. Right now they
can simply reindex a document to delete fields that were indexed before
but shouldn't be there. If we made atomic indexing the default, we
would definitely get complaints about the fact that these fields did not
get removed.
And what about users that have a schema that is not appropriate for
atomic indexing? Quite a lot of users, me included, have fields that
are indexed but not stored and have no docValues. I can guarantee you
that if we made atomic indexing the default, that users would assume
that all their existing fields will be preserved, and that might not be
the case.
It sounds like what you should do is have client A be aware that a
document might have changes done after they indexed it, and they should
do a check to see whether a doc already exists, and if it does, change
their indexing to atomic.
It is extremely problematic to have one index be built by two different
entities in this way. Maybe instead you should have separate indexes
for each client and use Solr's join capability to combine the info from
both indexes into one result. Just be aware that Solr's join capability
will NOT do everything a relational database expert might expect.
Thanks,
Shawn