You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Erick Erickson <er...@gmail.com> on 2020/10/20 12:59:28 UTC

schema changes and reindexing and user's questions

I found myself yet again trying to explain on the user’s list that if you change your schema, you almost cvertainly have to delete your index and start over. Then a lightbulb went off and I said “Wow! Hasn’t someone written this up somewhere?” The ref guide page “Reindexing.adoc” seems like the right place for this. It already has a section outlining deleting the existing index, I think just a bit of rearranging would help.

I also found:  https://cwiki.apache.org/confluence/display/SOLR/HowToReindex, which talks a lot about DataImportHandler. It’s also tagged “old wiki”. Should I raise a JIRA about that given DIH is being moved to a package?

Erick
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: schema changes and reindexing and user's questions

Posted by Cassandra Targett <ca...@gmail.com>.
If the old wiki page is old and out of date, just edit it to remove the old content and point people who might find the old page to the Ref Guide page. There’s no need for a Jira to edit the “old wiki”, just do it IMO. I mean, who’s attached to that content? No one is editing a lot of it, that’s for sure.

Every time someone suggests “fixing” the reindexing page, they’ve tended want to repeat all the instructions for how to reindex at the top of the page but I pretty much disagree with that approach - we need to explain WHY and WHEN before HOW. The issue isn’t that the info isn’t available IMO, the issue is that people don’t look for it because they have a mental model that schema = data dictionary, like a database. I mean, every time someone is surprised that changing a field type or properties of a field gave them all sorts of problems, it’s because they assumed changing the schema changes the index.

It occurs to me now that what might be helpful is making sure all the docs about schema and fields and indexing mention that changes to the schema require reindexing, in order to encourage changing the mental model before they ever start adding any docs.

This is also I think why people want to get rid of schemaless mode, right? Because of this mismatch between expectations and reality.
On Oct 20, 2020, 7:59 AM -0500, Erick Erickson <er...@gmail.com>, wrote:
> I found myself yet again trying to explain on the user’s list that if you change your schema, you almost cvertainly have to delete your index and start over. Then a lightbulb went off and I said “Wow! Hasn’t someone written this up somewhere?” The ref guide page “Reindexing.adoc” seems like the right place for this. It already has a section outlining deleting the existing index, I think just a bit of rearranging would help.
>
> I also found: https://cwiki.apache.org/confluence/display/SOLR/HowToReindex, which talks a lot about DataImportHandler. It’s also tagged “old wiki”. Should I raise a JIRA about that given DIH is being moved to a package?
>
> Erick
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>