You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Adam Estrada <es...@gmail.com> on 2010/12/03 04:52:04 UTC

Batch Update Fields

OK part 2 of my previous question...

Is there a way to batch update field values based on a certain criteria? For example, if thousands of documents have a field value of 'US' can I update all of them to 'United States' programmatically?

Adam

Re: Batch Update Fields

Posted by Markus Jelsma <ma...@openindex.io>.

On Friday 03 December 2010 18:20:44 Adam Estrada wrote:
> I wonder...I know that sed would work to find and replace the terms in all
> of the csv files that I am indexing but would it work to find and replace
> key terms in the index?

It'll most likely corrupt your index. Offsets, positions etc won't have the 
proper meaning anymore.

> find C:\\tmp\\index\\data -type f -exec sed -i 's/AF/AFGHANISTAN/g' {} \;
> 
> That command would iterate through all the files in the data directory and
> replace the country code with the full country name. I many just back up
> the directory and try it. I have it running on csv files right now and
> it's working wonderfully. For those of you interested, I am indexing the
> entire Geonames dataset http://download.geonames.org/export/dump/
> (allCountries.zip) which gives me a pretty comprehensive world gazetteer.
> My next step is gonna be to display the results as KML to view over a
> google globe.
> 
> Thoughts?
> 
> Adam
> 
> On Fri, Dec 3, 2010 at 7:57 AM, Erick Erickson 
<er...@gmail.com>wrote:
> > No, there's no equivalent to SQL update for all values in a column.
> > You'll have to reindex all the documents.
> > 
> > On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada <
> > estrada.adam.groups@gmail.com
> > 
> > > wrote:
> > > 
> > > OK part 2 of my previous question...
> > > 
> > > Is there a way to batch update field values based on a certain
> > > criteria? For example, if thousands of documents have a field value of
> > > 'US' can I update all of them to 'United States' programmatically?
> > > 
> > > Adam

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Batch Update Fields

Posted by Adam Estrada <es...@gmail.com>.
OK so the way I understand this is that if there is a synonym on a specific
field at index time, that value will be stored rather than the one in the
csv that I am indexing? I will give it a whirl and report back...

Thanks!
Adam

On Sat, Dec 4, 2010 at 2:27 PM, Erick Erickson <er...@gmail.com>wrote:

> When you define your fieldType at index time. My idea
> was that you substitue these on the way in to your
> index. You may need a specific field type just for your
> country conversion.... Perhaps in a copyField if
> you need both the code and full name....
>
> Best
> Erick
>
> On Sat, Dec 4, 2010 at 12:16 PM, Adam Estrada <
> estrada.adam.groups@gmail.com
> > wrote:
>
> > Synonyms eh? I have a synonym list like the following so how do I
> identify
> > the synonyms on a specific field. The only place the field is used is as
> a
> > facet.
> >
> > original field => country name
> >
> > AF => AFGHANISTAN
> > AX => ÅLAND ISLANDS
> > AL => ALBANIA
> > DZ => ALGERIA
> > AS => AMERICAN SAMOA
> > AD => ANDORRA
> > AO => ANGOLA
> > AI => ANGUILLA
> > AQ => ANTARCTICA
> > AG => ANTIGUA AND BARBUDA
> > AR => ARGENTINA
> > AM => ARMENIA
> > AW => ARUBA
> > AU => AUSTRALIA
> > AT => AUSTRIA
> > etc...
> >
> > Any advise on that would be great and very much appreciated!
> >
> > Adam
> >
> > On Fri, Dec 3, 2010 at 3:55 PM, Erick Erickson <erickerickson@gmail.com
> > >wrote:
> >
> > > That will certainly work. Another option, assuming the country codes
> are
> > > in their own field would be to put the transformations into a synonym
> > file
> > > that was only used on that field. That way you'd get this without
> having
> > > to do the pre-process step of the raw data...
> > >
> > > That said, if you pre-processing is working for you it may  not be
> worth
> > > your while
> > > to worry about doing it differently
> > >
> > > Best
> > > Erick
> > >
> > > On Fri, Dec 3, 2010 at 12:51 PM, Adam Estrada <
> > > estrada.adam.groups@gmail.com
> > > > wrote:
> > >
> > > > First off...I know enough about Solr to be VERY dangerous so please
> > bare
> > > > with me ;-) I am indexing the geonames database which only provides
> > > country
> > > > codes. I can facet the codes but to the end user who may not know all
> > 249
> > > > codes, it isn't really all that helpful. Therefore, I want to map the
> > > full
> > > > country names to the country codes provided in the geonames db.
> > > > http://download.geonames.org/export/dump/
> > > >
> > > > <http://download.geonames.org/export/dump/>I used a simple split
> > > function
> > > > to
> > > > chop the 850 meg txt file in to manageable csv's that I can import in
> > to
> > > > Solr. Now that all 7 million + documents are in there, I want to
> change
> > > the
> > > > country codes to the actual country names. I would of liked to have
> > done
> > > it
> > > > in the index but finding and replacing the strings in the csv seems
> to
> > be
> > > > working fine. After that I can just reindex the entire thing.
> > > >
> > > > Adam
> > > >
> > > > On Fri, Dec 3, 2010 at 12:42 PM, Erick Erickson <
> > erickerickson@gmail.com
> > > > >wrote:
> > > >
> > > > > Have you consider defining synonyms for your code <->country
> > > > > conversion at index time (or query time for that matter)?
> > > > >
> > > > > We may have an XY problem here. Could you state the high-level
> > > > > problem you're trying to solve? Maybe there's a better solution...
> > > > >
> > > > > Best
> > > > > Erick
> > > > >
> > > > > On Fri, Dec 3, 2010 at 12:20 PM, Adam Estrada <
> > > > > estrada.adam.groups@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > I wonder...I know that sed would work to find and replace the
> terms
> > > in
> > > > > all
> > > > > > of the csv files that I am indexing but would it work to find and
> > > > replace
> > > > > > key terms in the index?
> > > > > >
> > > > > > find C:\\tmp\\index\\data -type f -exec sed -i
> 's/AF/AFGHANISTAN/g'
> > > {}
> > > > \;
> > > > > >
> > > > > > That command would iterate through all the files in the data
> > > directory
> > > > > and
> > > > > > replace the country code with the full country name. I many just
> > back
> > > > up
> > > > > > the
> > > > > > directory and try it. I have it running on csv files right now
> and
> > > it's
> > > > > > working wonderfully. For those of you interested, I am indexing
> the
> > > > > entire
> > > > > > Geonames dataset
> > > > > http://download.geonames.org/export/dump/(allCountries.zip)
> > > > > > which gives me a pretty comprehensive world gazetteer. My next
> step
> > > is
> > > > > > gonna
> > > > > > be to display the results as KML to view over a google globe.
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > Adam
> > > > > >
> > > > > > On Fri, Dec 3, 2010 at 7:57 AM, Erick Erickson <
> > > > erickerickson@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > No, there's no equivalent to SQL update for all values in a
> > column.
> > > > > > You'll
> > > > > > > have to reindex all the documents.
> > > > > > >
> > > > > > > On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada <
> > > > > > > estrada.adam.groups@gmail.com
> > > > > > > > wrote:
> > > > > > >
> > > > > > > > OK part 2 of my previous question...
> > > > > > > >
> > > > > > > > Is there a way to batch update field values based on a
> certain
> > > > > > criteria?
> > > > > > > > For example, if thousands of documents have a field value of
> > 'US'
> > > > can
> > > > > I
> > > > > > > > update all of them to 'United States' programmatically?
> > > > > > > >
> > > > > > > > Adam
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Batch Update Fields

Posted by Erick Erickson <er...@gmail.com>.
When you define your fieldType at index time. My idea
was that you substitue these on the way in to your
index. You may need a specific field type just for your
country conversion.... Perhaps in a copyField if
you need both the code and full name....

Best
Erick

On Sat, Dec 4, 2010 at 12:16 PM, Adam Estrada <estrada.adam.groups@gmail.com
> wrote:

> Synonyms eh? I have a synonym list like the following so how do I identify
> the synonyms on a specific field. The only place the field is used is as a
> facet.
>
> original field => country name
>
> AF => AFGHANISTAN
> AX => ÅLAND ISLANDS
> AL => ALBANIA
> DZ => ALGERIA
> AS => AMERICAN SAMOA
> AD => ANDORRA
> AO => ANGOLA
> AI => ANGUILLA
> AQ => ANTARCTICA
> AG => ANTIGUA AND BARBUDA
> AR => ARGENTINA
> AM => ARMENIA
> AW => ARUBA
> AU => AUSTRALIA
> AT => AUSTRIA
> etc...
>
> Any advise on that would be great and very much appreciated!
>
> Adam
>
> On Fri, Dec 3, 2010 at 3:55 PM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > That will certainly work. Another option, assuming the country codes are
> > in their own field would be to put the transformations into a synonym
> file
> > that was only used on that field. That way you'd get this without having
> > to do the pre-process step of the raw data...
> >
> > That said, if you pre-processing is working for you it may  not be worth
> > your while
> > to worry about doing it differently
> >
> > Best
> > Erick
> >
> > On Fri, Dec 3, 2010 at 12:51 PM, Adam Estrada <
> > estrada.adam.groups@gmail.com
> > > wrote:
> >
> > > First off...I know enough about Solr to be VERY dangerous so please
> bare
> > > with me ;-) I am indexing the geonames database which only provides
> > country
> > > codes. I can facet the codes but to the end user who may not know all
> 249
> > > codes, it isn't really all that helpful. Therefore, I want to map the
> > full
> > > country names to the country codes provided in the geonames db.
> > > http://download.geonames.org/export/dump/
> > >
> > > <http://download.geonames.org/export/dump/>I used a simple split
> > function
> > > to
> > > chop the 850 meg txt file in to manageable csv's that I can import in
> to
> > > Solr. Now that all 7 million + documents are in there, I want to change
> > the
> > > country codes to the actual country names. I would of liked to have
> done
> > it
> > > in the index but finding and replacing the strings in the csv seems to
> be
> > > working fine. After that I can just reindex the entire thing.
> > >
> > > Adam
> > >
> > > On Fri, Dec 3, 2010 at 12:42 PM, Erick Erickson <
> erickerickson@gmail.com
> > > >wrote:
> > >
> > > > Have you consider defining synonyms for your code <->country
> > > > conversion at index time (or query time for that matter)?
> > > >
> > > > We may have an XY problem here. Could you state the high-level
> > > > problem you're trying to solve? Maybe there's a better solution...
> > > >
> > > > Best
> > > > Erick
> > > >
> > > > On Fri, Dec 3, 2010 at 12:20 PM, Adam Estrada <
> > > > estrada.adam.groups@gmail.com
> > > > > wrote:
> > > >
> > > > > I wonder...I know that sed would work to find and replace the terms
> > in
> > > > all
> > > > > of the csv files that I am indexing but would it work to find and
> > > replace
> > > > > key terms in the index?
> > > > >
> > > > > find C:\\tmp\\index\\data -type f -exec sed -i 's/AF/AFGHANISTAN/g'
> > {}
> > > \;
> > > > >
> > > > > That command would iterate through all the files in the data
> > directory
> > > > and
> > > > > replace the country code with the full country name. I many just
> back
> > > up
> > > > > the
> > > > > directory and try it. I have it running on csv files right now and
> > it's
> > > > > working wonderfully. For those of you interested, I am indexing the
> > > > entire
> > > > > Geonames dataset
> > > > http://download.geonames.org/export/dump/(allCountries.zip)
> > > > > which gives me a pretty comprehensive world gazetteer. My next step
> > is
> > > > > gonna
> > > > > be to display the results as KML to view over a google globe.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Adam
> > > > >
> > > > > On Fri, Dec 3, 2010 at 7:57 AM, Erick Erickson <
> > > erickerickson@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > No, there's no equivalent to SQL update for all values in a
> column.
> > > > > You'll
> > > > > > have to reindex all the documents.
> > > > > >
> > > > > > On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada <
> > > > > > estrada.adam.groups@gmail.com
> > > > > > > wrote:
> > > > > >
> > > > > > > OK part 2 of my previous question...
> > > > > > >
> > > > > > > Is there a way to batch update field values based on a certain
> > > > > criteria?
> > > > > > > For example, if thousands of documents have a field value of
> 'US'
> > > can
> > > > I
> > > > > > > update all of them to 'United States' programmatically?
> > > > > > >
> > > > > > > Adam
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Batch Update Fields

Posted by Adam Estrada <es...@gmail.com>.
Synonyms eh? I have a synonym list like the following so how do I identify
the synonyms on a specific field. The only place the field is used is as a
facet.

original field => country name

AF => AFGHANISTAN
AX => ÅLAND ISLANDS
AL => ALBANIA
DZ => ALGERIA
AS => AMERICAN SAMOA
AD => ANDORRA
AO => ANGOLA
AI => ANGUILLA
AQ => ANTARCTICA
AG => ANTIGUA AND BARBUDA
AR => ARGENTINA
AM => ARMENIA
AW => ARUBA
AU => AUSTRALIA
AT => AUSTRIA
etc...

Any advise on that would be great and very much appreciated!

Adam

On Fri, Dec 3, 2010 at 3:55 PM, Erick Erickson <er...@gmail.com>wrote:

> That will certainly work. Another option, assuming the country codes are
> in their own field would be to put the transformations into a synonym file
> that was only used on that field. That way you'd get this without having
> to do the pre-process step of the raw data...
>
> That said, if you pre-processing is working for you it may  not be worth
> your while
> to worry about doing it differently
>
> Best
> Erick
>
> On Fri, Dec 3, 2010 at 12:51 PM, Adam Estrada <
> estrada.adam.groups@gmail.com
> > wrote:
>
> > First off...I know enough about Solr to be VERY dangerous so please bare
> > with me ;-) I am indexing the geonames database which only provides
> country
> > codes. I can facet the codes but to the end user who may not know all 249
> > codes, it isn't really all that helpful. Therefore, I want to map the
> full
> > country names to the country codes provided in the geonames db.
> > http://download.geonames.org/export/dump/
> >
> > <http://download.geonames.org/export/dump/>I used a simple split
> function
> > to
> > chop the 850 meg txt file in to manageable csv's that I can import in to
> > Solr. Now that all 7 million + documents are in there, I want to change
> the
> > country codes to the actual country names. I would of liked to have done
> it
> > in the index but finding and replacing the strings in the csv seems to be
> > working fine. After that I can just reindex the entire thing.
> >
> > Adam
> >
> > On Fri, Dec 3, 2010 at 12:42 PM, Erick Erickson <erickerickson@gmail.com
> > >wrote:
> >
> > > Have you consider defining synonyms for your code <->country
> > > conversion at index time (or query time for that matter)?
> > >
> > > We may have an XY problem here. Could you state the high-level
> > > problem you're trying to solve? Maybe there's a better solution...
> > >
> > > Best
> > > Erick
> > >
> > > On Fri, Dec 3, 2010 at 12:20 PM, Adam Estrada <
> > > estrada.adam.groups@gmail.com
> > > > wrote:
> > >
> > > > I wonder...I know that sed would work to find and replace the terms
> in
> > > all
> > > > of the csv files that I am indexing but would it work to find and
> > replace
> > > > key terms in the index?
> > > >
> > > > find C:\\tmp\\index\\data -type f -exec sed -i 's/AF/AFGHANISTAN/g'
> {}
> > \;
> > > >
> > > > That command would iterate through all the files in the data
> directory
> > > and
> > > > replace the country code with the full country name. I many just back
> > up
> > > > the
> > > > directory and try it. I have it running on csv files right now and
> it's
> > > > working wonderfully. For those of you interested, I am indexing the
> > > entire
> > > > Geonames dataset
> > > http://download.geonames.org/export/dump/(allCountries.zip)
> > > > which gives me a pretty comprehensive world gazetteer. My next step
> is
> > > > gonna
> > > > be to display the results as KML to view over a google globe.
> > > >
> > > > Thoughts?
> > > >
> > > > Adam
> > > >
> > > > On Fri, Dec 3, 2010 at 7:57 AM, Erick Erickson <
> > erickerickson@gmail.com
> > > > >wrote:
> > > >
> > > > > No, there's no equivalent to SQL update for all values in a column.
> > > > You'll
> > > > > have to reindex all the documents.
> > > > >
> > > > > On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada <
> > > > > estrada.adam.groups@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > OK part 2 of my previous question...
> > > > > >
> > > > > > Is there a way to batch update field values based on a certain
> > > > criteria?
> > > > > > For example, if thousands of documents have a field value of 'US'
> > can
> > > I
> > > > > > update all of them to 'United States' programmatically?
> > > > > >
> > > > > > Adam
> > > > >
> > > >
> > >
> >
>

Re: Batch Update Fields

Posted by Erick Erickson <er...@gmail.com>.
That will certainly work. Another option, assuming the country codes are
in their own field would be to put the transformations into a synonym file
that was only used on that field. That way you'd get this without having
to do the pre-process step of the raw data...

That said, if you pre-processing is working for you it may  not be worth
your while
to worry about doing it differently

Best
Erick

On Fri, Dec 3, 2010 at 12:51 PM, Adam Estrada <estrada.adam.groups@gmail.com
> wrote:

> First off...I know enough about Solr to be VERY dangerous so please bare
> with me ;-) I am indexing the geonames database which only provides country
> codes. I can facet the codes but to the end user who may not know all 249
> codes, it isn't really all that helpful. Therefore, I want to map the full
> country names to the country codes provided in the geonames db.
> http://download.geonames.org/export/dump/
>
> <http://download.geonames.org/export/dump/>I used a simple split function
> to
> chop the 850 meg txt file in to manageable csv's that I can import in to
> Solr. Now that all 7 million + documents are in there, I want to change the
> country codes to the actual country names. I would of liked to have done it
> in the index but finding and replacing the strings in the csv seems to be
> working fine. After that I can just reindex the entire thing.
>
> Adam
>
> On Fri, Dec 3, 2010 at 12:42 PM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > Have you consider defining synonyms for your code <->country
> > conversion at index time (or query time for that matter)?
> >
> > We may have an XY problem here. Could you state the high-level
> > problem you're trying to solve? Maybe there's a better solution...
> >
> > Best
> > Erick
> >
> > On Fri, Dec 3, 2010 at 12:20 PM, Adam Estrada <
> > estrada.adam.groups@gmail.com
> > > wrote:
> >
> > > I wonder...I know that sed would work to find and replace the terms in
> > all
> > > of the csv files that I am indexing but would it work to find and
> replace
> > > key terms in the index?
> > >
> > > find C:\\tmp\\index\\data -type f -exec sed -i 's/AF/AFGHANISTAN/g' {}
> \;
> > >
> > > That command would iterate through all the files in the data directory
> > and
> > > replace the country code with the full country name. I many just back
> up
> > > the
> > > directory and try it. I have it running on csv files right now and it's
> > > working wonderfully. For those of you interested, I am indexing the
> > entire
> > > Geonames dataset
> > http://download.geonames.org/export/dump/(allCountries.zip)
> > > which gives me a pretty comprehensive world gazetteer. My next step is
> > > gonna
> > > be to display the results as KML to view over a google globe.
> > >
> > > Thoughts?
> > >
> > > Adam
> > >
> > > On Fri, Dec 3, 2010 at 7:57 AM, Erick Erickson <
> erickerickson@gmail.com
> > > >wrote:
> > >
> > > > No, there's no equivalent to SQL update for all values in a column.
> > > You'll
> > > > have to reindex all the documents.
> > > >
> > > > On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada <
> > > > estrada.adam.groups@gmail.com
> > > > > wrote:
> > > >
> > > > > OK part 2 of my previous question...
> > > > >
> > > > > Is there a way to batch update field values based on a certain
> > > criteria?
> > > > > For example, if thousands of documents have a field value of 'US'
> can
> > I
> > > > > update all of them to 'United States' programmatically?
> > > > >
> > > > > Adam
> > > >
> > >
> >
>

Re: Batch Update Fields

Posted by Adam Estrada <es...@gmail.com>.
First off...I know enough about Solr to be VERY dangerous so please bare
with me ;-) I am indexing the geonames database which only provides country
codes. I can facet the codes but to the end user who may not know all 249
codes, it isn't really all that helpful. Therefore, I want to map the full
country names to the country codes provided in the geonames db.
http://download.geonames.org/export/dump/

<http://download.geonames.org/export/dump/>I used a simple split function to
chop the 850 meg txt file in to manageable csv's that I can import in to
Solr. Now that all 7 million + documents are in there, I want to change the
country codes to the actual country names. I would of liked to have done it
in the index but finding and replacing the strings in the csv seems to be
working fine. After that I can just reindex the entire thing.

Adam

On Fri, Dec 3, 2010 at 12:42 PM, Erick Erickson <er...@gmail.com>wrote:

> Have you consider defining synonyms for your code <->country
> conversion at index time (or query time for that matter)?
>
> We may have an XY problem here. Could you state the high-level
> problem you're trying to solve? Maybe there's a better solution...
>
> Best
> Erick
>
> On Fri, Dec 3, 2010 at 12:20 PM, Adam Estrada <
> estrada.adam.groups@gmail.com
> > wrote:
>
> > I wonder...I know that sed would work to find and replace the terms in
> all
> > of the csv files that I am indexing but would it work to find and replace
> > key terms in the index?
> >
> > find C:\\tmp\\index\\data -type f -exec sed -i 's/AF/AFGHANISTAN/g' {} \;
> >
> > That command would iterate through all the files in the data directory
> and
> > replace the country code with the full country name. I many just back up
> > the
> > directory and try it. I have it running on csv files right now and it's
> > working wonderfully. For those of you interested, I am indexing the
> entire
> > Geonames dataset
> http://download.geonames.org/export/dump/(allCountries.zip)
> > which gives me a pretty comprehensive world gazetteer. My next step is
> > gonna
> > be to display the results as KML to view over a google globe.
> >
> > Thoughts?
> >
> > Adam
> >
> > On Fri, Dec 3, 2010 at 7:57 AM, Erick Erickson <erickerickson@gmail.com
> > >wrote:
> >
> > > No, there's no equivalent to SQL update for all values in a column.
> > You'll
> > > have to reindex all the documents.
> > >
> > > On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada <
> > > estrada.adam.groups@gmail.com
> > > > wrote:
> > >
> > > > OK part 2 of my previous question...
> > > >
> > > > Is there a way to batch update field values based on a certain
> > criteria?
> > > > For example, if thousands of documents have a field value of 'US' can
> I
> > > > update all of them to 'United States' programmatically?
> > > >
> > > > Adam
> > >
> >
>

Re: Batch Update Fields

Posted by Erick Erickson <er...@gmail.com>.
Have you consider defining synonyms for your code <->country
conversion at index time (or query time for that matter)?

We may have an XY problem here. Could you state the high-level
problem you're trying to solve? Maybe there's a better solution...

Best
Erick

On Fri, Dec 3, 2010 at 12:20 PM, Adam Estrada <estrada.adam.groups@gmail.com
> wrote:

> I wonder...I know that sed would work to find and replace the terms in all
> of the csv files that I am indexing but would it work to find and replace
> key terms in the index?
>
> find C:\\tmp\\index\\data -type f -exec sed -i 's/AF/AFGHANISTAN/g' {} \;
>
> That command would iterate through all the files in the data directory and
> replace the country code with the full country name. I many just back up
> the
> directory and try it. I have it running on csv files right now and it's
> working wonderfully. For those of you interested, I am indexing the entire
> Geonames dataset http://download.geonames.org/export/dump/(allCountries.zip)
> which gives me a pretty comprehensive world gazetteer. My next step is
> gonna
> be to display the results as KML to view over a google globe.
>
> Thoughts?
>
> Adam
>
> On Fri, Dec 3, 2010 at 7:57 AM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > No, there's no equivalent to SQL update for all values in a column.
> You'll
> > have to reindex all the documents.
> >
> > On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada <
> > estrada.adam.groups@gmail.com
> > > wrote:
> >
> > > OK part 2 of my previous question...
> > >
> > > Is there a way to batch update field values based on a certain
> criteria?
> > > For example, if thousands of documents have a field value of 'US' can I
> > > update all of them to 'United States' programmatically?
> > >
> > > Adam
> >
>

Re: Batch Update Fields

Posted by Adam Estrada <es...@gmail.com>.
I wonder...I know that sed would work to find and replace the terms in all
of the csv files that I am indexing but would it work to find and replace
key terms in the index?

find C:\\tmp\\index\\data -type f -exec sed -i 's/AF/AFGHANISTAN/g' {} \;

That command would iterate through all the files in the data directory and
replace the country code with the full country name. I many just back up the
directory and try it. I have it running on csv files right now and it's
working wonderfully. For those of you interested, I am indexing the entire
Geonames dataset http://download.geonames.org/export/dump/ (allCountries.zip)
which gives me a pretty comprehensive world gazetteer. My next step is gonna
be to display the results as KML to view over a google globe.

Thoughts?

Adam

On Fri, Dec 3, 2010 at 7:57 AM, Erick Erickson <er...@gmail.com>wrote:

> No, there's no equivalent to SQL update for all values in a column. You'll
> have to reindex all the documents.
>
> On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada <
> estrada.adam.groups@gmail.com
> > wrote:
>
> > OK part 2 of my previous question...
> >
> > Is there a way to batch update field values based on a certain criteria?
> > For example, if thousands of documents have a field value of 'US' can I
> > update all of them to 'United States' programmatically?
> >
> > Adam
>

Re: Batch Update Fields

Posted by Erick Erickson <er...@gmail.com>.
No, there's no equivalent to SQL update for all values in a column. You'll
have to reindex all the documents.

On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada <estrada.adam.groups@gmail.com
> wrote:

> OK part 2 of my previous question...
>
> Is there a way to batch update field values based on a certain criteria?
> For example, if thousands of documents have a field value of 'US' can I
> update all of them to 'United States' programmatically?
>
> Adam

Re: Batch Update Fields

Posted by Markus Jelsma <ma...@openindex.io>.
You must reindex the complete document, even if you just want to update a 
single field.

On Friday 03 December 2010 04:52:04 Adam Estrada wrote:
> OK part 2 of my previous question...
> 
> Is there a way to batch update field values based on a certain criteria?
> For example, if thousands of documents have a field value of 'US' can I
> update all of them to 'United States' programmatically?
> 
> Adam

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350