You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Utkarsh Sengar <ut...@gmail.com> on 2013/09/16 20:47:35 UTC

Dynamic row sizing for documents via UpdateCSV

Hello,

I am using UpdateCSV to load data in solr.

Currently I load this schema with a static set of values:
userid,name,age,location
john8322,John,32,CA
tom22,Tom,30,NY


But now I have this usecase where john8322 might have a state specific
dynamic field for example:
userid,name,age,location, ca_count_i
john8322,John,32,CA, 7

And tom22 might have different dynamic fields:
userid,name,age,location, ny_count_i,oh_count_i
tom22,Tom,30,NY, 981,11

So is it possible to pass different columns sizes for each row, something
like this:
john8322,John,32,CA,ca_count_i:7
tom22,Tom,30,NY, ny_count_i:981,oh_count_i:11

I understand that the above syntax is not possible, but is there any other
way of solving this problem?

-- 
Thanks,
-Utkarsh

Re: Dynamic row sizing for documents via UpdateCSV

Posted by Utkarsh Sengar <ut...@gmail.com>.
Yeah I think the only way to go about it is via SolrJ. The csv file is
generated by a pig job which computes the data to be loaded in solr.
I think this is what I will endup doing: Load all the possible columns in
the csv with a value of 0 if the value doesn't exist for a specific record.

I was just trying to avoid it and find an optimal solution with UpdateCSV.

Thanks,
-Utkarsh


On Tue, Sep 17, 2013 at 5:43 AM, Erick Erickson <er...@gmail.com>wrote:

> Well, it's reasonably easy if you have empty columns, in the same
> order, for _all_ of the possible dynamic fields, but I really doubt
> you are that fortunate... It's especially ugly in that you have the
> different dynamic fields scattered around.
>
> How is the csv file generated? Could you force every row to have
> _all_ the possible columns in the same order with spaces or something
> in the columns that are empty?
>
> Otherwise I'd think about parsing them externally and using, say, SolrJ
> to transmit the individual records to Solr.
>
> Best,
> Erick
>
>
> On Mon, Sep 16, 2013 at 2:47 PM, Utkarsh Sengar <utkarsh2012@gmail.com
> >wrote:
>
> > Hello,
> >
> > I am using UpdateCSV to load data in solr.
> >
> > Currently I load this schema with a static set of values:
> > userid,name,age,location
> > john8322,John,32,CA
> > tom22,Tom,30,NY
> >
> >
> > But now I have this usecase where john8322 might have a state specific
> > dynamic field for example:
> > userid,name,age,location, ca_count_i
> > john8322,John,32,CA, 7
> >
> > And tom22 might have different dynamic fields:
> > userid,name,age,location, ny_count_i,oh_count_i
> > tom22,Tom,30,NY, 981,11
> >
> > So is it possible to pass different columns sizes for each row, something
> > like this:
> > john8322,John,32,CA,ca_count_i:7
> > tom22,Tom,30,NY, ny_count_i:981,oh_count_i:11
> >
> > I understand that the above syntax is not possible, but is there any
> other
> > way of solving this problem?
> >
> > --
> > Thanks,
> > -Utkarsh
> >
>



-- 
Thanks,
-Utkarsh

Re: Dynamic row sizing for documents via UpdateCSV

Posted by Erick Erickson <er...@gmail.com>.
Well, it's reasonably easy if you have empty columns, in the same
order, for _all_ of the possible dynamic fields, but I really doubt
you are that fortunate... It's especially ugly in that you have the
different dynamic fields scattered around.

How is the csv file generated? Could you force every row to have
_all_ the possible columns in the same order with spaces or something
in the columns that are empty?

Otherwise I'd think about parsing them externally and using, say, SolrJ
to transmit the individual records to Solr.

Best,
Erick


On Mon, Sep 16, 2013 at 2:47 PM, Utkarsh Sengar <ut...@gmail.com>wrote:

> Hello,
>
> I am using UpdateCSV to load data in solr.
>
> Currently I load this schema with a static set of values:
> userid,name,age,location
> john8322,John,32,CA
> tom22,Tom,30,NY
>
>
> But now I have this usecase where john8322 might have a state specific
> dynamic field for example:
> userid,name,age,location, ca_count_i
> john8322,John,32,CA, 7
>
> And tom22 might have different dynamic fields:
> userid,name,age,location, ny_count_i,oh_count_i
> tom22,Tom,30,NY, 981,11
>
> So is it possible to pass different columns sizes for each row, something
> like this:
> john8322,John,32,CA,ca_count_i:7
> tom22,Tom,30,NY, ny_count_i:981,oh_count_i:11
>
> I understand that the above syntax is not possible, but is there any other
> way of solving this problem?
>
> --
> Thanks,
> -Utkarsh
>