You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Ryan McKinley (JIRA)" <ji...@apache.org> on 2009/04/28 01:03:30 UTC

[jira] Created: (SOLR-1131) Allow a single field to index multiple fields

Allow a single field to index multiple fields
---------------------------------------------

                 Key: SOLR-1131
                 URL: https://issues.apache.org/jira/browse/SOLR-1131
             Project: Solr
          Issue Type: New Feature
          Components: Analysis
            Reporter: Ryan McKinley


In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
 * geohash (sincle lucene field)
 * lat field, lon field (two double fields)
 * cartesian tiers (a series of fields with tokens to say if it exists within that region)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792864#action_12792864 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

{quote}I imagine 99% of the users of Point will be those wanting straight 2D geo search (limit by simple distance, or sort by simple distance).
I don't see how elevation would help in isolation since more accurate distance measures would presumably need map info for driving distance.
I could see people wanting to store 3D points, etc, but they can do that today.{quote}

But it's not like the code is any faster or more complicated and neither is the interface the user sees (I'm sure most people know what a point is).  If I have a generic Point, now I have a Vector.  With a Vector, I can do all kinds of interesting things.  But rather than fret over it, we can have both.  LatLonFieldType and PointFieldType.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794413#action_12794413 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

Committed revision 893746.

Leaving open for a little while to deal with any side effects.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

Added Chris's comments.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field to index multiple fields

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703545#action_12703545 ] 

Erik Hatcher commented on SOLR-1131:
------------------------------------

Perhaps this is a case where the TeeTokenFilter and friends can come into play in Solr finally?

> Allow a single field to index multiple fields
> ---------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>         Attachments: SOLR-1131-IndexMultipleFields.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789240#action_12789240 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

{quote}
Quick comment based on a spot check of the changes to IndexSchema: rather than make polyField special somehow w.r.t IndexSchema, and add a FieldType.getPolyFieldNames, etc, I had been thinking more along the lines of having an IndexSchema.registerDynamicFieldDefinition - just like the existing registerDynamicCopyField. This would (optionally) allow any field type to add other definitions to the IndexSchema. I continue to think it would be good to stay away of special logic for "polyfields" in the IndexSchema.
{quote}

So, then the FieldType would register it's Dynamic Fields in it's own init() method by calling this method?  That can work.

bq. Why use Dynamic Field array

The array is sorted and array access is much faster and we often have to loop over it to look it up.

{quote}
CoordinateFieldType: why process > 1 sub field types and then throw an exception at the end? I cleaned this up to throw the Exception when it occurs.
{quote}

OK.  Actually, this should just be in the derived class, as it may be the case some other CoordinateFieldType has multiple sub types.

{quote}
# parsePoint in DistanceUtils, why use ',' as the separator - use ' ' (at least conforms to georss point then). I guess because you are supporting N-dimensional points, right?
# parsePoint - instead of complicated isolation loops, why not just use trim()? I've taken that approach in the patch I've attached.
{quote}

I think comma makes sense.  As for the optimization stuff, I agree w/ Yonik, this is code that will be called a lot.

    * when checking for isDuplicateDynField, if it is, nothing is done. Shouldn't this be where an exception is thrown or a message is logged? In the patch I'm attaching I took the log approach.

{quote}

It is logged, but for the poly fields, if the dyn field is already defined, that's just fine.


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788307#action_12788307 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

Note, I don't think the distance function queries will work w/ my patch yet.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

Added some range tests.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792052#action_12792052 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

Hey Grant:

Let me give this a quick review. Won't take longer than 20 mins. Thanks for pushing forward on this.

Cheers,
Chris


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789003#action_12789003 ] 

Chris A. Mattmann edited comment on SOLR-1131 at 12/10/09 11:08 PM:
--------------------------------------------------------------------

Hi All:

Here's a cut on the patch. Some questions/comments on the existing patch(es):

# Why use 
{code}
private DynamicField[] dynamicFields;
{code}

instead of 
{code}
List<DynamicField>
{code}
or 
{code}
Collection<DynamicField>
{code}

in IndexSchema?

# There are a bunch of useless whitespace changes (e.g., in IndexSchema, FieldType) in the existing patches. The final patch probably shouldn't include those since it makes it difficult to understand what was actually changed.
# IndexSchema:
#* when checking for isDuplicateDynField, if it is, nothing is done. Shouldn't this be where an exception is thrown or a message is logged? In the patch I'm attaching I took the log approach.
# IndexSchema:
#* what happens if subs.isEmpty() == true?
#* maybe log message that says, dyn field definition is up to you?
#* I took the approach in my attached patch to log it.
# Why does getPolyFieldType(String) throw an exception if the field is not a poly field type -- that seems a bit brittle? Also there's the NoEx version anyways (why not just keep that one?). In the patch I've attached, I took the approach of only including a getPolyFieldType that returns null rather than throwing an ex (the NoEx version).
# CoordinateFieldType: why process > 1 sub field types and then throw an exception at the end? I cleaned this up to throw the Exception when it occurs.
# parsePoint in DistanceUtils, why use ',' as the separator -- use ' ' (at least conforms to georss point then). I guess because you are supporting N-dimensional points, right?
# parsePoint -- instead of complicated isolation loops, why not just use trim()? I've taken that approach in the patch I've attached.

This patch passes all unit tests as well. This doesn't implement option C that I proposed yet. Hopefully I'll get a chance to put that up later tonight.



      was (Author: chrismattmann):
    Hi All:

Here's a cut on the patch. Some questions/comments on the existing patch(es):

1. Why use 
{code}
private DynamicField[] dynamicFields;
{code}

instead of 
{code}
List<DynamicField>
{code}
or 
{code}
Collection<DynamicField>
{code}

in IndexSchema?

2. There are a bunch of useless whitespace changes (e.g., in IndexSchema, FieldType) in the existing patches. The final patch probably shouldn't include those since it makes it difficult to understand what was actually changed.
3. IndexSchema:
    - when checking for isDuplicateDynField, if it is, nothing is done. Shouldn't this be where an exception is thrown or a message is logged? In the patch I'm attaching I took the log approach.
4. IndexSchema:
    - what happens if subs.isEmpty() == true?
    - maybe log message that says, dyn field definition is up to you?
    - I took the approach in my attached patch to log it.
5. Why does getPolyFieldType(String) throw an exception if the field is not a poly field type -- that seems a bit brittle? Also there's the NoEx version anyways (why not just keep that one?). In the patch I've attached, I took the approach of only including a getPolyFieldType that returns null rather than throwing an ex (the NoEx version).
6. CoordinateFieldType: why process > 1 sub field types and then throw an exception at the end? I cleaned this up to throw the Exception when it occurs.
7. parsePoint in DistanceUtils, why use ',' as the separator -- use ' ' (at least conforms to georss point then). I guess because you are supporting N-dimensional points, right?
8. parsePoint -- instead of complicated isolation loops, why not just use trim()? I've taken that approach in the patch I've attached.

This patch passes all unit tests as well. This doesn't implement option C that I proposed yet. Hopefully I'll get a chance to put that up later tonight.


  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790720#action_12790720 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

bq.Because there is no way to change the name on a SchemaField w/o changing SchemaField to be non-final. I don't think SchemaField should be non-final.

Even if SchemaField is final we can precreate and cache the SchemaField objects because the properties of the synthetic field is known in advance. For instance, if you have a dimension of 2 ,the PointType instance will always have 2 well known synthetic names and types that can be created well in advance and they can be reused

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780956#action_12780956 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

I definitely agree, Chris, the interesting part is how that manifests itself in terms of implementation, which is where I am digging in at the moment.  It means the Query parsers need to handle it as well as the ResponseWriters, etc.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794445#action_12794445 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

OK, I thought I had got them all, but feel free to commit as you see fit.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786586#action_12786586 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

Hey Yonik,

One of the things I was debating was whether it was worthwhile to keep the single field creation or not.  I see in your patch you drop it.  I've got a patch that keeps it.  I will try to put it up this week.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792725#action_12792725 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

Actually, we should probably move the creation of the actual PointType off to SOLR-1586, anyway.  The reason I want n-dimensional is b/c this stuff is useful for more than just traditional lat/lon.  

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789354#action_12789354 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

bq. Maybe. It took me all of 30 seconds to figure out what it was doing. I'll put some comments on it. While readability is important, Solr's goal is not to make a product that a CS101 grad can read, it's too build a blazing fast search server. That call could hit millions of times when indexing points.

Sure, maybe the goal isn't for a CS101 grad to be able to easily understand the code, but SOLR's goal should include being open to ideas from the community that involve reusing standard Java library functions and not rewriting based on perception of speed and memory without empirical proof. Where's the evidence that using things like split and trim are so much more costly than rewriting those basic capabilities that they warrant not using them?

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788113#action_12788113 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

Some minor nits about tests in general that I've also noticed in the past:
IMO, unit tests can be too low level.  They can also be too fragile.

The test below is a pain to maintain... it essentially means you can't change the schema at all w/o breaking the test.
+    Map<String,FieldType> polyFs = schema.getPolyFieldTypes();
+    assertNotNull(polyFs);
+    assertTrue(polyFs.size() == 3);

I also prefer testing low level *behavior* as opposed to testing low level implementation.
It would be nice, for example, if testPointFieldType indexed a few couments (with various combinations of stored / indexed)  and then queried the index, with our high level xpath validation code to test that the field was correctly matched and had stored, or had not stored the value.

Rendundant null checks, trivial strings, etc:
+    assertNotNull("topDocs is null", topDocs);
+    assertTrue(topDocs.totalHits + " does not equal: " + 1, topDocs.totalHits == 1)

Things like the above can be replaced with the much more concise and readable:
assertEquals(1, topDocs.totalHits)

But really, stuff like this:
+ TopDocs topDocs = core.getSearcher().get().search(bq, 1);

Should normally use the higher level search and xpath validate  functionallity.  The code above actually leads to a refcount leak.

schema.xml: geo will be core... let's not add a new/different schema file in tests for this and simply add it to the latest schema12


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790237#action_12790237 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

Given the concerns of some people around automatically registering a dynamic field, perhaps we should optionally allow a subFieldSuffix to be passed instead of subFieldTypes.  

minor nit: why is "subFieldTypes" plural?  We're only specifying a single type, right?

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788103#action_12788103 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. 1. Why is TextField overriding getFieldQuery when it isn't called, except possibly via the FieldQParserPlugin?

As you point out, it is called by FieldQParserPlugin, and it's a move to make things a bit more orthogonal - with a little more work it could even be used by the SolrQueryParser for text field types as well.  It also opened the (expert) possibility of creating a new TextField type that handled things a bit differently.

bq. 2. I'm not sure I understand the getDistance, getBoundingBox methods on the GeoFieldType. It seems like that precludes one from picking a specific distance (for instance, some times you may want a faster approx. and others a slower more accurate calculation)

This decision will often be made for the user by the choice of field-types.  End users and app clients should be able to specify something like a bounding box filter and get the most performant implementation w/o having to know if it resolves to range queries, cartesian grids, or whatever.

fq={!gbox point=110,220 r=1.5}   #specify a point and a radius

This does not necessarily preclude users from calling exact functions if they know they are supported for that field type.


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791789#action_12791789 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

I guess we need to revamp the API.

The FieldType should act as a factory of SchemaField. And SchemaField does not have to be a final class. Solr Should do all the operations through that SchemaField

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788164#action_12788164 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

Perhaps we should return to the design level a bit instead of me reading code and maybe making mistakes and trying to infer intent.

Assume we have this:
  <fieldType name="xy" class="solr.PointType" dimension="2" subFieldType="double"/>
  <field name="home" type="xy" indexed="true" stored="true"/>

What are the exact field names that are indexed?

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792751#action_12792751 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

{quote}
Actually, we should probably move the creation of the actual PointType off to SOLR-1586, anyway. The reason I want n-dimensional is b/c this stuff is useful for more than just traditional lat/lon. 
{quote}

+1, I was going to suggest the same thing. BTW, the geohash field type is ready when you have a chance to take a look.

Cheers,
Chris


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793699#action_12793699 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

Note, I also changed PointValueSource to be ToMultiValueSource, as it really isn't just a point.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792495#action_12792495 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

it is possible to cache the objects in FieldType as 
{code:java}
private final Map<SchemaField, List<SchemaField>> subFields;
{code}

We can lazily initialize this Map for each SchemaField .

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788119#action_12788119 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

Regarding polyfields... it's not clear why they are special enough to have to change the IndexSchema? (IndexSchema.isPolyField, getPolyField, getPolyFieldType, getPolyFieldTypeNoEx, etc).  Can't we just store them as normal field types?


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794842#action_12794842 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

Changed toMultiVS to vector(): Committed revision 894183. 

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

Cleaned up the field creation a bit, more documentation in the example.  I think this is ready to go.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706438#action_12706438 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. FieldType#getAnalyzer() - I guess the same analyzer would have to apply to every field?

Fields in Lucene can now be pre-analyzed (which this was available when Solr was first being developed!)
See Field constructors that take a TokenStream..

> Allow a single field to index multiple fields
> ---------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>         Attachments: SOLR-1131-IndexMultipleFields.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789003#action_12789003 ] 

Chris A. Mattmann edited comment on SOLR-1131 at 12/10/09 11:08 PM:
--------------------------------------------------------------------

Hi All:

Here's a cut on the patch. Some questions/comments on the existing patch(es):

# Why use 
{code}
private DynamicField[] dynamicFields;
{code} instead of 
{code}
List<DynamicField>
{code} or 
{code}
Collection<DynamicField>
{code} in IndexSchema?
# There are a bunch of useless whitespace changes (e.g., in IndexSchema, FieldType) in the existing patches. The final patch probably shouldn't include those since it makes it difficult to understand what was actually changed.
# IndexSchema:
#* when checking for isDuplicateDynField, if it is, nothing is done. Shouldn't this be where an exception is thrown or a message is logged? In the patch I'm attaching I took the log approach.
# IndexSchema:
#* what happens if subs.isEmpty() == true?
#* maybe log message that says, dyn field definition is up to you?
#* I took the approach in my attached patch to log it.
# Why does getPolyFieldType(String) throw an exception if the field is not a poly field type -- that seems a bit brittle? Also there's the NoEx version anyways (why not just keep that one?). In the patch I've attached, I took the approach of only including a getPolyFieldType that returns null rather than throwing an ex (the NoEx version).
# CoordinateFieldType: why process > 1 sub field types and then throw an exception at the end? I cleaned this up to throw the Exception when it occurs.
# parsePoint in DistanceUtils, why use ',' as the separator -- use ' ' (at least conforms to georss point then). I guess because you are supporting N-dimensional points, right?
# parsePoint -- instead of complicated isolation loops, why not just use trim()? I've taken that approach in the patch I've attached.

This patch passes all unit tests as well. This doesn't implement option C that I proposed yet. Hopefully I'll get a chance to put that up later tonight.



      was (Author: chrismattmann):
    Hi All:

Here's a cut on the patch. Some questions/comments on the existing patch(es):

# Why use 
{code}
private DynamicField[] dynamicFields;
{code}

instead of 
{code}
List<DynamicField>
{code}
or 
{code}
Collection<DynamicField>
{code}

in IndexSchema?

# There are a bunch of useless whitespace changes (e.g., in IndexSchema, FieldType) in the existing patches. The final patch probably shouldn't include those since it makes it difficult to understand what was actually changed.
# IndexSchema:
#* when checking for isDuplicateDynField, if it is, nothing is done. Shouldn't this be where an exception is thrown or a message is logged? In the patch I'm attaching I took the log approach.
# IndexSchema:
#* what happens if subs.isEmpty() == true?
#* maybe log message that says, dyn field definition is up to you?
#* I took the approach in my attached patch to log it.
# Why does getPolyFieldType(String) throw an exception if the field is not a poly field type -- that seems a bit brittle? Also there's the NoEx version anyways (why not just keep that one?). In the patch I've attached, I took the approach of only including a getPolyFieldType that returns null rather than throwing an ex (the NoEx version).
# CoordinateFieldType: why process > 1 sub field types and then throw an exception at the end? I cleaned this up to throw the Exception when it occurs.
# parsePoint in DistanceUtils, why use ',' as the separator -- use ' ' (at least conforms to georss point then). I guess because you are supporting N-dimensional points, right?
# parsePoint -- instead of complicated isolation loops, why not just use trim()? I've taken that approach in the patch I've attached.

This patch passes all unit tests as well. This doesn't implement option C that I proposed yet. Hopefully I'll get a chance to put that up later tonight.


  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793482#action_12793482 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

> >    It's more of an issue of directly creating fields... that's currently up to the schema 
> I'm not following. Is the problem in:
[...]
No, I meant directly creating SchemaFields doesn't seem great.  If we put a cache in, it would bypass that too.

bq. This seems a bit weird (even though I understand why) due to the fact that the other createField methods actually return Field and not Fieldable. 

If it weren't for back compat, we would have already changed createField to return Fieldable (and I think there's a SOLR issue somewhere that either does this or depends on it).  Fields are a lot more limiting than Fieldables (but createField was in Solr before there even was a Fieldable).

if SOLR-1664 goes ahead it might be a natural place to make everything in SchemaField deal in Fieldabes?

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792055#action_12792055 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

+1. Looks good, Grant. Let's get this sucker committed...

Thanks!

Cheers,
Chris


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791665#action_12791665 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

I have a new patch in the works that makes creating the SchemaField lighter weight.  I agree w/ Yonik, I don't think this can be cached in general.  Also, I've done away with the Delegating Field Type.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789342#action_12789342 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

bq. Speed and memory. 

Unfortunately, it's at the cost of readability and maintainability. 

bq. As for logging, that code is all going away in the next patch, I think

I'll take a look when you throw it up, thanks.

Cheers,
Chris



> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792448#action_12792448 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. Are you sure that we want to create a new String for every query/field creation?

I don't see anyway around it.  The caching doesn't work b/c there may be field types where the number of SchemaFields may not be known. 

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-1131:
-------------------------------

    Attachment: SOLR-1131.patch

Here's a completely untested prototype patch along the lines of how I was thinking this would work with geo.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790357#action_12790357 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. That's inherited from a more general mechanism that would allow multiple field types.

But multiple sub-field types would be used for different things, and should hence be separate parameters?  So if a field type indexed points separately and indexed a field that contains a list of cartesian tiers, I could see the params being

<fieldType ..., coordinateType="..."  cartesianTierType="..."/>
(or coordinateSuffix / cartesianTierSuffix)

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793521#action_12793521 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

bq.if SOLR-1664 goes ahead it might be a natural place to make everything in SchemaField deal in Fieldabes?

What stops us from resolving SOLR-1664 ? 

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field to index multiple fields

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-1131:
--------------------------------

    Attachment: SOLR-1131-IndexMultipleFields.patch

This is a sketch to see how things look if we have SchemaField/FieldType return Field[] rather then Field:

{code:java}
+  /**
+   * @deprecated use {@link #createFields(String, float)}
+   */
   public Field createField(String val, float boost) {
     return type.createField(this,val,boost);
   }
+  
+  public Field[] createFields(String val, float boost) {
+    return type.createFields(this,val,boost);
+  }
{code}

I think this could work -- this would let FieldType#createFields() return a list of fields.  The issues i see are:
 * indexing each field adds a for loop (maybe not a big deal?)
 * FieldType#toInternal() may or may not relate to the actual indexed value
 * FieldType#getAnalyzer() -- I guess the same analyzer would have to apply to every field?  I'm not sure what the implication is on this.
 * what about #getRangeQuery()

 

> Allow a single field to index multiple fields
> ---------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>         Attachments: SOLR-1131-IndexMultipleFields.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786894#action_12786894 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. One of the things I was debating was whether it was worthwhile to keep the single field creation or not. I see in your patch you drop it.

No, I kept it.  And I borrowed the isPolyField() from your email thread.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

Missing an & in DistanceUtils.parsePoint

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788805#action_12788805 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. DocValues.getPoint(double[] point)

OK, let me see how that plays out.

See also http://www.lucidimagination.com/search/document/fd804bcd78d7bec1/solr_1131_poly_fields_and_valuesource

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793313#action_12793313 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. It's more of an issue of directly creating fields... that's currently up to the schema 

I'm not following.  Is the problem in:
{code}
@Override
  public Fieldable[] createFields(SchemaField field, String externalVal, float boost) {
    String[] point = DistanceUtils.parsePoint(null, externalVal, dimension);
    return createFields(field, dynFieldProps, subType, externalVal, boost, point);
  }
{code}



> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

Updated to include some of Yonik's concerns about implementation.  I didn't get his Javadoc changes b/c they were too hard to determine the differences.

Other notes:
# Removed PlusMinusField and just used the PointType for those tests.  This then allowed me to move the registerPoly static method to the CoordinateFieldType
# I kept the n-dimensional point.  All of our distances work on vectors, I see no reason not to keep them.  Performance wise, most people w/ dimension of 2 or 3 will see little if any difference between this and specifically calling out a lat/lon field type.
# I believe I cleaned up the public method stuff so that helper methods are now package private.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789561#action_12789561 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

Hi Grant:

{quote}
My tests show it to be at least 7 times faster. But this should be obvious from static analysis, too. First of all, String.split() uses a regex which then makes a pass through the underlying character array. Then, trim has to go back through and analyze the char array too, not to mention the extra String creations. The optimized version here makes one pass and deals solely at the char array level and only has to do the substring, which I think can be optimized by the JVM to be a copy on write.
{quote}

Got it. A couple of points:

1. 7x faster is great, but could end up being noise if x = 2 ms. It matters if x is say 2 minutes, agreed. If it's on the ms end then the expense of more lines of (uncommented) code isn't worth it.
2. This code is likely to get called heavily on the indexing side, so performance, though still an issue, is not as hugely important as say on the searching side.
3. If you feel strongly about an optimized version of this magic splitAndTrim function, how ability a utility function and refactor then? I would guess this code could be used elsewhere, and that would help to satisfy my hunger for reusability. I'll even javadoc the function and do the refactor if you'd like.

Cheers,
Chris



> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791414#action_12791414 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

I'm spot-checking mutiple different patches at this point... but in general, we should strive to not expose the complexity further up the type hierarchy, and we should not limit what subclasses can do.

isPolyField() returns true if more than one Fieldable *can* be returned from createFields()
createFields() is free to return whatever the heck it likes.
And from SchemaField and FieldType's perspective,that's it. Implementation details are up to subclasses and we shouldn't add assumptions in base classes.  There should be *no* concept of subFieldTypes or whatever baked into anything.

So, from Noble's patch: we shouldn't try caching subfields in SchemaField... and esp not via "if (type instanceof DelegatingFieldType)"... it really doesn't belong there.



> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788751#action_12788751 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

{quote}
Seems like SolrQueryParser should use getFieldQuery for everything (except TextField... but it could even be used for that if we make it such that we could call back to getBooleanQuery, etc). I had this in my patch.
{quote}

Yonik, could you elaborate on this?  It seems kind of weird to have that instanceof check in SolrQueryParser.getFieldQuery() to see if we have a TextField or not.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788151#action_12788151 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. What if, instead, dynamic fields are directly used for subfields?

That then requires those dynamic fields to be present, which I'd rather not have to do.  Part of the goal of this issue is to hide the implementation.  Having said that, I still don't know whether that means I need to keep the IndexSchema changes.  Let me do another iteration.

bq. Another thing to keep in mind - not all subfields will always be of the same type.

Agreed, but I don't think this is baked in to the generic capabilities, just the Point stuff, where I think it is fine to have the same sub-type.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792405#action_12792405 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

bq.Cleaned up the field creation a bit, more documentation in the example. I think this is ready to go.

Are you sure that we want to create a new String for every query/field creation?

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794439#action_12794439 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

OK, I'll try to diff what I had done before and re-make those changes.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788117#action_12788117 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

Please see the DocumentBuilder changes I had added... there was starting to be too much duplicated code and I pulled it out into a utility method.

Seems like SolrQueryParser should use getFieldQuery for everything (except TextField... but it could even be used for that if we make it such that we could call back to getBooleanQuery, etc).  I had this in my patch.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794439#action_12794439 ] 

Yonik Seeley edited comment on SOLR-1131 at 12/24/09 3:46 PM:
--------------------------------------------------------------

OK, I'll try to diff what I had done before and re-make those changes.
edit: I also see bug fixes I had made that got lost... I'll do a full re-review.
edit: It appears the subFieldSuffix is broken again too.

      was (Author: yseeley@gmail.com):
    OK, I'll try to diff what I had done before and re-make those changes.
edit: I also see bug fixes I had made that got lost... I'll do a full re-review.
  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791505#action_12791505 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

bq.we shouldn't try caching subfields in SchemaField

I believe The SchemaField is an ideal place to cache the 'synthetic' field info. 

bq.and esp not via "if (type instanceof DelegatingFieldType)"... it really doesn't belong there.

true. It was a quick and dirty way to demo the idea. 

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780954#action_12780954 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

I'm also looking for ideas on how to handle the naming of the fields that are produced by this.  I think a FieldType that produces multiple fields should hide the logistics of the naming, which this patch doesn't even begin to scratch the surface of and also on the search side, how does one search against just one of the fields?

Would appreciated thoughts on that.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788228#action_12788228 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

OK... so the real issue is that this introduces a new mechanism to look up field types... not necessarily a horrible thing, but we should definitely think twice before doing so.

home___0 and home___1 are not dynamic fields as I understand it (in that there is no ___0 dynamic field.  The lookup is done by adding new support to the IndexSchema to strip off ___foo off of any field and use that as it's type?

But... that scheme seems to limit us to a single subField type (in addition to the other downsides of requiring a new lookup mechanism).

I do want to separate these two issues though:
1) field lookup mechanism (currently just exact name in schema followed by a dynamic field check)
2) if and when fields or field types should be explicitly defined in the schema vs being created by the polyField

Aside: it looks like the code for getFieldOrNull isn't right?  Seems like it will return a field with both the wrong type and the wrong name?
{code}
   public SchemaField getFieldOrNull(String fieldName) {
      SchemaField f = fields.get(fieldName);
@@ -1055,25 +1071,28 @@
     for (DynamicField df : dynamicFields) {
       if (df.matches(fieldName)) return df.makeSchemaField(fieldName);
     }
-    
+    int idx = fieldName.indexOf(FieldType.POLY_FIELD_SEPARATOR);
+    if (idx != -1){
+      String fn = fieldName.substring(0, idx);
+      f = getFieldOrNull(fn);
+    }
     return f;
{code}

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793699#action_12793699 ] 

Grant Ingersoll edited comment on SOLR-1131 at 12/22/09 6:08 PM:
-----------------------------------------------------------------

Note, I also changed PointValueSource to be ToMultiValueSource, as it really isn't just a point, but I'm not married to that name either.

      was (Author: gsingers):
    Note, I also changed PointValueSource to be ToMultiValueSource, as it really isn't just a point.
  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Summary: Allow a single field type to index multiple fields  (was: Allow a single field to index multiple fields)

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

This implements Option B as laid out at: http://search.lucidimagination.com/search/document/83a5442ab155686/solr_1131_multiple_fields_per_field_type#a600de441418a798

Next up:  Implement ValueSource support for PointType.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated SOLR-1131:
------------------------------------

    Attachment: SOLR-1131.Mattmann.121109.patch.txt

OK, this is getting a lot closer to ready to commit.

Changes:
Hi All,

Updated patch:

{quote}
# Introduced a MultiValueSource - ValueSource that abstractly represents ValueSources for poly fields, and other things.
{quote}

I added javadoc to this and the ASF license header.

{quote}
# Introduced PointValueSource - point(x,y,z) - a MultiValueSource that wraps other value sources (could be called something else, I suppose)
{quote}

I put the ASF header before the package decl, to be consistent with the other SOLR java files.

{quote}
# Add in SchemaAware callback mechanism so that Field Types and other schema stuff can register dynamic fields, etc. after the schema has been created
{quote}

Added more javadoc here, and ASF license.

{quote}
# Incorporated various comments from Chris and Yonik.
{quote}

Thanks, I appreciate it. I'm still -1 on the way this patch deals with the "optimization" issue. I'd like to see evidence that it makes sense to not use split and trim.

Cheers,
Chris



> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789484#action_12789484 ] 

Chris A. Mattmann edited comment on SOLR-1131 at 12/11/09 7:39 PM:
-------------------------------------------------------------------

Hi All,

Updated patch:

{quote}
# Introduced a MultiValueSource - ValueSource that abstractly represents ValueSources for poly fields, and other things.
{quote}

I added javadoc to this and the ASF license header.

{quote}
# Introduced PointValueSource - point(x,y,z) - a MultiValueSource that wraps other value sources (could be called something else, I suppose)
{quote}

I put the ASF header before the package decl, to be consistent with the other SOLR java files.

{quote}
# Add in SchemaAware callback mechanism so that Field Types and other schema stuff can register dynamic fields, etc. after the schema has been created
{quote}

Added more javadoc here, and ASF license.

{quote}
# Incorporated various comments from Chris and Yonik.
{quote}

Thanks, I appreciate it. I'm still -1 on the way this patch deals with the "optimization" issue. I'd like to see evidence that it makes sense to not use split and trim.

Cheers,
Chris



      was (Author: chrismattmann):
    OK, this is getting a lot closer to ready to commit.

Changes:
Hi All,

Updated patch:

{quote}
# Introduced a MultiValueSource - ValueSource that abstractly represents ValueSources for poly fields, and other things.
{quote}

I added javadoc to this and the ASF license header.

{quote}
# Introduced PointValueSource - point(x,y,z) - a MultiValueSource that wraps other value sources (could be called something else, I suppose)
{quote}

I put the ASF header before the package decl, to be consistent with the other SOLR java files.

{quote}
# Add in SchemaAware callback mechanism so that Field Types and other schema stuff can register dynamic fields, etc. after the schema has been created
{quote}

Added more javadoc here, and ASF license.

{quote}
# Incorporated various comments from Chris and Yonik.
{quote}

Thanks, I appreciate it. I'm still -1 on the way this patch deals with the "optimization" issue. I'd like to see evidence that it makes sense to not use split and trim.

Cheers,
Chris


  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788170#action_12788170 ] 

Grant Ingersoll edited comment on SOLR-1131 at 12/9/09 5:18 PM:
----------------------------------------------------------------

{quote}
<fieldType name="xy" class="solr.PointType" dimension="2" subFieldType="double"/>
<field name="home" type="xy" indexed="true" stored="true"/>
{quote}
Two indexed fields
home___0
home___1

One stored field:
home

      was (Author: gsingers):
    {quote}
<fieldType name="xy" class="solr.PointType" dimension="2" subFieldType="double"/>
<field name="home" type="xy" indexed="true" stored="true"/>
{quote}

home___0
home___1
  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792522#action_12792522 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

It doesn't look like subFieldSuffix actually works correctly - I've made a lot of little changes already, so I'll just go ahead and take a shot at fixing it.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789241#action_12789241 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

I've got a patch almost ready that brings in the ValueSource stuff.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789330#action_12789330 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

Hi Grant:

bq.The array is sorted and array access is much faster and we often have to loop over it to look it up.

OK, fair enough.

bq. I think comma makes sense. 

OK, so you think it makes sense -- why? Because it's an N-dimensional array and spaces are less "user friendly" as Yonik put it? 

bq. As for the optimization stuff, I agree w/ Yonik, this is code that will be called a lot.

I'm wondering what there is to agree with since "optimization" was never defined. Are you talking speed? Are you talking memory efficiency? Code readability? Maintainability? Some combination of all of those? There are tradeoffs in everything. You could rewrite some of the provided java runtime methods to squeeze out extra performance, but what's the point of libraries or reusable functions then? The prior code that was in there basically rewrote exactly what split() and trim() do, so why not reuse what's there? If you throw up the performance flag, I would push back on readability and maintainability. 

bq. It is logged, but for the poly fields, if the dyn field is already defined, that's just fine.

Where is it logged? It wasn't in the most up-to-date patch, provided on 2009-12-10 04:34 PM. Here was the code snipped that was there:

{code}
+    //For each poly field, go through and add the appropriate Dynamic field
+      for (FieldType fieldType : polyFieldTypes.values()) {
+        if (fieldType instanceof DelegatingFieldType){
+          List<FieldType> subs = ((DelegatingFieldType) fieldType).getSubTypes();
+          if (subs.isEmpty() == false){
+            //add a new dynamic field for each sub field type
+            for (FieldType type : subs) {
+              log.debug("dynamic field creation for sub type: " + type.typeName);
+              SchemaField df = SchemaField.create("*" + FieldType.POLY_FIELD_SEPARATOR + type.typeName,
+                      type, type.args);//TODO: is type.args right?
+              if (isDuplicateDynField(dFields, df) == false){
+                addDynamicFieldNoDupCheck(dFields, df);
+              }
               // NOTE: there is no else here, so I added an else and a log message
+            }
+          }
            // NOTE: there is no else here, so I added an else and a log message
+        }
+      }
+
{code}

Here's what I added:

{code}
+    //For each poly field, go through and add the appropriate Dynamic field
+      for (FieldType fieldType : polyFieldTypes.values()) {
+        if (fieldType instanceof DelegatingFieldType){
+          List<FieldType> subs = ((DelegatingFieldType) fieldType).getSubTypes();
+          if (!subs.isEmpty()){
+            //add a new dynamic field for each sub field type
+            for (FieldType type : subs) {
+              log.debug("dynamic field creation for sub type: " + type.typeName);
+              SchemaField df = SchemaField.create("*" + FieldType.POLY_FIELD_SEPARATOR + type.typeName,
+                      type, type.args);//TODO: is type.args right?
+              if (!isDuplicateDynField(dFields, df)){
+                addDynamicFieldNoDupCheck(dFields, df);
+              }
+              else{
+                log.debug("dynamic field creation avoided: dynamic field: ["+df.getName()+"] " +
+                		"already defined in the schema!");
+              }
+              
+            }
+          }
+          else{
+            log.debug("field type: ["+fieldType.getTypeName()+"]: no sub fields defined");
+          }
+        }
+      }
+
{code}

Also, I get that it's fine for the poly fields if the dyn field is already defined (in fact, based on my mailing lists comments http://old.nabble.com/SOLR-1131:-disconnect-between-fields-created-by-poly-fields-td26736431.html I think this should _always_ be the case), but whether it's fine or not, it's still worthy to log to provide someone more information. 



> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-1131:
-------------------------------

    Attachment: diff.patch
                SOLR-1131.patch

Here's an updated patch that fixes a lot of little bugs - hopefully others can use this as a base so we don't lose all the little changes.  I also attached a diff of the patch to the previous patch to help people see what's changed (yuck... doesn't seem that readable though).  This isn't finished though - I only got to the FieldType / IndexSchema changes, and *some* of the ValueSource stuff.  I didn't get to distance and value source parsing stuff.

Some of the changes:
- small javadoc cleanups
- fix subFieldSuffix so that it actually uses that suffix
- make any utility methods on FieldType/IndexSchema dealing with "poly" field creation package protected - I don't think we want these public... it's specific for a field that adds only to other fields that it defines (with a specific naming convention) all of the same type.  They probably don't even belong on the base classes, but I don't care so much if they aren't public or protected, we can remove later.
- make PointType actually delegate to the subFieldType... before it was assuming thinks like TermQuery and TermRangeQuery... this would have actually disabled NumericRangeQuery speedups!
- remove SchemaField creation from PointType - we should get fields from the schema
- fixed some value sources that didn't weight correctly
- fix createFields() to return Fieldable instead of Field

When I fixed up point type, I did so in many places by assuming 2 points (so it will break for other dimensions).

I had been working off the assumption that we wanted a geo specific base class to delegate some things to (like the most efficient way to get a bounding box, etc).  If so, we need to decide what that class will be.  Making it point or coordinate already bakes in a lot if implementation details (subType stuff).  Do we want geo to just work off of a generic n dimentional point class, or should we have a 2d lat/lon?  It does feel like we're loosing something by trying to over-generalize.  The PointTypeValueSource is inner-loop stuff, so I did specialize that for lat/lon.


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788170#action_12788170 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

{quote}
<fieldType name="xy" class="solr.PointType" dimension="2" subFieldType="double"/>
<field name="home" type="xy" indexed="true" stored="true"/>
{quote}

home___0
home___1

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786902#action_12786902 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

I've got a patch with search and all the existing tests working.  Still trying to work on one test that is failing due to toInternal/external conflicts.  In the patch, I hide all the details of the internal fields, thus not requiring the dynamic field stuff.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790330#action_12790330 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. minor nit: why is "subFieldTypes" plural? We're only specifying a single type, right?

That's inherited from a more general mechanism that would allow multiple field types.  So, the PointType only allows one, but other implementations may allow more.

bq. Given the concerns of some people around automatically registering a dynamic field, perhaps we should optionally allow a subFieldSuffix to be passed instead of subFieldTypes.

That's fine by me.  Given my workload, I will try to do this and commit on Thursday of this week, unless someone wants to take it before then.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-1131:
----------------------------------------

    Attachment: SOLR-1131.patch

I guess Noble was referring to something like what is done in this patch.

# DelegatingFieldType has a new method:
{code}
public SchemaField[] getSubFields(SchemaField mainField);
{code}
# PointType and PlusMinusField implement this new method. It is not the prettiest way but this is one way to do it.
# With this approach, we can get the names from the subFields wherever the name is used (not implemented in this patch).

The PlusMinusField is actually a field type and not a field so we should probably rename it to PlusMinusFieldType.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780953#action_12780953 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

bq.dd a boolean on FT for isMultiField which returns false by default, then we could check that

not bad

bq.My gut, however, says the object is very short lived and is likely to be of negligible cost.

but, for a huge ingestion it just means several million objects created and that much extra GC


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Steven Rowe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792875#action_12792875 ] 

Steven Rowe commented on SOLR-1131:
-----------------------------------

Maybe Point -> { Point2D, Point3D } ?

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780955#action_12780955 ] 

Chris Male commented on SOLR-1131:
----------------------------------

My initial feeling is, is searching against just one field something this functionality needs to concern itself with? If someone creates a field of type Point for example, which behind the scenes is indexed as 2 fields, from a Solr schema.xml perspective it is just 1 field, and so it should be the same at the querying level.  We are trying to encapsulate the fact that the FieldType results in multiple fields.  This then frees us up to choose a naming convention that is easy for us to implement, because we don't have to concern users with the convention.

If someone does want to be able to search against just one field, such as maybe being able to find documents at a certain x coordinate, rather than an x,y Point, then I think we can simply recommend they index that data in a separate field.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788123#action_12788123 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

{quote}
Regarding polyfields... it's not clear why they are special enough to have to change the IndexSchema? (IndexSchema.isPolyField, getPolyField, getPolyFieldType, getPolyFieldTypeNoEx, etc). Can't we just store them as normal field types?
{quote}

My thinking was that a Query Parser or other things might need to know look up this information, but you are right, I don't have a specific use case for them at the moment.  At the same time, poly fields _feel_ like a hybrid between regular fields and dynamic fields and thus fit at the same level they do.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780924#action_12780924 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

Is this a good idea? This creates an extra Object (Field[]) for every Field . For a corner case we are introducing an overhead to all the field types.

Why don't we add a new interface MutlValuedFieldType which extends FieldType for this 

> Allow a single field to index multiple fields
> ---------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

OK, this is getting a lot closer to ready to commit.

Changes:
# Introduced a MultiValueSource - ValueSource that abstractly represents ValueSources for poly fields, and other things.  
# Introduced PointValueSource - point(x,y,z) - a MultiValueSource that wraps other value sources (could be called something else, I suppose)
# Implemented PointTypeValueSource to represent ValueSource for the PointType class. 
# Hooked in multivalue callbacks to DocValues.  In addition to making functions work with Points (et. al) it should be possible to write functions that work on multivalued fields, but I did not undertake this work.
# Add in SchemaAware callback mechanism so that Field Types and other schema stuff can register dynamic fields, etc. after the schema has been created
# Updated the example to have spatial information in the docs, etc.  See http://wiki.apache.org/solr/SpatialSearch
# Modified the distance functions to work with MultiValueSources
# cleaned up the tests
# Incorporated various comments from Chris and Yonik.


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-1131:
-------------------------------

    Attachment: SOLR-1131.patch

Second try - forgot to "svn add" the new files.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790745#action_12790745 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

OK, I see what you mean.  I don't think we should add it onto the interface, though.  I think it can just be handled by changing the signature of the createField method that takes in the delegatedFieldType.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793320#action_12793320 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. fix createFields() to return Fieldable instead of Field

This seems a bit weird (even though I understand why) due to the fact that the other createField methods actually return Field and not Fieldable.  

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792675#action_12792675 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. Do we want geo to just work off of a generic n dimentional point class, or should we have a 2d lat/lon?

I think we want generic.  Some geo stuff will want 3D (elevation).  

bq. remove SchemaField creation from PointType - we should get fields from the schema

I thought about that, but why should we have to figure out what the dynamic field is every time when we already know it?

bq, make PointType actually delegate to the subFieldType... before it was assuming thinks like TermQuery and TermRangeQuery... this would have actually disabled NumericRangeQuery speedups!

Where do you mean?  In createField?  Or in the getFieldQuery?

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786951#action_12786951 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

Patch is looking good! I'm pouring through it right now -- I'll try and test this as part of work I'm doing on SOLR-1586 -- maybe even update that issue if I get a sec today :)

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781948#action_12781948 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

See discussion at http://search.lucidimagination.com/search/document/d24c920ddf05b4f7/solr_1131_multiple_fields_per_field_type

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789339#action_12789339 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

{quote}
I'm wondering what there is to agree with since "optimization" was never defined. Are you talking speed? Are you talking memory efficiency? Code readability? Maintainability? Some combination of all of those?
{quote}

Speed and memory.  

As for logging, that code is all going away in the next patch, I think

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788168#action_12788168 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

{quote}
Perhaps we should return to the design level a bit instead of me reading code and maybe making mistakes and trying to infer intent.

Assume we have this:
<fieldType name="xy" class="solr.PointType" dimension="2" subFieldType="double"/>
<field name="home" type="xy" indexed="true" stored="true"/>

What are the exact field names that are indexed?
{quote}

Regarding the fieldType subFieldType attribute -- a question popped into my mind. How do we handle poly fields where each type is different? I.e., where subFieldType="double,tint" or whatever...

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field to index multiple fields

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703661#action_12703661 ] 

Ryan McKinley commented on SOLR-1131:
-------------------------------------

[TeeTokenFilter|http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/analysis/TeeTokenFilter.html] look interesting.  How do you imagine it could help with letting a solr Field index multiple fields?

Using the TeeTokenFilter seems like a matter of plumbing.  Perhaps we could add some properties to FieldType that create a SinkTokenizer and then use copyField (or something similar) to use the existing SinkTokenizer.

I'm not familiar enought with the Sink/Tee stuff to know what we would need -- but we should make sure any SchemaField/FieldType changes open the door to this.

> Allow a single field to index multiple fields
> ---------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>         Attachments: SOLR-1131-IndexMultipleFields.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788298#action_12788298 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. OK... so the real issue is that this introduces a new mechanism to look up field types... not necessarily a horrible thing, but we should definitely think twice before doing so. 

Agreed.  I'm not wedded to this approach, just want to see the discussion through.  I do feel strongly that the goal is such that an app designer should be able to use a FieldType just as they always have, either dynamic or static.  How we get to that I don't care so much as long as it works and performs.

bq. But... that scheme seems to limit us to a single subField type (in addition to the other downsides of requiring a new lookup mechanism).

I don't follow this.  In this particular implementation, I have a single subFieldType, but I don't see why a different implementation couldn't do something like:
{code}
<fieldType name="foo" type="solr.MultiSubPointType" dimension="3" subFieldTypes="double,tdouble,int"/>
{code}

bq. Aside: it looks like the code for getFieldOrNull isn't right? Seems like it will return a field with both the wrong type and the wrong name?

Hmmm, I _think_ it should return the "owning" Schema Field, i.e. the one that exists in the schema.xml file.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789520#action_12789520 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. I'm still -1 on the way this patch deals with the "optimization" issue. I'd like to see evidence that it makes sense to not use split and trim.

My tests show it to be at least 7 times faster.  But this should be obvious from static analysis, too.  First of all, String.split() uses a regex which then makes a pass through the underlying character array.  Then, trim has to go back through and analyze the char array too, not to mention the extra String creations.  The optimized version here makes one pass and deals solely at the char array level and only has to do the substring, which I think can be optimized by the JVM to be a copy on write.

{code}

  public void testDistPerf() throws Exception {
    String [] input = new String[1000000];
    Random random = new Random();
    for (int i = 0; i < input.length; i++){
      input[i] = random.nextInt() + ", " + random.nextInt();
    }
    String [] out = new String[2];
    long time = 0;
    long start = System.currentTimeMillis();
    for (int j = 0; j < 50; j++) {
      for (int i = 0; i < input.length; i++){
        split(input[i], out, 2);
      }
    }
    time = (System.currentTimeMillis() - start);
    System.out.println("Time: " + time);
    time = 0;
    start = System.currentTimeMillis();
    for (int j = 0; j < 50; j++) {
      for (int i = 0; i < input.length; i++){
        DistanceUtils.parsePoint(out, input[i], 2);
      }
    }
    time = (System.currentTimeMillis() - start);
    System.out.println("Time: " + time);
  }

  private String[] split(String externalVal, String[] out, int dimension) {
    out = externalVal.split(",");
    if (out.length != dimension) {
      throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "incompatible dimension (" + dimension +
              ") and values (" + externalVal + ").  Only " + out.length + " values specified");
    }
    for (int j = 0; j < out.length; j++) {
      out[j] = out[j].trim();
    }
    return out;
  }
{code}

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791831#action_12791831 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

I have opened an issue for the same SOLR-1664

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789526#action_12789526 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

I think this is ready to commit.  I'd like to do so on Monday or Tuesday of next week, so that should give plenty of time for further review

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790715#action_12790715 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. It is not clear as to why can't the 'sf' instance be cached and reused? 

Because there is no way to change the name on a SchemaField w/o changing SchemaField to be non-final.  I don't think SchemaField should be non-final.

bq. Why do we have a map for flags why not use a bitset?

Yeah, we could add a new method that takes a bitset, b/c I believe that is what is used under the hood anyway.


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789351#action_12789351 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. Unfortunately, it's at the cost of readability and maintainability. 

Maybe.  It took me all of 30 seconds to figure out what it was doing.  I'll put some comments on it.  While readability is important, Solr's goal is not to make a product that a CS101 grad can read, it's too build a blazing fast search server.  That call could hit millions of times when indexing points.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Noble Paul updated SOLR-1131:
-----------------------------

    Attachment: SOLR-1131.patch

Modified the Query creation to use the cached SchemaField names. 

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll resolved SOLR-1131.
-----------------------------------

    Resolution: Fixed

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790728#action_12790728 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. Even if SchemaField is final we can precreate and cache the SchemaField objects because the properties of the synthetic field is known in advance. For instance, if you have a dimension of 2 ,the PointType instance will always have 2 well known synthetic names and types that can be created well in advance and they can be reused

True, but you need to also be able to change the name and it needs to be able to rely on the existing createField signature, which uses these values on the SchemaField.  Earlier patches had a separate, internal createField() method that took in all the options (thus not requiring the SF at all) but they don't work for the delegation.  

I'm open to ideas, though, so throw up some code.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780950#action_12780950 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. Is this a good idea?

Not sure yet. 

bq. Why don't we add a new interface MutlValuedFieldType which extends FieldType for this 

Aren't we just substituting a very simple construction for an instanceof check?

I was possibly thinking of a couple of other options, too:
1. add a boolean on FT for isMultiField which returns false by default, then we could check that
2. Add a threadlocal that stores a preconstructed array of size one which could then simply be set for the single field case, which is the most common case.

My gut, however, says the object is very short lived and is likely to be of negligible cost.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Fix Version/s: 1.5

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789833#action_12789833 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

Hi Grant:

Thanks. Your latest patch omits class-level javadoc I wrote for DelegatingFieldType and for the #inform method in SchemaAware.

{code}
+/**
+ * An interface for {@link FieldType}s that are poly fields, as defined in <a
+ * href="http://issues.apache.org/jira/browse/SOLR-1131">SOLR-1131</a>, so that
+ * poly fields can declare the {@link FieldType}s of their sub-fields.
+ * 
+ * @since SOLR-1131
+ * 
+ **/
+public interface DelegatingFieldType {
+
+  /**
+   * 
+   * Returns the {@link FieldType}s of the sub-fields for this poly-field.
+   * 
+   * @return A {@link List} of {@link FieldType}s for the sub-fields of a poly
+   *         field.
+   */
+  public List<FieldType> getSubTypes();
+}
{code}

{code}
+public interface SchemaAware {
+
+  /**
+   * Informs the {@link IndexSchema} provided by the <code>schema</code>
+   * parameter of an event (e.g., a new {@link FieldType} was added, etc.
+   * 
+   * @param schema
+   *          The {@link IndexSchema} instance that inform of the update to.
+   * 
+   * @since SOLR-1131
+   */
+  public void inform(IndexSchema schema);
+}
{code}

Other than that +1. Thanks for seeing this through to a great patch.

Cheers,
Chris


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788800#action_12788800 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. Next up: Implement ValueSource support for PointType.

Exactly!

My thinking was perhaps to add the ability to get lat+lon from DocValues... but for efficiency have DocValues fill in values on an object passed to it (this is inner-loop stuff.... we don't want to be creating objects per doc).

DocValues getPoint(Point point)
or perhaps
DocValues.getPoint(double[] point)


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790635#action_12790635 ] 

Noble Paul edited comment on SOLR-1131 at 12/15/09 10:06 AM:
-------------------------------------------------------------

in FieldType#createFields(SchemaField field, FieldType delegatedType, String storageVal, boost, String ... externalVals)
{code:java}
String name = field.getName();
      Map<String, String> props = new HashMap<String, String>();
      //Just set these, delegate everything else to the field type
      props.put("indexed", "true");
      props.put("stored", "false");
      //props.put("omitNorms", "true");
      //props.put("tokenized", "false");
      if (field.indexed()) {

        for (int j = 0; j < externalVals.length; j++) {
          //SchemaField is final, as is name, so we need to recreate each time
          //put the counter before the separator, b/c dynamic fields can't be asterisks on both the front and the end of the String
          SchemaField sf = SchemaField.create(name  + "_" + j + POLY_FIELD_SEPARATOR + delegatedType.typeName, delegatedType, props);
          //QUESTION: should we allow for vectors, etc?  Not sure that it makes sense
          results[j] = delegatedType.createField(sf, externalVals[j], boost);
        }
      }
{code}

It is not clear as to why can't the 'sf' instance be cached and reused? 

 we can also  avoid creating the synthetic field name at query time in PointField#.getFieldQuery


Why do we have a map for flags why not use a bitset?

      was (Author: noble.paul):
    in FieldType#createFields(SchemaField field, FieldType delegatedType, String storageVal, boost, String ... externalVals)
{code:java}
String name = field.getName();
      Map<String, String> props = new HashMap<String, String>();
      //Just set these, delegate everything else to the field type
      props.put("indexed", "true");
      props.put("stored", "false");
      //props.put("omitNorms", "true");
      //props.put("tokenized", "false");
      if (field.indexed()) {

        for (int j = 0; j < externalVals.length; j++) {
          //SchemaField is final, as is name, so we need to recreate each time
          //put the counter before the separator, b/c dynamic fields can't be asterisks on both the front and the end of the String
          SchemaField sf = SchemaField.create(name  + "_" + j + POLY_FIELD_SEPARATOR + delegatedType.typeName, delegatedType, props);
          //QUESTION: should we allow for vectors, etc?  Not sure that it makes sense
          results[j] = delegatedType.createField(sf, externalVals[j], boost);
        }
      }
{code}

It is not clear as to why can't the 'sf' instance be cached and reused? 

Why do we have a map for flags why not use a bitset?
  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794439#action_12794439 ] 

Yonik Seeley edited comment on SOLR-1131 at 12/24/09 3:18 PM:
--------------------------------------------------------------

OK, I'll try to diff what I had done before and re-make those changes.
edit: I also see bug fixes I had made that got lost... I'll do a full re-review.

      was (Author: yseeley@gmail.com):
    OK, I'll try to diff what I had done before and re-make those changes.
  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788138#action_12788138 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. poly fields feel like a hybrid between regular fields and dynamic fields and thus fit at the same level they do.

The schema needs to know about dynamic fields because it affects field names.

Actually, I think I just saw why you currently need *some* support in IndexSchema with the current way you are doing things:
If you have a dynamicField _pt, then I think you use field names like home_pt__0 and home_pt__1?
So in essence, it's a new type of dynamic field?  This seems like it might be hard to actually get right in all of the corner cases.

What if, instead, dynamic fields are directly used for subfields?
So for a field name "home" instead of home__0 and home__1, you would use  home__0_d, home__1_d
Not as short, but it avoids having to add new capabilities to the IndexSchema.

Another alternative: use a prefix for subfields and the suffix for the type. _0_home_d,  _1_home_d

Another thing to keep in mind - not all subfields will always be of the same type.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

OK, here's my take on this.  I took Yonik's and merged it w/ a patch I had in the works.  It's not done, but all tests pass, including the new on I added (PolyFieldTest).  Yonik's move to put getFieldQuery in FieldType was just the key to answering the question of how to generate queries given a FieldType.

Notes:
1. I changed the Geo examples to be CoordinateFieldType (representing an abstract coordinate system) and then PointFieldType which represents a point in an n-dimensional space (default 2D).  I think from this, we could easily add things like PolygonFieldType, etc. which would allow us to create more sophisticated shapes and do things like intersections, etc.  For instance, imagine saying:  Does this point lie within this shape?  I think that might be able to be expressed as a RangeQuery
2. I'm not sure I care for the name of the new abstract FieldType that is a base class of CoordinateFieldType called DelegatingFieldType
3. I'm not sure yet on the properties of the generated fields just yet.  Right now, I'm delegating the handling to the sub FieldType except I'm overriding to turn off storage, which I think is pretty cool (could even work as a copy field like functionality)
4. I'm not thrilled about creating a SchemaField every time in the createFields protected helper method, but SchemaField is final and doesn't have a setName method (which makes sense)

Questions for Yonik on his patch:
1. Why is TextField overriding getFieldQuery when it isn't called, except possibly via the FieldQParserPlugin?
2. I'm not sure I understand the getDistance, getBoundingBox methods on the GeoFieldType.   It seems like that precludes one from picking a specific distance (for instance, some times you may want a faster approx. and others a slower)


Needs:
1. Write up changes.txt
2. More tests, including performance testing
3. Patch doesn't support dynamic fields yet, but it should

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789106#action_12789106 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

{quote}
A comma is more user friendly - spaces are often already used as delimiters in quite a few places.
Why did you replace more optimized code that was already written in parsePoint with less optimized code?
{quote}

Meh, I could go either way on the comma/space issue. It would be nice to be compatible with an existing GeoPoint standard. I know georss uses space as the delimeter -- do you know of any that use ","?

RE: optimized code, can you be explicit? I would argue the code I inserted is more optimized from a readiability standpoint. It's a bit easier for your typical CS101 grad to understand. All that was being done in the prior patch is a set of forwards/backwards isolation loops to determine the start/end index to substring out, in case you have:

34.333      ,100.1 OR
34.333,100.1 OR
34.333,       100.1

At first blush, trying to understand that code was a bit harder than simply tokenizing on the known delimeter, and then trimming each tokenized value.

{code}
  out = externalVal.split(",");
+      if(out.length != dimension){
+        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "incompatible dimension (" + dimension +
+            ") and values (" + externalVal + ").  Only " + i + " values specified");        
+      }
+      for(int j=0; j < out.length; j++){
+        out[j] = out[j].trim();
+      }
{code}




> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789082#action_12789082 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. parsePoint in DistanceUtils, why use ',' as the separator

A comma is more user friendly - spaces are often already used as delimiters in quite a few places.
Why did you replace more optimized code that was already written in parsePoint with less optimized code?


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789079#action_12789079 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

Quick comment based on a spot check of the changes to IndexSchema: rather than make polyField special somehow w.r.t IndexSchema, and add a FieldType.getPolyFieldNames, etc, I had been thinking more along the lines of having an IndexSchema.registerDynamicFieldDefinition - just like the existing registerDynamicCopyField.  This would (optionally) allow any field type to add other definitions to the IndexSchema.  I continue to think it would be good to stay away of special logic for "polyfields" in the IndexSchema.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated SOLR-1131:
------------------------------------

    Attachment: SOLR-1131.Mattmann.121009.patch.txt

Hi All:

Here's a cut on the patch. Some questions/comments on the existing patch(es):

1. Why use 
{code}
private DynamicField[] dynamicFields;
{code}

instead of 
{code}
List<DynamicField>
{code}
or 
{code}
Collection<DynamicField>
{code}

in IndexSchema?

2. There are a bunch of useless whitespace changes (e.g., in IndexSchema, FieldType) in the existing patches. The final patch probably shouldn't include those since it makes it difficult to understand what was actually changed.
3. IndexSchema:
    - when checking for isDuplicateDynField, if it is, nothing is done. Shouldn't this be where an exception is thrown or a message is logged? In the patch I'm attaching I took the log approach.
4. IndexSchema:
    - what happens if subs.isEmpty() == true?
    - maybe log message that says, dyn field definition is up to you?
    - I took the approach in my attached patch to log it.
5. Why does getPolyFieldType(String) throw an exception if the field is not a poly field type -- that seems a bit brittle? Also there's the NoEx version anyways (why not just keep that one?). In the patch I've attached, I took the approach of only including a getPolyFieldType that returns null rather than throwing an ex (the NoEx version).
6. CoordinateFieldType: why process > 1 sub field types and then throw an exception at the end? I cleaned this up to throw the Exception when it occurs.
7. parsePoint in DistanceUtils, why use ',' as the separator -- use ' ' (at least conforms to georss point then). I guess because you are supporting N-dimensional points, right?
8. parsePoint -- instead of complicated isolation loops, why not just use trim()? I've taken that approach in the patch I've attached.

This patch passes all unit tests as well. This doesn't implement option C that I proposed yet. Hopefully I'll get a chance to put that up later tonight.



> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

Brings it up to trunk.  Still needs test cases.  All other tests pass.

> Allow a single field to index multiple fields
> ---------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792848#action_12792848 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. I think we want generic. Some geo stuff will want 3D (elevation).

I imagine 99% of the users of Point will be those wanting straight 2D geo search (limit by simple distance, or sort by simple distance).
I don't see how elevation would help in isolation since more accurate distance measures would presumably need map info for driving distance.
I could see people wanting to *store* 3D points, etc, but they can do that today.

bq. I thought about that, but why should we have to figure out what the dynamic field is every time when we already know it?

It's more of an issue of directly creating fields... that's currently up to the schema (or the FieldType if some form of SOLR-1664 goes in).

bq. > make PointType actually delegate to the subFieldType... before it was assuming thinks like TermQuery and TermRangeQuery... this would have actually disabled NumericRangeQuery speedups!

bq. Where do you mean? In createField? Or in the getFieldQuery?

In getFieldQuery and getRangeQuery... I changed to actually delegate to the subFieldType.getFieldQuery and getRangeQuery.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780961#action_12780961 ] 

Chris Male commented on SOLR-1131:
----------------------------------

Those are definitely big problems.  

The ResponseWriter problem could be simplified if they used SolrDocuments rather than retrieving raw Lucene Documents.  When constructing the SolrDocuments, which is done in cooperation with an IndexSchema instance, we have the information needed to bring the multiple fields together as one.  I'm not sure of the performance impact of doing this, but it seems like having the ResponseWriters retrieve the data in a single consistent fashion is a good thing in the long run anyway.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788116#action_12788116 ] 

Chris A. Mattmann commented on SOLR-1131:
-----------------------------------------

Hi Yonik:

I agree in general with your points above regarding unit tests. However, there seems to be a contradiction in your last statement to what you proposed above:

bq. schema.xml: geo will be core... let's not add a new/different schema file in tests for this and simply add it to the latest schema12

Why have unit tests point at the actual schema? That explicitly ties your unit tests to the shipped ops schema, and then encourages people to write unit tests against it (which could lead to the specific number checks that will break when the schema is updated as you mentioned). Instead +1 for having a separate test schema even if it causes duplication it insulates change.

Cheers,
Chris


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (SOLR-1131) Allow a single field to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll reassigned SOLR-1131:
-------------------------------------

    Assignee: Grant Ingersoll

> Allow a single field to index multiple fields
> ---------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788802#action_12788802 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. Yonik, could you elaborate on this? It seems kind of weird to have that instanceof check in SolrQueryParser.getFieldQuery() to see if we have a TextField or not.

If you look at the impl in TextField, I had to comment out stuff like "newTermQuery" and replace it with "new TermQuery".

+            // Query currentQuery = newTermQuery(new Term(field, term));
+            Query currentQuery = new TermQuery(new Term(field, term));

To be fully back compatible, all we would need to do is check if the parser was an instance of QueryParser, and if so, delegate to newTermQuery.  Then we could use fieldType.getFieldQuery() absolutely everywhere.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790635#action_12790635 ] 

Noble Paul commented on SOLR-1131:
----------------------------------

in FieldType#createFields(SchemaField field, FieldType delegatedType, String storageVal, boost, String ... externalVals)
{code:java}
String name = field.getName();
      Map<String, String> props = new HashMap<String, String>();
      //Just set these, delegate everything else to the field type
      props.put("indexed", "true");
      props.put("stored", "false");
      //props.put("omitNorms", "true");
      //props.put("tokenized", "false");
      if (field.indexed()) {

        for (int j = 0; j < externalVals.length; j++) {
          //SchemaField is final, as is name, so we need to recreate each time
          //put the counter before the separator, b/c dynamic fields can't be asterisks on both the front and the end of the String
          SchemaField sf = SchemaField.create(name  + "_" + j + POLY_FIELD_SEPARATOR + delegatedType.typeName, delegatedType, props);
          //QUESTION: should we allow for vectors, etc?  Not sure that it makes sense
          results[j] = delegatedType.createField(sf, externalVals[j], boost);
        }
      }
{code}

It is not clear as to why can't the 'sf' instance be cached and reused? 

Why do we have a map for flags why not use a bitset?

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1131:
----------------------------------

    Attachment: SOLR-1131.patch

Starting to add unit tests.  Still no support on the search/response side, but groundwork for adding multiple fields per SchemaField/FieldType is now laid.  Still need a way to know that a field/fieldtype is going to output multiple fields so that we can detect them when searching, etc.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788125#action_12788125 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. I need to update my IntelliJ "Live Templates", as I have them setup to spit out a pattern like above

lol... so that's where all that comes from...  I was going to say something like "this looks like it came out of a code generator" but it sounded a bit too harsh in the off chance that it wasn't ;-)  I'm very relieved to find you weren't typing out that crap by hand.

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788319#action_12788319 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

>  Aside: it looks like the code for getFieldOrNull isn't right? Seems like it will return a field with both the wrong type and the wrong name?
> > Hmmm, I think it should return the "owning" Schema Field, i.e. the one that exists in the schema.xml file.

Those fields probably will be exposed at least internally to other parts of solr, so they should really return the correct field / fieldType.


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788156#action_12788156 ] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

bq. That then requires those dynamic fields to be present, which I'd rather not have to do. 

That's sort of a separate question: if one were allowed to register a dynamic field (not sure if this capability is present), then it could be registered if it didn't exist.   

Also, you have subFieldType="double" in the schema... and that requires that the "double" field type be defined.  Why not have subFieldSuffix="_d" and require the "_d" dynamic field be defined?  Seems like the same complexity level.
 
> > Another thing to keep in mind - not all subfields will always be of the same type.
> Agreed, but I don't think this is baked in to the generic capabilities, just the Point stuff,

For a specific point implementation, that's fine.  But if you use a point type that can do cartesian grid stuff, then you already have different field types.  But I guess subFieldType="double" need only apply to some of the subfields (the ones that index the points).

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786944#action_12786944 ] 

Grant Ingersoll edited comment on SOLR-1131 at 12/9/09 1:33 PM:
----------------------------------------------------------------

OK, here's my take on this.  I took Yonik's and merged it w/ a patch I had in the works.  It's not done, but all tests pass, including the new on I added (PolyFieldTest).  Yonik's move to put getFieldQuery in FieldType was just the key to answering the question of how to generate queries given a FieldType.

Notes:
1. I changed the Geo examples to be CoordinateFieldType (representing an abstract coordinate system) and then PointFieldType which represents a point in an n-dimensional space (default 2D).  I think from this, we could easily add things like PolygonFieldType, etc. which would allow us to create more sophisticated shapes and do things like intersections, etc.  For instance, imagine saying:  Does this point lie within this shape?  I think that might be able to be expressed as a RangeQuery
2. I'm not sure I care for the name of the new abstract FieldType that is a base class of CoordinateFieldType called DelegatingFieldType
3. I'm not sure yet on the properties of the generated fields just yet.  Right now, I'm delegating the handling to the sub FieldType except I'm overriding to turn off storage, which I think is pretty cool (could even work as a copy field like functionality)
4. I'm not thrilled about creating a SchemaField every time in the createFields protected helper method, but SchemaField is final and doesn't have a setName method (which makes sense)

Questions for Yonik on his patch:
1. Why is TextField overriding getFieldQuery when it isn't called, except possibly via the FieldQParserPlugin?
2. I'm not sure I understand the getDistance, getBoundingBox methods on the GeoFieldType.   It seems like that precludes one from picking a specific distance (for instance, some times you may want a faster approx. and others a slower more accurate calculation)


Needs:
1. Write up changes.txt
2. More tests, including performance testing
3. Patch doesn't support dynamic fields yet, but it should

      was (Author: gsingers):
    OK, here's my take on this.  I took Yonik's and merged it w/ a patch I had in the works.  It's not done, but all tests pass, including the new on I added (PolyFieldTest).  Yonik's move to put getFieldQuery in FieldType was just the key to answering the question of how to generate queries given a FieldType.

Notes:
1. I changed the Geo examples to be CoordinateFieldType (representing an abstract coordinate system) and then PointFieldType which represents a point in an n-dimensional space (default 2D).  I think from this, we could easily add things like PolygonFieldType, etc. which would allow us to create more sophisticated shapes and do things like intersections, etc.  For instance, imagine saying:  Does this point lie within this shape?  I think that might be able to be expressed as a RangeQuery
2. I'm not sure I care for the name of the new abstract FieldType that is a base class of CoordinateFieldType called DelegatingFieldType
3. I'm not sure yet on the properties of the generated fields just yet.  Right now, I'm delegating the handling to the sub FieldType except I'm overriding to turn off storage, which I think is pretty cool (could even work as a copy field like functionality)
4. I'm not thrilled about creating a SchemaField every time in the createFields protected helper method, but SchemaField is final and doesn't have a setName method (which makes sense)

Questions for Yonik on his patch:
1. Why is TextField overriding getFieldQuery when it isn't called, except possibly via the FieldQParserPlugin?
2. I'm not sure I understand the getDistance, getBoundingBox methods on the GeoFieldType.   It seems like that precludes one from picking a specific distance (for instance, some times you may want a faster approx. and others a slower)


Needs:
1. Write up changes.txt
2. More tests, including performance testing
3. Patch doesn't support dynamic fields yet, but it should
  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788163#action_12788163 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

{quote}
Also, you have subFieldType="double" in the schema... and that requires that the "double" field type be defined. Why not have subFieldSuffix="_d" and require the "_d" dynamic field be defined? Seems like the same complexity level
{quote}

I think it makes more sense for the subFieldType to be present to be tied to a type than a Field (subFieldSuffix), as it seems weird to have a field type have a dependency on a Field, whereas it seems fine for a field type to have a dependency on another field type.

{quote}
For a specific point implementation, that's fine. But if you use a point type that can do cartesian grid stuff, then you already have different field types. But I guess subFieldType="double" need only apply to some of the subfields (the ones that index the points).
{quote}
I'm not sure I see this.  If and when we implement CartesianPointType, it will still need to have a type for the sub fields (depending on the tiers specified) but I don't see why the subFieldType wouldn't be the same for all of them.  AIUI, they all have the same precision requirements.

I think part of what's missing is that for some of these attributes, it would be better for them to be field properties and not fieldType properties.  For instance for the Cartesian case, you will need to declare what levels to support.  If that is specified on the FieldType, then you have a proliferation of Field Type declarations, whereas if it is on the Field, that is a lot cleaner and less verbose.  I'm just not sure how that gets implemented just yet, as having to specify startTier and endTier doesn't seem like the same level as multiValued or stored.  

> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788121#action_12788121 ] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. IMO, unit tests can be too low level. They can also be too fragile. 

I guess it all comes down to what you call a unit.  

bq. t would be nice, for example, if testPointFieldType indexed a few couments (with various combinations of stored / indexed) and then queried the index,

This is done in testIndexing()

bq. TopDocs topDocs = core.getSearcher().get().search(bq, 1);

Yeah, see my comment there even!  I wanted a way to validate that the correct query is created, but I don't even really need to run a search for that.

{quote}
Rendundant null checks, trivial strings, etc:
+ assertNotNull("topDocs is null", topDocs);
+ assertTrue(topDocs.totalHits + " does not equal: " + 1, topDocs.totalHits == 1)
{quote}

I need to update my IntelliJ "Live Templates", as I have them setup to spit out a pattern like above

bq. Please see the DocumentBuilder changes I had added... 

Will do.

{quote}
Seems like SolrQueryParser should use getFieldQuery for everything (except TextField... but it could even be used for that if we make it such that we could call back to getBooleanQuery, etc). I had this in my patch.
{quote}

I thought I captured that, but will look again.


> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780954#action_12780954 ] 

Grant Ingersoll edited comment on SOLR-1131 at 11/21/09 12:11 PM:
------------------------------------------------------------------

I'm also looking for ideas on how to handle the naming of the fields that are produced by this.  I think a FieldType that produces multiple fields should hide the logistics of the naming, which this patch doesn't even begin to scratch the surface of and also on the search side, how does one search against just one of the fields?

Would appreciate thoughts on that.

      was (Author: gsingers):
    I'm also looking for ideas on how to handle the naming of the fields that are produced by this.  I think a FieldType that produces multiple fields should hide the logistics of the naming, which this patch doesn't even begin to scratch the surface of and also on the search side, how does one search against just one of the fields?

Would appreciated thoughts on that.
  
> Allow a single field type to index multiple fields
> --------------------------------------------------
>
>                 Key: SOLR-1131
>                 URL: https://issues.apache.org/jira/browse/SOLR-1131
>             Project: Solr
>          Issue Type: New Feature
>          Components: Analysis
>            Reporter: Ryan McKinley
>            Assignee: Grant Ingersoll
>         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.