You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Jared Flatow (JIRA)" <ji...@apache.org> on 2009/01/29 03:27:02 UTC

[jira] Created: (SOLR-994) EnumeratedEntityTransformer

EnumeratedEntityTransformer
---------------------------

                 Key: SOLR-994
                 URL: https://issues.apache.org/jira/browse/SOLR-994
             Project: Solr
          Issue Type: New Feature
          Components: contrib - DataImportHandler
    Affects Versions: 1.4
            Reporter: Jared Flatow
            Priority: Minor
             Fix For: 1.4


An EnumeratedEntityTransformer gives the Nth entity an accessible ${<entity>.n} == N. In addition, the entity may specify a chunkSize attribute, which will cause the chunkSize'th entity to gain the attribute $hasMore=true. A template for a nextUrl may also be specified on the entity, that is different from the url template.

Consider an API:

http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=0

an entity could specify:

<entity name="myEntity" processor="XPathEntityProcessor" transformer="EnumeratedEntityTransformer" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.n}" chunkSize=50>...</entity>

This allows for fetching entities in chunks until there are < chunkSize returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (SOLR-994) EnumeratedEntityTransformer

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
But it is not yet acessible though VariableResolver

On Sun, May 3, 2009 at 12:30 AM, Shalin Shekhar Mangar (JIRA)
<ji...@apache.org> wrote:
>
>    [ https://issues.apache.org/jira/browse/SOLR-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705318#action_12705318 ]
>
> Shalin Shekhar Mangar commented on SOLR-994:
> --------------------------------------------
>
> bq. one extra addition which I can think of is put an implicit variable 'rowsFetchedCount ' into the variableresolver so that it can be directly used .
>
> With SOLR-989 Context exposes the statistics map so rowCount is already available. Should we close this issue?
>
>> EnumeratedEntityTransformer
>> ---------------------------
>>
>>                 Key: SOLR-994
>>                 URL: https://issues.apache.org/jira/browse/SOLR-994
>>             Project: Solr
>>          Issue Type: New Feature
>>          Components: contrib - DataImportHandler
>>    Affects Versions: 1.4
>>            Reporter: Jared Flatow
>>            Assignee: Shalin Shekhar Mangar
>>            Priority: Minor
>>             Fix For: 1.4
>>
>>         Attachments: SOLR-994.patch
>>
>>
>> An EnumeratedEntityTransformer gives the Nth entity an accessible ${<entity>.n} == N. In addition, the entity may specify a chunkSize attribute, which will cause the chunkSize'th entity to gain the attribute $hasMore=true. A template for a nextUrl may also be specified on the entity, that is different from the url template.
>> Consider an API:
>> http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=0
>> an entity could specify:
>> <entity name="myEntity" processor="XPathEntityProcessor" transformer="EnumeratedEntityTransformer" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.n}" chunkSize=50>...</entity>
>> This allows for fetching entities in chunks until there are < chunkSize returned.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>



-- 
--Noble Paul

[jira] Commented: (SOLR-994) EnumeratedEntityTransformer

Posted by "Jared Flatow (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668350#action_12668350 ] 

Jared Flatow commented on SOLR-994:
-----------------------------------

The transformer must know somehow when $hasMore should be true. If the transformer always give $hasMore a value "true", will there be infinite requests made or will it stop on the first empty request? Using the EnumeratedEntityTransformer, a user can specify from the config xml when $hasMore should be true using the chunkSize attribute. This solves a general case of "request N rows at a time until no more are available". I agree, a combination of 'rowsFetchedCount' and a HasMoreUntilEmptyTransformer would also make this doable from the configuration.

> EnumeratedEntityTransformer
> ---------------------------
>
>                 Key: SOLR-994
>                 URL: https://issues.apache.org/jira/browse/SOLR-994
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Jared Flatow
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-994.patch
>
>
> An EnumeratedEntityTransformer gives the Nth entity an accessible ${<entity>.n} == N. In addition, the entity may specify a chunkSize attribute, which will cause the chunkSize'th entity to gain the attribute $hasMore=true. A template for a nextUrl may also be specified on the entity, that is different from the url template.
> Consider an API:
> http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=0
> an entity could specify:
> <entity name="myEntity" processor="XPathEntityProcessor" transformer="EnumeratedEntityTransformer" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.n}" chunkSize=50>...</entity>
> This allows for fetching entities in chunks until there are < chunkSize returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-994) EnumeratedEntityTransformer

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Noble Paul updated SOLR-994:
----------------------------

    Fix Version/s:     (was: 1.4)
                   1.5

it is currently possible to add this Transformer explicitly. A more elegant solution would be to make it happen automatically w/o a Transformer. It can be a bigger initiative and i am pushing this to 1.5

> EnumeratedEntityTransformer
> ---------------------------
>
>                 Key: SOLR-994
>                 URL: https://issues.apache.org/jira/browse/SOLR-994
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Jared Flatow
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: SOLR-994.patch
>
>
> An EnumeratedEntityTransformer gives the Nth entity an accessible ${<entity>.n} == N. In addition, the entity may specify a chunkSize attribute, which will cause the chunkSize'th entity to gain the attribute $hasMore=true. A template for a nextUrl may also be specified on the entity, that is different from the url template.
> Consider an API:
> http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=0
> an entity could specify:
> <entity name="myEntity" processor="XPathEntityProcessor" transformer="EnumeratedEntityTransformer" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.n}" chunkSize=50>...</entity>
> This allows for fetching entities in chunks until there are < chunkSize returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (SOLR-994) EnumeratedEntityTransformer

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar reassigned SOLR-994:
------------------------------------------

    Assignee: Shalin Shekhar Mangar

> EnumeratedEntityTransformer
> ---------------------------
>
>                 Key: SOLR-994
>                 URL: https://issues.apache.org/jira/browse/SOLR-994
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Jared Flatow
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-994.patch
>
>
> An EnumeratedEntityTransformer gives the Nth entity an accessible ${<entity>.n} == N. In addition, the entity may specify a chunkSize attribute, which will cause the chunkSize'th entity to gain the attribute $hasMore=true. A template for a nextUrl may also be specified on the entity, that is different from the url template.
> Consider an API:
> http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=0
> an entity could specify:
> <entity name="myEntity" processor="XPathEntityProcessor" transformer="EnumeratedEntityTransformer" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.n}" chunkSize=50>...</entity>
> This allows for fetching entities in chunks until there are < chunkSize returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-994) EnumeratedEntityTransformer

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705318#action_12705318 ] 

Shalin Shekhar Mangar commented on SOLR-994:
--------------------------------------------

bq. one extra addition which I can think of is put an implicit variable 'rowsFetchedCount ' into the variableresolver so that it can be directly used .

With SOLR-989 Context exposes the statistics map so rowCount is already available. Should we close this issue?

> EnumeratedEntityTransformer
> ---------------------------
>
>                 Key: SOLR-994
>                 URL: https://issues.apache.org/jira/browse/SOLR-994
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Jared Flatow
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-994.patch
>
>
> An EnumeratedEntityTransformer gives the Nth entity an accessible ${<entity>.n} == N. In addition, the entity may specify a chunkSize attribute, which will cause the chunkSize'th entity to gain the attribute $hasMore=true. A template for a nextUrl may also be specified on the entity, that is different from the url template.
> Consider an API:
> http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=0
> an entity could specify:
> <entity name="myEntity" processor="XPathEntityProcessor" transformer="EnumeratedEntityTransformer" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.n}" chunkSize=50>...</entity>
> This allows for fetching entities in chunks until there are < chunkSize returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-994) EnumeratedEntityTransformer

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668348#action_12668348 ] 

Noble Paul commented on SOLR-994:
---------------------------------

I guess this is already possible w/ any change.

{code}
<entity name="myEntity" processor="XPathEntityProcessor" transformer="MyTrans" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.somename}" chunkSize=50></entity>

{code}

The MyTransformer can put in a variable called 'somename' into the row and one extra variable called '$hasMore' with a value "true" . XPathEntityprocessor automatically takes up this value 'somename' and make a request after all your current set of rows are done with. 

one extra addition which I can think of is put an implicit variable 'rowsFetchedCount ' into the variableresolver so that it can be directly used .

> EnumeratedEntityTransformer
> ---------------------------
>
>                 Key: SOLR-994
>                 URL: https://issues.apache.org/jira/browse/SOLR-994
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Jared Flatow
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-994.patch
>
>
> An EnumeratedEntityTransformer gives the Nth entity an accessible ${<entity>.n} == N. In addition, the entity may specify a chunkSize attribute, which will cause the chunkSize'th entity to gain the attribute $hasMore=true. A template for a nextUrl may also be specified on the entity, that is different from the url template.
> Consider an API:
> http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=0
> an entity could specify:
> <entity name="myEntity" processor="XPathEntityProcessor" transformer="EnumeratedEntityTransformer" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.n}" chunkSize=50>...</entity>
> This allows for fetching entities in chunks until there are < chunkSize returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-994) EnumeratedEntityTransformer

Posted by "Jared Flatow (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jared Flatow updated SOLR-994:
------------------------------

    Attachment: SOLR-994.patch

> EnumeratedEntityTransformer
> ---------------------------
>
>                 Key: SOLR-994
>                 URL: https://issues.apache.org/jira/browse/SOLR-994
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Jared Flatow
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-994.patch
>
>
> An EnumeratedEntityTransformer gives the Nth entity an accessible ${<entity>.n} == N. In addition, the entity may specify a chunkSize attribute, which will cause the chunkSize'th entity to gain the attribute $hasMore=true. A template for a nextUrl may also be specified on the entity, that is different from the url template.
> Consider an API:
> http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=0
> an entity could specify:
> <entity name="myEntity" processor="XPathEntityProcessor" transformer="EnumeratedEntityTransformer" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.n}" chunkSize=50>...</entity>
> This allows for fetching entities in chunks until there are < chunkSize returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-994) EnumeratedEntityTransformer

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-994:
---------------------------------------

    Assignee:     (was: Shalin Shekhar Mangar)

> EnumeratedEntityTransformer
> ---------------------------
>
>                 Key: SOLR-994
>                 URL: https://issues.apache.org/jira/browse/SOLR-994
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Jared Flatow
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: SOLR-994.patch
>
>
> An EnumeratedEntityTransformer gives the Nth entity an accessible ${<entity>.n} == N. In addition, the entity may specify a chunkSize attribute, which will cause the chunkSize'th entity to gain the attribute $hasMore=true. A template for a nextUrl may also be specified on the entity, that is different from the url template.
> Consider an API:
> http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=0
> an entity could specify:
> <entity name="myEntity" processor="XPathEntityProcessor" transformer="EnumeratedEntityTransformer" url="http://host:port/path/to/resource?maximum_number_returned=50&return_start_index=${myEntity.n}" chunkSize=50>...</entity>
> This allows for fetching entities in chunks until there are < chunkSize returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.