You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2009/02/20 11:33:01 UTC

[jira] Created: (NUTCH-699) Add an "official" solr schema for solr integration

Add an "official" solr schema for solr integration
--------------------------------------------------

                 Key: NUTCH-699
                 URL: https://issues.apache.org/jira/browse/NUTCH-699
             Project: Nutch
          Issue Type: New Feature
          Components: indexer
            Reporter: Doğacan Güney
            Assignee: Doğacan Güney
             Fix For: 1.0.0


See Andrzej's comments on NUTCH-684 for more info.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration

Posted by "Sami Siren (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676233#action_12676233 ] 

Sami Siren commented on NUTCH-699:
----------------------------------

We could put it under conf/ ?

> Add an "official" solr schema for solr integration
> --------------------------------------------------
>
>                 Key: NUTCH-699
>                 URL: https://issues.apache.org/jira/browse/NUTCH-699
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>
> See Andrzej's comments on NUTCH-684 for more info.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration

Posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682223#action_12682223 ] 

Dmitry Lihachev commented on NUTCH-699:
---------------------------------------

In some cases (eg. when using MoreIndexingFilter) the' title'  field might have more than one value. This causes exception when indexing with solr.  Adding attribute multiValued="true" to title definition solves the problem

> Add an "official" solr schema for solr integration
> --------------------------------------------------
>
>                 Key: NUTCH-699
>                 URL: https://issues.apache.org/jira/browse/NUTCH-699
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>
> See Andrzej's comments on NUTCH-684 for more info.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (NUTCH-699) Add an "official" solr schema for solr integration

Posted by "Sami Siren (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sami Siren resolved NUTCH-699.
------------------------------

    Resolution: Fixed

committed

> Add an "official" solr schema for solr integration
> --------------------------------------------------
>
>                 Key: NUTCH-699
>                 URL: https://issues.apache.org/jira/browse/NUTCH-699
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>
> See Andrzej's comments on NUTCH-684 for more info.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration

Posted by "Dmitry Lihachev (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675324#action_12675324 ] 

Dmitry Lihachev commented on NUTCH-699:
---------------------------------------

I think we must extends field set for each plugin just like code code below 
{noformat} 
<fields>
  <field name="id" type="string" stored="true" indexed="true"/>

  <!-- core fields -->
  <field name="segment" type="string" stored="true" indexed="false"/>
  <field name="digest" type="string" stored="true" indexed="false"/>
  <field name="boost" type="float" stored="true" indexed="false"/>

  <!-- fields for index-basic plugin -->
  <field name="host" type="url" stored="false" indexed="true"/>
  <field name="site" type="string" stored="false" indexed="true"/>
  <field name="url" type="url" stored="true" indexed="true" required="true"/>
  <field name="content" type="text" stored="false" indexed="true"/>
  <field name="title" type="text" stored="true" indexed="true"/>
  <field name="cache" type="string" stored="true" indexed="false"/>
  <field name="tstamp" type="long" stored="true" indexed="false"/>

  <!-- fields for index-anchor plugin -->
  <field name="anchor" type="string" stored="true" indexed="true" multiValued="true"/>

  <!-- fields for index-more plugin -->
  <field name="type" type="string" stored="true" indexed="true" multiValued="true"/>
  <field name="contentLength" type="long" stored="true" indexed="false"/>
  <field name="lastModified" type="long" stored="true" indexed="false"/>
  <field name="date" type="string" stored="true" indexed="true"/>

  <!-- fields for languageidentifier plugin -->
  <field name="lang" type="string" stored="true" indexed="true"/>

  <!-- fields for subcollection plugin -->
  <field name="subcollection" type="string" stored="true" indexed="true"/>

  <!-- fields for feed plugin -->
  <field name="author" type="string" stored="true" indexed="true"/>
  <field name="tag" type="string" stored="true" indexed="true"/>
  <field name="feed" type="string" stored="true" indexed="true"/>
  <field name="publishedDate" type="string" stored="true" indexed="true"/>
  <field name="updatedDate" type="string" stored="true" indexed="true"/>

</fields>

{noformat} 

> Add an "official" solr schema for solr integration
> --------------------------------------------------
>
>                 Key: NUTCH-699
>                 URL: https://issues.apache.org/jira/browse/NUTCH-699
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>
> See Andrzej's comments on NUTCH-684 for more info.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677647#action_12677647 ] 

Hudson commented on NUTCH-699:
------------------------------

Integrated in Nutch-trunk #738 (See [http://hudson.zones.apache.org/hudson/job/Nutch-trunk/738/])
     - Add an "official" solr schema for solr integration. Contributed by dogacan, Dmitry Lihachev


> Add an "official" solr schema for solr integration
> --------------------------------------------------
>
>                 Key: NUTCH-699
>                 URL: https://issues.apache.org/jira/browse/NUTCH-699
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>
> See Andrzej's comments on NUTCH-684 for more info.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676245#action_12676245 ] 

Andrzej Bialecki  commented on NUTCH-699:
-----------------------------------------

+1. Perhaps add a comment on top that explains what users are supposed to do with this file.

> Add an "official" solr schema for solr integration
> --------------------------------------------------
>
>                 Key: NUTCH-699
>                 URL: https://issues.apache.org/jira/browse/NUTCH-699
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>
> See Andrzej's comments on NUTCH-684 for more info.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675317#action_12675317 ] 

Doğacan Güney commented on NUTCH-699:
-------------------------------------

Schema in NUTCH-442 may be a good starting point.

Question, where do you think is a good place for this schema in nutch codebase?

> Add an "official" solr schema for solr integration
> --------------------------------------------------
>
>                 Key: NUTCH-699
>                 URL: https://issues.apache.org/jira/browse/NUTCH-699
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>
> See Andrzej's comments on NUTCH-684 for more info.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.