You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Claudio Martella (JIRA)" <ji...@apache.org> on 2010/11/26 14:47:13 UTC

[jira] Created: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
-------------------------------------------------------------------------------------------------------------

                 Key: NUTCH-939
                 URL: https://issues.apache.org/jira/browse/NUTCH-939
             Project: Nutch
          Issue Type: Improvement
          Components: indexer
    Affects Versions: 1.2
            Reporter: Claudio Martella
            Priority: Minor
             Fix For: 1.2


The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  resolved NUTCH-939.
-------------------------------------

    Resolution: Fixed
      Assignee: Andrzej Bialecki 

I modified the patch slightly to allow more flexibility (you can mix individual segment names and the -dir options) as well as allowing segments placed on different filesystems. Committed in rev. 1051505. Thank you!

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.3
>            Reporter: Claudio Martella
>            Assignee: Andrzej Bialecki 
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Closed] (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Markus Jelsma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Jelsma closed NUTCH-939.
-------------------------------


Bulk close of resolved issues for 1.3.

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.3
>            Reporter: Claudio Martella
>            Assignee: Andrzej Bialecki 
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Claudio Martella (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973706#action_12973706 ] 

Claudio Martella commented on NUTCH-939:
----------------------------------------

Great. what about the Indexer patch?

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.3
>            Reporter: Claudio Martella
>            Assignee: Andrzej Bialecki 
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Julien Nioche (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien Nioche updated NUTCH-939:
--------------------------------

    Affects Version/s:     (was: 1.2)
                       1.3
        Fix Version/s:     (was: 1.2)
                       1.3

1.2 has been released. Marking as 1.3

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.3
>            Reporter: Claudio Martella
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Markus Jelsma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12936003#action_12936003 ] 

Markus Jelsma commented on NUTCH-939:
-------------------------------------

This is a useful patch! Could you also submit a patch for trunk?

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.2
>            Reporter: Claudio Martella
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Claudio Martella (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12936017#action_12936017 ] 

Claudio Martella commented on NUTCH-939:
----------------------------------------

Alex, that's exactly the idea behind this patch. Not all commands though need segments and it looks like that after this patch, they are all fixed. Did I miss some?

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.2
>            Reporter: Claudio Martella
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Claudio Martella (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12936075#action_12936075 ] 

Claudio Martella commented on NUTCH-939:
----------------------------------------

Yes, that's what I guessed looking at the code. I suppose this goes only for 1.x then.

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.2
>            Reporter: Claudio Martella
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973915#action_12973915 ] 

Andrzej Bialecki  commented on NUTCH-939:
-----------------------------------------

1.2 release is out, and branch-1.2 is unlikely to result in a subsequent release - most users seem to be interested either in 1.3 or trunk.

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.3
>            Reporter: Claudio Martella
>            Assignee: Andrzej Bialecki 
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Alex McLintock (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12936009#action_12936009 ] 

Alex McLintock commented on NUTCH-939:
--------------------------------------

Although I haven't checked these patches yet it would be good to have a consistent way of specifying segments - such as by using a -dir directory option. It would be good if the option behaved the same everywhere.

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.2
>            Reporter: Claudio Martella
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Claudio Martella (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12936014#action_12936014 ] 

Claudio Martella commented on NUTCH-939:
----------------------------------------

to be honest i haven't had a big look at nutchbase. I just had a quick look right now and it looks like the mechanism is quite different. There are no such command lines requests for those jobs. Or am I missing something?

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.2
>            Reporter: Claudio Martella
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Claudio Martella (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claudio Martella updated NUTCH-939:
-----------------------------------

    Attachment: SolrIndexer.patch
                Indexer.patch

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.2
>            Reporter: Claudio Martella
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12936047#action_12936047 ] 

Andrzej Bialecki  commented on NUTCH-939:
-----------------------------------------

Please note that trunk uses a very different method of working with segments (called batches there), and -dir is not applicable there.

> Added -dir command line option to Indexer and SolrIndexer,  allowing to specify directory containing segments
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-939
>                 URL: https://issues.apache.org/jira/browse/NUTCH-939
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.2
>            Reporter: Claudio Martella
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: Indexer.patch, SolrIndexer.patch
>
>
> The patches add -dir option, so the user can specify the directory in which the segments are to be found. The actual mode is to specify the list of segments, which is not very easy with hdfs. Also, the -dir option is already implemented in LinkDB and SegmentMerger, for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.