You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Stefan Groschupf <sg...@media-style.com> on 2006/07/24 20:17:06 UTC
segread vs. readseg
Hi developers,
we have command like readdb and readlinkdb but segread. Wouldn't be
more consistent to name the command readseg instead segread?
... just a thought.
Stefan
Re: segread vs. readseg
Posted by Stefan Groschupf <sg...@media-style.com>.
I like it!
Am 24.07.2006 um 16:10 schrieb Andrzej Bialecki:
> Stefan Neufeind wrote:
>> Andrzej Bialecki wrote:
>>> Stefan Groschupf wrote:
>>>> Hi developers,
>>>>
>>>> we have command like readdb and readlinkdb but segread. Wouldn't
>>>> be more consistent to name the command readseg instead segread?
>>>> ... just a thought.
>>>
>>> Yes, it seems more consistent. However, if we change it then
>>> scripts people wrote would break. We could support both aliases
>>> in 0.8, and give a deprecation message.
>>>
>>> What do others think?
>>
>> Same feeling here. Agreed.
>
> What about the following?
>
> Index: bin/nutch
> ===================================================================
> --- bin/nutch (revision 424960)
> +++ bin/nutch (working copy)
> @@ -40,7 +40,7 @@
> echo " generate generate new segments to fetch"
> echo " fetch fetch a segment's pages"
> echo " parse parse a segment's pages"
> - echo " segread read / dump segment data"
> + echo " readseg read / dump segment data"
> echo " mergesegs merge several segments, with optional
> filtering and slicing"
> echo " updatedb update crawl db from segments after
> fetching"
> echo " invertlinks create a linkdb from parsed segments"
> @@ -158,7 +158,10 @@
> CLASS=org.apache.nutch.crawl.CrawlDbMerger
> elif [ "$COMMAND" = "readlinkdb" ] ; then
> CLASS=org.apache.nutch.crawl.LinkDbReader
> +elif [ "$COMMAND" = "readseg" ] ; then
> + CLASS=org.apache.nutch.segment.SegmentReader
> elif [ "$COMMAND" = "segread" ] ; then
> + echo "[DEPRECATED] Command 'segread' is deprecated, use
> 'readseg' instead."
> CLASS=org.apache.nutch.segment.SegmentReader
> elif [ "$COMMAND" = "mergesegs" ] ; then
> CLASS=org.apache.nutch.segment.SegmentMerger
>
>
> --
> Best regards,
> Andrzej Bialecki <><
> ___. ___ ___ ___ _ _ __________________________________
> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> ___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
>
Re: segread vs. readseg
Posted by Andrzej Bialecki <ab...@getopt.org>.
Stefan Neufeind wrote:
> Andrzej Bialecki wrote:
>> Stefan Groschupf wrote:
>>> Hi developers,
>>>
>>> we have command like readdb and readlinkdb but segread. Wouldn't be
>>> more consistent to name the command readseg instead segread?
>>> ... just a thought.
>>
>> Yes, it seems more consistent. However, if we change it then scripts
>> people wrote would break. We could support both aliases in 0.8, and
>> give a deprecation message.
>>
>> What do others think?
>
> Same feeling here. Agreed.
What about the following?
Index: bin/nutch
===================================================================
--- bin/nutch (revision 424960)
+++ bin/nutch (working copy)
@@ -40,7 +40,7 @@
echo " generate generate new segments to fetch"
echo " fetch fetch a segment's pages"
echo " parse parse a segment's pages"
- echo " segread read / dump segment data"
+ echo " readseg read / dump segment data"
echo " mergesegs merge several segments, with optional
filtering and slicing"
echo " updatedb update crawl db from segments after fetching"
echo " invertlinks create a linkdb from parsed segments"
@@ -158,7 +158,10 @@
CLASS=org.apache.nutch.crawl.CrawlDbMerger
elif [ "$COMMAND" = "readlinkdb" ] ; then
CLASS=org.apache.nutch.crawl.LinkDbReader
+elif [ "$COMMAND" = "readseg" ] ; then
+ CLASS=org.apache.nutch.segment.SegmentReader
elif [ "$COMMAND" = "segread" ] ; then
+ echo "[DEPRECATED] Command 'segread' is deprecated, use 'readseg'
instead."
CLASS=org.apache.nutch.segment.SegmentReader
elif [ "$COMMAND" = "mergesegs" ] ; then
CLASS=org.apache.nutch.segment.SegmentMerger
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: segread vs. readseg
Posted by Stefan Neufeind <ap...@stefan-neufeind.de>.
Andrzej Bialecki wrote:
> Stefan Groschupf wrote:
>> Hi developers,
>>
>> we have command like readdb and readlinkdb but segread. Wouldn't be
>> more consistent to name the command readseg instead segread?
>> ... just a thought.
>
> Yes, it seems more consistent. However, if we change it then scripts
> people wrote would break. We could support both aliases in 0.8, and give
> a deprecation message.
>
> What do others think?
Same feeling here. Agreed.
Stefan
Re: segread vs. readseg
Posted by Andrzej Bialecki <ab...@getopt.org>.
Stefan Groschupf wrote:
> Hi developers,
>
> we have command like readdb and readlinkdb but segread. Wouldn't be
> more consistent to name the command readseg instead segread?
> ... just a thought.
Yes, it seems more consistent. However, if we change it then scripts
people wrote would break. We could support both aliases in 0.8, and give
a deprecation message.
What do others think?
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com