You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Erlend Garåsen <e....@usit.uio.no> on 2012/06/27 13:27:31 UTC

Exporting crawler configuration easier?

We have all configuration files for our search project stored in SVN, 
even our MCF crawler configuration. Each time we change our MCF 
settings, i.e. add something to the seed list, we usually export the 
configuration and commit that change to SVN.

This can be a time-consuming process since we have to unzip the 
generated export file in order to edit the files within it. We need to 
edit the output file which includes the password to our Solr server.

Then we must zip all these files in order to create a similar export 
file. The order of the files are very important. You cannot just create 
a zip file right away without being aware of the order of the included 
files. Otherwise, MCF will complain when you are trying to import that 
file later.

Any suggestions for a smoother way to have a version-controlled 
configuration? Perhaps I should create a script which does all the steps 
mentioned above? As far as I know, it's not possible to edit the files 
directly inside a zip file from a terminal on UNIX.

Thanks,
Erlend

-- 
Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050


Re: Exporting crawler configuration easier?

Posted by Erlend Garåsen <e....@usit.uio.no>.
On 27.06.12 13.33, Karl Wright wrote:
> The fact that the export is a zip is not supposed to be used to
> actually edit the stored information.
>
> It sounds like the reason that you want to edit it is to remove the
> passwords from the file.  Perhaps we should look at it from that point
> of view and allow an export option that does not include any passwords
> or something?

I agree, but I think it will be a little bit complicated to add that 
functionality for reasons we can discuss further on the dev list.

Erlend
-- 
Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050



Re: Exporting crawler configuration easier?

Posted by Erlend Garåsen <e....@usit.uio.no>.
Maybe we should add an argument to the
  "org.apache.manifoldcf.crawler.ExportConfiguration" command class for 
skipping passwords in the export file?

For example:
executecommand.sh ...ExportConfiguration export_file.zip nopass

I'm looking at the source code right now. Today we are just creating an 
XML of the complete map returned by the method getConfigParams().

I suggest that we create a new method solely for the exports:
getConfigParamsWithoutPassword();

I hoped that I could test my changes locally by using the quick example 
(Jetty), but it seems that I need to run a multiprocess setup.

If my approach seems reasonable, I can work on this issue. Since I'm 
going to Shanghai tomorrow, I'm afraid that I have to finish my 
contribution when I'm back.

Erlend

On 27.06.12 13.33, Karl Wright wrote:
> The fact that the export is a zip is not supposed to be used to
> actually edit the stored information.
>
> It sounds like the reason that you want to edit it is to remove the
> passwords from the file.  Perhaps we should look at it from that point
> of view and allow an export option that does not include any passwords
> or something?
>
> Karl
>
> On Wed, Jun 27, 2012 at 7:27 AM, Erlend Garåsen <e....@usit.uio.no> wrote:
>>
>> We have all configuration files for our search project stored in SVN, even
>> our MCF crawler configuration. Each time we change our MCF settings, i.e.
>> add something to the seed list, we usually export the configuration and
>> commit that change to SVN.
>>
>> This can be a time-consuming process since we have to unzip the generated
>> export file in order to edit the files within it. We need to edit the output
>> file which includes the password to our Solr server.
>>
>> Then we must zip all these files in order to create a similar export file.
>> The order of the files are very important. You cannot just create a zip file
>> right away without being aware of the order of the included files.
>> Otherwise, MCF will complain when you are trying to import that file later.
>>
>> Any suggestions for a smoother way to have a version-controlled
>> configuration? Perhaps I should create a script which does all the steps
>> mentioned above? As far as I know, it's not possible to edit the files
>> directly inside a zip file from a terminal on UNIX.
>>
>> Thanks,
>> Erlend
>>
>> --
>> Erlend Garåsen
>> Center for Information Technology Services
>> University of Oslo
>> P.O. Box 1086 Blindern, N-0317 OSLO, Norway
>> Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050
>>


-- 
Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050



Re: Exporting crawler configuration easier?

Posted by Karl Wright <da...@gmail.com>.
The fact that the export is a zip is not supposed to be used to
actually edit the stored information.

It sounds like the reason that you want to edit it is to remove the
passwords from the file.  Perhaps we should look at it from that point
of view and allow an export option that does not include any passwords
or something?

Karl

On Wed, Jun 27, 2012 at 7:27 AM, Erlend Garåsen <e....@usit.uio.no> wrote:
>
> We have all configuration files for our search project stored in SVN, even
> our MCF crawler configuration. Each time we change our MCF settings, i.e.
> add something to the seed list, we usually export the configuration and
> commit that change to SVN.
>
> This can be a time-consuming process since we have to unzip the generated
> export file in order to edit the files within it. We need to edit the output
> file which includes the password to our Solr server.
>
> Then we must zip all these files in order to create a similar export file.
> The order of the files are very important. You cannot just create a zip file
> right away without being aware of the order of the included files.
> Otherwise, MCF will complain when you are trying to import that file later.
>
> Any suggestions for a smoother way to have a version-controlled
> configuration? Perhaps I should create a script which does all the steps
> mentioned above? As far as I know, it's not possible to edit the files
> directly inside a zip file from a terminal on UNIX.
>
> Thanks,
> Erlend
>
> --
> Erlend Garåsen
> Center for Information Technology Services
> University of Oslo
> P.O. Box 1086 Blindern, N-0317 OSLO, Norway
> Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050
>