You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs-dev@perl.apache.org by Randy Kobes <ra...@theoryx5.uwinnipeg.ca> on 2002/12/03 17:31:53 UTC

Re: site mirrors

On Tue, 3 Dec 2002, Stas Bekman wrote:

> So I see that Randy has put up a fully functional mirror of 
> perl.apache.org: http://theoryx5.uwinnipeg.ca/modperl/index.html
> which is very fast (faster than perl.apache.org :). How often does it 
> get updated Randy?

I do it once a day, including regenerating the swish-e indices.

> Do we have any other mirrors already? I remember Thomas was talking 
> about making one.
> 
> Even though I was reluctant to having mirrors page, we probably should 
> have one for official mirrors. As long as we make sure that the listed 
> mirrors are up-to-date and perform a full-mirror. You can't imagine how 
> many outdated copies of the mod_perl guide can be found on the web. Some 
> of them are 3 years old :(

There was talk earlier of making up a script to fetch some
oft-changed page from perl.apache.org (the top-level index.html?)
and compare the time-stamp with that of a mirror. Here's a
start:

=================================================================
use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Date;
my $master = 'perl.apache.org';
my $master_time = last_modified($master);
while (<DATA>) {
  chomp;
  my $mirror_time;
  unless ($mirror_time = last_modified($_)) {
    warn "Cannot get Last-Modified time for $_\n";
    next;
  }
  my $diff = ($master_time - $mirror_time) / 86400;
  printf("%s is %.2f days out of sync\n", $_, $diff) if ($diff > 0);
}

sub last_modified {
  my $site = shift;
  my $ua = LWP::UserAgent->new();
  my $req = HTTP::Request->new(HEAD => "http://$site/index.html");
  my $res = $ua->request($req);
  if ($res->is_success) {
    $res->headers_as_string =~ m!(Last-Modified: )(.*)!;
    my $time = str2time($2);
    unless ($time) {
      warn "Couldn't determine Last-Modified time for $site\n";
      return undef;
    }
    return $time;
  }
  else {
    warn ("Error for $site: " . $res->status_line . "\n");
    return undef;
  }
}

__DATA__
theoryx5.uwinnipeg.ca/modperl

=============================================================
-- 
best regards,
randy


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: cleaning mirrors

Posted by Bill Moseley <mo...@hank.org>.
On Wed, 4 Dec 2002, Stas Bekman wrote:

> Randy, do you have an idea how come that the same file exists in two places?
> http://theoryx5.uwinnipeg.ca/modperl/search/swish.cgi?query=Controlling+the+Apache%3A%3ATest&sbm=&submit=search
> 
> docs/2.0/devel/testing/testing.pod has been moved to
> docs/general/testing/testing.pod a few months ago, and I understand that 
> the old autogenerated file may reside in the dest dir. What I don't 
> understand is how it was picked by swish-e, since it shouldn't be linked 
> from any place. e.g. it's not linked from: 
> http://theoryx5.uwinnipeg.ca/modperl/docs/2.0/devel/index.html and also 
> not from: http://perl.apache.org/sitemap.html
> 
> Ideas? can you please grep for testing/testing.html in your dst_html?

We could turn on indexing of hrefs in swish and then you can find what
links to a given page in the collection.



-- 
Bill Moseley moseley@hank.org


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


cleaning mirrors

Posted by Stas Bekman <st...@stason.org>.
Randy, do you have an idea how come that the same file exists in two places?
http://theoryx5.uwinnipeg.ca/modperl/search/swish.cgi?query=Controlling+the+Apache%3A%3ATest&sbm=&submit=search

docs/2.0/devel/testing/testing.pod has been moved to
docs/general/testing/testing.pod a few months ago, and I understand that 
the old autogenerated file may reside in the dest dir. What I don't 
understand is how it was picked by swish-e, since it shouldn't be linked 
from any place. e.g. it's not linked from: 
http://theoryx5.uwinnipeg.ca/modperl/docs/2.0/devel/index.html and also 
not from: http://perl.apache.org/sitemap.html

Ideas? can you please grep for testing/testing.html in your dst_html?

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Bill Moseley <mo...@hank.org>.
At 01:12 PM 12/04/02 +0800, Stas Bekman wrote:
>> ==================================================================
>> Alias /modperl/ "/usr/local/modperl-docs/dst_html/"
>> <Directory "/usr/local/modperl-docs/dst_html">
>>     Options Indexes MultiViews

Is MultiViews so you can request PDF over HTML?

-- 
Bill Moseley
mailto:moseley@hank.org

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Stas Bekman <st...@stason.org>.
Randy Kobes wrote:
> On Wed, 4 Dec 2002, Stas Bekman wrote:
> [ .. ]
> 
>>    But encouraging cvs mirrors sounds like a good idea to me.
>>We could also offer the cron script which syncs the cvs, rebuilds the 
>>docs and indices.
> 
> 
> That sounds good - makes it as easy as possible to do a
> full mirror ... Here's the cron script I use ...
> =================================================================
> #!/bin/bash
> export PATH=/usr/local/bin:/bin:/usr/bin
> export MODPERL_SITE='http://theoryx5.uwinnipeg.ca/modperl'
> export SWISH_BINARY_PATH='/usr/local/bin/swish-e'
> cd /usr/local/modperl-docs
> cvs -z9 up -dR
> bin/build
> bin/makeindex

the problem here, is that PDFs don't get created/update and if the 
templates change the change won't be picked up (need to force a rebuild 
once in a while). See my cvs commit with a more elaborate cron, which is 
tuned to handle all situations, while trying to use as little CPU as 
possible.

> ==================================================================
> The docs are in /usr/local/modperl-docs, and my 
> httpd.conf has
> ==================================================================
> Alias /modperl/ "/usr/local/modperl-docs/dst_html/"
> <Directory "/usr/local/modperl-docs/dst_html">
>     Options Indexes MultiViews
>     AllowOverride None
>     Order allow,deny
>     Allow from all
> </Directory>
> <Directory "/usr/local/modperl-docs/dst_html/search">
>     SetEnv SWISH_BINARY_PATH "/usr/local/bin/swish-e"
>     SetEnv PERL5LIB "/usr/local/modperl-docs/dst_html/search/modules"
>     Options +ExecCGI
>     AddHandler cgi-script cgi
> </Directory>
> =====================================================================

That's cool, I've added it to the /docs/download/docs.html, Thanks!

> The site layout and the search have been really
> well thought out, making it easy to mirror.

;)

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Randy Kobes <ra...@theoryx5.uwinnipeg.ca>.
On Wed, 4 Dec 2002, Stas Bekman wrote:
[ .. ]
>     But encouraging cvs mirrors sounds like a good idea to me.
> We could also offer the cron script which syncs the cvs, rebuilds the 
> docs and indices.

That sounds good - makes it as easy as possible to do a
full mirror ... Here's the cron script I use ...
=================================================================
#!/bin/bash
export PATH=/usr/local/bin:/bin:/usr/bin
export MODPERL_SITE='http://theoryx5.uwinnipeg.ca/modperl'
export SWISH_BINARY_PATH='/usr/local/bin/swish-e'
cd /usr/local/modperl-docs
cvs -z9 up -dR
bin/build
bin/makeindex
==================================================================
The docs are in /usr/local/modperl-docs, and my 
httpd.conf has
==================================================================
Alias /modperl/ "/usr/local/modperl-docs/dst_html/"
<Directory "/usr/local/modperl-docs/dst_html">
    Options Indexes MultiViews
    AllowOverride None
    Order allow,deny
    Allow from all
</Directory>
<Directory "/usr/local/modperl-docs/dst_html/search">
    SetEnv SWISH_BINARY_PATH "/usr/local/bin/swish-e"
    SetEnv PERL5LIB "/usr/local/modperl-docs/dst_html/search/modules"
    Options +ExecCGI
    AddHandler cgi-script cgi
</Directory>
=====================================================================

The site layout and the search have been really
well thought out, making it easy to mirror.

-- 
best regards,
randy


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Stas Bekman <st...@stason.org>.
Randy Kobes wrote:
> On Wed, 4 Dec 2002, Stas Bekman wrote:
> 
> 
>>Randy Kobes wrote:
>>
>>>There was talk earlier of making up a script to fetch some
>>>oft-changed page from perl.apache.org (the top-level index.html?)
>>>and compare the time-stamp with that of a mirror. Here's a
>>>start:
>>
>>you are talking about real mirroring, not 'cvs up && bin/build', right?
> 
> 
> I'm not sure I appreciate the difference ... 

The differences:

- you don't get the swish indexes
- incorrectly set mirror might use a lot of bandwidth, since the whole 
site with pdfs currently weights ~30MB and growing.

> I'm doing a 'cvs up
> && bin/build', in part because it's just as easy as using a
> mirroring program, but also because it's more natural to do it
> that way to regenerate the swish-e indices locally. But apart
> from generating these indices, should it make much of a
> difference here in how the site is mirrored, as long as it's done
> often enough?

Nope. But encouraging cvs mirrors sounds like a good idea to me.
We could also offer the cron script which syncs the cvs, rebuilds the 
docs and indices.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Randy Kobes <ra...@theoryx5.uwinnipeg.ca>.
On Wed, 4 Dec 2002, Stas Bekman wrote:

> Randy Kobes wrote:
> > 
> > There was talk earlier of making up a script to fetch some
> > oft-changed page from perl.apache.org (the top-level index.html?)
> > and compare the time-stamp with that of a mirror. Here's a
> > start:
> 
> you are talking about real mirroring, not 'cvs up && bin/build', right?

I'm not sure I appreciate the difference ... I'm doing a 'cvs up
&& bin/build', in part because it's just as easy as using a
mirroring program, but also because it's more natural to do it
that way to regenerate the swish-e indices locally. But apart
from generating these indices, should it make much of a
difference here in how the site is mirrored, as long as it's done
often enough?

-- 
best regards,
randy


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Stas Bekman <st...@stason.org>.
Randy Kobes wrote:
> On Tue, 3 Dec 2002, Stas Bekman wrote:
> 
> 
>>So I see that Randy has put up a fully functional mirror of 
>>perl.apache.org: http://theoryx5.uwinnipeg.ca/modperl/index.html
>>which is very fast (faster than perl.apache.org :). How often does it 
>>get updated Randy?
> 
> 
> I do it once a day, including regenerating the swish-e indices.

Great!

>>Do we have any other mirrors already? I remember Thomas was talking 
>>about making one.
>>
>>Even though I was reluctant to having mirrors page, we probably should 
>>have one for official mirrors. As long as we make sure that the listed 
>>mirrors are up-to-date and perform a full-mirror. You can't imagine how 
>>many outdated copies of the mod_perl guide can be found on the web. Some 
>>of them are 3 years old :(
> 
> 
> There was talk earlier of making up a script to fetch some
> oft-changed page from perl.apache.org (the top-level index.html?)
> and compare the time-stamp with that of a mirror. Here's a
> start:

you are talking about real mirroring, not 'cvs up && bin/build', right?


__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Bill Moseley <mo...@hank.org>.
At 10:16 PM 12/04/02 -0600, Randy Kobes wrote:
>Bill, how do you tell swish-e to record from where a page is
>referred from in the indexing? Perhaps this might also give a
>clue as to why there's also missing pages ...

 http://swish-e.org/current/docs/SWISH-CONFIG.html#item_HTMLLinksMetaName

So basically you can add to the config file:

  HTMLLinksMetaName links

then if you want to find out where foo.html is linked:

  -w 'links=("foo.html")'





-- 
Bill Moseley
mailto:moseley@hank.org

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Stas Bekman <st...@stason.org>.
>Bill, should we upgrade the binary on daedalus as well? I'm not 
>>symlinking to your copy. I use a real copy. Or should we leave things as is?
> 
> 
> http://www.swish-e.org/current/docs/CHANGES.html doesn't show anything
> that effects what we are doing.  Might as well wait for 2.4 I would have
> thought that HTMLLinksMetaName was in our version, but I'd have to look at
> the cvs to see when that was added.

ok

> In cvs swish.cgi can use the swish-e library and also has a feature
> suggested by Brian B. to run /bin/ps and not run swish-e if there's too
> many running already.
> 
> search.apache.org was getting hit hard and there were a bunch of swish-e
> processes using up resources.  My tests showed that running the script
> under mod_perl with max_clients set only to two was a better solution
> that running /bin/ps ;)
> 
> It might be helpful if search.apache.org wasn't listed as an example the
> flood docs!
> 
> Hey, is there any reason to have search.apache.org index perl.apache.org?
> I'm indexing almost 80,000 files (Jakarta has 10,000 dirs)  and it
> generates a 200M index file so any
> reduction would be nice.  

My opinion is that if you can keep it in, please keep it. I'm sure some 
people come to perl.apache.org because their were looking for something 
on apache.org and surprising have found it on our site. So it's a good 
thing. We have only about 500+ docs, so it's not that of a big save, 
isn't it?

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Bill Moseley <mo...@hank.org>.
On Thu, 5 Dec 2002, Stas Bekman wrote:

> Bill, should we upgrade the binary on daedalus as well? I'm not 
> symlinking to your copy. I use a real copy. Or should we leave things as is?

http://www.swish-e.org/current/docs/CHANGES.html doesn't show anything
that effects what we are doing.  Might as well wait for 2.4 I would have
thought that HTMLLinksMetaName was in our version, but I'd have to look at
the cvs to see when that was added.

In cvs swish.cgi can use the swish-e library and also has a feature
suggested by Brian B. to run /bin/ps and not run swish-e if there's too
many running already.

search.apache.org was getting hit hard and there were a bunch of swish-e
processes using up resources.  My tests showed that running the script
under mod_perl with max_clients set only to two was a better solution
that running /bin/ps ;)

It might be helpful if search.apache.org wasn't listed as an example the
flood docs!

Hey, is there any reason to have search.apache.org index perl.apache.org?
I'm indexing almost 80,000 files (Jakarta has 10,000 dirs)  and it
generates a 200M index file so any
reduction would be nice.  


-- 
Bill Moseley moseley@hank.org


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Stas Bekman <st...@stason.org>.
Randy Kobes wrote:
> On Thu, 5 Dec 2002, Stas Bekman wrote:
> 
> 
>>Randy Kobes wrote:
>>
>>>On Wed, 4 Dec 2002, Stas Bekman wrote:
>>
> 
>>>>Randy, what swish-e version are you using? it must be 2.1-dev
>>>
>>>2.1-dev-25.
>>
>>should be fine.
> 
> 
> I've now upgraded to 2.2 (what Bill suggested about recording
> the linkage wasn't available in 2.1-dev-25).

Bill, should we upgrade the binary on daedalus as well? I'm not 
symlinking to your copy. I use a real copy. Or should we leave things as is?

>>>I'm not sure what the problem is ... The problem about two
>>>versions of testing/testing.html I can't readily see a reason for
>>>- there are two versions of this file in the modperl-docs
>>>dst_html, and by following the indexing both get picked up, but
>>>I'm not sure from where ...
>>
>>If you rm -r dst_html and then bin/build -df there will be only one. But 
>>first I'd love to know how did this happen that that stale page was 
>>picked by swish-e's spider.
> 
> 
> There's also two such pages at perl.apache.org - 
>   docs/2.0/devel/testing/testing.html
>   docs/general/testing/testing.html

That's fine. I need to do a cleanup. Though the old page is correctly 
not linked.

> What's strange is that I took Bill's suggestion for finding the
> linking page, but only the docs/general/testing/testing.html page
> was reported, but still both got indexed (I also tried removing
> the indices and reindexing). However, as in one of your earlier
> messages, I tried a forced build (bin/build -f), and now only the
> one correct page gets indexed. However, there still remains the
> problem that a search for "PerlSetVar" yields fewer hits here
> than at perl.apache.org - I'll look into that later today ...

As Alice was saying: "things are getting weirder and weirder..."

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Randy Kobes <ra...@theoryx5.uwinnipeg.ca>.
On Thu, 5 Dec 2002, Stas Bekman wrote:

> Randy Kobes wrote:
> > On Wed, 4 Dec 2002, Stas Bekman wrote:

> >>Randy, what swish-e version are you using? it must be 2.1-dev
> > 
> > 2.1-dev-25.
> 
> should be fine.

I've now upgraded to 2.2 (what Bill suggested about recording
the linkage wasn't available in 2.1-dev-25).

> 
> > I'm not sure what the problem is ... The problem about two
> > versions of testing/testing.html I can't readily see a reason for
> > - there are two versions of this file in the modperl-docs
> > dst_html, and by following the indexing both get picked up, but
> > I'm not sure from where ...
> 
> If you rm -r dst_html and then bin/build -df there will be only one. But 
> first I'd love to know how did this happen that that stale page was 
> picked by swish-e's spider.

There's also two such pages at perl.apache.org - 
  docs/2.0/devel/testing/testing.html
  docs/general/testing/testing.html

What's strange is that I took Bill's suggestion for finding the
linking page, but only the docs/general/testing/testing.html page
was reported, but still both got indexed (I also tried removing
the indices and reindexing). However, as in one of your earlier
messages, I tried a forced build (bin/build -f), and now only the
one correct page gets indexed. However, there still remains the
problem that a search for "PerlSetVar" yields fewer hits here
than at perl.apache.org - I'll look into that later today ...

-- 
best regards,
randy


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Stas Bekman <st...@stason.org>.
Randy Kobes wrote:
> On Wed, 4 Dec 2002, Stas Bekman wrote:
> 
> 
>>More discrepancies in the search at Randy's mirror:
>>
>>http://perl.apache.org/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
>>gives 40 results
>>
>>http://theoryx5.uwinnipeg.ca/modperl/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
>>only 28
>>
>>e.g. if you search inside the guide 1.0, Randy's mirror has no hits at all.
>>
>>So, now we have pages that were mysteriously picked by the index and 
>>pages that gone unpicked.
>>
>>Randy, what swish-e version are you using? it must be 2.1-dev
> 
> 
> 2.1-dev-25.

should be fine.

> I'm not sure what the problem is ... The problem about two
> versions of testing/testing.html I can't readily see a reason for
> - there are two versions of this file in the modperl-docs
> dst_html, and by following the indexing both get picked up, but
> I'm not sure from where ...

If you rm -r dst_html and then bin/build -df there will be only one. But 
first I'd love to know how did this happen that that stale page was 
picked by swish-e's spider.

Also as you've seen in my another email src/docs/1.0/guide doesn't seem 
to be indexed at all.

> Bill, how do you tell swish-e to record from where a page is
> referred from in the indexing? Perhaps this might also give a
> clue as to why there's also missing pages ...
> 


-- 


__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Randy Kobes <ra...@theoryx5.uwinnipeg.ca>.
On Wed, 4 Dec 2002, Stas Bekman wrote:

> More discrepancies in the search at Randy's mirror:
> 
> http://perl.apache.org/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
> gives 40 results
> 
> http://theoryx5.uwinnipeg.ca/modperl/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
> only 28
> 
> e.g. if you search inside the guide 1.0, Randy's mirror has no hits at all.
> 
> So, now we have pages that were mysteriously picked by the index and 
> pages that gone unpicked.
> 
> Randy, what swish-e version are you using? it must be 2.1-dev

2.1-dev-25.

I'm not sure what the problem is ... The problem about two
versions of testing/testing.html I can't readily see a reason for
- there are two versions of this file in the modperl-docs
dst_html, and by following the indexing both get picked up, but
I'm not sure from where ...

Bill, how do you tell swish-e to record from where a page is
referred from in the indexing? Perhaps this might also give a
clue as to why there's also missing pages ...

-- 
best regards,
randy


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Stas Bekman <st...@stason.org>.
Randy Kobes wrote:
> On Wed, 4 Dec 2002, Stas Bekman wrote:
> 
> 
>>More discrepancies in the search at Randy's mirror:
>>
>>http://perl.apache.org/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
>>gives 40 results
>>
>>http://theoryx5.uwinnipeg.ca/modperl/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
>>only 28
>>
>>e.g. if you search inside the guide 1.0, Randy's mirror has no hits at all.
>>
>>So, now we have pages that were mysteriously picked by the index and 
>>pages that gone unpicked.
> 
> 
> It's taken a "bit" of time, but I found out the cause of the
> above problem - I was doing some redirections for people wanting
> the old guide, and these erroneously redirected some pages in the
> new doc tree. It's been fixed now.

Thanks, Randy!

One more issue that would be nice to handle gracefully is the cleanup of 
the dead docs. DocSet currently has no clue if some document has been 
removed. The simplest thing is to remove dst_html/dst_ps once in a while 
and the new build will include only the active docs. The problem with 
our setup on perl.apache.org is that dst_html is mixed with dist/, 
mail/, embperl/ and a few other things which aren't under DocSet's 
control, and therefore rsync --delete can't be used :(

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Randy Kobes <ra...@theoryx5.uwinnipeg.ca>.
On Wed, 4 Dec 2002, Stas Bekman wrote:

> More discrepancies in the search at Randy's mirror:
> 
> http://perl.apache.org/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
> gives 40 results
> 
> http://theoryx5.uwinnipeg.ca/modperl/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
> only 28
> 
> e.g. if you search inside the guide 1.0, Randy's mirror has no hits at all.
> 
> So, now we have pages that were mysteriously picked by the index and 
> pages that gone unpicked.

It's taken a "bit" of time, but I found out the cause of the
above problem - I was doing some redirections for people wanting
the old guide, and these erroneously redirected some pages in the
new doc tree. It's been fixed now.

-- 
best regards,
randy


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: site mirrors

Posted by Stas Bekman <st...@stason.org>.
More discrepancies in the search at Randy's mirror:

http://perl.apache.org/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
gives 40 results

http://theoryx5.uwinnipeg.ca/modperl/search/swish.cgi?query=PerlSetVar&sbm=&submit=search
only 28

e.g. if you search inside the guide 1.0, Randy's mirror has no hits at all.

So, now we have pages that were mysteriously picked by the index and 
pages that gone unpicked.

Randy, what swish-e version are you using? it must be 2.1-dev

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org