You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs-dev@perl.apache.org by Bill Moseley <mo...@hank.org> on 2002/01/30 21:45:14 UTC

sub-sections for searching

I'm looking for a way to index sections individually.   What I'd like to do
is take a document, grab the <head> section, then combine it with each
section and index as if each one was it's own page.

Could we add a <div> for each section?

That is in the page_body template:

   # render the content
    "<!-- SwishCommand index -->";
    FOREACH sec = doc.body;
        '<div class="section">'
        sec;
        "<br><br>";
        IF loop.count == loop.size;
            INCLUDE navbar_local_bottom
                nav=doc.nav
                rel_doc_root=doc.dir.rel_doc_root;
        ELSE;
            INCLUDE top_link;
        END;
        "<br><br>";
        "</div>"
    END;
    "<!-- SwishCommand noindex -->";

The idea is then I could probably use HTML::TreeBuilder and grab each
section one-by-one, combine with the <head> for the entire page, and index
that.

I could also simply use a regex to split up the page, but TreeBuilder might
be more fun.

Anyway, would that <div> mess up formatting?
-- 
Bill Moseley
mailto:moseley@hank.org

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: sub-sections for searching

Posted by Stas Bekman <st...@stason.org>.
Bill Moseley wrote:

> At 09:47 AM 01/31/02 +0800, Stas Bekman wrote:
> 
> Good morning!


morning indeed :)


>>sure we can. Why do you need to add the <head> section though?
>>
> 
> Because I wanted to maintain the complete <head> section for each document.
>  That way I can pass the <title> to swish, plus swish has the <meta> tags,
> just in case we ever wanted them.  Really, just to do it "Right".


cool!


>>I have suggested before to add <!-- sections --> instead. That's the 
>>safes method, no?
>>
> 
> As you may have seen by now, with wrapping in <div> I can use
> HTML::Element's look_down() to fetch these sections of the tree for me.
> Makes it really easy.

Yup, that's cool! But may be performance wise it'd be much faster to 
split on tags, no? We (can) have a well defined delimiters. But I guess 
since it's spidering, not a real time search it doesn't matter at all. 
So let's keep it that way.


_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: sub-sections for searching

Posted by Bill Moseley <mo...@hank.org>.
At 09:47 AM 01/31/02 +0800, Stas Bekman wrote:

Good morning!

>sure we can. Why do you need to add the <head> section though?

Because I wanted to maintain the complete <head> section for each document.
 That way I can pass the <title> to swish, plus swish has the <meta> tags,
just in case we ever wanted them.  Really, just to do it "Right".


>I have suggested before to add <!-- sections --> instead. That's the 
>safes method, no?

As you may have seen by now, with wrapping in <div> I can use
HTML::Element's look_down() to fetch these sections of the tree for me.
Makes it really easy.


-- 
Bill Moseley
mailto:moseley@hank.org

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Re: sub-sections for searching

Posted by Stas Bekman <st...@stason.org>.
Bill Moseley wrote:

> I'm looking for a way to index sections individually.   What I'd like to do
> is take a document, grab the <head> section, then combine it with each
> section and index as if each one was it's own page.
> 
> Could we add a <div> for each section?
> 
> That is in the page_body template:
> 
>    # render the content
>     "<!-- SwishCommand index -->";
>     FOREACH sec = doc.body;
>         '<div class="section">'
>         sec;
>         "<br><br>";
>         IF loop.count == loop.size;
>             INCLUDE navbar_local_bottom
>                 nav=doc.nav
>                 rel_doc_root=doc.dir.rel_doc_root;
>         ELSE;
>             INCLUDE top_link;
>         END;
>         "<br><br>";
>         "</div>"
>     END;
>     "<!-- SwishCommand noindex -->";


sure we can. Why do you need to add the <head> section though?

 
> The idea is then I could probably use HTML::TreeBuilder and grab each
> section one-by-one, combine with the <head> for the entire page, and index
> that.
> 
> I could also simply use a regex to split up the page, but TreeBuilder might
> be more fun.


I use HTML::Parser


> Anyway, would that <div> mess up formatting?


I have suggested before to add <!-- sections --> instead. That's the 
safes method, no?





-- 


_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org