You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs-cvs@perl.apache.org by st...@apache.org on 2002/01/05 19:51:59 UTC
cvs commit: modperl-docs/lib/DocSet/Template/Plugin NavigateCache.pm
stas 02/01/05 10:51:59
Added: lib DocSet.pm
lib/DocSet 5005compat.pm Cache.pm Config.pm Doc.pm DocSet.pm
NavigateCache.pm RunTime.pm Util.pm
lib/DocSet/Doc HTML2HTML.pm POD2HTML.pm Text2HTML.pm
lib/DocSet/DocSet HTML.pm PSPDF.pm
lib/DocSet/Source HTML.pm POD.pm Text.pm
lib/DocSet/Template/Plugin NavigateCache.pm
Log:
- add the DocSet package locally until it gets released on CPAN
Revision Changes Path
1.1 modperl-docs/lib/DocSet.pm
Index: DocSet.pm
===================================================================
package DocSet;
$VERSION = '0.08';
=head1 NAME
DocSet - documentation projects builder in HTML, PS and PDF formats
=head1 SYNOPSIS
pod2hpp [options] base_full_path relative_to_base_configuration_file_location
Options:
-h this help
-v verbose
-i podify pseudo-pod items (s/^* /=item */)
-s create the splitted html version (not implemented)
-t create tar.gz (not implemented)
-p generate PS file
-d generate PDF file
-f force a complete rebuild
-a print available hypertext anchors (not implemented)
-l do hypertext links validation (not implemented)
-e slides mode (for presentations) (not implemented)
-m executed from Makefile (forces rebuild,
no PS/PDF file,
no tgz archive!)
=head1 DESCRIPTION
This package builds a docset from sources in different formats. The
generated documents can be all nicely interlinked and to have the same
look and feel.
Currently it knows to handle input formats:
* POD
* HTML
and knows to generate:
* HTML
* PS
* PDF
=head2 Modification control
Each output mode maintains its own cache (per docset) which is used
when certain source documents weren't modified since last build and
the build is running in a non-force-rebuild mode.
=head2 Definitions:
* Chapter is a single document (file).
* Link is an URL
* Docset is a collection of docsets, chapters and links.
=head2 Application Specific Features
=over
=item 1
META: not ported yet!
Generate a split version HTML, creating html file for each pod
section, and having everything interlinked of course. This version is
used best for the search.
=item 1
Complete the POD on the fly from the files in POD format. This is used
to ease the generating of the presentations slides, so one can use
C<*> instead of a long =over/=item/.../=item/=back strings. The rest
is done as before. Take a look at the special version of the html2ps
format to generate nice slides in I<conf/html2ps-slides.conf>.
=item 1
META: not ported yet!
If you turn the slides mode on, it automatically turns the C<-i> (C<*>
preprocessing) mode and does a page break before each =head tag.
=back
=head2 Look-n-Feel Customization
You can customise the look and feel of the ouput by adjusting the
templates in the directory I<example/tmpl/custom>.
You can change look and feel of the PS (PDF) versions by modifying
I<example/conf/html2ps.conf>. Be careful that if your documentation
that you want to put in one PS or PDF file is very big and you tell
html2ps to put the TOC at the beginning you will need lots of memory
because it won't write a single byte to the disk before it gets all
the HTML markup converted to PS.
=head1 CONFIGURATION
All you have to prepare is a single config file that you then pass as
an argument to C<pod2hpp>:
pod2hpp [options] /abs/project/root/path /full/path/to/config/file
Every directory in the source tree may have a configuration file,
which designates a docset's root. See the I<config> files for
examples. Usually the file in the root (I<example/src>) sets
operational directories and other arguments, which you don't have to
repeat in sub-docsets. Modify these files to suit your documentation
project layout.
Note that I<example/bin/build> script automatically locates your
project's directory, so you can move your project around filesystem
without changing anything.
I<example/README> explains the layout of the directories.
C<DocSet::Config> manpage explains the layout of the configuration
file.
=head1 PREREQUISITES
All these are not required if all you want is to generate only the
html version.
=over 4
=item * ps2pdf
Needed to generate the PDF version
=item * Storable
Perl module available from CPAN (http://cpan.org/)
Allows source modification control, so if you modify only one file you
will not have to rebuild everything to get the updated HTML/PS/PDF
files.
=back
=head1 SUPPORT
Notice that this tool relies on two tools (ps2pdf and html2ps) which I
don't support. So if you have any problem first make sure that it's
not a problem of these tools.
Note that while C<html2ps> is included in this distribution, it's
written in the old style Perl, so if you have patches send them along,
but I won't try to fix/modify this code otherwise. I didn't write this
utility.
=head1 BUGS
Huh? Probably many...
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=head1 SEE ALSO
perl(1), Pod::HTML(3), html2ps(1), ps2pod(1), Storable(3)
=head1 COPYRIGHT
This program is distributed under the Artistic License, like the Perl
itself.
=cut
1.1 modperl-docs/lib/DocSet/5005compat.pm
Index: 5005compat.pm
===================================================================
package DocSet::5005compat;
use strict;
use Symbol ();
use File::Basename;
use File::Path;
use Symbol ();
my %compat_files = (
'lib/warnings.pm' => \&warnings_pm,
);
sub import {
if ($] >= 5.006) {
#make sure old compat stubs dont wipe out installed versions
unlink for keys %compat_files;
return;
}
eval { require File::Spec::Functions; } or
die "this is only Perl $], you need to install File-Spec from CPAN";
my $min_version = 0.82;
unless ($File::Spec::VERSION >= $min_version) {
die "you need to install File-Spec-$min_version or higher from CPAN";
}
while (my($file, $sub) = each %compat_files) {
$sub->($file);
}
}
sub open_file {
my $file = shift;
unless (-d 'lib') {
$file = "Apache-Test/$file";
}
my $dir = dirname $file;
unless (-d $dir) {
mkpath([$dir], 0, 0755);
}
my $fh = Symbol::gensym();
print "creating $file\n";
open $fh, ">$file" or die "open $file: $!";
return $fh;
}
sub warnings_pm {
return if eval { require warnings };
my $fh = open_file(shift);
print $fh <<'EOF';
package warnings;
sub import {}
1;
EOF
close $fh;
}
1;
1.1 modperl-docs/lib/DocSet/Cache.pm
Index: Cache.pm
===================================================================
package DocSet::Cache;
use strict;
use warnings;
use DocSet::RunTime;
use DocSet::Util;
use Storable;
use Carp;
my %attrs = map {$_ => 1} qw(toc meta order);
sub new {
my($class, $path) = @_;
die "no cache path specified" unless defined $path;
my $self = bless {
path => $path,
dirty => 0,
}, ref($class)||$class;
$self->read();
return $self;
}
sub path {
my($self) = @_;
$self->{path};
}
sub read {
my($self) = @_;
if (-w $self->{path} && DocSet::RunTime::has_storable_module()) {
note "+++ Reading cache from $self->{path}";
$self->{cache} = Storable::retrieve($self->{path});
} else {
note "+++ Initializing a new cache for $self->{path}";
$self->{cache} = {};
}
}
sub write {
my($self) = @_;
if (DocSet::RunTime::has_storable_module()) {
note "+++ Storing the docset's cache to $self->{path}";
Storable::store($self->{cache}, $self->{path});
$self->{dirty} = 0; # mark as synced (clean)
}
}
# set a cache entry (overrides a prev entry if any exists)
sub set {
my($self, $id, $attr, $data, $hidden) = @_;
croak "must specify a unique id" unless defined $id;
croak "must specify an attribute" unless defined $attr;
croak "unknown attribute $attr" unless exists $attrs{$attr};
# remember the addition order (unless it's an update)
unless (exists $self->{cache}{$id}) {
push @{ $self->{cache}{_ordered_ids} }, $id;
$self->{cache}{$id}{seq} = $#{ $self->{cache}{_ordered_ids} };
}
$self->{cache}{$id}{$attr} = $data;
$self->{cache}{$id}{_hidden} = $hidden;
$self->{dirty} = 1;
}
# get a cache entry
sub get {
my($self, $id, $attr) = @_;
croak "must specify a unique id" unless defined $id;
croak "must specify an attribute" unless defined $attr;
croak "unknown attribute $attr" unless exists $attrs{$attr};
if (exists $self->{cache}{$id} && exists $self->{cache}{$id}{$attr}) {
return $self->{cache}{$id}{$attr};
}
}
# check whether a cached entry exists
sub is_cached {
my($self, $id, $attr) = @_;
croak "must specify a unique id" unless defined $id;
croak "must specify an attribute" unless defined $attr;
croak "unknown attribute $attr" unless exists $attrs{$attr};
exists $self->{cache}{$id}{$attr};
}
# invalidate cache (i.e. when a complete rebuild is forced)
sub invalidate {
my($self) = @_;
$self->{cache} = {};
}
# delete an entry in the cache
sub unset {
my($self, $id, $attr) = @_;
croak "must specify a unique id" unless defined $id;
croak "must specify an attribute" unless defined $attr;
croak "unknown attribute $attr" unless exists $attrs{$attr};
if (exists $self->{cache}{$id}{$attr}) {
delete $self->{cache}{$id}{$attr};
$self->{dirty} = 1;
}
}
sub is_hidden {
my($self, $id) = @_;
#print "$id is hidden\n" if $self->{cache}{$id}{_hidden};
return $self->{cache}{$id}{_hidden};
}
# return the sequence number of $id in the list of linked objects (0..N)
sub id2seq {
my($self, $id) = @_;
croak "must specify a unique id" unless defined $id;
if (exists $self->{cache}{$id}) {
return $self->{cache}{$id}{seq};
}
else {
# this shouldn't happen!
die "Cannot find $id in $self->{path} cache",
dumper $self;
}
}
# return the $id at the place $seq in the list of linked objects (0..N)
sub seq2id {
my($self, $seq) = @_;
croak "must specify a seq number" unless defined $seq;
if ($self->{cache}{_ordered_ids}) {
return $self->{cache}{_ordered_ids}->[$seq];
}
else {
die "Cannot find $seq in $self->{path} cache",
dumper $self;
}
}
sub ordered_ids {
my($self) = @_;
return @{ $self->{cache}{_ordered_ids}||[] };
}
sub total_ids {
my($self) = @_;
return scalar @{ $self->{cache}{_ordered_ids}||[] };
}
# remember the meta data of the index node
sub index_node {
my($self) = shift;
if (@_) {
# set
my($id, $title, $abstract) = @_;
croak "must specify the index_node's id" unless defined $id;
croak "must specify the index_node's title" unless defined $title;
$self->{cache}{_index}{id} = $id;
$self->{cache}{_index}{title} = $title;
$self->{cache}{_index}{abstract} = $abstract;
}
else {
# get
return exists $self->{cache}{_index}
? $self->{cache}{_index}
: undef;
}
}
# set/get the path to the parent cache
sub parent_node {
my($self) = shift;
if (@_) {
# set
my($cache_path, $id, $rel_path) = @_;
croak "must specify a path to the parent cache" unless defined $cache_path;
croak "must specify a relative to parent path" unless defined $rel_path;
croak "must specify a parent id" unless defined $id;
$self->{cache}{_parent}{cache_path} = $cache_path;
$self->{cache}{_parent}{id} = $id;
$self->{cache}{_parent}{rel_path} = $rel_path;
}
else {
# get
return exists $self->{cache}{_parent}
? ($self->{cache}{_parent}{cache_path},
$self->{cache}{_parent}{id},
$self->{cache}{_parent}{rel_path})
: (undef, undef, undef);
}
}
# set/get the path to the node_groups cache
sub node_groups {
my($self) = shift;
if (@_) { # set
$self->{cache}{_node_groups} = shift;
}
else { # get
return $self->{cache}{_node_groups};
}
}
sub is_dirty { shift->{dirty};}
sub DESTROY {
my($self) = @_;
# flush the cache if destroyed before having a chance to sync to the disk
$self->write if $self->is_dirty;
}
1;
__END__
=head1 NAME
C<DocSet::Cache> - Maintain a Non-Volatile Cache of DocSet's Data
=head1 SYNOPSIS
use DocSet::Cache ();
my $cache = DocSet::Cache->new($cache_path);
$cache->read;
$cache->write;
$cache->set($id, $attr, $data);
my $data = $cache->get($id, $attr);
print "$id is cached" if $cache->is_cached($id);
$cache->invalidate();
$cache->unset($id, $attr)
my $seq = $cache->id2seq($id);
my $id = $cache->seq2id($seq);
my @ids = $cache->ordered_ids;
my $total_ids = $cache->total_ids;
$cache->index_node($id, $title, $abstract);
my %index_node = $cache->index_node();
$cache->parent_node($cache_path, $id, $rel_path);
my($cache_path, $id, $rel_path) = $cache->parent_node();
=head1 DESCRIPTION
C<DocSet::Cache> maintains a non-volatile cache of docset's data.
The cache is initialized either from the freezed file at the provided
path. When the file is empty or doesn't exists, a new cache is
initialized. When the cache is modified it should be saved, but if for
some reason it doesn't get saved, the C<DESTROY> method will check
whether the cache wasn't synced to the disk yet and will perform the
sync itself.
Each docset's node can create an entry in the cache, and store its
data in it. The creator has to ensure that it supplies a unique id for
each node that is added. Cache's internal representation is a hash,
with internal data keys starting with _ (underscore), therefore the
only restriction on node's id value is that it shouldn't not start
with underscore.
=head2 METHODS
META: to be written (see SYNOPSIS meanwhile)
=over
=item *
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/Config.pm
Index: Config.pm
===================================================================
package DocSet::Config;
use strict;
use warnings;
use Carp;
use File::Find;
use File::Basename ();
use File::Spec::Functions;
use DocSet::Util;
use constant TRACE => 1;
# uri extension to MIME type mapping
my %ext2mime = (
map({$_ => 'text/html' } qw(htm html)),
map({$_ => 'text/plain'} qw(txt text)),
map({$_ => 'text/pod' } qw(pod pm)),
);
my %conv_class= (
'text/pod' => {
'text/html' => 'DocSet::Doc::POD2HTML',
'text/ps' => 'DocSet::Doc::POD2PS',
},
'text/html' => {
'text/html' => 'DocSet::Doc::HTML2HTML',
'text/ps' => 'DocSet::Doc::HTML2PS',
},
'text/plain' => {
'text/html' => 'DocSet::Doc::Text2HTML',
'text/pdf' => 'DocSet::Doc::Text2PDF',
},
);
sub ext2mime {
my($self, $ext) = @_;
exists $ext2mime{$ext} ? $ext2mime{$ext} : undef;
}
sub conv_class {
my($self, $src_mime, $dst_mime) = @_;
# convert
die "src_mime is not defined" unless defined $src_mime;
die "dst_mime is not defined" unless defined $dst_mime;
my $conv_class = $conv_class{$src_mime}{$dst_mime}
or die "unknown input/output MIME mapping: $src_mime => $dst_mime";
return $conv_class;
}
my %attr = map {$_ => 1} qw(chapters docsets links);
sub read_config {
my($self, $config_file) = @_;
die "Configuration file is not specified" unless $config_file;
my $package = path2package($config_file);
$self->{package} = $package;
my $content;
read_file($config_file, \$content);
eval join '',
"package $package;",
$content, ";1;";
die "failed to eval config file at $config_file:\n$@" if $@;
# parse the attributes of the docset's config file
no strict 'refs';
use vars qw(@c);
*c = \@{"$package\::c"};
my @groups = ();
my $current_group = '';
my $group_size;
for ( my $i=0; $i < @c; $i +=2 ) {
my($key, $val) = @c[$i, $i+1];
if ($key eq 'group') {
# close the previous group by storing the key of its last node
if ($current_group) {
push @{ $self->{node_groups} }, $current_group, $group_size;
}
# start the new group
$current_group = $val;
$group_size = 0;
}
elsif ($key eq 'hidden') {
die "hidden's value must be an ARRAY reference"
unless ref $val eq 'ARRAY';
my @h = @$val;
for ( my $j=0; $j < @h; $j +=2 ) {
my($key1, $val1) = @h[$j, $j+1];
die "hidden's can include only 'chapters' and 'docsets', " .
"$key1 is invalid" unless $key1 =~ /^(docsets|chapters)$/;
$self->add_node($key1, $val1, 1);
}
}
elsif (exists $attr{$key}) {
$group_size += $self->add_node($key, $val, 0);
}
else {
$self->{$key} = $val;
#print "$key = $val\n";
}
}
if ($current_group) {
push @{ $self->{node_groups} }, $current_group, $group_size;
}
# merge_config will adjust this value, for nested docsets
# so this value is relevant only for the real top parent node
$self->{dir}{abs_doc_root} = '.';
$self->{dir}{src_root} = File::Basename::dirname $config_file;
# dumper $self;
}
#
# 1. put chapters together, docsets together, links together
# 2. store the normal nodes in the order they were listed in 'ordered_nodes'
# 2. store the hidden nodes in the order they were listed in 'hidden_nodes'
#
# return the number of added items
sub add_node {
my($self, $key, $value, $hidden) = @_;
my @values = ref $value eq 'ARRAY' ? @$value : $value;
if ($hidden) {
push @{ $self->{hidden_nodes} }, $key, $_ for @values;
}
else {
push @{ $self->{ordered_nodes} }, $key, $_ for @values;
}
return scalar @values;
}
# child config inherits from the parent config
# and adjusts its paths
sub merge_config {
my($self, $src_rel_dir) = @_;
my $parent_o = $self->{parent_o};
my $files = $self->{file} || {};
while ( my($k, $v) = each %{ $parent_o->{file}||{} }) {
$self->{file}{$k} = $v unless $files->{$k};
}
my $dirs = $self->{dir} || {};
while ( my($k, $v) = each %{ $parent_o->{dir}||{} }) {
$self->{dir}{$k} = $v unless $dirs->{$k};
}
# a chapter object won't set this one
if ($src_rel_dir) {
$self->{dir}{src_rel_dir} = $src_rel_dir;
# append the relative to parent_o's src dir segments
# META: hardcoded paths!
for my $k ( qw(dst_html dst_ps dst_split_html) ) {
$self->{dir}{$k} .= "/$src_rel_dir";
}
# set path to the abs_doc_root
# META: hardcoded paths! (but in this case it doesn't matter,
# as long as it's set in the config file
$self->{dir}{abs_doc_root} =
join '/', ("..") x ($self->{dir}{dst_html} =~ tr|/|/|);
}
}
# return a list of files to be copied
#
# due to a potentially huge list of files to be copied (e.g. the
# splash library) currently it's assumed that this function is called
# only once. Therefore no caching is done to save memory.
#
# The following conventions are used for $self->{copy_glob}
# 1. Explicitly specified files and directories are copied as is
# (directories aren't descended into)
# 2. Shell metachars (*?[]) can be used. e.g. if you want to grab
# directory foo and its contents, make sure to specify foo/*.
sub files_to_copy {
my($self) = @_;
my $copy_skip_patterns = $self->{copy_skip} || [];
# build one sub that will match many regex at once.
my $rsub_filter_out = build_matchmany_sub($copy_skip_patterns);
my $src_root = $self->get_dir('src_root');
# expand $self->{copy_glob}, applying the filter to skip unwanted
# files
my @files =
grep !$rsub_filter_out->($_), # skip unwanted
# grep s|^(?:\./)?||, # strip the leading ./
grep !-d $_, # skip empty dirs
map { -d $_ ? @{ expand_dir($_) } : $_ } # expand dirs
map { $_ =~ /[\*\?\[\]]/ ? glob($_) : $_ } # expand globs
map { "$src_root/$_" } # prefix with src_root
@{ $self->{copy_glob}||[] };
return \@files;
}
sub expand_dir {
my @files = ();
if ($] >= 5.006) {
find(sub {push @files, $File::Find::name}, $_[0]);
}
else {
# perl 5.005.03 on FreeBSD doesn't set the dir it chdir'ed to
# need to move this to compat level?
require Cwd;
my $cwd;
find(sub {$cwd = Cwd::cwd(); push @files, catfile $cwd, $_}, $_[0]);
}
return \@files;
}
sub set {
my($self, %args) = @_;
@{$self}{keys %args} = values %args;
}
sub set_dir {
my($self, %args) = @_;
@{ $self->{dir} }{keys %args} = values %args;
}
sub get {
my $self = shift;
return () unless @_;
my @values = map {exists $self->{$_} ? $self->{$_} : ''} @_;
return wantarray ? @values : $values[0];
}
sub get_file {
my $self = shift;
return () unless @_;
my @values = map {exists $self->{file}{$_} ? $self->{file}{$_} : ''} @_;
return wantarray ? @values : $values[0];
}
sub get_dir {
my $self = shift;
return () unless @_;
my @values = map {exists $self->{dir}{$_} ? $self->{dir}{$_} : ''} @_;
return wantarray ? @values : $values[0];
}
sub nodes_by_type {
my $self = shift;
return $self->{ordered_nodes} || [];
}
sub hidden_nodes_by_type {
my $self = shift;
return $self->{hidden_nodes} || [];
}
sub node_groups {
my $self = shift;
return $self->{node_groups} || [];
}
sub docsets {
my $self = shift;
return exists $self->{docsets} ? @{ $self->{docsets} } : ();
}
sub links {
my $self = shift;
return exists $self->{links} ? @{ $self->{links} } : ();
}
sub src_chapters {
my $self = shift;
return exists $self->{chapters} ? @{ $self->{chapters} } : ();
}
# chapter paths as they go into production
# $self->trg_chapters(@paths) : push a chapter(s)
# $self->trg_chapters : retrieve the list
sub trg_chapters {
my $self = shift;
if (@_) {
push @{ $self->{chapters_prod} }, @_;
} else {
return exists $self->{chapters_prod} ? @{ $self->{chapters_prod} } : ();
}
}
# set/get cache
sub cache {
my $self = shift;
if (@_) {
$self->{cache} = shift;
}
$self->{cache};
}
sub path2package {
my $path = shift;
$path =~ s|[\W\.]|_|g;
return "MyDocSet::X$path";
}
sub object_store {
my($self, $object) = @_;
croak "no object passed" unless defined $object and ref $object;
push @{ $self->{_objects_store} }, $object;
}
sub stored_objects {
my($self) = @_;
return @{ $self->{_objects_store}||[] };
}
#sub chapter_data {
# my $self = shift;
# my $id = shift;
# if (@_) {
# $self->{chapter_data}{$id} = shift;
# }
# else {
# $self->{chapter_data}{$id};
# }
#}
1;
__END__
=head1 NAME
C<DocSet::Config> - A superclass that handles object's configuration and data
=head1 SYNOPSIS
use DocSet::Config ();
my $mime = $self->ext2mime($ext);
my $class = $self->conv_class($src_mime, $dst_mime);
$self->read_config($config_file);
$self->merge_config($src_rel_dir);
my @files = $self->files_to_copy(files_to_copy);
my @files = $self->expand_dir();
$self->set($key => $val);
$self->set_dir($dir_name => $val);
$val = $self->get($key);
$self->get_file($key);
$self->get_dir($dir_name);
my @docsets = $self->docsets();
my @links = $self->links();
my @chapters = $self->src_chapters();
my @chapters = $self->trg_chapters();
$self->cache($cache);
my $cache = $self->cache();
$package = $self->path2package($path);
$self->object_store($object);
my @objects = $self->stored_objects();
=head1 DESCRIPTION
This objects lays in the base of the DocSet class and provides
configuration and internal data storage/retrieval methods.
At the end of this document the generic configuration file is
explained.
=head2 METHODS
META: to be completed (see SYNOPSIS meanwhile)
=over
=item * ext2mime
=item * conv_class
=item * read_config
=item * merge_config
=item * files_to_copy
=item * expand_dir
=item * set
=item * set_dir
=item * get
=item * get_file
=item * get_dir
=item * docsets
=item * links
=item * src_chapters
=item * trg_chapters
=item * cache
=item * path2package
=item * object_store
=item * stored_objects
=back
=back
=head1 CONFIGURATION FILE
Each DocSet has its own configuration file.
=head2 Structure
Currently the configuration file is a simple perl script that is
expected to declare an array C<@c> with all the docset properties in
it. Later on more configuration formats will be supported.
We use the C<@c> array because some of the configuration attributes
may be repeated, so the hash datatype is not suitable here. Otherwise
this array looks exactly like a hash:
key1 => val1,
key2 => val2,
...
keyN => valN
Of course you can declare any other perl variables and do whatevery
you want, but after the config file is run, it should have C<@c> set.
Don't forget to end the file with C<1;>.
=head2 Declare once attributes
The following attributes must be declared at least in the top-level
I<config.cfg> file:
=over
=item * dir
dir => {
# the resulting html files directory
dst_html => "dst_html",
# the resulting ps and pdf files directory (and special
# set of html files used for creating the ps and pdf
# versions.)
dst_ps => "dst_ps",
# the resulting split version html files directory
dst_split_html => "dst_split_html",
# location of the templates relative to the root dir
# (searched left to right)
tmpl => [qw(tmpl/custom tmpl/std tmpl)],
},
=item * file
file => {
# the html2ps configuration file
html2ps_conf => "conf/html2ps.conf",
},
=back
Generally you should specify these only in the top-level config file,
and only specify these again in sub-level config files, if you want to
override things for the sub-docset and its successors.
=head2 DocSet must attributes
The following attributes must be declared in every docset configuration:
=over
=item * id
a unique id of the docset. The uniquness should be preserved across
any parallel docsets.
=item * title
the title of the docset
=item * abstract
a short abstract
=back
=head2 DocSet Components
Any DocSet components can be repeated as many times as wanted. This
allows to mix various types of nodes and still have oredered the way
you want. You can have a chapter followed by a docset and followed by
a few more chapters and ended with a link.
The value of each component can be either a single item or a reference
to an array of items.
=over
=item * docsets
the docset can recursively include other docsets, simply list the
directories the other docsets can be found in (where the I<config.cfg>
file can be found)
=item * chapters
Each chapter can be specified as a path to its source document.
=item * links
The docset supports hyperlinks. Each link must be declared as a hash
reference with keys: I<id>, I<link>, I<title> and I<abstract>.
If you want to link to an external resource start the link, with URI
(e.g. C<http://>). But this attribute also works for local links, for
example, if the same generated page should be linked from more than
one place, or if there is some non parsed object that needs to be
linked to after it gets copied via I<copy_glob> attribute in the same
or another docset.
=back
This is an example:
docsets => ['docs', 'cool_docset'],
chapters => [
qw(
about/about.html
)
],
docsets => [
qw(
download
)
],
chapters => 'foo/bar/zed.pod',
links => [
{
id => 'asf',
link => 'http://apache.org/foundation/projects.html',
title => 'The ASF Projects',
abstract => "There many other ASF Projects",
},
],
Since normally books consist of parts which group chapters by a common
theme, we support this feature as well. So the index can now be
generated as:
part I: Installation
* Starting
* Installing
part II: Troubleshooting
* Debugging
* Errors
* Help Links
* Offline Help
This happens only if this feature is used, otherwise a plain flat toc
is used: to enable this feature simply splice nodes with declaration
of a new group using the I<group> attribute:
group => 'Installation',
chapters => [qw(start.pod install.pod)],
group => 'Troubleshooting',
chapters => [qw(debug.pod errors.pod)],
links => [{put link data here}],
chapters => ['offline_help.pod'],
=head2 Hidden Objects
I<docsets> and I<chapters> can be marked as hidden. This means that
they will be normally processed but won't be linked from anywhere.
Since the hidden objects cannot belong to any group and it doesn't
matter when they are listed in the config file, you simply put one or
more I<docsets> and I<chapters> into a special attribute I<hidden>
which of course can be repeated many times just like most of the
attributes.
For example:
...
chapters => [qw(start.pod install.pod)],
hidden => {
chapters => ['offline_help.pod'],
docsets => ['hidden_docset'],
},
...
The cool thing is that the hidden I<docsets> and I<chapters> will see
all the unhidden objects, so those who know the "secret" URL will be
able to navigate back to the non-hidden objects transparently.
This feature could be useful for example to create pages normally not
accessed by users. For example if you want to create a page used for
the Apache's I<ErrorDocument> handler, you want to mark it hidden,
because it shouldn't be linked from anywhere, but once the user hit it
(because a non-existing URL has been entered) the user will get a
perfect page with all the proper navigation widgets (I<menu>, etc) in
it.
=head2 Copy unmodified
Usually the generated UI includes images, CSS files and of course some
files must be copied without any modifications, like files including
pure code, archives, etc. There are two attributes to handle this:
=over
=item * copy_glob
Accepts a reference to an array of files and directories to copy. Note
that you must use shell wildcharacters if you want deep directory
copies, which also works for things like: C<*.html>. If you simply
specify a directory name it'll be copied without any contents (this is
a feature!). For example:
# non-pod/html files or dirs to be copied unmodified
copy_glob => [
qw(
style.css
images/*
)
],
will copy I<style.css> and all the files under the I<images/>
directory.
=item * copy_skip
While I<copy_glob> allows specifying complete dirs with potentially
many nested sub-dirs to be copied, this becomes inconvenient if we
want to copy all but a few files in these directories. The
I<copy_skip> rule comes to help. It accepts a reference to an array of
regular expressions that will be applied to each candidate to be
copied as suggested by the I<copy_glob> attribute. If the regular
expression matches the file won't be copied.
One of the useful examples would be:
copy_skip => [
'(?:^|\/)CVS(?:\/|$)', # skip cvs control files
'#|~', # skip emacs backup files
],
META: does copy_skip apply to all sub-docsets, if sub-docsets specify
their own copy_glob?
=back
=head2 Extra Features
If you want in the index file include a special top and bottom
sections in addition to the linked list of the docset contents, you
can do:
body => {
top => 'index_top.html',
bot => 'index_bot.html',
},
any of I<top> and I<bot> sub-attributes are optional. If these source
docs are for example in HTML, they have to be written in a proper
HTML, so the parser will be able to extract the body.
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/Doc.pm
Index: Doc.pm
===================================================================
package DocSet::Doc;
use strict;
use warnings;
use DocSet::Util;
use URI;
sub new {
my $class = shift;
my $self = bless {}, ref($class)||$class;
$self->init(@_);
return $self;
}
sub init {
my($self, %args) = @_;
while (my($k, $v) = each %args) {
$self->{$k} = $v;
}
}
sub scan {
my($self) = @_;
note "Scanning $self->{src_uri}";
$self->src_read();
$self->retrieve_meta_data();
}
sub render {
my($self, $cache) = @_;
# if the object wasn't stored rescan
#$self->scan() unless $self->meta;
my $src_uri = $self->{src_uri};
my $dst_path = $self->{dst_path};
my $rel_doc_root = $self->{rel_doc_root};
my $abs_doc_root = $self->{abs_doc_root};
$abs_doc_root .= "/$rel_doc_root" if defined $rel_doc_root;
$self->{dir} = {
abs_doc_root => $abs_doc_root,
rel_doc_root => $rel_doc_root,
};
$self->{nav} = DocSet::NavigateCache->new($cache->path, $src_uri);
note "Rendering $dst_path";
$self->convert();
write_file($dst_path, $self->{output});
}
# read the source and remember the mod time
# sets $self->{content}
# $self->{timestamp}
sub src_read {
my($self) = @_;
# META: at this moment everything is a file path
my $src_uri = "file://" . $self->{src_path};
my $u = URI->new($src_uri);
my $scheme = $u->scheme;
if ($scheme eq 'file') {
my $path = $u->path;
my $content = '';
read_file($path, \$content);
$self->{content} = \$content;
# file change timestamp
my($mon, $day, $year) = (localtime ( (stat($path))[9] ) )[4,3,5];
$self->{timestamp} = sprintf "%02d/%02d/%04d", ++$mon,$day,1900+$year;
}
else {
die "$scheme is not implemented yet";
}
if (my $sub = $self->can('src_filter')) {
$self->$sub();
}
}
sub meta {
my $self = shift;
if (@_) {
$self->{meta} = shift;
}
else {
$self->{meta};
}
}
sub toc {
my $self = shift;
if (@_) {
$self->{toc} = shift;
}
else {
$self->{toc};
}
}
# abstract methods
#sub src_filter {}
1;
__END__
=head1 NAME
C<DocSet::Doc> - A Base Document Class
=head1 SYNOPSIS
use DocSet::Doc::HTML ();
my $doc = DocSet::Doc::HTML->new(%args);
$doc->scan();
my $meta = $doc->meta();
my $toc = $doc->toc();
$doc->render();
# internal methods
$doc->src_read();
$doc->src_filter();
=head1 DESCRIPTION
This super class implement core methods for scanning a single document
of a given format and rendering it into another format. It provides
sub-classes with hooks that can change the default behavior. Note that
this class cannot be used as it is, you have to subclass it and
implement the required methods listed later.
=head1 METHODS
=over
=item * new
=item * init
=item * scan
scan the document into a parsed tree and retrieve its meta and toc
data if possible.
=item * render
render the output document and write it to its final destination.
=item * src_read
Fetches the source of the document. The source can be read from
different media, i.e. a file://, http://, relational DB or OCR :)
A subclass may implement a "source" filter. For example if the source
document is written in an extended POD the source filter may convert
it into a standard POD. If the source includes some template
directives these can be pre-processed as well.
The document's content is coming out of this class ready for parsing
and converting into other formats.
=item * meta
a simple set/get-able accessor to the I<meta> attribute.
=item * toc
a simple set/get-able accessor to the I<toc> attribute
=back
=head1 ABSTRACT METHODS
These methods must be implemented by the sub-classes:
=over
=item retrieve_meta_data
Retrieve and set the meta data that describes the input document into
the I<meta> object attribute. Various documents may provide different
meta information. The only required meta field is I<title>.
=back
These methods can be implemented by the sub-classes:
=over
=item src_filter
A subclass may want to preprocess the source document before it'll be
processed. This method is called after the source has been read. By
default nothing happens.
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/DocSet.pm
Index: DocSet.pm
===================================================================
package DocSet::DocSet;
use strict;
use warnings;
use DocSet::Util;
use DocSet::RunTime;
use DocSet::Cache ();
use DocSet::Doc ();
use DocSet::NavigateCache ();
use vars qw(@ISA);
use DocSet::Config ();
@ISA = qw(DocSet::Config);
########
sub new {
my $class = shift;
my $self = bless {}, ref($class)||$class;
$self->init(@_);
return $self;
}
sub init {
my($self, $config_file, $parent_o, $src_rel_dir) = @_;
$self->read_config($config_file);
# are we inside a super docset?
if ($parent_o and ref($parent_o)) {
$self->{parent_o} = $parent_o;
$self->merge_config($src_rel_dir);
}
}
sub scan {
my($self) = @_;
my $src_root = $self->get_dir('src_root');
# each output mode need its own cache, because of the destination
# links which are different
my $mode = $self->get('tmpl_mode');
my $cache = DocSet::Cache->new("$src_root/cache.$mode.dat");
$self->cache($cache); # store away
# cleanup the cache or rebuild
$cache->invalidate if get_opts('rebuild_all');
# cache the index node meta data
$cache->index_node($self->get('id'),
$self->get('title'),
$self->get('abstract')
);
# cache the location of the parent node cache
if (my $parent_o = $self->get('parent_o')) {
my $parent_src_root = $parent_o->get_dir('src_root');
(my $rel2parent_src_root = $src_root) =~ s|$parent_src_root||;
my $rel_dir = join '/', ("..") x ($rel2parent_src_root =~ tr|/|/|);
my $parent_cache_path = "$parent_src_root/cache.$mode.dat";
$cache->parent_node($parent_cache_path,
$self->get('id'),
$rel_dir);
$self->set_dir(rel_parent_root => $rel_dir);
}
###
# scan the nodes of the current level and cache the meta and other
# data
my $hidden = 0;
my @nodes_by_type = @{ $self->nodes_by_type };
while (@nodes_by_type) {
my($type, $data) = splice @nodes_by_type, 0, 2;
if ($type eq 'docsets') {
my $docset = $self->docset_scan_n_cache($data, $hidden);
$self->object_store($docset)
if defined $docset and ref $docset;
} elsif ($type eq 'chapters') {
my $chapter = $self->chapter_scan_n_cache($data, $hidden);
$self->object_store($chapter)
if defined $chapter and ref $chapter;
} elsif ($type eq 'links') {
$self->link_scan_n_cache($data, $hidden);
# we don't need to process links
} else {
# nothing
}
}
# the same but for the hidden objects
$hidden = 1;
my @hidden_nodes_by_type = @{ $self->hidden_nodes_by_type };
while (@hidden_nodes_by_type) {
my($type, $data) = splice @hidden_nodes_by_type, 0, 2;
if ($type eq 'docsets') {
my $docset = $self->docset_scan_n_cache($data, $hidden);
$self->object_store($docset)
if defined $docset and ref $docset;
} elsif ($type eq 'chapters') {
my $chapter = $self->chapter_scan_n_cache($data, $hidden);
$self->object_store($chapter)
if defined $chapter and ref $chapter;
} else {
# nothing
}
}
$cache->node_groups($self->node_groups);
# sync the cache
$cache->write;
}
sub docset_scan_n_cache {
my($self, $src_rel_dir, $hidden) = @_;
my $src_root = $self->get_dir('src_root');
my $cfg_file = "$src_root/$src_rel_dir/config.cfg";
my $docset = $self->new($cfg_file, $self, $src_rel_dir);
$docset->scan;
# cache the children meta data
my $id = $docset->get('id');
my $meta = {
title => $docset->get('title'),
link => "$src_rel_dir/index.html",
abstract => $docset->get('abstract'),
};
$self->cache->set($id, 'meta', $meta, $hidden);
return $docset;
}
sub link_scan_n_cache {
my($self, $link, $hidden) = @_;
my %meta = %$link; # make a copy
my $id = delete $meta{id};
$self->cache->set($id, 'meta', \%meta, $hidden);
}
sub chapter_scan_n_cache {
my($self, $src_file, $hidden) = @_;
my $trg_ext = $self->trg_ext();
my $src_root = $self->get_dir('src_root');
my $dst_root = $self->get_dir('dst_root');
my $abs_doc_root = $self->get_dir('abs_doc_root');
my $src_path = "$src_root/$src_file",
my $src_ext = filename_ext($src_file)
or die "cannot get an extension for $src_file";
my $src_mime = $self->ext2mime($src_ext)
or die "unknown extension: $src_ext";
(my $basename = $src_file) =~ s/\.$src_ext$//;
# destination paths
my $rel_dst_path = "$basename.$trg_ext";
my $rel_doc_root = "./";
$rel_dst_path =~ s|^\./||; # strip the leading './'
$rel_doc_root .= join '/', ("..") x ($rel_dst_path =~ tr|/|/|);
$rel_doc_root =~ s|/$||; # remove the last '/'
my $dst_path = "$dst_root/$rel_dst_path";
# push to the list of final chapter paths
# e.g. used by PS/PDF build, which needs all the chapters
$self->trg_chapters($rel_dst_path);
### to rebuild or not to rebuild
my($should_update, $reason) = should_update($src_path, $dst_path);
if (!$should_update) {
note "--- $src_file: skipping ($reason)";
return undef;
}
### init
note "+++ $src_file: processing ($reason)";
my $dst_mime = $self->get('dst_mime');
my $conv_class = $self->conv_class($src_mime, $dst_mime);
require_package($conv_class);
my $chapter = $conv_class->new(
tmpl_mode => $self->get('tmpl_mode'),
tmpl_root => $self->get_dir('tmpl'),
src_uri => $src_file,
src_path => $src_path,
dst_path => $dst_path,
rel_dst_path => $rel_dst_path,
rel_doc_root => $rel_doc_root,
abs_doc_root => $abs_doc_root,
);
$chapter->scan();
# cache the chapter's meta and toc data
$self->cache->set($src_file, 'meta', $chapter->meta, $hidden);
$self->cache->set($src_file, 'toc', $chapter->toc, $hidden);
return $chapter;
}
sub render {
my($self) = @_;
# copy non-pod files like images and stylesheets
$self->copy_the_rest;
my $src_root = $self->get_dir('src_root');
# each output mode need its own cache, because of the destination
# links which are different
my $mode = $self->get('tmpl_mode');
my $cache = DocSet::Cache->new("$src_root/cache.$mode.dat");
# render the objects no matter what kind are they
for my $obj ($self->stored_objects) {
$obj->render($cache);
}
$self->complete;
}
####################
sub copy_the_rest {
my($self) = @_;
my @copy_files = @{ $self->files_to_copy || [] };
return unless @copy_files;
my $src_root = $self->get_dir('src_root');
my $dst_root = $self->get_dir('dst_root');
note "+++ Copying the non-processed files from $src_root to $dst_root";
foreach my $src_path (@copy_files){
my $dst_path = $src_path;
# # some OSs's File::Find returns files with no dir prefix root
# # (that's what ()* is for
# $dst_path =~ s/(?:$src_root)*/$dst_root/;
$dst_path =~ s/$src_root/$dst_root/;
# to rebuild or not to rebuild
my($should_update, $reason) =
should_update($src_path, $dst_path);
if (!$should_update) {
note "--- skipping cp $src_path $dst_path ($reason)";
next;
}
note "+++ processing cp $src_path $dst_path ($reason)";
copy_file($src_path, $dst_path);
}
}
# abstract classes
sub complete {}
1;
__END__
=head1 NAME
C<DocSet::DocSet> - An abstract docset generation class
=head1 SYNOPSIS
use DocSet::DocSet::HTML ();
my $docset = DocSet::DocSet::HTML->new($config_file);
# must start from the abs root
chdir $abs_root;
# must be a relative path to be able to move the generated code from
# location to location, without adjusting the links
$docset->set_dir(abs_root => ".");
$docset->scan;
$docset->render;
=head1 DESCRIPTION
C<DocSet::DocSet> processes a docset, which can include other docsets,
documents and links. In the first pass it scans the linked to it
documents and other docsets and caches this information and the
objects for a later peruse. In the second pass the stored objects are
rendered. And the docset is completed.
This class cannot be used on its own and has to be subclassed and
extended, by the sub-classes which has a specific to input and output
formats of the documents that need to be processed. It handles only
the partial functionality which doesn't require format specific
knowledge.
=head2 METHODS
This class inherits from C<DocSet::Config> and you will find the
documentation of methods inherited from this class in its pod.
The following "public" methods are implemented in this super-class:
=over
=item * new
$class->new($config_file, $parent_o, $src_rel_dir);
=item * init
$self->init($config_file, $parent_o, $src_rel_dir);
=item * scan
$self->scan();
Scans the docset for meta data and tocs of its items and caches this
information and the item objects.
=item * render
$self->render();
Calls the render() method of each of the stored objects and creates an
index page linking all the items.
=item * copy_the_rest
$self->copy_the_rest()
Copies the items which aren't processed (i.e. images, css files, etc).
=back
=head2 ABSTRACT METHODS
The following methods should be implemented by the sub-classes.
=over
=item * parse
=item * retrieve_meta_data
=item * convert
=item * complete
$self->complete();
put here anything that should be run after all the items have been
rendered and all the meta info has been collected. i.e. generation of
the I<index> file, to link to all the links and the parent node if
such exists.
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/NavigateCache.pm
Index: NavigateCache.pm
===================================================================
package DocSet::NavigateCache;
use strict;
use warnings;
use DocSet::RunTime;
use DocSet::Util;
use Storable;
use Carp;
# cache the loaded cache files
use vars qw(%CACHE);
%CACHE = ();
#use vars qw(@ISA);
use DocSet::Cache ();
#@ISA = qw(DocSet::Cache);
use constant OBJ => 0;
use constant ID => 1;
use constant CUR_PATH => 2;
use constant REL_PATH => 3;
# $rel_path (to the parent) is optional (e.g. root doesn't have a parent)
sub new {
my($class, $cache_path, $id, $rel_path) = @_;
croak "no cache path specified" unless defined $cache_path;
croak "no id specified" unless defined $id;
my $cache = get_cache($cache_path);
my $self = bless [], ref($class)||$class;
$self->[OBJ] = $cache;
$self->[CUR_PATH] = $cache_path;
$self->[REL_PATH] = $rel_path if $rel_path;
$self->[ID] = $id;
return $self;
}
sub parent_rel_path {
my($self) = @_;
return defined $self->[REL_PATH] ? $self->[REL_PATH] : undef;
}
# get next item's object or undef if there are no more
sub next {
my($self) = @_;
my $cache = $self->[OBJ];
my $seq = $cache->id2seq($self->[ID]);
my $last_seq = $cache->total_ids - 1;
# if the next object is hidden, it's like there is no next object,
# because the hidden objects, if any, are always coming last
if ($seq < $last_seq) {
my $id = $cache->seq2id($seq + 1);
if ($cache->is_hidden($id)) {
return undef;
}
else {
return $self->new($self->[CUR_PATH], $id);
}
} else {
return undef;
}
}
# get prev item's object or undef if there are no more
sub prev {
my($self) = @_;
my $cache = $self->[OBJ];
my $seq = $cache->id2seq($self->[ID]);
# since the hidden objects, if any, are always coming last
# we need to go to the last of the non-hidden objects.
if ($seq) {
my $id = $cache->seq2id($seq - 1);
if ($cache->is_hidden($id)) {
return $self->new($self->[CUR_PATH], $id)->prev();
}
else {
return $self->new($self->[CUR_PATH], $id);
}
} else {
return undef;
}
}
# get the object of the first item on the same level
sub first {
my($self) = @_;
my $cache = $self->[OBJ];
# it's possible that the whole docset is made of hidden objects.
# since the hidden objects, if any, are always coming last
# we simply return undef in such a case
if ($cache->total_ids) {
my $id = $cache->seq2id(0);
if ($cache->is_hidden($id)) {
return undef;
}
else {
return $self->new($self->[CUR_PATH], $id);
}
}
else {
return undef;
}
}
# the index node of the current level
sub index_node {
my($self) = @_;
return $self->[OBJ]->index_node;
}
# get the object of the parent
sub up {
my($self) = @_;
my($path, $id, $rel_path) = $self->[OBJ]->parent_node;
$rel_path = "." unless defined $rel_path;
if (defined $self->[REL_PATH] && length $self->[REL_PATH]) {
# append the relative path of each child, so the overall
# relative path is correct
$rel_path .= "/$self->[REL_PATH]";
}
# it's ok to have a hidden parent, we don't mind to see it
# as non-hidden, since the children of the hidden parent aren't
# linked from other non-hidden pages. In fact we must ignore the
# fact that it's hidden (if it is) because otherwise the navigation
# won't work.
if ($path) {
return $self->new($path, $id, $rel_path);
}
else {
return undef;
}
}
# retrieve the meta data of the current node
sub meta {
my($self) = @_;
return $self->[OBJ]->get($self->[ID], 'meta');
}
# retrieve the node groups
sub node_groups {
my($self) = @_;
#print "OK: ";
#dumper $self->[OBJ]->node_groups;
return $self->[OBJ]->node_groups;
}
sub id {
shift->[ID];
}
sub get_cache {
my($cache_path) = @_;
$CACHE{$cache_path} ||= DocSet::Cache->new($cache_path);
return $CACHE{$cache_path};
}
1;
__END__
=head1 NAME
C<DocSet::NavigateCache> - Navigate the DocSet's caches in a readonly mode
=head1 SYNOPSIS
my $nav = DocSet::NavigateCache->new($cache_path, $id, $rel_path);
# go through all nodes from left to right, and remember the sequence
# number of the $nav node (from which we have started)
my $iterator = $nav->first;
my $seq = 0;
my $counter = 0;
my @meta = ();
while ($iterator) {
$seq = $counter if $iterator->id eq $nav->id;
push @meta, $iterator->meta;
$iterator = $iterator->next;
$counter++;
}
# add index node's meta data
push @meta, $nav->index_node;
# prev object
$prev = $nav->prev;
# get all the ancestry
my @parents = ();
$p = $nav->up;
while ($p) {
push @parents, $p;
$p = $p->up;
}
=head1 DESCRIPTION
C<DocSet::NavigateCache> navigates the cache created by docset objects
during their scan stage. Once the navigator handle is obtained, it's
possible to move between the nodes of the same level, using the next()
and prev() methods or going up one level using the up() method. the
first() method returns the object of the first node on the same
level. Each of these methods returns a new C<DocSet::NavigateCache>
object or undef if the object cannot be created.
This object can be used to retrieve node's meta data, its id and its
index node's meta data.
Currently it is used in the templates for the internal navigation
widgets creation. That's where you will find the examples of its use
(e.g. I<tmpl/custom/html/menu_top_level> and
I<tmpl/custom/html/navbar_global>).
As C<DocSet::NavigateCache> reads cache files in, it caches them, since
usually the same file is required many times in a few subsequent
calls.
Note that C<DocSet::NavigateCache> doesn't see any hidden objects
stored in the cache.
=head2 METHODS
META: to be completed (see SYNOPSIS meanwhile)
=over
=item * new
DocSet::NavigateCache->new($cache_path, $id, $rel_path);
C<$cache_path> is the path of the cache file to read.
C<$id> is the id of the current node.
C<$rel_path> is optional and passed if an object has a parent node. It
contains a relative path from the current node to its parent.
=item * parent_rel_path
=item * next
=item * prev
=item * first
=item * up
=item * index_node
=item * meta
=item * id
=item *
=item *
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/RunTime.pm
Index: RunTime.pm
===================================================================
package DocSet::RunTime;
use strict;
use warnings;
use vars qw(@ISA @EXPORT %opts);
@ISA = qw(Exporter);
@EXPORT = qw(get_opts);
sub set_opt {
my(%args) = ();
if (@_ == 1) {
my $arg = shift;
my $ref = ref $arg;
if ($ref) {
%args = $ref eq 'HASH' ? %$arg : @$arg;
} else {
die "must be a ref to or an array/hash";
}
} else {
%args = @_;
}
@opts{keys %args} = values %args;
}
sub get_opts {
my $opt = shift;
exists $opts{$opt} ? $opts{$opt} : '';
}
# check whether we have a Storable avalable
use constant HAS_STORABLE => eval { require Storable; };
sub has_storable_module {
return HAS_STORABLE;
}
my $html2ps_exec = `which html2ps` || '';
chomp $html2ps_exec;
sub can_create_ps {
# ps2html is bundled, so we can always create PS
return $html2ps_exec;
# if you unbundle it make sure you write here a code similar to
# can_create_pdf()
}
my $ps2pdf_exec = `which ps2pdf` || '';
chomp $ps2pdf_exec;
sub can_create_pdf {
# check whether ps2pdf exists
return $ps2pdf_exec if $ps2pdf_exec;
print(qq{It seems that you do not have ps2pdf installed! You have
to install it if you want to generate the PDF file
});
return 0;
}
1;
__END__
=head1 NAME
C<DocSet::RunTime> - RunTime Configuration
=head1 SYNOPSIS
use DocSet::RunTime;
if (get_opts('verbose') {
print "verbose mode";
}
DocSet::RunTime::set_opt(\%args);
DocSet::RunTime::has_storable_module();
DocSet::RunTime::can_create_ps();
DocSet::RunTime::can_create_pdf();
=head1 DESCRIPTION
This module is a part of the docset application, and it stores the run
time arguments. i.e. whether to build PS and PDF or to run in a
verbose mode and more.
=head1 FUNCTIONS
META: To be completed, see SYNOPSIS
=over
=item * set_opt
=item * get_opts
=item * has_storable_module
=item * can_create_ps
=item * can_create_pdf
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/Util.pm
Index: Util.pm
===================================================================
package DocSet::Util;
use strict;
use warnings;
use Symbol ();
use File::Basename ();
use File::Copy ();
use File::Path ();
use Data::Dumper;
use Carp;
use Template;
use DocSet::RunTime;
use vars qw(@ISA @EXPORT);
@ISA = qw(Exporter);
@EXPORT = qw(read_file read_file_paras copy_file write_file create_dir
filename_ext require_package dumper sub_trace note
get_date get_timestamp proc_tmpl build_matchmany_sub
banner should_update confess cluck);
# copy_file($src_path, $dst_path);
# copy a file at $src_path to $dst_path,
# if one of the directories of the $dst_path doesn't exist -- it'll
# be created.
###############
sub copy_file {
my($src, $dst) = @_;
# make sure that the directory exist or create one
my $base_dir = File::Basename::dirname $dst;
create_dir($base_dir) unless (-d $base_dir);
File::Copy::copy($src, $dst);
}
# write_file($filename, $ref_to_array||scalar);
# content will be written to the file from the passed array of
# paragraphs
###############
sub write_file {
my($filename, $content) = @_;
# make sure that the directory exist or create one
my $dir = File::Basename::dirname $filename;
create_dir($dir) unless -d $dir;
my $fh = Symbol::gensym;
open $fh, ">$filename" or croak "Can't open $filename for writing: $!";
print $fh ref $content ? @$content : defined $content ? $content : '';
close $fh;
}
# recursively creates a multi-layer directory
###############
sub create_dir {
my $path = shift;
return if !defined($path) || -e $path;
# META: mode could be made configurable
File::Path::mkpath($path, 0, 0755) or croak "Couldn't create $path: $!";
}
# read_file($filename, $ref);
# assign to a ref to a scalar
###############
sub read_file {
my($filename, $r_content) = @_;
my $fh = Symbol::gensym;
open $fh, $filename or croak "Can't open $filename for reading: $!";
local $/;
$$r_content = <$fh>;
close $fh;
}
# read_file_paras($filename, $ref_to_array);
# read by paragraph
# content will be set into a ref to an array
###############
sub read_file_paras {
my($filename, $ra_content) = @_;
my $fh = Symbol::gensym;
open $fh, $filename or croak "Can't open $filename for reading: $!";
local $/ = "";
@$ra_content = <$fh>;
close $fh;
}
# return the passed file's extension or '' if there is no one
# note: that '/foo/bar.conf.in' returns an extension: 'conf.in';
# note: a hidden file .foo will be recognized as an extension 'foo'
sub filename_ext {
my($filename) = @_;
my $ext = (File::Basename::fileparse($filename, '\.[^\.]*'))[2] || '';
$ext =~ s/^\.(.*)/lc $1/e;
$ext;
}
sub get_date {
sprintf "%s %d, %d", (split /\s+/, scalar localtime)[1,2,4];
}
sub get_timestamp {
my ($mon,$day,$year) = (localtime ( time ) )[4,3,5];
sprintf "%02d/%02d/%04d", ++$mon, $day, 1900+$year;
}
# convert Foo::Bar into Foo/Bar.pm and require
sub require_package {
my $package = shift;
die "no package passed" unless $package;
$package =~ s|::|/|g;
$package .= '.pm';
require $package;
}
# convert the template into the release version
# $tmpl_root: a ref to an array of tmpl base dirs
# tmpl_file: which template file to process
# mode : in what mode (html, ps, ...)
# vars : ref to a hash with vars to path to the template
#
# returns the processed template
###################
sub proc_tmpl {
my($tmpl_root, $tmpl_file, $mode, $vars) = @_;
# append the specific rendering mode, so the correct template will
# be picked (e.g. in 'ps' mode, the ps sub-dir(s) will be searched
# first)
my $search_path = join ':',
map { ("$_/$mode", "$_/common", "$_") }
(ref $tmpl_root ? @$tmpl_root : $tmpl_root);
my $template = Template->new
({
INCLUDE_PATH => $search_path,
RECURSION => 1,
PLUGINS => {
cnavigator => 'DocSet::Template::Plugin::NavigateCache',
},
}) || die $Template::ERROR, "\n";
# use Data::Dumper;
# print Dumper \@search_path;
my $output;
$template->process($tmpl_file, $vars, \$output)
|| die "error: ", $template->error(), "\n";
return $output;
}
# compare the timestamps/existance of src and dst paths
# and return (true,reason) if src is newer than dst
# otherwise return (false, reason)
#
# if rebuild_all runtime is on, this always returns (true, reason)
#
sub should_update {
my($src_path, $dst_path) = @_;
# to rebuild or not to rebuild
my $not_modified =
(-e $dst_path and -M $dst_path < -M $src_path) ? 1 : 0;
my $reason = $not_modified ? 'not modified' : 'modified';
if (get_opts('rebuild_all')) {
return (1, "$reason / forced");
} else {
return (!$not_modified, $reason);
}
}
sub banner {
my($string) = @_;
my $len = length($string) + 8;
note(
"#" x $len,
"### $string ###",
"#" x $len,
);
}
# see DocSet::Config::files_to_copy() for usage
#########################
sub build_matchmany_sub {
my $ra_regex = shift;
my $expr = join '||', map { "\$_[0] =~ m/$_/o" } @$ra_regex;
# note $expr;
my $matchsub = eval "sub { ($expr) ? 1 : 0}";
die "Failed in building regex [@$ra_regex]: $@" if $@;
$matchsub;
}
sub dumper {
print Dumper @_;
}
#sub sub_trace {
## my($package) = (caller(0))[0];
# my($sub) = (caller(1))[3];
# print "=> $sub: @_\n";
#}
*confess = \*Carp::confess;
*cluck = \*Carp::cluck;
sub note {
return unless get_opts('verbose');
print join("\n", @_), "\n";
}
1;
__END__
=head1 NAME
C<DocSet::Util> - Commonly used functions
=head1 SYNOPSIS
use DocSet::Util;
copy_file($src, $dst);
write_file($filename, $content);
create_dir($path);
read_file($filename, $r_content);
read_file_paras($filename, $ra_content);
my $ext = filename_ext($filename);
my $date = get_date();
my $timestamp = get_timestamp();
require_package($package);
my $output = proc_tmpl($tmpl_root, $tmpl_file, $mode, $vars);
my $should_update = should_update($src_path, $dst_path);
banner($string);
my $sub_ref = build_matchmany_sub($ra_regex);
dumper($ref);
confess($string);
note($string);
=head1 DESCRIPTION
All the functions are exported by default.
=head2 METHODS
META: to be completed (see SYNOPSIS meanwhile)
=over
=item * copy_file
=item * write_file
=item * create_dir
=item * read_file
=item * read_file_paras
=item * filename_ext
=item * get_date
=item * get_timestamp
=item * require_package
=item * proc_tmpl
=item * should_update
=item * banner
=item * build_matchmany_sub
=item * dumper
=item * confess
=item * note
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/Doc/HTML2HTML.pm
Index: HTML2HTML.pm
===================================================================
package DocSet::Doc::HTML2HTML;
use strict;
use warnings;
use DocSet::Util;
use vars qw(@ISA);
require DocSet::Source::HTML;
@ISA = qw(DocSet::Source::HTML);
sub convert {
my($self) = @_;
my @body = $self->{parsed_tree}->{body};
my $vars = {
meta => $self->{meta},
body => \@body,
dir => $self->{dir},
nav => $self->{nav},
last_modified => $self->{timestamp},
};
my $tmpl_file = 'page';
my $mode = $self->{tmpl_mode};
my $tmpl_root = $self->{tmpl_root};
$self->{output} = proc_tmpl($tmpl_root, $tmpl_file, $mode, {doc => $vars} );
}
# need for pluggin docs into index files
sub converted_body {
my($self) = @_;
return $self->{parsed_tree}->{body};
}
1;
__END__
=head1 NAME
C<DocSet::Doc::HTML2HTML> - HTML source to HTML target converter
=head1 SYNOPSIS
=head1 DESCRIPTION
Implements an C<DocSet::Doc> sub-class which converts a source
document in HTML, into an output document in HTML.
=head1 METHODS
For the rest of the super class methods see C<DocSet::Doc>.
=over
=item * convert
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/Doc/POD2HTML.pm
Index: POD2HTML.pm
===================================================================
package DocSet::Doc::POD2HTML;
use strict;
use warnings;
use DocSet::Util;
require Pod::POM;
#require Pod::POM::View::HTML;
#my $view_mode = 'Pod::POM::View::HTML';
my $view_mode = 'DocSet::Doc::POD2HTML::View::HTML';
use vars qw(@ISA);
require DocSet::Source::POD;
@ISA = qw(DocSet::Source::POD);
sub convert {
my($self) = @_;
my $pom = $self->{parsed_tree};
# my @sections = $pom->head1();
# shift @sections; # skip the title
my @sections = $pom->content();
shift @sections; # skip the title
my @body = ();
foreach my $node (@sections) {
# my $type = $node->type();
# print "$type\n";
push @body, $node->present($view_mode);
}
# for my $head1 (@sections) {
# push @body, $head1->title->present($view_mode);
# push @body, $head1->content->present($view_mode);
# for my $head2 ($head1->head2) {
# push @body, $head2->present($view_mode);
# for my $head3 ($head2->head3) {
# push @body, $head3->present($view_mode);
# for my $head4 ($head3->head4) {
# push @body, $head4->present($view_mode);
# }
# }
# }
# }
my $vars = {
meta => $self->{meta},
toc => $self->{toc},
body => \@body,
dir => $self->{dir},
nav => $self->{nav},
last_modified => $self->{timestamp},
};
my $tmpl_file = 'page';
my $mode = $self->{tmpl_mode};
my $tmpl_root = $self->{tmpl_root};
$self->{output} = proc_tmpl($tmpl_root, $tmpl_file, $mode, {doc => $vars} );
}
1;
package DocSet::Doc::POD2HTML::View::HTML;
use vars qw(@ISA);
require Pod::POM::View::HTML;
@ISA = qw( Pod::POM::View::HTML);
sub view_head1 {
my ($self, $node) = @_;
return $self->anchor($node->title) . $self->SUPER::view_head1($node);
}
sub view_head2 {
my ($self, $node) = @_;
return $self->anchor($node->title) . $self->SUPER::view_head2($node);
}
sub view_head3 {
my ($self, $node) = @_;
return $self->anchor($node->title) . $self->SUPER::view_head3($node);
}
sub view_head4 {
my ($self, $node) = @_;
return $self->anchor($node->title) . $self->SUPER::view_head4($node);
}
sub anchor {
my($self, $title) = @_;
my $anchor = "$title";
$anchor =~ s/\W/_/g;
return qq{<a name="$anchor"></a>\n};
}
1;
__END__
=head1 NAME
C<DocSet::Doc::POD2HTML> - POD source to HTML target converter
=head1 SYNOPSIS
=head1 DESCRIPTION
Implements an C<DocSet::Doc> sub-class which converts a source
document in POD, into an output document in HTML.
=head1 METHODS
For the rest of the super class methods see C<DocSet::Doc>.
=over
=item * convert
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/Doc/Text2HTML.pm
Index: Text2HTML.pm
===================================================================
package DocSet::Doc::Text2HTML;
use strict;
use warnings;
use vars qw(@ISA);
require DocSet::Source::Text;
@ISA = qw(DocSet::Source::Text);
###########################
### not implemented yet ###
###########################
1;
__END__
=head1 NAME
C<DocSet::Doc::Text2HTML> - Text source to HTML target converter
=head1 SYNOPSIS
=head1 DESCRIPTION
Implements an C<DocSet::Doc> sub-class which converts a source
document in Text, into an output document in HTML.
=cut
1.1 modperl-docs/lib/DocSet/DocSet/HTML.pm
Index: HTML.pm
===================================================================
package DocSet::DocSet::HTML;
use strict;
use warnings;
use DocSet::Util;
use DocSet::NavigateCache ();
use vars qw(@ISA);
use DocSet::DocSet ();
@ISA = qw(DocSet::DocSet);
# what's the output format
sub trg_ext {
return 'html';
}
sub init {
my $self = shift;
$self->SUPER::init(@_);
# configure HTML specific run-time
$self->set(dst_mime => 'text/html');
$self->set(tmpl_mode => 'html');
$self->set_dir(dst_root => $self->get_dir('dst_html'));
banner("HTML DocSet: " . $self->get('title') );
}
sub complete {
my($self) = @_;
$self->write_index_file();
}
# generate the index.html based on the doc entities it includes, in
# the following order: docsets, books, chapters
#
# Using the same template file create the long and the short index
# html files
##################################
sub write_index_file {
my($self) = @_;
my @toc = ();
my $cache = $self->cache;
# TOC
my @node_groups = @{ $self->node_groups };
my @ids = $cache->ordered_ids;
# create the toc while skipping over hidden files
if (@node_groups && @ids) {
# index's toc is built from groups of items' meta data
while (@node_groups) {
my($title, $count) = splice @node_groups, 0, 2;
push @toc, {
group_title => $title,
subs => [map {$cache->get($_, 'meta')}
grep !$cache->is_hidden($_),
splice @ids, 0, $count],
};
}
}
else {
# index's toc is built from items' meta data
for my $id (grep !$cache->is_hidden($_), $cache->ordered_ids) {
push @toc, $cache->get($id, 'meta');
}
}
my $dir = {
abs_doc_root => $self->get_dir('abs_doc_root'),
rel_doc_root => $self->get_dir('rel_parent_root'),
};
my $meta = {
title => $self->get('title'),
abstract => $self->get('abstract'),
};
my $navigator = DocSet::NavigateCache->new($self->cache->path, $self->get('id'));
my %args = (
nav => $navigator,
toc => \@toc,
meta => $meta,
dir => $dir,
version => $self->get('version')||'',
date => get_date(),
last_modified => get_timestamp(),
# body => top
);
# pluster index top and bottom docs if defined (after converting them)
if (my $body = $self->get('body')) {
my $src_root = $self->get_dir('src_root');
my $dst_mime = $self->get('dst_mime');
for my $sec (qw(top bot)) {
my $src_file = $body->{$sec};
next unless $src_file;
my $src_ext = filename_ext($src_file)
or die "cannot get an extension for $src_file";
my $src_mime = $self->ext2mime($src_ext)
or die "unknown extension: $src_ext";
my $conv_class = $self->conv_class($src_mime, $dst_mime);
require_package($conv_class);
my $chapter = $conv_class->new(
tmpl_mode => $self->get('tmpl_mode'),
tmpl_root => $self->get_dir('tmpl'),
src_uri => $src_file,
src_path => "$src_root/$src_file",
);
$chapter->scan();
$args{body}{$sec} = $chapter->converted_body();
}
}
my $dst_root = $self->get_dir('dst_html');
my $dst_file = "$dst_root/index.html";
my $mode = $self->get('tmpl_mode');
my $tmpl_file = 'index';
my $vars = { doc => \%args };
my $tmpl_root = $self->get_dir('tmpl');
my $content = proc_tmpl($tmpl_root, $tmpl_file, $mode, $vars);
note "+++ Creating $dst_file";
DocSet::Util::write_file($dst_file, $content);
}
1;
__END__
=head1 NAME
C<DocSet::DocSet::HTML> - A subclass of C<DocSet::DocSet> for generating HTML docset
=head1 SYNOPSIS
See C<DocSet::DocSet>
=head1 DESCRIPTION
This subclass of C<DocSet::DocSet> converts the source docset into a
set of HTML documents linking its items with autogenerated
I<index.html>.
=head2 METHODS
See the majority of the methods in C<DocSet::DocSet>
=over
=item * trg_ext
$self->trg_ext();
returns the extension of the target files. I<html> in the case of this
sub-class.
=item * init
$self->init(@_);
calls C<DocSet::DocSet::init> and then initializes its own HTML output
specific settings.
=item * complete
see C<DocSet::DocSet>
=item * write_index_file
$self->write_index_file();
creates I<index.html> file linking all the items of the docset
together.
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/DocSet/PSPDF.pm
Index: PSPDF.pm
===================================================================
package DocSet::DocSet::PSPDF;
use strict;
use warnings;
use DocSet::Util;
use DocSet::RunTime;
use DocSet::NavigateCache ();
use vars qw(@ISA);
use DocSet::DocSet ();
@ISA = qw(DocSet::DocSet);
# what's the output format
sub trg_ext {
return 'html';
}
sub init {
my $self = shift;
$self->SUPER::init(@_);
# configure PS/PDF specific run-time
# though, we build ps/pdf the intermediate product is HTML
$self->set(dst_mime => 'text/html');
$self->set(tmpl_mode => 'ps');
$self->set_dir(dst_root => $self->get_dir('dst_ps'));
banner("PS/PDF DocSet: " . $self->get('title') );
}
sub complete {
my($self) = @_;
$self->write_index_file();
$self->create_ps_book;
$self->create_pdf_book if get_opts('generate_pdf');
}
# generate the index.html file based on the doc entities it includes,
# in the following order: docsets, books, chapters
#
# Using the same template file create the long and the short index
# html files
##################################
sub write_index_file{
my($self) = @_;
my $dir = {
abs_doc_root => $self->get_dir('abs_doc_root'),
rel_doc_root => '..', # META: probably wrong! (see write_index_html_file())
};
my $meta = {
title => $self->get('title'),
abstract => $self->get('abstract'),
};
use DocSet::NavigateCache;
my $navigator = DocSet::NavigateCache->new($self->cache->path, $self->get('id'));
my %args =
(
nav => $navigator,
meta => $meta,
dir => $dir,
version => $self->get('version')||'',
date => get_date(),
last_modified => get_timestamp(),
);
my $dst_root = $self->get_dir('dst_root');
my $dst_file = "$dst_root/index.html";
my $mode = $self->get('tmpl_mode');
my $tmpl_file = 'index';
my $vars = { doc => \%args };
my $tmpl_root = $self->get_dir('tmpl');
my $content = proc_tmpl($tmpl_root, $tmpl_file, $mode, $vars);
note "+++ Creating $dst_file";
DocSet::Util::write_file($dst_file, $content);
}
# generate the PS book
####################
sub create_ps_book{
my($self) = @_;
note "+++ Generating a PostScript Book";
my $html2ps_exec = DocSet::RunTime::can_create_ps();
my $html2ps_conf = $self->get_file('html2ps_conf');
my $id = $self->get('id');
my $dst_root = $self->get_dir('dst_root');
my $command = "$html2ps_exec -f $html2ps_conf -o $dst_root/${id}.ps ";
$command .= join " ", map {"$dst_root/$_"} "index.html", $self->trg_chapters;
note "% $command";
system $command;
}
# generate the PDF book
####################
sub create_pdf_book{
my($self) = @_;
note "+++ Converting PS => PDF";
my $dst_root = $self->get_dir('dst_root');
my $id = $self->get('id');
my $command = "ps2pdf $dst_root/$id.ps $dst_root/$id.pdf";
note "% $command";
system $command;
}
1;
__END__
=head1 NAME
C<DocSet::DocSet::PSPDF> - A subclass of C<DocSet::DocSet> for generating PS/PDF docset
=head1 SYNOPSIS
See C<DocSet::DocSet>
=head1 DESCRIPTION
This subclass of C<DocSet::DocSet> converts the source docset into PS
and PDF "books". It uses C<html2ps> to generate the PS file, therefore
it uses HTML as its intermediate product, though it uses different
templates than C<DocSet::DocSet::HTML> since PS/PDF doesn't require
the navigation widgets.
=head2 METHODS
See the majority of the methods in C<DocSet::DocSet>
=over
=item * trg_ext
$self->trg_ext();
returns the extension of the target files. I<html> in the case of this
sub-class.
=item * init
$self->init(@_);
calls C<DocSet::DocSet::init> and then initializes its own HTML output
specific settings.
=item * complete
see C<DocSet::DocSet>
=item * write_index_file
$self->write_index_file();
creates I<index.html> file linking all the items of the docset
together.
=item * create_ps_book
Generats a PostScript Book
=item * create_pdf_book
Converts PS into PDF (if I<generate_pdf> runtime option is set)
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/Source/HTML.pm
Index: HTML.pm
===================================================================
package DocSet::Source::HTML;
use strict;
use warnings;
use DocSet::Util;
use vars qw(@ISA);
require DocSet::Doc;
@ISA = qw(DocSet::Doc);
sub retrieve_meta_data {
my($self) = @_;
$self->parse;
use Pod::POM::View::HTML;
my $mode = 'Pod::POM::View::HTML';
#print Pod::POM::View::HTML->print($pom);
$self->{meta} =
{
title => $self->{parsed_tree}->{title},
link => $self->{rel_dst_path},
};
# there is no autogenerated TOC for HTML files
}
sub parse {
my($self) = @_;
# already parsed
return if exists $self->{parsed_tree} && $self->{parsed_tree};
# print ${ $self->{content} };
#my %segments = map {$_ => ''} qw(title body);
# this one retrievs the body and the title of the given html
require HTML::Parser;
sub start_h {}
sub end_h {
my($self, $tagname, $skipped_text) = @_;
# use $p itself as a tmp storage (ok according to the docs)
$self->{parsed_tree}->{$tagname} = $skipped_text;
}
my $p = HTML::Parser->new(api_version => 3,
report_tags => [qw(title body)],
start_h => [\&start_h],
end_h => [\&end_h, "self,tagname,skipped_text"],
);
# Parse document text chunk by chunk
$p->parse(${ $self->{content} });
$p->eof;
# store the tree away
$self->{parsed_tree} = $p->{parsed_tree};
}
1;
__END__
=head1 NAME
C<DocSet::Source::HTML> - A class for parsing input document in the HTML format
=head1 SYNOPSIS
See C<DocSet::Source>
=head1 DESCRIPTION
=head1 METHODS
=over
=item * parse
Converts the source HTML document into a parsed tree.
=item * retrieve_meta_data
Retrieve and set the meta data that describes the input document into
the I<meta> object attribute. The I<title> and I<link> meta attributes
are getting set.
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/Source/POD.pm
Index: POD.pm
===================================================================
package DocSet::Source::POD;
use strict;
use warnings;
use DocSet::Util;
use DocSet::RunTime;
use vars qw(@ISA);
require DocSet::Doc;
@ISA = qw(DocSet::Doc);
sub retrieve_meta_data {
my($self) = @_;
$self->parse_pod;
use Pod::POM::View::HTML;
my $mode = 'Pod::POM::View::HTML';
#print Pod::POM::View::HTML->print($pom);
my $meta = {};
my $pom = $self->{parsed_tree};
my @sections = $pom->head1();
# don't present on purpose ->present($mode); there should be no markup in NAME
my $name_sec = shift @sections;
if ($name_sec) {
$meta->{title} = $name_sec->content();
}
else {
$meta->{title} = 'No Title';
}
$meta->{title} =~ s/^\s*|\s*$//sg;
$meta->{link} = $self->{rel_dst_path};
# put all the meta data under the same attribute
$self->{meta} = $meta;
# build the toc datastructure
my @toc = ();
my $level = 1;
for my $node (@sections) {
push @toc, $self->render_toc_level($node, $level);
}
$self->{toc} = \@toc;
}
sub render_toc_level {
my($self, $node, $level) = @_;
my %toc_entry = ();
my $title = $node->title;
$toc_entry{link} = $toc_entry{title} = "$title"; # must stringify
$toc_entry{link} =~ s/\W/_/g; # META: put into a sub?
$toc_entry{link} = "#$toc_entry{link}"; # prepand '#' for internal links
my @sub = ();
$level++;
if ($level < 5) {
# if there are deeper than =head4 levels we don't go down (spec is 1-4)
my $method = "head$level";
for my $sub_node ($node->$method()) {
push @sub, $self->render_toc_level($sub_node, $level);
}
}
$toc_entry{subs} = \@sub if @sub;
return \%toc_entry;
}
sub parse_pod {
my($self) = @_;
# already parsed
return if exists $self->{parsed_tree} && $self->{parsed_tree};
$self->podify_items() if get_opts('podify_items');
# print ${ $self->{content} };
use Pod::POM;
my %options;
my $parser = Pod::POM->new(\%options);
my $pom = $parser->parse_text(${ $self->{content} })
or die $parser->error();
$self->{parsed_tree} = $pom;
# examine any warnings raised
if (my @warnings = $parser->warnings()) {
print "\n", '-' x 40, "\n";
print "File: $self->{src_path}\n";
warn "$_\n" for @warnings;
}
}
sub src_filter {
my ($self) = @_;
$self->extract_pod;
$self->podify_items if get_opts('podify_items');
}
sub extract_pod {
my($self) = @_;
my @pod = ();
my $in_pod = 0;
for (split /\n\n/, ${ $self->{content} }) {
$in_pod ||= /^=/s;
next unless $in_pod;
$in_pod = 0 if /^=cut/;
push @pod, $_;
}
# handle empty files
unless (@pod) {
push @pod, "=head1 NAME", "=head1 Not documented", "=cut";
}
my $content = join "\n\n", @pod;
$self->{content} = \$content;
}
sub podify_items {
my($self) = @_;
# tmp storage
my @paras = ();
my $items = 0;
my $second = 0;
# we want the source in paragraphs
my @content = split /\n\n/, ${ $self->{content} };
foreach (@content) {
# is it an item?
if (/^(\*|\d+)\s+((\*|\d+)\s+)?/) {
$items++;
if ($2) {
$second++;
s/^(\*|\d+)\s+//; # strip the first level shortcut
s/^(\*|\d+)\s+/=item $1\n\n/; # do the second
s/^/=over 4\n\n/ if $second == 1; # start 2nd level
} else {
# first time insert the =over pod tag
s/^(\*|\d+)\s+/=item $1\n\n/; # start 1st level
s/^/=over 4\n\n/ if $items == 1;
s/^/=back\n\n/ if $second; # complete 2nd level
$second = 0; # end 2nd level section
}
push @paras, split /\n\n/, $_;
} else {
# complete the =over =item =back tag
$second=0, push @paras, "=back" if $second; # if 2nd level is not closed
push @paras, "=back" if $items;
push @paras, $_;
# not a tag item
$items = 0;
}
}
my $content = join "\n\n", @paras;
$self->{content} = \$content;
}
1;
__END__
=head1 NAME
C<DocSet::Source::POD> - A class for parsing input document in the POD format
=head1 SYNOPSIS
=head1 DESCRIPTION
=head2 METHODS
=over
=item retrieve_meta_data()
=item parse_pod()
=item podify_items()
podify_items();
Podify text to represent items in pod, e.g:
1 Some text from item Item1
2 Some text from item Item2
becomes:
=over 4
=item 1
Some text from item Item1
=item 2
Some text from item Item2
=back
podify_items() accepts 'C<*>' and digits as bullets
podify_items() receives a ref to array of paragraphs as a parameter
and modifies it. Nothing returned.
Moreover, you can use a second level of indentation. So you can have
* title
* * item
* * item
or
* title
* 1 item
* 2 item
where the second mark is which tells whether to use a ball bullet or a
numbered item.
=back
=head1 AUTHORS
Stas Bekman E<lt>stas (at) stason.orgE<gt>
=cut
1.1 modperl-docs/lib/DocSet/Source/Text.pm
Index: Text.pm
===================================================================
package DocSet::Source::Text;
use strict;
use warnings;
use vars qw(@ISA);
require DocSet::Doc;
@ISA = qw(DocSet::Doc);
###########################
### not implemented yet ###
###########################
1;
__END__
=head1 NAME
C<DocSet::Source::Text> - A class for parsing input document in the text format
=head1 SYNOPSIS
=head1 DESCRIPTION
=cut
1.1 modperl-docs/lib/DocSet/Template/Plugin/NavigateCache.pm
Index: NavigateCache.pm
===================================================================
package DocSet::Template::Plugin::NavigateCache;
use strict;
use warnings;
use vars qw(@ISA);
use Template::Plugin;
@ISA = qw(Template::Plugin);
use DocSet::NavigateCache ();
sub new {
my $class = shift;
my $context = shift;
DocSet::NavigateCache->new(@_);
}
1;
__END__
---------------------------------------------------------------------
To unsubscribe, e-mail: docs-cvs-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-cvs-help@perl.apache.org