You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs-cvs@perl.apache.org by st...@apache.org on 2002/01/05 19:51:59 UTC
cvs commit: modperl-docs/lib/DocSet/Template/Plugin NavigateCache.pm

stas        02/01/05 10:51:59

  Added:       lib      DocSet.pm
               lib/DocSet 5005compat.pm Cache.pm Config.pm Doc.pm DocSet.pm
                        NavigateCache.pm RunTime.pm Util.pm
               lib/DocSet/Doc HTML2HTML.pm POD2HTML.pm Text2HTML.pm
               lib/DocSet/DocSet HTML.pm PSPDF.pm
               lib/DocSet/Source HTML.pm POD.pm Text.pm
               lib/DocSet/Template/Plugin NavigateCache.pm
  Log:
  - add the DocSet package locally until it gets released on CPAN
  
  Revision  Changes    Path
  1.1                  modperl-docs/lib/DocSet.pm
  
  Index: DocSet.pm
  ===================================================================
  package DocSet;
  
  $VERSION = '0.08';
  
  =head1 NAME
  
  DocSet - documentation projects builder in HTML, PS and PDF formats
  
  =head1 SYNOPSIS
  
    pod2hpp [options] base_full_path relative_to_base_configuration_file_location
  
  Options:
  
    -h    this help
    -v    verbose
    -i    podify pseudo-pod items (s/^* /=item */)
    -s    create the splitted html version (not implemented)
    -t    create tar.gz (not implemented)
    -p    generate PS file
    -d    generate PDF file
    -f    force a complete rebuild
    -a    print available hypertext anchors (not implemented)
    -l    do hypertext links validation (not implemented)
    -e    slides mode (for presentations) (not implemented)
    -m    executed from Makefile (forces rebuild,
  				no PS/PDF file,
  				no tgz archive!)
  
  =head1 DESCRIPTION
  
  This package builds a docset from sources in different formats. The
  generated documents can be all nicely interlinked and to have the same
  look and feel.
  
  Currently it knows to handle input formats:
  
  * POD
  * HTML
  
  and knows to generate:
  
  * HTML
  * PS
  * PDF
  
  =head2  Modification control
  
  Each output mode maintains its own cache (per docset) which is used
  when certain source documents weren't modified since last build and
  the build is running in a non-force-rebuild mode.
  
  =head2 Definitions:
  
  * Chapter is a single document (file).
  
  * Link is an URL
  
  * Docset is a collection of docsets, chapters and links.
  
  =head2 Application Specific Features
  
  =over
  
  =item 1
  
  META: not ported yet!
  
  Generate a split version HTML, creating html file for each pod
  section, and having everything interlinked of course. This version is
  used best for the search.
  
  =item 1
  
  Complete the POD on the fly from the files in POD format. This is used
  to ease the generating of the presentations slides, so one can use
  C<*> instead of a long =over/=item/.../=item/=back strings. The rest
  is done as before. Take a look at the special version of the html2ps
  format to generate nice slides in I<conf/html2ps-slides.conf>.
  
  =item 1
  
  META: not ported yet!
  
  If you turn the slides mode on, it automatically turns the C<-i> (C<*>
  preprocessing) mode and does a page break before each =head tag.
  
  =back
  
  =head2 Look-n-Feel Customization
  
  You can customise the look and feel of the ouput by adjusting the
  templates in the directory I<example/tmpl/custom>.
  
  You can change look and feel of the PS (PDF) versions by modifying
  I<example/conf/html2ps.conf>.  Be careful that if your documentation
  that you want to put in one PS or PDF file is very big and you tell
  html2ps to put the TOC at the beginning you will need lots of memory
  because it won't write a single byte to the disk before it gets all
  the HTML markup converted to PS.
  
  
  =head1 CONFIGURATION
  
  All you have to prepare is a single config file that you then pass as
  an argument to C<pod2hpp>:
  
    pod2hpp [options] /abs/project/root/path /full/path/to/config/file
  
  Every directory in the source tree may have a configuration file,
  which designates a docset's root. See the I<config> files for
  examples. Usually the file in the root (I<example/src>) sets
  operational directories and other arguments, which you don't have to
  repeat in sub-docsets. Modify these files to suit your documentation
  project layout.
  
  Note that I<example/bin/build> script automatically locates your
  project's directory, so you can move your project around filesystem
  without changing anything.
  
  I<example/README> explains the layout of the directories.
  
  C<DocSet::Config> manpage explains the layout of the configuration
  file.
  
  =head1 PREREQUISITES
  
  All these are not required if all you want is to generate only the
  html version.
  
  =over 4
  
  =item * ps2pdf
  
  Needed to generate the PDF version
  
  =item * Storable
  
  Perl module available from CPAN (http://cpan.org/)
  
  Allows source modification control, so if you modify only one file you
  will not have to rebuild everything to get the updated HTML/PS/PDF
  files.
  
  =back
  
  =head1 SUPPORT
  
  Notice that this tool relies on two tools (ps2pdf and html2ps) which I
  don't support. So if you have any problem first make sure that it's
  not a problem of these tools.
  
  Note that while C<html2ps> is included in this distribution, it's
  written in the old style Perl, so if you have patches send them along,
  but I won't try to fix/modify this code otherwise. I didn't write this
  utility.
  
  =head1 BUGS
  
  Huh? Probably many...
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  =head1 SEE ALSO
  
  perl(1), Pod::HTML(3), html2ps(1), ps2pod(1), Storable(3)
  
  =head1 COPYRIGHT
  
  This program is distributed under the Artistic License, like the Perl
  itself.
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/5005compat.pm
  
  Index: 5005compat.pm
  ===================================================================
  package DocSet::5005compat;
  
  use strict;
  use Symbol ();
  use File::Basename;
  use File::Path;
  use Symbol ();
  
  my %compat_files = (
       'lib/warnings.pm' => \&warnings_pm,
  );
  
  sub import {
      if ($] >= 5.006) {
          #make sure old compat stubs dont wipe out installed versions
          unlink for keys %compat_files;
          return;
      }
  
      eval { require File::Spec::Functions; } or
        die "this is only Perl $], you need to install File-Spec from CPAN";
  
      my $min_version = 0.82;
      unless ($File::Spec::VERSION >= $min_version) {
          die "you need to install File-Spec-$min_version or higher from CPAN";
      }
  
      while (my($file, $sub) = each %compat_files) {
          $sub->($file);
      }
  }
  
  sub open_file {
      my $file = shift;
  
      unless (-d 'lib') {
          $file = "Apache-Test/$file";
      }
  
      my $dir = dirname $file;
  
      unless (-d $dir) {
          mkpath([$dir], 0, 0755);
      }
  
      my $fh = Symbol::gensym();
      print "creating $file\n";
      open $fh, ">$file" or die "open $file: $!";
  
      return $fh;
  }
  
  sub warnings_pm {
      return if eval { require warnings };
  
      my $fh = open_file(shift);
  
      print $fh <<'EOF';
  package warnings;
  
  sub import {}
  
  1;
  EOF
  
      close $fh;
  }
  
  1;
  
  
  
  1.1                  modperl-docs/lib/DocSet/Cache.pm
  
  Index: Cache.pm
  ===================================================================
  package DocSet::Cache;
  
  use strict;
  use warnings;
  
  use DocSet::RunTime;
  use DocSet::Util;
  use Storable;
  use Carp;
  
  my %attrs = map {$_ => 1} qw(toc meta order);
  
  sub new {
      my($class, $path) = @_;
  
      die "no cache path specified" unless defined $path;
  
      my $self = bless {
                        path   => $path,
                        dirty  => 0,
                       }, ref($class)||$class;
      $self->read();
  
      return $self;
  }
  
  sub path {
      my($self) = @_;
      $self->{path};
  }
  
  sub read {
      my($self) = @_;
  
      if (-w $self->{path} && DocSet::RunTime::has_storable_module()) {
          note "+++ Reading cache from $self->{path}";
          $self->{cache} = Storable::retrieve($self->{path});
      } else {
          note "+++ Initializing a new cache for $self->{path}";
          $self->{cache} = {};
      }
  }
  
  sub write {
      my($self) = @_;
  
      if (DocSet::RunTime::has_storable_module()) {
          note "+++ Storing the docset's cache to $self->{path}";
          Storable::store($self->{cache}, $self->{path});
          $self->{dirty} = 0; # mark as synced (clean)
      }
  }
  
  # set a cache entry (overrides a prev entry if any exists)
  sub set {
      my($self, $id, $attr, $data, $hidden) = @_;
  
      croak "must specify a unique id"  unless defined $id;
      croak "must specify an attribute" unless defined $attr;
      croak "unknown attribute $attr"   unless exists $attrs{$attr};
  
      # remember the addition order (unless it's an update)
      unless (exists $self->{cache}{$id}) {
          push @{ $self->{cache}{_ordered_ids} }, $id;
          $self->{cache}{$id}{seq} = $#{ $self->{cache}{_ordered_ids} };
      }
      $self->{cache}{$id}{$attr} = $data;
      $self->{cache}{$id}{_hidden} = $hidden;
      $self->{dirty} = 1;
  }
  
  # get a cache entry
  sub get {
      my($self, $id, $attr) = @_;
  
      croak "must specify a unique id"  unless defined $id;
      croak "must specify an attribute" unless defined $attr;
      croak "unknown attribute $attr"   unless exists $attrs{$attr};
  
      if (exists $self->{cache}{$id} && exists $self->{cache}{$id}{$attr}) {
          return $self->{cache}{$id}{$attr};
      }
  }
  
  
  
  
  
  # check whether a cached entry exists
  sub is_cached {
      my($self, $id, $attr) = @_;
  
      croak "must specify a unique id"  unless defined $id;
      croak "must specify an attribute" unless defined $attr;
      croak "unknown attribute $attr"   unless exists $attrs{$attr};
  
      exists $self->{cache}{$id}{$attr};
  }
  
  # invalidate cache (i.e. when a complete rebuild is forced)
  sub invalidate {
      my($self) = @_;
  
      $self->{cache} = {};
  }
  
  # delete an entry in the cache
  sub unset {
      my($self, $id, $attr) = @_;
  
      croak "must specify a unique id"  unless defined $id;
      croak "must specify an attribute" unless defined $attr;
      croak "unknown attribute $attr"   unless exists $attrs{$attr};
  
      if (exists $self->{cache}{$id}{$attr}) {
          delete $self->{cache}{$id}{$attr};
          $self->{dirty} = 1;
      }
  
  }
  
  sub is_hidden {
      my($self, $id) = @_;
      #print "$id is hidden\n" if $self->{cache}{$id}{_hidden};
      return $self->{cache}{$id}{_hidden};
  }
  
  # return the sequence number of $id in the list of linked objects (0..N)
  sub id2seq {
      my($self, $id) = @_;
      croak "must specify a unique id"  unless defined $id;
      if (exists $self->{cache}{$id}) {
          return $self->{cache}{$id}{seq};
      } 
      else {
          # this shouldn't happen!
          die "Cannot find $id in $self->{path} cache",
              dumper $self;
      }
  
  }
  
  # return the $id at the place $seq in the list of linked objects (0..N)
  sub seq2id {
      my($self, $seq) = @_;
  
      croak "must specify a seq number"  unless defined $seq;
      if ($self->{cache}{_ordered_ids}) {
          return $self->{cache}{_ordered_ids}->[$seq];
      }
      else {
          die "Cannot find $seq in $self->{path} cache",
              dumper $self;
      }
  }
  
  
  sub ordered_ids {
      my($self) = @_;
      return @{ $self->{cache}{_ordered_ids}||[] };
  }
  
  sub total_ids {
      my($self) = @_;
      return scalar @{ $self->{cache}{_ordered_ids}||[] };
  }
  
  # remember the meta data of the index node
  sub index_node {
      my($self) = shift;
  
      if (@_) {
          # set
          my($id, $title, $abstract) = @_;
          croak "must specify the index_node's id" unless defined $id;
          croak "must specify the index_node's title" unless defined $title;
          $self->{cache}{_index}{id}       = $id;
          $self->{cache}{_index}{title}    = $title;
          $self->{cache}{_index}{abstract} = $abstract;
      }
      else {
          # get
          return exists $self->{cache}{_index}
              ? $self->{cache}{_index}
              : undef;
      }
  
  }
  
  # set/get the path to the parent cache
  sub parent_node {
      my($self) = shift;
  
      if (@_) {
          # set
          my($cache_path, $id, $rel_path) = @_;
          croak "must specify a path to the parent cache"  unless defined $cache_path;
          croak "must specify a relative to parent path"  unless defined $rel_path;
          croak "must specify a parent id"  unless defined $id;
          $self->{cache}{_parent}{cache_path} = $cache_path;
          $self->{cache}{_parent}{id}         = $id;
          $self->{cache}{_parent}{rel_path}   = $rel_path;
      }
      else {
          # get
          return exists $self->{cache}{_parent}
              ? ($self->{cache}{_parent}{cache_path},
                 $self->{cache}{_parent}{id},
                 $self->{cache}{_parent}{rel_path})
              : (undef, undef, undef);
      }
  }
  
  
  # set/get the path to the node_groups cache
  sub node_groups {
      my($self) = shift;
  
      if (@_) { # set
          $self->{cache}{_node_groups} = shift;
      }
      else { # get
          return $self->{cache}{_node_groups};
      }
  }
  
  sub is_dirty { shift->{dirty};}
  
  sub DESTROY {
      my($self) = @_;
  
      # flush the cache if destroyed before having a chance to sync to the disk
      $self->write if $self->is_dirty;
  }
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::Cache> - Maintain a Non-Volatile Cache of DocSet's Data
  
  =head1 SYNOPSIS
  
    use DocSet::Cache ();
  
    my $cache = DocSet::Cache->new($cache_path);
    $cache->read;
    $cache->write;
  
    $cache->set($id, $attr, $data);
    my $data = $cache->get($id, $attr);
    print "$id is cached" if $cache->is_cached($id);
    $cache->invalidate();
    $cache->unset($id, $attr)
  
    my $seq = $cache->id2seq($id);
    my $id = $cache->seq2id($seq);
    my @ids = $cache->ordered_ids;
    my $total_ids = $cache->total_ids;
  
    $cache->index_node($id, $title, $abstract);
    my %index_node = $cache->index_node();
  
    $cache->parent_node($cache_path, $id, $rel_path);
    my($cache_path, $id, $rel_path) = $cache->parent_node();
  
  
  =head1 DESCRIPTION
  
  C<DocSet::Cache> maintains a non-volatile cache of docset's data. 
  
  The cache is initialized either from the freezed file at the provided
  path. When the file is empty or doesn't exists, a new cache is
  initialized. When the cache is modified it should be saved, but if for
  some reason it doesn't get saved, the C<DESTROY> method will check
  whether the cache wasn't synced to the disk yet and will perform the
  sync itself.
  
  Each docset's node can create an entry in the cache, and store its
  data in it. The creator has to ensure that it supplies a unique id for
  each node that is added.  Cache's internal representation is a hash,
  with internal data keys starting with _ (underscore), therefore the
  only restriction on node's id value is that it shouldn't not start
  with underscore.
  
  =head2 METHODS
  
  META: to be written (see SYNOPSIS meanwhile)
  
  =over
  
  =item * 
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/Config.pm
  
  Index: Config.pm
  ===================================================================
  package DocSet::Config;
  
  use strict;
  use warnings;
  
  use Carp;
  
  use File::Find;
  use File::Basename ();
  use File::Spec::Functions;
  
  use DocSet::Util;
  
  use constant TRACE => 1;
  
  # uri extension to MIME type mapping
  my %ext2mime = (
      map({$_ => 'text/html' } qw(htm html)),
      map({$_ => 'text/plain'} qw(txt text)),
      map({$_ => 'text/pod'  } qw(pod pm)),
  );
  
  my %conv_class= (
      'text/pod'  => {
                      'text/html' => 'DocSet::Doc::POD2HTML',
                      'text/ps'   => 'DocSet::Doc::POD2PS',
                     },
      'text/html' => {
                      'text/html' => 'DocSet::Doc::HTML2HTML',
                      'text/ps'   => 'DocSet::Doc::HTML2PS',
                     },
      'text/plain' => {
                      'text/html' => 'DocSet::Doc::Text2HTML',
                      'text/pdf'  => 'DocSet::Doc::Text2PDF',
                     },
  );
  
  sub ext2mime {
      my($self, $ext) = @_;
      exists $ext2mime{$ext} ? $ext2mime{$ext} : undef;
  }
  
  
  sub conv_class {
      my($self, $src_mime, $dst_mime) = @_;
      # convert
      die "src_mime is not defined" unless defined $src_mime;
      die "dst_mime is not defined" unless defined $dst_mime;
      my $conv_class = $conv_class{$src_mime}{$dst_mime}
          or die "unknown input/output MIME mapping: $src_mime => $dst_mime";
      return $conv_class;
  }
  
  
  my %attr = map {$_ => 1} qw(chapters docsets links);
  sub read_config {
      my($self, $config_file) = @_;
      die "Configuration file is not specified" unless $config_file;
  
      my $package = path2package($config_file);
      $self->{package} = $package;
  
      my $content;
      read_file($config_file, \$content);
  
      eval join '',
          "package $package;",
          $content, ";1;";
      die "failed to eval config file at $config_file:\n$@" if $@;
  
      # parse the attributes of the docset's config file
      no strict 'refs';
      use vars qw(@c);
      *c = \@{"$package\::c"};
      my @groups = ();
      my $current_group = '';
      my $group_size;
      for ( my $i=0; $i < @c; $i +=2 ) {
          my($key, $val) = @c[$i, $i+1];
          if ($key eq 'group') {
              # close the previous group by storing the key of its last node
              if ($current_group) {
                  push @{ $self->{node_groups} }, $current_group, $group_size;
              }
              # start the new group
              $current_group = $val;
              $group_size = 0;
          }
          elsif ($key eq 'hidden') {
              die "hidden's value must be an ARRAY reference" 
                  unless ref $val eq 'ARRAY';
              my @h = @$val;
              for ( my $j=0; $j < @h; $j +=2 ) {
                  my($key1, $val1) = @h[$j, $j+1];
                  die "hidden's can include only 'chapters' and 'docsets', " .
                      "$key1 is invalid" unless $key1 =~ /^(docsets|chapters)$/;
                  $self->add_node($key1, $val1, 1);
              }
          }
          elsif (exists $attr{$key}) {
              $group_size += $self->add_node($key, $val, 0);
          }
          else {
              $self->{$key} = $val;
              #print "$key = $val\n";
          }
      }
      if ($current_group) {
          push @{ $self->{node_groups} }, $current_group, $group_size;
      }
  
       # merge_config will adjust this value, for nested docsets
      # so this value is relevant only for the real top parent node
      $self->{dir}{abs_doc_root} = '.';
  
      $self->{dir}{src_root} = File::Basename::dirname $config_file;
      # dumper $self;
  
  }
  
  
  #
  # 1. put chapters together, docsets together, links together
  # 2. store the normal nodes in the order they were listed in 'ordered_nodes'
  # 2. store the hidden nodes in the order they were listed in 'hidden_nodes'
  #
  # return the number of added items
  sub add_node {
      my($self, $key, $value, $hidden) = @_;
  
      my @values = ref $value eq 'ARRAY' ? @$value : $value;
  
      if ($hidden) {
          push @{ $self->{hidden_nodes} }, $key, $_ for @values;
      }
      else {
          push @{ $self->{ordered_nodes} }, $key, $_ for @values;
      }
  
      return scalar @values;
  }
  
  # child config inherits from the parent config
  # and adjusts its paths
  sub merge_config {
      my($self, $src_rel_dir) = @_;
  
      my $parent_o = $self->{parent_o};
  
      my $files = $self->{file} || {};
      while ( my($k, $v) = each %{ $parent_o->{file}||{} }) {
          $self->{file}{$k} = $v unless $files->{$k};
      }
  
      my $dirs = $self->{dir} || {};
      while ( my($k, $v) = each %{ $parent_o->{dir}||{} }) {
          $self->{dir}{$k} = $v unless $dirs->{$k};
      }
  
      # a chapter object won't set this one
      if ($src_rel_dir) {
          $self->{dir}{src_rel_dir} = $src_rel_dir;
  
          # append the relative to parent_o's src dir segments
          # META: hardcoded paths!
          for my $k ( qw(dst_html dst_ps dst_split_html) ) {
              $self->{dir}{$k} .= "/$src_rel_dir";
          }
  
          # set path to the abs_doc_root 
          # META: hardcoded paths! (but in this case it doesn't matter,
          # as long as it's set in the config file
          $self->{dir}{abs_doc_root} = 
              join '/', ("..") x ($self->{dir}{dst_html} =~ tr|/|/|);
  
      }
  
  }
  
  # return a list of files to be copied
  #
  # due to a potentially huge list of files to be copied (e.g. the
  # splash library) currently it's assumed that this function is called
  # only once. Therefore no caching is done to save memory.
  #
  # The following conventions are used for $self->{copy_glob}
  # 1. Explicitly specified files and directories are copied as is
  #    (directories aren't descended into)
  # 2. Shell metachars (*?[]) can be used. e.g. if you want to grab
  #    directory foo and its contents, make sure to specify foo/*.
  sub files_to_copy {
      my($self) = @_;
  
      my $copy_skip_patterns = $self->{copy_skip} || [];
      # build one sub that will match many regex at once.
      my $rsub_filter_out = build_matchmany_sub($copy_skip_patterns);
  
      my $src_root  = $self->get_dir('src_root');
  
      # expand $self->{copy_glob}, applying the filter to skip unwanted
      # files
      my @files = 
          grep !$rsub_filter_out->($_),              # skip unwanted
  #        grep s|^(?:\./)?||,                        # strip the leading ./
          grep !-d $_,                               # skip empty dirs
          map { -d $_ ? @{ expand_dir($_) } : $_ }   # expand dirs
          map { $_ =~ /[\*\?\[\]]/ ? glob($_) : $_ } # expand globs
          map { "$src_root/$_" }                     # prefix with src_root
              @{ $self->{copy_glob}||[] };
  
      return \@files;
  }
  
  sub expand_dir {
      my @files = ();
      if ($] >= 5.006) {
         find(sub {push @files, $File::Find::name}, $_[0]);
      }
      else {
          # perl 5.005.03 on FreeBSD doesn't set the dir it chdir'ed to
          # need to move this to compat level?
          require Cwd;
          my $cwd;
          find(sub {$cwd = Cwd::cwd(); push @files, catfile $cwd, $_}, $_[0]);
      }
  
      return \@files;
  }
  
  sub set {
      my($self, %args) = @_;
      @{$self}{keys %args} = values %args;
  }
  
  sub set_dir {
      my($self, %args) = @_;
      @{ $self->{dir} }{keys %args} = values %args;
  }
  
  sub get {
      my $self = shift;
      return () unless @_;
      my @values = map {exists $self->{$_} ? $self->{$_} : ''} @_;
      return wantarray ? @values : $values[0];
  }
  
  
  sub get_file {
      my $self = shift;
      return () unless @_;
      my @values = map {exists $self->{file}{$_} ? $self->{file}{$_} : ''} @_;
      return wantarray ? @values : $values[0];
  }
  
  sub get_dir {
      my $self = shift;
      return () unless @_;
      my @values = map {exists $self->{dir}{$_} ? $self->{dir}{$_} : ''} @_;
      return wantarray ? @values : $values[0];
  }
  
  sub nodes_by_type {
      my $self = shift;
      return $self->{ordered_nodes} || [];
  }
  
  sub hidden_nodes_by_type {
      my $self = shift;
      return $self->{hidden_nodes} || [];
  }
  
  sub node_groups {
      my $self = shift;
      return $self->{node_groups} || [];
  }
  
  
  sub docsets {
      my $self = shift;
      return exists $self->{docsets} ? @{ $self->{docsets} } : ();
  }
  
  sub links {
      my $self = shift;
      return exists $self->{links} ? @{ $self->{links} } : ();
  }
  
  sub src_chapters {
      my $self = shift;
      return exists $self->{chapters} ? @{ $self->{chapters} } : ();
  }
  
  # chapter paths as they go into production
  # $self->trg_chapters(@paths) : push a chapter(s) 
  # $self->trg_chapters         : retrieve the list
  sub trg_chapters {
      my $self = shift;
      if (@_) {
          push @{ $self->{chapters_prod} }, @_;
      } else {
          return exists $self->{chapters_prod} ? @{ $self->{chapters_prod} } : ();
      }
  
  }
  
  # set/get cache
  sub cache { 
      my $self = shift;
  
      if (@_) {
          $self->{cache} = shift;
      }
      $self->{cache};
  }
  
  sub path2package {
      my $path = shift;
      $path =~ s|[\W\.]|_|g;
      return "MyDocSet::X$path";
  }
  
  
  sub object_store {
      my($self, $object) = @_;
      croak "no object passed" unless defined $object and ref $object;
      push @{ $self->{_objects_store} }, $object;
  }
  
  sub stored_objects {
      my($self) = @_;
      return @{ $self->{_objects_store}||[] };
  }
  
  
  
  #sub chapter_data {
  #   my $self = shift;
  #   my $id = shift;
  
  #   if (@_) {
  #       $self->{chapter_data}{$id} = shift;
  #   }
  #   else {
  #       $self->{chapter_data}{$id};
  #   }
  #}
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::Config> - A superclass that handles object's configuration and data
  
  =head1 SYNOPSIS
  
    use DocSet::Config ();
  
    my $mime = $self->ext2mime($ext);
    my $class = $self->conv_class($src_mime, $dst_mime);
  
    $self->read_config($config_file);
    $self->merge_config($src_rel_dir);
  
    my @files = $self->files_to_copy(files_to_copy);
    my @files = $self->expand_dir();
  
    $self->set($key => $val);
    $self->set_dir($dir_name => $val);
    $val = $self->get($key);
    $self->get_file($key);
    $self->get_dir($dir_name);
  
    my @docsets = $self->docsets();
    my @links = $self->links();
    my @chapters = $self->src_chapters();
    my @chapters = $self->trg_chapters();
  
    $self->cache($cache); 
    my $cache = $self->cache(); 
  
    $package = $self->path2package($path);
    $self->object_store($object);
    my @objects = $self->stored_objects();
  
  =head1 DESCRIPTION
  
  This objects lays in the base of the DocSet class and provides
  configuration and internal data storage/retrieval methods.
  
  At the end of this document the generic configuration file is
  explained.
  
  =head2 METHODS
  
  META: to be completed (see SYNOPSIS meanwhile)
  
  =over
  
  =item * ext2mime
  
  =item * conv_class
  
  =item * read_config
  
  =item * merge_config
  
  =item * files_to_copy
  
  =item * expand_dir
  
  =item * set
  
  =item * set_dir
  
  =item * get
  
  =item * get_file
  
  =item * get_dir
  
  =item * docsets
  
  =item * links
  
  =item * src_chapters
  
  =item * trg_chapters
  
  =item * cache 
  
  =item * path2package
  
  =item * object_store
  
  =item * stored_objects
  
  =back
  
  =back
  
  =head1 CONFIGURATION FILE
  
  Each DocSet has its own configuration file.
  
  =head2 Structure
  
  Currently the configuration file is a simple perl script that is
  expected to declare an array C<@c> with all the docset properties in
  it. Later on more configuration formats will be supported.
  
  We use the C<@c> array because some of the configuration attributes
  may be repeated, so the hash datatype is not suitable here. Otherwise
  this array looks exactly like a hash:
  
    key1 => val1,
    key2 => val2,
    ...
    keyN => valN
  
  Of course you can declare any other perl variables and do whatevery
  you want, but after the config file is run, it should have C<@c> set.
  
  Don't forget to end the file with C<1;>.
  
  =head2 Declare once attributes
  
  The following attributes must be declared at least in the top-level
  I<config.cfg> file:
  
  =over
  
  =item * dir
  
       dir => {
   	     # the resulting html files directory
   	     dst_html   => "dst_html",
   	     
   	     # the resulting ps and pdf files directory (and special
   	     # set of html files used for creating the ps and pdf
   	     # versions.)
   	     dst_ps     => "dst_ps",
   	     
   	     # the resulting split version html files directory
   	     dst_split_html => "dst_split_html",
   	     
               # location of the templates relative to the root dir
               # (searched left to right)
               tmpl       => [qw(tmpl/custom tmpl/std tmpl)],
   	    },	
  
  =item * file
  
       file => {
  	      # the html2ps configuration file
  	      html2ps_conf  => "conf/html2ps.conf",
  	     },
  
  =back
  
  Generally you should specify these only in the top-level config file,
  and only specify these again in sub-level config files, if you want to
  override things for the sub-docset and its successors.
  
  =head2 DocSet must attributes
  
  The following attributes must be declared in every docset configuration:
  
  =over
  
  =item * id
  
  a unique id of the docset. The uniquness should be preserved across
  any parallel docsets.
  
  =item * title
  
  the title of the docset
  
  =item * abstract
  
  a short abstract
  
  =back
  
  
  =head2 DocSet Components
  
  Any DocSet components can be repeated as many times as wanted. This
  allows to mix various types of nodes and still have oredered the way
  you want. You can have a chapter followed by a docset and followed by
  a few more chapters and ended with a link.
  
  The value of each component can be either a single item or a reference
  to an array of items.
  
  =over
  
  =item * docsets
  
  the docset can recursively include other docsets, simply list the
  directories the other docsets can be found in (where the I<config.cfg>
  file can be found)
  
  =item * chapters
  
  Each chapter can be specified as a path to its source document.
  
  =item * links
  
  The docset supports hyperlinks. Each link must be declared as a hash
  reference with keys: I<id>, I<link>, I<title> and I<abstract>.
  
  If you want to link to an external resource start the link, with URI
  (e.g. C<http://>). But this attribute also works for local links, for
  example, if the same generated page should be linked from more than
  one place, or if there is some non parsed object that needs to be
  linked to after it gets copied via I<copy_glob> attribute in the same
  or another docset.
  
  =back
  
  This is an example:
  
       docsets =>  ['docs', 'cool_docset'],
       chapters => [
           qw(
              about/about.html
             )
       ],
       docsets => [
           qw(
              download
             )
       ],
       chapters => 'foo/bar/zed.pod',
       links => [
           {
            id       => 'asf',
            link     => 'http://apache.org/foundation/projects.html',
            title    => 'The ASF Projects',
            abstract => "There many other ASF Projects",
           },
       ],
  
  Since normally books consist of parts which group chapters by a common
  theme, we support this feature as well. So the index can now be
  generated as:
  
    part I: Installation
    * Starting
    * Installing
  
    part II: Troubleshooting
    * Debugging
    * Errors
    * Help Links
    * Offline Help
  
  This happens only if this feature is used, otherwise a plain flat toc
  is used: to enable this feature simply splice nodes with declaration
  of a new group using the I<group> attribute:
  
    group => 'Installation',
    chapters => [qw(start.pod install.pod)],
  
    group => 'Troubleshooting',
    chapters => [qw(debug.pod errors.pod)],
    links    => [{put link data here}],
    chapters => ['offline_help.pod'],
  
  
  =head2 Hidden Objects
  
  
  I<docsets> and I<chapters> can be marked as hidden. This means that
  they will be normally processed but won't be linked from anywhere.
  
  Since the hidden objects cannot belong to any group and it doesn't
  matter when they are listed in the config file, you simply put one or
  more I<docsets> and I<chapters> into a special attribute I<hidden>
  which of course can be repeated many times just like most of the
  attributes.
  
  For example:
  
    ...
    chapters => [qw(start.pod install.pod)],
    hidden => {
        chapters => ['offline_help.pod'],
        docsets  => ['hidden_docset'],
    },
    ...
  
  The cool thing is that the hidden I<docsets> and I<chapters> will see
  all the unhidden objects, so those who know the "secret" URL will be
  able to navigate back to the non-hidden objects transparently. 
  
  This feature could be useful for example to create pages normally not
  accessed by users. For example if you want to create a page used for
  the Apache's I<ErrorDocument> handler, you want to mark it hidden,
  because it shouldn't be linked from anywhere, but once the user hit it
  (because a non-existing URL has been entered) the user will get a
  perfect page with all the proper navigation widgets (I<menu>, etc) in
  it.
  
  =head2 Copy unmodified
  
  Usually the generated UI includes images, CSS files and of course some
  files must be copied without any modifications, like files including
  pure code, archives, etc. There are two attributes to handle this:
  
  =over
  
  =item * copy_glob
  
  Accepts a reference to an array of files and directories to copy. Note
  that you must use shell wildcharacters if you want deep directory
  copies, which also works for things like: C<*.html>. If you simply
  specify a directory name it'll be copied without any contents (this is
  a feature!). For example:
  
       # non-pod/html files or dirs to be copied unmodified
       copy_glob => [
           qw(
              style.css
              images/*
             )
       ],
  
  will copy I<style.css> and all the files under the I<images/>
  directory.
  
  =item * copy_skip
  
  While I<copy_glob> allows specifying complete dirs with potentially
  many nested sub-dirs to be copied, this becomes inconvenient if we
  want to copy all but a few files in these directories. The
  I<copy_skip> rule comes to help. It accepts a reference to an array of
  regular expressions that will be applied to each candidate to be
  copied as suggested by the I<copy_glob> attribute. If the regular
  expression matches the file won't be copied.
  
  One of the useful examples would be:
  
       copy_skip => [
           '(?:^|\/)CVS(?:\/|$)', # skip cvs control files
           '#|~',                 # skip emacs backup files
       ],
  
  META: does copy_skip apply to all sub-docsets, if sub-docsets specify
  their own copy_glob?
  
  =back
  
  
  =head2 Extra Features
  
  If you want in the index file include a special top and bottom
  sections in addition to the linked list of the docset contents, you
  can do:
  
       body => {
           top => 'index_top.html',
           bot => 'index_bot.html',
       },
  
  any of I<top> and I<bot> sub-attributes are optional.  If these source
  docs are for example in HTML, they have to be written in a proper
  HTML, so the parser will be able to extract the body.
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  
  =cut
  
  
  
  
  1.1                  modperl-docs/lib/DocSet/Doc.pm
  
  Index: Doc.pm
  ===================================================================
  package DocSet::Doc;
  
  use strict;
  use warnings;
  use DocSet::Util;
  use URI;
  
  sub new {
      my $class = shift;
      my $self = bless {}, ref($class)||$class;
      $self->init(@_);
      return $self;
  }
  
  sub init {
      my($self, %args) = @_;
      while (my($k, $v) = each %args) {
          $self->{$k} = $v;
      }
  }
  
  sub scan {
      my($self) = @_;
  
      note "Scanning $self->{src_uri}";
      $self->src_read();
  
      $self->retrieve_meta_data();
  }
  
  sub render {
      my($self, $cache) = @_;
  
      # if the object wasn't stored rescan
      #$self->scan() unless $self->meta;
  
      my $src_uri       = $self->{src_uri};
      my $dst_path      = $self->{dst_path};
  
      my $rel_doc_root  = $self->{rel_doc_root};
      my $abs_doc_root  = $self->{abs_doc_root};
      $abs_doc_root .= "/$rel_doc_root" if defined $rel_doc_root;
  
      $self->{dir} = {
          abs_doc_root => $abs_doc_root,
          rel_doc_root => $rel_doc_root,
      };
  
      $self->{nav} = DocSet::NavigateCache->new($cache->path, $src_uri);
  
      note "Rendering $dst_path";
      $self->convert();
      write_file($dst_path, $self->{output});
  }
  
  # read the source and remember the mod time
  # sets $self->{content}
  #      $self->{timestamp}
  sub src_read {
      my($self) = @_;
  
      # META: at this moment everything is a file path
      my $src_uri = "file://" . $self->{src_path};
      my $u = URI->new($src_uri);
  
      my $scheme = $u->scheme;
  
      if ($scheme eq 'file') {
          my $path = $u->path;
  
          my $content = '';
          read_file($path, \$content);
          $self->{content} = \$content;
  
          # file change timestamp
          my($mon, $day, $year) = (localtime ( (stat($path))[9] ) )[4,3,5];
          $self->{timestamp} = sprintf "%02d/%02d/%04d", ++$mon,$day,1900+$year;
  
      }
      else {
          die "$scheme is not implemented yet";
      }
  
      if (my $sub = $self->can('src_filter')) {
          $self->$sub();
      }
  
  
  }
  
  sub meta {
      my $self = shift;
  
      if (@_) {
          $self->{meta} = shift;
      }
      else {
          $self->{meta};
      }
  }
  
  sub toc {
      my $self = shift;
  
      if (@_) {
          $self->{toc} = shift;
      }
      else {
          $self->{toc};
      }
  }
  
  
  # abstract methods
  #sub src_filter {}
  
  
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::Doc> - A Base Document Class
  
  =head1 SYNOPSIS
  
     use DocSet::Doc::HTML ();
     my $doc = DocSet::Doc::HTML->new(%args);
     $doc->scan();
     my $meta = $doc->meta();
     my $toc  = $doc->toc();
     $doc->render();
  
     # internal methods
     $doc->src_read();
     $doc->src_filter();
  
  =head1 DESCRIPTION
  
  This super class implement core methods for scanning a single document
  of a given format and rendering it into another format. It provides
  sub-classes with hooks that can change the default behavior. Note that
  this class cannot be used as it is, you have to subclass it and
  implement the required methods listed later.
  
  =head1 METHODS
  
  =over
  
  =item * new
  
  =item * init
  
  =item * scan
  
  scan the document into a parsed tree and retrieve its meta and toc
  data if possible.
  
  =item * render
  
  render the output document and write it to its final destination.
  
  =item * src_read
  
  Fetches the source of the document. The source can be read from
  different media, i.e. a file://, http://, relational DB or OCR :)
  
  A subclass may implement a "source" filter. For example if the source
  document is written in an extended POD the source filter may convert
  it into a standard POD. If the source includes some template
  directives these can be pre-processed as well.
  
  The document's content is coming out of this class ready for parsing
  and converting into other formats.
  
  =item * meta
  
  a simple set/get-able accessor to the I<meta> attribute.
  
  =item * toc
  
  a simple set/get-able accessor to the I<toc> attribute
  
  =back
  
  =head1 ABSTRACT METHODS
  
  These methods must be implemented by the sub-classes:
  
  =over
  
  =item retrieve_meta_data
  
  Retrieve and set the meta data that describes the input document into
  the I<meta> object attribute. Various documents may provide different
  meta information. The only required meta field is I<title>.
  
  =back
  
  These methods can be implemented by the sub-classes:
  
  =over
  
  =item src_filter
  
  A subclass may want to preprocess the source document before it'll be
  processed. This method is called after the source has been read. By
  default nothing happens.
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/DocSet.pm
  
  Index: DocSet.pm
  ===================================================================
  package DocSet::DocSet;
  
  use strict;
  use warnings;
  
  use DocSet::Util;
  use DocSet::RunTime;
  use DocSet::Cache ();
  use DocSet::Doc ();
  use DocSet::NavigateCache ();
  
  use vars qw(@ISA);
  use DocSet::Config ();
  @ISA = qw(DocSet::Config);
  
  ########
  sub new {
      my $class = shift;
      my $self = bless {}, ref($class)||$class;
      $self->init(@_);
      return $self;
  }
  
  
  sub init {
      my($self, $config_file, $parent_o, $src_rel_dir) = @_;
  
      $self->read_config($config_file);
  
      # are we inside a super docset?
      if ($parent_o and ref($parent_o)) {
          $self->{parent_o} = $parent_o;
          $self->merge_config($src_rel_dir);
      }
  
  }
  
  sub scan {
      my($self) = @_;
  
      my $src_root = $self->get_dir('src_root');
  
      # each output mode need its own cache, because of the destination
      # links which are different
      my $mode = $self->get('tmpl_mode');
      my $cache = DocSet::Cache->new("$src_root/cache.$mode.dat");
      $self->cache($cache); # store away
  
      # cleanup the cache or rebuild
      $cache->invalidate if get_opts('rebuild_all');
  
      # cache the index node meta data
      $cache->index_node($self->get('id'),
                         $self->get('title'),
                         $self->get('abstract')
                        );
  
      # cache the location of the parent node cache
      if (my $parent_o = $self->get('parent_o')) {
          my $parent_src_root   = $parent_o->get_dir('src_root');
          (my $rel2parent_src_root = $src_root) =~ s|$parent_src_root||;
          my $rel_dir = join '/', ("..") x ($rel2parent_src_root =~ tr|/|/|);
          my $parent_cache_path = "$parent_src_root/cache.$mode.dat";
          $cache->parent_node($parent_cache_path,
                              $self->get('id'),
                              $rel_dir);
          $self->set_dir(rel_parent_root => $rel_dir);
      }
  
      ###
      # scan the nodes of the current level and cache the meta and other
      # data
  
      my $hidden = 0;
      my @nodes_by_type = @{ $self->nodes_by_type };
      while (@nodes_by_type) {
          my($type, $data) = splice @nodes_by_type, 0, 2;
          if ($type eq 'docsets') {
              my $docset = $self->docset_scan_n_cache($data, $hidden);
              $self->object_store($docset)
                  if defined $docset and ref $docset;
  
          } elsif ($type eq 'chapters') {
              my $chapter = $self->chapter_scan_n_cache($data, $hidden);
              $self->object_store($chapter)
                  if defined $chapter and ref $chapter;
  
          } elsif ($type eq 'links') {
              $self->link_scan_n_cache($data, $hidden);
              # we don't need to process links
  
          } else {
              # nothing
          }
  
      }
  
      # the same but for the hidden objects
      $hidden = 1;
      my @hidden_nodes_by_type = @{ $self->hidden_nodes_by_type };
      while (@hidden_nodes_by_type) {
          my($type, $data) = splice @hidden_nodes_by_type, 0, 2;
          if ($type eq 'docsets') {
              my $docset = $self->docset_scan_n_cache($data, $hidden);
              $self->object_store($docset)
                  if defined $docset and ref $docset;
  
          } elsif ($type eq 'chapters') {
              my $chapter = $self->chapter_scan_n_cache($data, $hidden);
              $self->object_store($chapter)
                  if defined $chapter and ref $chapter;
  
          } else {
              # nothing
          }
      }
  
      $cache->node_groups($self->node_groups);
  
      # sync the cache
      $cache->write;
  
  }
  
  
  sub docset_scan_n_cache {
      my($self, $src_rel_dir, $hidden) = @_;
  
      my $src_root = $self->get_dir('src_root');
      my $cfg_file =  "$src_root/$src_rel_dir/config.cfg";
      my $docset = $self->new($cfg_file, $self, $src_rel_dir);
      $docset->scan;
  
      # cache the children meta data
      my $id = $docset->get('id');
      my $meta = {
                  title    => $docset->get('title'),
                  link     => "$src_rel_dir/index.html",
                  abstract => $docset->get('abstract'),
                 };
      $self->cache->set($id, 'meta', $meta, $hidden);
  
      return $docset;
  }
  
  
  sub link_scan_n_cache {
      my($self, $link, $hidden) = @_;
      my %meta = %$link; # make a copy
      my $id = delete $meta{id};
      $self->cache->set($id, 'meta', \%meta, $hidden);
  }
  
  
  sub chapter_scan_n_cache {
      my($self, $src_file, $hidden) = @_;
  
      my $trg_ext = $self->trg_ext();
  
      my $src_root      = $self->get_dir('src_root');
      my $dst_root      = $self->get_dir('dst_root');
      my $abs_doc_root  = $self->get_dir('abs_doc_root');
      my $src_path      = "$src_root/$src_file",
  
      my $src_ext = filename_ext($src_file)
          or die "cannot get an extension for $src_file";
      my $src_mime = $self->ext2mime($src_ext)
          or die "unknown extension: $src_ext";
      (my $basename = $src_file) =~ s/\.$src_ext$//;
  
      # destination paths
      my $rel_dst_path = "$basename.$trg_ext";
      my $rel_doc_root = "./";
      $rel_dst_path =~ s|^\./||; # strip the leading './'
      $rel_doc_root .= join '/', ("..") x ($rel_dst_path =~ tr|/|/|);
      $rel_doc_root =~ s|/$||; # remove the last '/'
      my $dst_path  = "$dst_root/$rel_dst_path";
  
      # push to the list of final chapter paths
      # e.g. used by PS/PDF build, which needs all the chapters
      $self->trg_chapters($rel_dst_path);
  
      ### to rebuild or not to rebuild
      my($should_update, $reason) = should_update($src_path, $dst_path);
      if (!$should_update) {
          note "--- $src_file: skipping ($reason)";
          return undef;
      }
  
      ### init
      note "+++ $src_file: processing ($reason)";
      my $dst_mime = $self->get('dst_mime');
      my $conv_class = $self->conv_class($src_mime, $dst_mime);
      require_package($conv_class);
  
      my $chapter = $conv_class->new(
           tmpl_mode    => $self->get('tmpl_mode'),
           tmpl_root    => $self->get_dir('tmpl'),
           src_uri      => $src_file,
           src_path     => $src_path,
           dst_path     => $dst_path,
           rel_dst_path => $rel_dst_path,
           rel_doc_root => $rel_doc_root,
           abs_doc_root => $abs_doc_root,
          );
  
      $chapter->scan();
  
      # cache the chapter's meta and toc data
      $self->cache->set($src_file, 'meta', $chapter->meta, $hidden);
      $self->cache->set($src_file, 'toc',  $chapter->toc,  $hidden);
  
      return $chapter;
  
  }
  
  sub render {
      my($self) = @_;
  
      # copy non-pod files like images and stylesheets
      $self->copy_the_rest;
  
      my $src_root = $self->get_dir('src_root');
  
      # each output mode need its own cache, because of the destination
      # links which are different
      my $mode = $self->get('tmpl_mode');
      my $cache = DocSet::Cache->new("$src_root/cache.$mode.dat");
  
      # render the objects no matter what kind are they
      for my $obj ($self->stored_objects) {
          $obj->render($cache);
      }
  
      $self->complete;
  
  }
  
  ####################
  sub copy_the_rest {
      my($self) = @_;
  
      my @copy_files = @{ $self->files_to_copy || [] };
  
      return unless @copy_files;
  
      my $src_root = $self->get_dir('src_root');
      my $dst_root = $self->get_dir('dst_root');
      note "+++ Copying the non-processed files from $src_root to $dst_root";
      foreach my $src_path (@copy_files){
          my $dst_path = $src_path;
  #        # some OSs's File::Find returns files with no dir prefix root
  #        # (that's what ()* is for
  #        $dst_path =~ s/(?:$src_root)*/$dst_root/; 
          $dst_path =~ s/$src_root/$dst_root/;
              
          # to rebuild or not to rebuild
          my($should_update, $reason) = 
              should_update($src_path, $dst_path);
          if (!$should_update) {
              note "--- skipping cp $src_path $dst_path ($reason)";
              next;
          }
          note "+++ processing cp $src_path $dst_path ($reason)";
          copy_file($src_path, $dst_path);
      }
  }
  
  
  # abstract classes
  sub complete {}
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::DocSet> - An abstract docset generation class
  
  =head1 SYNOPSIS
  
    use DocSet::DocSet::HTML ();
    my $docset = DocSet::DocSet::HTML->new($config_file);
    
    # must start from the abs root
    chdir $abs_root;
    
    # must be a relative path to be able to move the generated code from
    # location to location, without adjusting the links
    $docset->set_dir(abs_root => ".");
    $docset->scan;
    $docset->render;
  
  =head1 DESCRIPTION
  
  C<DocSet::DocSet> processes a docset, which can include other docsets,
  documents and links. In the first pass it scans the linked to it
  documents and other docsets and caches this information and the
  objects for a later peruse. In the second pass the stored objects are
  rendered. And the docset is completed.
  
  This class cannot be used on its own and has to be subclassed and
  extended, by the sub-classes which has a specific to input and output
  formats of the documents that need to be processed. It handles only
  the partial functionality which doesn't require format specific
  knowledge.
  
  =head2 METHODS
  
  This class inherits from C<DocSet::Config> and you will find the
  documentation of methods inherited from this class in its pod.
  
  The following "public" methods are implemented in this super-class:
  
  =over
  
  =item * new
  
    $class->new($config_file, $parent_o, $src_rel_dir);
  
  =item * init
  
    $self->init($config_file, $parent_o, $src_rel_dir);
  
  =item * scan
  
    $self->scan();
  
  Scans the docset for meta data and tocs of its items and caches this
  information and the item objects.
  
  =item * render
  
    $self->render();
  
  Calls the render() method of each of the stored objects and creates an
  index page linking all the items.
  
  =item * copy_the_rest
  
    $self->copy_the_rest()
  
  Copies the items which aren't processed (i.e. images, css files, etc).
  
  =back
  
  =head2 ABSTRACT METHODS
  
  The following methods should be implemented by the sub-classes.
  
  =over
  
  =item * parse
  
  =item * retrieve_meta_data
  
  =item * convert
  
  =item * complete
  
    $self->complete();
  
  put here anything that should be run after all the items have been
  rendered and all the meta info has been collected. i.e. generation of
  the I<index> file, to link to all the links and the parent node if
  such exists.
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/NavigateCache.pm
  
  Index: NavigateCache.pm
  ===================================================================
  package DocSet::NavigateCache;
  
  use strict;
  use warnings;
  
  use DocSet::RunTime;
  use DocSet::Util;
  use Storable;
  use Carp;
  
  # cache the loaded cache files
  use vars qw(%CACHE);
  %CACHE = ();
  
  #use vars qw(@ISA);
  use DocSet::Cache ();
  #@ISA = qw(DocSet::Cache);
  
  use constant OBJ         => 0;
  use constant ID          => 1;
  use constant CUR_PATH    => 2;
  use constant REL_PATH    => 3;
  
  # $rel_path (to the parent) is optional (e.g. root doesn't have a parent)
  sub new {
      my($class, $cache_path, $id, $rel_path) = @_;
  
      croak "no cache path specified" unless defined $cache_path;
      croak "no id specified"         unless defined $id;
  
      my $cache = get_cache($cache_path);
      my $self = bless [], ref($class)||$class;
      $self->[OBJ]         = $cache;
      $self->[CUR_PATH]    = $cache_path;
      $self->[REL_PATH]    = $rel_path if $rel_path;
      $self->[ID]          = $id;
  
      return $self;
  }
  
  sub parent_rel_path {
      my($self) = @_;
      return defined $self->[REL_PATH] ? $self->[REL_PATH] : undef;
  }
  
  # get next item's object or undef if there are no more
  sub next {
      my($self) = @_;
      my $cache    = $self->[OBJ];
  
      my $seq      = $cache->id2seq($self->[ID]);
      my $last_seq = $cache->total_ids - 1;
  
      # if the next object is hidden, it's like there is no next object,
      # because the hidden objects, if any, are always coming last
      if ($seq < $last_seq) {
          my $id = $cache->seq2id($seq + 1);
          if ($cache->is_hidden($id)) {
              return undef;
          }
          else {
              return $self->new($self->[CUR_PATH], $id);
          }
      } else {
          return undef;
      }
  
  }
  
  # get prev item's object or undef if there are no more
  sub prev {
      my($self) = @_;
      my $cache    = $self->[OBJ];
      my $seq = $cache->id2seq($self->[ID]);
      
      # since the hidden objects, if any, are always coming last
      # we need to go to the last of the non-hidden objects.
      if ($seq) {
          my $id = $cache->seq2id($seq - 1);
          if ($cache->is_hidden($id)) {
              return $self->new($self->[CUR_PATH], $id)->prev();
          }
          else {
              return $self->new($self->[CUR_PATH], $id);
          }
      } else {
          return undef;
      }
  }
  
  
  
  # get the object of the first item on the same level
  sub first {
      my($self) = @_;
      my $cache    = $self->[OBJ];
  
      # it's possible that the whole docset is made of hidden objects.
      # since the hidden objects, if any, are always coming last
      # we simply return undef in such a case
      if ($cache->total_ids) {
          my $id = $cache->seq2id(0);
          if ($cache->is_hidden($id)) {
              return undef;
          }
          else {
              return $self->new($self->[CUR_PATH], $id);
          }
      }
      else {
          return undef;
      }
  }
  
  
  # the index node of the current level
  sub index_node {
      my($self) = @_;
      return $self->[OBJ]->index_node;
  }
  
  # get the object of the parent
  sub up {
      my($self) = @_;
      my($path, $id, $rel_path) = $self->[OBJ]->parent_node;
  
      $rel_path = "." unless defined $rel_path;
      if (defined $self->[REL_PATH] && length $self->[REL_PATH]) {
          # append the relative path of each child, so the overall
          # relative path is correct
          $rel_path .= "/$self->[REL_PATH]";
      }
  
      # it's ok to have a hidden parent, we don't mind to see it
      # as non-hidden, since the children of the hidden parent aren't
      # linked from other non-hidden pages. In fact we must ignore the
      # fact that it's hidden (if it is) because otherwise the navigation
      # won't work.
      if ($path) {
          return $self->new($path, $id, $rel_path);
      }
      else {
          return undef;
      }
  }
  
  # retrieve the meta data of the current node
  sub meta {
      my($self) = @_;
      return $self->[OBJ]->get($self->[ID], 'meta');
  }
  
  # retrieve the node groups
  sub node_groups {
      my($self) = @_;
  #print "OK: "; 
  #dumper $self->[OBJ]->node_groups;
      return $self->[OBJ]->node_groups;
  }
  
  sub id {
      shift->[ID];
  }
  
  sub get_cache {
      my($cache_path) = @_;
      $CACHE{$cache_path} ||= DocSet::Cache->new($cache_path);
      return $CACHE{$cache_path};
  }
  
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::NavigateCache> - Navigate the DocSet's caches in a readonly mode
  
  =head1 SYNOPSIS
  
    my $nav = DocSet::NavigateCache->new($cache_path, $id, $rel_path);
  
    # go through all nodes from left to right, and remember the sequence
    # number of the $nav node (from which we have started)
    my $iterator = $nav->first;
    my $seq = 0;
    my $counter = 0;
    my @meta = ();
    while ($iterator) {
       $seq = $counter if $iterator->id eq $nav->id;
       push @meta, $iterator->meta;
       $iterator = $iterator->next;
       $counter++;
    }
    # add index node's meta data
    push @meta, $nav->index_node;
  
    # prev object
    $prev  = $nav->prev;
  
    # get all the ancestry
    my @parents = ();
    $p = $nav->up;
    while ($p) {
        push @parents, $p;
        $p = $p->up;
    }
  
  =head1 DESCRIPTION
  
  C<DocSet::NavigateCache> navigates the cache created by docset objects
  during their scan stage. Once the navigator handle is obtained, it's
  possible to move between the nodes of the same level, using the next()
  and prev() methods or going up one level using the up() method. the
  first() method returns the object of the first node on the same
  level. Each of these methods returns a new C<DocSet::NavigateCache>
  object or undef if the object cannot be created.
  
  This object can be used to retrieve node's meta data, its id and its
  index node's meta data.
  
  Currently it is used in the templates for the internal navigation
  widgets creation. That's where you will find the examples of its use
  (e.g. I<tmpl/custom/html/menu_top_level> and
  I<tmpl/custom/html/navbar_global>).
  
  As C<DocSet::NavigateCache> reads cache files in, it caches them, since
  usually the same file is required many times in a few subsequent
  calls.
  
  Note that C<DocSet::NavigateCache> doesn't see any hidden objects
  stored in the cache.
  
  =head2 METHODS
  
  META: to be completed (see SYNOPSIS meanwhile)
  
  =over
  
  =item * new
  
    DocSet::NavigateCache->new($cache_path, $id, $rel_path);
  
  C<$cache_path> is the path of the cache file to read.
  
  C<$id> is the id of the current node.
  
  C<$rel_path> is optional and passed if an object has a parent node. It
  contains a relative path from the current node to its parent.
  
  =item * parent_rel_path
  
  =item * next
  
  =item * prev
  
  =item * first
  
  =item * up
  
  =item * index_node
  
  =item * meta
  
  =item * id
  
  =item *
  
  =item *
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/RunTime.pm
  
  Index: RunTime.pm
  ===================================================================
  package DocSet::RunTime;
  
  use strict;
  use warnings;
  
  use vars qw(@ISA @EXPORT %opts);
  @ISA    = qw(Exporter);
  @EXPORT = qw(get_opts);
  
  
  sub set_opt {
      my(%args) = ();
      if (@_ == 1) {
          my $arg = shift;
          my $ref = ref $arg;
          if ($ref) {
              %args = $ref eq 'HASH' ? %$arg : @$arg;
          } else {
              die "must be a ref to or an array/hash";
          }
      } else {
          %args = @_;
      }
      @opts{keys %args} = values %args;
  }
  
  sub get_opts {
      my $opt = shift;
      exists $opts{$opt} ? $opts{$opt} : '';
  }
  
  # check whether we have a Storable avalable
  use constant HAS_STORABLE => eval { require Storable; };
  sub has_storable_module {
      return HAS_STORABLE;
  }
  
  my $html2ps_exec = `which html2ps` || '';
  chomp $html2ps_exec;
  sub can_create_ps {
      # ps2html is bundled, so we can always create PS
      return $html2ps_exec;
  
      # if you unbundle it make sure you write here a code similar to
      # can_create_pdf()
  }
  
  my $ps2pdf_exec = `which ps2pdf` || '';
  chomp $ps2pdf_exec;
  sub can_create_pdf {
      # check whether ps2pdf exists
      return $ps2pdf_exec if $ps2pdf_exec;
  
      print(qq{It seems that you do not have ps2pdf installed! You have
               to install it if you want to generate the PDF file
              });
      return 0;
  }
  
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::RunTime> - RunTime Configuration
  
  =head1 SYNOPSIS
  
    use DocSet::RunTime;
    if (get_opts('verbose') {
        print "verbose mode";
    }
  
    DocSet::RunTime::set_opt(\%args);
  
    DocSet::RunTime::has_storable_module();
    DocSet::RunTime::can_create_ps();
    DocSet::RunTime::can_create_pdf();
  
  
  =head1 DESCRIPTION
  
  This module is a part of the docset application, and it stores the run
  time arguments. i.e. whether to build PS and PDF or to run in a
  verbose mode and more.
  
  =head1 FUNCTIONS
  
  META: To be completed, see SYNOPSIS 
  
  =over
  
  =item * set_opt
  
  
  =item * get_opts
  
  
  =item * has_storable_module
  
  
  =item * can_create_ps
  
  
  =item * can_create_pdf
  
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/Util.pm
  
  Index: Util.pm
  ===================================================================
  package DocSet::Util;
  
  use strict;
  use warnings;
  
  use Symbol ();
  use File::Basename ();
  use File::Copy ();
  use File::Path ();
  use Data::Dumper;
  use Carp;
  use Template;
  
  use DocSet::RunTime;
  
  use vars qw(@ISA @EXPORT);
  @ISA    = qw(Exporter);
  @EXPORT = qw(read_file read_file_paras copy_file write_file create_dir
               filename_ext require_package dumper sub_trace note
               get_date get_timestamp proc_tmpl build_matchmany_sub
               banner should_update confess cluck);
  
  # copy_file($src_path, $dst_path);
  # copy a file at $src_path to $dst_path, 
  # if one of the directories of the $dst_path doesn't exist -- it'll
  # be created.
  ###############
  sub copy_file {
      my($src, $dst) = @_;
  
      # make sure that the directory exist or create one
      my $base_dir = File::Basename::dirname $dst;
      create_dir($base_dir) unless (-d $base_dir);
  
      File::Copy::copy($src, $dst);
  }
  
  
  # write_file($filename, $ref_to_array||scalar);
  # content will be written to the file from the passed array of
  # paragraphs
  ###############
  sub write_file {
      my($filename, $content) = @_;
  
      # make sure that the directory exist or create one
      my $dir = File::Basename::dirname $filename;
      create_dir($dir) unless -d $dir;
  
      my $fh = Symbol::gensym;
      open $fh, ">$filename" or croak "Can't open $filename for writing: $!";
      print $fh ref $content ? @$content : defined $content ? $content : '';
      close $fh;
  }
  
  
  # recursively creates a multi-layer directory
  ###############
  sub create_dir {
      my $path = shift;
      return if !defined($path) || -e $path;
      # META: mode could be made configurable
      File::Path::mkpath($path, 0, 0755) or croak "Couldn't create $path: $!";
  }
  
  # read_file($filename, $ref);
  # assign to a ref to a scalar
  ###############
  sub read_file {
      my($filename, $r_content) = @_;
  
      my $fh = Symbol::gensym;
      open $fh, $filename  or croak "Can't open $filename for reading: $!";
      local $/;
      $$r_content = <$fh>;
      close $fh;
  
  }
  
  # read_file_paras($filename, $ref_to_array);
  # read by paragraph
  # content will be set into a ref to an array
  ###############
  sub read_file_paras {
      my($filename, $ra_content) = @_;
  
      my $fh = Symbol::gensym;
      open $fh, $filename  or croak "Can't open $filename for reading: $!";
      local $/ = "";
      @$ra_content = <$fh>;
      close $fh;
  
  }
  
  # return the passed file's extension or '' if there is no one
  # note: that '/foo/bar.conf.in' returns an extension: 'conf.in';
  # note: a hidden file .foo will be recognized as an extension 'foo'
  sub filename_ext {
      my($filename) = @_;
      my $ext = (File::Basename::fileparse($filename, '\.[^\.]*'))[2] || '';
      $ext =~ s/^\.(.*)/lc $1/e;
      $ext;
  }
  
  sub get_date {
      sprintf "%s %d, %d", (split /\s+/, scalar localtime)[1,2,4];
  }
  
  sub get_timestamp {
      my ($mon,$day,$year) = (localtime ( time ) )[4,3,5];
      sprintf "%02d/%02d/%04d", ++$mon, $day, 1900+$year;
  }
  
  # convert Foo::Bar into Foo/Bar.pm and require
  sub require_package {
      my $package = shift;
      die "no package passed" unless $package;
      $package =~ s|::|/|g;
      $package .= '.pm';
      require $package;
  }
  
  
  
  # convert the template into the release version
  # $tmpl_root: a ref to an array of tmpl base dirs
  # tmpl_file: which template file to process
  # mode     : in what mode (html, ps, ...)
  # vars     : ref to a hash with vars to path to the template
  #
  # returns the processed template
  ###################
  sub proc_tmpl {
      my($tmpl_root, $tmpl_file, $mode, $vars) = @_;
  
      # append the specific rendering mode, so the correct template will
      # be picked (e.g. in 'ps' mode, the ps sub-dir(s) will be searched
      # first)
      my $search_path = join ':',
          map { ("$_/$mode", "$_/common", "$_") }
              (ref $tmpl_root ? @$tmpl_root : $tmpl_root);
  
      my $template = Template->new
          ({
            INCLUDE_PATH => $search_path,
            RECURSION => 1,
            PLUGINS => {
                cnavigator => 'DocSet::Template::Plugin::NavigateCache',
            },
           }) || die $Template::ERROR, "\n";
  
      #  use Data::Dumper;
      #  print Dumper \@search_path;
  
      my $output;
      $template->process($tmpl_file, $vars, \$output)
          || die "error: ", $template->error(), "\n";
  
      return $output;
  
  }
  
  # compare the timestamps/existance of src and dst paths 
  # and return (true,reason) if src is newer than dst 
  # otherwise return (false, reason)
  #
  # if rebuild_all runtime is on, this always returns (true, reason)
  #
  sub should_update {
      my($src_path, $dst_path) = @_;
  
      # to rebuild or not to rebuild
      my $not_modified = 
          (-e $dst_path and -M $dst_path < -M $src_path) ? 1 : 0;
  
      my $reason = $not_modified ? 'not modified' : 'modified';
      if (get_opts('rebuild_all')) {
          return (1, "$reason / forced");
      } else {
          return (!$not_modified, $reason);
      }
      
  
  }
  
  sub banner {
      my($string) = @_;
  
      my $len = length($string) + 8;
      note(
           "#" x $len,
           "### $string ###",
           "#" x $len,
          );
  
  }
  
  # see DocSet::Config::files_to_copy() for usage
  #########################
  sub build_matchmany_sub {
      my $ra_regex = shift;
      my $expr = join '||', map { "\$_[0] =~ m/$_/o" } @$ra_regex;
      # note $expr;
      my $matchsub = eval "sub { ($expr) ? 1 : 0}";
      die "Failed in building regex [@$ra_regex]: $@" if $@;
      $matchsub;
  }
  
  sub dumper {
      print Dumper @_;
  }
  
  
  #sub sub_trace {
  ##    my($package) = (caller(0))[0];
  #    my($sub) = (caller(1))[3];
  #    print "=> $sub: @_\n";
  #}
  
  *confess = \*Carp::confess;
  *cluck = \*Carp::cluck;
  
  sub note {
      return unless get_opts('verbose');
      print join("\n", @_), "\n";
  
  }
  
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::Util> - Commonly used functions
  
  =head1 SYNOPSIS
  
    use DocSet::Util;
  
    copy_file($src, $dst);
    write_file($filename, $content);
    create_dir($path);
  
    read_file($filename, $r_content);
    read_file_paras($filename, $ra_content);
  
    my $ext = filename_ext($filename);
    my $date = get_date();
    my $timestamp = get_timestamp();
  
    require_package($package);
    my $output = proc_tmpl($tmpl_root, $tmpl_file, $mode, $vars);
    my $should_update = should_update($src_path, $dst_path);
    banner($string);
  
    my $sub_ref = build_matchmany_sub($ra_regex);
    dumper($ref);
    confess($string);
    note($string);
  
  
  =head1 DESCRIPTION
  
  All the functions are exported by default.
  
  =head2 METHODS
  
  META: to be completed (see SYNOPSIS meanwhile)
  
  =over
  
  =item * copy_file
  
  =item * write_file
  
  =item * create_dir
  
  =item * read_file
  
  =item * read_file_paras
  
  =item * filename_ext
  
  =item * get_date
  
  =item * get_timestamp
  
  =item * require_package
  
  =item * proc_tmpl
  
  =item * should_update
  
  =item * banner
  
  =item * build_matchmany_sub
  
  =item * dumper
  
  =item * confess
  
  =item * note
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  
  =cut
  
  
  
  
  1.1                  modperl-docs/lib/DocSet/Doc/HTML2HTML.pm
  
  Index: HTML2HTML.pm
  ===================================================================
  package DocSet::Doc::HTML2HTML;
  
  use strict;
  use warnings;
  
  use DocSet::Util;
  
  use vars qw(@ISA);
  require DocSet::Source::HTML;
  @ISA = qw(DocSet::Source::HTML);
  
  sub convert {
      my($self) = @_;
  
      my @body = $self->{parsed_tree}->{body};
      my $vars = {
                  meta => $self->{meta},
                  body => \@body,
                  dir  => $self->{dir},
                  nav  => $self->{nav},
                  last_modified => $self->{timestamp},
                 };
      my $tmpl_file = 'page';
      my $mode = $self->{tmpl_mode};
      my $tmpl_root = $self->{tmpl_root};
      $self->{output} = proc_tmpl($tmpl_root, $tmpl_file, $mode, {doc => $vars} );
  }
  
  
  # need for pluggin docs into index files
  sub converted_body {
      my($self) = @_;
  
      return $self->{parsed_tree}->{body};
  }
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::Doc::HTML2HTML> - HTML source to HTML target converter
  
  =head1 SYNOPSIS
  
  
  
  =head1 DESCRIPTION
  
  Implements an C<DocSet::Doc> sub-class which converts a source
  document in HTML, into an output document in HTML.
  
  =head1 METHODS
  
  For the rest of the super class methods see C<DocSet::Doc>.
  
  =over
  
  =item * convert
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/Doc/POD2HTML.pm
  
  Index: POD2HTML.pm
  ===================================================================
  package DocSet::Doc::POD2HTML;
  
  use strict;
  use warnings;
  
  use DocSet::Util;
  require Pod::POM;
  #require Pod::POM::View::HTML;
  #my $view_mode = 'Pod::POM::View::HTML';
  my $view_mode = 'DocSet::Doc::POD2HTML::View::HTML';
  
  use vars qw(@ISA);
  require DocSet::Source::POD;
  @ISA = qw(DocSet::Source::POD);
  
  sub convert {
      my($self) = @_;
  
      my $pom = $self->{parsed_tree};
      
  #    my @sections = $pom->head1();
  #    shift @sections; # skip the title
  
      my @sections = $pom->content();
      shift @sections; # skip the title
  
      my @body = ();
      foreach my $node (@sections) {
  #	my $type = $node->type();
  #        print "$type\n";
  	push @body, $node->present($view_mode);
      }
  
  #    for my $head1 (@sections) {
  #        push @body, $head1->title->present($view_mode);
  #        push @body, $head1->content->present($view_mode);
  #        for my $head2 ($head1->head2) {
  #            push @body, $head2->present($view_mode);
  #            for my $head3 ($head2->head3) {
  #                push @body, $head3->present($view_mode);
  #                for my $head4 ($head3->head4) {
  #                    push @body, $head4->present($view_mode);
  #                }
  #            }
  #        }
  #    }
  
      my $vars = {
                  meta => $self->{meta},
                  toc  => $self->{toc},
                  body => \@body,
                  dir  => $self->{dir},
                  nav  => $self->{nav},
                  last_modified => $self->{timestamp},
                 };
  
      my $tmpl_file = 'page';
      my $mode = $self->{tmpl_mode};
      my $tmpl_root = $self->{tmpl_root};
      $self->{output} = proc_tmpl($tmpl_root, $tmpl_file, $mode, {doc => $vars} );
  
  }
  
  1;
  
  
  package DocSet::Doc::POD2HTML::View::HTML;
  
  use vars qw(@ISA);
  require Pod::POM::View::HTML;
  @ISA = qw( Pod::POM::View::HTML);
  
  sub view_head1 {
      my ($self, $node) = @_;
      return $self->anchor($node->title) . $self->SUPER::view_head1($node);
  }
  
  sub view_head2 {
      my ($self, $node) = @_;
      return $self->anchor($node->title) . $self->SUPER::view_head2($node);
  }
  
  sub view_head3 {
      my ($self, $node) = @_;
      return $self->anchor($node->title) . $self->SUPER::view_head3($node);
  }
  
  sub view_head4 {
      my ($self, $node) = @_;
      return $self->anchor($node->title) . $self->SUPER::view_head4($node);
  }
  
  sub anchor {
      my($self, $title) = @_;
      my $anchor = "$title";
      $anchor =~ s/\W/_/g;
      return qq{<a name="$anchor"></a>\n};
  }
  
  1;
  
  
  
  __END__
  
  =head1 NAME
  
  C<DocSet::Doc::POD2HTML> - POD source to HTML target converter
  
  =head1 SYNOPSIS
  
  
  
  =head1 DESCRIPTION
  
  Implements an C<DocSet::Doc> sub-class which converts a source
  document in POD, into an output document in HTML.
  
  =head1 METHODS
  
  For the rest of the super class methods see C<DocSet::Doc>.
  
  =over
  
  =item * convert
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  =cut
  
  
  
  
  1.1                  modperl-docs/lib/DocSet/Doc/Text2HTML.pm
  
  Index: Text2HTML.pm
  ===================================================================
  package DocSet::Doc::Text2HTML;
  
  use strict;
  use warnings;
  
  use vars qw(@ISA);
  require DocSet::Source::Text;
  @ISA = qw(DocSet::Source::Text);
  
  ###########################
  ### not implemented yet ###
  ###########################
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::Doc::Text2HTML> - Text source to HTML target converter
  
  =head1 SYNOPSIS
  
  
  
  =head1 DESCRIPTION
  
  Implements an C<DocSet::Doc> sub-class which converts a source
  document in Text, into an output document in HTML.
  
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/DocSet/HTML.pm
  
  Index: HTML.pm
  ===================================================================
  package DocSet::DocSet::HTML;
  
  use strict;
  use warnings;
  
  use DocSet::Util;
  use DocSet::NavigateCache ();
  
  use vars qw(@ISA);
  use DocSet::DocSet ();
  @ISA = qw(DocSet::DocSet);
  
  # what's the output format
  sub trg_ext {
      return 'html';
  }
  
  sub init {
      my $self = shift;
  
      $self->SUPER::init(@_);
  
      # configure HTML specific run-time
      $self->set(dst_mime => 'text/html');
      $self->set(tmpl_mode => 'html');
      $self->set_dir(dst_root => $self->get_dir('dst_html'));
      banner("HTML DocSet: " . $self->get('title') );
  }
  
  sub complete {
      my($self) = @_;
  
      $self->write_index_file();
  }
  
  # generate the index.html based on the doc entities it includes, in
  # the following order: docsets, books, chapters
  #
  # Using the same template file create the long and the short index
  # html files
  ##################################
  sub write_index_file {
      my($self) = @_;
  
      my @toc  = ();
      my $cache = $self->cache;
  
      # TOC
      my @node_groups = @{ $self->node_groups };
      my @ids = $cache->ordered_ids;
  
      # create the toc while skipping over hidden files
      if (@node_groups && @ids) {
          # index's toc is built from groups of items' meta data
          while (@node_groups) {
              my($title, $count) = splice @node_groups, 0, 2;
              push @toc, {
                  group_title => $title,
                  subs  => [map {$cache->get($_, 'meta')} 
                            grep !$cache->is_hidden($_), 
                            splice @ids, 0, $count],
              };
          }
      }
      else {
          # index's toc is built from items' meta data
          for my $id (grep !$cache->is_hidden($_), $cache->ordered_ids) {
              push @toc, $cache->get($id, 'meta');
          }
      }
  
      my $dir = {
          abs_doc_root => $self->get_dir('abs_doc_root'),
          rel_doc_root => $self->get_dir('rel_parent_root'),
      };
  
      my $meta = {
           title    => $self->get('title'),
           abstract => $self->get('abstract'),
      };
  
      my $navigator = DocSet::NavigateCache->new($self->cache->path, $self->get('id'));
      my %args = (
           nav      => $navigator,
           toc      => \@toc,
           meta     => $meta,
           dir      => $dir,
           version  => $self->get('version')||'',
           date     => get_date(),
           last_modified => get_timestamp(),
  #         body     => top
      );
  
  
      # pluster index top and bottom docs if defined (after converting them)
      if (my $body = $self->get('body')) {
          my $src_root = $self->get_dir('src_root');
          my $dst_mime = $self->get('dst_mime');
  
          for my $sec (qw(top bot)) {
              my $src_file = $body->{$sec};
              next unless $src_file;
  
              my $src_ext = filename_ext($src_file)
                  or die "cannot get an extension for $src_file";
              my $src_mime = $self->ext2mime($src_ext)
                  or die "unknown extension: $src_ext";
              my $conv_class = $self->conv_class($src_mime, $dst_mime);
              require_package($conv_class);
  
              my $chapter = $conv_class->new(
                  tmpl_mode    => $self->get('tmpl_mode'),
                  tmpl_root    => $self->get_dir('tmpl'),
                  src_uri      => $src_file,
                  src_path     => "$src_root/$src_file",
              );
              $chapter->scan();
              $args{body}{$sec} = $chapter->converted_body();
          }
  
      }
  
  
  
  
  
      my $dst_root  = $self->get_dir('dst_html');
      my $dst_file = "$dst_root/index.html";
      my $mode = $self->get('tmpl_mode');
      my $tmpl_file = 'index';
      my $vars = { doc => \%args };
      my $tmpl_root = $self->get_dir('tmpl');
      my $content = proc_tmpl($tmpl_root, $tmpl_file, $mode, $vars);
      note "+++ Creating $dst_file";
      DocSet::Util::write_file($dst_file, $content);
  }
  
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::DocSet::HTML> - A subclass of C<DocSet::DocSet> for generating HTML docset
  
  =head1 SYNOPSIS
  
  See C<DocSet::DocSet>
  
  =head1 DESCRIPTION
  
  This subclass of C<DocSet::DocSet> converts the source docset into a
  set of HTML documents linking its items with autogenerated
  I<index.html>.
  
  =head2 METHODS
  
  See the majority of the methods in C<DocSet::DocSet>
  
  =over
  
  =item * trg_ext
  
    $self->trg_ext();
  
  returns the extension of the target files. I<html> in the case of this
  sub-class.
  
  =item * init
  
    $self->init(@_);
  
  calls C<DocSet::DocSet::init> and then initializes its own HTML output
  specific settings.
  
  =item * complete
  
  see C<DocSet::DocSet>
  
  =item * write_index_file
  
    $self->write_index_file();
  
  creates I<index.html> file linking all the items of the docset
  together.
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/DocSet/PSPDF.pm
  
  Index: PSPDF.pm
  ===================================================================
  package DocSet::DocSet::PSPDF;
  
  use strict;
  use warnings;
  
  use DocSet::Util;
  use DocSet::RunTime;
  use DocSet::NavigateCache ();
  
  use vars qw(@ISA);
  use DocSet::DocSet ();
  @ISA = qw(DocSet::DocSet);
  
  # what's the output format
  sub trg_ext {
      return 'html';
  }
  
  sub init {
      my $self = shift;
  
      $self->SUPER::init(@_);
  
      # configure PS/PDF specific run-time
      # though, we build ps/pdf the intermediate product is HTML
      $self->set(dst_mime => 'text/html');
      $self->set(tmpl_mode => 'ps');
      $self->set_dir(dst_root => $self->get_dir('dst_ps'));
      banner("PS/PDF DocSet: " . $self->get('title') );
  }
  
  sub complete {
      my($self) = @_;
  
      $self->write_index_file();
  
      $self->create_ps_book;
      $self->create_pdf_book if get_opts('generate_pdf');
  }
  
  
  
  # generate the index.html file based on the doc entities it includes,
  # in the following order: docsets, books, chapters
  #
  # Using the same template file create the long and the short index
  # html files
  ##################################
  sub write_index_file{
      my($self) = @_;
  
      my $dir = {
                 abs_doc_root => $self->get_dir('abs_doc_root'),
                 rel_doc_root => '..', # META: probably wrong! (see write_index_html_file())
                };
  
      my $meta = {
                  title    => $self->get('title'),
                  abstract => $self->get('abstract'),
                 };
  
      use DocSet::NavigateCache;
      my $navigator = DocSet::NavigateCache->new($self->cache->path, $self->get('id'));
  
      my %args = 
          (
           nav      => $navigator,
           meta     => $meta,
           dir      => $dir,
           version  => $self->get('version')||'',
           date     => get_date(),
           last_modified => get_timestamp(),
          );
  
      my $dst_root  = $self->get_dir('dst_root');
      my $dst_file = "$dst_root/index.html";
      my $mode = $self->get('tmpl_mode');
      my $tmpl_file = 'index';
      my $vars = { doc => \%args };
      my $tmpl_root = $self->get_dir('tmpl');
      my $content = proc_tmpl($tmpl_root, $tmpl_file, $mode, $vars);
      note "+++ Creating $dst_file";
      DocSet::Util::write_file($dst_file, $content);
  }
  
  # generate the PS book
  ####################
  sub create_ps_book{
      my($self) = @_;
  
      note "+++ Generating a PostScript Book";
  
      my $html2ps_exec = DocSet::RunTime::can_create_ps();
      my $html2ps_conf = $self->get_file('html2ps_conf');
      my $id = $self->get('id');
      my $dst_root = $self->get_dir('dst_root');
      my $command = "$html2ps_exec -f $html2ps_conf -o $dst_root/${id}.ps ";
      $command .= join " ", map {"$dst_root/$_"} "index.html", $self->trg_chapters;
      note "% $command";
      system $command;
  
  }
  
  # generate the PDF book
  ####################
  sub create_pdf_book{
      my($self) = @_;
  
      note "+++ Converting PS => PDF";
      my $dst_root = $self->get_dir('dst_root');
      my $id = $self->get('id');
      my $command = "ps2pdf $dst_root/$id.ps $dst_root/$id.pdf";
      note "% $command";
      system $command;
  
  }
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::DocSet::PSPDF> - A subclass of C<DocSet::DocSet> for generating PS/PDF docset
  
  =head1 SYNOPSIS
  
  See C<DocSet::DocSet>
  
  =head1 DESCRIPTION
  
  This subclass of C<DocSet::DocSet> converts the source docset into PS
  and PDF "books". It uses C<html2ps> to generate the PS file, therefore
  it uses HTML as its intermediate product, though it uses different
  templates than C<DocSet::DocSet::HTML> since PS/PDF doesn't require
  the navigation widgets.
  
  =head2 METHODS
  
  See the majority of the methods in C<DocSet::DocSet>
  
  =over
  
  =item * trg_ext
  
    $self->trg_ext();
  
  returns the extension of the target files. I<html> in the case of this
  sub-class.
  
  =item * init
  
    $self->init(@_);
  
  calls C<DocSet::DocSet::init> and then initializes its own HTML output
  specific settings.
  
  =item * complete
  
  see C<DocSet::DocSet>
  
  =item * write_index_file
  
    $self->write_index_file();
  
  creates I<index.html> file linking all the items of the docset
  together.
  
  =item * create_ps_book
  
  Generats a PostScript Book
  
  =item * create_pdf_book
  
  Converts PS into PDF (if I<generate_pdf> runtime option is set)
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/Source/HTML.pm
  
  Index: HTML.pm
  ===================================================================
  package DocSet::Source::HTML;
  
  use strict;
  use warnings;
  
  use DocSet::Util;
  
  use vars qw(@ISA);
  require DocSet::Doc;
  @ISA = qw(DocSet::Doc);
  
  sub retrieve_meta_data {
      my($self) = @_;
  
      $self->parse;
  
      use Pod::POM::View::HTML;
      my $mode = 'Pod::POM::View::HTML';
      #print Pod::POM::View::HTML->print($pom);
  
      $self->{meta} = 
          {
           title => $self->{parsed_tree}->{title},
           link  => $self->{rel_dst_path},
          };
  
      # there is no autogenerated TOC for HTML files
  }
  
  sub parse {
      my($self) = @_;
      
      # already parsed
      return if exists $self->{parsed_tree} && $self->{parsed_tree};
  
      # print ${ $self->{content} };
      #my %segments = map {$_ => ''} qw(title body);
  
      # this one retrievs the body and the title of the given html
      require HTML::Parser;
      sub start_h {}
      sub end_h {
          my($self, $tagname, $skipped_text) = @_;
          # use $p itself as a tmp storage (ok according to the docs)
          $self->{parsed_tree}->{$tagname} = $skipped_text;
      }
      my $p = HTML::Parser->new(api_version => 3,
                                report_tags => [qw(title body)],
                                start_h => [\&start_h],
                                end_h   => [\&end_h, "self,tagname,skipped_text"],
                               );
      # Parse document text chunk by chunk
      $p->parse(${ $self->{content} });
      $p->eof;
  
      # store the tree away
      $self->{parsed_tree} = $p->{parsed_tree};
  }
  
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::Source::HTML> - A class for parsing input document in the HTML format
  
  =head1 SYNOPSIS
  
  See C<DocSet::Source>
  
  =head1 DESCRIPTION
  
  =head1 METHODS
  
  =over
  
  =item * parse
  
  Converts the source HTML document into a parsed tree.
  
  =item * retrieve_meta_data
  
  Retrieve and set the meta data that describes the input document into
  the I<meta> object attribute. The I<title> and I<link> meta attributes
  are getting set.
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/Source/POD.pm
  
  Index: POD.pm
  ===================================================================
  package DocSet::Source::POD;
  
  use strict;
  use warnings;
  
  use DocSet::Util;
  use DocSet::RunTime;
  
  use vars qw(@ISA);
  require DocSet::Doc;
  @ISA = qw(DocSet::Doc);
  
  sub retrieve_meta_data {
      my($self) = @_;
  
      $self->parse_pod;
  
      use Pod::POM::View::HTML;
      my $mode = 'Pod::POM::View::HTML';
      #print Pod::POM::View::HTML->print($pom);
  
      my $meta = {};
  
      my $pom = $self->{parsed_tree};
      my @sections = $pom->head1();
      # don't present on purpose ->present($mode); there should be no markup in NAME
      my $name_sec = shift @sections;
      if ($name_sec) {
          $meta->{title} = $name_sec->content();
      }
      else {
          $meta->{title} = 'No Title';
      }
      $meta->{title} =~ s/^\s*|\s*$//sg;
  
      $meta->{link} = $self->{rel_dst_path};
      # put all the meta data under the same attribute
      $self->{meta} = $meta;
  
      # build the toc datastructure
      my @toc = ();
      my $level = 1;
      for my $node (@sections) {
          push @toc, $self->render_toc_level($node, $level);
      }
      $self->{toc} = \@toc;
  
  }
  
  sub render_toc_level {
      my($self, $node, $level) = @_;
      my %toc_entry = ();
      my $title = $node->title;
      $toc_entry{link} = $toc_entry{title} = "$title"; # must stringify
      $toc_entry{link} =~ s/\W/_/g; # META: put into a sub?
      $toc_entry{link} = "#$toc_entry{link}"; # prepand '#' for internal links
      my @sub = ();
      $level++;
      if ($level < 5) {
          # if there are deeper than =head4 levels we don't go down (spec is 1-4)
          my $method = "head$level";
          for my $sub_node ($node->$method()) {
              push @sub, $self->render_toc_level($sub_node, $level);
          }
      }
      $toc_entry{subs} = \@sub if @sub;
      return \%toc_entry;
  }
  
  
  
  sub parse_pod {
      my($self) = @_;
      
      # already parsed
      return if exists $self->{parsed_tree} && $self->{parsed_tree};
  
      $self->podify_items() if get_opts('podify_items');
  
  #    print ${ $self->{content} };
  
      use Pod::POM;
      my %options;
      my $parser = Pod::POM->new(\%options);
      my $pom = $parser->parse_text(${ $self->{content} })
          or die $parser->error();
  
      $self->{parsed_tree} = $pom;
  
      # examine any warnings raised
      if (my @warnings = $parser->warnings()) {
          print "\n", '-' x 40, "\n";
          print "File: $self->{src_path}\n";
          warn "$_\n" for @warnings;
      }
  }
  
  sub src_filter {
      my ($self) = @_;
  
      $self->extract_pod;
  
      $self->podify_items if get_opts('podify_items');
  }
  
  sub extract_pod {
      my($self) = @_;
  
      my @pod = ();
      my $in_pod = 0;
      for (split /\n\n/, ${ $self->{content} }) {
          $in_pod ||= /^=/s;
          next unless $in_pod;
          $in_pod = 0 if /^=cut/;
          push @pod, $_;
      }
  
      # handle empty files
      unless (@pod) {
          push @pod, "=head1 NAME", "=head1 Not documented", "=cut";
      }
  
      my $content = join "\n\n", @pod;
      $self->{content} = \$content;
  }
  
  sub podify_items {
      my($self) = @_;
    
      # tmp storage
      my @paras = ();
      my $items = 0;
      my $second = 0;
  
      # we want the source in paragraphs
      my @content = split /\n\n/, ${ $self->{content} };
  
      foreach (@content) {
          # is it an item?
          if (/^(\*|\d+)\s+((\*|\d+)\s+)?/) {
              $items++;
              if ($2) {
                  $second++;
                  s/^(\*|\d+)\s+//; # strip the first level shortcut
                  s/^(\*|\d+)\s+/=item $1\n\n/; # do the second
                  s/^/=over 4\n\n/ if $second == 1; # start 2nd level
              } else {
                  # first time insert the =over pod tag
                  s/^(\*|\d+)\s+/=item $1\n\n/; # start 1st level
                  s/^/=over 4\n\n/ if $items == 1;
                  s/^/=back\n\n/   if $second; # complete 2nd level
                  $second = 0; # end 2nd level section
              }
              push @paras, split /\n\n/, $_;
          } else {
            # complete the =over =item =back tag
              $second=0, push @paras, "=back" if $second; # if 2nd level is not closed
              push @paras, "=back" if $items;
              push @paras, $_;
            # not a tag item
              $items = 0;
          }
      }
  
      my $content = join "\n\n", @paras;
      $self->{content} = \$content;
  
  }
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::Source::POD> - A class for parsing input document in the POD format
  
  =head1 SYNOPSIS
  
  
  
  =head1 DESCRIPTION
  
  =head2 METHODS
  
  =over 
  
  =item retrieve_meta_data()
  
  =item parse_pod()
  
  =item podify_items()
  
    podify_items();
  
  Podify text to represent items in pod, e.g:
  
    1 Some text from item Item1
    
    2 Some text from item Item2
  
  becomes:
  
    =over 4
   
    =item 1
   
    Some text from item Item1
  
    =item 2
   
    Some text from item Item2
  
    =back
  
  podify_items() accepts 'C<*>' and digits as bullets
  
  podify_items() receives a ref to array of paragraphs as a parameter
  and modifies it. Nothing returned.
  
  Moreover, you can use a second level of indentation. So you can have
  
    * title
  
    * * item
  
    * * item
  
  or 
  
    * title
  
    * 1 item
  
    * 2 item
  
  where the second mark is which tells whether to use a ball bullet or a
  numbered item.
  
  
  =back
  
  =head1 AUTHORS
  
  Stas Bekman E<lt>stas (at) stason.orgE<gt>
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/Source/Text.pm
  
  Index: Text.pm
  ===================================================================
  package DocSet::Source::Text;
  
  use strict;
  use warnings;
  
  use vars qw(@ISA);
  require DocSet::Doc;
  @ISA = qw(DocSet::Doc);
  
  ###########################
  ### not implemented yet ###
  ###########################
  
  1;
  __END__
  
  =head1 NAME
  
  C<DocSet::Source::Text> - A class for parsing input document in the text format
  
  
  =head1 SYNOPSIS
  
  
  
  =head1 DESCRIPTION
  
  
  
  =cut
  
  
  
  1.1                  modperl-docs/lib/DocSet/Template/Plugin/NavigateCache.pm
  
  Index: NavigateCache.pm
  ===================================================================
  package DocSet::Template::Plugin::NavigateCache;
  
  use strict;
  use warnings;
  
  use vars qw(@ISA);
  use Template::Plugin;
  @ISA = qw(Template::Plugin);
  
  use DocSet::NavigateCache ();
  
  sub new {
      my $class   = shift;
      my $context = shift;
      DocSet::NavigateCache->new(@_);
  }
  
  1;
  
  __END__
  
  
  

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-cvs-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-cvs-help@perl.apache.org