You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by eyal edri <ey...@gmail.com> on 2007/10/10 12:59:47 UTC

download code works in fetch class but not in plugins class

Hi,

I've written a piece of code for d/l files from urls (written below).
The code works great when injecting it to Fetcher.java src class, while
capturing the desired d/l content types.

i want to move the code to the plugins src class so that it will
automatically  d/l  the files  via the plugin  (such as  zip, for exe i will
need to write a plugin).
but when i write it for e.g. in ZipParser.java (right at the begining of the
"getParse(Content content)" function), it doesnt do anything, any idea?

the code:

 public ParseResult getParse(final Content content) {

    String resultText = null;
    String resultTitle = null;
    Outlink[] outlinks = null;
    List outLinksList = new ArrayList();
    Properties properties = null;

    try {

// my code:

 LOG.info ("edri:: found file type:" + content.getContentType());
      Pattern regex = Pattern.compile ("http://([^/]*).*/([^/]*)$");
      Matcher urlMatcher = regex.matcher(content.getUrl());

      String domain = null;
      String fileLast = null;
      // group is equvillant to $1 $2 in regex
      while ( urlMatcher.find() ) {
            domain = urlMatcher.group(1);
            fileLast = urlMatcher.group(2);
      }
      LOG.info ("edri:: filename " + fileLast);
      LOG.info ("edri:: domain " + domain);
      File downloadDir  = new File("/home/eyale/nutch/DOWNLOADS/" + domain);
       if ( !downloadDir.exists() )
               downloadDir.mkdir();
              String filename = downloadDir + "/" + fileLast;
              LOG.info ("edri:: saving filename: " + filename);
              byte [] contentBArray = content.getContent();
              FileOutputStream out = new FileOutputStream (new File
(filename));
              for (int i=0; i < contentBArray.length; i++)
              {
                 out.write(contentBArray[i]);
              }
              out.close();

... the rest of the function here....

thanks,

Eyal.

-- 
Eyal Edri