You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by Simone Tripodi <si...@apache.org> on 2012/01/10 18:08:27 UTC

Re: svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./ any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/ any23-core/src/main/java/org/deri/any23/cli/ any23-core/src/main/java/org/deri/any23/eval/ any23-core/src/main/java/or

Hi Mic,
this is something great, thanks for the hard work of merging!
next step is renaming the packages in org.apache.any23 :)

All the best, have a nice day!
-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Tue, Jan 10, 2012 at 5:32 PM,  <mo...@apache.org> wrote:
> Author: mostarda
> Date: Tue Jan 10 16:32:28 2012
> New Revision: 1229627
>
> URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
> Log:
> This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
> with the current Apache Any23 SVN repo, including the issues
> developed during the initial import transition phase.
> Such issues have been tracked on the original Any23 Google Code Issue Tracker [2].
> Below the extract of the original repository commit log.
>
> This commit is related to issue ANY23-27.
>
> [1] http://any23.googlecode.com/svn/trunk/
> [2] http://code.google.com/p/any23/issues/list
>
> ==== BEGIN: Original Log ====
>
> ------------------------------------------------------------------------
> r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line
>
> Improved numeric datatype assigment. This commit fixes issue #208.
> ------------------------------------------------------------------------
> hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
> ------------------------------------------------------------------------
> r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line
>
> Improved numeric datatype assigment. This commit fixes issue #208.
> ------------------------------------------------------------------------
> r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) | 1 line
>
> Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'. Fixed HTMLMetaExtractorTest.java to match this new
> namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS declared as resource instead that as a URI. Fixed
> RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration. This commit is related to issue #203.
> ------------------------------------------------------------------------
> r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) | 1 line
>
> Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found wrong declaration of Class(Resource) in WO.java
> voca. Fixed and updated RDFSchemaUtils.java test. This commit is related to issue #198.
> ------------------------------------------------------------------------
> r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) | 1 line
>
> Added utility method.
> ------------------------------------------------------------------------
> r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) | 1 line
>
> Improved Vocabulary.java class: added support for comments to any resource. Improved RDFSchemaUtils.java serialization
> support, added separators to RDFXML serialization. This commit is related to issue #198.
> ------------------------------------------------------------------------
> r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) | 1 line
>
> Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved prefix declaration parsing in RDFa11Parser, this
> new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix declarations. Fixed support for prefix mapping resolution in
> RDFa11Parser, this allows the correct support for the structured properties introduced by the latest version of the Open
> Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest to the new output of vocabularies serialization.
> Updated Any23PluginManagerTest to include a new class. This commit is related to issue #206.
> ------------------------------------------------------------------------
> r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) | 1 line
>
> Restricted scope of testGetClassesFromClasspath to avoid updating it every time a new class is added.
> ------------------------------------------------------------------------
> r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) | 1 line
>
> Improved validation mode support. Improved descriptions of Validation and Report fields. This commit is related to issue
> #209.
> ------------------------------------------------------------------------
> r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) | 1 line
>
> Improved Any23 Service XML Report format documentation.
> ------------------------------------------------------------------------
> r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) | 1 line
>
> Added URL encoding to the source location path. This commit fixes issue #205. Chosen not to write a formal test which
> requires the creation of folders with spaces
> ------------------------------------------------------------------------
> r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) | 1 line
>
> Removed obsolete section.
> ------------------------------------------------------------------------
> r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) | 1 line
>
> Improved Any23 facade, added method createDocumentSource() to simplify the extraction setup.
> ------------------------------------------------------------------------
> r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) | 1 line
>
> Refactored Rover CLI class to made it extensible from other CLI implementations.
> ------------------------------------------------------------------------
> r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) | 3 lines
>
> Removed wrong artifact name.
>
>
> ------------------------------------------------------------------------
> r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) | 1 line
>
> Removed no longer used jspf lib. Added crawler4j dependencies. Added README. This commit is related to issue #211.
> ------------------------------------------------------------------------
> r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) | 1 line
>
> Changed attributes visibility to facilitate the class extensibility.
> ------------------------------------------------------------------------
> r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) | 1 line
>
> Added helper methods to extract file lines as list of strings. Improved javadoc.
> ------------------------------------------------------------------------
> r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) | 1 line
>
> Added first version of basic-crawler plugin. This commit is related to issue #211.
> ------------------------------------------------------------------------
> r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) | 1 line
>
> Added plugins README.
> ------------------------------------------------------------------------
> r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) | 1 line
>
> Updated main README, added references to plugin and lib.
> ------------------------------------------------------------------------
> r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) | 1 line
>
> Fixed assembly name.
> ------------------------------------------------------------------------
> r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) | 1 line
>
> Fixed Tool signature. This commit is related to #211.
> ------------------------------------------------------------------------
> r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) | 1 line
>
> Improved logging.
> ------------------------------------------------------------------------
> r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) | 1 line
>
> Included plugin basic-crawler in reactor. Improved ToolRunner and Any23PluginManager tests to be compliant to the new
> plugin classes. This commit is related to issue #211.
> ------------------------------------------------------------------------
> r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) | 1 line
>
> Fixed Crawler4j group id. Related to issue #211.
> ------------------------------------------------------------------------
> r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) | 1 line
>
> Improved plugin documentation. Introduced Office Scraper specific page. This commit is related to issue #213.
> ------------------------------------------------------------------------
> r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) | 1 line
>
> Fixed POST method documentation. Related to issue #213.
> ------------------------------------------------------------------------
> r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) | 1 line
>
> Fixed code snippets, prettified, added missing finalization logic. See issue #187.
> ------------------------------------------------------------------------
> r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) | 1 line
>
> Fixed var name. See #187.
> ------------------------------------------------------------------------
> r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) | 1 line
>
> Updated code snippets and tutorial, added explicit TripleHandler closure. This commit is related to issue #187.
> ------------------------------------------------------------------------
> r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) | 1 line
>
> Fixed data type handling management in NQuadsParser. This commit is related to issue #210.
> ------------------------------------------------------------------------
> r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) | 1 line
>
> Added missing JSON output format. See #214.
> ------------------------------------------------------------------------
> r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) | 1 line
>
> Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output format support to Rover. This commit is related to
> issue #215.
> ------------------------------------------------------------------------
> r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) | 1 line
>
> Added Sesame TriX IO dependency. This commit is related to #215.
> ------------------------------------------------------------------------
> r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) | 1 line
>
> Some suppressed suppressed have been reactivated as Ignored.
> ------------------------------------------------------------------------
> r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) | 1 line
>
> Added TriX output format to the Any23 Service. Commit related to issue #215.
> ------------------------------------------------------------------------
> r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) | 1 line
>
> Improved FormatWriter management, added WriterRegistry. Improved Writer format management in Rover and WebResponder.
> This commit is related to issues #215 and #216.
> ------------------------------------------------------------------------
> r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) | 6 lines
>
> Added TriXExtractor and textual example (example-trix.trx), added trix support in RDFParserFactory.
> Registered TriXExtractor to the ExtractorRegistry.
> Added TriX mimetype support in TikaMIMETypeDetector (through mimetypes.xml) and added specific test.
> Added support and doc to TriX format in Any23 Service web page (form.html).
> This commit is related to issue #215.
>
> ------------------------------------------------------------------------
> r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) | 1 line
>
> Fixed number of extractors (+1 after adding TriXExtractor). Commit related to issue #215.
> ------------------------------------------------------------------------
> r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) | 1 line
>
> Added method getExtractorType() .
> ------------------------------------------------------------------------
> r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) | 4 lines
>
> Improved ExtractorDocumentation support, added missing format examples.
> Improved output layout. This commit is related to issue #194.
>
>
> ------------------------------------------------------------------------
> r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) | 1 line
>
> Improved classpath management in Any23PluginManager. Renamed getClasses\* in loadClasses\* . This commit is related to
> issue #212.
> ------------------------------------------------------------------------
> r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) | 1 line
>
> Separated log messages from specific outout data.
> ------------------------------------------------------------------------
> r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) | 1 line
>
> Added human readable report printing support in ReportingTripleHandler and Rover.
> ------------------------------------------------------------------------
> r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) | 1 line
>
> Fixed major issue in output generation, added final activity report, help prettification. This commit is related to
> issue #211.
> ------------------------------------------------------------------------
> r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) | 1 line
>
> Upgraded to Sesame 2.6.1 See issue #217.
> ------------------------------------------------------------------------
> r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) | 1 line
>
> Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue #216
> ------------------------------------------------------------------------
> r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) | 1 line
>
> Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
> ------------------------------------------------------------------------
> r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) | 1 line
>
> Added specific Crawler CLI documentation. Updated general CLI documentation. This commit is related to issue #211.
> ------------------------------------------------------------------------
> r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) | 4 lines
>
> The Eval CLI Tool has been removed as well as the org.deri.any23.eval package classes related to it.
> Updated tests verifying CLI tool detection.
> This commit is related to issue #218.
>
> ------------------------------------------------------------------------
> r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) | 5 lines
>
> Added MimeDetector CLI Tool and test case, removed main() from
> TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
> Updated CLI doc.
> This commit is related to issue #219.
>
> ------------------------------------------------------------------------
> r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) | 1 line
>
> Added support for comment serialization. Related to issue #158.
> ------------------------------------------------------------------------
> r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) | 1 line
>
> Add support for annotation writing in FormatWriter implementations. This commit is related to issue #158.
> ------------------------------------------------------------------------
> r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) | 1 line
>
> Added support for 'annotate' flag in Any23 Service.
> ------------------------------------------------------------------------
>
> ==== END  : Original Log ====
>
>
> Added:
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
>    incubator/any23/trunk/any23-core/src/test/resources/application/trix/
>    incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
>    incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
>    incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
>    incubator/any23/trunk/lib/README.txt
>    incubator/any23/trunk/plugins/README.txt
>    incubator/any23/trunk/plugins/basic-crawler/
>    incubator/any23/trunk/plugins/basic-crawler/pom.xml
>    incubator/any23/trunk/plugins/basic-crawler/src/
>    incubator/any23/trunk/plugins/basic-crawler/src/main/
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
>    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
>    incubator/any23/trunk/plugins/basic-crawler/src/test/
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
>    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
>    incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
> Removed:
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
> Modified:
>    incubator/any23/trunk/README.txt
>    incubator/any23/trunk/any23-core/bin/any23
>    incubator/any23/trunk/any23-core/bin/any23tools
>    incubator/any23/trunk/any23-core/pom.xml
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
>    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
>    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
>    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
>    incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
>    incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
>    incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
>    incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
>    incubator/any23/trunk/lib/install-deps.sh
>    incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
>    incubator/any23/trunk/pom.xml
>    incubator/any23/trunk/src/site/apt/any23-plugins.apt
>    incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
>    incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
>    incubator/any23/trunk/src/site/apt/getting-started.apt
>    incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
>    incubator/any23/trunk/src/site/apt/service.apt
>    incubator/any23/trunk/src/site/apt/supported-formats.apt
>
> Modified: incubator/any23/trunk/README.txt
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/README.txt (original)
> +++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
> @@ -20,7 +20,8 @@ Distribution Content
>
>  any23-core           The library core codebase.
>  any23-service        The library HTTP service codebase.
> -plugins              Library plugins codebase.
> +lib                  Contains the Any23 the external deps (read lib/README.txt for further details).
> +plugins              Library plugins codebase (read plugins/README.txt for further details).
>  RELEASE-NOTES.txt    File reporting main release notes for every version.
>  LICENSE.txt          Applicable project license.
>  README.txt           This file.
> @@ -240,15 +241,14 @@ Upload the produced packages in download
>
>    http://code.google.com/p/any23/downloads/list
>
> +--------------------
> +Manage External Deps
> +--------------------
>
> -Fix Release Procedure
> ----------------------
> -
> -   Currently the *plugins/integration-test* module is excluded from the parent
> -   reactor.
> -   To fix it in tag follow procedure as described at issue #171:
> -
> -        http://code.google.com/p/any23/issues/detail?id=171
> +::Developers interest only.::
>
> +External Deps are libraries used by some Any23 modules which are
> +not available in public Maven repositories. Such libraries are
> +managed within the 'lib' dir.
>
>  EOF
>
> Modified: incubator/any23/trunk/any23-core/bin/any23
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/bin/any23 (original)
> +++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
> @@ -9,12 +9,12 @@
>  ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
>
>  if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
> -    echo "Generating executable JAR..."
> -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
> +    echo "Generating executable JAR..." >&2
> +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>         ||\
> -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
> +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>        ||\
> -    { echo "Error while generating commandline assembly."; exit 1; }
> +    { echo "Error while generating commandline assembly."  >&2; exit 1; }
>  fi
>
>  SEP=':'
>
> Modified: incubator/any23/trunk/any23-core/bin/any23tools
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/bin/any23tools (original)
> +++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28 2012
> @@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd
>  PLUGINS_DIR=plugins
>
>  if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
> -    echo "Generating executable JAR..."
> -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
> +    echo "Generating executable JAR..." >&2
> +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>         ||\
> -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
> +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>        ||\
> -    { echo "Error while generating commandline assembly."; exit 1; }
> +    { echo "Error while generating commandline assembly." >&2; exit 1; }
>  fi
>
>  SEP=':'
> @@ -30,6 +30,7 @@ done
>  # Plugins classpath.
>  for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name "*-plugin.jar" -depth 1)
>  do
> +  echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
>   if [ ! -e "$jar" ]; then continue; fi
>   CP="$CP$SEP$jar"
>  done
>
> Modified: incubator/any23/trunk/any23-core/pom.xml
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/pom.xml (original)
> +++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
> @@ -92,6 +92,10 @@
>         </dependency>
>         <dependency>
>             <groupId>org.openrdf.sesame</groupId>
> +            <artifactId>sesame-rio-trix</artifactId>
> +        </dependency>
> +        <dependency>
> +            <groupId>org.openrdf.sesame</groupId>
>             <artifactId>sesame-repository-sail</artifactId>
>         </dependency>
>         <dependency>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java Tue Jan 10 16:32:28 2012
> @@ -258,6 +258,28 @@ public class Any23 {
>     }
>
>     /**
> +     * Returns the most appropriate {@link DocumentSource} for the given<code>documentURI</code>.
> +     *
> +     * @param documentURI the document <i>URI</i>.
> +     * @return a new instance of DocumentSource.
> +     * @throws URISyntaxException if an error occurs while parsing the <code>documentURI</code> as a <i>URI</i>.
> +     * @throws IOException if an error occurs while initializing the internal {@link HTTPClient}.
> +     */
> +    public DocumentSource createDocumentSource(String documentURI) throws URISyntaxException, IOException {
> +        if(documentURI == null) throw new NullPointerException("documentURI cannot be null.");
> +        if (documentURI.toLowerCase().startsWith("file:")) {
> +            return new FileDocumentSource( new File(new URI(documentURI)) );
> +        }
> +        if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
> +            return new HTTPDocumentSource(getHTTPClient(), documentURI);
> +        }
> +        throw new IllegalArgumentException(
> +                String.format("Unsupported protocol for document URI: '%s' .", documentURI)
> +        );
> +    }
> +
> +
> +    /**
>      * Performs metadata extraction from the content of the given
>      * <code>in</code> document source, sending the generated events
>      * to the specified <code>outputHandler</code>.
> @@ -363,13 +385,7 @@ public class Any23 {
>     public ExtractionReport extract(ExtractionParameters eps, String documentURI, TripleHandler outputHandler)
>     throws IOException, ExtractionException {
>         try {
> -            if (documentURI.toLowerCase().startsWith("file:")) {
> -                return extract(eps, new FileDocumentSource(new File(new URI(documentURI))), outputHandler);
> -            }
> -            if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
> -                return extract(eps, new HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
> -            }
> -            throw new ExtractionException("Not a valid absolute URI: " + documentURI);
> +            return extract(eps, createDocumentSource(documentURI), outputHandler);
>         } catch (URISyntaxException ex) {
>             throw new ExtractionException("Error while extracting data from document URI.", ex);
>         }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java Tue Jan 10 16:32:28 2012
> @@ -16,7 +16,7 @@
>
>  package org.deri.any23.cli;
>
> -import org.deri.any23.LogUtil;
> +import org.deri.any23.util.LogUtils;
>  import org.deri.any23.extractor.ExampleInputOutput;
>  import org.deri.any23.extractor.ExtractionException;
>  import org.deri.any23.extractor.Extractor;
> @@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
>     }
>
>     public int run(String[] args) {
> -        LogUtil.setDefaultLogging();
> +        LogUtils.setDefaultLogging();
>         try {
>             if (args.length == 0) {
>                 printUsage();
> @@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
>      * Prints the list of all the available extractors.
>      */
>     public void printExtractorList() {
> -        for (String extractorName : ExtractorRegistry.getInstance().getAllNames()) {
> -            System.out.println(extractorName);
> +        for(ExtractorFactory factory : ExtractorRegistry.getInstance().getExtractorGroup()) {
> +            System.out.println( String.format("%25s [%15s]", factory.getExtractorName(), factory.getExtractorType()));
>         }
>     }
>
> @@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
>             ExtractorFactory<?> factory = ExtractorRegistry.getInstance().getFactory(extractorName);
>             ExampleInputOutput example = new ExampleInputOutput(factory);
>             System.out.println("Extractor: " + extractorName);
> -            System.out.println("  type: " + getType(factory));
> -            String output = example.getExampleOutput();
> -            if (output == null) {
> -                System.out.println("(no example output)");
> +            System.out.println("\ttype: " + getType(factory));
> +            System.out.println();
> +            final String exampleInput = example.getExampleInput();
> +            if(exampleInput == null) {
> +                System.out.println("(No Example Available)");
>             } else {
> -                System.out.println("-------- example output --------");
> -                System.out.println(output);
> +                System.out.println("-------- Example Input  --------");
> +                System.out.println(exampleInput);
> +                System.out.println("-------- Example Output --------");
> +                String output = example.getExampleOutput();
> +                System.out.println(output == null || output.trim().length() == 0 ? "(No Output Generated)" : output);
>             }
> -            System.out.println();
>             System.out.println("================================");
> +            System.out.println();
>         }
>     }
>
>
> Added: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java (added)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java Tue Jan 10 16:32:28 2012
> @@ -0,0 +1,113 @@
> +/*
> + * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at
> + *
> + *          http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +package org.deri.any23.cli;
> +
> +import org.deri.any23.configuration.DefaultConfiguration;
> +import org.deri.any23.http.DefaultHTTPClient;
> +import org.deri.any23.http.HTTPClient;
> +import org.deri.any23.http.HTTPClientConfiguration;
> +import org.deri.any23.mime.MIMEType;
> +import org.deri.any23.mime.MIMETypeDetector;
> +import org.deri.any23.mime.TikaMIMETypeDetector;
> +import org.deri.any23.source.DocumentSource;
> +import org.deri.any23.source.FileDocumentSource;
> +import org.deri.any23.source.HTTPDocumentSource;
> +import org.deri.any23.source.StringDocumentSource;
> +
> +import java.io.File;
> +import java.net.URISyntaxException;
> +
> +/**
> + * Commandline tool to detect <b>MIME Type</b>s from
> + * file, HTTP and direct input sources.
> + * The implementation of this tool is based on {@link TikaMIMETypeDetector}.
> + *
> + * @author Michele Mostarda (mostarda@fbk.eu)
> + */
> +@ToolRunner.Description("MIME Type Detector Tool.")
> +public class MimeDetector implements Tool{
> +
> +    public static final String FILE_DOCUMENT_PREFIX   = "file://";
> +    public static final String INLINE_DOCUMENT_PREFIX = "inline://";
> +    public static final String URL_DOCUMENT_RE        = "^https?://.*";
> +
> +    public static void main(String[] args) {
> +        System.exit( new MimeDetector().run(args) );
> +    }
> +
> +    @Override
> +    public int run(String[] args) {
> +          if(args.length != 1) {
> +            System.err.println("USAGE: {http://path/to/resource.html|file:///path/to/local.file|inline:// some inline content}");
> +            return 1;
> +        }
> +
> +        final String document = args[0];
> +        try {
> +            final DocumentSource documentSource = createDocumentSource(document);
> +            final MIMETypeDetector detector = new TikaMIMETypeDetector();
> +            final MIMEType mimeType = detector.guessMIMEType(
> +                    documentSource.getDocumentURI(),
> +                    documentSource.openInputStream(),
> +                    MIMEType.parse(documentSource.getContentType())
> +            );
> +            System.out.println(mimeType);
> +            return 0;
> +        } catch (Exception e) {
> +            System.err.print("Error while detecting MIME Type.");
> +            e.printStackTrace(System.err);
> +            return 1;
> +        }
> +    }
> +
> +    private DocumentSource createDocumentSource(String document) throws URISyntaxException {
> +        if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
> +            return new FileDocumentSource(
> +                    new File(
> +                            document.substring(FILE_DOCUMENT_PREFIX.length())
> +                    )
> +            );
> +        }
> +        if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
> +            return new StringDocumentSource(
> +                    document.substring(INLINE_DOCUMENT_PREFIX.length()),
> +                    ""
> +            );
> +        }
> +        if(document.matches(URL_DOCUMENT_RE)) {
> +            final HTTPClient client = new DefaultHTTPClient();
> +            // TODO: anonymous config class also used in Any23. centralize.
> +            client.init(new HTTPClientConfiguration() {
> +                public String getUserAgent() {
> +                    return DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
> +                }
> +                public String getAcceptHeader() {
> +                    return "";
> +                }
> +                public int getDefaultTimeout() {
> +                    return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
> +                }
> +                public int getMaxConnections() {
> +                    return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
> +                }
> +            });
> +            return new HTTPDocumentSource(client, document);
> +        }
> +        throw new IllegalArgumentException("Unsupported protocol for document " + document);
> +    }
> +
> +}
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java Tue Jan 10 16:32:28 2012
> @@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
>  import org.apache.commons.cli.Options;
>  import org.apache.commons.cli.PosixParser;
>  import org.deri.any23.Any23;
> -import org.deri.any23.LogUtil;
> +import org.deri.any23.util.LogUtils;
>  import org.deri.any23.configuration.Configuration;
>  import org.deri.any23.configuration.DefaultConfiguration;
>  import org.deri.any23.extractor.ExtractionException;
> @@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
>  import org.deri.any23.extractor.SingleDocumentExtraction;
>  import org.deri.any23.filter.IgnoreAccidentalRDFa;
>  import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
> +import org.deri.any23.source.DocumentSource;
>  import org.deri.any23.writer.BenchmarkTripleHandler;
>  import org.deri.any23.writer.LoggingTripleHandler;
> -import org.deri.any23.writer.NQuadsWriter;
> -import org.deri.any23.writer.NTriplesWriter;
> -import org.deri.any23.writer.RDFXMLWriter;
>  import org.deri.any23.writer.ReportingTripleHandler;
>  import org.deri.any23.writer.TripleHandler;
>  import org.deri.any23.writer.TripleHandlerException;
> -import org.deri.any23.writer.TurtleWriter;
> -import org.deri.any23.writer.URIListWriter;
> +import org.deri.any23.writer.WriterRegistry;
>  import org.slf4j.Logger;
>  import org.slf4j.LoggerFactory;
>
> @@ -51,6 +48,7 @@ import java.io.OutputStream;
>  import java.io.PrintStream;
>  import java.io.PrintWriter;
>  import java.net.MalformedURLException;
> +import java.net.URISyntaxException;
>  import java.net.URL;
>
>  import static org.deri.any23.extractor.ExtractionParameters.ValidationMode;
> @@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
>  * A default rover implementation. Goes and fetches a URL using an hint
>  * as to what format should require, then tries to convert it to RDF.
>  *
> - * @author Gabriele Renzi
> - * @author Richard Cyganiak (richard@cyganiak.de)
>  * @author Michele Mostarda (mostarda@fbk.eu)
> + * @author Richard Cyganiak (richard@cyganiak.de)
> + * @author Gabriele Renzi
>  */
>  @ToolRunner.Description("Any23 Command Line Tool.")
>  public class Rover implements Tool {
>
> -    // Supported formats.
> -    private static final String TURTLE_FORMAT  = "turtle";
> -    private static final String NTRIPLE_FORMAT = "ntriples";
> -    private static final String RDFXML_FORMAT  = "rdfxml";
> -    private static final String NQUADS_FORMAT  = "nquads";
> -    private static final String URIS_FORMAT    = "uris";
> -
> -    private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
> +    private static final String[] FORMATS = WriterRegistry.getInstance().getIdentifiers();
> +    private static final int DEFAULT_FORMAT_INDEX = 0;
>
>     private static final Logger logger = LoggerFactory.getLogger(Rover.class);
>
> -    private static Options options;
> +    private Options options;
>
> -    public static void main(String[] args) {
> -        System.exit( new Rover().run(args) );
> -    }
> +    private CommandLine commandLine;
>
> -    public int run(String[] args) {
> -        final CommandLineParser parser = new PosixParser();
> -        final CommandLine commandLine;
> +    private boolean verbose = false;
>
> -        boolean verbose = false;
> -        try {
> -            options = createOptions();
> -            commandLine = parser.parse(options, args);
> +    private PrintStream outputStream;
> +    private TripleHandler tripleHandler;
> +    private ReportingTripleHandler reportingTripleHandler;
> +    private BenchmarkTripleHandler benchmarkTripleHandler;
>
> -            if (commandLine.hasOption("h")) {
> -                printHelp();
> -                return 0;
> -            }
> +    private ExtractionParameters eps;
> +    private Any23 any23;
>
> -            if (commandLine.hasOption('v')) {
> -                verbose = true;
> -                LogUtil.setVerboseLogging();
> -            } else {
> -                LogUtil.setDefaultLogging();
> -            }
> -
> -            if (commandLine.getArgs().length < 1) {
> -                printHelp();
> -                throw new IllegalArgumentException("Expected at least 1 argument.");
> -            }
> +    protected boolean isVerbose() {
> +        return verbose;
> +    }
>
> -            final String[] inputURIs      = argumentsToURIs(commandLine.getArgs());
> -            final String[] extractorNames = getExtractors(commandLine);
> +    public static void main(String[] args) {
> +        System.exit( new Rover().run(args) );
> +    }
>
> -            PrintStream outputStream    = null;
> -            TripleHandler tripleHandler = null;
> -            try {
> -                outputStream  = getOutputStream(commandLine);
> +    public int run(String[] args) {
> +        try {
> +            final String[] uris = configure(args);
> +            performExtraction(uris);
> +            return 0;
> +        } catch (Exception e) {
> +            System.err.println( e.getMessage() );
> +            final int exitCode = e instanceof ExitCodeException ? ((ExitCodeException) e).exitCode : 1;
> +            if(verbose) e.printStackTrace(System.err);
> +            return exitCode;
> +        }
> +    }
>
> -                tripleHandler = getTripleHandler(commandLine, outputStream);
> +    protected CommandLine getCommandLine() {
> +        if(commandLine == null) throw new IllegalStateException("Rover must be configured first.");
> +        return commandLine;
> +    }
>
> -                tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
> +    protected String[] configure(String[] args) throws Exception {
> +        final CommandLineParser parser = new PosixParser();
> +        options = createOptions();
> +        commandLine = parser.parse(options, args);
>
> -                tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
> -                final BenchmarkTripleHandler benchmarkTripleHandler =
> -                        tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
> +        if (commandLine.hasOption("h")) {
> +            printHelp();
> +            throw new ExitCodeException(0);
> +        }
>
> -                tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
> +        if (commandLine.hasOption('v')) {
> +            verbose = true;
> +            LogUtils.setVerboseLogging();
> +        } else {
> +            LogUtils.setDefaultLogging();
> +        }
>
> -                final ReportingTripleHandler reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
> +        if (commandLine.getArgs().length < 1) {
> +            printHelp();
> +            throw new IllegalArgumentException("Expected at least 1 argument.");
> +        }
>
> -                final ExtractionParameters eps = getExtractionParameters(commandLine);
> +        final String[] inputURIs = argumentsToURIs(commandLine.getArgs());
> +        final String[] extractorNames = getExtractors(commandLine);
>
> -                final Any23 any23 = createAny23(extractorNames);
> +        try {
> +            outputStream  = getOutputStream(commandLine);
> +            tripleHandler = getTripleHandler(commandLine, outputStream);
> +            tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
> +            tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
>
> -                final long start = System.currentTimeMillis();
> -                for(String inputURI : inputURIs) {
> -                    performExtraction(any23, eps, inputURI, reportingTripleHandler);
> -                }
> -                final long elapsed = System.currentTimeMillis() - start;
> +            benchmarkTripleHandler =
> +                    tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
>
> -                closeAll(tripleHandler, outputStream);
> +            tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>
> -                if (benchmarkTripleHandler != null) {
> -                    System.err.println( benchmarkTripleHandler.report() );
> -                }
> +            reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
> +            eps = getExtractionParameters(commandLine);
> +            any23 = createAny23(extractorNames);
>
> -                logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
> -                logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
> -            } finally {
> -                closeAll(tripleHandler, outputStream);
> -            }
> +            return inputURIs;
>         } catch (Exception e) {
> -            System.err.println(e.getMessage());
> -            final int exitCode = e instanceof SpecificExitException ? ((SpecificExitException) e).exitCode : 1;
> -            if(verbose) e.printStackTrace(System.err);
> -            return exitCode;
> +            closeStreams();
> +            throw e;
>         }
> -        return 0;
>     }
>
> -    private Options createOptions() {
> +    protected Options createOptions() {
>         final Options options = new Options();
>         options.addOption(
>                 new Option("v", "verbose", false, "Show debug and progress information.")
> @@ -178,13 +175,7 @@ public class Rover implements Tool {
>                         "f",
>                         "Output format",
>                         true,
> -                        "[" +
> -                                TURTLE_FORMAT  + " (default), " +
> -                                NTRIPLE_FORMAT + ", " +
> -                                RDFXML_FORMAT  + ", " +
> -                                NQUADS_FORMAT  + ", " +
> -                                URIS_FORMAT    +
> -                        "]"
> +                        "[" +  printFormats(FORMATS, DEFAULT_FORMAT_INDEX) + "]"
>                 )
>         );
>         options.addOption(
> @@ -208,11 +199,51 @@ public class Rover implements Tool {
>         return options;
>     }
>
> +    protected void performExtraction(DocumentSource documentSource) {
> +        performExtraction(any23, eps, documentSource, reportingTripleHandler);
> +    }
> +
> +    protected void performExtraction(String[] inputURIs) throws URISyntaxException, IOException {
> +        try {
> +            final long start = System.currentTimeMillis();
> +            for (String inputURI : inputURIs) {
> +                performExtraction( any23.createDocumentSource(inputURI) );
> +            }
> +            final long elapsed = System.currentTimeMillis() - start;
> +
> +            if (benchmarkTripleHandler != null) {
> +                System.err.println(benchmarkTripleHandler.report());
> +            }
> +
> +            logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
> +            logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
> +        } finally {
> +            closeStreams();
> +        }
> +    }
> +
> +    protected String printReports() {
> +        final StringBuilder sb = new StringBuilder();
> +        if(benchmarkTripleHandler != null) sb.append( benchmarkTripleHandler.report() ).append('\n');
> +        if(reportingTripleHandler != null) sb.append( reportingTripleHandler.printReport() ).append('\n');
> +        return sb.toString();
> +    }
> +
>     private void printHelp() {
>         HelpFormatter formatter = new HelpFormatter();
>         formatter.printHelp("[{<url>|<file>}]+", options, true);
>     }
>
> +    private String printFormats(String[] formats, int defaultIndex) {
> +        final StringBuilder sb = new StringBuilder();
> +        for (int i = 0; i < formats.length; i++) {
> +            sb.append(formats[i]);
> +            if(i == defaultIndex) sb.append(" (default)");
> +            if(i < formats.length - 1) sb.append(", ");
> +        }
> +        return sb.toString();
> +    }
> +
>     private String argumentToURI(String uri) {
>         uri = uri.trim();
>         if (uri.toLowerCase().startsWith("http:") || uri.toLowerCase().startsWith("https:")) {
> @@ -268,27 +299,17 @@ public class Rover implements Tool {
>
>     private TripleHandler getTripleHandler(CommandLine cl, OutputStream os) {
>         final String FORMAT_OPTION = "f";
> -        String format = DEFAULT_FORMAT;
> +        String format = FORMATS[DEFAULT_FORMAT_INDEX];
>         if (cl.hasOption(FORMAT_OPTION)) {
> -            format = cl.getOptionValue(FORMAT_OPTION);
> +            format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
>         }
> -        final TripleHandler outputHandler;
> -        if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
> -            outputHandler = new TurtleWriter(os);
> -        } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
> -            outputHandler = new NTriplesWriter(os);
> -        } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
> -            outputHandler = new RDFXMLWriter(os);
> -        } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
> -            outputHandler = new NQuadsWriter(os);
> -        } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
> -            outputHandler = new URIListWriter(os);
> -        } else {
> +        try {
> +            return WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
> +        } catch (Exception e) {
>             throw new IllegalArgumentException(
>                     String.format("Invalid option value '%s' for option %s", format, FORMAT_OPTION)
>             );
>         }
> -        return outputHandler;
>     }
>
>     private TripleHandler decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
> @@ -346,44 +367,54 @@ public class Rover implements Tool {
>         return any23;
>     }
>
> -    private void performExtraction(Any23 any23, ExtractionParameters eps, String documentURI, TripleHandler th) {
> +    private void performExtraction(
> +            Any23 any23, ExtractionParameters eps, DocumentSource documentSource, TripleHandler th
> +    ) {
>         try {
> -            if (! any23.extract(eps, documentURI, th).hasMatchingExtractors()) {
> -                throw new SpecificExitException("No suitable extractors found.", 2);
> +            if (! any23.extract(eps, documentSource, th).hasMatchingExtractors()) {
> +                throw new ExitCodeException("No suitable extractors found.", 2);
>             }
>         } catch (ExtractionException ex) {
> -            throw new SpecificExitException("Exception while extracting metadata.", ex, 3);
> +            throw new ExitCodeException("Exception while extracting metadata.", ex, 3);
>         } catch (IOException ex) {
> -            throw new SpecificExitException("Exception while producing output.", ex, 4);
> +            throw new ExitCodeException("Exception while producing output.", ex, 4);
>         }
>     }
>
> -    private void closeHandler(TripleHandler th) {
> -        if(th == null) return;
> +    private void closeHandler() {
> +        if(tripleHandler == null) return;
>         try {
> -            th.close();
> +            tripleHandler.close();
>         } catch (TripleHandlerException the) {
> -            throw new SpecificExitException("Error while closing TripleHandler", the, 5);
> +            throw new ExitCodeException("Error while closing TripleHandler", the, 5);
>         }
>     }
>
> -    private void closeAll(TripleHandler th, PrintStream os) {
> -             closeHandler(th);
> -            if(os != null) os.close();
> +    private void closeStreams() {
> +             closeHandler();
> +            if(outputStream != null) outputStream.close();
>     }
>
> -    private class SpecificExitException extends RuntimeException {
> +    protected class ExitCodeException extends RuntimeException {
>
>         private final int exitCode;
>
> -        public SpecificExitException(String message, Throwable cause, int exitCode) {
> +        public ExitCodeException(String message, Throwable cause, int exitCode) {
>             super(message, cause);
>             this.exitCode = exitCode;
>         }
> -        public SpecificExitException(String message, int exitCode) {
> +        public ExitCodeException(String message, int exitCode) {
>             super(message);
>             this.exitCode = exitCode;
>         }
> +        public ExitCodeException(int exitCode) {
> +            super();
> +            this.exitCode = exitCode;
> +        }
> +
> +        protected int getExitCode() {
> +            return exitCode;
> +        }
>     }
>
>  }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java Tue Jan 10 16:32:28 2012
> @@ -29,6 +29,13 @@ import java.util.Collection;
>  public interface ExtractorFactory<T extends Extractor<?>> extends ExtractorDescription {
>
>     /**
> +     * Returns the extractor type.
> +     *
> +     * @return the not <code>null</code> extractor class.
> +     */
> +    Class<T> getExtractorType();
> +
> +    /**
>      * Creates an extractor instance.
>      *
>      * @return an instance of the extractor associated to this factory.
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java Tue Jan 10 16:32:28 2012
> @@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
>  import org.deri.any23.extractor.rdf.NQuadsExtractor;
>  import org.deri.any23.extractor.rdf.NTriplesExtractor;
>  import org.deri.any23.extractor.rdf.RDFXMLExtractor;
> +import org.deri.any23.extractor.rdf.TriXExtractor;
>  import org.deri.any23.extractor.rdf.TurtleExtractor;
>  import org.deri.any23.extractor.rdfa.RDFa11Extractor;
>  import org.deri.any23.extractor.rdfa.RDFaExtractor;
> @@ -79,6 +80,7 @@ public class ExtractorRegistry {
>                 instance.register(TurtleExtractor.factory);
>                 instance.register(NTriplesExtractor.factory);
>                 instance.register(NQuadsExtractor.factory);
> +                instance.register(TriXExtractor.factory);
>                 if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
>                     instance.register(RDFa11Extractor.factory);
>                 } else {
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java Tue Jan 10 16:32:28 2012
> @@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
>         return supportedMIMETypes;
>     }
>
> +    @Override
> +    public Class<T> getExtractorType() {
> +        return extractorClass;
> +    }
> +
>     /**
>      * @return an instance of type T concrete implementation of {@link org.deri.any23.extractor.Extractor}
>      */
> +    @Override
>     public T createExtractor() {
>         try {
>             return extractorClass.newInstance();
> @@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
>     /**
>      * @return an input example
>      */
> +    @Override
>     public String getExampleInput() {
>         return exampleInput;
>     }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java Tue Jan 10 16:32:28 2012
> @@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
>                     Arrays.asList(
>                             "text/csv;q=0.1"
>                     ),
> -                    null,
> +                    "example-csv.csv",
>                     CSVExtractor.class
>             );
>
> @@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
>     }
>
>     /**
> +     * Check whether a number is an integer.
> +     *
> +     * @param number
> +     * @return
> +     */
> +    private boolean isInteger(String number) {
> +        try {
> +            Integer.valueOf(number);
> +            return true;
> +        } catch (NumberFormatException e) {
> +            return false;
> +        }
> +    }
> +
> +    /**
> +     * Check whether a number is a float.
> +     *
>      * @param number
>      * @return
>      */
> -    private boolean isNumber(String number) {
> +    private boolean isFloat(String number) {
>         try {
> -            Double.valueOf(number);
> +            Float.valueOf(number);
>             return true;
>         } catch (NumberFormatException e) {
>             return false;
> @@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
>             object = new URIImpl(cell);
>         } else {
>             URI datatype = XMLSchema.STRING;
> -            if (isNumber(cell)) {
> +            if (isInteger(cell)) {
>                 datatype = XMLSchema.INTEGER;
> +            } else if(isFloat(cell)) {
> +                datatype = XMLSchema.FLOAT;
>             }
>             object = new LiteralImpl(cell, datatype);
>         }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java Tue Jan 10 16:32:28 2012
> @@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
>                     "html-mf-adr",
>                     PopularPrefixes.createSubset("rdf", "vcard"),
>                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                    null,
> +                    "example-mf-adr.html",
>                     AdrExtractor.class
>             );
>  }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java Tue Jan 10 16:32:28 2012
> @@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
>                 "html-mf-geo",
>                 PopularPrefixes.createSubset("rdf", "vcard"),
>                 Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                null,
> +                "example-mf-geo.html",
>                 GeoExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java Tue Jan 10 16:32:28 2012
> @@ -53,7 +53,7 @@ public class HCalendarExtractor extends
>                     "html-mf-hcalendar",
>                     PopularPrefixes.createSubset("rdf", "ical"),
>                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                    null,
> +                    "example-mf-hcalendar.html",
>                     HCalendarExtractor.class);
>
>     private static final String[] Components = {"Vevent", "Vtodo", "Vjournal", "Vfreebusy"};
> @@ -116,7 +116,7 @@ public class HCalendarExtractor extends
>     private boolean extractComponent(Node node, Resource cal, String component) throws ExtractionException {
>         HTMLDocument compoNode = new HTMLDocument(node);
>         BNode evt = valueFactory.createBNode();
> -        addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
> +        addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
>         addTextProps(compoNode, evt);
>         addUrl(compoNode, evt);
>         addRRule(compoNode, evt);
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java Tue Jan 10 16:32:28 2012
> @@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
>                     "html-mf-hcard",
>                     PopularPrefixes.createSubset("rdf", "vcard"),
>                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                    null,
> +                    "example-mf-hcard.html",
>                     HCardExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java Tue Jan 10 16:32:28 2012
> @@ -82,7 +82,7 @@ public class HListingExtractor extends E
>                     "html-mf-hlisting",
>                     PopularPrefixes.createSubset("rdf", "hlisting"),
>                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                    null,
> +                    "example-mf-hlisting.html",
>                     HListingExtractor.class
>             );
>
> @@ -106,7 +106,7 @@ public class HListingExtractor extends E
>         out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
>
>         for (String action : findActions(fragment)) {
> -            out.writeTriple(listing, hLISTING.action, hLISTING.getResource(action));
> +            out.writeTriple(listing, hLISTING.action, hLISTING.getClass(action));
>         }
>         out.writeTriple(listing, hLISTING.lister, addLister() );
>         addItem(listing);
> @@ -154,7 +154,7 @@ public class HListingExtractor extends E
>                     String value = node.getNodeValue();
>                     // do not use conditionallyAdd, it won't work cause of evaluation rules
>                     if (!(null == value || "".equals(value))) {
> -                        URI property = hLISTING.getPropertyCamelized(klass);
> +                        URI property = hLISTING.getPropertyCamelCase(klass);
>                         conditionallyAddLiteralProperty(
>                                 node,
>                                 blankItem, property, valueFactory.createLiteral(value)
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java Tue Jan 10 16:32:28 2012
> @@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
>                     "html-mf-hrecipe",
>                     PopularPrefixes.createSubset("rdf", "hrecipe"),
>                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                    null,
> +                    "example-mf-hrecipe.html",
>                     HRecipeExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java Tue Jan 10 16:32:28 2012
> @@ -48,7 +48,7 @@ public class HResumeExtractor extends En
>                     "html-mf-hresume",
>                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                    null,
> +                    "example-mf-hresume.html",
>                     HResumeExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java Tue Jan 10 16:32:28 2012
> @@ -53,7 +53,7 @@ public class HReviewExtractor extends En
>                     "html-mf-hreview",
>                     PopularPrefixes.createSubset("rdf", "vcard", "rev"),
>                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                    null,
> +                    "example-mf-hreview.html",
>                     HReviewExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java Tue Jan 10 16:32:28 2012
> @@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
>                     "html-head-links",
>                     PopularPrefixes.createSubset("xhtml", "dcterms"),
>                     Arrays.asList("text/html;q=0.05", "application/xhtml+xml;q=0.05"),
> -                    null,
> +                    "example-head-link.html",
>                     HeadLinkExtractor.class);
>  }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java Tue Jan 10 16:32:28 2012
> @@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
>                     "html-head-icbm",
>                     PopularPrefixes.createSubset("geo", "rdf"),
>                     Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
> -                    null,
> +                    "example-icbm.html",
>                     ICBMExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java Tue Jan 10 16:32:28 2012
> @@ -51,7 +51,7 @@ public class LicenseExtractor implements
>                     "html-mf-license",
>                     PopularPrefixes.createSubset("xhtml"),
>                     Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
> -                    null,
> +                    "example-mf-license.html",
>                     LicenseExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java Tue Jan 10 16:32:28 2012
> @@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
>                     "html-mf-species",
>                     PopularPrefixes.createSubset("rdf", "wo"),
>                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                    null,
> +                    "example-mf-species.html",
>                     SpeciesExtractor.class
>             );
>
> @@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
>
>     private URI resolveClassName(String clazz) {
>         String upperCaseClass = clazz.substring(0, 1);
> -        return vWO.getResource(
> +        return vWO.getClass(
>                 String.format("%s%s",
>                         upperCaseClass.toUpperCase(),
>                         clazz.substring(1)
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java Tue Jan 10 16:32:28 2012
> @@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
>                     NAME,
>                     PopularPrefixes.get(),
>                     Arrays.asList("text/html;q=0.02", "application/xhtml+xml;q=0.02"),
> -                    null,
> +                    "example-script-turtle.html",
>                     TurtleHTMLExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java Tue Jan 10 16:32:28 2012
> @@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
>                 "html-mf-xfn",
>                 PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
>                 Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                null,
> +                "example-mf-xfn.html",
>                 XFNExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java Tue Jan 10 16:32:28 2012
> @@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
>                     "html-microdata",
>                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> -                    null,
> +                    "example-microdata.html",
>                     MicrodataExtractor.class
>             );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java Tue Jan 10 16:32:28 2012
> @@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
>  import org.deri.any23.extractor.ErrorReporter;
>  import org.deri.any23.extractor.ExtractionContext;
>  import org.deri.any23.extractor.ExtractionResult;
> -import org.deri.any23.parser.NQuadsParser;
> +import org.deri.any23.io.nquads.NQuadsParser;
>  import org.deri.any23.rdf.Any23ValueFactoryWrapper;
>  import org.openrdf.model.impl.ValueFactoryImpl;
>  import org.openrdf.rio.ParseErrorListener;
> @@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
>  import org.openrdf.rio.RDFParser;
>  import org.openrdf.rio.ntriples.NTriplesParser;
>  import org.openrdf.rio.rdfxml.RDFXMLParser;
> +import org.openrdf.rio.trix.TriXParser;
>  import org.openrdf.rio.turtle.TurtleParser;
>  import org.slf4j.Logger;
>  import org.slf4j.LoggerFactory;
> @@ -38,7 +39,7 @@ import java.io.Reader;
>
>  /**
>  * This factory provides a common logic for creating and configuring correctly
> - * any RDF parser used within the library.
> + * any <i>RDF</i> parser used within the library.
>  *
>  * @author Michele Mostarda (mostarda@fbk.eu)
>  */
> @@ -119,7 +120,7 @@ public class RDFParserFactory {
>     }
>
>     /**
> -     * Returns a new instance of a configured {@link org.deri.any23.parser.NQuadsParser}.
> +     * Returns a new instance of a configured {@link org.deri.any23.io.nquads.NQuadsParser}.
>      *
>      * @param verifyDataType data verification enable if <code>true</code>.
>      * @param stopAtFirstError the parser stops at first error if <code>true</code>.
> @@ -139,6 +140,26 @@ public class RDFParserFactory {
>     }
>
>     /**
> +     * Returns a new instance of a configured {@link TriXParser}.
> +     *
> +     * @param verifyDataType data verification enable if <code>true</code>.
> +     * @param stopAtFirstError the parser stops at first error if <code>true</code>.
> +     * @param extractionContext the extraction context where the parser is used.
> +     * @param extractionResult the output extraction result.
> +     * @return a new instance of a configured TriX parser.
> +     */
> +    public TriXParser getTriXParser(
> +            final boolean verifyDataType,
> +            final boolean stopAtFirstError,
> +            final ExtractionContext extractionContext,
> +            final ExtractionResult extractionResult
> +    ) {
> +        final TriXParser parser = new TriXParser();
> +        configureParser(parser, verifyDataType, stopAtFirstError, extractionContext, extractionResult);
> +        return parser;
> +    }
> +
> +    /**
>      * Configures the given parser on the specified extraction result
>      * setting the policies for data verification and error handling.
>      *
>
>

Re: svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./ any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/ any23-core/src/main/java/org/deri/any23/cli/ any23-core/src/main/java/org/deri/any23/eval/ any23-core/src/main/java/or

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks Guys, appreciate it!

Cheers,
Chris

On Jan 10, 2012, at 9:08 AM, Simone Tripodi wrote:

> Hi Mic,
> this is something great, thanks for the hard work of merging!
> next step is renaming the packages in org.apache.any23 :)
>
> All the best, have a nice day!
> -Simo
>
> http://people.apache.org/~simonetripodi/
> http://simonetripodi.livejournal.com/
> http://twitter.com/simonetripodi
> http://www.99soft.org/
>
>
>
> On Tue, Jan 10, 2012 at 5:32 PM,  <mo...@apache.org> wrote:
>> Author: mostarda
>> Date: Tue Jan 10 16:32:28 2012
>> New Revision: 1229627
>>
>> URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
>> Log:
>> This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
>> with the current Apache Any23 SVN repo, including the issues
>> developed during the initial import transition phase.
>> Such issues have been tracked on the original Any23 Google Code Issue Tracker [2].
>> Below the extract of the original repository commit log.
>>
>> This commit is related to issue ANY23-27.
>>
>> [1] http://any23.googlecode.com/svn/trunk/
>> [2] http://code.google.com/p/any23/issues/list
>>
>> ==== BEGIN: Original Log ====
>>
>> ------------------------------------------------------------------------
>> r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line
>>
>> Improved numeric datatype assigment. This commit fixes issue #208.
>> ------------------------------------------------------------------------
>> hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
>> ------------------------------------------------------------------------
>> r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line
>>
>> Improved numeric datatype assigment. This commit fixes issue #208.
>> ------------------------------------------------------------------------
>> r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) | 1 line
>>
>> Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'. Fixed HTMLMetaExtractorTest.java to match this new
>> namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS declared as resource instead that as a URI. Fixed
>> RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration. This commit is related to issue #203.
>> ------------------------------------------------------------------------
>> r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) | 1 line
>>
>> Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found wrong declaration of Class(Resource) in WO.java
>> voca. Fixed and updated RDFSchemaUtils.java test. This commit is related to issue #198.
>> ------------------------------------------------------------------------
>> r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) | 1 line
>>
>> Added utility method.
>> ------------------------------------------------------------------------
>> r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) | 1 line
>>
>> Improved Vocabulary.java class: added support for comments to any resource. Improved RDFSchemaUtils.java serialization
>> support, added separators to RDFXML serialization. This commit is related to issue #198.
>> ------------------------------------------------------------------------
>> r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) | 1 line
>>
>> Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved prefix declaration parsing in RDFa11Parser, this
>> new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix declarations. Fixed support for prefix mapping resolution in
>> RDFa11Parser, this allows the correct support for the structured properties introduced by the latest version of the Open
>> Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest to the new output of vocabularies serialization.
>> Updated Any23PluginManagerTest to include a new class. This commit is related to issue #206.
>> ------------------------------------------------------------------------
>> r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) | 1 line
>>
>> Restricted scope of testGetClassesFromClasspath to avoid updating it every time a new class is added.
>> ------------------------------------------------------------------------
>> r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) | 1 line
>>
>> Improved validation mode support. Improved descriptions of Validation and Report fields. This commit is related to issue
>> #209.
>> ------------------------------------------------------------------------
>> r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) | 1 line
>>
>> Improved Any23 Service XML Report format documentation.
>> ------------------------------------------------------------------------
>> r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) | 1 line
>>
>> Added URL encoding to the source location path. This commit fixes issue #205. Chosen not to write a formal test which
>> requires the creation of folders with spaces
>> ------------------------------------------------------------------------
>> r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) | 1 line
>>
>> Removed obsolete section.
>> ------------------------------------------------------------------------
>> r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) | 1 line
>>
>> Improved Any23 facade, added method createDocumentSource() to simplify the extraction setup.
>> ------------------------------------------------------------------------
>> r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) | 1 line
>>
>> Refactored Rover CLI class to made it extensible from other CLI implementations.
>> ------------------------------------------------------------------------
>> r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) | 3 lines
>>
>> Removed wrong artifact name.
>>
>>
>> ------------------------------------------------------------------------
>> r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Removed no longer used jspf lib. Added crawler4j dependencies. Added README. This commit is related to issue #211.
>> ------------------------------------------------------------------------
>> r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Changed attributes visibility to facilitate the class extensibility.
>> ------------------------------------------------------------------------
>> r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Added helper methods to extract file lines as list of strings. Improved javadoc.
>> ------------------------------------------------------------------------
>> r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Added first version of basic-crawler plugin. This commit is related to issue #211.
>> ------------------------------------------------------------------------
>> r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Added plugins README.
>> ------------------------------------------------------------------------
>> r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Updated main README, added references to plugin and lib.
>> ------------------------------------------------------------------------
>> r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Fixed assembly name.
>> ------------------------------------------------------------------------
>> r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Fixed Tool signature. This commit is related to #211.
>> ------------------------------------------------------------------------
>> r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Improved logging.
>> ------------------------------------------------------------------------
>> r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Included plugin basic-crawler in reactor. Improved ToolRunner and Any23PluginManager tests to be compliant to the new
>> plugin classes. This commit is related to issue #211.
>> ------------------------------------------------------------------------
>> r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Fixed Crawler4j group id. Related to issue #211.
>> ------------------------------------------------------------------------
>> r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Improved plugin documentation. Introduced Office Scraper specific page. This commit is related to issue #213.
>> ------------------------------------------------------------------------
>> r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Fixed POST method documentation. Related to issue #213.
>> ------------------------------------------------------------------------
>> r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Fixed code snippets, prettified, added missing finalization logic. See issue #187.
>> ------------------------------------------------------------------------
>> r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Fixed var name. See #187.
>> ------------------------------------------------------------------------
>> r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Updated code snippets and tutorial, added explicit TripleHandler closure. This commit is related to issue #187.
>> ------------------------------------------------------------------------
>> r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Fixed data type handling management in NQuadsParser. This commit is related to issue #210.
>> ------------------------------------------------------------------------
>> r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Added missing JSON output format. See #214.
>> ------------------------------------------------------------------------
>> r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output format support to Rover. This commit is related to
>> issue #215.
>> ------------------------------------------------------------------------
>> r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) | 1 line
>>
>> Added Sesame TriX IO dependency. This commit is related to #215.
>> ------------------------------------------------------------------------
>> r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) | 1 line
>>
>> Some suppressed suppressed have been reactivated as Ignored.
>> ------------------------------------------------------------------------
>> r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) | 1 line
>>
>> Added TriX output format to the Any23 Service. Commit related to issue #215.
>> ------------------------------------------------------------------------
>> r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) | 1 line
>>
>> Improved FormatWriter management, added WriterRegistry. Improved Writer format management in Rover and WebResponder.
>> This commit is related to issues #215 and #216.
>> ------------------------------------------------------------------------
>> r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) | 6 lines
>>
>> Added TriXExtractor and textual example (example-trix.trx), added trix support in RDFParserFactory.
>> Registered TriXExtractor to the ExtractorRegistry.
>> Added TriX mimetype support in TikaMIMETypeDetector (through mimetypes.xml) and added specific test.
>> Added support and doc to TriX format in Any23 Service web page (form.html).
>> This commit is related to issue #215.
>>
>> ------------------------------------------------------------------------
>> r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) | 1 line
>>
>> Fixed number of extractors (+1 after adding TriXExtractor). Commit related to issue #215.
>> ------------------------------------------------------------------------
>> r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Added method getExtractorType() .
>> ------------------------------------------------------------------------
>> r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) | 4 lines
>>
>> Improved ExtractorDocumentation support, added missing format examples.
>> Improved output layout. This commit is related to issue #194.
>>
>>
>> ------------------------------------------------------------------------
>> r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Improved classpath management in Any23PluginManager. Renamed getClasses\* in loadClasses\* . This commit is related to
>> issue #212.
>> ------------------------------------------------------------------------
>> r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Separated log messages from specific outout data.
>> ------------------------------------------------------------------------
>> r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Added human readable report printing support in ReportingTripleHandler and Rover.
>> ------------------------------------------------------------------------
>> r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Fixed major issue in output generation, added final activity report, help prettification. This commit is related to
>> issue #211.
>> ------------------------------------------------------------------------
>> r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Upgraded to Sesame 2.6.1 See issue #217.
>> ------------------------------------------------------------------------
>> r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue #216
>> ------------------------------------------------------------------------
>> r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
>> ------------------------------------------------------------------------
>> r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) | 1 line
>>
>> Added specific Crawler CLI documentation. Updated general CLI documentation. This commit is related to issue #211.
>> ------------------------------------------------------------------------
>> r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) | 4 lines
>>
>> The Eval CLI Tool has been removed as well as the org.deri.any23.eval package classes related to it.
>> Updated tests verifying CLI tool detection.
>> This commit is related to issue #218.
>>
>> ------------------------------------------------------------------------
>> r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) | 5 lines
>>
>> Added MimeDetector CLI Tool and test case, removed main() from
>> TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
>> Updated CLI doc.
>> This commit is related to issue #219.
>>
>> ------------------------------------------------------------------------
>> r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) | 1 line
>>
>> Added support for comment serialization. Related to issue #158.
>> ------------------------------------------------------------------------
>> r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) | 1 line
>>
>> Add support for annotation writing in FormatWriter implementations. This commit is related to issue #158.
>> ------------------------------------------------------------------------
>> r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) | 1 line
>>
>> Added support for 'annotate' flag in Any23 Service.
>> ------------------------------------------------------------------------
>>
>> ==== END  : Original Log ====
>>
>>
>> Added:
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
>>   incubator/any23/trunk/any23-core/src/test/resources/application/trix/
>>   incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
>>   incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
>>   incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
>>   incubator/any23/trunk/lib/README.txt
>>   incubator/any23/trunk/plugins/README.txt
>>   incubator/any23/trunk/plugins/basic-crawler/
>>   incubator/any23/trunk/plugins/basic-crawler/pom.xml
>>   incubator/any23/trunk/plugins/basic-crawler/src/
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
>>   incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
>>   incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
>>   incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
>> Removed:
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
>> Modified:
>>   incubator/any23/trunk/README.txt
>>   incubator/any23/trunk/any23-core/bin/any23
>>   incubator/any23/trunk/any23-core/bin/any23tools
>>   incubator/any23/trunk/any23-core/pom.xml
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
>>   incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
>>   incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
>>   incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
>>   incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
>>   incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
>>   incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
>>   incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
>>   incubator/any23/trunk/lib/install-deps.sh
>>   incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
>>   incubator/any23/trunk/pom.xml
>>   incubator/any23/trunk/src/site/apt/any23-plugins.apt
>>   incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
>>   incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
>>   incubator/any23/trunk/src/site/apt/getting-started.apt
>>   incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
>>   incubator/any23/trunk/src/site/apt/service.apt
>>   incubator/any23/trunk/src/site/apt/supported-formats.apt
>>
>> Modified: incubator/any23/trunk/README.txt
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/README.txt (original)
>> +++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
>> @@ -20,7 +20,8 @@ Distribution Content
>>
>> any23-core           The library core codebase.
>> any23-service        The library HTTP service codebase.
>> -plugins              Library plugins codebase.
>> +lib                  Contains the Any23 the external deps (read lib/README.txt for further details).
>> +plugins              Library plugins codebase (read plugins/README.txt for further details).
>> RELEASE-NOTES.txt    File reporting main release notes for every version.
>> LICENSE.txt          Applicable project license.
>> README.txt           This file.
>> @@ -240,15 +241,14 @@ Upload the produced packages in download
>>
>>   http://code.google.com/p/any23/downloads/list
>>
>> +--------------------
>> +Manage External Deps
>> +--------------------
>>
>> -Fix Release Procedure
>> ----------------------
>> -
>> -   Currently the *plugins/integration-test* module is excluded from the parent
>> -   reactor.
>> -   To fix it in tag follow procedure as described at issue #171:
>> -
>> -        http://code.google.com/p/any23/issues/detail?id=171
>> +::Developers interest only.::
>>
>> +External Deps are libraries used by some Any23 modules which are
>> +not available in public Maven repositories. Such libraries are
>> +managed within the 'lib' dir.
>>
>> EOF
>>
>> Modified: incubator/any23/trunk/any23-core/bin/any23
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/bin/any23 (original)
>> +++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
>> @@ -9,12 +9,12 @@
>> ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
>>
>> if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> -    echo "Generating executable JAR..."
>> -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
>> +    echo "Generating executable JAR..." >&2
>> +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>>        ||\
>> -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
>> +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>>       ||\
>> -    { echo "Error while generating commandline assembly."; exit 1; }
>> +    { echo "Error while generating commandline assembly."  >&2; exit 1; }
>> fi
>>
>> SEP=':'
>>
>> Modified: incubator/any23/trunk/any23-core/bin/any23tools
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/bin/any23tools (original)
>> +++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28 2012
>> @@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd
>> PLUGINS_DIR=plugins
>>
>> if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> -    echo "Generating executable JAR..."
>> -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
>> +    echo "Generating executable JAR..." >&2
>> +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>>        ||\
>> -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
>> +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>>       ||\
>> -    { echo "Error while generating commandline assembly."; exit 1; }
>> +    { echo "Error while generating commandline assembly." >&2; exit 1; }
>> fi
>>
>> SEP=':'
>> @@ -30,6 +30,7 @@ done
>> # Plugins classpath.
>> for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name "*-plugin.jar" -depth 1)
>> do
>> +  echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
>>  if [ ! -e "$jar" ]; then continue; fi
>>  CP="$CP$SEP$jar"
>> done
>>
>> Modified: incubator/any23/trunk/any23-core/pom.xml
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/pom.xml (original)
>> +++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
>> @@ -92,6 +92,10 @@
>>        </dependency>
>>        <dependency>
>>            <groupId>org.openrdf.sesame</groupId>
>> +            <artifactId>sesame-rio-trix</artifactId>
>> +        </dependency>
>> +        <dependency>
>> +            <groupId>org.openrdf.sesame</groupId>
>>            <artifactId>sesame-repository-sail</artifactId>
>>        </dependency>
>>        <dependency>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java Tue Jan 10 16:32:28 2012
>> @@ -258,6 +258,28 @@ public class Any23 {
>>    }
>>
>>    /**
>> +     * Returns the most appropriate {@link DocumentSource} for the given<code>documentURI</code>.
>> +     *
>> +     * @param documentURI the document <i>URI</i>.
>> +     * @return a new instance of DocumentSource.
>> +     * @throws URISyntaxException if an error occurs while parsing the <code>documentURI</code> as a <i>URI</i>.
>> +     * @throws IOException if an error occurs while initializing the internal {@link HTTPClient}.
>> +     */
>> +    public DocumentSource createDocumentSource(String documentURI) throws URISyntaxException, IOException {
>> +        if(documentURI == null) throw new NullPointerException("documentURI cannot be null.");
>> +        if (documentURI.toLowerCase().startsWith("file:")) {
>> +            return new FileDocumentSource( new File(new URI(documentURI)) );
>> +        }
>> +        if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
>> +            return new HTTPDocumentSource(getHTTPClient(), documentURI);
>> +        }
>> +        throw new IllegalArgumentException(
>> +                String.format("Unsupported protocol for document URI: '%s' .", documentURI)
>> +        );
>> +    }
>> +
>> +
>> +    /**
>>     * Performs metadata extraction from the content of the given
>>     * <code>in</code> document source, sending the generated events
>>     * to the specified <code>outputHandler</code>.
>> @@ -363,13 +385,7 @@ public class Any23 {
>>    public ExtractionReport extract(ExtractionParameters eps, String documentURI, TripleHandler outputHandler)
>>    throws IOException, ExtractionException {
>>        try {
>> -            if (documentURI.toLowerCase().startsWith("file:")) {
>> -                return extract(eps, new FileDocumentSource(new File(new URI(documentURI))), outputHandler);
>> -            }
>> -            if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
>> -                return extract(eps, new HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
>> -            }
>> -            throw new ExtractionException("Not a valid absolute URI: " + documentURI);
>> +            return extract(eps, createDocumentSource(documentURI), outputHandler);
>>        } catch (URISyntaxException ex) {
>>            throw new ExtractionException("Error while extracting data from document URI.", ex);
>>        }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java Tue Jan 10 16:32:28 2012
>> @@ -16,7 +16,7 @@
>>
>> package org.deri.any23.cli;
>>
>> -import org.deri.any23.LogUtil;
>> +import org.deri.any23.util.LogUtils;
>> import org.deri.any23.extractor.ExampleInputOutput;
>> import org.deri.any23.extractor.ExtractionException;
>> import org.deri.any23.extractor.Extractor;
>> @@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
>>    }
>>
>>    public int run(String[] args) {
>> -        LogUtil.setDefaultLogging();
>> +        LogUtils.setDefaultLogging();
>>        try {
>>            if (args.length == 0) {
>>                printUsage();
>> @@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
>>     * Prints the list of all the available extractors.
>>     */
>>    public void printExtractorList() {
>> -        for (String extractorName : ExtractorRegistry.getInstance().getAllNames()) {
>> -            System.out.println(extractorName);
>> +        for(ExtractorFactory factory : ExtractorRegistry.getInstance().getExtractorGroup()) {
>> +            System.out.println( String.format("%25s [%15s]", factory.getExtractorName(), factory.getExtractorType()));
>>        }
>>    }
>>
>> @@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
>>            ExtractorFactory<?> factory = ExtractorRegistry.getInstance().getFactory(extractorName);
>>            ExampleInputOutput example = new ExampleInputOutput(factory);
>>            System.out.println("Extractor: " + extractorName);
>> -            System.out.println("  type: " + getType(factory));
>> -            String output = example.getExampleOutput();
>> -            if (output == null) {
>> -                System.out.println("(no example output)");
>> +            System.out.println("\ttype: " + getType(factory));
>> +            System.out.println();
>> +            final String exampleInput = example.getExampleInput();
>> +            if(exampleInput == null) {
>> +                System.out.println("(No Example Available)");
>>            } else {
>> -                System.out.println("-------- example output --------");
>> -                System.out.println(output);
>> +                System.out.println("-------- Example Input  --------");
>> +                System.out.println(exampleInput);
>> +                System.out.println("-------- Example Output --------");
>> +                String output = example.getExampleOutput();
>> +                System.out.println(output == null || output.trim().length() == 0 ? "(No Output Generated)" : output);
>>            }
>> -            System.out.println();
>>            System.out.println("================================");
>> +            System.out.println();
>>        }
>>    }
>>
>>
>> Added: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java (added)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java Tue Jan 10 16:32:28 2012
>> @@ -0,0 +1,113 @@
>> +/*
>> + * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
>> + *
>> + * Licensed under the Apache License, Version 2.0 (the "License");
>> + * you may not use this file except in compliance with the License.
>> + * You may obtain a copy of the License at
>> + *
>> + *          http://www.apache.org/licenses/LICENSE-2.0
>> + *
>> + * Unless required by applicable law or agreed to in writing, software
>> + * distributed under the License is distributed on an "AS IS" BASIS,
>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>> + * See the License for the specific language governing permissions and
>> + * limitations under the License.
>> + */
>> +
>> +package org.deri.any23.cli;
>> +
>> +import org.deri.any23.configuration.DefaultConfiguration;
>> +import org.deri.any23.http.DefaultHTTPClient;
>> +import org.deri.any23.http.HTTPClient;
>> +import org.deri.any23.http.HTTPClientConfiguration;
>> +import org.deri.any23.mime.MIMEType;
>> +import org.deri.any23.mime.MIMETypeDetector;
>> +import org.deri.any23.mime.TikaMIMETypeDetector;
>> +import org.deri.any23.source.DocumentSource;
>> +import org.deri.any23.source.FileDocumentSource;
>> +import org.deri.any23.source.HTTPDocumentSource;
>> +import org.deri.any23.source.StringDocumentSource;
>> +
>> +import java.io.File;
>> +import java.net.URISyntaxException;
>> +
>> +/**
>> + * Commandline tool to detect <b>MIME Type</b>s from
>> + * file, HTTP and direct input sources.
>> + * The implementation of this tool is based on {@link TikaMIMETypeDetector}.
>> + *
>> + * @author Michele Mostarda (mostarda@fbk.eu)
>> + */
>> +@ToolRunner.Description("MIME Type Detector Tool.")
>> +public class MimeDetector implements Tool{
>> +
>> +    public static final String FILE_DOCUMENT_PREFIX   = "file://";
>> +    public static final String INLINE_DOCUMENT_PREFIX = "inline://";
>> +    public static final String URL_DOCUMENT_RE        = "^https?://.*";
>> +
>> +    public static void main(String[] args) {
>> +        System.exit( new MimeDetector().run(args) );
>> +    }
>> +
>> +    @Override
>> +    public int run(String[] args) {
>> +          if(args.length != 1) {
>> +            System.err.println("USAGE: {http://path/to/resource.html|file:///path/to/local.file|inline:// some inline content}");
>> +            return 1;
>> +        }
>> +
>> +        final String document = args[0];
>> +        try {
>> +            final DocumentSource documentSource = createDocumentSource(document);
>> +            final MIMETypeDetector detector = new TikaMIMETypeDetector();
>> +            final MIMEType mimeType = detector.guessMIMEType(
>> +                    documentSource.getDocumentURI(),
>> +                    documentSource.openInputStream(),
>> +                    MIMEType.parse(documentSource.getContentType())
>> +            );
>> +            System.out.println(mimeType);
>> +            return 0;
>> +        } catch (Exception e) {
>> +            System.err.print("Error while detecting MIME Type.");
>> +            e.printStackTrace(System.err);
>> +            return 1;
>> +        }
>> +    }
>> +
>> +    private DocumentSource createDocumentSource(String document) throws URISyntaxException {
>> +        if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
>> +            return new FileDocumentSource(
>> +                    new File(
>> +                            document.substring(FILE_DOCUMENT_PREFIX.length())
>> +                    )
>> +            );
>> +        }
>> +        if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
>> +            return new StringDocumentSource(
>> +                    document.substring(INLINE_DOCUMENT_PREFIX.length()),
>> +                    ""
>> +            );
>> +        }
>> +        if(document.matches(URL_DOCUMENT_RE)) {
>> +            final HTTPClient client = new DefaultHTTPClient();
>> +            // TODO: anonymous config class also used in Any23. centralize.
>> +            client.init(new HTTPClientConfiguration() {
>> +                public String getUserAgent() {
>> +                    return DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
>> +                }
>> +                public String getAcceptHeader() {
>> +                    return "";
>> +                }
>> +                public int getDefaultTimeout() {
>> +                    return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
>> +                }
>> +                public int getMaxConnections() {
>> +                    return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
>> +                }
>> +            });
>> +            return new HTTPDocumentSource(client, document);
>> +        }
>> +        throw new IllegalArgumentException("Unsupported protocol for document " + document);
>> +    }
>> +
>> +}
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java Tue Jan 10 16:32:28 2012
>> @@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
>> import org.apache.commons.cli.Options;
>> import org.apache.commons.cli.PosixParser;
>> import org.deri.any23.Any23;
>> -import org.deri.any23.LogUtil;
>> +import org.deri.any23.util.LogUtils;
>> import org.deri.any23.configuration.Configuration;
>> import org.deri.any23.configuration.DefaultConfiguration;
>> import org.deri.any23.extractor.ExtractionException;
>> @@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
>> import org.deri.any23.extractor.SingleDocumentExtraction;
>> import org.deri.any23.filter.IgnoreAccidentalRDFa;
>> import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
>> +import org.deri.any23.source.DocumentSource;
>> import org.deri.any23.writer.BenchmarkTripleHandler;
>> import org.deri.any23.writer.LoggingTripleHandler;
>> -import org.deri.any23.writer.NQuadsWriter;
>> -import org.deri.any23.writer.NTriplesWriter;
>> -import org.deri.any23.writer.RDFXMLWriter;
>> import org.deri.any23.writer.ReportingTripleHandler;
>> import org.deri.any23.writer.TripleHandler;
>> import org.deri.any23.writer.TripleHandlerException;
>> -import org.deri.any23.writer.TurtleWriter;
>> -import org.deri.any23.writer.URIListWriter;
>> +import org.deri.any23.writer.WriterRegistry;
>> import org.slf4j.Logger;
>> import org.slf4j.LoggerFactory;
>>
>> @@ -51,6 +48,7 @@ import java.io.OutputStream;
>> import java.io.PrintStream;
>> import java.io.PrintWriter;
>> import java.net.MalformedURLException;
>> +import java.net.URISyntaxException;
>> import java.net.URL;
>>
>> import static org.deri.any23.extractor.ExtractionParameters.ValidationMode;
>> @@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
>> * A default rover implementation. Goes and fetches a URL using an hint
>> * as to what format should require, then tries to convert it to RDF.
>> *
>> - * @author Gabriele Renzi
>> - * @author Richard Cyganiak (richard@cyganiak.de)
>> * @author Michele Mostarda (mostarda@fbk.eu)
>> + * @author Richard Cyganiak (richard@cyganiak.de)
>> + * @author Gabriele Renzi
>> */
>> @ToolRunner.Description("Any23 Command Line Tool.")
>> public class Rover implements Tool {
>>
>> -    // Supported formats.
>> -    private static final String TURTLE_FORMAT  = "turtle";
>> -    private static final String NTRIPLE_FORMAT = "ntriples";
>> -    private static final String RDFXML_FORMAT  = "rdfxml";
>> -    private static final String NQUADS_FORMAT  = "nquads";
>> -    private static final String URIS_FORMAT    = "uris";
>> -
>> -    private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
>> +    private static final String[] FORMATS = WriterRegistry.getInstance().getIdentifiers();
>> +    private static final int DEFAULT_FORMAT_INDEX = 0;
>>
>>    private static final Logger logger = LoggerFactory.getLogger(Rover.class);
>>
>> -    private static Options options;
>> +    private Options options;
>>
>> -    public static void main(String[] args) {
>> -        System.exit( new Rover().run(args) );
>> -    }
>> +    private CommandLine commandLine;
>>
>> -    public int run(String[] args) {
>> -        final CommandLineParser parser = new PosixParser();
>> -        final CommandLine commandLine;
>> +    private boolean verbose = false;
>>
>> -        boolean verbose = false;
>> -        try {
>> -            options = createOptions();
>> -            commandLine = parser.parse(options, args);
>> +    private PrintStream outputStream;
>> +    private TripleHandler tripleHandler;
>> +    private ReportingTripleHandler reportingTripleHandler;
>> +    private BenchmarkTripleHandler benchmarkTripleHandler;
>>
>> -            if (commandLine.hasOption("h")) {
>> -                printHelp();
>> -                return 0;
>> -            }
>> +    private ExtractionParameters eps;
>> +    private Any23 any23;
>>
>> -            if (commandLine.hasOption('v')) {
>> -                verbose = true;
>> -                LogUtil.setVerboseLogging();
>> -            } else {
>> -                LogUtil.setDefaultLogging();
>> -            }
>> -
>> -            if (commandLine.getArgs().length < 1) {
>> -                printHelp();
>> -                throw new IllegalArgumentException("Expected at least 1 argument.");
>> -            }
>> +    protected boolean isVerbose() {
>> +        return verbose;
>> +    }
>>
>> -            final String[] inputURIs      = argumentsToURIs(commandLine.getArgs());
>> -            final String[] extractorNames = getExtractors(commandLine);
>> +    public static void main(String[] args) {
>> +        System.exit( new Rover().run(args) );
>> +    }
>>
>> -            PrintStream outputStream    = null;
>> -            TripleHandler tripleHandler = null;
>> -            try {
>> -                outputStream  = getOutputStream(commandLine);
>> +    public int run(String[] args) {
>> +        try {
>> +            final String[] uris = configure(args);
>> +            performExtraction(uris);
>> +            return 0;
>> +        } catch (Exception e) {
>> +            System.err.println( e.getMessage() );
>> +            final int exitCode = e instanceof ExitCodeException ? ((ExitCodeException) e).exitCode : 1;
>> +            if(verbose) e.printStackTrace(System.err);
>> +            return exitCode;
>> +        }
>> +    }
>>
>> -                tripleHandler = getTripleHandler(commandLine, outputStream);
>> +    protected CommandLine getCommandLine() {
>> +        if(commandLine == null) throw new IllegalStateException("Rover must be configured first.");
>> +        return commandLine;
>> +    }
>>
>> -                tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
>> +    protected String[] configure(String[] args) throws Exception {
>> +        final CommandLineParser parser = new PosixParser();
>> +        options = createOptions();
>> +        commandLine = parser.parse(options, args);
>>
>> -                tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
>> -                final BenchmarkTripleHandler benchmarkTripleHandler =
>> -                        tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
>> +        if (commandLine.hasOption("h")) {
>> +            printHelp();
>> +            throw new ExitCodeException(0);
>> +        }
>>
>> -                tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>> +        if (commandLine.hasOption('v')) {
>> +            verbose = true;
>> +            LogUtils.setVerboseLogging();
>> +        } else {
>> +            LogUtils.setDefaultLogging();
>> +        }
>>
>> -                final ReportingTripleHandler reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
>> +        if (commandLine.getArgs().length < 1) {
>> +            printHelp();
>> +            throw new IllegalArgumentException("Expected at least 1 argument.");
>> +        }
>>
>> -                final ExtractionParameters eps = getExtractionParameters(commandLine);
>> +        final String[] inputURIs = argumentsToURIs(commandLine.getArgs());
>> +        final String[] extractorNames = getExtractors(commandLine);
>>
>> -                final Any23 any23 = createAny23(extractorNames);
>> +        try {
>> +            outputStream  = getOutputStream(commandLine);
>> +            tripleHandler = getTripleHandler(commandLine, outputStream);
>> +            tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
>> +            tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
>>
>> -                final long start = System.currentTimeMillis();
>> -                for(String inputURI : inputURIs) {
>> -                    performExtraction(any23, eps, inputURI, reportingTripleHandler);
>> -                }
>> -                final long elapsed = System.currentTimeMillis() - start;
>> +            benchmarkTripleHandler =
>> +                    tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
>>
>> -                closeAll(tripleHandler, outputStream);
>> +            tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>>
>> -                if (benchmarkTripleHandler != null) {
>> -                    System.err.println( benchmarkTripleHandler.report() );
>> -                }
>> +            reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
>> +            eps = getExtractionParameters(commandLine);
>> +            any23 = createAny23(extractorNames);
>>
>> -                logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
>> -                logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
>> -            } finally {
>> -                closeAll(tripleHandler, outputStream);
>> -            }
>> +            return inputURIs;
>>        } catch (Exception e) {
>> -            System.err.println(e.getMessage());
>> -            final int exitCode = e instanceof SpecificExitException ? ((SpecificExitException) e).exitCode : 1;
>> -            if(verbose) e.printStackTrace(System.err);
>> -            return exitCode;
>> +            closeStreams();
>> +            throw e;
>>        }
>> -        return 0;
>>    }
>>
>> -    private Options createOptions() {
>> +    protected Options createOptions() {
>>        final Options options = new Options();
>>        options.addOption(
>>                new Option("v", "verbose", false, "Show debug and progress information.")
>> @@ -178,13 +175,7 @@ public class Rover implements Tool {
>>                        "f",
>>                        "Output format",
>>                        true,
>> -                        "[" +
>> -                                TURTLE_FORMAT  + " (default), " +
>> -                                NTRIPLE_FORMAT + ", " +
>> -                                RDFXML_FORMAT  + ", " +
>> -                                NQUADS_FORMAT  + ", " +
>> -                                URIS_FORMAT    +
>> -                        "]"
>> +                        "[" +  printFormats(FORMATS, DEFAULT_FORMAT_INDEX) + "]"
>>                )
>>        );
>>        options.addOption(
>> @@ -208,11 +199,51 @@ public class Rover implements Tool {
>>        return options;
>>    }
>>
>> +    protected void performExtraction(DocumentSource documentSource) {
>> +        performExtraction(any23, eps, documentSource, reportingTripleHandler);
>> +    }
>> +
>> +    protected void performExtraction(String[] inputURIs) throws URISyntaxException, IOException {
>> +        try {
>> +            final long start = System.currentTimeMillis();
>> +            for (String inputURI : inputURIs) {
>> +                performExtraction( any23.createDocumentSource(inputURI) );
>> +            }
>> +            final long elapsed = System.currentTimeMillis() - start;
>> +
>> +            if (benchmarkTripleHandler != null) {
>> +                System.err.println(benchmarkTripleHandler.report());
>> +            }
>> +
>> +            logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
>> +            logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
>> +        } finally {
>> +            closeStreams();
>> +        }
>> +    }
>> +
>> +    protected String printReports() {
>> +        final StringBuilder sb = new StringBuilder();
>> +        if(benchmarkTripleHandler != null) sb.append( benchmarkTripleHandler.report() ).append('\n');
>> +        if(reportingTripleHandler != null) sb.append( reportingTripleHandler.printReport() ).append('\n');
>> +        return sb.toString();
>> +    }
>> +
>>    private void printHelp() {
>>        HelpFormatter formatter = new HelpFormatter();
>>        formatter.printHelp("[{<url>|<file>}]+", options, true);
>>    }
>>
>> +    private String printFormats(String[] formats, int defaultIndex) {
>> +        final StringBuilder sb = new StringBuilder();
>> +        for (int i = 0; i < formats.length; i++) {
>> +            sb.append(formats[i]);
>> +            if(i == defaultIndex) sb.append(" (default)");
>> +            if(i < formats.length - 1) sb.append(", ");
>> +        }
>> +        return sb.toString();
>> +    }
>> +
>>    private String argumentToURI(String uri) {
>>        uri = uri.trim();
>>        if (uri.toLowerCase().startsWith("http:") || uri.toLowerCase().startsWith("https:")) {
>> @@ -268,27 +299,17 @@ public class Rover implements Tool {
>>
>>    private TripleHandler getTripleHandler(CommandLine cl, OutputStream os) {
>>        final String FORMAT_OPTION = "f";
>> -        String format = DEFAULT_FORMAT;
>> +        String format = FORMATS[DEFAULT_FORMAT_INDEX];
>>        if (cl.hasOption(FORMAT_OPTION)) {
>> -            format = cl.getOptionValue(FORMAT_OPTION);
>> +            format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
>>        }
>> -        final TripleHandler outputHandler;
>> -        if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
>> -            outputHandler = new TurtleWriter(os);
>> -        } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
>> -            outputHandler = new NTriplesWriter(os);
>> -        } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
>> -            outputHandler = new RDFXMLWriter(os);
>> -        } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
>> -            outputHandler = new NQuadsWriter(os);
>> -        } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
>> -            outputHandler = new URIListWriter(os);
>> -        } else {
>> +        try {
>> +            return WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
>> +        } catch (Exception e) {
>>            throw new IllegalArgumentException(
>>                    String.format("Invalid option value '%s' for option %s", format, FORMAT_OPTION)
>>            );
>>        }
>> -        return outputHandler;
>>    }
>>
>>    private TripleHandler decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
>> @@ -346,44 +367,54 @@ public class Rover implements Tool {
>>        return any23;
>>    }
>>
>> -    private void performExtraction(Any23 any23, ExtractionParameters eps, String documentURI, TripleHandler th) {
>> +    private void performExtraction(
>> +            Any23 any23, ExtractionParameters eps, DocumentSource documentSource, TripleHandler th
>> +    ) {
>>        try {
>> -            if (! any23.extract(eps, documentURI, th).hasMatchingExtractors()) {
>> -                throw new SpecificExitException("No suitable extractors found.", 2);
>> +            if (! any23.extract(eps, documentSource, th).hasMatchingExtractors()) {
>> +                throw new ExitCodeException("No suitable extractors found.", 2);
>>            }
>>        } catch (ExtractionException ex) {
>> -            throw new SpecificExitException("Exception while extracting metadata.", ex, 3);
>> +            throw new ExitCodeException("Exception while extracting metadata.", ex, 3);
>>        } catch (IOException ex) {
>> -            throw new SpecificExitException("Exception while producing output.", ex, 4);
>> +            throw new ExitCodeException("Exception while producing output.", ex, 4);
>>        }
>>    }
>>
>> -    private void closeHandler(TripleHandler th) {
>> -        if(th == null) return;
>> +    private void closeHandler() {
>> +        if(tripleHandler == null) return;
>>        try {
>> -            th.close();
>> +            tripleHandler.close();
>>        } catch (TripleHandlerException the) {
>> -            throw new SpecificExitException("Error while closing TripleHandler", the, 5);
>> +            throw new ExitCodeException("Error while closing TripleHandler", the, 5);
>>        }
>>    }
>>
>> -    private void closeAll(TripleHandler th, PrintStream os) {
>> -             closeHandler(th);
>> -            if(os != null) os.close();
>> +    private void closeStreams() {
>> +             closeHandler();
>> +            if(outputStream != null) outputStream.close();
>>    }
>>
>> -    private class SpecificExitException extends RuntimeException {
>> +    protected class ExitCodeException extends RuntimeException {
>>
>>        private final int exitCode;
>>
>> -        public SpecificExitException(String message, Throwable cause, int exitCode) {
>> +        public ExitCodeException(String message, Throwable cause, int exitCode) {
>>            super(message, cause);
>>            this.exitCode = exitCode;
>>        }
>> -        public SpecificExitException(String message, int exitCode) {
>> +        public ExitCodeException(String message, int exitCode) {
>>            super(message);
>>            this.exitCode = exitCode;
>>        }
>> +        public ExitCodeException(int exitCode) {
>> +            super();
>> +            this.exitCode = exitCode;
>> +        }
>> +
>> +        protected int getExitCode() {
>> +            return exitCode;
>> +        }
>>    }
>>
>> }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java Tue Jan 10 16:32:28 2012
>> @@ -29,6 +29,13 @@ import java.util.Collection;
>> public interface ExtractorFactory<T extends Extractor<?>> extends ExtractorDescription {
>>
>>    /**
>> +     * Returns the extractor type.
>> +     *
>> +     * @return the not <code>null</code> extractor class.
>> +     */
>> +    Class<T> getExtractorType();
>> +
>> +    /**
>>     * Creates an extractor instance.
>>     *
>>     * @return an instance of the extractor associated to this factory.
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java Tue Jan 10 16:32:28 2012
>> @@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
>> import org.deri.any23.extractor.rdf.NQuadsExtractor;
>> import org.deri.any23.extractor.rdf.NTriplesExtractor;
>> import org.deri.any23.extractor.rdf.RDFXMLExtractor;
>> +import org.deri.any23.extractor.rdf.TriXExtractor;
>> import org.deri.any23.extractor.rdf.TurtleExtractor;
>> import org.deri.any23.extractor.rdfa.RDFa11Extractor;
>> import org.deri.any23.extractor.rdfa.RDFaExtractor;
>> @@ -79,6 +80,7 @@ public class ExtractorRegistry {
>>                instance.register(TurtleExtractor.factory);
>>                instance.register(NTriplesExtractor.factory);
>>                instance.register(NQuadsExtractor.factory);
>> +                instance.register(TriXExtractor.factory);
>>                if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
>>                    instance.register(RDFa11Extractor.factory);
>>                } else {
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java Tue Jan 10 16:32:28 2012
>> @@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
>>        return supportedMIMETypes;
>>    }
>>
>> +    @Override
>> +    public Class<T> getExtractorType() {
>> +        return extractorClass;
>> +    }
>> +
>>    /**
>>     * @return an instance of type T concrete implementation of {@link org.deri.any23.extractor.Extractor}
>>     */
>> +    @Override
>>    public T createExtractor() {
>>        try {
>>            return extractorClass.newInstance();
>> @@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
>>    /**
>>     * @return an input example
>>     */
>> +    @Override
>>    public String getExampleInput() {
>>        return exampleInput;
>>    }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
>>                    Arrays.asList(
>>                            "text/csv;q=0.1"
>>                    ),
>> -                    null,
>> +                    "example-csv.csv",
>>                    CSVExtractor.class
>>            );
>>
>> @@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
>>    }
>>
>>    /**
>> +     * Check whether a number is an integer.
>> +     *
>> +     * @param number
>> +     * @return
>> +     */
>> +    private boolean isInteger(String number) {
>> +        try {
>> +            Integer.valueOf(number);
>> +            return true;
>> +        } catch (NumberFormatException e) {
>> +            return false;
>> +        }
>> +    }
>> +
>> +    /**
>> +     * Check whether a number is a float.
>> +     *
>>     * @param number
>>     * @return
>>     */
>> -    private boolean isNumber(String number) {
>> +    private boolean isFloat(String number) {
>>        try {
>> -            Double.valueOf(number);
>> +            Float.valueOf(number);
>>            return true;
>>        } catch (NumberFormatException e) {
>>            return false;
>> @@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
>>            object = new URIImpl(cell);
>>        } else {
>>            URI datatype = XMLSchema.STRING;
>> -            if (isNumber(cell)) {
>> +            if (isInteger(cell)) {
>>                datatype = XMLSchema.INTEGER;
>> +            } else if(isFloat(cell)) {
>> +                datatype = XMLSchema.FLOAT;
>>            }
>>            object = new LiteralImpl(cell, datatype);
>>        }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
>>                    "html-mf-adr",
>>                    PopularPrefixes.createSubset("rdf", "vcard"),
>>                    Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                    null,
>> +                    "example-mf-adr.html",
>>                    AdrExtractor.class
>>            );
>> }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
>>                "html-mf-geo",
>>                PopularPrefixes.createSubset("rdf", "vcard"),
>>                Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                null,
>> +                "example-mf-geo.html",
>>                GeoExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -53,7 +53,7 @@ public class HCalendarExtractor extends
>>                    "html-mf-hcalendar",
>>                    PopularPrefixes.createSubset("rdf", "ical"),
>>                    Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                    null,
>> +                    "example-mf-hcalendar.html",
>>                    HCalendarExtractor.class);
>>
>>    private static final String[] Components = {"Vevent", "Vtodo", "Vjournal", "Vfreebusy"};
>> @@ -116,7 +116,7 @@ public class HCalendarExtractor extends
>>    private boolean extractComponent(Node node, Resource cal, String component) throws ExtractionException {
>>        HTMLDocument compoNode = new HTMLDocument(node);
>>        BNode evt = valueFactory.createBNode();
>> -        addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
>> +        addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
>>        addTextProps(compoNode, evt);
>>        addUrl(compoNode, evt);
>>        addRRule(compoNode, evt);
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
>>                    "html-mf-hcard",
>>                    PopularPrefixes.createSubset("rdf", "vcard"),
>>                    Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                    null,
>> +                    "example-mf-hcard.html",
>>                    HCardExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -82,7 +82,7 @@ public class HListingExtractor extends E
>>                    "html-mf-hlisting",
>>                    PopularPrefixes.createSubset("rdf", "hlisting"),
>>                    Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                    null,
>> +                    "example-mf-hlisting.html",
>>                    HListingExtractor.class
>>            );
>>
>> @@ -106,7 +106,7 @@ public class HListingExtractor extends E
>>        out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
>>
>>        for (String action : findActions(fragment)) {
>> -            out.writeTriple(listing, hLISTING.action, hLISTING.getResource(action));
>> +            out.writeTriple(listing, hLISTING.action, hLISTING.getClass(action));
>>        }
>>        out.writeTriple(listing, hLISTING.lister, addLister() );
>>        addItem(listing);
>> @@ -154,7 +154,7 @@ public class HListingExtractor extends E
>>                    String value = node.getNodeValue();
>>                    // do not use conditionallyAdd, it won't work cause of evaluation rules
>>                    if (!(null == value || "".equals(value))) {
>> -                        URI property = hLISTING.getPropertyCamelized(klass);
>> +                        URI property = hLISTING.getPropertyCamelCase(klass);
>>                        conditionallyAddLiteralProperty(
>>                                node,
>>                                blankItem, property, valueFactory.createLiteral(value)
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
>>                    "html-mf-hrecipe",
>>                    PopularPrefixes.createSubset("rdf", "hrecipe"),
>>                    Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                    null,
>> +                    "example-mf-hrecipe.html",
>>                    HRecipeExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -48,7 +48,7 @@ public class HResumeExtractor extends En
>>                    "html-mf-hresume",
>>                    PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>>                    Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                    null,
>> +                    "example-mf-hresume.html",
>>                    HResumeExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -53,7 +53,7 @@ public class HReviewExtractor extends En
>>                    "html-mf-hreview",
>>                    PopularPrefixes.createSubset("rdf", "vcard", "rev"),
>>                    Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                    null,
>> +                    "example-mf-hreview.html",
>>                    HReviewExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
>>                    "html-head-links",
>>                    PopularPrefixes.createSubset("xhtml", "dcterms"),
>>                    Arrays.asList("text/html;q=0.05", "application/xhtml+xml;q=0.05"),
>> -                    null,
>> +                    "example-head-link.html",
>>                    HeadLinkExtractor.class);
>> }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
>>                    "html-head-icbm",
>>                    PopularPrefixes.createSubset("geo", "rdf"),
>>                    Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
>> -                    null,
>> +                    "example-icbm.html",
>>                    ICBMExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -51,7 +51,7 @@ public class LicenseExtractor implements
>>                    "html-mf-license",
>>                    PopularPrefixes.createSubset("xhtml"),
>>                    Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
>> -                    null,
>> +                    "example-mf-license.html",
>>                    LicenseExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
>>                    "html-mf-species",
>>                    PopularPrefixes.createSubset("rdf", "wo"),
>>                    Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                    null,
>> +                    "example-mf-species.html",
>>                    SpeciesExtractor.class
>>            );
>>
>> @@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
>>
>>    private URI resolveClassName(String clazz) {
>>        String upperCaseClass = clazz.substring(0, 1);
>> -        return vWO.getResource(
>> +        return vWO.getClass(
>>                String.format("%s%s",
>>                        upperCaseClass.toUpperCase(),
>>                        clazz.substring(1)
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
>>                    NAME,
>>                    PopularPrefixes.get(),
>>                    Arrays.asList("text/html;q=0.02", "application/xhtml+xml;q=0.02"),
>> -                    null,
>> +                    "example-script-turtle.html",
>>                    TurtleHTMLExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
>>                "html-mf-xfn",
>>                PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
>>                Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                null,
>> +                "example-mf-xfn.html",
>>                XFNExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
>>                    "html-microdata",
>>                    PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>>                    Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> -                    null,
>> +                    "example-microdata.html",
>>                    MicrodataExtractor.class
>>            );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java Tue Jan 10 16:32:28 2012
>> @@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
>> import org.deri.any23.extractor.ErrorReporter;
>> import org.deri.any23.extractor.ExtractionContext;
>> import org.deri.any23.extractor.ExtractionResult;
>> -import org.deri.any23.parser.NQuadsParser;
>> +import org.deri.any23.io.nquads.NQuadsParser;
>> import org.deri.any23.rdf.Any23ValueFactoryWrapper;
>> import org.openrdf.model.impl.ValueFactoryImpl;
>> import org.openrdf.rio.ParseErrorListener;
>> @@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
>> import org.openrdf.rio.RDFParser;
>> import org.openrdf.rio.ntriples.NTriplesParser;
>> import org.openrdf.rio.rdfxml.RDFXMLParser;
>> +import org.openrdf.rio.trix.TriXParser;
>> import org.openrdf.rio.turtle.TurtleParser;
>> import org.slf4j.Logger;
>> import org.slf4j.LoggerFactory;
>> @@ -38,7 +39,7 @@ import java.io.Reader;
>>
>> /**
>> * This factory provides a common logic for creating and configuring correctly
>> - * any RDF parser used within the library.
>> + * any <i>RDF</i> parser used within the library.
>> *
>> * @author Michele Mostarda (mostarda@fbk.eu)
>> */
>> @@ -119,7 +120,7 @@ public class RDFParserFactory {
>>    }
>>
>>    /**
>> -     * Returns a new instance of a configured {@link org.deri.any23.parser.NQuadsParser}.
>> +     * Returns a new instance of a configured {@link org.deri.any23.io.nquads.NQuadsParser}.
>>     *
>>     * @param verifyDataType data verification enable if <code>true</code>.
>>     * @param stopAtFirstError the parser stops at first error if <code>true</code>.
>> @@ -139,6 +140,26 @@ public class RDFParserFactory {
>>    }
>>
>>    /**
>> +     * Returns a new instance of a configured {@link TriXParser}.
>> +     *
>> +     * @param verifyDataType data verification enable if <code>true</code>.
>> +     * @param stopAtFirstError the parser stops at first error if <code>true</code>.
>> +     * @param extractionContext the extraction context where the parser is used.
>> +     * @param extractionResult the output extraction result.
>> +     * @return a new instance of a configured TriX parser.
>> +     */
>> +    public TriXParser getTriXParser(
>> +            final boolean verifyDataType,
>> +            final boolean stopAtFirstError,
>> +            final ExtractionContext extractionContext,
>> +            final ExtractionResult extractionResult
>> +    ) {
>> +        final TriXParser parser = new TriXParser();
>> +        configureParser(parser, verifyDataType, stopAtFirstError, extractionContext, extractionResult);
>> +        return parser;
>> +    }
>> +
>> +    /**
>>     * Configures the given parser on the specified extraction result
>>     * setting the policies for data verification and error handling.
>>     *
>>
>>


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./ any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/ any23-core/src/main/java/org/deri/any23/cli/ any23-core/src/main/java/org/deri/any23/eval/ any23-core/src/main/java/or

Posted by Simone Tripodi <si...@apache.org>.
Hi Mic,

happy new year you too indeed :P

Please shout if you need any help on reorganizing stuff, I would be
more than glad to provide my help!

TIA!
-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Tue, Jan 10, 2012 at 6:13 PM, Michele Mostarda
<mi...@gmail.com> wrote:
> On 10 January 2012 18:08, Simone Tripodi <si...@apache.org> wrote:
>
>> Hi Mic,
>>
>
> Hi Simo, happy new year !
>
> this is something great, thanks for the hard work of merging!
>> next step is renaming the packages in org.apache.any23 :)
>>
>
> Sure :) It is the next critical issue scheduled on Jira.
> The we can start discussing about the release.
>
> Ciao
>
> Mic
>
>
>>
>> All the best, have a nice day!
>> -Simo
>>
>> http://people.apache.org/~simonetripodi/
>> http://simonetripodi.livejournal.com/
>> http://twitter.com/simonetripodi
>> http://www.99soft.org/
>>
>>
>>
>> On Tue, Jan 10, 2012 at 5:32 PM,  <mo...@apache.org> wrote:
>> > Author: mostarda
>> > Date: Tue Jan 10 16:32:28 2012
>> > New Revision: 1229627
>> >
>> > URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
>> > Log:
>> > This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
>> > with the current Apache Any23 SVN repo, including the issues
>> > developed during the initial import transition phase.
>> > Such issues have been tracked on the original Any23 Google Code Issue
>> Tracker [2].
>> > Below the extract of the original repository commit log.
>> >
>> > This commit is related to issue ANY23-27.
>> >
>> > [1] http://any23.googlecode.com/svn/trunk/
>> > [2] http://code.google.com/p/any23/issues/list
>> >
>> > ==== BEGIN: Original Log ====
>> >
>> > ------------------------------------------------------------------------
>> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
>> 1 line
>> >
>> > Improved numeric datatype assigment. This commit fixes issue #208.
>> > ------------------------------------------------------------------------
>> > hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
>> > ------------------------------------------------------------------------
>> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
>> 1 line
>> >
>> > Improved numeric datatype assigment. This commit fixes issue #208.
>> > ------------------------------------------------------------------------
>> > r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'.
>> Fixed HTMLMetaExtractorTest.java to match this new
>> > namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS
>> declared as resource instead that as a URI. Fixed
>> > RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration.
>> This commit is related to issue #203.
>> > ------------------------------------------------------------------------
>> > r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found
>> wrong declaration of Class(Resource) in WO.java
>> > voca. Fixed and updated RDFSchemaUtils.java test. This commit is related
>> to issue #198.
>> > ------------------------------------------------------------------------
>> > r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Added utility method.
>> > ------------------------------------------------------------------------
>> > r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Improved Vocabulary.java class: added support for comments to any
>> resource. Improved RDFSchemaUtils.java serialization
>> > support, added separators to RDFXML serialization. This commit is
>> related to issue #198.
>> > ------------------------------------------------------------------------
>> > r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) |
>> 1 line
>> >
>> > Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved
>> prefix declaration parsing in RDFa11Parser, this
>> > new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix
>> declarations. Fixed support for prefix mapping resolution in
>> > RDFa11Parser, this allows the correct support for the structured
>> properties introduced by the latest version of the Open
>> > Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest
>> to the new output of vocabularies serialization.
>> > Updated Any23PluginManagerTest to include a new class. This commit is
>> related to issue #206.
>> > ------------------------------------------------------------------------
>> > r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) |
>> 1 line
>> >
>> > Restricted scope of testGetClassesFromClasspath to avoid updating it
>> every time a new class is added.
>> > ------------------------------------------------------------------------
>> > r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Improved validation mode support. Improved descriptions of Validation
>> and Report fields. This commit is related to issue
>> > #209.
>> > ------------------------------------------------------------------------
>> > r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Improved Any23 Service XML Report format documentation.
>> > ------------------------------------------------------------------------
>> > r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Added URL encoding to the source location path. This commit fixes issue
>> #205. Chosen not to write a formal test which
>> > requires the creation of folders with spaces
>> > ------------------------------------------------------------------------
>> > r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Removed obsolete section.
>> > ------------------------------------------------------------------------
>> > r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) |
>> 1 line
>> >
>> > Improved Any23 facade, added method createDocumentSource() to simplify
>> the extraction setup.
>> > ------------------------------------------------------------------------
>> > r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) |
>> 1 line
>> >
>> > Refactored Rover CLI class to made it extensible from other CLI
>> implementations.
>> > ------------------------------------------------------------------------
>> > r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) |
>> 3 lines
>> >
>> > Removed wrong artifact name.
>> >
>> >
>> > ------------------------------------------------------------------------
>> > r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Removed no longer used jspf lib. Added crawler4j dependencies. Added
>> README. This commit is related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Changed attributes visibility to facilitate the class extensibility.
>> > ------------------------------------------------------------------------
>> > r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Added helper methods to extract file lines as list of strings. Improved
>> javadoc.
>> > ------------------------------------------------------------------------
>> > r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Added first version of basic-crawler plugin. This commit is related to
>> issue #211.
>> > ------------------------------------------------------------------------
>> > r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Added plugins README.
>> > ------------------------------------------------------------------------
>> > r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Updated main README, added references to plugin and lib.
>> > ------------------------------------------------------------------------
>> > r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Fixed assembly name.
>> > ------------------------------------------------------------------------
>> > r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Fixed Tool signature. This commit is related to #211.
>> > ------------------------------------------------------------------------
>> > r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Improved logging.
>> > ------------------------------------------------------------------------
>> > r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Included plugin basic-crawler in reactor. Improved ToolRunner and
>> Any23PluginManager tests to be compliant to the new
>> > plugin classes. This commit is related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Fixed Crawler4j group id. Related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Improved plugin documentation. Introduced Office Scraper specific page.
>> This commit is related to issue #213.
>> > ------------------------------------------------------------------------
>> > r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed POST method documentation. Related to issue #213.
>> > ------------------------------------------------------------------------
>> > r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed code snippets, prettified, added missing finalization logic. See
>> issue #187.
>> > ------------------------------------------------------------------------
>> > r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed var name. See #187.
>> > ------------------------------------------------------------------------
>> > r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Updated code snippets and tutorial, added explicit TripleHandler
>> closure. This commit is related to issue #187.
>> > ------------------------------------------------------------------------
>> > r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed data type handling management in NQuadsParser. This commit is
>> related to issue #210.
>> > ------------------------------------------------------------------------
>> > r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Added missing JSON output format. See #214.
>> > ------------------------------------------------------------------------
>> > r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output
>> format support to Rover. This commit is related to
>> > issue #215.
>> > ------------------------------------------------------------------------
>> > r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Added Sesame TriX IO dependency. This commit is related to #215.
>> > ------------------------------------------------------------------------
>> > r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Some suppressed suppressed have been reactivated as Ignored.
>> > ------------------------------------------------------------------------
>> > r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Added TriX output format to the Any23 Service. Commit related to issue
>> #215.
>> > ------------------------------------------------------------------------
>> > r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Improved FormatWriter management, added WriterRegistry. Improved Writer
>> format management in Rover and WebResponder.
>> > This commit is related to issues #215 and #216.
>> > ------------------------------------------------------------------------
>> > r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) |
>> 6 lines
>> >
>> > Added TriXExtractor and textual example (example-trix.trx), added trix
>> support in RDFParserFactory.
>> > Registered TriXExtractor to the ExtractorRegistry.
>> > Added TriX mimetype support in TikaMIMETypeDetector (through
>> mimetypes.xml) and added specific test.
>> > Added support and doc to TriX format in Any23 Service web page
>> (form.html).
>> > This commit is related to issue #215.
>> >
>> > ------------------------------------------------------------------------
>> > r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) |
>> 1 line
>> >
>> > Fixed number of extractors (+1 after adding TriXExtractor). Commit
>> related to issue #215.
>> > ------------------------------------------------------------------------
>> > r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Added method getExtractorType() .
>> > ------------------------------------------------------------------------
>> > r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) |
>> 4 lines
>> >
>> > Improved ExtractorDocumentation support, added missing format examples.
>> > Improved output layout. This commit is related to issue #194.
>> >
>> >
>> > ------------------------------------------------------------------------
>> > r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Improved classpath management in Any23PluginManager. Renamed
>> getClasses\* in loadClasses\* . This commit is related to
>> > issue #212.
>> > ------------------------------------------------------------------------
>> > r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Separated log messages from specific outout data.
>> > ------------------------------------------------------------------------
>> > r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Added human readable report printing support in ReportingTripleHandler
>> and Rover.
>> > ------------------------------------------------------------------------
>> > r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Fixed major issue in output generation, added final activity report,
>> help prettification. This commit is related to
>> > issue #211.
>> > ------------------------------------------------------------------------
>> > r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Upgraded to Sesame 2.6.1 See issue #217.
>> > ------------------------------------------------------------------------
>> > r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue
>> #216
>> > ------------------------------------------------------------------------
>> > r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
>> > ------------------------------------------------------------------------
>> > r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) |
>> 1 line
>> >
>> > Added specific Crawler CLI documentation. Updated general CLI
>> documentation. This commit is related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) |
>> 4 lines
>> >
>> > The Eval CLI Tool has been removed as well as the org.deri.any23.eval
>> package classes related to it.
>> > Updated tests verifying CLI tool detection.
>> > This commit is related to issue #218.
>> >
>> > ------------------------------------------------------------------------
>> > r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) |
>> 5 lines
>> >
>> > Added MimeDetector CLI Tool and test case, removed main() from
>> > TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
>> > Updated CLI doc.
>> > This commit is related to issue #219.
>> >
>> > ------------------------------------------------------------------------
>> > r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) |
>> 1 line
>> >
>> > Added support for comment serialization. Related to issue #158.
>> > ------------------------------------------------------------------------
>> > r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) |
>> 1 line
>> >
>> > Add support for annotation writing in FormatWriter implementations. This
>> commit is related to issue #158.
>> > ------------------------------------------------------------------------
>> > r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) |
>> 1 line
>> >
>> > Added support for 'annotate' flag in Any23 Service.
>> > ------------------------------------------------------------------------
>> >
>> > ==== END  : Original Log ====
>> >
>> >
>> > Added:
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
>> >    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
>> >    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
>> >    incubator/any23/trunk/any23-core/src/test/resources/application/trix/
>> >
>>  incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
>> >
>>  incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
>> >
>>  incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
>> >    incubator/any23/trunk/lib/README.txt
>> >    incubator/any23/trunk/plugins/README.txt
>> >    incubator/any23/trunk/plugins/basic-crawler/
>> >    incubator/any23/trunk/plugins/basic-crawler/pom.xml
>> >    incubator/any23/trunk/plugins/basic-crawler/src/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/main/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/main/java/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
>> >    incubator/any23/trunk/plugins/basic-crawler/src/test/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/test/java/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
>> >    incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
>> > Removed:
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
>> > Modified:
>> >    incubator/any23/trunk/README.txt
>> >    incubator/any23/trunk/any23-core/bin/any23
>> >    incubator/any23/trunk/any23-core/bin/any23tools
>> >    incubator/any23/trunk/any23-core/pom.xml
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
>> >
>>  incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
>> >
>>  incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
>> >
>>  incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
>> >
>>  incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
>> >    incubator/any23/trunk/lib/install-deps.sh
>> >
>>  incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
>> >    incubator/any23/trunk/pom.xml
>> >    incubator/any23/trunk/src/site/apt/any23-plugins.apt
>> >    incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
>> >    incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
>> >    incubator/any23/trunk/src/site/apt/getting-started.apt
>> >    incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
>> >    incubator/any23/trunk/src/site/apt/service.apt
>> >    incubator/any23/trunk/src/site/apt/supported-formats.apt
>> >
>> > Modified: incubator/any23/trunk/README.txt
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/README.txt (original)
>> > +++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
>> > @@ -20,7 +20,8 @@ Distribution Content
>> >
>> >  any23-core           The library core codebase.
>> >  any23-service        The library HTTP service codebase.
>> > -plugins              Library plugins codebase.
>> > +lib                  Contains the Any23 the external deps (read
>> lib/README.txt for further details).
>> > +plugins              Library plugins codebase (read plugins/README.txt
>> for further details).
>> >  RELEASE-NOTES.txt    File reporting main release notes for every
>> version.
>> >  LICENSE.txt          Applicable project license.
>> >  README.txt           This file.
>> > @@ -240,15 +241,14 @@ Upload the produced packages in download
>> >
>> >    http://code.google.com/p/any23/downloads/list
>> >
>> > +--------------------
>> > +Manage External Deps
>> > +--------------------
>> >
>> > -Fix Release Procedure
>> > ----------------------
>> > -
>> > -   Currently the *plugins/integration-test* module is excluded from the
>> parent
>> > -   reactor.
>> > -   To fix it in tag follow procedure as described at issue #171:
>> > -
>> > -        http://code.google.com/p/any23/issues/detail?id=171
>> > +::Developers interest only.::
>> >
>> > +External Deps are libraries used by some Any23 modules which are
>> > +not available in public Maven repositories. Such libraries are
>> > +managed within the 'lib' dir.
>> >
>> >  EOF
>> >
>> > Modified: incubator/any23/trunk/any23-core/bin/any23
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/any23-core/bin/any23 (original)
>> > +++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
>> > @@ -9,12 +9,12 @@
>> >  ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
>> >
>> >  if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> > -    echo "Generating executable JAR..."
>> > -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > +    echo "Generating executable JAR..." >&2
>> > +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> >         ||\
>> > -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> >        ||\
>> > -    { echo "Error while generating commandline assembly."; exit 1; }
>> > +    { echo "Error while generating commandline assembly."  >&2; exit 1;
>> }
>> >  fi
>> >
>> >  SEP=':'
>> >
>> > Modified: incubator/any23/trunk/any23-core/bin/any23tools
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/any23-core/bin/any23tools (original)
>> > +++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28
>> 2012
>> > @@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd
>> >  PLUGINS_DIR=plugins
>> >
>> >  if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> > -    echo "Generating executable JAR..."
>> > -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > +    echo "Generating executable JAR..." >&2
>> > +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> >         ||\
>> > -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> >        ||\
>> > -    { echo "Error while generating commandline assembly."; exit 1; }
>> > +    { echo "Error while generating commandline assembly." >&2; exit 1; }
>> >  fi
>> >
>> >  SEP=':'
>> > @@ -30,6 +30,7 @@ done
>> >  # Plugins classpath.
>> >  for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name
>> "*-plugin.jar" -depth 1)
>> >  do
>> > +  echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
>> >   if [ ! -e "$jar" ]; then continue; fi
>> >   CP="$CP$SEP$jar"
>> >  done
>> >
>> > Modified: incubator/any23/trunk/any23-core/pom.xml
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/any23-core/pom.xml (original)
>> > +++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
>> > @@ -92,6 +92,10 @@
>> >         </dependency>
>> >         <dependency>
>> >             <groupId>org.openrdf.sesame</groupId>
>> > +            <artifactId>sesame-rio-trix</artifactId>
>> > +        </dependency>
>> > +        <dependency>
>> > +            <groupId>org.openrdf.sesame</groupId>
>> >             <artifactId>sesame-repository-sail</artifactId>
>> >         </dependency>
>> >         <dependency>
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -258,6 +258,28 @@ public class Any23 {
>> >     }
>> >
>> >     /**
>> > +     * Returns the most appropriate {@link DocumentSource} for the
>> given<code>documentURI</code>.
>> > +     *
>> > +     * @param documentURI the document <i>URI</i>.
>> > +     * @return a new instance of DocumentSource.
>> > +     * @throws URISyntaxException if an error occurs while parsing the
>> <code>documentURI</code> as a <i>URI</i>.
>> > +     * @throws IOException if an error occurs while initializing the
>> internal {@link HTTPClient}.
>> > +     */
>> > +    public DocumentSource createDocumentSource(String documentURI)
>> throws URISyntaxException, IOException {
>> > +        if(documentURI == null) throw new
>> NullPointerException("documentURI cannot be null.");
>> > +        if (documentURI.toLowerCase().startsWith("file:")) {
>> > +            return new FileDocumentSource( new File(new
>> URI(documentURI)) );
>> > +        }
>> > +        if (documentURI.toLowerCase().startsWith("http:") ||
>> documentURI.toLowerCase().startsWith("https:")) {
>> > +            return new HTTPDocumentSource(getHTTPClient(), documentURI);
>> > +        }
>> > +        throw new IllegalArgumentException(
>> > +                String.format("Unsupported protocol for document URI:
>> '%s' .", documentURI)
>> > +        );
>> > +    }
>> > +
>> > +
>> > +    /**
>> >      * Performs metadata extraction from the content of the given
>> >      * <code>in</code> document source, sending the generated events
>> >      * to the specified <code>outputHandler</code>.
>> > @@ -363,13 +385,7 @@ public class Any23 {
>> >     public ExtractionReport extract(ExtractionParameters eps, String
>> documentURI, TripleHandler outputHandler)
>> >     throws IOException, ExtractionException {
>> >         try {
>> > -            if (documentURI.toLowerCase().startsWith("file:")) {
>> > -                return extract(eps, new FileDocumentSource(new File(new
>> URI(documentURI))), outputHandler);
>> > -            }
>> > -            if (documentURI.toLowerCase().startsWith("http:") ||
>> documentURI.toLowerCase().startsWith("https:")) {
>> > -                return extract(eps, new
>> HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
>> > -            }
>> > -            throw new ExtractionException("Not a valid absolute URI: "
>> + documentURI);
>> > +            return extract(eps, createDocumentSource(documentURI),
>> outputHandler);
>> >         } catch (URISyntaxException ex) {
>> >             throw new ExtractionException("Error while extracting data
>> from document URI.", ex);
>> >         }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -16,7 +16,7 @@
>> >
>> >  package org.deri.any23.cli;
>> >
>> > -import org.deri.any23.LogUtil;
>> > +import org.deri.any23.util.LogUtils;
>> >  import org.deri.any23.extractor.ExampleInputOutput;
>> >  import org.deri.any23.extractor.ExtractionException;
>> >  import org.deri.any23.extractor.Extractor;
>> > @@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
>> >     }
>> >
>> >     public int run(String[] args) {
>> > -        LogUtil.setDefaultLogging();
>> > +        LogUtils.setDefaultLogging();
>> >         try {
>> >             if (args.length == 0) {
>> >                 printUsage();
>> > @@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
>> >      * Prints the list of all the available extractors.
>> >      */
>> >     public void printExtractorList() {
>> > -        for (String extractorName :
>> ExtractorRegistry.getInstance().getAllNames()) {
>> > -            System.out.println(extractorName);
>> > +        for(ExtractorFactory factory :
>> ExtractorRegistry.getInstance().getExtractorGroup()) {
>> > +            System.out.println( String.format("%25s [%15s]",
>> factory.getExtractorName(), factory.getExtractorType()));
>> >         }
>> >     }
>> >
>> > @@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
>> >             ExtractorFactory<?> factory =
>> ExtractorRegistry.getInstance().getFactory(extractorName);
>> >             ExampleInputOutput example = new ExampleInputOutput(factory);
>> >             System.out.println("Extractor: " + extractorName);
>> > -            System.out.println("  type: " + getType(factory));
>> > -            String output = example.getExampleOutput();
>> > -            if (output == null) {
>> > -                System.out.println("(no example output)");
>> > +            System.out.println("\ttype: " + getType(factory));
>> > +            System.out.println();
>> > +            final String exampleInput = example.getExampleInput();
>> > +            if(exampleInput == null) {
>> > +                System.out.println("(No Example Available)");
>> >             } else {
>> > -                System.out.println("-------- example output --------");
>> > -                System.out.println(output);
>> > +                System.out.println("-------- Example Input  --------");
>> > +                System.out.println(exampleInput);
>> > +                System.out.println("-------- Example Output --------");
>> > +                String output = example.getExampleOutput();
>> > +                System.out.println(output == null ||
>> output.trim().length() == 0 ? "(No Output Generated)" : output);
>> >             }
>> > -            System.out.println();
>> >             System.out.println("================================");
>> > +            System.out.println();
>> >         }
>> >     }
>> >
>> >
>> > Added:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> (added)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -0,0 +1,113 @@
>> > +/*
>> > + * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
>> > + *
>> > + * Licensed under the Apache License, Version 2.0 (the "License");
>> > + * you may not use this file except in compliance with the License.
>> > + * You may obtain a copy of the License at
>> > + *
>> > + *          http://www.apache.org/licenses/LICENSE-2.0
>> > + *
>> > + * Unless required by applicable law or agreed to in writing, software
>> > + * distributed under the License is distributed on an "AS IS" BASIS,
>> > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
>> implied.
>> > + * See the License for the specific language governing permissions and
>> > + * limitations under the License.
>> > + */
>> > +
>> > +package org.deri.any23.cli;
>> > +
>> > +import org.deri.any23.configuration.DefaultConfiguration;
>> > +import org.deri.any23.http.DefaultHTTPClient;
>> > +import org.deri.any23.http.HTTPClient;
>> > +import org.deri.any23.http.HTTPClientConfiguration;
>> > +import org.deri.any23.mime.MIMEType;
>> > +import org.deri.any23.mime.MIMETypeDetector;
>> > +import org.deri.any23.mime.TikaMIMETypeDetector;
>> > +import org.deri.any23.source.DocumentSource;
>> > +import org.deri.any23.source.FileDocumentSource;
>> > +import org.deri.any23.source.HTTPDocumentSource;
>> > +import org.deri.any23.source.StringDocumentSource;
>> > +
>> > +import java.io.File;
>> > +import java.net.URISyntaxException;
>> > +
>> > +/**
>> > + * Commandline tool to detect <b>MIME Type</b>s from
>> > + * file, HTTP and direct input sources.
>> > + * The implementation of this tool is based on {@link
>> TikaMIMETypeDetector}.
>> > + *
>> > + * @author Michele Mostarda (mostarda@fbk.eu)
>> > + */
>> > +@ToolRunner.Description("MIME Type Detector Tool.")
>> > +public class MimeDetector implements Tool{
>> > +
>> > +    public static final String FILE_DOCUMENT_PREFIX   = "file://";
>> > +    public static final String INLINE_DOCUMENT_PREFIX = "inline://";
>> > +    public static final String URL_DOCUMENT_RE        = "^https?://.*";
>> > +
>> > +    public static void main(String[] args) {
>> > +        System.exit( new MimeDetector().run(args) );
>> > +    }
>> > +
>> > +    @Override
>> > +    public int run(String[] args) {
>> > +          if(args.length != 1) {
>> > +            System.err.println("USAGE: {
>> http://path/to/resource.html|file:///path/to/local.file|inline:// some
>> inline content}");
>> > +            return 1;
>> > +        }
>> > +
>> > +        final String document = args[0];
>> > +        try {
>> > +            final DocumentSource documentSource =
>> createDocumentSource(document);
>> > +            final MIMETypeDetector detector = new
>> TikaMIMETypeDetector();
>> > +            final MIMEType mimeType = detector.guessMIMEType(
>> > +                    documentSource.getDocumentURI(),
>> > +                    documentSource.openInputStream(),
>> > +                    MIMEType.parse(documentSource.getContentType())
>> > +            );
>> > +            System.out.println(mimeType);
>> > +            return 0;
>> > +        } catch (Exception e) {
>> > +            System.err.print("Error while detecting MIME Type.");
>> > +            e.printStackTrace(System.err);
>> > +            return 1;
>> > +        }
>> > +    }
>> > +
>> > +    private DocumentSource createDocumentSource(String document) throws
>> URISyntaxException {
>> > +        if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
>> > +            return new FileDocumentSource(
>> > +                    new File(
>> > +
>>  document.substring(FILE_DOCUMENT_PREFIX.length())
>> > +                    )
>> > +            );
>> > +        }
>> > +        if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
>> > +            return new StringDocumentSource(
>> > +                    document.substring(INLINE_DOCUMENT_PREFIX.length()),
>> > +                    ""
>> > +            );
>> > +        }
>> > +        if(document.matches(URL_DOCUMENT_RE)) {
>> > +            final HTTPClient client = new DefaultHTTPClient();
>> > +            // TODO: anonymous config class also used in Any23.
>> centralize.
>> > +            client.init(new HTTPClientConfiguration() {
>> > +                public String getUserAgent() {
>> > +                    return
>> DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
>> > +                }
>> > +                public String getAcceptHeader() {
>> > +                    return "";
>> > +                }
>> > +                public int getDefaultTimeout() {
>> > +                    return
>> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
>> > +                }
>> > +                public int getMaxConnections() {
>> > +                    return
>> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
>> > +                }
>> > +            });
>> > +            return new HTTPDocumentSource(client, document);
>> > +        }
>> > +        throw new IllegalArgumentException("Unsupported protocol for
>> document " + document);
>> > +    }
>> > +
>> > +}
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
>> >  import org.apache.commons.cli.Options;
>> >  import org.apache.commons.cli.PosixParser;
>> >  import org.deri.any23.Any23;
>> > -import org.deri.any23.LogUtil;
>> > +import org.deri.any23.util.LogUtils;
>> >  import org.deri.any23.configuration.Configuration;
>> >  import org.deri.any23.configuration.DefaultConfiguration;
>> >  import org.deri.any23.extractor.ExtractionException;
>> > @@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
>> >  import org.deri.any23.extractor.SingleDocumentExtraction;
>> >  import org.deri.any23.filter.IgnoreAccidentalRDFa;
>> >  import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
>> > +import org.deri.any23.source.DocumentSource;
>> >  import org.deri.any23.writer.BenchmarkTripleHandler;
>> >  import org.deri.any23.writer.LoggingTripleHandler;
>> > -import org.deri.any23.writer.NQuadsWriter;
>> > -import org.deri.any23.writer.NTriplesWriter;
>> > -import org.deri.any23.writer.RDFXMLWriter;
>> >  import org.deri.any23.writer.ReportingTripleHandler;
>> >  import org.deri.any23.writer.TripleHandler;
>> >  import org.deri.any23.writer.TripleHandlerException;
>> > -import org.deri.any23.writer.TurtleWriter;
>> > -import org.deri.any23.writer.URIListWriter;
>> > +import org.deri.any23.writer.WriterRegistry;
>> >  import org.slf4j.Logger;
>> >  import org.slf4j.LoggerFactory;
>> >
>> > @@ -51,6 +48,7 @@ import java.io.OutputStream;
>> >  import java.io.PrintStream;
>> >  import java.io.PrintWriter;
>> >  import java.net.MalformedURLException;
>> > +import java.net.URISyntaxException;
>> >  import java.net.URL;
>> >
>> >  import static
>> org.deri.any23.extractor.ExtractionParameters.ValidationMode;
>> > @@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
>> >  * A default rover implementation. Goes and fetches a URL using an hint
>> >  * as to what format should require, then tries to convert it to RDF.
>> >  *
>> > - * @author Gabriele Renzi
>> > - * @author Richard Cyganiak (richard@cyganiak.de)
>> >  * @author Michele Mostarda (mostarda@fbk.eu)
>> > + * @author Richard Cyganiak (richard@cyganiak.de)
>> > + * @author Gabriele Renzi
>> >  */
>> >  @ToolRunner.Description("Any23 Command Line Tool.")
>> >  public class Rover implements Tool {
>> >
>> > -    // Supported formats.
>> > -    private static final String TURTLE_FORMAT  = "turtle";
>> > -    private static final String NTRIPLE_FORMAT = "ntriples";
>> > -    private static final String RDFXML_FORMAT  = "rdfxml";
>> > -    private static final String NQUADS_FORMAT  = "nquads";
>> > -    private static final String URIS_FORMAT    = "uris";
>> > -
>> > -    private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
>> > +    private static final String[] FORMATS =
>> WriterRegistry.getInstance().getIdentifiers();
>> > +    private static final int DEFAULT_FORMAT_INDEX = 0;
>> >
>> >     private static final Logger logger =
>> LoggerFactory.getLogger(Rover.class);
>> >
>> > -    private static Options options;
>> > +    private Options options;
>> >
>> > -    public static void main(String[] args) {
>> > -        System.exit( new Rover().run(args) );
>> > -    }
>> > +    private CommandLine commandLine;
>> >
>> > -    public int run(String[] args) {
>> > -        final CommandLineParser parser = new PosixParser();
>> > -        final CommandLine commandLine;
>> > +    private boolean verbose = false;
>> >
>> > -        boolean verbose = false;
>> > -        try {
>> > -            options = createOptions();
>> > -            commandLine = parser.parse(options, args);
>> > +    private PrintStream outputStream;
>> > +    private TripleHandler tripleHandler;
>> > +    private ReportingTripleHandler reportingTripleHandler;
>> > +    private BenchmarkTripleHandler benchmarkTripleHandler;
>> >
>> > -            if (commandLine.hasOption("h")) {
>> > -                printHelp();
>> > -                return 0;
>> > -            }
>> > +    private ExtractionParameters eps;
>> > +    private Any23 any23;
>> >
>> > -            if (commandLine.hasOption('v')) {
>> > -                verbose = true;
>> > -                LogUtil.setVerboseLogging();
>> > -            } else {
>> > -                LogUtil.setDefaultLogging();
>> > -            }
>> > -
>> > -            if (commandLine.getArgs().length < 1) {
>> > -                printHelp();
>> > -                throw new IllegalArgumentException("Expected at least 1
>> argument.");
>> > -            }
>> > +    protected boolean isVerbose() {
>> > +        return verbose;
>> > +    }
>> >
>> > -            final String[] inputURIs      =
>> argumentsToURIs(commandLine.getArgs());
>> > -            final String[] extractorNames = getExtractors(commandLine);
>> > +    public static void main(String[] args) {
>> > +        System.exit( new Rover().run(args) );
>> > +    }
>> >
>> > -            PrintStream outputStream    = null;
>> > -            TripleHandler tripleHandler = null;
>> > -            try {
>> > -                outputStream  = getOutputStream(commandLine);
>> > +    public int run(String[] args) {
>> > +        try {
>> > +            final String[] uris = configure(args);
>> > +            performExtraction(uris);
>> > +            return 0;
>> > +        } catch (Exception e) {
>> > +            System.err.println( e.getMessage() );
>> > +            final int exitCode = e instanceof ExitCodeException ?
>> ((ExitCodeException) e).exitCode : 1;
>> > +            if(verbose) e.printStackTrace(System.err);
>> > +            return exitCode;
>> > +        }
>> > +    }
>> >
>> > -                tripleHandler = getTripleHandler(commandLine,
>> outputStream);
>> > +    protected CommandLine getCommandLine() {
>> > +        if(commandLine == null) throw new IllegalStateException("Rover
>> must be configured first.");
>> > +        return commandLine;
>> > +    }
>> >
>> > -                tripleHandler = decorateWithLogHandler(commandLine,
>> tripleHandler);
>> > +    protected String[] configure(String[] args) throws Exception {
>> > +        final CommandLineParser parser = new PosixParser();
>> > +        options = createOptions();
>> > +        commandLine = parser.parse(options, args);
>> >
>> > -                tripleHandler =
>> decorateWithStatisticsHandler(commandLine, tripleHandler);
>> > -                final BenchmarkTripleHandler benchmarkTripleHandler =
>> > -                        tripleHandler instanceof BenchmarkTripleHandler
>> ? (BenchmarkTripleHandler) tripleHandler : null;
>> > +        if (commandLine.hasOption("h")) {
>> > +            printHelp();
>> > +            throw new ExitCodeException(0);
>> > +        }
>> >
>> > -                tripleHandler =
>> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>> > +        if (commandLine.hasOption('v')) {
>> > +            verbose = true;
>> > +            LogUtils.setVerboseLogging();
>> > +        } else {
>> > +            LogUtils.setDefaultLogging();
>> > +        }
>> >
>> > -                final ReportingTripleHandler reportingTripleHandler =
>> new ReportingTripleHandler(tripleHandler);
>> > +        if (commandLine.getArgs().length < 1) {
>> > +            printHelp();
>> > +            throw new IllegalArgumentException("Expected at least 1
>> argument.");
>> > +        }
>> >
>> > -                final ExtractionParameters eps =
>> getExtractionParameters(commandLine);
>> > +        final String[] inputURIs =
>> argumentsToURIs(commandLine.getArgs());
>> > +        final String[] extractorNames = getExtractors(commandLine);
>> >
>> > -                final Any23 any23 = createAny23(extractorNames);
>> > +        try {
>> > +            outputStream  = getOutputStream(commandLine);
>> > +            tripleHandler = getTripleHandler(commandLine, outputStream);
>> > +            tripleHandler = decorateWithLogHandler(commandLine,
>> tripleHandler);
>> > +            tripleHandler = decorateWithStatisticsHandler(commandLine,
>> tripleHandler);
>> >
>> > -                final long start = System.currentTimeMillis();
>> > -                for(String inputURI : inputURIs) {
>> > -                    performExtraction(any23, eps, inputURI,
>> reportingTripleHandler);
>> > -                }
>> > -                final long elapsed = System.currentTimeMillis() - start;
>> > +            benchmarkTripleHandler =
>> > +                    tripleHandler instanceof BenchmarkTripleHandler ?
>> (BenchmarkTripleHandler) tripleHandler : null;
>> >
>> > -                closeAll(tripleHandler, outputStream);
>> > +            tripleHandler =
>> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>> >
>> > -                if (benchmarkTripleHandler != null) {
>> > -                    System.err.println( benchmarkTripleHandler.report()
>> );
>> > -                }
>> > +            reportingTripleHandler = new
>> ReportingTripleHandler(tripleHandler);
>> > +            eps = getExtractionParameters(commandLine);
>> > +            any23 = createAny23(extractorNames);
>> >
>> > -                logger.info("Extractors used: " +
>> reportingTripleHandler.getExtractorNames());
>> > -                logger.info(reportingTripleHandler.getTotalTriples() +
>> " triples, " + elapsed + "ms");
>> > -            } finally {
>> > -                closeAll(tripleHandler, outputStream);
>> > -            }
>> > +            return inputURIs;
>> >         } catch (Exception e) {
>> > -            System.err.println(e.getMessage());
>> > -            final int exitCode = e instanceof SpecificExitException ?
>> ((SpecificExitException) e).exitCode : 1;
>> > -            if(verbose) e.printStackTrace(System.err);
>> > -            return exitCode;
>> > +            closeStreams();
>> > +            throw e;
>> >         }
>> > -        return 0;
>> >     }
>> >
>> > -    private Options createOptions() {
>> > +    protected Options createOptions() {
>> >         final Options options = new Options();
>> >         options.addOption(
>> >                 new Option("v", "verbose", false, "Show debug and
>> progress information.")
>> > @@ -178,13 +175,7 @@ public class Rover implements Tool {
>> >                         "f",
>> >                         "Output format",
>> >                         true,
>> > -                        "[" +
>> > -                                TURTLE_FORMAT  + " (default), " +
>> > -                                NTRIPLE_FORMAT + ", " +
>> > -                                RDFXML_FORMAT  + ", " +
>> > -                                NQUADS_FORMAT  + ", " +
>> > -                                URIS_FORMAT    +
>> > -                        "]"
>> > +                        "[" +  printFormats(FORMATS,
>> DEFAULT_FORMAT_INDEX) + "]"
>> >                 )
>> >         );
>> >         options.addOption(
>> > @@ -208,11 +199,51 @@ public class Rover implements Tool {
>> >         return options;
>> >     }
>> >
>> > +    protected void performExtraction(DocumentSource documentSource) {
>> > +        performExtraction(any23, eps, documentSource,
>> reportingTripleHandler);
>> > +    }
>> > +
>> > +    protected void performExtraction(String[] inputURIs) throws
>> URISyntaxException, IOException {
>> > +        try {
>> > +            final long start = System.currentTimeMillis();
>> > +            for (String inputURI : inputURIs) {
>> > +                performExtraction( any23.createDocumentSource(inputURI)
>> );
>> > +            }
>> > +            final long elapsed = System.currentTimeMillis() - start;
>> > +
>> > +            if (benchmarkTripleHandler != null) {
>> > +                System.err.println(benchmarkTripleHandler.report());
>> > +            }
>> > +
>> > +            logger.info("Extractors used: " +
>> reportingTripleHandler.getExtractorNames());
>> > +            logger.info(reportingTripleHandler.getTotalTriples() + "
>> triples, " + elapsed + "ms");
>> > +        } finally {
>> > +            closeStreams();
>> > +        }
>> > +    }
>> > +
>> > +    protected String printReports() {
>> > +        final StringBuilder sb = new StringBuilder();
>> > +        if(benchmarkTripleHandler != null) sb.append(
>> benchmarkTripleHandler.report() ).append('\n');
>> > +        if(reportingTripleHandler != null) sb.append(
>> reportingTripleHandler.printReport() ).append('\n');
>> > +        return sb.toString();
>> > +    }
>> > +
>> >     private void printHelp() {
>> >         HelpFormatter formatter = new HelpFormatter();
>> >         formatter.printHelp("[{<url>|<file>}]+", options, true);
>> >     }
>> >
>> > +    private String printFormats(String[] formats, int defaultIndex) {
>> > +        final StringBuilder sb = new StringBuilder();
>> > +        for (int i = 0; i < formats.length; i++) {
>> > +            sb.append(formats[i]);
>> > +            if(i == defaultIndex) sb.append(" (default)");
>> > +            if(i < formats.length - 1) sb.append(", ");
>> > +        }
>> > +        return sb.toString();
>> > +    }
>> > +
>> >     private String argumentToURI(String uri) {
>> >         uri = uri.trim();
>> >         if (uri.toLowerCase().startsWith("http:") ||
>> uri.toLowerCase().startsWith("https:")) {
>> > @@ -268,27 +299,17 @@ public class Rover implements Tool {
>> >
>> >     private TripleHandler getTripleHandler(CommandLine cl, OutputStream
>> os) {
>> >         final String FORMAT_OPTION = "f";
>> > -        String format = DEFAULT_FORMAT;
>> > +        String format = FORMATS[DEFAULT_FORMAT_INDEX];
>> >         if (cl.hasOption(FORMAT_OPTION)) {
>> > -            format = cl.getOptionValue(FORMAT_OPTION);
>> > +            format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
>> >         }
>> > -        final TripleHandler outputHandler;
>> > -        if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new TurtleWriter(os);
>> > -        } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new NTriplesWriter(os);
>> > -        } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new RDFXMLWriter(os);
>> > -        } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new NQuadsWriter(os);
>> > -        } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new URIListWriter(os);
>> > -        } else {
>> > +        try {
>> > +            return
>> WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
>> > +        } catch (Exception e) {
>> >             throw new IllegalArgumentException(
>> >                     String.format("Invalid option value '%s' for option
>> %s", format, FORMAT_OPTION)
>> >             );
>> >         }
>> > -        return outputHandler;
>> >     }
>> >
>> >     private TripleHandler
>> decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
>> > @@ -346,44 +367,54 @@ public class Rover implements Tool {
>> >         return any23;
>> >     }
>> >
>> > -    private void performExtraction(Any23 any23, ExtractionParameters
>> eps, String documentURI, TripleHandler th) {
>> > +    private void performExtraction(
>> > +            Any23 any23, ExtractionParameters eps, DocumentSource
>> documentSource, TripleHandler th
>> > +    ) {
>> >         try {
>> > -            if (! any23.extract(eps, documentURI,
>> th).hasMatchingExtractors()) {
>> > -                throw new SpecificExitException("No suitable extractors
>> found.", 2);
>> > +            if (! any23.extract(eps, documentSource,
>> th).hasMatchingExtractors()) {
>> > +                throw new ExitCodeException("No suitable extractors
>> found.", 2);
>> >             }
>> >         } catch (ExtractionException ex) {
>> > -            throw new SpecificExitException("Exception while extracting
>> metadata.", ex, 3);
>> > +            throw new ExitCodeException("Exception while extracting
>> metadata.", ex, 3);
>> >         } catch (IOException ex) {
>> > -            throw new SpecificExitException("Exception while producing
>> output.", ex, 4);
>> > +            throw new ExitCodeException("Exception while producing
>> output.", ex, 4);
>> >         }
>> >     }
>> >
>> > -    private void closeHandler(TripleHandler th) {
>> > -        if(th == null) return;
>> > +    private void closeHandler() {
>> > +        if(tripleHandler == null) return;
>> >         try {
>> > -            th.close();
>> > +            tripleHandler.close();
>> >         } catch (TripleHandlerException the) {
>> > -            throw new SpecificExitException("Error while closing
>> TripleHandler", the, 5);
>> > +            throw new ExitCodeException("Error while closing
>> TripleHandler", the, 5);
>> >         }
>> >     }
>> >
>> > -    private void closeAll(TripleHandler th, PrintStream os) {
>> > -             closeHandler(th);
>> > -            if(os != null) os.close();
>> > +    private void closeStreams() {
>> > +             closeHandler();
>> > +            if(outputStream != null) outputStream.close();
>> >     }
>> >
>> > -    private class SpecificExitException extends RuntimeException {
>> > +    protected class ExitCodeException extends RuntimeException {
>> >
>> >         private final int exitCode;
>> >
>> > -        public SpecificExitException(String message, Throwable cause,
>> int exitCode) {
>> > +        public ExitCodeException(String message, Throwable cause, int
>> exitCode) {
>> >             super(message, cause);
>> >             this.exitCode = exitCode;
>> >         }
>> > -        public SpecificExitException(String message, int exitCode) {
>> > +        public ExitCodeException(String message, int exitCode) {
>> >             super(message);
>> >             this.exitCode = exitCode;
>> >         }
>> > +        public ExitCodeException(int exitCode) {
>> > +            super();
>> > +            this.exitCode = exitCode;
>> > +        }
>> > +
>> > +        protected int getExitCode() {
>> > +            return exitCode;
>> > +        }
>> >     }
>> >
>> >  }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -29,6 +29,13 @@ import java.util.Collection;
>> >  public interface ExtractorFactory<T extends Extractor<?>> extends
>> ExtractorDescription {
>> >
>> >     /**
>> > +     * Returns the extractor type.
>> > +     *
>> > +     * @return the not <code>null</code> extractor class.
>> > +     */
>> > +    Class<T> getExtractorType();
>> > +
>> > +    /**
>> >      * Creates an extractor instance.
>> >      *
>> >      * @return an instance of the extractor associated to this factory.
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
>> >  import org.deri.any23.extractor.rdf.NQuadsExtractor;
>> >  import org.deri.any23.extractor.rdf.NTriplesExtractor;
>> >  import org.deri.any23.extractor.rdf.RDFXMLExtractor;
>> > +import org.deri.any23.extractor.rdf.TriXExtractor;
>> >  import org.deri.any23.extractor.rdf.TurtleExtractor;
>> >  import org.deri.any23.extractor.rdfa.RDFa11Extractor;
>> >  import org.deri.any23.extractor.rdfa.RDFaExtractor;
>> > @@ -79,6 +80,7 @@ public class ExtractorRegistry {
>> >                 instance.register(TurtleExtractor.factory);
>> >                 instance.register(NTriplesExtractor.factory);
>> >                 instance.register(NQuadsExtractor.factory);
>> > +                instance.register(TriXExtractor.factory);
>> >
>> if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
>> >                     instance.register(RDFa11Extractor.factory);
>> >                 } else {
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
>> >         return supportedMIMETypes;
>> >     }
>> >
>> > +    @Override
>> > +    public Class<T> getExtractorType() {
>> > +        return extractorClass;
>> > +    }
>> > +
>> >     /**
>> >      * @return an instance of type T concrete implementation of {@link
>> org.deri.any23.extractor.Extractor}
>> >      */
>> > +    @Override
>> >     public T createExtractor() {
>> >         try {
>> >             return extractorClass.newInstance();
>> > @@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
>> >     /**
>> >      * @return an input example
>> >      */
>> > +    @Override
>> >     public String getExampleInput() {
>> >         return exampleInput;
>> >     }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
>> >                     Arrays.asList(
>> >                             "text/csv;q=0.1"
>> >                     ),
>> > -                    null,
>> > +                    "example-csv.csv",
>> >                     CSVExtractor.class
>> >             );
>> >
>> > @@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
>> >     }
>> >
>> >     /**
>> > +     * Check whether a number is an integer.
>> > +     *
>> > +     * @param number
>> > +     * @return
>> > +     */
>> > +    private boolean isInteger(String number) {
>> > +        try {
>> > +            Integer.valueOf(number);
>> > +            return true;
>> > +        } catch (NumberFormatException e) {
>> > +            return false;
>> > +        }
>> > +    }
>> > +
>> > +    /**
>> > +     * Check whether a number is a float.
>> > +     *
>> >      * @param number
>> >      * @return
>> >      */
>> > -    private boolean isNumber(String number) {
>> > +    private boolean isFloat(String number) {
>> >         try {
>> > -            Double.valueOf(number);
>> > +            Float.valueOf(number);
>> >             return true;
>> >         } catch (NumberFormatException e) {
>> >             return false;
>> > @@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
>> >             object = new URIImpl(cell);
>> >         } else {
>> >             URI datatype = XMLSchema.STRING;
>> > -            if (isNumber(cell)) {
>> > +            if (isInteger(cell)) {
>> >                 datatype = XMLSchema.INTEGER;
>> > +            } else if(isFloat(cell)) {
>> > +                datatype = XMLSchema.FLOAT;
>> >             }
>> >             object = new LiteralImpl(cell, datatype);
>> >         }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
>> >                     "html-mf-adr",
>> >                     PopularPrefixes.createSubset("rdf", "vcard"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-adr.html",
>> >                     AdrExtractor.class
>> >             );
>> >  }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
>> >                 "html-mf-geo",
>> >                 PopularPrefixes.createSubset("rdf", "vcard"),
>> >                 Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                null,
>> > +                "example-mf-geo.html",
>> >                 GeoExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -53,7 +53,7 @@ public class HCalendarExtractor extends
>> >                     "html-mf-hcalendar",
>> >                     PopularPrefixes.createSubset("rdf", "ical"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hcalendar.html",
>> >                     HCalendarExtractor.class);
>> >
>> >     private static final String[] Components = {"Vevent", "Vtodo",
>> "Vjournal", "Vfreebusy"};
>> > @@ -116,7 +116,7 @@ public class HCalendarExtractor extends
>> >     private boolean extractComponent(Node node, Resource cal, String
>> component) throws ExtractionException {
>> >         HTMLDocument compoNode = new HTMLDocument(node);
>> >         BNode evt = valueFactory.createBNode();
>> > -        addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
>> > +        addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
>> >         addTextProps(compoNode, evt);
>> >         addUrl(compoNode, evt);
>> >         addRRule(compoNode, evt);
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
>> >                     "html-mf-hcard",
>> >                     PopularPrefixes.createSubset("rdf", "vcard"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hcard.html",
>> >                     HCardExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -82,7 +82,7 @@ public class HListingExtractor extends E
>> >                     "html-mf-hlisting",
>> >                     PopularPrefixes.createSubset("rdf", "hlisting"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hlisting.html",
>> >                     HListingExtractor.class
>> >             );
>> >
>> > @@ -106,7 +106,7 @@ public class HListingExtractor extends E
>> >         out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
>> >
>> >         for (String action : findActions(fragment)) {
>> > -            out.writeTriple(listing, hLISTING.action,
>> hLISTING.getResource(action));
>> > +            out.writeTriple(listing, hLISTING.action,
>> hLISTING.getClass(action));
>> >         }
>> >         out.writeTriple(listing, hLISTING.lister, addLister() );
>> >         addItem(listing);
>> > @@ -154,7 +154,7 @@ public class HListingExtractor extends E
>> >                     String value = node.getNodeValue();
>> >                     // do not use conditionallyAdd, it won't work cause
>> of evaluation rules
>> >                     if (!(null == value || "".equals(value))) {
>> > -                        URI property =
>> hLISTING.getPropertyCamelized(klass);
>> > +                        URI property =
>> hLISTING.getPropertyCamelCase(klass);
>> >                         conditionallyAddLiteralProperty(
>> >                                 node,
>> >                                 blankItem, property,
>> valueFactory.createLiteral(value)
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
>> >                     "html-mf-hrecipe",
>> >                     PopularPrefixes.createSubset("rdf", "hrecipe"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hrecipe.html",
>> >                     HRecipeExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -48,7 +48,7 @@ public class HResumeExtractor extends En
>> >                     "html-mf-hresume",
>> >                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hresume.html",
>> >                     HResumeExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -53,7 +53,7 @@ public class HReviewExtractor extends En
>> >                     "html-mf-hreview",
>> >                     PopularPrefixes.createSubset("rdf", "vcard", "rev"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hreview.html",
>> >                     HReviewExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
>> >                     "html-head-links",
>> >                     PopularPrefixes.createSubset("xhtml", "dcterms"),
>> >                     Arrays.asList("text/html;q=0.05",
>> "application/xhtml+xml;q=0.05"),
>> > -                    null,
>> > +                    "example-head-link.html",
>> >                     HeadLinkExtractor.class);
>> >  }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
>> >                     "html-head-icbm",
>> >                     PopularPrefixes.createSubset("geo", "rdf"),
>> >                     Arrays.asList("text/html;q=0.01",
>> "application/xhtml+xml;q=0.01"),
>> > -                    null,
>> > +                    "example-icbm.html",
>> >                     ICBMExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -51,7 +51,7 @@ public class LicenseExtractor implements
>> >                     "html-mf-license",
>> >                     PopularPrefixes.createSubset("xhtml"),
>> >                     Arrays.asList("text/html;q=0.01",
>> "application/xhtml+xml;q=0.01"),
>> > -                    null,
>> > +                    "example-mf-license.html",
>> >                     LicenseExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
>> >                     "html-mf-species",
>> >                     PopularPrefixes.createSubset("rdf", "wo"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-species.html",
>> >                     SpeciesExtractor.class
>> >             );
>> >
>> > @@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
>> >
>> >     private URI resolveClassName(String clazz) {
>> >         String upperCaseClass = clazz.substring(0, 1);
>> > -        return vWO.getResource(
>> > +        return vWO.getClass(
>> >                 String.format("%s%s",
>> >                         upperCaseClass.toUpperCase(),
>> >                         clazz.substring(1)
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
>> >                     NAME,
>> >                     PopularPrefixes.get(),
>> >                     Arrays.asList("text/html;q=0.02",
>> "application/xhtml+xml;q=0.02"),
>> > -                    null,
>> > +                    "example-script-turtle.html",
>> >                     TurtleHTMLExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
>> >                 "html-mf-xfn",
>> >                 PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
>> >                 Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                null,
>> > +                "example-mf-xfn.html",
>> >                 XFNExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
>> >                     "html-microdata",
>> >                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-microdata.html",
>> >                     MicrodataExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
>> >  import org.deri.any23.extractor.ErrorReporter;
>> >  import org.deri.any23.extractor.ExtractionContext;
>> >  import org.deri.any23.extractor.ExtractionResult;
>> > -import org.deri.any23.parser.NQuadsParser;
>> > +import org.deri.any23.io.nquads.NQuadsParser;
>> >  import org.deri.any23.rdf.Any23ValueFactoryWrapper;
>> >  import org.openrdf.model.impl.ValueFactoryImpl;
>> >  import org.openrdf.rio.ParseErrorListener;
>> > @@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
>> >  import org.openrdf.rio.RDFParser;
>> >  import org.openrdf.rio.ntriples.NTriplesParser;
>> >  import org.openrdf.rio.rdfxml.RDFXMLParser;
>> > +import org.openrdf.rio.trix.TriXParser;
>> >  import org.openrdf.rio.turtle.TurtleParser;
>> >  import org.slf4j.Logger;
>> >  import org.slf4j.LoggerFactory;
>> > @@ -38,7 +39,7 @@ import java.io.Reader;
>> >
>> >  /**
>> >  * This factory provides a common logic for creating and configuring
>> correctly
>> > - * any RDF parser used within the library.
>> > + * any <i>RDF</i> parser used within the library.
>> >  *
>> >  * @author Michele Mostarda (mostarda@fbk.eu)
>> >  */
>> > @@ -119,7 +120,7 @@ public class RDFParserFactory {
>> >     }
>> >
>> >     /**
>> > -     * Returns a new instance of a configured {@link
>> org.deri.any23.parser.NQuadsParser}.
>> > +     * Returns a new instance of a configured {@link
>> org.deri.any23.io.nquads.NQuadsParser}.
>> >      *
>> >      * @param verifyDataType data verification enable if
>> <code>true</code>.
>> >      * @param stopAtFirstError the parser stops at first error if
>> <code>true</code>.
>> > @@ -139,6 +140,26 @@ public class RDFParserFactory {
>> >     }
>> >
>> >     /**
>> > +     * Returns a new instance of a configured {@link TriXParser}.
>> > +     *
>> > +     * @param verifyDataType data verification enable if
>> <code>true</code>.
>> > +     * @param stopAtFirstError the parser stops at first error if
>> <code>true</code>.
>> > +     * @param extractionContext the extraction context where the parser
>> is used.
>> > +     * @param extractionResult the output extraction result.
>> > +     * @return a new instance of a configured TriX parser.
>> > +     */
>> > +    public TriXParser getTriXParser(
>> > +            final boolean verifyDataType,
>> > +            final boolean stopAtFirstError,
>> > +            final ExtractionContext extractionContext,
>> > +            final ExtractionResult extractionResult
>> > +    ) {
>> > +        final TriXParser parser = new TriXParser();
>> > +        configureParser(parser, verifyDataType, stopAtFirstError,
>> extractionContext, extractionResult);
>> > +        return parser;
>> > +    }
>> > +
>> > +    /**
>> >      * Configures the given parser on the specified extraction result
>> >      * setting the policies for data verification and error handling.
>> >      *
>> >
>> >
>>
>
>
>
> --
> Michele Mostarda
> Senior Software Engineer
> skype: michele.mostarda
> twitter: micmos
> mail: me@michelemostarda.com
> site : http://www.michelemostarda.com

Re: svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./ any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/ any23-core/src/main/java/org/deri/any23/cli/ any23-core/src/main/java/org/deri/any23/eval/ any23-core/src/main/java/or

Posted by Michele Mostarda <mi...@gmail.com>.
On 10 January 2012 18:08, Simone Tripodi <si...@apache.org> wrote:

> Hi Mic,
>

Hi Simo, happy new year !

this is something great, thanks for the hard work of merging!
> next step is renaming the packages in org.apache.any23 :)
>

Sure :) It is the next critical issue scheduled on Jira.
The we can start discussing about the release.

Ciao

Mic


>
> All the best, have a nice day!
> -Simo
>
> http://people.apache.org/~simonetripodi/
> http://simonetripodi.livejournal.com/
> http://twitter.com/simonetripodi
> http://www.99soft.org/
>
>
>
> On Tue, Jan 10, 2012 at 5:32 PM,  <mo...@apache.org> wrote:
> > Author: mostarda
> > Date: Tue Jan 10 16:32:28 2012
> > New Revision: 1229627
> >
> > URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
> > Log:
> > This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
> > with the current Apache Any23 SVN repo, including the issues
> > developed during the initial import transition phase.
> > Such issues have been tracked on the original Any23 Google Code Issue
> Tracker [2].
> > Below the extract of the original repository commit log.
> >
> > This commit is related to issue ANY23-27.
> >
> > [1] http://any23.googlecode.com/svn/trunk/
> > [2] http://code.google.com/p/any23/issues/list
> >
> > ==== BEGIN: Original Log ====
> >
> > ------------------------------------------------------------------------
> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
> 1 line
> >
> > Improved numeric datatype assigment. This commit fixes issue #208.
> > ------------------------------------------------------------------------
> > hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
> > ------------------------------------------------------------------------
> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
> 1 line
> >
> > Improved numeric datatype assigment. This commit fixes issue #208.
> > ------------------------------------------------------------------------
> > r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) |
> 1 line
> >
> > Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'.
> Fixed HTMLMetaExtractorTest.java to match this new
> > namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS
> declared as resource instead that as a URI. Fixed
> > RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration.
> This commit is related to issue #203.
> > ------------------------------------------------------------------------
> > r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) |
> 1 line
> >
> > Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found
> wrong declaration of Class(Resource) in WO.java
> > voca. Fixed and updated RDFSchemaUtils.java test. This commit is related
> to issue #198.
> > ------------------------------------------------------------------------
> > r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) |
> 1 line
> >
> > Added utility method.
> > ------------------------------------------------------------------------
> > r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) |
> 1 line
> >
> > Improved Vocabulary.java class: added support for comments to any
> resource. Improved RDFSchemaUtils.java serialization
> > support, added separators to RDFXML serialization. This commit is
> related to issue #198.
> > ------------------------------------------------------------------------
> > r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) |
> 1 line
> >
> > Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved
> prefix declaration parsing in RDFa11Parser, this
> > new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix
> declarations. Fixed support for prefix mapping resolution in
> > RDFa11Parser, this allows the correct support for the structured
> properties introduced by the latest version of the Open
> > Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest
> to the new output of vocabularies serialization.
> > Updated Any23PluginManagerTest to include a new class. This commit is
> related to issue #206.
> > ------------------------------------------------------------------------
> > r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) |
> 1 line
> >
> > Restricted scope of testGetClassesFromClasspath to avoid updating it
> every time a new class is added.
> > ------------------------------------------------------------------------
> > r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) |
> 1 line
> >
> > Improved validation mode support. Improved descriptions of Validation
> and Report fields. This commit is related to issue
> > #209.
> > ------------------------------------------------------------------------
> > r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) |
> 1 line
> >
> > Improved Any23 Service XML Report format documentation.
> > ------------------------------------------------------------------------
> > r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) |
> 1 line
> >
> > Added URL encoding to the source location path. This commit fixes issue
> #205. Chosen not to write a formal test which
> > requires the creation of folders with spaces
> > ------------------------------------------------------------------------
> > r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) |
> 1 line
> >
> > Removed obsolete section.
> > ------------------------------------------------------------------------
> > r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) |
> 1 line
> >
> > Improved Any23 facade, added method createDocumentSource() to simplify
> the extraction setup.
> > ------------------------------------------------------------------------
> > r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) |
> 1 line
> >
> > Refactored Rover CLI class to made it extensible from other CLI
> implementations.
> > ------------------------------------------------------------------------
> > r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) |
> 3 lines
> >
> > Removed wrong artifact name.
> >
> >
> > ------------------------------------------------------------------------
> > r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Removed no longer used jspf lib. Added crawler4j dependencies. Added
> README. This commit is related to issue #211.
> > ------------------------------------------------------------------------
> > r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Changed attributes visibility to facilitate the class extensibility.
> > ------------------------------------------------------------------------
> > r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Added helper methods to extract file lines as list of strings. Improved
> javadoc.
> > ------------------------------------------------------------------------
> > r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Added first version of basic-crawler plugin. This commit is related to
> issue #211.
> > ------------------------------------------------------------------------
> > r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Added plugins README.
> > ------------------------------------------------------------------------
> > r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Updated main README, added references to plugin and lib.
> > ------------------------------------------------------------------------
> > r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Fixed assembly name.
> > ------------------------------------------------------------------------
> > r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Fixed Tool signature. This commit is related to #211.
> > ------------------------------------------------------------------------
> > r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Improved logging.
> > ------------------------------------------------------------------------
> > r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Included plugin basic-crawler in reactor. Improved ToolRunner and
> Any23PluginManager tests to be compliant to the new
> > plugin classes. This commit is related to issue #211.
> > ------------------------------------------------------------------------
> > r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Fixed Crawler4j group id. Related to issue #211.
> > ------------------------------------------------------------------------
> > r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Improved plugin documentation. Introduced Office Scraper specific page.
> This commit is related to issue #213.
> > ------------------------------------------------------------------------
> > r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Fixed POST method documentation. Related to issue #213.
> > ------------------------------------------------------------------------
> > r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Fixed code snippets, prettified, added missing finalization logic. See
> issue #187.
> > ------------------------------------------------------------------------
> > r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Fixed var name. See #187.
> > ------------------------------------------------------------------------
> > r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Updated code snippets and tutorial, added explicit TripleHandler
> closure. This commit is related to issue #187.
> > ------------------------------------------------------------------------
> > r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Fixed data type handling management in NQuadsParser. This commit is
> related to issue #210.
> > ------------------------------------------------------------------------
> > r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Added missing JSON output format. See #214.
> > ------------------------------------------------------------------------
> > r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output
> format support to Rover. This commit is related to
> > issue #215.
> > ------------------------------------------------------------------------
> > r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) |
> 1 line
> >
> > Added Sesame TriX IO dependency. This commit is related to #215.
> > ------------------------------------------------------------------------
> > r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) |
> 1 line
> >
> > Some suppressed suppressed have been reactivated as Ignored.
> > ------------------------------------------------------------------------
> > r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) |
> 1 line
> >
> > Added TriX output format to the Any23 Service. Commit related to issue
> #215.
> > ------------------------------------------------------------------------
> > r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) |
> 1 line
> >
> > Improved FormatWriter management, added WriterRegistry. Improved Writer
> format management in Rover and WebResponder.
> > This commit is related to issues #215 and #216.
> > ------------------------------------------------------------------------
> > r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) |
> 6 lines
> >
> > Added TriXExtractor and textual example (example-trix.trx), added trix
> support in RDFParserFactory.
> > Registered TriXExtractor to the ExtractorRegistry.
> > Added TriX mimetype support in TikaMIMETypeDetector (through
> mimetypes.xml) and added specific test.
> > Added support and doc to TriX format in Any23 Service web page
> (form.html).
> > This commit is related to issue #215.
> >
> > ------------------------------------------------------------------------
> > r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) |
> 1 line
> >
> > Fixed number of extractors (+1 after adding TriXExtractor). Commit
> related to issue #215.
> > ------------------------------------------------------------------------
> > r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Added method getExtractorType() .
> > ------------------------------------------------------------------------
> > r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) |
> 4 lines
> >
> > Improved ExtractorDocumentation support, added missing format examples.
> > Improved output layout. This commit is related to issue #194.
> >
> >
> > ------------------------------------------------------------------------
> > r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Improved classpath management in Any23PluginManager. Renamed
> getClasses\* in loadClasses\* . This commit is related to
> > issue #212.
> > ------------------------------------------------------------------------
> > r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Separated log messages from specific outout data.
> > ------------------------------------------------------------------------
> > r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Added human readable report printing support in ReportingTripleHandler
> and Rover.
> > ------------------------------------------------------------------------
> > r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Fixed major issue in output generation, added final activity report,
> help prettification. This commit is related to
> > issue #211.
> > ------------------------------------------------------------------------
> > r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Upgraded to Sesame 2.6.1 See issue #217.
> > ------------------------------------------------------------------------
> > r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue
> #216
> > ------------------------------------------------------------------------
> > r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
> > ------------------------------------------------------------------------
> > r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) |
> 1 line
> >
> > Added specific Crawler CLI documentation. Updated general CLI
> documentation. This commit is related to issue #211.
> > ------------------------------------------------------------------------
> > r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) |
> 4 lines
> >
> > The Eval CLI Tool has been removed as well as the org.deri.any23.eval
> package classes related to it.
> > Updated tests verifying CLI tool detection.
> > This commit is related to issue #218.
> >
> > ------------------------------------------------------------------------
> > r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) |
> 5 lines
> >
> > Added MimeDetector CLI Tool and test case, removed main() from
> > TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
> > Updated CLI doc.
> > This commit is related to issue #219.
> >
> > ------------------------------------------------------------------------
> > r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) |
> 1 line
> >
> > Added support for comment serialization. Related to issue #158.
> > ------------------------------------------------------------------------
> > r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) |
> 1 line
> >
> > Add support for annotation writing in FormatWriter implementations. This
> commit is related to issue #158.
> > ------------------------------------------------------------------------
> > r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) |
> 1 line
> >
> > Added support for 'annotate' flag in Any23 Service.
> > ------------------------------------------------------------------------
> >
> > ==== END  : Original Log ====
> >
> >
> > Added:
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
> >    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
> >    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
> >    incubator/any23/trunk/any23-core/src/test/resources/application/trix/
> >
>  incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
> >
>  incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
> >
>  incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
> >    incubator/any23/trunk/lib/README.txt
> >    incubator/any23/trunk/plugins/README.txt
> >    incubator/any23/trunk/plugins/basic-crawler/
> >    incubator/any23/trunk/plugins/basic-crawler/pom.xml
> >    incubator/any23/trunk/plugins/basic-crawler/src/
> >    incubator/any23/trunk/plugins/basic-crawler/src/main/
> >    incubator/any23/trunk/plugins/basic-crawler/src/main/java/
> >    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
> >    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
> >    incubator/any23/trunk/plugins/basic-crawler/src/test/
> >    incubator/any23/trunk/plugins/basic-crawler/src/test/java/
> >    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
> >    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
> >
>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
> >    incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
> > Removed:
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
> > Modified:
> >    incubator/any23/trunk/README.txt
> >    incubator/any23/trunk/any23-core/bin/any23
> >    incubator/any23/trunk/any23-core/bin/any23tools
> >    incubator/any23/trunk/any23-core/pom.xml
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
> >
>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
> >
>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
> >
>  incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
> >
>  incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
> >
>  incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
> >
>  incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
> >    incubator/any23/trunk/lib/install-deps.sh
> >
>  incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
> >    incubator/any23/trunk/pom.xml
> >    incubator/any23/trunk/src/site/apt/any23-plugins.apt
> >    incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
> >    incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
> >    incubator/any23/trunk/src/site/apt/getting-started.apt
> >    incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
> >    incubator/any23/trunk/src/site/apt/service.apt
> >    incubator/any23/trunk/src/site/apt/supported-formats.apt
> >
> > Modified: incubator/any23/trunk/README.txt
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > --- incubator/any23/trunk/README.txt (original)
> > +++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
> > @@ -20,7 +20,8 @@ Distribution Content
> >
> >  any23-core           The library core codebase.
> >  any23-service        The library HTTP service codebase.
> > -plugins              Library plugins codebase.
> > +lib                  Contains the Any23 the external deps (read
> lib/README.txt for further details).
> > +plugins              Library plugins codebase (read plugins/README.txt
> for further details).
> >  RELEASE-NOTES.txt    File reporting main release notes for every
> version.
> >  LICENSE.txt          Applicable project license.
> >  README.txt           This file.
> > @@ -240,15 +241,14 @@ Upload the produced packages in download
> >
> >    http://code.google.com/p/any23/downloads/list
> >
> > +--------------------
> > +Manage External Deps
> > +--------------------
> >
> > -Fix Release Procedure
> > ----------------------
> > -
> > -   Currently the *plugins/integration-test* module is excluded from the
> parent
> > -   reactor.
> > -   To fix it in tag follow procedure as described at issue #171:
> > -
> > -        http://code.google.com/p/any23/issues/detail?id=171
> > +::Developers interest only.::
> >
> > +External Deps are libraries used by some Any23 modules which are
> > +not available in public Maven repositories. Such libraries are
> > +managed within the 'lib' dir.
> >
> >  EOF
> >
> > Modified: incubator/any23/trunk/any23-core/bin/any23
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > --- incubator/any23/trunk/any23-core/bin/any23 (original)
> > +++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
> > @@ -9,12 +9,12 @@
> >  ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
> >
> >  if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
> > -    echo "Generating executable JAR..."
> > -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly\
> > +    echo "Generating executable JAR..." >&2
> > +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly >&2 \
> >         ||\
> > -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly\
> > +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly >&2 \
> >        ||\
> > -    { echo "Error while generating commandline assembly."; exit 1; }
> > +    { echo "Error while generating commandline assembly."  >&2; exit 1;
> }
> >  fi
> >
> >  SEP=':'
> >
> > Modified: incubator/any23/trunk/any23-core/bin/any23tools
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > --- incubator/any23/trunk/any23-core/bin/any23tools (original)
> > +++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28
> 2012
> > @@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd
> >  PLUGINS_DIR=plugins
> >
> >  if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
> > -    echo "Generating executable JAR..."
> > -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly\
> > +    echo "Generating executable JAR..." >&2
> > +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly >&2 \
> >         ||\
> > -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly\
> > +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly >&2 \
> >        ||\
> > -    { echo "Error while generating commandline assembly."; exit 1; }
> > +    { echo "Error while generating commandline assembly." >&2; exit 1; }
> >  fi
> >
> >  SEP=':'
> > @@ -30,6 +30,7 @@ done
> >  # Plugins classpath.
> >  for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name
> "*-plugin.jar" -depth 1)
> >  do
> > +  echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
> >   if [ ! -e "$jar" ]; then continue; fi
> >   CP="$CP$SEP$jar"
> >  done
> >
> > Modified: incubator/any23/trunk/any23-core/pom.xml
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > --- incubator/any23/trunk/any23-core/pom.xml (original)
> > +++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
> > @@ -92,6 +92,10 @@
> >         </dependency>
> >         <dependency>
> >             <groupId>org.openrdf.sesame</groupId>
> > +            <artifactId>sesame-rio-trix</artifactId>
> > +        </dependency>
> > +        <dependency>
> > +            <groupId>org.openrdf.sesame</groupId>
> >             <artifactId>sesame-repository-sail</artifactId>
> >         </dependency>
> >         <dependency>
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> Tue Jan 10 16:32:28 2012
> > @@ -258,6 +258,28 @@ public class Any23 {
> >     }
> >
> >     /**
> > +     * Returns the most appropriate {@link DocumentSource} for the
> given<code>documentURI</code>.
> > +     *
> > +     * @param documentURI the document <i>URI</i>.
> > +     * @return a new instance of DocumentSource.
> > +     * @throws URISyntaxException if an error occurs while parsing the
> <code>documentURI</code> as a <i>URI</i>.
> > +     * @throws IOException if an error occurs while initializing the
> internal {@link HTTPClient}.
> > +     */
> > +    public DocumentSource createDocumentSource(String documentURI)
> throws URISyntaxException, IOException {
> > +        if(documentURI == null) throw new
> NullPointerException("documentURI cannot be null.");
> > +        if (documentURI.toLowerCase().startsWith("file:")) {
> > +            return new FileDocumentSource( new File(new
> URI(documentURI)) );
> > +        }
> > +        if (documentURI.toLowerCase().startsWith("http:") ||
> documentURI.toLowerCase().startsWith("https:")) {
> > +            return new HTTPDocumentSource(getHTTPClient(), documentURI);
> > +        }
> > +        throw new IllegalArgumentException(
> > +                String.format("Unsupported protocol for document URI:
> '%s' .", documentURI)
> > +        );
> > +    }
> > +
> > +
> > +    /**
> >      * Performs metadata extraction from the content of the given
> >      * <code>in</code> document source, sending the generated events
> >      * to the specified <code>outputHandler</code>.
> > @@ -363,13 +385,7 @@ public class Any23 {
> >     public ExtractionReport extract(ExtractionParameters eps, String
> documentURI, TripleHandler outputHandler)
> >     throws IOException, ExtractionException {
> >         try {
> > -            if (documentURI.toLowerCase().startsWith("file:")) {
> > -                return extract(eps, new FileDocumentSource(new File(new
> URI(documentURI))), outputHandler);
> > -            }
> > -            if (documentURI.toLowerCase().startsWith("http:") ||
> documentURI.toLowerCase().startsWith("https:")) {
> > -                return extract(eps, new
> HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
> > -            }
> > -            throw new ExtractionException("Not a valid absolute URI: "
> + documentURI);
> > +            return extract(eps, createDocumentSource(documentURI),
> outputHandler);
> >         } catch (URISyntaxException ex) {
> >             throw new ExtractionException("Error while extracting data
> from document URI.", ex);
> >         }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> Tue Jan 10 16:32:28 2012
> > @@ -16,7 +16,7 @@
> >
> >  package org.deri.any23.cli;
> >
> > -import org.deri.any23.LogUtil;
> > +import org.deri.any23.util.LogUtils;
> >  import org.deri.any23.extractor.ExampleInputOutput;
> >  import org.deri.any23.extractor.ExtractionException;
> >  import org.deri.any23.extractor.Extractor;
> > @@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
> >     }
> >
> >     public int run(String[] args) {
> > -        LogUtil.setDefaultLogging();
> > +        LogUtils.setDefaultLogging();
> >         try {
> >             if (args.length == 0) {
> >                 printUsage();
> > @@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
> >      * Prints the list of all the available extractors.
> >      */
> >     public void printExtractorList() {
> > -        for (String extractorName :
> ExtractorRegistry.getInstance().getAllNames()) {
> > -            System.out.println(extractorName);
> > +        for(ExtractorFactory factory :
> ExtractorRegistry.getInstance().getExtractorGroup()) {
> > +            System.out.println( String.format("%25s [%15s]",
> factory.getExtractorName(), factory.getExtractorType()));
> >         }
> >     }
> >
> > @@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
> >             ExtractorFactory<?> factory =
> ExtractorRegistry.getInstance().getFactory(extractorName);
> >             ExampleInputOutput example = new ExampleInputOutput(factory);
> >             System.out.println("Extractor: " + extractorName);
> > -            System.out.println("  type: " + getType(factory));
> > -            String output = example.getExampleOutput();
> > -            if (output == null) {
> > -                System.out.println("(no example output)");
> > +            System.out.println("\ttype: " + getType(factory));
> > +            System.out.println();
> > +            final String exampleInput = example.getExampleInput();
> > +            if(exampleInput == null) {
> > +                System.out.println("(No Example Available)");
> >             } else {
> > -                System.out.println("-------- example output --------");
> > -                System.out.println(output);
> > +                System.out.println("-------- Example Input  --------");
> > +                System.out.println(exampleInput);
> > +                System.out.println("-------- Example Output --------");
> > +                String output = example.getExampleOutput();
> > +                System.out.println(output == null ||
> output.trim().length() == 0 ? "(No Output Generated)" : output);
> >             }
> > -            System.out.println();
> >             System.out.println("================================");
> > +            System.out.println();
> >         }
> >     }
> >
> >
> > Added:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> (added)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> Tue Jan 10 16:32:28 2012
> > @@ -0,0 +1,113 @@
> > +/*
> > + * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
> > + *
> > + * Licensed under the Apache License, Version 2.0 (the "License");
> > + * you may not use this file except in compliance with the License.
> > + * You may obtain a copy of the License at
> > + *
> > + *          http://www.apache.org/licenses/LICENSE-2.0
> > + *
> > + * Unless required by applicable law or agreed to in writing, software
> > + * distributed under the License is distributed on an "AS IS" BASIS,
> > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> implied.
> > + * See the License for the specific language governing permissions and
> > + * limitations under the License.
> > + */
> > +
> > +package org.deri.any23.cli;
> > +
> > +import org.deri.any23.configuration.DefaultConfiguration;
> > +import org.deri.any23.http.DefaultHTTPClient;
> > +import org.deri.any23.http.HTTPClient;
> > +import org.deri.any23.http.HTTPClientConfiguration;
> > +import org.deri.any23.mime.MIMEType;
> > +import org.deri.any23.mime.MIMETypeDetector;
> > +import org.deri.any23.mime.TikaMIMETypeDetector;
> > +import org.deri.any23.source.DocumentSource;
> > +import org.deri.any23.source.FileDocumentSource;
> > +import org.deri.any23.source.HTTPDocumentSource;
> > +import org.deri.any23.source.StringDocumentSource;
> > +
> > +import java.io.File;
> > +import java.net.URISyntaxException;
> > +
> > +/**
> > + * Commandline tool to detect <b>MIME Type</b>s from
> > + * file, HTTP and direct input sources.
> > + * The implementation of this tool is based on {@link
> TikaMIMETypeDetector}.
> > + *
> > + * @author Michele Mostarda (mostarda@fbk.eu)
> > + */
> > +@ToolRunner.Description("MIME Type Detector Tool.")
> > +public class MimeDetector implements Tool{
> > +
> > +    public static final String FILE_DOCUMENT_PREFIX   = "file://";
> > +    public static final String INLINE_DOCUMENT_PREFIX = "inline://";
> > +    public static final String URL_DOCUMENT_RE        = "^https?://.*";
> > +
> > +    public static void main(String[] args) {
> > +        System.exit( new MimeDetector().run(args) );
> > +    }
> > +
> > +    @Override
> > +    public int run(String[] args) {
> > +          if(args.length != 1) {
> > +            System.err.println("USAGE: {
> http://path/to/resource.html|file:///path/to/local.file|inline:// some
> inline content}");
> > +            return 1;
> > +        }
> > +
> > +        final String document = args[0];
> > +        try {
> > +            final DocumentSource documentSource =
> createDocumentSource(document);
> > +            final MIMETypeDetector detector = new
> TikaMIMETypeDetector();
> > +            final MIMEType mimeType = detector.guessMIMEType(
> > +                    documentSource.getDocumentURI(),
> > +                    documentSource.openInputStream(),
> > +                    MIMEType.parse(documentSource.getContentType())
> > +            );
> > +            System.out.println(mimeType);
> > +            return 0;
> > +        } catch (Exception e) {
> > +            System.err.print("Error while detecting MIME Type.");
> > +            e.printStackTrace(System.err);
> > +            return 1;
> > +        }
> > +    }
> > +
> > +    private DocumentSource createDocumentSource(String document) throws
> URISyntaxException {
> > +        if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
> > +            return new FileDocumentSource(
> > +                    new File(
> > +
>  document.substring(FILE_DOCUMENT_PREFIX.length())
> > +                    )
> > +            );
> > +        }
> > +        if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
> > +            return new StringDocumentSource(
> > +                    document.substring(INLINE_DOCUMENT_PREFIX.length()),
> > +                    ""
> > +            );
> > +        }
> > +        if(document.matches(URL_DOCUMENT_RE)) {
> > +            final HTTPClient client = new DefaultHTTPClient();
> > +            // TODO: anonymous config class also used in Any23.
> centralize.
> > +            client.init(new HTTPClientConfiguration() {
> > +                public String getUserAgent() {
> > +                    return
> DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
> > +                }
> > +                public String getAcceptHeader() {
> > +                    return "";
> > +                }
> > +                public int getDefaultTimeout() {
> > +                    return
> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
> > +                }
> > +                public int getMaxConnections() {
> > +                    return
> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
> > +                }
> > +            });
> > +            return new HTTPDocumentSource(client, document);
> > +        }
> > +        throw new IllegalArgumentException("Unsupported protocol for
> document " + document);
> > +    }
> > +
> > +}
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> Tue Jan 10 16:32:28 2012
> > @@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
> >  import org.apache.commons.cli.Options;
> >  import org.apache.commons.cli.PosixParser;
> >  import org.deri.any23.Any23;
> > -import org.deri.any23.LogUtil;
> > +import org.deri.any23.util.LogUtils;
> >  import org.deri.any23.configuration.Configuration;
> >  import org.deri.any23.configuration.DefaultConfiguration;
> >  import org.deri.any23.extractor.ExtractionException;
> > @@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
> >  import org.deri.any23.extractor.SingleDocumentExtraction;
> >  import org.deri.any23.filter.IgnoreAccidentalRDFa;
> >  import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
> > +import org.deri.any23.source.DocumentSource;
> >  import org.deri.any23.writer.BenchmarkTripleHandler;
> >  import org.deri.any23.writer.LoggingTripleHandler;
> > -import org.deri.any23.writer.NQuadsWriter;
> > -import org.deri.any23.writer.NTriplesWriter;
> > -import org.deri.any23.writer.RDFXMLWriter;
> >  import org.deri.any23.writer.ReportingTripleHandler;
> >  import org.deri.any23.writer.TripleHandler;
> >  import org.deri.any23.writer.TripleHandlerException;
> > -import org.deri.any23.writer.TurtleWriter;
> > -import org.deri.any23.writer.URIListWriter;
> > +import org.deri.any23.writer.WriterRegistry;
> >  import org.slf4j.Logger;
> >  import org.slf4j.LoggerFactory;
> >
> > @@ -51,6 +48,7 @@ import java.io.OutputStream;
> >  import java.io.PrintStream;
> >  import java.io.PrintWriter;
> >  import java.net.MalformedURLException;
> > +import java.net.URISyntaxException;
> >  import java.net.URL;
> >
> >  import static
> org.deri.any23.extractor.ExtractionParameters.ValidationMode;
> > @@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
> >  * A default rover implementation. Goes and fetches a URL using an hint
> >  * as to what format should require, then tries to convert it to RDF.
> >  *
> > - * @author Gabriele Renzi
> > - * @author Richard Cyganiak (richard@cyganiak.de)
> >  * @author Michele Mostarda (mostarda@fbk.eu)
> > + * @author Richard Cyganiak (richard@cyganiak.de)
> > + * @author Gabriele Renzi
> >  */
> >  @ToolRunner.Description("Any23 Command Line Tool.")
> >  public class Rover implements Tool {
> >
> > -    // Supported formats.
> > -    private static final String TURTLE_FORMAT  = "turtle";
> > -    private static final String NTRIPLE_FORMAT = "ntriples";
> > -    private static final String RDFXML_FORMAT  = "rdfxml";
> > -    private static final String NQUADS_FORMAT  = "nquads";
> > -    private static final String URIS_FORMAT    = "uris";
> > -
> > -    private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
> > +    private static final String[] FORMATS =
> WriterRegistry.getInstance().getIdentifiers();
> > +    private static final int DEFAULT_FORMAT_INDEX = 0;
> >
> >     private static final Logger logger =
> LoggerFactory.getLogger(Rover.class);
> >
> > -    private static Options options;
> > +    private Options options;
> >
> > -    public static void main(String[] args) {
> > -        System.exit( new Rover().run(args) );
> > -    }
> > +    private CommandLine commandLine;
> >
> > -    public int run(String[] args) {
> > -        final CommandLineParser parser = new PosixParser();
> > -        final CommandLine commandLine;
> > +    private boolean verbose = false;
> >
> > -        boolean verbose = false;
> > -        try {
> > -            options = createOptions();
> > -            commandLine = parser.parse(options, args);
> > +    private PrintStream outputStream;
> > +    private TripleHandler tripleHandler;
> > +    private ReportingTripleHandler reportingTripleHandler;
> > +    private BenchmarkTripleHandler benchmarkTripleHandler;
> >
> > -            if (commandLine.hasOption("h")) {
> > -                printHelp();
> > -                return 0;
> > -            }
> > +    private ExtractionParameters eps;
> > +    private Any23 any23;
> >
> > -            if (commandLine.hasOption('v')) {
> > -                verbose = true;
> > -                LogUtil.setVerboseLogging();
> > -            } else {
> > -                LogUtil.setDefaultLogging();
> > -            }
> > -
> > -            if (commandLine.getArgs().length < 1) {
> > -                printHelp();
> > -                throw new IllegalArgumentException("Expected at least 1
> argument.");
> > -            }
> > +    protected boolean isVerbose() {
> > +        return verbose;
> > +    }
> >
> > -            final String[] inputURIs      =
> argumentsToURIs(commandLine.getArgs());
> > -            final String[] extractorNames = getExtractors(commandLine);
> > +    public static void main(String[] args) {
> > +        System.exit( new Rover().run(args) );
> > +    }
> >
> > -            PrintStream outputStream    = null;
> > -            TripleHandler tripleHandler = null;
> > -            try {
> > -                outputStream  = getOutputStream(commandLine);
> > +    public int run(String[] args) {
> > +        try {
> > +            final String[] uris = configure(args);
> > +            performExtraction(uris);
> > +            return 0;
> > +        } catch (Exception e) {
> > +            System.err.println( e.getMessage() );
> > +            final int exitCode = e instanceof ExitCodeException ?
> ((ExitCodeException) e).exitCode : 1;
> > +            if(verbose) e.printStackTrace(System.err);
> > +            return exitCode;
> > +        }
> > +    }
> >
> > -                tripleHandler = getTripleHandler(commandLine,
> outputStream);
> > +    protected CommandLine getCommandLine() {
> > +        if(commandLine == null) throw new IllegalStateException("Rover
> must be configured first.");
> > +        return commandLine;
> > +    }
> >
> > -                tripleHandler = decorateWithLogHandler(commandLine,
> tripleHandler);
> > +    protected String[] configure(String[] args) throws Exception {
> > +        final CommandLineParser parser = new PosixParser();
> > +        options = createOptions();
> > +        commandLine = parser.parse(options, args);
> >
> > -                tripleHandler =
> decorateWithStatisticsHandler(commandLine, tripleHandler);
> > -                final BenchmarkTripleHandler benchmarkTripleHandler =
> > -                        tripleHandler instanceof BenchmarkTripleHandler
> ? (BenchmarkTripleHandler) tripleHandler : null;
> > +        if (commandLine.hasOption("h")) {
> > +            printHelp();
> > +            throw new ExitCodeException(0);
> > +        }
> >
> > -                tripleHandler =
> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
> > +        if (commandLine.hasOption('v')) {
> > +            verbose = true;
> > +            LogUtils.setVerboseLogging();
> > +        } else {
> > +            LogUtils.setDefaultLogging();
> > +        }
> >
> > -                final ReportingTripleHandler reportingTripleHandler =
> new ReportingTripleHandler(tripleHandler);
> > +        if (commandLine.getArgs().length < 1) {
> > +            printHelp();
> > +            throw new IllegalArgumentException("Expected at least 1
> argument.");
> > +        }
> >
> > -                final ExtractionParameters eps =
> getExtractionParameters(commandLine);
> > +        final String[] inputURIs =
> argumentsToURIs(commandLine.getArgs());
> > +        final String[] extractorNames = getExtractors(commandLine);
> >
> > -                final Any23 any23 = createAny23(extractorNames);
> > +        try {
> > +            outputStream  = getOutputStream(commandLine);
> > +            tripleHandler = getTripleHandler(commandLine, outputStream);
> > +            tripleHandler = decorateWithLogHandler(commandLine,
> tripleHandler);
> > +            tripleHandler = decorateWithStatisticsHandler(commandLine,
> tripleHandler);
> >
> > -                final long start = System.currentTimeMillis();
> > -                for(String inputURI : inputURIs) {
> > -                    performExtraction(any23, eps, inputURI,
> reportingTripleHandler);
> > -                }
> > -                final long elapsed = System.currentTimeMillis() - start;
> > +            benchmarkTripleHandler =
> > +                    tripleHandler instanceof BenchmarkTripleHandler ?
> (BenchmarkTripleHandler) tripleHandler : null;
> >
> > -                closeAll(tripleHandler, outputStream);
> > +            tripleHandler =
> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
> >
> > -                if (benchmarkTripleHandler != null) {
> > -                    System.err.println( benchmarkTripleHandler.report()
> );
> > -                }
> > +            reportingTripleHandler = new
> ReportingTripleHandler(tripleHandler);
> > +            eps = getExtractionParameters(commandLine);
> > +            any23 = createAny23(extractorNames);
> >
> > -                logger.info("Extractors used: " +
> reportingTripleHandler.getExtractorNames());
> > -                logger.info(reportingTripleHandler.getTotalTriples() +
> " triples, " + elapsed + "ms");
> > -            } finally {
> > -                closeAll(tripleHandler, outputStream);
> > -            }
> > +            return inputURIs;
> >         } catch (Exception e) {
> > -            System.err.println(e.getMessage());
> > -            final int exitCode = e instanceof SpecificExitException ?
> ((SpecificExitException) e).exitCode : 1;
> > -            if(verbose) e.printStackTrace(System.err);
> > -            return exitCode;
> > +            closeStreams();
> > +            throw e;
> >         }
> > -        return 0;
> >     }
> >
> > -    private Options createOptions() {
> > +    protected Options createOptions() {
> >         final Options options = new Options();
> >         options.addOption(
> >                 new Option("v", "verbose", false, "Show debug and
> progress information.")
> > @@ -178,13 +175,7 @@ public class Rover implements Tool {
> >                         "f",
> >                         "Output format",
> >                         true,
> > -                        "[" +
> > -                                TURTLE_FORMAT  + " (default), " +
> > -                                NTRIPLE_FORMAT + ", " +
> > -                                RDFXML_FORMAT  + ", " +
> > -                                NQUADS_FORMAT  + ", " +
> > -                                URIS_FORMAT    +
> > -                        "]"
> > +                        "[" +  printFormats(FORMATS,
> DEFAULT_FORMAT_INDEX) + "]"
> >                 )
> >         );
> >         options.addOption(
> > @@ -208,11 +199,51 @@ public class Rover implements Tool {
> >         return options;
> >     }
> >
> > +    protected void performExtraction(DocumentSource documentSource) {
> > +        performExtraction(any23, eps, documentSource,
> reportingTripleHandler);
> > +    }
> > +
> > +    protected void performExtraction(String[] inputURIs) throws
> URISyntaxException, IOException {
> > +        try {
> > +            final long start = System.currentTimeMillis();
> > +            for (String inputURI : inputURIs) {
> > +                performExtraction( any23.createDocumentSource(inputURI)
> );
> > +            }
> > +            final long elapsed = System.currentTimeMillis() - start;
> > +
> > +            if (benchmarkTripleHandler != null) {
> > +                System.err.println(benchmarkTripleHandler.report());
> > +            }
> > +
> > +            logger.info("Extractors used: " +
> reportingTripleHandler.getExtractorNames());
> > +            logger.info(reportingTripleHandler.getTotalTriples() + "
> triples, " + elapsed + "ms");
> > +        } finally {
> > +            closeStreams();
> > +        }
> > +    }
> > +
> > +    protected String printReports() {
> > +        final StringBuilder sb = new StringBuilder();
> > +        if(benchmarkTripleHandler != null) sb.append(
> benchmarkTripleHandler.report() ).append('\n');
> > +        if(reportingTripleHandler != null) sb.append(
> reportingTripleHandler.printReport() ).append('\n');
> > +        return sb.toString();
> > +    }
> > +
> >     private void printHelp() {
> >         HelpFormatter formatter = new HelpFormatter();
> >         formatter.printHelp("[{<url>|<file>}]+", options, true);
> >     }
> >
> > +    private String printFormats(String[] formats, int defaultIndex) {
> > +        final StringBuilder sb = new StringBuilder();
> > +        for (int i = 0; i < formats.length; i++) {
> > +            sb.append(formats[i]);
> > +            if(i == defaultIndex) sb.append(" (default)");
> > +            if(i < formats.length - 1) sb.append(", ");
> > +        }
> > +        return sb.toString();
> > +    }
> > +
> >     private String argumentToURI(String uri) {
> >         uri = uri.trim();
> >         if (uri.toLowerCase().startsWith("http:") ||
> uri.toLowerCase().startsWith("https:")) {
> > @@ -268,27 +299,17 @@ public class Rover implements Tool {
> >
> >     private TripleHandler getTripleHandler(CommandLine cl, OutputStream
> os) {
> >         final String FORMAT_OPTION = "f";
> > -        String format = DEFAULT_FORMAT;
> > +        String format = FORMATS[DEFAULT_FORMAT_INDEX];
> >         if (cl.hasOption(FORMAT_OPTION)) {
> > -            format = cl.getOptionValue(FORMAT_OPTION);
> > +            format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
> >         }
> > -        final TripleHandler outputHandler;
> > -        if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
> > -            outputHandler = new TurtleWriter(os);
> > -        } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
> > -            outputHandler = new NTriplesWriter(os);
> > -        } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
> > -            outputHandler = new RDFXMLWriter(os);
> > -        } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
> > -            outputHandler = new NQuadsWriter(os);
> > -        } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
> > -            outputHandler = new URIListWriter(os);
> > -        } else {
> > +        try {
> > +            return
> WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
> > +        } catch (Exception e) {
> >             throw new IllegalArgumentException(
> >                     String.format("Invalid option value '%s' for option
> %s", format, FORMAT_OPTION)
> >             );
> >         }
> > -        return outputHandler;
> >     }
> >
> >     private TripleHandler
> decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
> > @@ -346,44 +367,54 @@ public class Rover implements Tool {
> >         return any23;
> >     }
> >
> > -    private void performExtraction(Any23 any23, ExtractionParameters
> eps, String documentURI, TripleHandler th) {
> > +    private void performExtraction(
> > +            Any23 any23, ExtractionParameters eps, DocumentSource
> documentSource, TripleHandler th
> > +    ) {
> >         try {
> > -            if (! any23.extract(eps, documentURI,
> th).hasMatchingExtractors()) {
> > -                throw new SpecificExitException("No suitable extractors
> found.", 2);
> > +            if (! any23.extract(eps, documentSource,
> th).hasMatchingExtractors()) {
> > +                throw new ExitCodeException("No suitable extractors
> found.", 2);
> >             }
> >         } catch (ExtractionException ex) {
> > -            throw new SpecificExitException("Exception while extracting
> metadata.", ex, 3);
> > +            throw new ExitCodeException("Exception while extracting
> metadata.", ex, 3);
> >         } catch (IOException ex) {
> > -            throw new SpecificExitException("Exception while producing
> output.", ex, 4);
> > +            throw new ExitCodeException("Exception while producing
> output.", ex, 4);
> >         }
> >     }
> >
> > -    private void closeHandler(TripleHandler th) {
> > -        if(th == null) return;
> > +    private void closeHandler() {
> > +        if(tripleHandler == null) return;
> >         try {
> > -            th.close();
> > +            tripleHandler.close();
> >         } catch (TripleHandlerException the) {
> > -            throw new SpecificExitException("Error while closing
> TripleHandler", the, 5);
> > +            throw new ExitCodeException("Error while closing
> TripleHandler", the, 5);
> >         }
> >     }
> >
> > -    private void closeAll(TripleHandler th, PrintStream os) {
> > -             closeHandler(th);
> > -            if(os != null) os.close();
> > +    private void closeStreams() {
> > +             closeHandler();
> > +            if(outputStream != null) outputStream.close();
> >     }
> >
> > -    private class SpecificExitException extends RuntimeException {
> > +    protected class ExitCodeException extends RuntimeException {
> >
> >         private final int exitCode;
> >
> > -        public SpecificExitException(String message, Throwable cause,
> int exitCode) {
> > +        public ExitCodeException(String message, Throwable cause, int
> exitCode) {
> >             super(message, cause);
> >             this.exitCode = exitCode;
> >         }
> > -        public SpecificExitException(String message, int exitCode) {
> > +        public ExitCodeException(String message, int exitCode) {
> >             super(message);
> >             this.exitCode = exitCode;
> >         }
> > +        public ExitCodeException(int exitCode) {
> > +            super();
> > +            this.exitCode = exitCode;
> > +        }
> > +
> > +        protected int getExitCode() {
> > +            return exitCode;
> > +        }
> >     }
> >
> >  }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> Tue Jan 10 16:32:28 2012
> > @@ -29,6 +29,13 @@ import java.util.Collection;
> >  public interface ExtractorFactory<T extends Extractor<?>> extends
> ExtractorDescription {
> >
> >     /**
> > +     * Returns the extractor type.
> > +     *
> > +     * @return the not <code>null</code> extractor class.
> > +     */
> > +    Class<T> getExtractorType();
> > +
> > +    /**
> >      * Creates an extractor instance.
> >      *
> >      * @return an instance of the extractor associated to this factory.
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> Tue Jan 10 16:32:28 2012
> > @@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
> >  import org.deri.any23.extractor.rdf.NQuadsExtractor;
> >  import org.deri.any23.extractor.rdf.NTriplesExtractor;
> >  import org.deri.any23.extractor.rdf.RDFXMLExtractor;
> > +import org.deri.any23.extractor.rdf.TriXExtractor;
> >  import org.deri.any23.extractor.rdf.TurtleExtractor;
> >  import org.deri.any23.extractor.rdfa.RDFa11Extractor;
> >  import org.deri.any23.extractor.rdfa.RDFaExtractor;
> > @@ -79,6 +80,7 @@ public class ExtractorRegistry {
> >                 instance.register(TurtleExtractor.factory);
> >                 instance.register(NTriplesExtractor.factory);
> >                 instance.register(NQuadsExtractor.factory);
> > +                instance.register(TriXExtractor.factory);
> >
> if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
> >                     instance.register(RDFa11Extractor.factory);
> >                 } else {
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> Tue Jan 10 16:32:28 2012
> > @@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
> >         return supportedMIMETypes;
> >     }
> >
> > +    @Override
> > +    public Class<T> getExtractorType() {
> > +        return extractorClass;
> > +    }
> > +
> >     /**
> >      * @return an instance of type T concrete implementation of {@link
> org.deri.any23.extractor.Extractor}
> >      */
> > +    @Override
> >     public T createExtractor() {
> >         try {
> >             return extractorClass.newInstance();
> > @@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
> >     /**
> >      * @return an input example
> >      */
> > +    @Override
> >     public String getExampleInput() {
> >         return exampleInput;
> >     }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
> >                     Arrays.asList(
> >                             "text/csv;q=0.1"
> >                     ),
> > -                    null,
> > +                    "example-csv.csv",
> >                     CSVExtractor.class
> >             );
> >
> > @@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
> >     }
> >
> >     /**
> > +     * Check whether a number is an integer.
> > +     *
> > +     * @param number
> > +     * @return
> > +     */
> > +    private boolean isInteger(String number) {
> > +        try {
> > +            Integer.valueOf(number);
> > +            return true;
> > +        } catch (NumberFormatException e) {
> > +            return false;
> > +        }
> > +    }
> > +
> > +    /**
> > +     * Check whether a number is a float.
> > +     *
> >      * @param number
> >      * @return
> >      */
> > -    private boolean isNumber(String number) {
> > +    private boolean isFloat(String number) {
> >         try {
> > -            Double.valueOf(number);
> > +            Float.valueOf(number);
> >             return true;
> >         } catch (NumberFormatException e) {
> >             return false;
> > @@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
> >             object = new URIImpl(cell);
> >         } else {
> >             URI datatype = XMLSchema.STRING;
> > -            if (isNumber(cell)) {
> > +            if (isInteger(cell)) {
> >                 datatype = XMLSchema.INTEGER;
> > +            } else if(isFloat(cell)) {
> > +                datatype = XMLSchema.FLOAT;
> >             }
> >             object = new LiteralImpl(cell, datatype);
> >         }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
> >                     "html-mf-adr",
> >                     PopularPrefixes.createSubset("rdf", "vcard"),
> >                     Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                    null,
> > +                    "example-mf-adr.html",
> >                     AdrExtractor.class
> >             );
> >  }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
> >                 "html-mf-geo",
> >                 PopularPrefixes.createSubset("rdf", "vcard"),
> >                 Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                null,
> > +                "example-mf-geo.html",
> >                 GeoExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -53,7 +53,7 @@ public class HCalendarExtractor extends
> >                     "html-mf-hcalendar",
> >                     PopularPrefixes.createSubset("rdf", "ical"),
> >                     Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                    null,
> > +                    "example-mf-hcalendar.html",
> >                     HCalendarExtractor.class);
> >
> >     private static final String[] Components = {"Vevent", "Vtodo",
> "Vjournal", "Vfreebusy"};
> > @@ -116,7 +116,7 @@ public class HCalendarExtractor extends
> >     private boolean extractComponent(Node node, Resource cal, String
> component) throws ExtractionException {
> >         HTMLDocument compoNode = new HTMLDocument(node);
> >         BNode evt = valueFactory.createBNode();
> > -        addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
> > +        addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
> >         addTextProps(compoNode, evt);
> >         addUrl(compoNode, evt);
> >         addRRule(compoNode, evt);
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
> >                     "html-mf-hcard",
> >                     PopularPrefixes.createSubset("rdf", "vcard"),
> >                     Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                    null,
> > +                    "example-mf-hcard.html",
> >                     HCardExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -82,7 +82,7 @@ public class HListingExtractor extends E
> >                     "html-mf-hlisting",
> >                     PopularPrefixes.createSubset("rdf", "hlisting"),
> >                     Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                    null,
> > +                    "example-mf-hlisting.html",
> >                     HListingExtractor.class
> >             );
> >
> > @@ -106,7 +106,7 @@ public class HListingExtractor extends E
> >         out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
> >
> >         for (String action : findActions(fragment)) {
> > -            out.writeTriple(listing, hLISTING.action,
> hLISTING.getResource(action));
> > +            out.writeTriple(listing, hLISTING.action,
> hLISTING.getClass(action));
> >         }
> >         out.writeTriple(listing, hLISTING.lister, addLister() );
> >         addItem(listing);
> > @@ -154,7 +154,7 @@ public class HListingExtractor extends E
> >                     String value = node.getNodeValue();
> >                     // do not use conditionallyAdd, it won't work cause
> of evaluation rules
> >                     if (!(null == value || "".equals(value))) {
> > -                        URI property =
> hLISTING.getPropertyCamelized(klass);
> > +                        URI property =
> hLISTING.getPropertyCamelCase(klass);
> >                         conditionallyAddLiteralProperty(
> >                                 node,
> >                                 blankItem, property,
> valueFactory.createLiteral(value)
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
> >                     "html-mf-hrecipe",
> >                     PopularPrefixes.createSubset("rdf", "hrecipe"),
> >                     Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                    null,
> > +                    "example-mf-hrecipe.html",
> >                     HRecipeExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -48,7 +48,7 @@ public class HResumeExtractor extends En
> >                     "html-mf-hresume",
> >                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
> >                     Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                    null,
> > +                    "example-mf-hresume.html",
> >                     HResumeExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -53,7 +53,7 @@ public class HReviewExtractor extends En
> >                     "html-mf-hreview",
> >                     PopularPrefixes.createSubset("rdf", "vcard", "rev"),
> >                     Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                    null,
> > +                    "example-mf-hreview.html",
> >                     HReviewExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
> >                     "html-head-links",
> >                     PopularPrefixes.createSubset("xhtml", "dcterms"),
> >                     Arrays.asList("text/html;q=0.05",
> "application/xhtml+xml;q=0.05"),
> > -                    null,
> > +                    "example-head-link.html",
> >                     HeadLinkExtractor.class);
> >  }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
> >                     "html-head-icbm",
> >                     PopularPrefixes.createSubset("geo", "rdf"),
> >                     Arrays.asList("text/html;q=0.01",
> "application/xhtml+xml;q=0.01"),
> > -                    null,
> > +                    "example-icbm.html",
> >                     ICBMExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -51,7 +51,7 @@ public class LicenseExtractor implements
> >                     "html-mf-license",
> >                     PopularPrefixes.createSubset("xhtml"),
> >                     Arrays.asList("text/html;q=0.01",
> "application/xhtml+xml;q=0.01"),
> > -                    null,
> > +                    "example-mf-license.html",
> >                     LicenseExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
> >                     "html-mf-species",
> >                     PopularPrefixes.createSubset("rdf", "wo"),
> >                     Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                    null,
> > +                    "example-mf-species.html",
> >                     SpeciesExtractor.class
> >             );
> >
> > @@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
> >
> >     private URI resolveClassName(String clazz) {
> >         String upperCaseClass = clazz.substring(0, 1);
> > -        return vWO.getResource(
> > +        return vWO.getClass(
> >                 String.format("%s%s",
> >                         upperCaseClass.toUpperCase(),
> >                         clazz.substring(1)
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
> >                     NAME,
> >                     PopularPrefixes.get(),
> >                     Arrays.asList("text/html;q=0.02",
> "application/xhtml+xml;q=0.02"),
> > -                    null,
> > +                    "example-script-turtle.html",
> >                     TurtleHTMLExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
> >                 "html-mf-xfn",
> >                 PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
> >                 Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                null,
> > +                "example-mf-xfn.html",
> >                 XFNExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
> >                     "html-microdata",
> >                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
> >                     Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > -                    null,
> > +                    "example-microdata.html",
> >                     MicrodataExtractor.class
> >             );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> Tue Jan 10 16:32:28 2012
> > @@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
> >  import org.deri.any23.extractor.ErrorReporter;
> >  import org.deri.any23.extractor.ExtractionContext;
> >  import org.deri.any23.extractor.ExtractionResult;
> > -import org.deri.any23.parser.NQuadsParser;
> > +import org.deri.any23.io.nquads.NQuadsParser;
> >  import org.deri.any23.rdf.Any23ValueFactoryWrapper;
> >  import org.openrdf.model.impl.ValueFactoryImpl;
> >  import org.openrdf.rio.ParseErrorListener;
> > @@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
> >  import org.openrdf.rio.RDFParser;
> >  import org.openrdf.rio.ntriples.NTriplesParser;
> >  import org.openrdf.rio.rdfxml.RDFXMLParser;
> > +import org.openrdf.rio.trix.TriXParser;
> >  import org.openrdf.rio.turtle.TurtleParser;
> >  import org.slf4j.Logger;
> >  import org.slf4j.LoggerFactory;
> > @@ -38,7 +39,7 @@ import java.io.Reader;
> >
> >  /**
> >  * This factory provides a common logic for creating and configuring
> correctly
> > - * any RDF parser used within the library.
> > + * any <i>RDF</i> parser used within the library.
> >  *
> >  * @author Michele Mostarda (mostarda@fbk.eu)
> >  */
> > @@ -119,7 +120,7 @@ public class RDFParserFactory {
> >     }
> >
> >     /**
> > -     * Returns a new instance of a configured {@link
> org.deri.any23.parser.NQuadsParser}.
> > +     * Returns a new instance of a configured {@link
> org.deri.any23.io.nquads.NQuadsParser}.
> >      *
> >      * @param verifyDataType data verification enable if
> <code>true</code>.
> >      * @param stopAtFirstError the parser stops at first error if
> <code>true</code>.
> > @@ -139,6 +140,26 @@ public class RDFParserFactory {
> >     }
> >
> >     /**
> > +     * Returns a new instance of a configured {@link TriXParser}.
> > +     *
> > +     * @param verifyDataType data verification enable if
> <code>true</code>.
> > +     * @param stopAtFirstError the parser stops at first error if
> <code>true</code>.
> > +     * @param extractionContext the extraction context where the parser
> is used.
> > +     * @param extractionResult the output extraction result.
> > +     * @return a new instance of a configured TriX parser.
> > +     */
> > +    public TriXParser getTriXParser(
> > +            final boolean verifyDataType,
> > +            final boolean stopAtFirstError,
> > +            final ExtractionContext extractionContext,
> > +            final ExtractionResult extractionResult
> > +    ) {
> > +        final TriXParser parser = new TriXParser();
> > +        configureParser(parser, verifyDataType, stopAtFirstError,
> extractionContext, extractionResult);
> > +        return parser;
> > +    }
> > +
> > +    /**
> >      * Configures the given parser on the specified extraction result
> >      * setting the policies for data verification and error handling.
> >      *
> >
> >
>



-- 
Michele Mostarda
Senior Software Engineer
skype: michele.mostarda
twitter: micmos
mail: me@michelemostarda.com
site : http://www.michelemostarda.com