You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by Simone Tripodi <si...@apache.org> on 2012/01/10 18:08:27 UTC
Re: svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./
any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/
any23-core/src/main/java/org/deri/any23/cli/ any23-core/src/main/java/org/deri/any23/eval/
any23-core/src/main/java/or
Hi Mic,
this is something great, thanks for the hard work of merging!
next step is renaming the packages in org.apache.any23 :)
All the best, have a nice day!
-Simo
http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/
On Tue, Jan 10, 2012 at 5:32 PM, <mo...@apache.org> wrote:
> Author: mostarda
> Date: Tue Jan 10 16:32:28 2012
> New Revision: 1229627
>
> URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
> Log:
> This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
> with the current Apache Any23 SVN repo, including the issues
> developed during the initial import transition phase.
> Such issues have been tracked on the original Any23 Google Code Issue Tracker [2].
> Below the extract of the original repository commit log.
>
> This commit is related to issue ANY23-27.
>
> [1] http://any23.googlecode.com/svn/trunk/
> [2] http://code.google.com/p/any23/issues/list
>
> ==== BEGIN: Original Log ====
>
> ------------------------------------------------------------------------
> r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line
>
> Improved numeric datatype assigment. This commit fixes issue #208.
> ------------------------------------------------------------------------
> hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
> ------------------------------------------------------------------------
> r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line
>
> Improved numeric datatype assigment. This commit fixes issue #208.
> ------------------------------------------------------------------------
> r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) | 1 line
>
> Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'. Fixed HTMLMetaExtractorTest.java to match this new
> namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS declared as resource instead that as a URI. Fixed
> RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration. This commit is related to issue #203.
> ------------------------------------------------------------------------
> r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) | 1 line
>
> Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found wrong declaration of Class(Resource) in WO.java
> voca. Fixed and updated RDFSchemaUtils.java test. This commit is related to issue #198.
> ------------------------------------------------------------------------
> r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) | 1 line
>
> Added utility method.
> ------------------------------------------------------------------------
> r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) | 1 line
>
> Improved Vocabulary.java class: added support for comments to any resource. Improved RDFSchemaUtils.java serialization
> support, added separators to RDFXML serialization. This commit is related to issue #198.
> ------------------------------------------------------------------------
> r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) | 1 line
>
> Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved prefix declaration parsing in RDFa11Parser, this
> new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix declarations. Fixed support for prefix mapping resolution in
> RDFa11Parser, this allows the correct support for the structured properties introduced by the latest version of the Open
> Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest to the new output of vocabularies serialization.
> Updated Any23PluginManagerTest to include a new class. This commit is related to issue #206.
> ------------------------------------------------------------------------
> r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) | 1 line
>
> Restricted scope of testGetClassesFromClasspath to avoid updating it every time a new class is added.
> ------------------------------------------------------------------------
> r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) | 1 line
>
> Improved validation mode support. Improved descriptions of Validation and Report fields. This commit is related to issue
> #209.
> ------------------------------------------------------------------------
> r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) | 1 line
>
> Improved Any23 Service XML Report format documentation.
> ------------------------------------------------------------------------
> r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) | 1 line
>
> Added URL encoding to the source location path. This commit fixes issue #205. Chosen not to write a formal test which
> requires the creation of folders with spaces
> ------------------------------------------------------------------------
> r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) | 1 line
>
> Removed obsolete section.
> ------------------------------------------------------------------------
> r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) | 1 line
>
> Improved Any23 facade, added method createDocumentSource() to simplify the extraction setup.
> ------------------------------------------------------------------------
> r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) | 1 line
>
> Refactored Rover CLI class to made it extensible from other CLI implementations.
> ------------------------------------------------------------------------
> r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) | 3 lines
>
> Removed wrong artifact name.
>
>
> ------------------------------------------------------------------------
> r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) | 1 line
>
> Upload by wagon-svn
> ------------------------------------------------------------------------
> r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) | 1 line
>
> Removed no longer used jspf lib. Added crawler4j dependencies. Added README. This commit is related to issue #211.
> ------------------------------------------------------------------------
> r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) | 1 line
>
> Changed attributes visibility to facilitate the class extensibility.
> ------------------------------------------------------------------------
> r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) | 1 line
>
> Added helper methods to extract file lines as list of strings. Improved javadoc.
> ------------------------------------------------------------------------
> r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) | 1 line
>
> Added first version of basic-crawler plugin. This commit is related to issue #211.
> ------------------------------------------------------------------------
> r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) | 1 line
>
> Added plugins README.
> ------------------------------------------------------------------------
> r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) | 1 line
>
> Updated main README, added references to plugin and lib.
> ------------------------------------------------------------------------
> r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) | 1 line
>
> Fixed assembly name.
> ------------------------------------------------------------------------
> r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) | 1 line
>
> Fixed Tool signature. This commit is related to #211.
> ------------------------------------------------------------------------
> r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) | 1 line
>
> Improved logging.
> ------------------------------------------------------------------------
> r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) | 1 line
>
> Included plugin basic-crawler in reactor. Improved ToolRunner and Any23PluginManager tests to be compliant to the new
> plugin classes. This commit is related to issue #211.
> ------------------------------------------------------------------------
> r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) | 1 line
>
> Fixed Crawler4j group id. Related to issue #211.
> ------------------------------------------------------------------------
> r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) | 1 line
>
> Improved plugin documentation. Introduced Office Scraper specific page. This commit is related to issue #213.
> ------------------------------------------------------------------------
> r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) | 1 line
>
> Fixed POST method documentation. Related to issue #213.
> ------------------------------------------------------------------------
> r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) | 1 line
>
> Fixed code snippets, prettified, added missing finalization logic. See issue #187.
> ------------------------------------------------------------------------
> r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) | 1 line
>
> Fixed var name. See #187.
> ------------------------------------------------------------------------
> r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) | 1 line
>
> Updated code snippets and tutorial, added explicit TripleHandler closure. This commit is related to issue #187.
> ------------------------------------------------------------------------
> r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) | 1 line
>
> Fixed data type handling management in NQuadsParser. This commit is related to issue #210.
> ------------------------------------------------------------------------
> r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) | 1 line
>
> Added missing JSON output format. See #214.
> ------------------------------------------------------------------------
> r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) | 1 line
>
> Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output format support to Rover. This commit is related to
> issue #215.
> ------------------------------------------------------------------------
> r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) | 1 line
>
> Added Sesame TriX IO dependency. This commit is related to #215.
> ------------------------------------------------------------------------
> r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) | 1 line
>
> Some suppressed suppressed have been reactivated as Ignored.
> ------------------------------------------------------------------------
> r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) | 1 line
>
> Added TriX output format to the Any23 Service. Commit related to issue #215.
> ------------------------------------------------------------------------
> r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) | 1 line
>
> Improved FormatWriter management, added WriterRegistry. Improved Writer format management in Rover and WebResponder.
> This commit is related to issues #215 and #216.
> ------------------------------------------------------------------------
> r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) | 6 lines
>
> Added TriXExtractor and textual example (example-trix.trx), added trix support in RDFParserFactory.
> Registered TriXExtractor to the ExtractorRegistry.
> Added TriX mimetype support in TikaMIMETypeDetector (through mimetypes.xml) and added specific test.
> Added support and doc to TriX format in Any23 Service web page (form.html).
> This commit is related to issue #215.
>
> ------------------------------------------------------------------------
> r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) | 1 line
>
> Fixed number of extractors (+1 after adding TriXExtractor). Commit related to issue #215.
> ------------------------------------------------------------------------
> r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) | 1 line
>
> Added method getExtractorType() .
> ------------------------------------------------------------------------
> r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) | 4 lines
>
> Improved ExtractorDocumentation support, added missing format examples.
> Improved output layout. This commit is related to issue #194.
>
>
> ------------------------------------------------------------------------
> r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) | 1 line
>
> Improved classpath management in Any23PluginManager. Renamed getClasses\* in loadClasses\* . This commit is related to
> issue #212.
> ------------------------------------------------------------------------
> r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) | 1 line
>
> Separated log messages from specific outout data.
> ------------------------------------------------------------------------
> r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) | 1 line
>
> Added human readable report printing support in ReportingTripleHandler and Rover.
> ------------------------------------------------------------------------
> r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) | 1 line
>
> Fixed major issue in output generation, added final activity report, help prettification. This commit is related to
> issue #211.
> ------------------------------------------------------------------------
> r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) | 1 line
>
> Upgraded to Sesame 2.6.1 See issue #217.
> ------------------------------------------------------------------------
> r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) | 1 line
>
> Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue #216
> ------------------------------------------------------------------------
> r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) | 1 line
>
> Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
> ------------------------------------------------------------------------
> r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) | 1 line
>
> Added specific Crawler CLI documentation. Updated general CLI documentation. This commit is related to issue #211.
> ------------------------------------------------------------------------
> r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) | 4 lines
>
> The Eval CLI Tool has been removed as well as the org.deri.any23.eval package classes related to it.
> Updated tests verifying CLI tool detection.
> This commit is related to issue #218.
>
> ------------------------------------------------------------------------
> r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) | 5 lines
>
> Added MimeDetector CLI Tool and test case, removed main() from
> TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
> Updated CLI doc.
> This commit is related to issue #219.
>
> ------------------------------------------------------------------------
> r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) | 1 line
>
> Added support for comment serialization. Related to issue #158.
> ------------------------------------------------------------------------
> r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) | 1 line
>
> Add support for annotation writing in FormatWriter implementations. This commit is related to issue #158.
> ------------------------------------------------------------------------
> r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) | 1 line
>
> Added support for 'annotate' flag in Any23 Service.
> ------------------------------------------------------------------------
>
> ==== END : Original Log ====
>
>
> Added:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
> incubator/any23/trunk/any23-core/src/test/resources/application/trix/
> incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
> incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
> incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
> incubator/any23/trunk/lib/README.txt
> incubator/any23/trunk/plugins/README.txt
> incubator/any23/trunk/plugins/basic-crawler/
> incubator/any23/trunk/plugins/basic-crawler/pom.xml
> incubator/any23/trunk/plugins/basic-crawler/src/
> incubator/any23/trunk/plugins/basic-crawler/src/main/
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
> incubator/any23/trunk/plugins/basic-crawler/src/test/
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
> incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
> Removed:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
> Modified:
> incubator/any23/trunk/README.txt
> incubator/any23/trunk/any23-core/bin/any23
> incubator/any23/trunk/any23-core/bin/any23tools
> incubator/any23/trunk/any23-core/pom.xml
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
> incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
> incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
> incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
> incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
> incubator/any23/trunk/lib/install-deps.sh
> incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
> incubator/any23/trunk/pom.xml
> incubator/any23/trunk/src/site/apt/any23-plugins.apt
> incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
> incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
> incubator/any23/trunk/src/site/apt/getting-started.apt
> incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
> incubator/any23/trunk/src/site/apt/service.apt
> incubator/any23/trunk/src/site/apt/supported-formats.apt
>
> Modified: incubator/any23/trunk/README.txt
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/README.txt (original)
> +++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
> @@ -20,7 +20,8 @@ Distribution Content
>
> any23-core The library core codebase.
> any23-service The library HTTP service codebase.
> -plugins Library plugins codebase.
> +lib Contains the Any23 the external deps (read lib/README.txt for further details).
> +plugins Library plugins codebase (read plugins/README.txt for further details).
> RELEASE-NOTES.txt File reporting main release notes for every version.
> LICENSE.txt Applicable project license.
> README.txt This file.
> @@ -240,15 +241,14 @@ Upload the produced packages in download
>
> http://code.google.com/p/any23/downloads/list
>
> +--------------------
> +Manage External Deps
> +--------------------
>
> -Fix Release Procedure
> ----------------------
> -
> - Currently the *plugins/integration-test* module is excluded from the parent
> - reactor.
> - To fix it in tag follow procedure as described at issue #171:
> -
> - http://code.google.com/p/any23/issues/detail?id=171
> +::Developers interest only.::
>
> +External Deps are libraries used by some Any23 modules which are
> +not available in public Maven repositories. Such libraries are
> +managed within the 'lib' dir.
>
> EOF
>
> Modified: incubator/any23/trunk/any23-core/bin/any23
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/bin/any23 (original)
> +++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
> @@ -9,12 +9,12 @@
> ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
>
> if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
> - echo "Generating executable JAR..."
> - mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
> + echo "Generating executable JAR..." >&2
> + mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
> ||\
> - mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
> + mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
> ||\
> - { echo "Error while generating commandline assembly."; exit 1; }
> + { echo "Error while generating commandline assembly." >&2; exit 1; }
> fi
>
> SEP=':'
>
> Modified: incubator/any23/trunk/any23-core/bin/any23tools
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/bin/any23tools (original)
> +++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28 2012
> @@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd
> PLUGINS_DIR=plugins
>
> if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
> - echo "Generating executable JAR..."
> - mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
> + echo "Generating executable JAR..." >&2
> + mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
> ||\
> - mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
> + mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
> ||\
> - { echo "Error while generating commandline assembly."; exit 1; }
> + { echo "Error while generating commandline assembly." >&2; exit 1; }
> fi
>
> SEP=':'
> @@ -30,6 +30,7 @@ done
> # Plugins classpath.
> for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name "*-plugin.jar" -depth 1)
> do
> + echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
> if [ ! -e "$jar" ]; then continue; fi
> CP="$CP$SEP$jar"
> done
>
> Modified: incubator/any23/trunk/any23-core/pom.xml
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/pom.xml (original)
> +++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
> @@ -92,6 +92,10 @@
> </dependency>
> <dependency>
> <groupId>org.openrdf.sesame</groupId>
> + <artifactId>sesame-rio-trix</artifactId>
> + </dependency>
> + <dependency>
> + <groupId>org.openrdf.sesame</groupId>
> <artifactId>sesame-repository-sail</artifactId>
> </dependency>
> <dependency>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java Tue Jan 10 16:32:28 2012
> @@ -258,6 +258,28 @@ public class Any23 {
> }
>
> /**
> + * Returns the most appropriate {@link DocumentSource} for the given<code>documentURI</code>.
> + *
> + * @param documentURI the document <i>URI</i>.
> + * @return a new instance of DocumentSource.
> + * @throws URISyntaxException if an error occurs while parsing the <code>documentURI</code> as a <i>URI</i>.
> + * @throws IOException if an error occurs while initializing the internal {@link HTTPClient}.
> + */
> + public DocumentSource createDocumentSource(String documentURI) throws URISyntaxException, IOException {
> + if(documentURI == null) throw new NullPointerException("documentURI cannot be null.");
> + if (documentURI.toLowerCase().startsWith("file:")) {
> + return new FileDocumentSource( new File(new URI(documentURI)) );
> + }
> + if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
> + return new HTTPDocumentSource(getHTTPClient(), documentURI);
> + }
> + throw new IllegalArgumentException(
> + String.format("Unsupported protocol for document URI: '%s' .", documentURI)
> + );
> + }
> +
> +
> + /**
> * Performs metadata extraction from the content of the given
> * <code>in</code> document source, sending the generated events
> * to the specified <code>outputHandler</code>.
> @@ -363,13 +385,7 @@ public class Any23 {
> public ExtractionReport extract(ExtractionParameters eps, String documentURI, TripleHandler outputHandler)
> throws IOException, ExtractionException {
> try {
> - if (documentURI.toLowerCase().startsWith("file:")) {
> - return extract(eps, new FileDocumentSource(new File(new URI(documentURI))), outputHandler);
> - }
> - if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
> - return extract(eps, new HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
> - }
> - throw new ExtractionException("Not a valid absolute URI: " + documentURI);
> + return extract(eps, createDocumentSource(documentURI), outputHandler);
> } catch (URISyntaxException ex) {
> throw new ExtractionException("Error while extracting data from document URI.", ex);
> }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java Tue Jan 10 16:32:28 2012
> @@ -16,7 +16,7 @@
>
> package org.deri.any23.cli;
>
> -import org.deri.any23.LogUtil;
> +import org.deri.any23.util.LogUtils;
> import org.deri.any23.extractor.ExampleInputOutput;
> import org.deri.any23.extractor.ExtractionException;
> import org.deri.any23.extractor.Extractor;
> @@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
> }
>
> public int run(String[] args) {
> - LogUtil.setDefaultLogging();
> + LogUtils.setDefaultLogging();
> try {
> if (args.length == 0) {
> printUsage();
> @@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
> * Prints the list of all the available extractors.
> */
> public void printExtractorList() {
> - for (String extractorName : ExtractorRegistry.getInstance().getAllNames()) {
> - System.out.println(extractorName);
> + for(ExtractorFactory factory : ExtractorRegistry.getInstance().getExtractorGroup()) {
> + System.out.println( String.format("%25s [%15s]", factory.getExtractorName(), factory.getExtractorType()));
> }
> }
>
> @@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
> ExtractorFactory<?> factory = ExtractorRegistry.getInstance().getFactory(extractorName);
> ExampleInputOutput example = new ExampleInputOutput(factory);
> System.out.println("Extractor: " + extractorName);
> - System.out.println(" type: " + getType(factory));
> - String output = example.getExampleOutput();
> - if (output == null) {
> - System.out.println("(no example output)");
> + System.out.println("\ttype: " + getType(factory));
> + System.out.println();
> + final String exampleInput = example.getExampleInput();
> + if(exampleInput == null) {
> + System.out.println("(No Example Available)");
> } else {
> - System.out.println("-------- example output --------");
> - System.out.println(output);
> + System.out.println("-------- Example Input --------");
> + System.out.println(exampleInput);
> + System.out.println("-------- Example Output --------");
> + String output = example.getExampleOutput();
> + System.out.println(output == null || output.trim().length() == 0 ? "(No Output Generated)" : output);
> }
> - System.out.println();
> System.out.println("================================");
> + System.out.println();
> }
> }
>
>
> Added: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java (added)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java Tue Jan 10 16:32:28 2012
> @@ -0,0 +1,113 @@
> +/*
> + * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at
> + *
> + * http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +package org.deri.any23.cli;
> +
> +import org.deri.any23.configuration.DefaultConfiguration;
> +import org.deri.any23.http.DefaultHTTPClient;
> +import org.deri.any23.http.HTTPClient;
> +import org.deri.any23.http.HTTPClientConfiguration;
> +import org.deri.any23.mime.MIMEType;
> +import org.deri.any23.mime.MIMETypeDetector;
> +import org.deri.any23.mime.TikaMIMETypeDetector;
> +import org.deri.any23.source.DocumentSource;
> +import org.deri.any23.source.FileDocumentSource;
> +import org.deri.any23.source.HTTPDocumentSource;
> +import org.deri.any23.source.StringDocumentSource;
> +
> +import java.io.File;
> +import java.net.URISyntaxException;
> +
> +/**
> + * Commandline tool to detect <b>MIME Type</b>s from
> + * file, HTTP and direct input sources.
> + * The implementation of this tool is based on {@link TikaMIMETypeDetector}.
> + *
> + * @author Michele Mostarda (mostarda@fbk.eu)
> + */
> +@ToolRunner.Description("MIME Type Detector Tool.")
> +public class MimeDetector implements Tool{
> +
> + public static final String FILE_DOCUMENT_PREFIX = "file://";
> + public static final String INLINE_DOCUMENT_PREFIX = "inline://";
> + public static final String URL_DOCUMENT_RE = "^https?://.*";
> +
> + public static void main(String[] args) {
> + System.exit( new MimeDetector().run(args) );
> + }
> +
> + @Override
> + public int run(String[] args) {
> + if(args.length != 1) {
> + System.err.println("USAGE: {http://path/to/resource.html|file:///path/to/local.file|inline:// some inline content}");
> + return 1;
> + }
> +
> + final String document = args[0];
> + try {
> + final DocumentSource documentSource = createDocumentSource(document);
> + final MIMETypeDetector detector = new TikaMIMETypeDetector();
> + final MIMEType mimeType = detector.guessMIMEType(
> + documentSource.getDocumentURI(),
> + documentSource.openInputStream(),
> + MIMEType.parse(documentSource.getContentType())
> + );
> + System.out.println(mimeType);
> + return 0;
> + } catch (Exception e) {
> + System.err.print("Error while detecting MIME Type.");
> + e.printStackTrace(System.err);
> + return 1;
> + }
> + }
> +
> + private DocumentSource createDocumentSource(String document) throws URISyntaxException {
> + if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
> + return new FileDocumentSource(
> + new File(
> + document.substring(FILE_DOCUMENT_PREFIX.length())
> + )
> + );
> + }
> + if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
> + return new StringDocumentSource(
> + document.substring(INLINE_DOCUMENT_PREFIX.length()),
> + ""
> + );
> + }
> + if(document.matches(URL_DOCUMENT_RE)) {
> + final HTTPClient client = new DefaultHTTPClient();
> + // TODO: anonymous config class also used in Any23. centralize.
> + client.init(new HTTPClientConfiguration() {
> + public String getUserAgent() {
> + return DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
> + }
> + public String getAcceptHeader() {
> + return "";
> + }
> + public int getDefaultTimeout() {
> + return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
> + }
> + public int getMaxConnections() {
> + return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
> + }
> + });
> + return new HTTPDocumentSource(client, document);
> + }
> + throw new IllegalArgumentException("Unsupported protocol for document " + document);
> + }
> +
> +}
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java Tue Jan 10 16:32:28 2012
> @@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
> import org.apache.commons.cli.Options;
> import org.apache.commons.cli.PosixParser;
> import org.deri.any23.Any23;
> -import org.deri.any23.LogUtil;
> +import org.deri.any23.util.LogUtils;
> import org.deri.any23.configuration.Configuration;
> import org.deri.any23.configuration.DefaultConfiguration;
> import org.deri.any23.extractor.ExtractionException;
> @@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
> import org.deri.any23.extractor.SingleDocumentExtraction;
> import org.deri.any23.filter.IgnoreAccidentalRDFa;
> import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
> +import org.deri.any23.source.DocumentSource;
> import org.deri.any23.writer.BenchmarkTripleHandler;
> import org.deri.any23.writer.LoggingTripleHandler;
> -import org.deri.any23.writer.NQuadsWriter;
> -import org.deri.any23.writer.NTriplesWriter;
> -import org.deri.any23.writer.RDFXMLWriter;
> import org.deri.any23.writer.ReportingTripleHandler;
> import org.deri.any23.writer.TripleHandler;
> import org.deri.any23.writer.TripleHandlerException;
> -import org.deri.any23.writer.TurtleWriter;
> -import org.deri.any23.writer.URIListWriter;
> +import org.deri.any23.writer.WriterRegistry;
> import org.slf4j.Logger;
> import org.slf4j.LoggerFactory;
>
> @@ -51,6 +48,7 @@ import java.io.OutputStream;
> import java.io.PrintStream;
> import java.io.PrintWriter;
> import java.net.MalformedURLException;
> +import java.net.URISyntaxException;
> import java.net.URL;
>
> import static org.deri.any23.extractor.ExtractionParameters.ValidationMode;
> @@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
> * A default rover implementation. Goes and fetches a URL using an hint
> * as to what format should require, then tries to convert it to RDF.
> *
> - * @author Gabriele Renzi
> - * @author Richard Cyganiak (richard@cyganiak.de)
> * @author Michele Mostarda (mostarda@fbk.eu)
> + * @author Richard Cyganiak (richard@cyganiak.de)
> + * @author Gabriele Renzi
> */
> @ToolRunner.Description("Any23 Command Line Tool.")
> public class Rover implements Tool {
>
> - // Supported formats.
> - private static final String TURTLE_FORMAT = "turtle";
> - private static final String NTRIPLE_FORMAT = "ntriples";
> - private static final String RDFXML_FORMAT = "rdfxml";
> - private static final String NQUADS_FORMAT = "nquads";
> - private static final String URIS_FORMAT = "uris";
> -
> - private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
> + private static final String[] FORMATS = WriterRegistry.getInstance().getIdentifiers();
> + private static final int DEFAULT_FORMAT_INDEX = 0;
>
> private static final Logger logger = LoggerFactory.getLogger(Rover.class);
>
> - private static Options options;
> + private Options options;
>
> - public static void main(String[] args) {
> - System.exit( new Rover().run(args) );
> - }
> + private CommandLine commandLine;
>
> - public int run(String[] args) {
> - final CommandLineParser parser = new PosixParser();
> - final CommandLine commandLine;
> + private boolean verbose = false;
>
> - boolean verbose = false;
> - try {
> - options = createOptions();
> - commandLine = parser.parse(options, args);
> + private PrintStream outputStream;
> + private TripleHandler tripleHandler;
> + private ReportingTripleHandler reportingTripleHandler;
> + private BenchmarkTripleHandler benchmarkTripleHandler;
>
> - if (commandLine.hasOption("h")) {
> - printHelp();
> - return 0;
> - }
> + private ExtractionParameters eps;
> + private Any23 any23;
>
> - if (commandLine.hasOption('v')) {
> - verbose = true;
> - LogUtil.setVerboseLogging();
> - } else {
> - LogUtil.setDefaultLogging();
> - }
> -
> - if (commandLine.getArgs().length < 1) {
> - printHelp();
> - throw new IllegalArgumentException("Expected at least 1 argument.");
> - }
> + protected boolean isVerbose() {
> + return verbose;
> + }
>
> - final String[] inputURIs = argumentsToURIs(commandLine.getArgs());
> - final String[] extractorNames = getExtractors(commandLine);
> + public static void main(String[] args) {
> + System.exit( new Rover().run(args) );
> + }
>
> - PrintStream outputStream = null;
> - TripleHandler tripleHandler = null;
> - try {
> - outputStream = getOutputStream(commandLine);
> + public int run(String[] args) {
> + try {
> + final String[] uris = configure(args);
> + performExtraction(uris);
> + return 0;
> + } catch (Exception e) {
> + System.err.println( e.getMessage() );
> + final int exitCode = e instanceof ExitCodeException ? ((ExitCodeException) e).exitCode : 1;
> + if(verbose) e.printStackTrace(System.err);
> + return exitCode;
> + }
> + }
>
> - tripleHandler = getTripleHandler(commandLine, outputStream);
> + protected CommandLine getCommandLine() {
> + if(commandLine == null) throw new IllegalStateException("Rover must be configured first.");
> + return commandLine;
> + }
>
> - tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
> + protected String[] configure(String[] args) throws Exception {
> + final CommandLineParser parser = new PosixParser();
> + options = createOptions();
> + commandLine = parser.parse(options, args);
>
> - tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
> - final BenchmarkTripleHandler benchmarkTripleHandler =
> - tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
> + if (commandLine.hasOption("h")) {
> + printHelp();
> + throw new ExitCodeException(0);
> + }
>
> - tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
> + if (commandLine.hasOption('v')) {
> + verbose = true;
> + LogUtils.setVerboseLogging();
> + } else {
> + LogUtils.setDefaultLogging();
> + }
>
> - final ReportingTripleHandler reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
> + if (commandLine.getArgs().length < 1) {
> + printHelp();
> + throw new IllegalArgumentException("Expected at least 1 argument.");
> + }
>
> - final ExtractionParameters eps = getExtractionParameters(commandLine);
> + final String[] inputURIs = argumentsToURIs(commandLine.getArgs());
> + final String[] extractorNames = getExtractors(commandLine);
>
> - final Any23 any23 = createAny23(extractorNames);
> + try {
> + outputStream = getOutputStream(commandLine);
> + tripleHandler = getTripleHandler(commandLine, outputStream);
> + tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
> + tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
>
> - final long start = System.currentTimeMillis();
> - for(String inputURI : inputURIs) {
> - performExtraction(any23, eps, inputURI, reportingTripleHandler);
> - }
> - final long elapsed = System.currentTimeMillis() - start;
> + benchmarkTripleHandler =
> + tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
>
> - closeAll(tripleHandler, outputStream);
> + tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>
> - if (benchmarkTripleHandler != null) {
> - System.err.println( benchmarkTripleHandler.report() );
> - }
> + reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
> + eps = getExtractionParameters(commandLine);
> + any23 = createAny23(extractorNames);
>
> - logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
> - logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
> - } finally {
> - closeAll(tripleHandler, outputStream);
> - }
> + return inputURIs;
> } catch (Exception e) {
> - System.err.println(e.getMessage());
> - final int exitCode = e instanceof SpecificExitException ? ((SpecificExitException) e).exitCode : 1;
> - if(verbose) e.printStackTrace(System.err);
> - return exitCode;
> + closeStreams();
> + throw e;
> }
> - return 0;
> }
>
> - private Options createOptions() {
> + protected Options createOptions() {
> final Options options = new Options();
> options.addOption(
> new Option("v", "verbose", false, "Show debug and progress information.")
> @@ -178,13 +175,7 @@ public class Rover implements Tool {
> "f",
> "Output format",
> true,
> - "[" +
> - TURTLE_FORMAT + " (default), " +
> - NTRIPLE_FORMAT + ", " +
> - RDFXML_FORMAT + ", " +
> - NQUADS_FORMAT + ", " +
> - URIS_FORMAT +
> - "]"
> + "[" + printFormats(FORMATS, DEFAULT_FORMAT_INDEX) + "]"
> )
> );
> options.addOption(
> @@ -208,11 +199,51 @@ public class Rover implements Tool {
> return options;
> }
>
> + protected void performExtraction(DocumentSource documentSource) {
> + performExtraction(any23, eps, documentSource, reportingTripleHandler);
> + }
> +
> + protected void performExtraction(String[] inputURIs) throws URISyntaxException, IOException {
> + try {
> + final long start = System.currentTimeMillis();
> + for (String inputURI : inputURIs) {
> + performExtraction( any23.createDocumentSource(inputURI) );
> + }
> + final long elapsed = System.currentTimeMillis() - start;
> +
> + if (benchmarkTripleHandler != null) {
> + System.err.println(benchmarkTripleHandler.report());
> + }
> +
> + logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
> + logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
> + } finally {
> + closeStreams();
> + }
> + }
> +
> + protected String printReports() {
> + final StringBuilder sb = new StringBuilder();
> + if(benchmarkTripleHandler != null) sb.append( benchmarkTripleHandler.report() ).append('\n');
> + if(reportingTripleHandler != null) sb.append( reportingTripleHandler.printReport() ).append('\n');
> + return sb.toString();
> + }
> +
> private void printHelp() {
> HelpFormatter formatter = new HelpFormatter();
> formatter.printHelp("[{<url>|<file>}]+", options, true);
> }
>
> + private String printFormats(String[] formats, int defaultIndex) {
> + final StringBuilder sb = new StringBuilder();
> + for (int i = 0; i < formats.length; i++) {
> + sb.append(formats[i]);
> + if(i == defaultIndex) sb.append(" (default)");
> + if(i < formats.length - 1) sb.append(", ");
> + }
> + return sb.toString();
> + }
> +
> private String argumentToURI(String uri) {
> uri = uri.trim();
> if (uri.toLowerCase().startsWith("http:") || uri.toLowerCase().startsWith("https:")) {
> @@ -268,27 +299,17 @@ public class Rover implements Tool {
>
> private TripleHandler getTripleHandler(CommandLine cl, OutputStream os) {
> final String FORMAT_OPTION = "f";
> - String format = DEFAULT_FORMAT;
> + String format = FORMATS[DEFAULT_FORMAT_INDEX];
> if (cl.hasOption(FORMAT_OPTION)) {
> - format = cl.getOptionValue(FORMAT_OPTION);
> + format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
> }
> - final TripleHandler outputHandler;
> - if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
> - outputHandler = new TurtleWriter(os);
> - } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
> - outputHandler = new NTriplesWriter(os);
> - } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
> - outputHandler = new RDFXMLWriter(os);
> - } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
> - outputHandler = new NQuadsWriter(os);
> - } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
> - outputHandler = new URIListWriter(os);
> - } else {
> + try {
> + return WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
> + } catch (Exception e) {
> throw new IllegalArgumentException(
> String.format("Invalid option value '%s' for option %s", format, FORMAT_OPTION)
> );
> }
> - return outputHandler;
> }
>
> private TripleHandler decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
> @@ -346,44 +367,54 @@ public class Rover implements Tool {
> return any23;
> }
>
> - private void performExtraction(Any23 any23, ExtractionParameters eps, String documentURI, TripleHandler th) {
> + private void performExtraction(
> + Any23 any23, ExtractionParameters eps, DocumentSource documentSource, TripleHandler th
> + ) {
> try {
> - if (! any23.extract(eps, documentURI, th).hasMatchingExtractors()) {
> - throw new SpecificExitException("No suitable extractors found.", 2);
> + if (! any23.extract(eps, documentSource, th).hasMatchingExtractors()) {
> + throw new ExitCodeException("No suitable extractors found.", 2);
> }
> } catch (ExtractionException ex) {
> - throw new SpecificExitException("Exception while extracting metadata.", ex, 3);
> + throw new ExitCodeException("Exception while extracting metadata.", ex, 3);
> } catch (IOException ex) {
> - throw new SpecificExitException("Exception while producing output.", ex, 4);
> + throw new ExitCodeException("Exception while producing output.", ex, 4);
> }
> }
>
> - private void closeHandler(TripleHandler th) {
> - if(th == null) return;
> + private void closeHandler() {
> + if(tripleHandler == null) return;
> try {
> - th.close();
> + tripleHandler.close();
> } catch (TripleHandlerException the) {
> - throw new SpecificExitException("Error while closing TripleHandler", the, 5);
> + throw new ExitCodeException("Error while closing TripleHandler", the, 5);
> }
> }
>
> - private void closeAll(TripleHandler th, PrintStream os) {
> - closeHandler(th);
> - if(os != null) os.close();
> + private void closeStreams() {
> + closeHandler();
> + if(outputStream != null) outputStream.close();
> }
>
> - private class SpecificExitException extends RuntimeException {
> + protected class ExitCodeException extends RuntimeException {
>
> private final int exitCode;
>
> - public SpecificExitException(String message, Throwable cause, int exitCode) {
> + public ExitCodeException(String message, Throwable cause, int exitCode) {
> super(message, cause);
> this.exitCode = exitCode;
> }
> - public SpecificExitException(String message, int exitCode) {
> + public ExitCodeException(String message, int exitCode) {
> super(message);
> this.exitCode = exitCode;
> }
> + public ExitCodeException(int exitCode) {
> + super();
> + this.exitCode = exitCode;
> + }
> +
> + protected int getExitCode() {
> + return exitCode;
> + }
> }
>
> }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java Tue Jan 10 16:32:28 2012
> @@ -29,6 +29,13 @@ import java.util.Collection;
> public interface ExtractorFactory<T extends Extractor<?>> extends ExtractorDescription {
>
> /**
> + * Returns the extractor type.
> + *
> + * @return the not <code>null</code> extractor class.
> + */
> + Class<T> getExtractorType();
> +
> + /**
> * Creates an extractor instance.
> *
> * @return an instance of the extractor associated to this factory.
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java Tue Jan 10 16:32:28 2012
> @@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
> import org.deri.any23.extractor.rdf.NQuadsExtractor;
> import org.deri.any23.extractor.rdf.NTriplesExtractor;
> import org.deri.any23.extractor.rdf.RDFXMLExtractor;
> +import org.deri.any23.extractor.rdf.TriXExtractor;
> import org.deri.any23.extractor.rdf.TurtleExtractor;
> import org.deri.any23.extractor.rdfa.RDFa11Extractor;
> import org.deri.any23.extractor.rdfa.RDFaExtractor;
> @@ -79,6 +80,7 @@ public class ExtractorRegistry {
> instance.register(TurtleExtractor.factory);
> instance.register(NTriplesExtractor.factory);
> instance.register(NQuadsExtractor.factory);
> + instance.register(TriXExtractor.factory);
> if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
> instance.register(RDFa11Extractor.factory);
> } else {
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java Tue Jan 10 16:32:28 2012
> @@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
> return supportedMIMETypes;
> }
>
> + @Override
> + public Class<T> getExtractorType() {
> + return extractorClass;
> + }
> +
> /**
> * @return an instance of type T concrete implementation of {@link org.deri.any23.extractor.Extractor}
> */
> + @Override
> public T createExtractor() {
> try {
> return extractorClass.newInstance();
> @@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
> /**
> * @return an input example
> */
> + @Override
> public String getExampleInput() {
> return exampleInput;
> }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java Tue Jan 10 16:32:28 2012
> @@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
> Arrays.asList(
> "text/csv;q=0.1"
> ),
> - null,
> + "example-csv.csv",
> CSVExtractor.class
> );
>
> @@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
> }
>
> /**
> + * Check whether a number is an integer.
> + *
> + * @param number
> + * @return
> + */
> + private boolean isInteger(String number) {
> + try {
> + Integer.valueOf(number);
> + return true;
> + } catch (NumberFormatException e) {
> + return false;
> + }
> + }
> +
> + /**
> + * Check whether a number is a float.
> + *
> * @param number
> * @return
> */
> - private boolean isNumber(String number) {
> + private boolean isFloat(String number) {
> try {
> - Double.valueOf(number);
> + Float.valueOf(number);
> return true;
> } catch (NumberFormatException e) {
> return false;
> @@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
> object = new URIImpl(cell);
> } else {
> URI datatype = XMLSchema.STRING;
> - if (isNumber(cell)) {
> + if (isInteger(cell)) {
> datatype = XMLSchema.INTEGER;
> + } else if(isFloat(cell)) {
> + datatype = XMLSchema.FLOAT;
> }
> object = new LiteralImpl(cell, datatype);
> }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java Tue Jan 10 16:32:28 2012
> @@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
> "html-mf-adr",
> PopularPrefixes.createSubset("rdf", "vcard"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-adr.html",
> AdrExtractor.class
> );
> }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java Tue Jan 10 16:32:28 2012
> @@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
> "html-mf-geo",
> PopularPrefixes.createSubset("rdf", "vcard"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-geo.html",
> GeoExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java Tue Jan 10 16:32:28 2012
> @@ -53,7 +53,7 @@ public class HCalendarExtractor extends
> "html-mf-hcalendar",
> PopularPrefixes.createSubset("rdf", "ical"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-hcalendar.html",
> HCalendarExtractor.class);
>
> private static final String[] Components = {"Vevent", "Vtodo", "Vjournal", "Vfreebusy"};
> @@ -116,7 +116,7 @@ public class HCalendarExtractor extends
> private boolean extractComponent(Node node, Resource cal, String component) throws ExtractionException {
> HTMLDocument compoNode = new HTMLDocument(node);
> BNode evt = valueFactory.createBNode();
> - addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
> + addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
> addTextProps(compoNode, evt);
> addUrl(compoNode, evt);
> addRRule(compoNode, evt);
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java Tue Jan 10 16:32:28 2012
> @@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
> "html-mf-hcard",
> PopularPrefixes.createSubset("rdf", "vcard"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-hcard.html",
> HCardExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java Tue Jan 10 16:32:28 2012
> @@ -82,7 +82,7 @@ public class HListingExtractor extends E
> "html-mf-hlisting",
> PopularPrefixes.createSubset("rdf", "hlisting"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-hlisting.html",
> HListingExtractor.class
> );
>
> @@ -106,7 +106,7 @@ public class HListingExtractor extends E
> out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
>
> for (String action : findActions(fragment)) {
> - out.writeTriple(listing, hLISTING.action, hLISTING.getResource(action));
> + out.writeTriple(listing, hLISTING.action, hLISTING.getClass(action));
> }
> out.writeTriple(listing, hLISTING.lister, addLister() );
> addItem(listing);
> @@ -154,7 +154,7 @@ public class HListingExtractor extends E
> String value = node.getNodeValue();
> // do not use conditionallyAdd, it won't work cause of evaluation rules
> if (!(null == value || "".equals(value))) {
> - URI property = hLISTING.getPropertyCamelized(klass);
> + URI property = hLISTING.getPropertyCamelCase(klass);
> conditionallyAddLiteralProperty(
> node,
> blankItem, property, valueFactory.createLiteral(value)
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java Tue Jan 10 16:32:28 2012
> @@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
> "html-mf-hrecipe",
> PopularPrefixes.createSubset("rdf", "hrecipe"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-hrecipe.html",
> HRecipeExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java Tue Jan 10 16:32:28 2012
> @@ -48,7 +48,7 @@ public class HResumeExtractor extends En
> "html-mf-hresume",
> PopularPrefixes.createSubset("rdf", "doac", "foaf"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-hresume.html",
> HResumeExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java Tue Jan 10 16:32:28 2012
> @@ -53,7 +53,7 @@ public class HReviewExtractor extends En
> "html-mf-hreview",
> PopularPrefixes.createSubset("rdf", "vcard", "rev"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-hreview.html",
> HReviewExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java Tue Jan 10 16:32:28 2012
> @@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
> "html-head-links",
> PopularPrefixes.createSubset("xhtml", "dcterms"),
> Arrays.asList("text/html;q=0.05", "application/xhtml+xml;q=0.05"),
> - null,
> + "example-head-link.html",
> HeadLinkExtractor.class);
> }
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java Tue Jan 10 16:32:28 2012
> @@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
> "html-head-icbm",
> PopularPrefixes.createSubset("geo", "rdf"),
> Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
> - null,
> + "example-icbm.html",
> ICBMExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java Tue Jan 10 16:32:28 2012
> @@ -51,7 +51,7 @@ public class LicenseExtractor implements
> "html-mf-license",
> PopularPrefixes.createSubset("xhtml"),
> Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
> - null,
> + "example-mf-license.html",
> LicenseExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java Tue Jan 10 16:32:28 2012
> @@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
> "html-mf-species",
> PopularPrefixes.createSubset("rdf", "wo"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-species.html",
> SpeciesExtractor.class
> );
>
> @@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
>
> private URI resolveClassName(String clazz) {
> String upperCaseClass = clazz.substring(0, 1);
> - return vWO.getResource(
> + return vWO.getClass(
> String.format("%s%s",
> upperCaseClass.toUpperCase(),
> clazz.substring(1)
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java Tue Jan 10 16:32:28 2012
> @@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
> NAME,
> PopularPrefixes.get(),
> Arrays.asList("text/html;q=0.02", "application/xhtml+xml;q=0.02"),
> - null,
> + "example-script-turtle.html",
> TurtleHTMLExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java Tue Jan 10 16:32:28 2012
> @@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
> "html-mf-xfn",
> PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-mf-xfn.html",
> XFNExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java Tue Jan 10 16:32:28 2012
> @@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
> "html-microdata",
> PopularPrefixes.createSubset("rdf", "doac", "foaf"),
> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
> - null,
> + "example-microdata.html",
> MicrodataExtractor.class
> );
>
>
> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> ==============================================================================
> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java (original)
> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java Tue Jan 10 16:32:28 2012
> @@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
> import org.deri.any23.extractor.ErrorReporter;
> import org.deri.any23.extractor.ExtractionContext;
> import org.deri.any23.extractor.ExtractionResult;
> -import org.deri.any23.parser.NQuadsParser;
> +import org.deri.any23.io.nquads.NQuadsParser;
> import org.deri.any23.rdf.Any23ValueFactoryWrapper;
> import org.openrdf.model.impl.ValueFactoryImpl;
> import org.openrdf.rio.ParseErrorListener;
> @@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
> import org.openrdf.rio.RDFParser;
> import org.openrdf.rio.ntriples.NTriplesParser;
> import org.openrdf.rio.rdfxml.RDFXMLParser;
> +import org.openrdf.rio.trix.TriXParser;
> import org.openrdf.rio.turtle.TurtleParser;
> import org.slf4j.Logger;
> import org.slf4j.LoggerFactory;
> @@ -38,7 +39,7 @@ import java.io.Reader;
>
> /**
> * This factory provides a common logic for creating and configuring correctly
> - * any RDF parser used within the library.
> + * any <i>RDF</i> parser used within the library.
> *
> * @author Michele Mostarda (mostarda@fbk.eu)
> */
> @@ -119,7 +120,7 @@ public class RDFParserFactory {
> }
>
> /**
> - * Returns a new instance of a configured {@link org.deri.any23.parser.NQuadsParser}.
> + * Returns a new instance of a configured {@link org.deri.any23.io.nquads.NQuadsParser}.
> *
> * @param verifyDataType data verification enable if <code>true</code>.
> * @param stopAtFirstError the parser stops at first error if <code>true</code>.
> @@ -139,6 +140,26 @@ public class RDFParserFactory {
> }
>
> /**
> + * Returns a new instance of a configured {@link TriXParser}.
> + *
> + * @param verifyDataType data verification enable if <code>true</code>.
> + * @param stopAtFirstError the parser stops at first error if <code>true</code>.
> + * @param extractionContext the extraction context where the parser is used.
> + * @param extractionResult the output extraction result.
> + * @return a new instance of a configured TriX parser.
> + */
> + public TriXParser getTriXParser(
> + final boolean verifyDataType,
> + final boolean stopAtFirstError,
> + final ExtractionContext extractionContext,
> + final ExtractionResult extractionResult
> + ) {
> + final TriXParser parser = new TriXParser();
> + configureParser(parser, verifyDataType, stopAtFirstError, extractionContext, extractionResult);
> + return parser;
> + }
> +
> + /**
> * Configures the given parser on the specified extraction result
> * setting the policies for data verification and error handling.
> *
>
>
Re: svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./
any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/
any23-core/src/main/java/org/deri/any23/cli/
any23-core/src/main/java/org/deri/any23/eval/ any23-core/src/main/java/or
Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks Guys, appreciate it!
Cheers,
Chris
On Jan 10, 2012, at 9:08 AM, Simone Tripodi wrote:
> Hi Mic,
> this is something great, thanks for the hard work of merging!
> next step is renaming the packages in org.apache.any23 :)
>
> All the best, have a nice day!
> -Simo
>
> http://people.apache.org/~simonetripodi/
> http://simonetripodi.livejournal.com/
> http://twitter.com/simonetripodi
> http://www.99soft.org/
>
>
>
> On Tue, Jan 10, 2012 at 5:32 PM, <mo...@apache.org> wrote:
>> Author: mostarda
>> Date: Tue Jan 10 16:32:28 2012
>> New Revision: 1229627
>>
>> URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
>> Log:
>> This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
>> with the current Apache Any23 SVN repo, including the issues
>> developed during the initial import transition phase.
>> Such issues have been tracked on the original Any23 Google Code Issue Tracker [2].
>> Below the extract of the original repository commit log.
>>
>> This commit is related to issue ANY23-27.
>>
>> [1] http://any23.googlecode.com/svn/trunk/
>> [2] http://code.google.com/p/any23/issues/list
>>
>> ==== BEGIN: Original Log ====
>>
>> ------------------------------------------------------------------------
>> r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line
>>
>> Improved numeric datatype assigment. This commit fixes issue #208.
>> ------------------------------------------------------------------------
>> hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
>> ------------------------------------------------------------------------
>> r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line
>>
>> Improved numeric datatype assigment. This commit fixes issue #208.
>> ------------------------------------------------------------------------
>> r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) | 1 line
>>
>> Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'. Fixed HTMLMetaExtractorTest.java to match this new
>> namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS declared as resource instead that as a URI. Fixed
>> RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration. This commit is related to issue #203.
>> ------------------------------------------------------------------------
>> r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) | 1 line
>>
>> Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found wrong declaration of Class(Resource) in WO.java
>> voca. Fixed and updated RDFSchemaUtils.java test. This commit is related to issue #198.
>> ------------------------------------------------------------------------
>> r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) | 1 line
>>
>> Added utility method.
>> ------------------------------------------------------------------------
>> r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) | 1 line
>>
>> Improved Vocabulary.java class: added support for comments to any resource. Improved RDFSchemaUtils.java serialization
>> support, added separators to RDFXML serialization. This commit is related to issue #198.
>> ------------------------------------------------------------------------
>> r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) | 1 line
>>
>> Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved prefix declaration parsing in RDFa11Parser, this
>> new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix declarations. Fixed support for prefix mapping resolution in
>> RDFa11Parser, this allows the correct support for the structured properties introduced by the latest version of the Open
>> Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest to the new output of vocabularies serialization.
>> Updated Any23PluginManagerTest to include a new class. This commit is related to issue #206.
>> ------------------------------------------------------------------------
>> r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) | 1 line
>>
>> Restricted scope of testGetClassesFromClasspath to avoid updating it every time a new class is added.
>> ------------------------------------------------------------------------
>> r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) | 1 line
>>
>> Improved validation mode support. Improved descriptions of Validation and Report fields. This commit is related to issue
>> #209.
>> ------------------------------------------------------------------------
>> r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) | 1 line
>>
>> Improved Any23 Service XML Report format documentation.
>> ------------------------------------------------------------------------
>> r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) | 1 line
>>
>> Added URL encoding to the source location path. This commit fixes issue #205. Chosen not to write a formal test which
>> requires the creation of folders with spaces
>> ------------------------------------------------------------------------
>> r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) | 1 line
>>
>> Removed obsolete section.
>> ------------------------------------------------------------------------
>> r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) | 1 line
>>
>> Improved Any23 facade, added method createDocumentSource() to simplify the extraction setup.
>> ------------------------------------------------------------------------
>> r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) | 1 line
>>
>> Refactored Rover CLI class to made it extensible from other CLI implementations.
>> ------------------------------------------------------------------------
>> r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) | 3 lines
>>
>> Removed wrong artifact name.
>>
>>
>> ------------------------------------------------------------------------
>> r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Upload by wagon-svn
>> ------------------------------------------------------------------------
>> r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Removed no longer used jspf lib. Added crawler4j dependencies. Added README. This commit is related to issue #211.
>> ------------------------------------------------------------------------
>> r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Changed attributes visibility to facilitate the class extensibility.
>> ------------------------------------------------------------------------
>> r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Added helper methods to extract file lines as list of strings. Improved javadoc.
>> ------------------------------------------------------------------------
>> r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Added first version of basic-crawler plugin. This commit is related to issue #211.
>> ------------------------------------------------------------------------
>> r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Added plugins README.
>> ------------------------------------------------------------------------
>> r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Updated main README, added references to plugin and lib.
>> ------------------------------------------------------------------------
>> r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Fixed assembly name.
>> ------------------------------------------------------------------------
>> r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Fixed Tool signature. This commit is related to #211.
>> ------------------------------------------------------------------------
>> r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Improved logging.
>> ------------------------------------------------------------------------
>> r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Included plugin basic-crawler in reactor. Improved ToolRunner and Any23PluginManager tests to be compliant to the new
>> plugin classes. This commit is related to issue #211.
>> ------------------------------------------------------------------------
>> r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) | 1 line
>>
>> Fixed Crawler4j group id. Related to issue #211.
>> ------------------------------------------------------------------------
>> r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Improved plugin documentation. Introduced Office Scraper specific page. This commit is related to issue #213.
>> ------------------------------------------------------------------------
>> r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Fixed POST method documentation. Related to issue #213.
>> ------------------------------------------------------------------------
>> r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Fixed code snippets, prettified, added missing finalization logic. See issue #187.
>> ------------------------------------------------------------------------
>> r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Fixed var name. See #187.
>> ------------------------------------------------------------------------
>> r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Updated code snippets and tutorial, added explicit TripleHandler closure. This commit is related to issue #187.
>> ------------------------------------------------------------------------
>> r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Fixed data type handling management in NQuadsParser. This commit is related to issue #210.
>> ------------------------------------------------------------------------
>> r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Added missing JSON output format. See #214.
>> ------------------------------------------------------------------------
>> r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) | 1 line
>>
>> Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output format support to Rover. This commit is related to
>> issue #215.
>> ------------------------------------------------------------------------
>> r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) | 1 line
>>
>> Added Sesame TriX IO dependency. This commit is related to #215.
>> ------------------------------------------------------------------------
>> r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) | 1 line
>>
>> Some suppressed suppressed have been reactivated as Ignored.
>> ------------------------------------------------------------------------
>> r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) | 1 line
>>
>> Added TriX output format to the Any23 Service. Commit related to issue #215.
>> ------------------------------------------------------------------------
>> r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) | 1 line
>>
>> Improved FormatWriter management, added WriterRegistry. Improved Writer format management in Rover and WebResponder.
>> This commit is related to issues #215 and #216.
>> ------------------------------------------------------------------------
>> r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) | 6 lines
>>
>> Added TriXExtractor and textual example (example-trix.trx), added trix support in RDFParserFactory.
>> Registered TriXExtractor to the ExtractorRegistry.
>> Added TriX mimetype support in TikaMIMETypeDetector (through mimetypes.xml) and added specific test.
>> Added support and doc to TriX format in Any23 Service web page (form.html).
>> This commit is related to issue #215.
>>
>> ------------------------------------------------------------------------
>> r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) | 1 line
>>
>> Fixed number of extractors (+1 after adding TriXExtractor). Commit related to issue #215.
>> ------------------------------------------------------------------------
>> r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Added method getExtractorType() .
>> ------------------------------------------------------------------------
>> r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) | 4 lines
>>
>> Improved ExtractorDocumentation support, added missing format examples.
>> Improved output layout. This commit is related to issue #194.
>>
>>
>> ------------------------------------------------------------------------
>> r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Improved classpath management in Any23PluginManager. Renamed getClasses\* in loadClasses\* . This commit is related to
>> issue #212.
>> ------------------------------------------------------------------------
>> r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Separated log messages from specific outout data.
>> ------------------------------------------------------------------------
>> r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Added human readable report printing support in ReportingTripleHandler and Rover.
>> ------------------------------------------------------------------------
>> r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Fixed major issue in output generation, added final activity report, help prettification. This commit is related to
>> issue #211.
>> ------------------------------------------------------------------------
>> r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Upgraded to Sesame 2.6.1 See issue #217.
>> ------------------------------------------------------------------------
>> r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue #216
>> ------------------------------------------------------------------------
>> r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) | 1 line
>>
>> Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
>> ------------------------------------------------------------------------
>> r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) | 1 line
>>
>> Added specific Crawler CLI documentation. Updated general CLI documentation. This commit is related to issue #211.
>> ------------------------------------------------------------------------
>> r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) | 4 lines
>>
>> The Eval CLI Tool has been removed as well as the org.deri.any23.eval package classes related to it.
>> Updated tests verifying CLI tool detection.
>> This commit is related to issue #218.
>>
>> ------------------------------------------------------------------------
>> r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) | 5 lines
>>
>> Added MimeDetector CLI Tool and test case, removed main() from
>> TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
>> Updated CLI doc.
>> This commit is related to issue #219.
>>
>> ------------------------------------------------------------------------
>> r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) | 1 line
>>
>> Added support for comment serialization. Related to issue #158.
>> ------------------------------------------------------------------------
>> r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) | 1 line
>>
>> Add support for annotation writing in FormatWriter implementations. This commit is related to issue #158.
>> ------------------------------------------------------------------------
>> r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) | 1 line
>>
>> Added support for 'annotate' flag in Any23 Service.
>> ------------------------------------------------------------------------
>>
>> ==== END : Original Log ====
>>
>>
>> Added:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
>> incubator/any23/trunk/any23-core/src/test/resources/application/trix/
>> incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
>> incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
>> incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
>> incubator/any23/trunk/lib/README.txt
>> incubator/any23/trunk/plugins/README.txt
>> incubator/any23/trunk/plugins/basic-crawler/
>> incubator/any23/trunk/plugins/basic-crawler/pom.xml
>> incubator/any23/trunk/plugins/basic-crawler/src/
>> incubator/any23/trunk/plugins/basic-crawler/src/main/
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
>> incubator/any23/trunk/plugins/basic-crawler/src/test/
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
>> incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
>> Removed:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
>> Modified:
>> incubator/any23/trunk/README.txt
>> incubator/any23/trunk/any23-core/bin/any23
>> incubator/any23/trunk/any23-core/bin/any23tools
>> incubator/any23/trunk/any23-core/pom.xml
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
>> incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
>> incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
>> incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
>> incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
>> incubator/any23/trunk/lib/install-deps.sh
>> incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
>> incubator/any23/trunk/pom.xml
>> incubator/any23/trunk/src/site/apt/any23-plugins.apt
>> incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
>> incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
>> incubator/any23/trunk/src/site/apt/getting-started.apt
>> incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
>> incubator/any23/trunk/src/site/apt/service.apt
>> incubator/any23/trunk/src/site/apt/supported-formats.apt
>>
>> Modified: incubator/any23/trunk/README.txt
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/README.txt (original)
>> +++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
>> @@ -20,7 +20,8 @@ Distribution Content
>>
>> any23-core The library core codebase.
>> any23-service The library HTTP service codebase.
>> -plugins Library plugins codebase.
>> +lib Contains the Any23 the external deps (read lib/README.txt for further details).
>> +plugins Library plugins codebase (read plugins/README.txt for further details).
>> RELEASE-NOTES.txt File reporting main release notes for every version.
>> LICENSE.txt Applicable project license.
>> README.txt This file.
>> @@ -240,15 +241,14 @@ Upload the produced packages in download
>>
>> http://code.google.com/p/any23/downloads/list
>>
>> +--------------------
>> +Manage External Deps
>> +--------------------
>>
>> -Fix Release Procedure
>> ----------------------
>> -
>> - Currently the *plugins/integration-test* module is excluded from the parent
>> - reactor.
>> - To fix it in tag follow procedure as described at issue #171:
>> -
>> - http://code.google.com/p/any23/issues/detail?id=171
>> +::Developers interest only.::
>>
>> +External Deps are libraries used by some Any23 modules which are
>> +not available in public Maven repositories. Such libraries are
>> +managed within the 'lib' dir.
>>
>> EOF
>>
>> Modified: incubator/any23/trunk/any23-core/bin/any23
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/bin/any23 (original)
>> +++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
>> @@ -9,12 +9,12 @@
>> ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
>>
>> if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> - echo "Generating executable JAR..."
>> - mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
>> + echo "Generating executable JAR..." >&2
>> + mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>> ||\
>> - mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
>> + mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>> ||\
>> - { echo "Error while generating commandline assembly."; exit 1; }
>> + { echo "Error while generating commandline assembly." >&2; exit 1; }
>> fi
>>
>> SEP=':'
>>
>> Modified: incubator/any23/trunk/any23-core/bin/any23tools
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/bin/any23tools (original)
>> +++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28 2012
>> @@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd
>> PLUGINS_DIR=plugins
>>
>> if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> - echo "Generating executable JAR..."
>> - mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
>> + echo "Generating executable JAR..." >&2
>> + mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>> ||\
>> - mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
>> + mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
>> ||\
>> - { echo "Error while generating commandline assembly."; exit 1; }
>> + { echo "Error while generating commandline assembly." >&2; exit 1; }
>> fi
>>
>> SEP=':'
>> @@ -30,6 +30,7 @@ done
>> # Plugins classpath.
>> for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name "*-plugin.jar" -depth 1)
>> do
>> + echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
>> if [ ! -e "$jar" ]; then continue; fi
>> CP="$CP$SEP$jar"
>> done
>>
>> Modified: incubator/any23/trunk/any23-core/pom.xml
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/pom.xml (original)
>> +++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
>> @@ -92,6 +92,10 @@
>> </dependency>
>> <dependency>
>> <groupId>org.openrdf.sesame</groupId>
>> + <artifactId>sesame-rio-trix</artifactId>
>> + </dependency>
>> + <dependency>
>> + <groupId>org.openrdf.sesame</groupId>
>> <artifactId>sesame-repository-sail</artifactId>
>> </dependency>
>> <dependency>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java Tue Jan 10 16:32:28 2012
>> @@ -258,6 +258,28 @@ public class Any23 {
>> }
>>
>> /**
>> + * Returns the most appropriate {@link DocumentSource} for the given<code>documentURI</code>.
>> + *
>> + * @param documentURI the document <i>URI</i>.
>> + * @return a new instance of DocumentSource.
>> + * @throws URISyntaxException if an error occurs while parsing the <code>documentURI</code> as a <i>URI</i>.
>> + * @throws IOException if an error occurs while initializing the internal {@link HTTPClient}.
>> + */
>> + public DocumentSource createDocumentSource(String documentURI) throws URISyntaxException, IOException {
>> + if(documentURI == null) throw new NullPointerException("documentURI cannot be null.");
>> + if (documentURI.toLowerCase().startsWith("file:")) {
>> + return new FileDocumentSource( new File(new URI(documentURI)) );
>> + }
>> + if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
>> + return new HTTPDocumentSource(getHTTPClient(), documentURI);
>> + }
>> + throw new IllegalArgumentException(
>> + String.format("Unsupported protocol for document URI: '%s' .", documentURI)
>> + );
>> + }
>> +
>> +
>> + /**
>> * Performs metadata extraction from the content of the given
>> * <code>in</code> document source, sending the generated events
>> * to the specified <code>outputHandler</code>.
>> @@ -363,13 +385,7 @@ public class Any23 {
>> public ExtractionReport extract(ExtractionParameters eps, String documentURI, TripleHandler outputHandler)
>> throws IOException, ExtractionException {
>> try {
>> - if (documentURI.toLowerCase().startsWith("file:")) {
>> - return extract(eps, new FileDocumentSource(new File(new URI(documentURI))), outputHandler);
>> - }
>> - if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
>> - return extract(eps, new HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
>> - }
>> - throw new ExtractionException("Not a valid absolute URI: " + documentURI);
>> + return extract(eps, createDocumentSource(documentURI), outputHandler);
>> } catch (URISyntaxException ex) {
>> throw new ExtractionException("Error while extracting data from document URI.", ex);
>> }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java Tue Jan 10 16:32:28 2012
>> @@ -16,7 +16,7 @@
>>
>> package org.deri.any23.cli;
>>
>> -import org.deri.any23.LogUtil;
>> +import org.deri.any23.util.LogUtils;
>> import org.deri.any23.extractor.ExampleInputOutput;
>> import org.deri.any23.extractor.ExtractionException;
>> import org.deri.any23.extractor.Extractor;
>> @@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
>> }
>>
>> public int run(String[] args) {
>> - LogUtil.setDefaultLogging();
>> + LogUtils.setDefaultLogging();
>> try {
>> if (args.length == 0) {
>> printUsage();
>> @@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
>> * Prints the list of all the available extractors.
>> */
>> public void printExtractorList() {
>> - for (String extractorName : ExtractorRegistry.getInstance().getAllNames()) {
>> - System.out.println(extractorName);
>> + for(ExtractorFactory factory : ExtractorRegistry.getInstance().getExtractorGroup()) {
>> + System.out.println( String.format("%25s [%15s]", factory.getExtractorName(), factory.getExtractorType()));
>> }
>> }
>>
>> @@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
>> ExtractorFactory<?> factory = ExtractorRegistry.getInstance().getFactory(extractorName);
>> ExampleInputOutput example = new ExampleInputOutput(factory);
>> System.out.println("Extractor: " + extractorName);
>> - System.out.println(" type: " + getType(factory));
>> - String output = example.getExampleOutput();
>> - if (output == null) {
>> - System.out.println("(no example output)");
>> + System.out.println("\ttype: " + getType(factory));
>> + System.out.println();
>> + final String exampleInput = example.getExampleInput();
>> + if(exampleInput == null) {
>> + System.out.println("(No Example Available)");
>> } else {
>> - System.out.println("-------- example output --------");
>> - System.out.println(output);
>> + System.out.println("-------- Example Input --------");
>> + System.out.println(exampleInput);
>> + System.out.println("-------- Example Output --------");
>> + String output = example.getExampleOutput();
>> + System.out.println(output == null || output.trim().length() == 0 ? "(No Output Generated)" : output);
>> }
>> - System.out.println();
>> System.out.println("================================");
>> + System.out.println();
>> }
>> }
>>
>>
>> Added: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java (added)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java Tue Jan 10 16:32:28 2012
>> @@ -0,0 +1,113 @@
>> +/*
>> + * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
>> + *
>> + * Licensed under the Apache License, Version 2.0 (the "License");
>> + * you may not use this file except in compliance with the License.
>> + * You may obtain a copy of the License at
>> + *
>> + * http://www.apache.org/licenses/LICENSE-2.0
>> + *
>> + * Unless required by applicable law or agreed to in writing, software
>> + * distributed under the License is distributed on an "AS IS" BASIS,
>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>> + * See the License for the specific language governing permissions and
>> + * limitations under the License.
>> + */
>> +
>> +package org.deri.any23.cli;
>> +
>> +import org.deri.any23.configuration.DefaultConfiguration;
>> +import org.deri.any23.http.DefaultHTTPClient;
>> +import org.deri.any23.http.HTTPClient;
>> +import org.deri.any23.http.HTTPClientConfiguration;
>> +import org.deri.any23.mime.MIMEType;
>> +import org.deri.any23.mime.MIMETypeDetector;
>> +import org.deri.any23.mime.TikaMIMETypeDetector;
>> +import org.deri.any23.source.DocumentSource;
>> +import org.deri.any23.source.FileDocumentSource;
>> +import org.deri.any23.source.HTTPDocumentSource;
>> +import org.deri.any23.source.StringDocumentSource;
>> +
>> +import java.io.File;
>> +import java.net.URISyntaxException;
>> +
>> +/**
>> + * Commandline tool to detect <b>MIME Type</b>s from
>> + * file, HTTP and direct input sources.
>> + * The implementation of this tool is based on {@link TikaMIMETypeDetector}.
>> + *
>> + * @author Michele Mostarda (mostarda@fbk.eu)
>> + */
>> +@ToolRunner.Description("MIME Type Detector Tool.")
>> +public class MimeDetector implements Tool{
>> +
>> + public static final String FILE_DOCUMENT_PREFIX = "file://";
>> + public static final String INLINE_DOCUMENT_PREFIX = "inline://";
>> + public static final String URL_DOCUMENT_RE = "^https?://.*";
>> +
>> + public static void main(String[] args) {
>> + System.exit( new MimeDetector().run(args) );
>> + }
>> +
>> + @Override
>> + public int run(String[] args) {
>> + if(args.length != 1) {
>> + System.err.println("USAGE: {http://path/to/resource.html|file:///path/to/local.file|inline:// some inline content}");
>> + return 1;
>> + }
>> +
>> + final String document = args[0];
>> + try {
>> + final DocumentSource documentSource = createDocumentSource(document);
>> + final MIMETypeDetector detector = new TikaMIMETypeDetector();
>> + final MIMEType mimeType = detector.guessMIMEType(
>> + documentSource.getDocumentURI(),
>> + documentSource.openInputStream(),
>> + MIMEType.parse(documentSource.getContentType())
>> + );
>> + System.out.println(mimeType);
>> + return 0;
>> + } catch (Exception e) {
>> + System.err.print("Error while detecting MIME Type.");
>> + e.printStackTrace(System.err);
>> + return 1;
>> + }
>> + }
>> +
>> + private DocumentSource createDocumentSource(String document) throws URISyntaxException {
>> + if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
>> + return new FileDocumentSource(
>> + new File(
>> + document.substring(FILE_DOCUMENT_PREFIX.length())
>> + )
>> + );
>> + }
>> + if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
>> + return new StringDocumentSource(
>> + document.substring(INLINE_DOCUMENT_PREFIX.length()),
>> + ""
>> + );
>> + }
>> + if(document.matches(URL_DOCUMENT_RE)) {
>> + final HTTPClient client = new DefaultHTTPClient();
>> + // TODO: anonymous config class also used in Any23. centralize.
>> + client.init(new HTTPClientConfiguration() {
>> + public String getUserAgent() {
>> + return DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
>> + }
>> + public String getAcceptHeader() {
>> + return "";
>> + }
>> + public int getDefaultTimeout() {
>> + return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
>> + }
>> + public int getMaxConnections() {
>> + return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
>> + }
>> + });
>> + return new HTTPDocumentSource(client, document);
>> + }
>> + throw new IllegalArgumentException("Unsupported protocol for document " + document);
>> + }
>> +
>> +}
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java Tue Jan 10 16:32:28 2012
>> @@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
>> import org.apache.commons.cli.Options;
>> import org.apache.commons.cli.PosixParser;
>> import org.deri.any23.Any23;
>> -import org.deri.any23.LogUtil;
>> +import org.deri.any23.util.LogUtils;
>> import org.deri.any23.configuration.Configuration;
>> import org.deri.any23.configuration.DefaultConfiguration;
>> import org.deri.any23.extractor.ExtractionException;
>> @@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
>> import org.deri.any23.extractor.SingleDocumentExtraction;
>> import org.deri.any23.filter.IgnoreAccidentalRDFa;
>> import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
>> +import org.deri.any23.source.DocumentSource;
>> import org.deri.any23.writer.BenchmarkTripleHandler;
>> import org.deri.any23.writer.LoggingTripleHandler;
>> -import org.deri.any23.writer.NQuadsWriter;
>> -import org.deri.any23.writer.NTriplesWriter;
>> -import org.deri.any23.writer.RDFXMLWriter;
>> import org.deri.any23.writer.ReportingTripleHandler;
>> import org.deri.any23.writer.TripleHandler;
>> import org.deri.any23.writer.TripleHandlerException;
>> -import org.deri.any23.writer.TurtleWriter;
>> -import org.deri.any23.writer.URIListWriter;
>> +import org.deri.any23.writer.WriterRegistry;
>> import org.slf4j.Logger;
>> import org.slf4j.LoggerFactory;
>>
>> @@ -51,6 +48,7 @@ import java.io.OutputStream;
>> import java.io.PrintStream;
>> import java.io.PrintWriter;
>> import java.net.MalformedURLException;
>> +import java.net.URISyntaxException;
>> import java.net.URL;
>>
>> import static org.deri.any23.extractor.ExtractionParameters.ValidationMode;
>> @@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
>> * A default rover implementation. Goes and fetches a URL using an hint
>> * as to what format should require, then tries to convert it to RDF.
>> *
>> - * @author Gabriele Renzi
>> - * @author Richard Cyganiak (richard@cyganiak.de)
>> * @author Michele Mostarda (mostarda@fbk.eu)
>> + * @author Richard Cyganiak (richard@cyganiak.de)
>> + * @author Gabriele Renzi
>> */
>> @ToolRunner.Description("Any23 Command Line Tool.")
>> public class Rover implements Tool {
>>
>> - // Supported formats.
>> - private static final String TURTLE_FORMAT = "turtle";
>> - private static final String NTRIPLE_FORMAT = "ntriples";
>> - private static final String RDFXML_FORMAT = "rdfxml";
>> - private static final String NQUADS_FORMAT = "nquads";
>> - private static final String URIS_FORMAT = "uris";
>> -
>> - private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
>> + private static final String[] FORMATS = WriterRegistry.getInstance().getIdentifiers();
>> + private static final int DEFAULT_FORMAT_INDEX = 0;
>>
>> private static final Logger logger = LoggerFactory.getLogger(Rover.class);
>>
>> - private static Options options;
>> + private Options options;
>>
>> - public static void main(String[] args) {
>> - System.exit( new Rover().run(args) );
>> - }
>> + private CommandLine commandLine;
>>
>> - public int run(String[] args) {
>> - final CommandLineParser parser = new PosixParser();
>> - final CommandLine commandLine;
>> + private boolean verbose = false;
>>
>> - boolean verbose = false;
>> - try {
>> - options = createOptions();
>> - commandLine = parser.parse(options, args);
>> + private PrintStream outputStream;
>> + private TripleHandler tripleHandler;
>> + private ReportingTripleHandler reportingTripleHandler;
>> + private BenchmarkTripleHandler benchmarkTripleHandler;
>>
>> - if (commandLine.hasOption("h")) {
>> - printHelp();
>> - return 0;
>> - }
>> + private ExtractionParameters eps;
>> + private Any23 any23;
>>
>> - if (commandLine.hasOption('v')) {
>> - verbose = true;
>> - LogUtil.setVerboseLogging();
>> - } else {
>> - LogUtil.setDefaultLogging();
>> - }
>> -
>> - if (commandLine.getArgs().length < 1) {
>> - printHelp();
>> - throw new IllegalArgumentException("Expected at least 1 argument.");
>> - }
>> + protected boolean isVerbose() {
>> + return verbose;
>> + }
>>
>> - final String[] inputURIs = argumentsToURIs(commandLine.getArgs());
>> - final String[] extractorNames = getExtractors(commandLine);
>> + public static void main(String[] args) {
>> + System.exit( new Rover().run(args) );
>> + }
>>
>> - PrintStream outputStream = null;
>> - TripleHandler tripleHandler = null;
>> - try {
>> - outputStream = getOutputStream(commandLine);
>> + public int run(String[] args) {
>> + try {
>> + final String[] uris = configure(args);
>> + performExtraction(uris);
>> + return 0;
>> + } catch (Exception e) {
>> + System.err.println( e.getMessage() );
>> + final int exitCode = e instanceof ExitCodeException ? ((ExitCodeException) e).exitCode : 1;
>> + if(verbose) e.printStackTrace(System.err);
>> + return exitCode;
>> + }
>> + }
>>
>> - tripleHandler = getTripleHandler(commandLine, outputStream);
>> + protected CommandLine getCommandLine() {
>> + if(commandLine == null) throw new IllegalStateException("Rover must be configured first.");
>> + return commandLine;
>> + }
>>
>> - tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
>> + protected String[] configure(String[] args) throws Exception {
>> + final CommandLineParser parser = new PosixParser();
>> + options = createOptions();
>> + commandLine = parser.parse(options, args);
>>
>> - tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
>> - final BenchmarkTripleHandler benchmarkTripleHandler =
>> - tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
>> + if (commandLine.hasOption("h")) {
>> + printHelp();
>> + throw new ExitCodeException(0);
>> + }
>>
>> - tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>> + if (commandLine.hasOption('v')) {
>> + verbose = true;
>> + LogUtils.setVerboseLogging();
>> + } else {
>> + LogUtils.setDefaultLogging();
>> + }
>>
>> - final ReportingTripleHandler reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
>> + if (commandLine.getArgs().length < 1) {
>> + printHelp();
>> + throw new IllegalArgumentException("Expected at least 1 argument.");
>> + }
>>
>> - final ExtractionParameters eps = getExtractionParameters(commandLine);
>> + final String[] inputURIs = argumentsToURIs(commandLine.getArgs());
>> + final String[] extractorNames = getExtractors(commandLine);
>>
>> - final Any23 any23 = createAny23(extractorNames);
>> + try {
>> + outputStream = getOutputStream(commandLine);
>> + tripleHandler = getTripleHandler(commandLine, outputStream);
>> + tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
>> + tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
>>
>> - final long start = System.currentTimeMillis();
>> - for(String inputURI : inputURIs) {
>> - performExtraction(any23, eps, inputURI, reportingTripleHandler);
>> - }
>> - final long elapsed = System.currentTimeMillis() - start;
>> + benchmarkTripleHandler =
>> + tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
>>
>> - closeAll(tripleHandler, outputStream);
>> + tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>>
>> - if (benchmarkTripleHandler != null) {
>> - System.err.println( benchmarkTripleHandler.report() );
>> - }
>> + reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
>> + eps = getExtractionParameters(commandLine);
>> + any23 = createAny23(extractorNames);
>>
>> - logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
>> - logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
>> - } finally {
>> - closeAll(tripleHandler, outputStream);
>> - }
>> + return inputURIs;
>> } catch (Exception e) {
>> - System.err.println(e.getMessage());
>> - final int exitCode = e instanceof SpecificExitException ? ((SpecificExitException) e).exitCode : 1;
>> - if(verbose) e.printStackTrace(System.err);
>> - return exitCode;
>> + closeStreams();
>> + throw e;
>> }
>> - return 0;
>> }
>>
>> - private Options createOptions() {
>> + protected Options createOptions() {
>> final Options options = new Options();
>> options.addOption(
>> new Option("v", "verbose", false, "Show debug and progress information.")
>> @@ -178,13 +175,7 @@ public class Rover implements Tool {
>> "f",
>> "Output format",
>> true,
>> - "[" +
>> - TURTLE_FORMAT + " (default), " +
>> - NTRIPLE_FORMAT + ", " +
>> - RDFXML_FORMAT + ", " +
>> - NQUADS_FORMAT + ", " +
>> - URIS_FORMAT +
>> - "]"
>> + "[" + printFormats(FORMATS, DEFAULT_FORMAT_INDEX) + "]"
>> )
>> );
>> options.addOption(
>> @@ -208,11 +199,51 @@ public class Rover implements Tool {
>> return options;
>> }
>>
>> + protected void performExtraction(DocumentSource documentSource) {
>> + performExtraction(any23, eps, documentSource, reportingTripleHandler);
>> + }
>> +
>> + protected void performExtraction(String[] inputURIs) throws URISyntaxException, IOException {
>> + try {
>> + final long start = System.currentTimeMillis();
>> + for (String inputURI : inputURIs) {
>> + performExtraction( any23.createDocumentSource(inputURI) );
>> + }
>> + final long elapsed = System.currentTimeMillis() - start;
>> +
>> + if (benchmarkTripleHandler != null) {
>> + System.err.println(benchmarkTripleHandler.report());
>> + }
>> +
>> + logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
>> + logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
>> + } finally {
>> + closeStreams();
>> + }
>> + }
>> +
>> + protected String printReports() {
>> + final StringBuilder sb = new StringBuilder();
>> + if(benchmarkTripleHandler != null) sb.append( benchmarkTripleHandler.report() ).append('\n');
>> + if(reportingTripleHandler != null) sb.append( reportingTripleHandler.printReport() ).append('\n');
>> + return sb.toString();
>> + }
>> +
>> private void printHelp() {
>> HelpFormatter formatter = new HelpFormatter();
>> formatter.printHelp("[{<url>|<file>}]+", options, true);
>> }
>>
>> + private String printFormats(String[] formats, int defaultIndex) {
>> + final StringBuilder sb = new StringBuilder();
>> + for (int i = 0; i < formats.length; i++) {
>> + sb.append(formats[i]);
>> + if(i == defaultIndex) sb.append(" (default)");
>> + if(i < formats.length - 1) sb.append(", ");
>> + }
>> + return sb.toString();
>> + }
>> +
>> private String argumentToURI(String uri) {
>> uri = uri.trim();
>> if (uri.toLowerCase().startsWith("http:") || uri.toLowerCase().startsWith("https:")) {
>> @@ -268,27 +299,17 @@ public class Rover implements Tool {
>>
>> private TripleHandler getTripleHandler(CommandLine cl, OutputStream os) {
>> final String FORMAT_OPTION = "f";
>> - String format = DEFAULT_FORMAT;
>> + String format = FORMATS[DEFAULT_FORMAT_INDEX];
>> if (cl.hasOption(FORMAT_OPTION)) {
>> - format = cl.getOptionValue(FORMAT_OPTION);
>> + format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
>> }
>> - final TripleHandler outputHandler;
>> - if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
>> - outputHandler = new TurtleWriter(os);
>> - } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
>> - outputHandler = new NTriplesWriter(os);
>> - } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
>> - outputHandler = new RDFXMLWriter(os);
>> - } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
>> - outputHandler = new NQuadsWriter(os);
>> - } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
>> - outputHandler = new URIListWriter(os);
>> - } else {
>> + try {
>> + return WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
>> + } catch (Exception e) {
>> throw new IllegalArgumentException(
>> String.format("Invalid option value '%s' for option %s", format, FORMAT_OPTION)
>> );
>> }
>> - return outputHandler;
>> }
>>
>> private TripleHandler decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
>> @@ -346,44 +367,54 @@ public class Rover implements Tool {
>> return any23;
>> }
>>
>> - private void performExtraction(Any23 any23, ExtractionParameters eps, String documentURI, TripleHandler th) {
>> + private void performExtraction(
>> + Any23 any23, ExtractionParameters eps, DocumentSource documentSource, TripleHandler th
>> + ) {
>> try {
>> - if (! any23.extract(eps, documentURI, th).hasMatchingExtractors()) {
>> - throw new SpecificExitException("No suitable extractors found.", 2);
>> + if (! any23.extract(eps, documentSource, th).hasMatchingExtractors()) {
>> + throw new ExitCodeException("No suitable extractors found.", 2);
>> }
>> } catch (ExtractionException ex) {
>> - throw new SpecificExitException("Exception while extracting metadata.", ex, 3);
>> + throw new ExitCodeException("Exception while extracting metadata.", ex, 3);
>> } catch (IOException ex) {
>> - throw new SpecificExitException("Exception while producing output.", ex, 4);
>> + throw new ExitCodeException("Exception while producing output.", ex, 4);
>> }
>> }
>>
>> - private void closeHandler(TripleHandler th) {
>> - if(th == null) return;
>> + private void closeHandler() {
>> + if(tripleHandler == null) return;
>> try {
>> - th.close();
>> + tripleHandler.close();
>> } catch (TripleHandlerException the) {
>> - throw new SpecificExitException("Error while closing TripleHandler", the, 5);
>> + throw new ExitCodeException("Error while closing TripleHandler", the, 5);
>> }
>> }
>>
>> - private void closeAll(TripleHandler th, PrintStream os) {
>> - closeHandler(th);
>> - if(os != null) os.close();
>> + private void closeStreams() {
>> + closeHandler();
>> + if(outputStream != null) outputStream.close();
>> }
>>
>> - private class SpecificExitException extends RuntimeException {
>> + protected class ExitCodeException extends RuntimeException {
>>
>> private final int exitCode;
>>
>> - public SpecificExitException(String message, Throwable cause, int exitCode) {
>> + public ExitCodeException(String message, Throwable cause, int exitCode) {
>> super(message, cause);
>> this.exitCode = exitCode;
>> }
>> - public SpecificExitException(String message, int exitCode) {
>> + public ExitCodeException(String message, int exitCode) {
>> super(message);
>> this.exitCode = exitCode;
>> }
>> + public ExitCodeException(int exitCode) {
>> + super();
>> + this.exitCode = exitCode;
>> + }
>> +
>> + protected int getExitCode() {
>> + return exitCode;
>> + }
>> }
>>
>> }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java Tue Jan 10 16:32:28 2012
>> @@ -29,6 +29,13 @@ import java.util.Collection;
>> public interface ExtractorFactory<T extends Extractor<?>> extends ExtractorDescription {
>>
>> /**
>> + * Returns the extractor type.
>> + *
>> + * @return the not <code>null</code> extractor class.
>> + */
>> + Class<T> getExtractorType();
>> +
>> + /**
>> * Creates an extractor instance.
>> *
>> * @return an instance of the extractor associated to this factory.
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java Tue Jan 10 16:32:28 2012
>> @@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
>> import org.deri.any23.extractor.rdf.NQuadsExtractor;
>> import org.deri.any23.extractor.rdf.NTriplesExtractor;
>> import org.deri.any23.extractor.rdf.RDFXMLExtractor;
>> +import org.deri.any23.extractor.rdf.TriXExtractor;
>> import org.deri.any23.extractor.rdf.TurtleExtractor;
>> import org.deri.any23.extractor.rdfa.RDFa11Extractor;
>> import org.deri.any23.extractor.rdfa.RDFaExtractor;
>> @@ -79,6 +80,7 @@ public class ExtractorRegistry {
>> instance.register(TurtleExtractor.factory);
>> instance.register(NTriplesExtractor.factory);
>> instance.register(NQuadsExtractor.factory);
>> + instance.register(TriXExtractor.factory);
>> if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
>> instance.register(RDFa11Extractor.factory);
>> } else {
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java Tue Jan 10 16:32:28 2012
>> @@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
>> return supportedMIMETypes;
>> }
>>
>> + @Override
>> + public Class<T> getExtractorType() {
>> + return extractorClass;
>> + }
>> +
>> /**
>> * @return an instance of type T concrete implementation of {@link org.deri.any23.extractor.Extractor}
>> */
>> + @Override
>> public T createExtractor() {
>> try {
>> return extractorClass.newInstance();
>> @@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
>> /**
>> * @return an input example
>> */
>> + @Override
>> public String getExampleInput() {
>> return exampleInput;
>> }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
>> Arrays.asList(
>> "text/csv;q=0.1"
>> ),
>> - null,
>> + "example-csv.csv",
>> CSVExtractor.class
>> );
>>
>> @@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
>> }
>>
>> /**
>> + * Check whether a number is an integer.
>> + *
>> + * @param number
>> + * @return
>> + */
>> + private boolean isInteger(String number) {
>> + try {
>> + Integer.valueOf(number);
>> + return true;
>> + } catch (NumberFormatException e) {
>> + return false;
>> + }
>> + }
>> +
>> + /**
>> + * Check whether a number is a float.
>> + *
>> * @param number
>> * @return
>> */
>> - private boolean isNumber(String number) {
>> + private boolean isFloat(String number) {
>> try {
>> - Double.valueOf(number);
>> + Float.valueOf(number);
>> return true;
>> } catch (NumberFormatException e) {
>> return false;
>> @@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
>> object = new URIImpl(cell);
>> } else {
>> URI datatype = XMLSchema.STRING;
>> - if (isNumber(cell)) {
>> + if (isInteger(cell)) {
>> datatype = XMLSchema.INTEGER;
>> + } else if(isFloat(cell)) {
>> + datatype = XMLSchema.FLOAT;
>> }
>> object = new LiteralImpl(cell, datatype);
>> }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
>> "html-mf-adr",
>> PopularPrefixes.createSubset("rdf", "vcard"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-adr.html",
>> AdrExtractor.class
>> );
>> }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
>> "html-mf-geo",
>> PopularPrefixes.createSubset("rdf", "vcard"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-geo.html",
>> GeoExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -53,7 +53,7 @@ public class HCalendarExtractor extends
>> "html-mf-hcalendar",
>> PopularPrefixes.createSubset("rdf", "ical"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-hcalendar.html",
>> HCalendarExtractor.class);
>>
>> private static final String[] Components = {"Vevent", "Vtodo", "Vjournal", "Vfreebusy"};
>> @@ -116,7 +116,7 @@ public class HCalendarExtractor extends
>> private boolean extractComponent(Node node, Resource cal, String component) throws ExtractionException {
>> HTMLDocument compoNode = new HTMLDocument(node);
>> BNode evt = valueFactory.createBNode();
>> - addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
>> + addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
>> addTextProps(compoNode, evt);
>> addUrl(compoNode, evt);
>> addRRule(compoNode, evt);
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
>> "html-mf-hcard",
>> PopularPrefixes.createSubset("rdf", "vcard"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-hcard.html",
>> HCardExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -82,7 +82,7 @@ public class HListingExtractor extends E
>> "html-mf-hlisting",
>> PopularPrefixes.createSubset("rdf", "hlisting"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-hlisting.html",
>> HListingExtractor.class
>> );
>>
>> @@ -106,7 +106,7 @@ public class HListingExtractor extends E
>> out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
>>
>> for (String action : findActions(fragment)) {
>> - out.writeTriple(listing, hLISTING.action, hLISTING.getResource(action));
>> + out.writeTriple(listing, hLISTING.action, hLISTING.getClass(action));
>> }
>> out.writeTriple(listing, hLISTING.lister, addLister() );
>> addItem(listing);
>> @@ -154,7 +154,7 @@ public class HListingExtractor extends E
>> String value = node.getNodeValue();
>> // do not use conditionallyAdd, it won't work cause of evaluation rules
>> if (!(null == value || "".equals(value))) {
>> - URI property = hLISTING.getPropertyCamelized(klass);
>> + URI property = hLISTING.getPropertyCamelCase(klass);
>> conditionallyAddLiteralProperty(
>> node,
>> blankItem, property, valueFactory.createLiteral(value)
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
>> "html-mf-hrecipe",
>> PopularPrefixes.createSubset("rdf", "hrecipe"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-hrecipe.html",
>> HRecipeExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -48,7 +48,7 @@ public class HResumeExtractor extends En
>> "html-mf-hresume",
>> PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-hresume.html",
>> HResumeExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -53,7 +53,7 @@ public class HReviewExtractor extends En
>> "html-mf-hreview",
>> PopularPrefixes.createSubset("rdf", "vcard", "rev"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-hreview.html",
>> HReviewExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
>> "html-head-links",
>> PopularPrefixes.createSubset("xhtml", "dcterms"),
>> Arrays.asList("text/html;q=0.05", "application/xhtml+xml;q=0.05"),
>> - null,
>> + "example-head-link.html",
>> HeadLinkExtractor.class);
>> }
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
>> "html-head-icbm",
>> PopularPrefixes.createSubset("geo", "rdf"),
>> Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
>> - null,
>> + "example-icbm.html",
>> ICBMExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -51,7 +51,7 @@ public class LicenseExtractor implements
>> "html-mf-license",
>> PopularPrefixes.createSubset("xhtml"),
>> Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
>> - null,
>> + "example-mf-license.html",
>> LicenseExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
>> "html-mf-species",
>> PopularPrefixes.createSubset("rdf", "wo"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-species.html",
>> SpeciesExtractor.class
>> );
>>
>> @@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
>>
>> private URI resolveClassName(String clazz) {
>> String upperCaseClass = clazz.substring(0, 1);
>> - return vWO.getResource(
>> + return vWO.getClass(
>> String.format("%s%s",
>> upperCaseClass.toUpperCase(),
>> clazz.substring(1)
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
>> NAME,
>> PopularPrefixes.get(),
>> Arrays.asList("text/html;q=0.02", "application/xhtml+xml;q=0.02"),
>> - null,
>> + "example-script-turtle.html",
>> TurtleHTMLExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
>> "html-mf-xfn",
>> PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-mf-xfn.html",
>> XFNExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java Tue Jan 10 16:32:28 2012
>> @@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
>> "html-microdata",
>> PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>> Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
>> - null,
>> + "example-microdata.html",
>> MicrodataExtractor.class
>> );
>>
>>
>> Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> ==============================================================================
>> --- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java (original)
>> +++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java Tue Jan 10 16:32:28 2012
>> @@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
>> import org.deri.any23.extractor.ErrorReporter;
>> import org.deri.any23.extractor.ExtractionContext;
>> import org.deri.any23.extractor.ExtractionResult;
>> -import org.deri.any23.parser.NQuadsParser;
>> +import org.deri.any23.io.nquads.NQuadsParser;
>> import org.deri.any23.rdf.Any23ValueFactoryWrapper;
>> import org.openrdf.model.impl.ValueFactoryImpl;
>> import org.openrdf.rio.ParseErrorListener;
>> @@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
>> import org.openrdf.rio.RDFParser;
>> import org.openrdf.rio.ntriples.NTriplesParser;
>> import org.openrdf.rio.rdfxml.RDFXMLParser;
>> +import org.openrdf.rio.trix.TriXParser;
>> import org.openrdf.rio.turtle.TurtleParser;
>> import org.slf4j.Logger;
>> import org.slf4j.LoggerFactory;
>> @@ -38,7 +39,7 @@ import java.io.Reader;
>>
>> /**
>> * This factory provides a common logic for creating and configuring correctly
>> - * any RDF parser used within the library.
>> + * any <i>RDF</i> parser used within the library.
>> *
>> * @author Michele Mostarda (mostarda@fbk.eu)
>> */
>> @@ -119,7 +120,7 @@ public class RDFParserFactory {
>> }
>>
>> /**
>> - * Returns a new instance of a configured {@link org.deri.any23.parser.NQuadsParser}.
>> + * Returns a new instance of a configured {@link org.deri.any23.io.nquads.NQuadsParser}.
>> *
>> * @param verifyDataType data verification enable if <code>true</code>.
>> * @param stopAtFirstError the parser stops at first error if <code>true</code>.
>> @@ -139,6 +140,26 @@ public class RDFParserFactory {
>> }
>>
>> /**
>> + * Returns a new instance of a configured {@link TriXParser}.
>> + *
>> + * @param verifyDataType data verification enable if <code>true</code>.
>> + * @param stopAtFirstError the parser stops at first error if <code>true</code>.
>> + * @param extractionContext the extraction context where the parser is used.
>> + * @param extractionResult the output extraction result.
>> + * @return a new instance of a configured TriX parser.
>> + */
>> + public TriXParser getTriXParser(
>> + final boolean verifyDataType,
>> + final boolean stopAtFirstError,
>> + final ExtractionContext extractionContext,
>> + final ExtractionResult extractionResult
>> + ) {
>> + final TriXParser parser = new TriXParser();
>> + configureParser(parser, verifyDataType, stopAtFirstError, extractionContext, extractionResult);
>> + return parser;
>> + }
>> +
>> + /**
>> * Configures the given parser on the specified extraction result
>> * setting the policies for data verification and error handling.
>> *
>>
>>
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Re: svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./
any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/
any23-core/src/main/java/org/deri/any23/cli/ any23-core/src/main/java/org/deri/any23/eval/
any23-core/src/main/java/or
Posted by Simone Tripodi <si...@apache.org>.
Hi Mic,
happy new year you too indeed :P
Please shout if you need any help on reorganizing stuff, I would be
more than glad to provide my help!
TIA!
-Simo
http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/
On Tue, Jan 10, 2012 at 6:13 PM, Michele Mostarda
<mi...@gmail.com> wrote:
> On 10 January 2012 18:08, Simone Tripodi <si...@apache.org> wrote:
>
>> Hi Mic,
>>
>
> Hi Simo, happy new year !
>
> this is something great, thanks for the hard work of merging!
>> next step is renaming the packages in org.apache.any23 :)
>>
>
> Sure :) It is the next critical issue scheduled on Jira.
> The we can start discussing about the release.
>
> Ciao
>
> Mic
>
>
>>
>> All the best, have a nice day!
>> -Simo
>>
>> http://people.apache.org/~simonetripodi/
>> http://simonetripodi.livejournal.com/
>> http://twitter.com/simonetripodi
>> http://www.99soft.org/
>>
>>
>>
>> On Tue, Jan 10, 2012 at 5:32 PM, <mo...@apache.org> wrote:
>> > Author: mostarda
>> > Date: Tue Jan 10 16:32:28 2012
>> > New Revision: 1229627
>> >
>> > URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
>> > Log:
>> > This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
>> > with the current Apache Any23 SVN repo, including the issues
>> > developed during the initial import transition phase.
>> > Such issues have been tracked on the original Any23 Google Code Issue
>> Tracker [2].
>> > Below the extract of the original repository commit log.
>> >
>> > This commit is related to issue ANY23-27.
>> >
>> > [1] http://any23.googlecode.com/svn/trunk/
>> > [2] http://code.google.com/p/any23/issues/list
>> >
>> > ==== BEGIN: Original Log ====
>> >
>> > ------------------------------------------------------------------------
>> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
>> 1 line
>> >
>> > Improved numeric datatype assigment. This commit fixes issue #208.
>> > ------------------------------------------------------------------------
>> > hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
>> > ------------------------------------------------------------------------
>> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
>> 1 line
>> >
>> > Improved numeric datatype assigment. This commit fixes issue #208.
>> > ------------------------------------------------------------------------
>> > r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'.
>> Fixed HTMLMetaExtractorTest.java to match this new
>> > namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS
>> declared as resource instead that as a URI. Fixed
>> > RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration.
>> This commit is related to issue #203.
>> > ------------------------------------------------------------------------
>> > r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found
>> wrong declaration of Class(Resource) in WO.java
>> > voca. Fixed and updated RDFSchemaUtils.java test. This commit is related
>> to issue #198.
>> > ------------------------------------------------------------------------
>> > r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Added utility method.
>> > ------------------------------------------------------------------------
>> > r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Improved Vocabulary.java class: added support for comments to any
>> resource. Improved RDFSchemaUtils.java serialization
>> > support, added separators to RDFXML serialization. This commit is
>> related to issue #198.
>> > ------------------------------------------------------------------------
>> > r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) |
>> 1 line
>> >
>> > Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved
>> prefix declaration parsing in RDFa11Parser, this
>> > new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix
>> declarations. Fixed support for prefix mapping resolution in
>> > RDFa11Parser, this allows the correct support for the structured
>> properties introduced by the latest version of the Open
>> > Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest
>> to the new output of vocabularies serialization.
>> > Updated Any23PluginManagerTest to include a new class. This commit is
>> related to issue #206.
>> > ------------------------------------------------------------------------
>> > r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) |
>> 1 line
>> >
>> > Restricted scope of testGetClassesFromClasspath to avoid updating it
>> every time a new class is added.
>> > ------------------------------------------------------------------------
>> > r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Improved validation mode support. Improved descriptions of Validation
>> and Report fields. This commit is related to issue
>> > #209.
>> > ------------------------------------------------------------------------
>> > r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Improved Any23 Service XML Report format documentation.
>> > ------------------------------------------------------------------------
>> > r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Added URL encoding to the source location path. This commit fixes issue
>> #205. Chosen not to write a formal test which
>> > requires the creation of folders with spaces
>> > ------------------------------------------------------------------------
>> > r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Removed obsolete section.
>> > ------------------------------------------------------------------------
>> > r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) |
>> 1 line
>> >
>> > Improved Any23 facade, added method createDocumentSource() to simplify
>> the extraction setup.
>> > ------------------------------------------------------------------------
>> > r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) |
>> 1 line
>> >
>> > Refactored Rover CLI class to made it extensible from other CLI
>> implementations.
>> > ------------------------------------------------------------------------
>> > r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) |
>> 3 lines
>> >
>> > Removed wrong artifact name.
>> >
>> >
>> > ------------------------------------------------------------------------
>> > r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Removed no longer used jspf lib. Added crawler4j dependencies. Added
>> README. This commit is related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Changed attributes visibility to facilitate the class extensibility.
>> > ------------------------------------------------------------------------
>> > r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Added helper methods to extract file lines as list of strings. Improved
>> javadoc.
>> > ------------------------------------------------------------------------
>> > r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Added first version of basic-crawler plugin. This commit is related to
>> issue #211.
>> > ------------------------------------------------------------------------
>> > r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Added plugins README.
>> > ------------------------------------------------------------------------
>> > r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Updated main README, added references to plugin and lib.
>> > ------------------------------------------------------------------------
>> > r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Fixed assembly name.
>> > ------------------------------------------------------------------------
>> > r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Fixed Tool signature. This commit is related to #211.
>> > ------------------------------------------------------------------------
>> > r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Improved logging.
>> > ------------------------------------------------------------------------
>> > r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Included plugin basic-crawler in reactor. Improved ToolRunner and
>> Any23PluginManager tests to be compliant to the new
>> > plugin classes. This commit is related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Fixed Crawler4j group id. Related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Improved plugin documentation. Introduced Office Scraper specific page.
>> This commit is related to issue #213.
>> > ------------------------------------------------------------------------
>> > r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed POST method documentation. Related to issue #213.
>> > ------------------------------------------------------------------------
>> > r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed code snippets, prettified, added missing finalization logic. See
>> issue #187.
>> > ------------------------------------------------------------------------
>> > r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed var name. See #187.
>> > ------------------------------------------------------------------------
>> > r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Updated code snippets and tutorial, added explicit TripleHandler
>> closure. This commit is related to issue #187.
>> > ------------------------------------------------------------------------
>> > r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed data type handling management in NQuadsParser. This commit is
>> related to issue #210.
>> > ------------------------------------------------------------------------
>> > r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Added missing JSON output format. See #214.
>> > ------------------------------------------------------------------------
>> > r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output
>> format support to Rover. This commit is related to
>> > issue #215.
>> > ------------------------------------------------------------------------
>> > r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Added Sesame TriX IO dependency. This commit is related to #215.
>> > ------------------------------------------------------------------------
>> > r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Some suppressed suppressed have been reactivated as Ignored.
>> > ------------------------------------------------------------------------
>> > r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Added TriX output format to the Any23 Service. Commit related to issue
>> #215.
>> > ------------------------------------------------------------------------
>> > r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Improved FormatWriter management, added WriterRegistry. Improved Writer
>> format management in Rover and WebResponder.
>> > This commit is related to issues #215 and #216.
>> > ------------------------------------------------------------------------
>> > r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) |
>> 6 lines
>> >
>> > Added TriXExtractor and textual example (example-trix.trx), added trix
>> support in RDFParserFactory.
>> > Registered TriXExtractor to the ExtractorRegistry.
>> > Added TriX mimetype support in TikaMIMETypeDetector (through
>> mimetypes.xml) and added specific test.
>> > Added support and doc to TriX format in Any23 Service web page
>> (form.html).
>> > This commit is related to issue #215.
>> >
>> > ------------------------------------------------------------------------
>> > r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) |
>> 1 line
>> >
>> > Fixed number of extractors (+1 after adding TriXExtractor). Commit
>> related to issue #215.
>> > ------------------------------------------------------------------------
>> > r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Added method getExtractorType() .
>> > ------------------------------------------------------------------------
>> > r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) |
>> 4 lines
>> >
>> > Improved ExtractorDocumentation support, added missing format examples.
>> > Improved output layout. This commit is related to issue #194.
>> >
>> >
>> > ------------------------------------------------------------------------
>> > r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Improved classpath management in Any23PluginManager. Renamed
>> getClasses\* in loadClasses\* . This commit is related to
>> > issue #212.
>> > ------------------------------------------------------------------------
>> > r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Separated log messages from specific outout data.
>> > ------------------------------------------------------------------------
>> > r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Added human readable report printing support in ReportingTripleHandler
>> and Rover.
>> > ------------------------------------------------------------------------
>> > r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Fixed major issue in output generation, added final activity report,
>> help prettification. This commit is related to
>> > issue #211.
>> > ------------------------------------------------------------------------
>> > r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Upgraded to Sesame 2.6.1 See issue #217.
>> > ------------------------------------------------------------------------
>> > r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue
>> #216
>> > ------------------------------------------------------------------------
>> > r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
>> > ------------------------------------------------------------------------
>> > r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) |
>> 1 line
>> >
>> > Added specific Crawler CLI documentation. Updated general CLI
>> documentation. This commit is related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) |
>> 4 lines
>> >
>> > The Eval CLI Tool has been removed as well as the org.deri.any23.eval
>> package classes related to it.
>> > Updated tests verifying CLI tool detection.
>> > This commit is related to issue #218.
>> >
>> > ------------------------------------------------------------------------
>> > r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) |
>> 5 lines
>> >
>> > Added MimeDetector CLI Tool and test case, removed main() from
>> > TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
>> > Updated CLI doc.
>> > This commit is related to issue #219.
>> >
>> > ------------------------------------------------------------------------
>> > r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) |
>> 1 line
>> >
>> > Added support for comment serialization. Related to issue #158.
>> > ------------------------------------------------------------------------
>> > r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) |
>> 1 line
>> >
>> > Add support for annotation writing in FormatWriter implementations. This
>> commit is related to issue #158.
>> > ------------------------------------------------------------------------
>> > r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) |
>> 1 line
>> >
>> > Added support for 'annotate' flag in Any23 Service.
>> > ------------------------------------------------------------------------
>> >
>> > ==== END : Original Log ====
>> >
>> >
>> > Added:
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
>> > incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
>> > incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
>> > incubator/any23/trunk/any23-core/src/test/resources/application/trix/
>> >
>> incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
>> >
>> incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
>> >
>> incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
>> > incubator/any23/trunk/lib/README.txt
>> > incubator/any23/trunk/plugins/README.txt
>> > incubator/any23/trunk/plugins/basic-crawler/
>> > incubator/any23/trunk/plugins/basic-crawler/pom.xml
>> > incubator/any23/trunk/plugins/basic-crawler/src/
>> > incubator/any23/trunk/plugins/basic-crawler/src/main/
>> > incubator/any23/trunk/plugins/basic-crawler/src/main/java/
>> > incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
>> > incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
>> > incubator/any23/trunk/plugins/basic-crawler/src/test/
>> > incubator/any23/trunk/plugins/basic-crawler/src/test/java/
>> > incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
>> > incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
>> >
>> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
>> > incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
>> > Removed:
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
>> > Modified:
>> > incubator/any23/trunk/README.txt
>> > incubator/any23/trunk/any23-core/bin/any23
>> > incubator/any23/trunk/any23-core/bin/any23tools
>> > incubator/any23/trunk/any23-core/pom.xml
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
>> >
>> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
>> >
>> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
>> >
>> incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
>> >
>> incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
>> >
>> incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
>> >
>> incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
>> > incubator/any23/trunk/lib/install-deps.sh
>> >
>> incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
>> > incubator/any23/trunk/pom.xml
>> > incubator/any23/trunk/src/site/apt/any23-plugins.apt
>> > incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
>> > incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
>> > incubator/any23/trunk/src/site/apt/getting-started.apt
>> > incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
>> > incubator/any23/trunk/src/site/apt/service.apt
>> > incubator/any23/trunk/src/site/apt/supported-formats.apt
>> >
>> > Modified: incubator/any23/trunk/README.txt
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/README.txt (original)
>> > +++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
>> > @@ -20,7 +20,8 @@ Distribution Content
>> >
>> > any23-core The library core codebase.
>> > any23-service The library HTTP service codebase.
>> > -plugins Library plugins codebase.
>> > +lib Contains the Any23 the external deps (read
>> lib/README.txt for further details).
>> > +plugins Library plugins codebase (read plugins/README.txt
>> for further details).
>> > RELEASE-NOTES.txt File reporting main release notes for every
>> version.
>> > LICENSE.txt Applicable project license.
>> > README.txt This file.
>> > @@ -240,15 +241,14 @@ Upload the produced packages in download
>> >
>> > http://code.google.com/p/any23/downloads/list
>> >
>> > +--------------------
>> > +Manage External Deps
>> > +--------------------
>> >
>> > -Fix Release Procedure
>> > ----------------------
>> > -
>> > - Currently the *plugins/integration-test* module is excluded from the
>> parent
>> > - reactor.
>> > - To fix it in tag follow procedure as described at issue #171:
>> > -
>> > - http://code.google.com/p/any23/issues/detail?id=171
>> > +::Developers interest only.::
>> >
>> > +External Deps are libraries used by some Any23 modules which are
>> > +not available in public Maven repositories. Such libraries are
>> > +managed within the 'lib' dir.
>> >
>> > EOF
>> >
>> > Modified: incubator/any23/trunk/any23-core/bin/any23
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/any23-core/bin/any23 (original)
>> > +++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
>> > @@ -9,12 +9,12 @@
>> > ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
>> >
>> > if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> > - echo "Generating executable JAR..."
>> > - mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > + echo "Generating executable JAR..." >&2
>> > + mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> > ||\
>> > - mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > + mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> > ||\
>> > - { echo "Error while generating commandline assembly."; exit 1; }
>> > + { echo "Error while generating commandline assembly." >&2; exit 1;
>> }
>> > fi
>> >
>> > SEP=':'
>> >
>> > Modified: incubator/any23/trunk/any23-core/bin/any23tools
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/any23-core/bin/any23tools (original)
>> > +++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28
>> 2012
>> > @@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd
>> > PLUGINS_DIR=plugins
>> >
>> > if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> > - echo "Generating executable JAR..."
>> > - mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > + echo "Generating executable JAR..." >&2
>> > + mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> > ||\
>> > - mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > + mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> > ||\
>> > - { echo "Error while generating commandline assembly."; exit 1; }
>> > + { echo "Error while generating commandline assembly." >&2; exit 1; }
>> > fi
>> >
>> > SEP=':'
>> > @@ -30,6 +30,7 @@ done
>> > # Plugins classpath.
>> > for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name
>> "*-plugin.jar" -depth 1)
>> > do
>> > + echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
>> > if [ ! -e "$jar" ]; then continue; fi
>> > CP="$CP$SEP$jar"
>> > done
>> >
>> > Modified: incubator/any23/trunk/any23-core/pom.xml
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/any23-core/pom.xml (original)
>> > +++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
>> > @@ -92,6 +92,10 @@
>> > </dependency>
>> > <dependency>
>> > <groupId>org.openrdf.sesame</groupId>
>> > + <artifactId>sesame-rio-trix</artifactId>
>> > + </dependency>
>> > + <dependency>
>> > + <groupId>org.openrdf.sesame</groupId>
>> > <artifactId>sesame-repository-sail</artifactId>
>> > </dependency>
>> > <dependency>
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -258,6 +258,28 @@ public class Any23 {
>> > }
>> >
>> > /**
>> > + * Returns the most appropriate {@link DocumentSource} for the
>> given<code>documentURI</code>.
>> > + *
>> > + * @param documentURI the document <i>URI</i>.
>> > + * @return a new instance of DocumentSource.
>> > + * @throws URISyntaxException if an error occurs while parsing the
>> <code>documentURI</code> as a <i>URI</i>.
>> > + * @throws IOException if an error occurs while initializing the
>> internal {@link HTTPClient}.
>> > + */
>> > + public DocumentSource createDocumentSource(String documentURI)
>> throws URISyntaxException, IOException {
>> > + if(documentURI == null) throw new
>> NullPointerException("documentURI cannot be null.");
>> > + if (documentURI.toLowerCase().startsWith("file:")) {
>> > + return new FileDocumentSource( new File(new
>> URI(documentURI)) );
>> > + }
>> > + if (documentURI.toLowerCase().startsWith("http:") ||
>> documentURI.toLowerCase().startsWith("https:")) {
>> > + return new HTTPDocumentSource(getHTTPClient(), documentURI);
>> > + }
>> > + throw new IllegalArgumentException(
>> > + String.format("Unsupported protocol for document URI:
>> '%s' .", documentURI)
>> > + );
>> > + }
>> > +
>> > +
>> > + /**
>> > * Performs metadata extraction from the content of the given
>> > * <code>in</code> document source, sending the generated events
>> > * to the specified <code>outputHandler</code>.
>> > @@ -363,13 +385,7 @@ public class Any23 {
>> > public ExtractionReport extract(ExtractionParameters eps, String
>> documentURI, TripleHandler outputHandler)
>> > throws IOException, ExtractionException {
>> > try {
>> > - if (documentURI.toLowerCase().startsWith("file:")) {
>> > - return extract(eps, new FileDocumentSource(new File(new
>> URI(documentURI))), outputHandler);
>> > - }
>> > - if (documentURI.toLowerCase().startsWith("http:") ||
>> documentURI.toLowerCase().startsWith("https:")) {
>> > - return extract(eps, new
>> HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
>> > - }
>> > - throw new ExtractionException("Not a valid absolute URI: "
>> + documentURI);
>> > + return extract(eps, createDocumentSource(documentURI),
>> outputHandler);
>> > } catch (URISyntaxException ex) {
>> > throw new ExtractionException("Error while extracting data
>> from document URI.", ex);
>> > }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -16,7 +16,7 @@
>> >
>> > package org.deri.any23.cli;
>> >
>> > -import org.deri.any23.LogUtil;
>> > +import org.deri.any23.util.LogUtils;
>> > import org.deri.any23.extractor.ExampleInputOutput;
>> > import org.deri.any23.extractor.ExtractionException;
>> > import org.deri.any23.extractor.Extractor;
>> > @@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
>> > }
>> >
>> > public int run(String[] args) {
>> > - LogUtil.setDefaultLogging();
>> > + LogUtils.setDefaultLogging();
>> > try {
>> > if (args.length == 0) {
>> > printUsage();
>> > @@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
>> > * Prints the list of all the available extractors.
>> > */
>> > public void printExtractorList() {
>> > - for (String extractorName :
>> ExtractorRegistry.getInstance().getAllNames()) {
>> > - System.out.println(extractorName);
>> > + for(ExtractorFactory factory :
>> ExtractorRegistry.getInstance().getExtractorGroup()) {
>> > + System.out.println( String.format("%25s [%15s]",
>> factory.getExtractorName(), factory.getExtractorType()));
>> > }
>> > }
>> >
>> > @@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
>> > ExtractorFactory<?> factory =
>> ExtractorRegistry.getInstance().getFactory(extractorName);
>> > ExampleInputOutput example = new ExampleInputOutput(factory);
>> > System.out.println("Extractor: " + extractorName);
>> > - System.out.println(" type: " + getType(factory));
>> > - String output = example.getExampleOutput();
>> > - if (output == null) {
>> > - System.out.println("(no example output)");
>> > + System.out.println("\ttype: " + getType(factory));
>> > + System.out.println();
>> > + final String exampleInput = example.getExampleInput();
>> > + if(exampleInput == null) {
>> > + System.out.println("(No Example Available)");
>> > } else {
>> > - System.out.println("-------- example output --------");
>> > - System.out.println(output);
>> > + System.out.println("-------- Example Input --------");
>> > + System.out.println(exampleInput);
>> > + System.out.println("-------- Example Output --------");
>> > + String output = example.getExampleOutput();
>> > + System.out.println(output == null ||
>> output.trim().length() == 0 ? "(No Output Generated)" : output);
>> > }
>> > - System.out.println();
>> > System.out.println("================================");
>> > + System.out.println();
>> > }
>> > }
>> >
>> >
>> > Added:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> (added)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -0,0 +1,113 @@
>> > +/*
>> > + * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
>> > + *
>> > + * Licensed under the Apache License, Version 2.0 (the "License");
>> > + * you may not use this file except in compliance with the License.
>> > + * You may obtain a copy of the License at
>> > + *
>> > + * http://www.apache.org/licenses/LICENSE-2.0
>> > + *
>> > + * Unless required by applicable law or agreed to in writing, software
>> > + * distributed under the License is distributed on an "AS IS" BASIS,
>> > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
>> implied.
>> > + * See the License for the specific language governing permissions and
>> > + * limitations under the License.
>> > + */
>> > +
>> > +package org.deri.any23.cli;
>> > +
>> > +import org.deri.any23.configuration.DefaultConfiguration;
>> > +import org.deri.any23.http.DefaultHTTPClient;
>> > +import org.deri.any23.http.HTTPClient;
>> > +import org.deri.any23.http.HTTPClientConfiguration;
>> > +import org.deri.any23.mime.MIMEType;
>> > +import org.deri.any23.mime.MIMETypeDetector;
>> > +import org.deri.any23.mime.TikaMIMETypeDetector;
>> > +import org.deri.any23.source.DocumentSource;
>> > +import org.deri.any23.source.FileDocumentSource;
>> > +import org.deri.any23.source.HTTPDocumentSource;
>> > +import org.deri.any23.source.StringDocumentSource;
>> > +
>> > +import java.io.File;
>> > +import java.net.URISyntaxException;
>> > +
>> > +/**
>> > + * Commandline tool to detect <b>MIME Type</b>s from
>> > + * file, HTTP and direct input sources.
>> > + * The implementation of this tool is based on {@link
>> TikaMIMETypeDetector}.
>> > + *
>> > + * @author Michele Mostarda (mostarda@fbk.eu)
>> > + */
>> > +@ToolRunner.Description("MIME Type Detector Tool.")
>> > +public class MimeDetector implements Tool{
>> > +
>> > + public static final String FILE_DOCUMENT_PREFIX = "file://";
>> > + public static final String INLINE_DOCUMENT_PREFIX = "inline://";
>> > + public static final String URL_DOCUMENT_RE = "^https?://.*";
>> > +
>> > + public static void main(String[] args) {
>> > + System.exit( new MimeDetector().run(args) );
>> > + }
>> > +
>> > + @Override
>> > + public int run(String[] args) {
>> > + if(args.length != 1) {
>> > + System.err.println("USAGE: {
>> http://path/to/resource.html|file:///path/to/local.file|inline:// some
>> inline content}");
>> > + return 1;
>> > + }
>> > +
>> > + final String document = args[0];
>> > + try {
>> > + final DocumentSource documentSource =
>> createDocumentSource(document);
>> > + final MIMETypeDetector detector = new
>> TikaMIMETypeDetector();
>> > + final MIMEType mimeType = detector.guessMIMEType(
>> > + documentSource.getDocumentURI(),
>> > + documentSource.openInputStream(),
>> > + MIMEType.parse(documentSource.getContentType())
>> > + );
>> > + System.out.println(mimeType);
>> > + return 0;
>> > + } catch (Exception e) {
>> > + System.err.print("Error while detecting MIME Type.");
>> > + e.printStackTrace(System.err);
>> > + return 1;
>> > + }
>> > + }
>> > +
>> > + private DocumentSource createDocumentSource(String document) throws
>> URISyntaxException {
>> > + if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
>> > + return new FileDocumentSource(
>> > + new File(
>> > +
>> document.substring(FILE_DOCUMENT_PREFIX.length())
>> > + )
>> > + );
>> > + }
>> > + if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
>> > + return new StringDocumentSource(
>> > + document.substring(INLINE_DOCUMENT_PREFIX.length()),
>> > + ""
>> > + );
>> > + }
>> > + if(document.matches(URL_DOCUMENT_RE)) {
>> > + final HTTPClient client = new DefaultHTTPClient();
>> > + // TODO: anonymous config class also used in Any23.
>> centralize.
>> > + client.init(new HTTPClientConfiguration() {
>> > + public String getUserAgent() {
>> > + return
>> DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
>> > + }
>> > + public String getAcceptHeader() {
>> > + return "";
>> > + }
>> > + public int getDefaultTimeout() {
>> > + return
>> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
>> > + }
>> > + public int getMaxConnections() {
>> > + return
>> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
>> > + }
>> > + });
>> > + return new HTTPDocumentSource(client, document);
>> > + }
>> > + throw new IllegalArgumentException("Unsupported protocol for
>> document " + document);
>> > + }
>> > +
>> > +}
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
>> > import org.apache.commons.cli.Options;
>> > import org.apache.commons.cli.PosixParser;
>> > import org.deri.any23.Any23;
>> > -import org.deri.any23.LogUtil;
>> > +import org.deri.any23.util.LogUtils;
>> > import org.deri.any23.configuration.Configuration;
>> > import org.deri.any23.configuration.DefaultConfiguration;
>> > import org.deri.any23.extractor.ExtractionException;
>> > @@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
>> > import org.deri.any23.extractor.SingleDocumentExtraction;
>> > import org.deri.any23.filter.IgnoreAccidentalRDFa;
>> > import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
>> > +import org.deri.any23.source.DocumentSource;
>> > import org.deri.any23.writer.BenchmarkTripleHandler;
>> > import org.deri.any23.writer.LoggingTripleHandler;
>> > -import org.deri.any23.writer.NQuadsWriter;
>> > -import org.deri.any23.writer.NTriplesWriter;
>> > -import org.deri.any23.writer.RDFXMLWriter;
>> > import org.deri.any23.writer.ReportingTripleHandler;
>> > import org.deri.any23.writer.TripleHandler;
>> > import org.deri.any23.writer.TripleHandlerException;
>> > -import org.deri.any23.writer.TurtleWriter;
>> > -import org.deri.any23.writer.URIListWriter;
>> > +import org.deri.any23.writer.WriterRegistry;
>> > import org.slf4j.Logger;
>> > import org.slf4j.LoggerFactory;
>> >
>> > @@ -51,6 +48,7 @@ import java.io.OutputStream;
>> > import java.io.PrintStream;
>> > import java.io.PrintWriter;
>> > import java.net.MalformedURLException;
>> > +import java.net.URISyntaxException;
>> > import java.net.URL;
>> >
>> > import static
>> org.deri.any23.extractor.ExtractionParameters.ValidationMode;
>> > @@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
>> > * A default rover implementation. Goes and fetches a URL using an hint
>> > * as to what format should require, then tries to convert it to RDF.
>> > *
>> > - * @author Gabriele Renzi
>> > - * @author Richard Cyganiak (richard@cyganiak.de)
>> > * @author Michele Mostarda (mostarda@fbk.eu)
>> > + * @author Richard Cyganiak (richard@cyganiak.de)
>> > + * @author Gabriele Renzi
>> > */
>> > @ToolRunner.Description("Any23 Command Line Tool.")
>> > public class Rover implements Tool {
>> >
>> > - // Supported formats.
>> > - private static final String TURTLE_FORMAT = "turtle";
>> > - private static final String NTRIPLE_FORMAT = "ntriples";
>> > - private static final String RDFXML_FORMAT = "rdfxml";
>> > - private static final String NQUADS_FORMAT = "nquads";
>> > - private static final String URIS_FORMAT = "uris";
>> > -
>> > - private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
>> > + private static final String[] FORMATS =
>> WriterRegistry.getInstance().getIdentifiers();
>> > + private static final int DEFAULT_FORMAT_INDEX = 0;
>> >
>> > private static final Logger logger =
>> LoggerFactory.getLogger(Rover.class);
>> >
>> > - private static Options options;
>> > + private Options options;
>> >
>> > - public static void main(String[] args) {
>> > - System.exit( new Rover().run(args) );
>> > - }
>> > + private CommandLine commandLine;
>> >
>> > - public int run(String[] args) {
>> > - final CommandLineParser parser = new PosixParser();
>> > - final CommandLine commandLine;
>> > + private boolean verbose = false;
>> >
>> > - boolean verbose = false;
>> > - try {
>> > - options = createOptions();
>> > - commandLine = parser.parse(options, args);
>> > + private PrintStream outputStream;
>> > + private TripleHandler tripleHandler;
>> > + private ReportingTripleHandler reportingTripleHandler;
>> > + private BenchmarkTripleHandler benchmarkTripleHandler;
>> >
>> > - if (commandLine.hasOption("h")) {
>> > - printHelp();
>> > - return 0;
>> > - }
>> > + private ExtractionParameters eps;
>> > + private Any23 any23;
>> >
>> > - if (commandLine.hasOption('v')) {
>> > - verbose = true;
>> > - LogUtil.setVerboseLogging();
>> > - } else {
>> > - LogUtil.setDefaultLogging();
>> > - }
>> > -
>> > - if (commandLine.getArgs().length < 1) {
>> > - printHelp();
>> > - throw new IllegalArgumentException("Expected at least 1
>> argument.");
>> > - }
>> > + protected boolean isVerbose() {
>> > + return verbose;
>> > + }
>> >
>> > - final String[] inputURIs =
>> argumentsToURIs(commandLine.getArgs());
>> > - final String[] extractorNames = getExtractors(commandLine);
>> > + public static void main(String[] args) {
>> > + System.exit( new Rover().run(args) );
>> > + }
>> >
>> > - PrintStream outputStream = null;
>> > - TripleHandler tripleHandler = null;
>> > - try {
>> > - outputStream = getOutputStream(commandLine);
>> > + public int run(String[] args) {
>> > + try {
>> > + final String[] uris = configure(args);
>> > + performExtraction(uris);
>> > + return 0;
>> > + } catch (Exception e) {
>> > + System.err.println( e.getMessage() );
>> > + final int exitCode = e instanceof ExitCodeException ?
>> ((ExitCodeException) e).exitCode : 1;
>> > + if(verbose) e.printStackTrace(System.err);
>> > + return exitCode;
>> > + }
>> > + }
>> >
>> > - tripleHandler = getTripleHandler(commandLine,
>> outputStream);
>> > + protected CommandLine getCommandLine() {
>> > + if(commandLine == null) throw new IllegalStateException("Rover
>> must be configured first.");
>> > + return commandLine;
>> > + }
>> >
>> > - tripleHandler = decorateWithLogHandler(commandLine,
>> tripleHandler);
>> > + protected String[] configure(String[] args) throws Exception {
>> > + final CommandLineParser parser = new PosixParser();
>> > + options = createOptions();
>> > + commandLine = parser.parse(options, args);
>> >
>> > - tripleHandler =
>> decorateWithStatisticsHandler(commandLine, tripleHandler);
>> > - final BenchmarkTripleHandler benchmarkTripleHandler =
>> > - tripleHandler instanceof BenchmarkTripleHandler
>> ? (BenchmarkTripleHandler) tripleHandler : null;
>> > + if (commandLine.hasOption("h")) {
>> > + printHelp();
>> > + throw new ExitCodeException(0);
>> > + }
>> >
>> > - tripleHandler =
>> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>> > + if (commandLine.hasOption('v')) {
>> > + verbose = true;
>> > + LogUtils.setVerboseLogging();
>> > + } else {
>> > + LogUtils.setDefaultLogging();
>> > + }
>> >
>> > - final ReportingTripleHandler reportingTripleHandler =
>> new ReportingTripleHandler(tripleHandler);
>> > + if (commandLine.getArgs().length < 1) {
>> > + printHelp();
>> > + throw new IllegalArgumentException("Expected at least 1
>> argument.");
>> > + }
>> >
>> > - final ExtractionParameters eps =
>> getExtractionParameters(commandLine);
>> > + final String[] inputURIs =
>> argumentsToURIs(commandLine.getArgs());
>> > + final String[] extractorNames = getExtractors(commandLine);
>> >
>> > - final Any23 any23 = createAny23(extractorNames);
>> > + try {
>> > + outputStream = getOutputStream(commandLine);
>> > + tripleHandler = getTripleHandler(commandLine, outputStream);
>> > + tripleHandler = decorateWithLogHandler(commandLine,
>> tripleHandler);
>> > + tripleHandler = decorateWithStatisticsHandler(commandLine,
>> tripleHandler);
>> >
>> > - final long start = System.currentTimeMillis();
>> > - for(String inputURI : inputURIs) {
>> > - performExtraction(any23, eps, inputURI,
>> reportingTripleHandler);
>> > - }
>> > - final long elapsed = System.currentTimeMillis() - start;
>> > + benchmarkTripleHandler =
>> > + tripleHandler instanceof BenchmarkTripleHandler ?
>> (BenchmarkTripleHandler) tripleHandler : null;
>> >
>> > - closeAll(tripleHandler, outputStream);
>> > + tripleHandler =
>> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>> >
>> > - if (benchmarkTripleHandler != null) {
>> > - System.err.println( benchmarkTripleHandler.report()
>> );
>> > - }
>> > + reportingTripleHandler = new
>> ReportingTripleHandler(tripleHandler);
>> > + eps = getExtractionParameters(commandLine);
>> > + any23 = createAny23(extractorNames);
>> >
>> > - logger.info("Extractors used: " +
>> reportingTripleHandler.getExtractorNames());
>> > - logger.info(reportingTripleHandler.getTotalTriples() +
>> " triples, " + elapsed + "ms");
>> > - } finally {
>> > - closeAll(tripleHandler, outputStream);
>> > - }
>> > + return inputURIs;
>> > } catch (Exception e) {
>> > - System.err.println(e.getMessage());
>> > - final int exitCode = e instanceof SpecificExitException ?
>> ((SpecificExitException) e).exitCode : 1;
>> > - if(verbose) e.printStackTrace(System.err);
>> > - return exitCode;
>> > + closeStreams();
>> > + throw e;
>> > }
>> > - return 0;
>> > }
>> >
>> > - private Options createOptions() {
>> > + protected Options createOptions() {
>> > final Options options = new Options();
>> > options.addOption(
>> > new Option("v", "verbose", false, "Show debug and
>> progress information.")
>> > @@ -178,13 +175,7 @@ public class Rover implements Tool {
>> > "f",
>> > "Output format",
>> > true,
>> > - "[" +
>> > - TURTLE_FORMAT + " (default), " +
>> > - NTRIPLE_FORMAT + ", " +
>> > - RDFXML_FORMAT + ", " +
>> > - NQUADS_FORMAT + ", " +
>> > - URIS_FORMAT +
>> > - "]"
>> > + "[" + printFormats(FORMATS,
>> DEFAULT_FORMAT_INDEX) + "]"
>> > )
>> > );
>> > options.addOption(
>> > @@ -208,11 +199,51 @@ public class Rover implements Tool {
>> > return options;
>> > }
>> >
>> > + protected void performExtraction(DocumentSource documentSource) {
>> > + performExtraction(any23, eps, documentSource,
>> reportingTripleHandler);
>> > + }
>> > +
>> > + protected void performExtraction(String[] inputURIs) throws
>> URISyntaxException, IOException {
>> > + try {
>> > + final long start = System.currentTimeMillis();
>> > + for (String inputURI : inputURIs) {
>> > + performExtraction( any23.createDocumentSource(inputURI)
>> );
>> > + }
>> > + final long elapsed = System.currentTimeMillis() - start;
>> > +
>> > + if (benchmarkTripleHandler != null) {
>> > + System.err.println(benchmarkTripleHandler.report());
>> > + }
>> > +
>> > + logger.info("Extractors used: " +
>> reportingTripleHandler.getExtractorNames());
>> > + logger.info(reportingTripleHandler.getTotalTriples() + "
>> triples, " + elapsed + "ms");
>> > + } finally {
>> > + closeStreams();
>> > + }
>> > + }
>> > +
>> > + protected String printReports() {
>> > + final StringBuilder sb = new StringBuilder();
>> > + if(benchmarkTripleHandler != null) sb.append(
>> benchmarkTripleHandler.report() ).append('\n');
>> > + if(reportingTripleHandler != null) sb.append(
>> reportingTripleHandler.printReport() ).append('\n');
>> > + return sb.toString();
>> > + }
>> > +
>> > private void printHelp() {
>> > HelpFormatter formatter = new HelpFormatter();
>> > formatter.printHelp("[{<url>|<file>}]+", options, true);
>> > }
>> >
>> > + private String printFormats(String[] formats, int defaultIndex) {
>> > + final StringBuilder sb = new StringBuilder();
>> > + for (int i = 0; i < formats.length; i++) {
>> > + sb.append(formats[i]);
>> > + if(i == defaultIndex) sb.append(" (default)");
>> > + if(i < formats.length - 1) sb.append(", ");
>> > + }
>> > + return sb.toString();
>> > + }
>> > +
>> > private String argumentToURI(String uri) {
>> > uri = uri.trim();
>> > if (uri.toLowerCase().startsWith("http:") ||
>> uri.toLowerCase().startsWith("https:")) {
>> > @@ -268,27 +299,17 @@ public class Rover implements Tool {
>> >
>> > private TripleHandler getTripleHandler(CommandLine cl, OutputStream
>> os) {
>> > final String FORMAT_OPTION = "f";
>> > - String format = DEFAULT_FORMAT;
>> > + String format = FORMATS[DEFAULT_FORMAT_INDEX];
>> > if (cl.hasOption(FORMAT_OPTION)) {
>> > - format = cl.getOptionValue(FORMAT_OPTION);
>> > + format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
>> > }
>> > - final TripleHandler outputHandler;
>> > - if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
>> > - outputHandler = new TurtleWriter(os);
>> > - } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
>> > - outputHandler = new NTriplesWriter(os);
>> > - } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
>> > - outputHandler = new RDFXMLWriter(os);
>> > - } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
>> > - outputHandler = new NQuadsWriter(os);
>> > - } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
>> > - outputHandler = new URIListWriter(os);
>> > - } else {
>> > + try {
>> > + return
>> WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
>> > + } catch (Exception e) {
>> > throw new IllegalArgumentException(
>> > String.format("Invalid option value '%s' for option
>> %s", format, FORMAT_OPTION)
>> > );
>> > }
>> > - return outputHandler;
>> > }
>> >
>> > private TripleHandler
>> decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
>> > @@ -346,44 +367,54 @@ public class Rover implements Tool {
>> > return any23;
>> > }
>> >
>> > - private void performExtraction(Any23 any23, ExtractionParameters
>> eps, String documentURI, TripleHandler th) {
>> > + private void performExtraction(
>> > + Any23 any23, ExtractionParameters eps, DocumentSource
>> documentSource, TripleHandler th
>> > + ) {
>> > try {
>> > - if (! any23.extract(eps, documentURI,
>> th).hasMatchingExtractors()) {
>> > - throw new SpecificExitException("No suitable extractors
>> found.", 2);
>> > + if (! any23.extract(eps, documentSource,
>> th).hasMatchingExtractors()) {
>> > + throw new ExitCodeException("No suitable extractors
>> found.", 2);
>> > }
>> > } catch (ExtractionException ex) {
>> > - throw new SpecificExitException("Exception while extracting
>> metadata.", ex, 3);
>> > + throw new ExitCodeException("Exception while extracting
>> metadata.", ex, 3);
>> > } catch (IOException ex) {
>> > - throw new SpecificExitException("Exception while producing
>> output.", ex, 4);
>> > + throw new ExitCodeException("Exception while producing
>> output.", ex, 4);
>> > }
>> > }
>> >
>> > - private void closeHandler(TripleHandler th) {
>> > - if(th == null) return;
>> > + private void closeHandler() {
>> > + if(tripleHandler == null) return;
>> > try {
>> > - th.close();
>> > + tripleHandler.close();
>> > } catch (TripleHandlerException the) {
>> > - throw new SpecificExitException("Error while closing
>> TripleHandler", the, 5);
>> > + throw new ExitCodeException("Error while closing
>> TripleHandler", the, 5);
>> > }
>> > }
>> >
>> > - private void closeAll(TripleHandler th, PrintStream os) {
>> > - closeHandler(th);
>> > - if(os != null) os.close();
>> > + private void closeStreams() {
>> > + closeHandler();
>> > + if(outputStream != null) outputStream.close();
>> > }
>> >
>> > - private class SpecificExitException extends RuntimeException {
>> > + protected class ExitCodeException extends RuntimeException {
>> >
>> > private final int exitCode;
>> >
>> > - public SpecificExitException(String message, Throwable cause,
>> int exitCode) {
>> > + public ExitCodeException(String message, Throwable cause, int
>> exitCode) {
>> > super(message, cause);
>> > this.exitCode = exitCode;
>> > }
>> > - public SpecificExitException(String message, int exitCode) {
>> > + public ExitCodeException(String message, int exitCode) {
>> > super(message);
>> > this.exitCode = exitCode;
>> > }
>> > + public ExitCodeException(int exitCode) {
>> > + super();
>> > + this.exitCode = exitCode;
>> > + }
>> > +
>> > + protected int getExitCode() {
>> > + return exitCode;
>> > + }
>> > }
>> >
>> > }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -29,6 +29,13 @@ import java.util.Collection;
>> > public interface ExtractorFactory<T extends Extractor<?>> extends
>> ExtractorDescription {
>> >
>> > /**
>> > + * Returns the extractor type.
>> > + *
>> > + * @return the not <code>null</code> extractor class.
>> > + */
>> > + Class<T> getExtractorType();
>> > +
>> > + /**
>> > * Creates an extractor instance.
>> > *
>> > * @return an instance of the extractor associated to this factory.
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
>> > import org.deri.any23.extractor.rdf.NQuadsExtractor;
>> > import org.deri.any23.extractor.rdf.NTriplesExtractor;
>> > import org.deri.any23.extractor.rdf.RDFXMLExtractor;
>> > +import org.deri.any23.extractor.rdf.TriXExtractor;
>> > import org.deri.any23.extractor.rdf.TurtleExtractor;
>> > import org.deri.any23.extractor.rdfa.RDFa11Extractor;
>> > import org.deri.any23.extractor.rdfa.RDFaExtractor;
>> > @@ -79,6 +80,7 @@ public class ExtractorRegistry {
>> > instance.register(TurtleExtractor.factory);
>> > instance.register(NTriplesExtractor.factory);
>> > instance.register(NQuadsExtractor.factory);
>> > + instance.register(TriXExtractor.factory);
>> >
>> if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
>> > instance.register(RDFa11Extractor.factory);
>> > } else {
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
>> > return supportedMIMETypes;
>> > }
>> >
>> > + @Override
>> > + public Class<T> getExtractorType() {
>> > + return extractorClass;
>> > + }
>> > +
>> > /**
>> > * @return an instance of type T concrete implementation of {@link
>> org.deri.any23.extractor.Extractor}
>> > */
>> > + @Override
>> > public T createExtractor() {
>> > try {
>> > return extractorClass.newInstance();
>> > @@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
>> > /**
>> > * @return an input example
>> > */
>> > + @Override
>> > public String getExampleInput() {
>> > return exampleInput;
>> > }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
>> > Arrays.asList(
>> > "text/csv;q=0.1"
>> > ),
>> > - null,
>> > + "example-csv.csv",
>> > CSVExtractor.class
>> > );
>> >
>> > @@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
>> > }
>> >
>> > /**
>> > + * Check whether a number is an integer.
>> > + *
>> > + * @param number
>> > + * @return
>> > + */
>> > + private boolean isInteger(String number) {
>> > + try {
>> > + Integer.valueOf(number);
>> > + return true;
>> > + } catch (NumberFormatException e) {
>> > + return false;
>> > + }
>> > + }
>> > +
>> > + /**
>> > + * Check whether a number is a float.
>> > + *
>> > * @param number
>> > * @return
>> > */
>> > - private boolean isNumber(String number) {
>> > + private boolean isFloat(String number) {
>> > try {
>> > - Double.valueOf(number);
>> > + Float.valueOf(number);
>> > return true;
>> > } catch (NumberFormatException e) {
>> > return false;
>> > @@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
>> > object = new URIImpl(cell);
>> > } else {
>> > URI datatype = XMLSchema.STRING;
>> > - if (isNumber(cell)) {
>> > + if (isInteger(cell)) {
>> > datatype = XMLSchema.INTEGER;
>> > + } else if(isFloat(cell)) {
>> > + datatype = XMLSchema.FLOAT;
>> > }
>> > object = new LiteralImpl(cell, datatype);
>> > }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
>> > "html-mf-adr",
>> > PopularPrefixes.createSubset("rdf", "vcard"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-adr.html",
>> > AdrExtractor.class
>> > );
>> > }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
>> > "html-mf-geo",
>> > PopularPrefixes.createSubset("rdf", "vcard"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-geo.html",
>> > GeoExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -53,7 +53,7 @@ public class HCalendarExtractor extends
>> > "html-mf-hcalendar",
>> > PopularPrefixes.createSubset("rdf", "ical"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-hcalendar.html",
>> > HCalendarExtractor.class);
>> >
>> > private static final String[] Components = {"Vevent", "Vtodo",
>> "Vjournal", "Vfreebusy"};
>> > @@ -116,7 +116,7 @@ public class HCalendarExtractor extends
>> > private boolean extractComponent(Node node, Resource cal, String
>> component) throws ExtractionException {
>> > HTMLDocument compoNode = new HTMLDocument(node);
>> > BNode evt = valueFactory.createBNode();
>> > - addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
>> > + addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
>> > addTextProps(compoNode, evt);
>> > addUrl(compoNode, evt);
>> > addRRule(compoNode, evt);
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
>> > "html-mf-hcard",
>> > PopularPrefixes.createSubset("rdf", "vcard"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-hcard.html",
>> > HCardExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -82,7 +82,7 @@ public class HListingExtractor extends E
>> > "html-mf-hlisting",
>> > PopularPrefixes.createSubset("rdf", "hlisting"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-hlisting.html",
>> > HListingExtractor.class
>> > );
>> >
>> > @@ -106,7 +106,7 @@ public class HListingExtractor extends E
>> > out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
>> >
>> > for (String action : findActions(fragment)) {
>> > - out.writeTriple(listing, hLISTING.action,
>> hLISTING.getResource(action));
>> > + out.writeTriple(listing, hLISTING.action,
>> hLISTING.getClass(action));
>> > }
>> > out.writeTriple(listing, hLISTING.lister, addLister() );
>> > addItem(listing);
>> > @@ -154,7 +154,7 @@ public class HListingExtractor extends E
>> > String value = node.getNodeValue();
>> > // do not use conditionallyAdd, it won't work cause
>> of evaluation rules
>> > if (!(null == value || "".equals(value))) {
>> > - URI property =
>> hLISTING.getPropertyCamelized(klass);
>> > + URI property =
>> hLISTING.getPropertyCamelCase(klass);
>> > conditionallyAddLiteralProperty(
>> > node,
>> > blankItem, property,
>> valueFactory.createLiteral(value)
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
>> > "html-mf-hrecipe",
>> > PopularPrefixes.createSubset("rdf", "hrecipe"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-hrecipe.html",
>> > HRecipeExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -48,7 +48,7 @@ public class HResumeExtractor extends En
>> > "html-mf-hresume",
>> > PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-hresume.html",
>> > HResumeExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -53,7 +53,7 @@ public class HReviewExtractor extends En
>> > "html-mf-hreview",
>> > PopularPrefixes.createSubset("rdf", "vcard", "rev"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-hreview.html",
>> > HReviewExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
>> > "html-head-links",
>> > PopularPrefixes.createSubset("xhtml", "dcterms"),
>> > Arrays.asList("text/html;q=0.05",
>> "application/xhtml+xml;q=0.05"),
>> > - null,
>> > + "example-head-link.html",
>> > HeadLinkExtractor.class);
>> > }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
>> > "html-head-icbm",
>> > PopularPrefixes.createSubset("geo", "rdf"),
>> > Arrays.asList("text/html;q=0.01",
>> "application/xhtml+xml;q=0.01"),
>> > - null,
>> > + "example-icbm.html",
>> > ICBMExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -51,7 +51,7 @@ public class LicenseExtractor implements
>> > "html-mf-license",
>> > PopularPrefixes.createSubset("xhtml"),
>> > Arrays.asList("text/html;q=0.01",
>> "application/xhtml+xml;q=0.01"),
>> > - null,
>> > + "example-mf-license.html",
>> > LicenseExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
>> > "html-mf-species",
>> > PopularPrefixes.createSubset("rdf", "wo"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-species.html",
>> > SpeciesExtractor.class
>> > );
>> >
>> > @@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
>> >
>> > private URI resolveClassName(String clazz) {
>> > String upperCaseClass = clazz.substring(0, 1);
>> > - return vWO.getResource(
>> > + return vWO.getClass(
>> > String.format("%s%s",
>> > upperCaseClass.toUpperCase(),
>> > clazz.substring(1)
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
>> > NAME,
>> > PopularPrefixes.get(),
>> > Arrays.asList("text/html;q=0.02",
>> "application/xhtml+xml;q=0.02"),
>> > - null,
>> > + "example-script-turtle.html",
>> > TurtleHTMLExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
>> > "html-mf-xfn",
>> > PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-mf-xfn.html",
>> > XFNExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
>> > "html-microdata",
>> > PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>> > Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > - null,
>> > + "example-microdata.html",
>> > MicrodataExtractor.class
>> > );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
>> > import org.deri.any23.extractor.ErrorReporter;
>> > import org.deri.any23.extractor.ExtractionContext;
>> > import org.deri.any23.extractor.ExtractionResult;
>> > -import org.deri.any23.parser.NQuadsParser;
>> > +import org.deri.any23.io.nquads.NQuadsParser;
>> > import org.deri.any23.rdf.Any23ValueFactoryWrapper;
>> > import org.openrdf.model.impl.ValueFactoryImpl;
>> > import org.openrdf.rio.ParseErrorListener;
>> > @@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
>> > import org.openrdf.rio.RDFParser;
>> > import org.openrdf.rio.ntriples.NTriplesParser;
>> > import org.openrdf.rio.rdfxml.RDFXMLParser;
>> > +import org.openrdf.rio.trix.TriXParser;
>> > import org.openrdf.rio.turtle.TurtleParser;
>> > import org.slf4j.Logger;
>> > import org.slf4j.LoggerFactory;
>> > @@ -38,7 +39,7 @@ import java.io.Reader;
>> >
>> > /**
>> > * This factory provides a common logic for creating and configuring
>> correctly
>> > - * any RDF parser used within the library.
>> > + * any <i>RDF</i> parser used within the library.
>> > *
>> > * @author Michele Mostarda (mostarda@fbk.eu)
>> > */
>> > @@ -119,7 +120,7 @@ public class RDFParserFactory {
>> > }
>> >
>> > /**
>> > - * Returns a new instance of a configured {@link
>> org.deri.any23.parser.NQuadsParser}.
>> > + * Returns a new instance of a configured {@link
>> org.deri.any23.io.nquads.NQuadsParser}.
>> > *
>> > * @param verifyDataType data verification enable if
>> <code>true</code>.
>> > * @param stopAtFirstError the parser stops at first error if
>> <code>true</code>.
>> > @@ -139,6 +140,26 @@ public class RDFParserFactory {
>> > }
>> >
>> > /**
>> > + * Returns a new instance of a configured {@link TriXParser}.
>> > + *
>> > + * @param verifyDataType data verification enable if
>> <code>true</code>.
>> > + * @param stopAtFirstError the parser stops at first error if
>> <code>true</code>.
>> > + * @param extractionContext the extraction context where the parser
>> is used.
>> > + * @param extractionResult the output extraction result.
>> > + * @return a new instance of a configured TriX parser.
>> > + */
>> > + public TriXParser getTriXParser(
>> > + final boolean verifyDataType,
>> > + final boolean stopAtFirstError,
>> > + final ExtractionContext extractionContext,
>> > + final ExtractionResult extractionResult
>> > + ) {
>> > + final TriXParser parser = new TriXParser();
>> > + configureParser(parser, verifyDataType, stopAtFirstError,
>> extractionContext, extractionResult);
>> > + return parser;
>> > + }
>> > +
>> > + /**
>> > * Configures the given parser on the specified extraction result
>> > * setting the policies for data verification and error handling.
>> > *
>> >
>> >
>>
>
>
>
> --
> Michele Mostarda
> Senior Software Engineer
> skype: michele.mostarda
> twitter: micmos
> mail: me@michelemostarda.com
> site : http://www.michelemostarda.com
Re: svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./
any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/
any23-core/src/main/java/org/deri/any23/cli/ any23-core/src/main/java/org/deri/any23/eval/
any23-core/src/main/java/or
Posted by Michele Mostarda <mi...@gmail.com>.
On 10 January 2012 18:08, Simone Tripodi <si...@apache.org> wrote:
> Hi Mic,
>
Hi Simo, happy new year !
this is something great, thanks for the hard work of merging!
> next step is renaming the packages in org.apache.any23 :)
>
Sure :) It is the next critical issue scheduled on Jira.
The we can start discussing about the release.
Ciao
Mic
>
> All the best, have a nice day!
> -Simo
>
> http://people.apache.org/~simonetripodi/
> http://simonetripodi.livejournal.com/
> http://twitter.com/simonetripodi
> http://www.99soft.org/
>
>
>
> On Tue, Jan 10, 2012 at 5:32 PM, <mo...@apache.org> wrote:
> > Author: mostarda
> > Date: Tue Jan 10 16:32:28 2012
> > New Revision: 1229627
> >
> > URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
> > Log:
> > This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
> > with the current Apache Any23 SVN repo, including the issues
> > developed during the initial import transition phase.
> > Such issues have been tracked on the original Any23 Google Code Issue
> Tracker [2].
> > Below the extract of the original repository commit log.
> >
> > This commit is related to issue ANY23-27.
> >
> > [1] http://any23.googlecode.com/svn/trunk/
> > [2] http://code.google.com/p/any23/issues/list
> >
> > ==== BEGIN: Original Log ====
> >
> > ------------------------------------------------------------------------
> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
> 1 line
> >
> > Improved numeric datatype assigment. This commit fixes issue #208.
> > ------------------------------------------------------------------------
> > hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
> > ------------------------------------------------------------------------
> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
> 1 line
> >
> > Improved numeric datatype assigment. This commit fixes issue #208.
> > ------------------------------------------------------------------------
> > r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) |
> 1 line
> >
> > Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'.
> Fixed HTMLMetaExtractorTest.java to match this new
> > namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS
> declared as resource instead that as a URI. Fixed
> > RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration.
> This commit is related to issue #203.
> > ------------------------------------------------------------------------
> > r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) |
> 1 line
> >
> > Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found
> wrong declaration of Class(Resource) in WO.java
> > voca. Fixed and updated RDFSchemaUtils.java test. This commit is related
> to issue #198.
> > ------------------------------------------------------------------------
> > r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) |
> 1 line
> >
> > Added utility method.
> > ------------------------------------------------------------------------
> > r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) |
> 1 line
> >
> > Improved Vocabulary.java class: added support for comments to any
> resource. Improved RDFSchemaUtils.java serialization
> > support, added separators to RDFXML serialization. This commit is
> related to issue #198.
> > ------------------------------------------------------------------------
> > r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) |
> 1 line
> >
> > Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved
> prefix declaration parsing in RDFa11Parser, this
> > new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix
> declarations. Fixed support for prefix mapping resolution in
> > RDFa11Parser, this allows the correct support for the structured
> properties introduced by the latest version of the Open
> > Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest
> to the new output of vocabularies serialization.
> > Updated Any23PluginManagerTest to include a new class. This commit is
> related to issue #206.
> > ------------------------------------------------------------------------
> > r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) |
> 1 line
> >
> > Restricted scope of testGetClassesFromClasspath to avoid updating it
> every time a new class is added.
> > ------------------------------------------------------------------------
> > r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) |
> 1 line
> >
> > Improved validation mode support. Improved descriptions of Validation
> and Report fields. This commit is related to issue
> > #209.
> > ------------------------------------------------------------------------
> > r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) |
> 1 line
> >
> > Improved Any23 Service XML Report format documentation.
> > ------------------------------------------------------------------------
> > r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) |
> 1 line
> >
> > Added URL encoding to the source location path. This commit fixes issue
> #205. Chosen not to write a formal test which
> > requires the creation of folders with spaces
> > ------------------------------------------------------------------------
> > r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) |
> 1 line
> >
> > Removed obsolete section.
> > ------------------------------------------------------------------------
> > r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) |
> 1 line
> >
> > Improved Any23 facade, added method createDocumentSource() to simplify
> the extraction setup.
> > ------------------------------------------------------------------------
> > r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) |
> 1 line
> >
> > Refactored Rover CLI class to made it extensible from other CLI
> implementations.
> > ------------------------------------------------------------------------
> > r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) |
> 3 lines
> >
> > Removed wrong artifact name.
> >
> >
> > ------------------------------------------------------------------------
> > r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Upload by wagon-svn
> > ------------------------------------------------------------------------
> > r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Removed no longer used jspf lib. Added crawler4j dependencies. Added
> README. This commit is related to issue #211.
> > ------------------------------------------------------------------------
> > r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Changed attributes visibility to facilitate the class extensibility.
> > ------------------------------------------------------------------------
> > r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Added helper methods to extract file lines as list of strings. Improved
> javadoc.
> > ------------------------------------------------------------------------
> > r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Added first version of basic-crawler plugin. This commit is related to
> issue #211.
> > ------------------------------------------------------------------------
> > r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Added plugins README.
> > ------------------------------------------------------------------------
> > r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Updated main README, added references to plugin and lib.
> > ------------------------------------------------------------------------
> > r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Fixed assembly name.
> > ------------------------------------------------------------------------
> > r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Fixed Tool signature. This commit is related to #211.
> > ------------------------------------------------------------------------
> > r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Improved logging.
> > ------------------------------------------------------------------------
> > r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Included plugin basic-crawler in reactor. Improved ToolRunner and
> Any23PluginManager tests to be compliant to the new
> > plugin classes. This commit is related to issue #211.
> > ------------------------------------------------------------------------
> > r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) |
> 1 line
> >
> > Fixed Crawler4j group id. Related to issue #211.
> > ------------------------------------------------------------------------
> > r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Improved plugin documentation. Introduced Office Scraper specific page.
> This commit is related to issue #213.
> > ------------------------------------------------------------------------
> > r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Fixed POST method documentation. Related to issue #213.
> > ------------------------------------------------------------------------
> > r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Fixed code snippets, prettified, added missing finalization logic. See
> issue #187.
> > ------------------------------------------------------------------------
> > r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Fixed var name. See #187.
> > ------------------------------------------------------------------------
> > r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Updated code snippets and tutorial, added explicit TripleHandler
> closure. This commit is related to issue #187.
> > ------------------------------------------------------------------------
> > r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Fixed data type handling management in NQuadsParser. This commit is
> related to issue #210.
> > ------------------------------------------------------------------------
> > r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Added missing JSON output format. See #214.
> > ------------------------------------------------------------------------
> > r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) |
> 1 line
> >
> > Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output
> format support to Rover. This commit is related to
> > issue #215.
> > ------------------------------------------------------------------------
> > r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) |
> 1 line
> >
> > Added Sesame TriX IO dependency. This commit is related to #215.
> > ------------------------------------------------------------------------
> > r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) |
> 1 line
> >
> > Some suppressed suppressed have been reactivated as Ignored.
> > ------------------------------------------------------------------------
> > r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) |
> 1 line
> >
> > Added TriX output format to the Any23 Service. Commit related to issue
> #215.
> > ------------------------------------------------------------------------
> > r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) |
> 1 line
> >
> > Improved FormatWriter management, added WriterRegistry. Improved Writer
> format management in Rover and WebResponder.
> > This commit is related to issues #215 and #216.
> > ------------------------------------------------------------------------
> > r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) |
> 6 lines
> >
> > Added TriXExtractor and textual example (example-trix.trx), added trix
> support in RDFParserFactory.
> > Registered TriXExtractor to the ExtractorRegistry.
> > Added TriX mimetype support in TikaMIMETypeDetector (through
> mimetypes.xml) and added specific test.
> > Added support and doc to TriX format in Any23 Service web page
> (form.html).
> > This commit is related to issue #215.
> >
> > ------------------------------------------------------------------------
> > r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) |
> 1 line
> >
> > Fixed number of extractors (+1 after adding TriXExtractor). Commit
> related to issue #215.
> > ------------------------------------------------------------------------
> > r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Added method getExtractorType() .
> > ------------------------------------------------------------------------
> > r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) |
> 4 lines
> >
> > Improved ExtractorDocumentation support, added missing format examples.
> > Improved output layout. This commit is related to issue #194.
> >
> >
> > ------------------------------------------------------------------------
> > r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Improved classpath management in Any23PluginManager. Renamed
> getClasses\* in loadClasses\* . This commit is related to
> > issue #212.
> > ------------------------------------------------------------------------
> > r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Separated log messages from specific outout data.
> > ------------------------------------------------------------------------
> > r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Added human readable report printing support in ReportingTripleHandler
> and Rover.
> > ------------------------------------------------------------------------
> > r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Fixed major issue in output generation, added final activity report,
> help prettification. This commit is related to
> > issue #211.
> > ------------------------------------------------------------------------
> > r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Upgraded to Sesame 2.6.1 See issue #217.
> > ------------------------------------------------------------------------
> > r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue
> #216
> > ------------------------------------------------------------------------
> > r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) |
> 1 line
> >
> > Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
> > ------------------------------------------------------------------------
> > r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) |
> 1 line
> >
> > Added specific Crawler CLI documentation. Updated general CLI
> documentation. This commit is related to issue #211.
> > ------------------------------------------------------------------------
> > r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) |
> 4 lines
> >
> > The Eval CLI Tool has been removed as well as the org.deri.any23.eval
> package classes related to it.
> > Updated tests verifying CLI tool detection.
> > This commit is related to issue #218.
> >
> > ------------------------------------------------------------------------
> > r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) |
> 5 lines
> >
> > Added MimeDetector CLI Tool and test case, removed main() from
> > TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
> > Updated CLI doc.
> > This commit is related to issue #219.
> >
> > ------------------------------------------------------------------------
> > r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) |
> 1 line
> >
> > Added support for comment serialization. Related to issue #158.
> > ------------------------------------------------------------------------
> > r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) |
> 1 line
> >
> > Add support for annotation writing in FormatWriter implementations. This
> commit is related to issue #158.
> > ------------------------------------------------------------------------
> > r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) |
> 1 line
> >
> > Added support for 'annotate' flag in Any23 Service.
> > ------------------------------------------------------------------------
> >
> > ==== END : Original Log ====
> >
> >
> > Added:
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
> > incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
> > incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
> > incubator/any23/trunk/any23-core/src/test/resources/application/trix/
> >
> incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
> >
> incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
> >
> incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
> > incubator/any23/trunk/lib/README.txt
> > incubator/any23/trunk/plugins/README.txt
> > incubator/any23/trunk/plugins/basic-crawler/
> > incubator/any23/trunk/plugins/basic-crawler/pom.xml
> > incubator/any23/trunk/plugins/basic-crawler/src/
> > incubator/any23/trunk/plugins/basic-crawler/src/main/
> > incubator/any23/trunk/plugins/basic-crawler/src/main/java/
> > incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
> > incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
> >
> incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
> > incubator/any23/trunk/plugins/basic-crawler/src/test/
> > incubator/any23/trunk/plugins/basic-crawler/src/test/java/
> > incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
> > incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
> >
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
> >
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
> >
> incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
> > incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
> > Removed:
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
> > Modified:
> > incubator/any23/trunk/README.txt
> > incubator/any23/trunk/any23-core/bin/any23
> > incubator/any23/trunk/any23-core/bin/any23tools
> > incubator/any23/trunk/any23-core/pom.xml
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
> >
> incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
> >
> incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
> >
> incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
> >
> incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
> >
> incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
> >
> incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
> > incubator/any23/trunk/lib/install-deps.sh
> >
> incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
> > incubator/any23/trunk/pom.xml
> > incubator/any23/trunk/src/site/apt/any23-plugins.apt
> > incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
> > incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
> > incubator/any23/trunk/src/site/apt/getting-started.apt
> > incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
> > incubator/any23/trunk/src/site/apt/service.apt
> > incubator/any23/trunk/src/site/apt/supported-formats.apt
> >
> > Modified: incubator/any23/trunk/README.txt
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > --- incubator/any23/trunk/README.txt (original)
> > +++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
> > @@ -20,7 +20,8 @@ Distribution Content
> >
> > any23-core The library core codebase.
> > any23-service The library HTTP service codebase.
> > -plugins Library plugins codebase.
> > +lib Contains the Any23 the external deps (read
> lib/README.txt for further details).
> > +plugins Library plugins codebase (read plugins/README.txt
> for further details).
> > RELEASE-NOTES.txt File reporting main release notes for every
> version.
> > LICENSE.txt Applicable project license.
> > README.txt This file.
> > @@ -240,15 +241,14 @@ Upload the produced packages in download
> >
> > http://code.google.com/p/any23/downloads/list
> >
> > +--------------------
> > +Manage External Deps
> > +--------------------
> >
> > -Fix Release Procedure
> > ----------------------
> > -
> > - Currently the *plugins/integration-test* module is excluded from the
> parent
> > - reactor.
> > - To fix it in tag follow procedure as described at issue #171:
> > -
> > - http://code.google.com/p/any23/issues/detail?id=171
> > +::Developers interest only.::
> >
> > +External Deps are libraries used by some Any23 modules which are
> > +not available in public Maven repositories. Such libraries are
> > +managed within the 'lib' dir.
> >
> > EOF
> >
> > Modified: incubator/any23/trunk/any23-core/bin/any23
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > --- incubator/any23/trunk/any23-core/bin/any23 (original)
> > +++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
> > @@ -9,12 +9,12 @@
> > ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
> >
> > if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
> > - echo "Generating executable JAR..."
> > - mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly\
> > + echo "Generating executable JAR..." >&2
> > + mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly >&2 \
> > ||\
> > - mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly\
> > + mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly >&2 \
> > ||\
> > - { echo "Error while generating commandline assembly."; exit 1; }
> > + { echo "Error while generating commandline assembly." >&2; exit 1;
> }
> > fi
> >
> > SEP=':'
> >
> > Modified: incubator/any23/trunk/any23-core/bin/any23tools
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > --- incubator/any23/trunk/any23-core/bin/any23tools (original)
> > +++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28
> 2012
> > @@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd
> > PLUGINS_DIR=plugins
> >
> > if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
> > - echo "Generating executable JAR..."
> > - mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly\
> > + echo "Generating executable JAR..." >&2
> > + mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly >&2 \
> > ||\
> > - mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly\
> > + mvn -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
> assembly:assembly >&2 \
> > ||\
> > - { echo "Error while generating commandline assembly."; exit 1; }
> > + { echo "Error while generating commandline assembly." >&2; exit 1; }
> > fi
> >
> > SEP=':'
> > @@ -30,6 +30,7 @@ done
> > # Plugins classpath.
> > for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name
> "*-plugin.jar" -depth 1)
> > do
> > + echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
> > if [ ! -e "$jar" ]; then continue; fi
> > CP="$CP$SEP$jar"
> > done
> >
> > Modified: incubator/any23/trunk/any23-core/pom.xml
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > --- incubator/any23/trunk/any23-core/pom.xml (original)
> > +++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
> > @@ -92,6 +92,10 @@
> > </dependency>
> > <dependency>
> > <groupId>org.openrdf.sesame</groupId>
> > + <artifactId>sesame-rio-trix</artifactId>
> > + </dependency>
> > + <dependency>
> > + <groupId>org.openrdf.sesame</groupId>
> > <artifactId>sesame-repository-sail</artifactId>
> > </dependency>
> > <dependency>
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
> Tue Jan 10 16:32:28 2012
> > @@ -258,6 +258,28 @@ public class Any23 {
> > }
> >
> > /**
> > + * Returns the most appropriate {@link DocumentSource} for the
> given<code>documentURI</code>.
> > + *
> > + * @param documentURI the document <i>URI</i>.
> > + * @return a new instance of DocumentSource.
> > + * @throws URISyntaxException if an error occurs while parsing the
> <code>documentURI</code> as a <i>URI</i>.
> > + * @throws IOException if an error occurs while initializing the
> internal {@link HTTPClient}.
> > + */
> > + public DocumentSource createDocumentSource(String documentURI)
> throws URISyntaxException, IOException {
> > + if(documentURI == null) throw new
> NullPointerException("documentURI cannot be null.");
> > + if (documentURI.toLowerCase().startsWith("file:")) {
> > + return new FileDocumentSource( new File(new
> URI(documentURI)) );
> > + }
> > + if (documentURI.toLowerCase().startsWith("http:") ||
> documentURI.toLowerCase().startsWith("https:")) {
> > + return new HTTPDocumentSource(getHTTPClient(), documentURI);
> > + }
> > + throw new IllegalArgumentException(
> > + String.format("Unsupported protocol for document URI:
> '%s' .", documentURI)
> > + );
> > + }
> > +
> > +
> > + /**
> > * Performs metadata extraction from the content of the given
> > * <code>in</code> document source, sending the generated events
> > * to the specified <code>outputHandler</code>.
> > @@ -363,13 +385,7 @@ public class Any23 {
> > public ExtractionReport extract(ExtractionParameters eps, String
> documentURI, TripleHandler outputHandler)
> > throws IOException, ExtractionException {
> > try {
> > - if (documentURI.toLowerCase().startsWith("file:")) {
> > - return extract(eps, new FileDocumentSource(new File(new
> URI(documentURI))), outputHandler);
> > - }
> > - if (documentURI.toLowerCase().startsWith("http:") ||
> documentURI.toLowerCase().startsWith("https:")) {
> > - return extract(eps, new
> HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
> > - }
> > - throw new ExtractionException("Not a valid absolute URI: "
> + documentURI);
> > + return extract(eps, createDocumentSource(documentURI),
> outputHandler);
> > } catch (URISyntaxException ex) {
> > throw new ExtractionException("Error while extracting data
> from document URI.", ex);
> > }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
> Tue Jan 10 16:32:28 2012
> > @@ -16,7 +16,7 @@
> >
> > package org.deri.any23.cli;
> >
> > -import org.deri.any23.LogUtil;
> > +import org.deri.any23.util.LogUtils;
> > import org.deri.any23.extractor.ExampleInputOutput;
> > import org.deri.any23.extractor.ExtractionException;
> > import org.deri.any23.extractor.Extractor;
> > @@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
> > }
> >
> > public int run(String[] args) {
> > - LogUtil.setDefaultLogging();
> > + LogUtils.setDefaultLogging();
> > try {
> > if (args.length == 0) {
> > printUsage();
> > @@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
> > * Prints the list of all the available extractors.
> > */
> > public void printExtractorList() {
> > - for (String extractorName :
> ExtractorRegistry.getInstance().getAllNames()) {
> > - System.out.println(extractorName);
> > + for(ExtractorFactory factory :
> ExtractorRegistry.getInstance().getExtractorGroup()) {
> > + System.out.println( String.format("%25s [%15s]",
> factory.getExtractorName(), factory.getExtractorType()));
> > }
> > }
> >
> > @@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
> > ExtractorFactory<?> factory =
> ExtractorRegistry.getInstance().getFactory(extractorName);
> > ExampleInputOutput example = new ExampleInputOutput(factory);
> > System.out.println("Extractor: " + extractorName);
> > - System.out.println(" type: " + getType(factory));
> > - String output = example.getExampleOutput();
> > - if (output == null) {
> > - System.out.println("(no example output)");
> > + System.out.println("\ttype: " + getType(factory));
> > + System.out.println();
> > + final String exampleInput = example.getExampleInput();
> > + if(exampleInput == null) {
> > + System.out.println("(No Example Available)");
> > } else {
> > - System.out.println("-------- example output --------");
> > - System.out.println(output);
> > + System.out.println("-------- Example Input --------");
> > + System.out.println(exampleInput);
> > + System.out.println("-------- Example Output --------");
> > + String output = example.getExampleOutput();
> > + System.out.println(output == null ||
> output.trim().length() == 0 ? "(No Output Generated)" : output);
> > }
> > - System.out.println();
> > System.out.println("================================");
> > + System.out.println();
> > }
> > }
> >
> >
> > Added:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> (added)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
> Tue Jan 10 16:32:28 2012
> > @@ -0,0 +1,113 @@
> > +/*
> > + * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
> > + *
> > + * Licensed under the Apache License, Version 2.0 (the "License");
> > + * you may not use this file except in compliance with the License.
> > + * You may obtain a copy of the License at
> > + *
> > + * http://www.apache.org/licenses/LICENSE-2.0
> > + *
> > + * Unless required by applicable law or agreed to in writing, software
> > + * distributed under the License is distributed on an "AS IS" BASIS,
> > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> implied.
> > + * See the License for the specific language governing permissions and
> > + * limitations under the License.
> > + */
> > +
> > +package org.deri.any23.cli;
> > +
> > +import org.deri.any23.configuration.DefaultConfiguration;
> > +import org.deri.any23.http.DefaultHTTPClient;
> > +import org.deri.any23.http.HTTPClient;
> > +import org.deri.any23.http.HTTPClientConfiguration;
> > +import org.deri.any23.mime.MIMEType;
> > +import org.deri.any23.mime.MIMETypeDetector;
> > +import org.deri.any23.mime.TikaMIMETypeDetector;
> > +import org.deri.any23.source.DocumentSource;
> > +import org.deri.any23.source.FileDocumentSource;
> > +import org.deri.any23.source.HTTPDocumentSource;
> > +import org.deri.any23.source.StringDocumentSource;
> > +
> > +import java.io.File;
> > +import java.net.URISyntaxException;
> > +
> > +/**
> > + * Commandline tool to detect <b>MIME Type</b>s from
> > + * file, HTTP and direct input sources.
> > + * The implementation of this tool is based on {@link
> TikaMIMETypeDetector}.
> > + *
> > + * @author Michele Mostarda (mostarda@fbk.eu)
> > + */
> > +@ToolRunner.Description("MIME Type Detector Tool.")
> > +public class MimeDetector implements Tool{
> > +
> > + public static final String FILE_DOCUMENT_PREFIX = "file://";
> > + public static final String INLINE_DOCUMENT_PREFIX = "inline://";
> > + public static final String URL_DOCUMENT_RE = "^https?://.*";
> > +
> > + public static void main(String[] args) {
> > + System.exit( new MimeDetector().run(args) );
> > + }
> > +
> > + @Override
> > + public int run(String[] args) {
> > + if(args.length != 1) {
> > + System.err.println("USAGE: {
> http://path/to/resource.html|file:///path/to/local.file|inline:// some
> inline content}");
> > + return 1;
> > + }
> > +
> > + final String document = args[0];
> > + try {
> > + final DocumentSource documentSource =
> createDocumentSource(document);
> > + final MIMETypeDetector detector = new
> TikaMIMETypeDetector();
> > + final MIMEType mimeType = detector.guessMIMEType(
> > + documentSource.getDocumentURI(),
> > + documentSource.openInputStream(),
> > + MIMEType.parse(documentSource.getContentType())
> > + );
> > + System.out.println(mimeType);
> > + return 0;
> > + } catch (Exception e) {
> > + System.err.print("Error while detecting MIME Type.");
> > + e.printStackTrace(System.err);
> > + return 1;
> > + }
> > + }
> > +
> > + private DocumentSource createDocumentSource(String document) throws
> URISyntaxException {
> > + if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
> > + return new FileDocumentSource(
> > + new File(
> > +
> document.substring(FILE_DOCUMENT_PREFIX.length())
> > + )
> > + );
> > + }
> > + if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
> > + return new StringDocumentSource(
> > + document.substring(INLINE_DOCUMENT_PREFIX.length()),
> > + ""
> > + );
> > + }
> > + if(document.matches(URL_DOCUMENT_RE)) {
> > + final HTTPClient client = new DefaultHTTPClient();
> > + // TODO: anonymous config class also used in Any23.
> centralize.
> > + client.init(new HTTPClientConfiguration() {
> > + public String getUserAgent() {
> > + return
> DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
> > + }
> > + public String getAcceptHeader() {
> > + return "";
> > + }
> > + public int getDefaultTimeout() {
> > + return
> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
> > + }
> > + public int getMaxConnections() {
> > + return
> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
> > + }
> > + });
> > + return new HTTPDocumentSource(client, document);
> > + }
> > + throw new IllegalArgumentException("Unsupported protocol for
> document " + document);
> > + }
> > +
> > +}
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
> Tue Jan 10 16:32:28 2012
> > @@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
> > import org.apache.commons.cli.Options;
> > import org.apache.commons.cli.PosixParser;
> > import org.deri.any23.Any23;
> > -import org.deri.any23.LogUtil;
> > +import org.deri.any23.util.LogUtils;
> > import org.deri.any23.configuration.Configuration;
> > import org.deri.any23.configuration.DefaultConfiguration;
> > import org.deri.any23.extractor.ExtractionException;
> > @@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
> > import org.deri.any23.extractor.SingleDocumentExtraction;
> > import org.deri.any23.filter.IgnoreAccidentalRDFa;
> > import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
> > +import org.deri.any23.source.DocumentSource;
> > import org.deri.any23.writer.BenchmarkTripleHandler;
> > import org.deri.any23.writer.LoggingTripleHandler;
> > -import org.deri.any23.writer.NQuadsWriter;
> > -import org.deri.any23.writer.NTriplesWriter;
> > -import org.deri.any23.writer.RDFXMLWriter;
> > import org.deri.any23.writer.ReportingTripleHandler;
> > import org.deri.any23.writer.TripleHandler;
> > import org.deri.any23.writer.TripleHandlerException;
> > -import org.deri.any23.writer.TurtleWriter;
> > -import org.deri.any23.writer.URIListWriter;
> > +import org.deri.any23.writer.WriterRegistry;
> > import org.slf4j.Logger;
> > import org.slf4j.LoggerFactory;
> >
> > @@ -51,6 +48,7 @@ import java.io.OutputStream;
> > import java.io.PrintStream;
> > import java.io.PrintWriter;
> > import java.net.MalformedURLException;
> > +import java.net.URISyntaxException;
> > import java.net.URL;
> >
> > import static
> org.deri.any23.extractor.ExtractionParameters.ValidationMode;
> > @@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
> > * A default rover implementation. Goes and fetches a URL using an hint
> > * as to what format should require, then tries to convert it to RDF.
> > *
> > - * @author Gabriele Renzi
> > - * @author Richard Cyganiak (richard@cyganiak.de)
> > * @author Michele Mostarda (mostarda@fbk.eu)
> > + * @author Richard Cyganiak (richard@cyganiak.de)
> > + * @author Gabriele Renzi
> > */
> > @ToolRunner.Description("Any23 Command Line Tool.")
> > public class Rover implements Tool {
> >
> > - // Supported formats.
> > - private static final String TURTLE_FORMAT = "turtle";
> > - private static final String NTRIPLE_FORMAT = "ntriples";
> > - private static final String RDFXML_FORMAT = "rdfxml";
> > - private static final String NQUADS_FORMAT = "nquads";
> > - private static final String URIS_FORMAT = "uris";
> > -
> > - private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
> > + private static final String[] FORMATS =
> WriterRegistry.getInstance().getIdentifiers();
> > + private static final int DEFAULT_FORMAT_INDEX = 0;
> >
> > private static final Logger logger =
> LoggerFactory.getLogger(Rover.class);
> >
> > - private static Options options;
> > + private Options options;
> >
> > - public static void main(String[] args) {
> > - System.exit( new Rover().run(args) );
> > - }
> > + private CommandLine commandLine;
> >
> > - public int run(String[] args) {
> > - final CommandLineParser parser = new PosixParser();
> > - final CommandLine commandLine;
> > + private boolean verbose = false;
> >
> > - boolean verbose = false;
> > - try {
> > - options = createOptions();
> > - commandLine = parser.parse(options, args);
> > + private PrintStream outputStream;
> > + private TripleHandler tripleHandler;
> > + private ReportingTripleHandler reportingTripleHandler;
> > + private BenchmarkTripleHandler benchmarkTripleHandler;
> >
> > - if (commandLine.hasOption("h")) {
> > - printHelp();
> > - return 0;
> > - }
> > + private ExtractionParameters eps;
> > + private Any23 any23;
> >
> > - if (commandLine.hasOption('v')) {
> > - verbose = true;
> > - LogUtil.setVerboseLogging();
> > - } else {
> > - LogUtil.setDefaultLogging();
> > - }
> > -
> > - if (commandLine.getArgs().length < 1) {
> > - printHelp();
> > - throw new IllegalArgumentException("Expected at least 1
> argument.");
> > - }
> > + protected boolean isVerbose() {
> > + return verbose;
> > + }
> >
> > - final String[] inputURIs =
> argumentsToURIs(commandLine.getArgs());
> > - final String[] extractorNames = getExtractors(commandLine);
> > + public static void main(String[] args) {
> > + System.exit( new Rover().run(args) );
> > + }
> >
> > - PrintStream outputStream = null;
> > - TripleHandler tripleHandler = null;
> > - try {
> > - outputStream = getOutputStream(commandLine);
> > + public int run(String[] args) {
> > + try {
> > + final String[] uris = configure(args);
> > + performExtraction(uris);
> > + return 0;
> > + } catch (Exception e) {
> > + System.err.println( e.getMessage() );
> > + final int exitCode = e instanceof ExitCodeException ?
> ((ExitCodeException) e).exitCode : 1;
> > + if(verbose) e.printStackTrace(System.err);
> > + return exitCode;
> > + }
> > + }
> >
> > - tripleHandler = getTripleHandler(commandLine,
> outputStream);
> > + protected CommandLine getCommandLine() {
> > + if(commandLine == null) throw new IllegalStateException("Rover
> must be configured first.");
> > + return commandLine;
> > + }
> >
> > - tripleHandler = decorateWithLogHandler(commandLine,
> tripleHandler);
> > + protected String[] configure(String[] args) throws Exception {
> > + final CommandLineParser parser = new PosixParser();
> > + options = createOptions();
> > + commandLine = parser.parse(options, args);
> >
> > - tripleHandler =
> decorateWithStatisticsHandler(commandLine, tripleHandler);
> > - final BenchmarkTripleHandler benchmarkTripleHandler =
> > - tripleHandler instanceof BenchmarkTripleHandler
> ? (BenchmarkTripleHandler) tripleHandler : null;
> > + if (commandLine.hasOption("h")) {
> > + printHelp();
> > + throw new ExitCodeException(0);
> > + }
> >
> > - tripleHandler =
> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
> > + if (commandLine.hasOption('v')) {
> > + verbose = true;
> > + LogUtils.setVerboseLogging();
> > + } else {
> > + LogUtils.setDefaultLogging();
> > + }
> >
> > - final ReportingTripleHandler reportingTripleHandler =
> new ReportingTripleHandler(tripleHandler);
> > + if (commandLine.getArgs().length < 1) {
> > + printHelp();
> > + throw new IllegalArgumentException("Expected at least 1
> argument.");
> > + }
> >
> > - final ExtractionParameters eps =
> getExtractionParameters(commandLine);
> > + final String[] inputURIs =
> argumentsToURIs(commandLine.getArgs());
> > + final String[] extractorNames = getExtractors(commandLine);
> >
> > - final Any23 any23 = createAny23(extractorNames);
> > + try {
> > + outputStream = getOutputStream(commandLine);
> > + tripleHandler = getTripleHandler(commandLine, outputStream);
> > + tripleHandler = decorateWithLogHandler(commandLine,
> tripleHandler);
> > + tripleHandler = decorateWithStatisticsHandler(commandLine,
> tripleHandler);
> >
> > - final long start = System.currentTimeMillis();
> > - for(String inputURI : inputURIs) {
> > - performExtraction(any23, eps, inputURI,
> reportingTripleHandler);
> > - }
> > - final long elapsed = System.currentTimeMillis() - start;
> > + benchmarkTripleHandler =
> > + tripleHandler instanceof BenchmarkTripleHandler ?
> (BenchmarkTripleHandler) tripleHandler : null;
> >
> > - closeAll(tripleHandler, outputStream);
> > + tripleHandler =
> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
> >
> > - if (benchmarkTripleHandler != null) {
> > - System.err.println( benchmarkTripleHandler.report()
> );
> > - }
> > + reportingTripleHandler = new
> ReportingTripleHandler(tripleHandler);
> > + eps = getExtractionParameters(commandLine);
> > + any23 = createAny23(extractorNames);
> >
> > - logger.info("Extractors used: " +
> reportingTripleHandler.getExtractorNames());
> > - logger.info(reportingTripleHandler.getTotalTriples() +
> " triples, " + elapsed + "ms");
> > - } finally {
> > - closeAll(tripleHandler, outputStream);
> > - }
> > + return inputURIs;
> > } catch (Exception e) {
> > - System.err.println(e.getMessage());
> > - final int exitCode = e instanceof SpecificExitException ?
> ((SpecificExitException) e).exitCode : 1;
> > - if(verbose) e.printStackTrace(System.err);
> > - return exitCode;
> > + closeStreams();
> > + throw e;
> > }
> > - return 0;
> > }
> >
> > - private Options createOptions() {
> > + protected Options createOptions() {
> > final Options options = new Options();
> > options.addOption(
> > new Option("v", "verbose", false, "Show debug and
> progress information.")
> > @@ -178,13 +175,7 @@ public class Rover implements Tool {
> > "f",
> > "Output format",
> > true,
> > - "[" +
> > - TURTLE_FORMAT + " (default), " +
> > - NTRIPLE_FORMAT + ", " +
> > - RDFXML_FORMAT + ", " +
> > - NQUADS_FORMAT + ", " +
> > - URIS_FORMAT +
> > - "]"
> > + "[" + printFormats(FORMATS,
> DEFAULT_FORMAT_INDEX) + "]"
> > )
> > );
> > options.addOption(
> > @@ -208,11 +199,51 @@ public class Rover implements Tool {
> > return options;
> > }
> >
> > + protected void performExtraction(DocumentSource documentSource) {
> > + performExtraction(any23, eps, documentSource,
> reportingTripleHandler);
> > + }
> > +
> > + protected void performExtraction(String[] inputURIs) throws
> URISyntaxException, IOException {
> > + try {
> > + final long start = System.currentTimeMillis();
> > + for (String inputURI : inputURIs) {
> > + performExtraction( any23.createDocumentSource(inputURI)
> );
> > + }
> > + final long elapsed = System.currentTimeMillis() - start;
> > +
> > + if (benchmarkTripleHandler != null) {
> > + System.err.println(benchmarkTripleHandler.report());
> > + }
> > +
> > + logger.info("Extractors used: " +
> reportingTripleHandler.getExtractorNames());
> > + logger.info(reportingTripleHandler.getTotalTriples() + "
> triples, " + elapsed + "ms");
> > + } finally {
> > + closeStreams();
> > + }
> > + }
> > +
> > + protected String printReports() {
> > + final StringBuilder sb = new StringBuilder();
> > + if(benchmarkTripleHandler != null) sb.append(
> benchmarkTripleHandler.report() ).append('\n');
> > + if(reportingTripleHandler != null) sb.append(
> reportingTripleHandler.printReport() ).append('\n');
> > + return sb.toString();
> > + }
> > +
> > private void printHelp() {
> > HelpFormatter formatter = new HelpFormatter();
> > formatter.printHelp("[{<url>|<file>}]+", options, true);
> > }
> >
> > + private String printFormats(String[] formats, int defaultIndex) {
> > + final StringBuilder sb = new StringBuilder();
> > + for (int i = 0; i < formats.length; i++) {
> > + sb.append(formats[i]);
> > + if(i == defaultIndex) sb.append(" (default)");
> > + if(i < formats.length - 1) sb.append(", ");
> > + }
> > + return sb.toString();
> > + }
> > +
> > private String argumentToURI(String uri) {
> > uri = uri.trim();
> > if (uri.toLowerCase().startsWith("http:") ||
> uri.toLowerCase().startsWith("https:")) {
> > @@ -268,27 +299,17 @@ public class Rover implements Tool {
> >
> > private TripleHandler getTripleHandler(CommandLine cl, OutputStream
> os) {
> > final String FORMAT_OPTION = "f";
> > - String format = DEFAULT_FORMAT;
> > + String format = FORMATS[DEFAULT_FORMAT_INDEX];
> > if (cl.hasOption(FORMAT_OPTION)) {
> > - format = cl.getOptionValue(FORMAT_OPTION);
> > + format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
> > }
> > - final TripleHandler outputHandler;
> > - if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
> > - outputHandler = new TurtleWriter(os);
> > - } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
> > - outputHandler = new NTriplesWriter(os);
> > - } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
> > - outputHandler = new RDFXMLWriter(os);
> > - } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
> > - outputHandler = new NQuadsWriter(os);
> > - } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
> > - outputHandler = new URIListWriter(os);
> > - } else {
> > + try {
> > + return
> WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
> > + } catch (Exception e) {
> > throw new IllegalArgumentException(
> > String.format("Invalid option value '%s' for option
> %s", format, FORMAT_OPTION)
> > );
> > }
> > - return outputHandler;
> > }
> >
> > private TripleHandler
> decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
> > @@ -346,44 +367,54 @@ public class Rover implements Tool {
> > return any23;
> > }
> >
> > - private void performExtraction(Any23 any23, ExtractionParameters
> eps, String documentURI, TripleHandler th) {
> > + private void performExtraction(
> > + Any23 any23, ExtractionParameters eps, DocumentSource
> documentSource, TripleHandler th
> > + ) {
> > try {
> > - if (! any23.extract(eps, documentURI,
> th).hasMatchingExtractors()) {
> > - throw new SpecificExitException("No suitable extractors
> found.", 2);
> > + if (! any23.extract(eps, documentSource,
> th).hasMatchingExtractors()) {
> > + throw new ExitCodeException("No suitable extractors
> found.", 2);
> > }
> > } catch (ExtractionException ex) {
> > - throw new SpecificExitException("Exception while extracting
> metadata.", ex, 3);
> > + throw new ExitCodeException("Exception while extracting
> metadata.", ex, 3);
> > } catch (IOException ex) {
> > - throw new SpecificExitException("Exception while producing
> output.", ex, 4);
> > + throw new ExitCodeException("Exception while producing
> output.", ex, 4);
> > }
> > }
> >
> > - private void closeHandler(TripleHandler th) {
> > - if(th == null) return;
> > + private void closeHandler() {
> > + if(tripleHandler == null) return;
> > try {
> > - th.close();
> > + tripleHandler.close();
> > } catch (TripleHandlerException the) {
> > - throw new SpecificExitException("Error while closing
> TripleHandler", the, 5);
> > + throw new ExitCodeException("Error while closing
> TripleHandler", the, 5);
> > }
> > }
> >
> > - private void closeAll(TripleHandler th, PrintStream os) {
> > - closeHandler(th);
> > - if(os != null) os.close();
> > + private void closeStreams() {
> > + closeHandler();
> > + if(outputStream != null) outputStream.close();
> > }
> >
> > - private class SpecificExitException extends RuntimeException {
> > + protected class ExitCodeException extends RuntimeException {
> >
> > private final int exitCode;
> >
> > - public SpecificExitException(String message, Throwable cause,
> int exitCode) {
> > + public ExitCodeException(String message, Throwable cause, int
> exitCode) {
> > super(message, cause);
> > this.exitCode = exitCode;
> > }
> > - public SpecificExitException(String message, int exitCode) {
> > + public ExitCodeException(String message, int exitCode) {
> > super(message);
> > this.exitCode = exitCode;
> > }
> > + public ExitCodeException(int exitCode) {
> > + super();
> > + this.exitCode = exitCode;
> > + }
> > +
> > + protected int getExitCode() {
> > + return exitCode;
> > + }
> > }
> >
> > }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
> Tue Jan 10 16:32:28 2012
> > @@ -29,6 +29,13 @@ import java.util.Collection;
> > public interface ExtractorFactory<T extends Extractor<?>> extends
> ExtractorDescription {
> >
> > /**
> > + * Returns the extractor type.
> > + *
> > + * @return the not <code>null</code> extractor class.
> > + */
> > + Class<T> getExtractorType();
> > +
> > + /**
> > * Creates an extractor instance.
> > *
> > * @return an instance of the extractor associated to this factory.
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
> Tue Jan 10 16:32:28 2012
> > @@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
> > import org.deri.any23.extractor.rdf.NQuadsExtractor;
> > import org.deri.any23.extractor.rdf.NTriplesExtractor;
> > import org.deri.any23.extractor.rdf.RDFXMLExtractor;
> > +import org.deri.any23.extractor.rdf.TriXExtractor;
> > import org.deri.any23.extractor.rdf.TurtleExtractor;
> > import org.deri.any23.extractor.rdfa.RDFa11Extractor;
> > import org.deri.any23.extractor.rdfa.RDFaExtractor;
> > @@ -79,6 +80,7 @@ public class ExtractorRegistry {
> > instance.register(TurtleExtractor.factory);
> > instance.register(NTriplesExtractor.factory);
> > instance.register(NQuadsExtractor.factory);
> > + instance.register(TriXExtractor.factory);
> >
> if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
> > instance.register(RDFa11Extractor.factory);
> > } else {
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
> Tue Jan 10 16:32:28 2012
> > @@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
> > return supportedMIMETypes;
> > }
> >
> > + @Override
> > + public Class<T> getExtractorType() {
> > + return extractorClass;
> > + }
> > +
> > /**
> > * @return an instance of type T concrete implementation of {@link
> org.deri.any23.extractor.Extractor}
> > */
> > + @Override
> > public T createExtractor() {
> > try {
> > return extractorClass.newInstance();
> > @@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
> > /**
> > * @return an input example
> > */
> > + @Override
> > public String getExampleInput() {
> > return exampleInput;
> > }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
> > Arrays.asList(
> > "text/csv;q=0.1"
> > ),
> > - null,
> > + "example-csv.csv",
> > CSVExtractor.class
> > );
> >
> > @@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
> > }
> >
> > /**
> > + * Check whether a number is an integer.
> > + *
> > + * @param number
> > + * @return
> > + */
> > + private boolean isInteger(String number) {
> > + try {
> > + Integer.valueOf(number);
> > + return true;
> > + } catch (NumberFormatException e) {
> > + return false;
> > + }
> > + }
> > +
> > + /**
> > + * Check whether a number is a float.
> > + *
> > * @param number
> > * @return
> > */
> > - private boolean isNumber(String number) {
> > + private boolean isFloat(String number) {
> > try {
> > - Double.valueOf(number);
> > + Float.valueOf(number);
> > return true;
> > } catch (NumberFormatException e) {
> > return false;
> > @@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
> > object = new URIImpl(cell);
> > } else {
> > URI datatype = XMLSchema.STRING;
> > - if (isNumber(cell)) {
> > + if (isInteger(cell)) {
> > datatype = XMLSchema.INTEGER;
> > + } else if(isFloat(cell)) {
> > + datatype = XMLSchema.FLOAT;
> > }
> > object = new LiteralImpl(cell, datatype);
> > }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
> > "html-mf-adr",
> > PopularPrefixes.createSubset("rdf", "vcard"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-adr.html",
> > AdrExtractor.class
> > );
> > }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
> > "html-mf-geo",
> > PopularPrefixes.createSubset("rdf", "vcard"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-geo.html",
> > GeoExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -53,7 +53,7 @@ public class HCalendarExtractor extends
> > "html-mf-hcalendar",
> > PopularPrefixes.createSubset("rdf", "ical"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-hcalendar.html",
> > HCalendarExtractor.class);
> >
> > private static final String[] Components = {"Vevent", "Vtodo",
> "Vjournal", "Vfreebusy"};
> > @@ -116,7 +116,7 @@ public class HCalendarExtractor extends
> > private boolean extractComponent(Node node, Resource cal, String
> component) throws ExtractionException {
> > HTMLDocument compoNode = new HTMLDocument(node);
> > BNode evt = valueFactory.createBNode();
> > - addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
> > + addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
> > addTextProps(compoNode, evt);
> > addUrl(compoNode, evt);
> > addRRule(compoNode, evt);
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
> > "html-mf-hcard",
> > PopularPrefixes.createSubset("rdf", "vcard"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-hcard.html",
> > HCardExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -82,7 +82,7 @@ public class HListingExtractor extends E
> > "html-mf-hlisting",
> > PopularPrefixes.createSubset("rdf", "hlisting"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-hlisting.html",
> > HListingExtractor.class
> > );
> >
> > @@ -106,7 +106,7 @@ public class HListingExtractor extends E
> > out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
> >
> > for (String action : findActions(fragment)) {
> > - out.writeTriple(listing, hLISTING.action,
> hLISTING.getResource(action));
> > + out.writeTriple(listing, hLISTING.action,
> hLISTING.getClass(action));
> > }
> > out.writeTriple(listing, hLISTING.lister, addLister() );
> > addItem(listing);
> > @@ -154,7 +154,7 @@ public class HListingExtractor extends E
> > String value = node.getNodeValue();
> > // do not use conditionallyAdd, it won't work cause
> of evaluation rules
> > if (!(null == value || "".equals(value))) {
> > - URI property =
> hLISTING.getPropertyCamelized(klass);
> > + URI property =
> hLISTING.getPropertyCamelCase(klass);
> > conditionallyAddLiteralProperty(
> > node,
> > blankItem, property,
> valueFactory.createLiteral(value)
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
> > "html-mf-hrecipe",
> > PopularPrefixes.createSubset("rdf", "hrecipe"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-hrecipe.html",
> > HRecipeExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -48,7 +48,7 @@ public class HResumeExtractor extends En
> > "html-mf-hresume",
> > PopularPrefixes.createSubset("rdf", "doac", "foaf"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-hresume.html",
> > HResumeExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -53,7 +53,7 @@ public class HReviewExtractor extends En
> > "html-mf-hreview",
> > PopularPrefixes.createSubset("rdf", "vcard", "rev"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-hreview.html",
> > HReviewExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
> > "html-head-links",
> > PopularPrefixes.createSubset("xhtml", "dcterms"),
> > Arrays.asList("text/html;q=0.05",
> "application/xhtml+xml;q=0.05"),
> > - null,
> > + "example-head-link.html",
> > HeadLinkExtractor.class);
> > }
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
> > "html-head-icbm",
> > PopularPrefixes.createSubset("geo", "rdf"),
> > Arrays.asList("text/html;q=0.01",
> "application/xhtml+xml;q=0.01"),
> > - null,
> > + "example-icbm.html",
> > ICBMExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -51,7 +51,7 @@ public class LicenseExtractor implements
> > "html-mf-license",
> > PopularPrefixes.createSubset("xhtml"),
> > Arrays.asList("text/html;q=0.01",
> "application/xhtml+xml;q=0.01"),
> > - null,
> > + "example-mf-license.html",
> > LicenseExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
> > "html-mf-species",
> > PopularPrefixes.createSubset("rdf", "wo"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-species.html",
> > SpeciesExtractor.class
> > );
> >
> > @@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
> >
> > private URI resolveClassName(String clazz) {
> > String upperCaseClass = clazz.substring(0, 1);
> > - return vWO.getResource(
> > + return vWO.getClass(
> > String.format("%s%s",
> > upperCaseClass.toUpperCase(),
> > clazz.substring(1)
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
> > NAME,
> > PopularPrefixes.get(),
> > Arrays.asList("text/html;q=0.02",
> "application/xhtml+xml;q=0.02"),
> > - null,
> > + "example-script-turtle.html",
> > TurtleHTMLExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
> > "html-mf-xfn",
> > PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-mf-xfn.html",
> > XFNExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
> Tue Jan 10 16:32:28 2012
> > @@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
> > "html-microdata",
> > PopularPrefixes.createSubset("rdf", "doac", "foaf"),
> > Arrays.asList("text/html;q=0.1",
> "application/xhtml+xml;q=0.1"),
> > - null,
> > + "example-microdata.html",
> > MicrodataExtractor.class
> > );
> >
> >
> > Modified:
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> > URL:
> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
> >
> ==============================================================================
> > ---
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> (original)
> > +++
> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
> Tue Jan 10 16:32:28 2012
> > @@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
> > import org.deri.any23.extractor.ErrorReporter;
> > import org.deri.any23.extractor.ExtractionContext;
> > import org.deri.any23.extractor.ExtractionResult;
> > -import org.deri.any23.parser.NQuadsParser;
> > +import org.deri.any23.io.nquads.NQuadsParser;
> > import org.deri.any23.rdf.Any23ValueFactoryWrapper;
> > import org.openrdf.model.impl.ValueFactoryImpl;
> > import org.openrdf.rio.ParseErrorListener;
> > @@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
> > import org.openrdf.rio.RDFParser;
> > import org.openrdf.rio.ntriples.NTriplesParser;
> > import org.openrdf.rio.rdfxml.RDFXMLParser;
> > +import org.openrdf.rio.trix.TriXParser;
> > import org.openrdf.rio.turtle.TurtleParser;
> > import org.slf4j.Logger;
> > import org.slf4j.LoggerFactory;
> > @@ -38,7 +39,7 @@ import java.io.Reader;
> >
> > /**
> > * This factory provides a common logic for creating and configuring
> correctly
> > - * any RDF parser used within the library.
> > + * any <i>RDF</i> parser used within the library.
> > *
> > * @author Michele Mostarda (mostarda@fbk.eu)
> > */
> > @@ -119,7 +120,7 @@ public class RDFParserFactory {
> > }
> >
> > /**
> > - * Returns a new instance of a configured {@link
> org.deri.any23.parser.NQuadsParser}.
> > + * Returns a new instance of a configured {@link
> org.deri.any23.io.nquads.NQuadsParser}.
> > *
> > * @param verifyDataType data verification enable if
> <code>true</code>.
> > * @param stopAtFirstError the parser stops at first error if
> <code>true</code>.
> > @@ -139,6 +140,26 @@ public class RDFParserFactory {
> > }
> >
> > /**
> > + * Returns a new instance of a configured {@link TriXParser}.
> > + *
> > + * @param verifyDataType data verification enable if
> <code>true</code>.
> > + * @param stopAtFirstError the parser stops at first error if
> <code>true</code>.
> > + * @param extractionContext the extraction context where the parser
> is used.
> > + * @param extractionResult the output extraction result.
> > + * @return a new instance of a configured TriX parser.
> > + */
> > + public TriXParser getTriXParser(
> > + final boolean verifyDataType,
> > + final boolean stopAtFirstError,
> > + final ExtractionContext extractionContext,
> > + final ExtractionResult extractionResult
> > + ) {
> > + final TriXParser parser = new TriXParser();
> > + configureParser(parser, verifyDataType, stopAtFirstError,
> extractionContext, extractionResult);
> > + return parser;
> > + }
> > +
> > + /**
> > * Configures the given parser on the specified extraction result
> > * setting the policies for data verification and error handling.
> > *
> >
> >
>
--
Michele Mostarda
Senior Software Engineer
skype: michele.mostarda
twitter: micmos
mail: me@michelemostarda.com
site : http://www.michelemostarda.com