You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by le...@apache.org on 2021/12/17 19:02:27 UTC
svn commit: r1896103 - in /tika/site: publish/2.2.0/detection.html src/site/apt/2.2.0/detection.apt
Author: lewismc
Date: Fri Dec 17 19:02:27 2021
New Revision: 1896103
URL: http://svn.apache.org/viewvc?rev=1896103&view=rev
Log:
TIKA-3620 Language detection documentation needs attention
Modified:
tika/site/publish/2.2.0/detection.html
tika/site/src/site/apt/2.2.0/detection.apt
Modified: tika/site/publish/2.2.0/detection.html
URL: http://svn.apache.org/viewvc/tika/site/publish/2.2.0/detection.html?rev=1896103&r1=1896102&r2=1896103&view=diff
==============================================================================
--- tika/site/publish/2.2.0/detection.html (original)
+++ tika/site/publish/2.2.0/detection.html Fri Dec 17 19:02:27 2021
@@ -164,7 +164,8 @@ for (InputStream is : myListOfStreams) {
<div class="section">
<h3><a name="Language_Detection">Language Detection</a></h3>
<p>Tika is able to help identify the language of a piece of text, which is useful when extracting text from document formats which do not include language information in their metadata.</p>
-<p>The language detection is provided by <a href="./api/org/apache/tika/language/LanguageIdentifier.html">org.apache.tika.language.LanguageIdentifier</a></p></div>
+<p>The language detection is provided by etensions of the <a href="./api/org/apache/tika/language/detect/LanguageDetector.html">org.apache.tika.language.detect.LanguageDetector</a>. This provides choice for developers looking to compare and contrast differing language detection implementations.</p>
+<p>Some Java code example of language detection can be found at <a class="externalLink" href="https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectorExample.java">LanguageDetectorExample.java</a>, <a class="externalLink" href="https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectingParser.java">LanguageDetectingParser.java</a> and <a class="externalLink" href="https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/Language.java">Language.java</a>. </p></div>
<div class="section">
<h3><a name="More_Examples">More Examples</a></h3>
<p>For more examples of Detection using Apache Tika, please take a look at the <a href="./examples.html">Tika Examples page</a>.</p></div></div>
Modified: tika/site/src/site/apt/2.2.0/detection.apt
URL: http://svn.apache.org/viewvc/tika/site/src/site/apt/2.2.0/detection.apt?rev=1896103&r1=1896102&r2=1896103&view=diff
==============================================================================
--- tika/site/src/site/apt/2.2.0/detection.apt (original)
+++ tika/site/src/site/apt/2.2.0/detection.apt Fri Dec 17 19:02:27 2021
@@ -208,8 +208,14 @@ for (InputStream is : myListOfStreams) {
is useful when extracting text from document formats which do not include
language information in their metadata.
- The language detection is provided by
- {{{./api/org/apache/tika/language/LanguageIdentifier.html}org.apache.tika.language.LanguageIdentifier}}
+ The language detection is provided by etensions of the
+ {{{./api/org/apache/tika/language/detect/LanguageDetector.html}org.apache.tika.language.detect.LanguageDetector}}.
+ This provides choice for developers looking to compare and contrast differing
+ language detection implementations.
+
+ Some Java code example of language detection can be found at {{{https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectorExample.java}LanguageDetectorExample.java}},
+ {{{https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/LanguageDetectingParser.java}LanguageDetectingParser.java}}
+ and {{{https://github.com/apache/tika/blob/main/tika-example/src/main/java/org/apache/tika/example/Language.java}Language.java}}.
* {More Examples}