You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by GitBox <gi...@apache.org> on 2021/09/15 00:11:31 UTC

[GitHub] [any23] dependabot[bot] opened a new pull request #183: Bump tika.version from 1.27 to 2.1.0

dependabot[bot] opened a new pull request #183:
URL: https://github.com/apache/any23/pull/183


   Bumps `tika.version` from 1.27 to 2.1.0.
   Updates `tika-core` from 1.27 to 2.1.0
   <details>
   <summary>Changelog</summary>
   <p><em>Sourced from <a href="https://github.com/apache/tika/blob/main/CHANGES.txt">tika-core's changelog</a>.</em></p>
   <blockquote>
   <p>Release 2.1.1 - ???</p>
   <ul>
   <li>
   <p>Improve robustness and features of the httpfetcher (TIKA-3543)</p>
   </li>
   <li>
   <p>Add optional fetch ranges to FetchEmitTuple to allow range fetching from,
   e.g. http or s3 (TIKA-3542).</p>
   </li>
   <li>
   <p>Exclude dependencies on jsoup and ehcache in ucar grib/cdm (TIKA-3003).</p>
   </li>
   </ul>
   <p>Release 2.1.0 - 08/18/2021</p>
   <p>MAJOR CHANGES in 2.1.0:</p>
   <ul>
   <li>
   <p>Improved packaging for tika-parsers-extended. Use the tika-parser-scientific-package and
   tika-parser-sqlite3-package artifacts if you want fat jars with dependencies. (TIKA-3510)</p>
   </li>
   <li>
   <p>Tika app writes UTF-8 when an encoding is not specified; the legacy behavior
   was UTF-8 on Mac OS, but System default on other OSs (TIKA-3515).</p>
   </li>
   <li>
   <p>Change the default rendering strategy for PDFs from NO_TEXT to ALL (TIKA-3520).</p>
   </li>
   </ul>
   <p>Other changes:</p>
   <ul>
   <li>
   <p>Fixed bug that pointed to the wrong tessdata directory if the user specified
   a tesseract path but not also a tessdata path (TIKA-3518).</p>
   </li>
   <li>
   <p>Fixed bug in Icu4j's encoding detector where it would return non-standard
   names for charsets, e.g. IBM424_rtl is now returned as IBM424 (TIKA-3516).</p>
   </li>
   <li>
   <p>Add a simple UrlFetcher in tika-core as a basic alternative
   to tika-fetcher-http (TIKA-3527).</p>
   </li>
   <li>
   <p>Add tika-pipes support for Google Cloud Storage (TIKA-3524).</p>
   </li>
   <li>
   <p>Fix markup ordering errors in xhtml output for ODT files (TIKA-2242).</p>
   </li>
   <li>
   <p>Fix serialization of embedded docs in OpenSearch emitter
   and fix embedded documents not being indexed in some use
   cases in the Solr emitter (TIKA-3490).</p>
   </li>
   <li>
   <p>Add pipesClientId system property to PipesServer so that each
   forked process can log to its own logger (TIKA-3480).</p>
   </li>
   <li>
   <p>Add DateNormalizingMetadataFilter let users ensure that all dates
   emitted to Solr/OpenSearch are in UTC. Users can configure which
   timezone they'd like to use in cases where the file format does
   not store a timezone (TIKA-3496).</p>
   </li>
   <li>
   <p>Breaking change in the Solr and OpenSearch emitters. To achieve</p>
   </li>
   </ul>
   <!-- raw HTML omitted -->
   </blockquote>
   <p>... (truncated)</p>
   </details>
   <details>
   <summary>Commits</summary>
   <ul>
   <li>See full diff in <a href="https://github.com/apache/tika/commits">compare view</a></li>
   </ul>
   </details>
   <br />
   
   Updates `tika-parsers` from 1.27 to 2.1.0
   <details>
   <summary>Changelog</summary>
   <p><em>Sourced from <a href="https://github.com/apache/tika/blob/main/CHANGES.txt">tika-parsers's changelog</a>.</em></p>
   <blockquote>
   <p>Release 2.1.1 - ???</p>
   <ul>
   <li>
   <p>Improve robustness and features of the httpfetcher (TIKA-3543)</p>
   </li>
   <li>
   <p>Add optional fetch ranges to FetchEmitTuple to allow range fetching from,
   e.g. http or s3 (TIKA-3542).</p>
   </li>
   <li>
   <p>Exclude dependencies on jsoup and ehcache in ucar grib/cdm (TIKA-3003).</p>
   </li>
   </ul>
   <p>Release 2.1.0 - 08/18/2021</p>
   <p>MAJOR CHANGES in 2.1.0:</p>
   <ul>
   <li>
   <p>Improved packaging for tika-parsers-extended. Use the tika-parser-scientific-package and
   tika-parser-sqlite3-package artifacts if you want fat jars with dependencies. (TIKA-3510)</p>
   </li>
   <li>
   <p>Tika app writes UTF-8 when an encoding is not specified; the legacy behavior
   was UTF-8 on Mac OS, but System default on other OSs (TIKA-3515).</p>
   </li>
   <li>
   <p>Change the default rendering strategy for PDFs from NO_TEXT to ALL (TIKA-3520).</p>
   </li>
   </ul>
   <p>Other changes:</p>
   <ul>
   <li>
   <p>Fixed bug that pointed to the wrong tessdata directory if the user specified
   a tesseract path but not also a tessdata path (TIKA-3518).</p>
   </li>
   <li>
   <p>Fixed bug in Icu4j's encoding detector where it would return non-standard
   names for charsets, e.g. IBM424_rtl is now returned as IBM424 (TIKA-3516).</p>
   </li>
   <li>
   <p>Add a simple UrlFetcher in tika-core as a basic alternative
   to tika-fetcher-http (TIKA-3527).</p>
   </li>
   <li>
   <p>Add tika-pipes support for Google Cloud Storage (TIKA-3524).</p>
   </li>
   <li>
   <p>Fix markup ordering errors in xhtml output for ODT files (TIKA-2242).</p>
   </li>
   <li>
   <p>Fix serialization of embedded docs in OpenSearch emitter
   and fix embedded documents not being indexed in some use
   cases in the Solr emitter (TIKA-3490).</p>
   </li>
   <li>
   <p>Add pipesClientId system property to PipesServer so that each
   forked process can log to its own logger (TIKA-3480).</p>
   </li>
   <li>
   <p>Add DateNormalizingMetadataFilter let users ensure that all dates
   emitted to Solr/OpenSearch are in UTC. Users can configure which
   timezone they'd like to use in cases where the file format does
   not store a timezone (TIKA-3496).</p>
   </li>
   <li>
   <p>Breaking change in the Solr and OpenSearch emitters. To achieve</p>
   </li>
   </ul>
   <!-- raw HTML omitted -->
   </blockquote>
   <p>... (truncated)</p>
   </details>
   <details>
   <summary>Commits</summary>
   <ul>
   <li>See full diff in <a href="https://github.com/apache/tika/commits">compare view</a></li>
   </ul>
   </details>
   <br />
   
   
   Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   <details>
   <summary>Dependabot commands and options</summary>
   <br />
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
   - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
   
   
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@any23.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [any23] dependabot[bot] commented on pull request #183: Bump tika.version from 1.27 to 2.1.0

Posted by GitBox <gi...@apache.org>.
dependabot[bot] commented on pull request #183:
URL: https://github.com/apache/any23/pull/183#issuecomment-919715624


   OK, I won't notify you again about this release, but will get in touch when a new version is available. You can also ignore all major, minor, or patch releases for a dependency by adding an [`ignore` condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore) with the desired `update_types` to your config file.
   
   If you change your mind, just re-open this PR and I'll resolve any conflicts on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@any23.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [any23] lewismc closed pull request #183: Bump tika.version from 1.27 to 2.1.0

Posted by GitBox <gi...@apache.org>.
lewismc closed pull request #183:
URL: https://github.com/apache/any23/pull/183


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@any23.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org