You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by tballison <gi...@git.apache.org> on 2016/06/16 16:57:20 UTC
[GitHub] lucene-solr pull request #44: SOLR-8981
GitHub user tballison opened a pull request:
https://github.com/apache/lucene-solr/pull/44
SOLR-8981
SOLR-8981 upgrade to Tika 1.13
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tballison/lucene-solr SOLR-8981
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/lucene-solr/pull/44.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #44
----
commit ba0e71703464849198b384aa6e92962db8a04b51
Author: tballison <ta...@mitre.org>
Date: 2016-06-16T16:56:45Z
SOLR-8981 upgrade to Tika 1.13
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Hallo,
please also update all SHA1 hashes of files. Plesae run "ant precommit" from root folder of Lu/Solr. This will report all missing things.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
I merged everything successfully, but I get one test failure in solr/contrib/extraction:
[junit4] FAILURE 0.05s J0 | ExtractingRequestHandlerTest.testXPath <<<
[junit4] > Throwable #1: org.junit.ComparisonFailure: expected:<[News]> but was:<[]>
[junit4] > at __randomizedtesting.SeedInfo.seed([404BA07016F1FB57:3E1A6EE30E469911]:0)
I have the feeling I have seen this before. Weren't you running the extraction tests?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Hi I have applied some other fixes and will push soon. Currently ASF have some problems with pushing:
git.exe push --progress "origin" master:master
Counting objects: 121, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (66/66), done.
Writing objects: 100% (121/121), 8.90 KiB | 0 bytes/s, done.
Total 121 (delta 55), reused 17 (delta 2)
remote: You are not authorized to edit this repository.
remote:
To https://git-wip-us.apache.org/repos/asf/lucene-solr.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://git-wip-us.apache.org/repos/asf/lucene-solr.git'
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Our bug introduced in TIKA-995.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by lewismc <gi...@git.apache.org>.
Github user lewismc commented on the issue:
https://github.com/apache/lucene-solr/pull/44
@uschindler yep we've seen this before. I have no idea what is going on here. I'll look in to it again today. Can someone point out the exact code which does the XPath magic?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
OK, I will merge again later. So I will revert my checkout once you have fixed that. Otherwise all looks fine.
BTW: Can you remove the assumeFalse on Java 9, because PDFBox is fixed? This was because on Java 9 PDFBOX failed in clinit (version number parsing failure).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Were you able to fix the test or should I look into it?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Y, I did run the extraction tests. That was the error we were getting initially, but which (without explanation) disappeared on my most recent integration attempt.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Error parsing javascript with selenium (solr 6.0.0 & nutch 1.11 &
firefox 47.0)
Posted by Erick Erickson <er...@gmail.com>.
This isn't much help, but I'd advise asking on the Nutch user's list as
this appears to be a Nutch issue, not a Solr one.
Best,
Erick
On Mon, Jun 20, 2016 at 1:41 AM, <li...@yahoo.com.invalid> wrote:
>
> ------------------------------
> * From: * liviuchristian@yahoo.com.INVALID
> <li...@yahoo.com.INVALID>;
> * To: * dev@lucene.apache.org <de...@lucene.apache.org>;
> dev@lucene.apache.org <de...@lucene.apache.org>; git@git.apache.org <
> git@git.apache.org>;
> * Subject: * Error parsing javascript with selenium (solr 6.0.0 & nutch
> 1.11 & firefox 47.0)
> * Sent: * Fri, Jun 17, 2016 9:38:53 PM
>
> Hi,
> I'm trying to use selenium (*solr 6.0.0 &* *nutch 1.11 & firefox 47.0*)
> to parse javascript pages
> *I'm using this configuration for nutch-site:*
> <property>
> <name>plugin.includes</name>
>
> <value>protocol-(httpclient|interactiveselenium|selenium)|urlfilter-(automaton|regex)|parse-(metatags|ext|html|js|swf|tika|zip)|index-(metadata|basic|anchor|geoip|dummy|links|more|replace|static)|scoring-opic|indexer-solr|urlnormalizer-(pass|regex|basic|ajax)|creativecommons|feed|headings|language-identifier|lib-nekohtml|lib-xml|microformats-reltag|mimetype-filter|nutch-extensionpoints|lib-selenium|subcollection|tld|parserfilter-naivebayes</value>
> <description>...</description>
> </property>
> *and this configuration for parse-plugins.xml*
> <parse-plugins>
>
> <!-- by default if the mimeType is set to *, or
> if it can't be determined, use parse-tika -->
> <mimeType name="*">
> <plugin id="parse-metatags"/>
> <plugin id="protocol-interactiveselenium"/>
> <plugin id="protocol-selenium"/>
> <plugin id="lib-selenium"/>
> <plugin id="nutch-extensionpoints"/>
> <plugin id="parse-js"/>
> <plugin id="parse-tika" />
> <plugin id="feed"/>
> <plugin id="parse-html"/>
> <plugin id="parse-js"/>
> <plugin id="parse-html" />
> </mimeType>
>
> <mimeType name="application/rss+xml">
> <plugin id="parse-tika" />
> <plugin id="feed" />
> </mimeType>
>
> <mimeType name="application/x-bzip2">
> <!-- try and parse it with the zip parser -->
> <plugin id="parse-zip" />
> </mimeType>
>
> <mimeType name="application/x-gzip">
> <!-- try and parse it with the zip parser -->
> <plugin id="parse-zip" />
> </mimeType>
>
> <mimeType name="application/x-javascript">
> <plugin id="parse-js" />
> <plugin id="protocol-interactiveselenium"/>
> <plugin id="protocol-selenium"/>
> <plugin id="lib-selenium"/>
> <plugin id="nutch-extensionpoints"/>
> <plugin id="parse-metatags"/>
> <!--<plugin id="parse-ext"/>-->
> <plugin id="parse-tika" />
> </mimeType>
>
> <mimeType name="application/x-shockwave-flash">
> <plugin id="parse-swf" />
> </mimeType>
>
> <mimeType name="application/zip">
> <plugin id="parse-zip" />
> </mimeType>
>
> <!--<mimeType name="text/html">
> <plugin id="parse-html" />
> </mimeType>-->
>
> <mimeType name="text/html">
> <plugin id="parse-metatags"/>
> <plugin id="protocol-interactiveselenium"/>
> <plugin id="protocol-selenium"/>
> <plugin id="lib-selenium"/>
> <plugin id="nutch-extensionpoints"/>
> <!--<plugin id="parse-ext"/>-->
> <!--<plugin id="parse-js"/>-->
> <plugin id="parse-html" />
> <plugin id="parse-tika" />
> </mimeType>
>
> <mimeType name="application/xhtml+xml">
> <plugin id="parse-metatags"/>
> <plugin id="protocol-interactiveselenium"/>
> <plugin id="protocol-selenium"/>
> <plugin id="lib-selenium"/>
> <plugin id="nutch-extensionpoints"/>
> <plugin id="parse-tika" />
> <plugin id="feed" />
> <plugin id="parse-html" />
> </mimeType>
>
> <mimeType name="text/xml">
> <plugin id="parse-metatags"/>
> <plugin id="protocol-interactiveselenium"/>
> <plugin id="protocol-selenium"/>
> <plugin id="lib-selenium"/>
> <plugin id="parse-tika" />
> <plugin id="feed" />
> </mimeType>
>
>
>
> *The firefox window popup with a message about private browsing on it. *
> *However, I get the error below and the job crushes into flames:*
>
> 17 18:44:13,029 INFO api.HttpRobotRulesParser - Couldn't get robots.txt
> for http://findjobs.mashable.com/: java.lang.RuntimeException:
> org.openqa.selenium.WebDriverException: Unable to bind to locking port 7054
> within 45000 ms
> Build info: version: '2.48.2', revision:
> '41bccdd10cf2c0560f637404c2d96164b67d9d67', time: '2015-10-09 13:08:06'
> System info: host: 'solr', ip: '127.0.1.1', os.name: 'Linux', os.arch:
> 'amd64', os.version: '3.19.0-39-generic', java.version: '1.8.0_91'
> Driver info: driver.version: FirefoxDriver
> 2016-06-17 18:44:13,129 ERROR selenium.Http - Failed to get protocol output
> *java.lang.RuntimeException: org.openqa.selenium.WebDriverException:
> Failed to connect to binary FirefoxBinary(/usr/bin/firefox) on port 7055;
> process output follows: *
> ения Firefox для Ubuntu","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["sl"],"name":"Ubuntu
> Modifications","description":"Ubuntu razširitve za
> Firefox.","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["sv-SE"],"name":"Ubuntu
> Modifications","description":"Ubuntu-paket för
> Firefox.","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["uk"],"name":"Ubuntu
> Modifications","description":"Убунтівські доповнення до
> Firefox.","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["zh-CN"],"name":"Ubuntu
> Modifications","description":"Ubuntu 火狐扩展包.","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["zh-TW"],"name":"Ubuntu
> Modifications","description":"Ubuntu Firefox 擴充包。","creator":"Canonical
> Ltd.","homepageURL":null}],"targetApplications":[{"id":"{ec8030f7-c20a-464f-9b0e-13a3a9e97384}","minVersion":"9.0","maxVersion":"37.0a1"}],"targetPlatforms":[],"multiprocessCompatible":false,"signedState":2,"seen":true}
> 1466178208570 DeferredSave.extensions.json DEBUG Save changes
> 1466178208570 addons.xpi DEBUG Updating database with changes to
> installed add-ons
> 1466178208570 addons.xpi-utils DEBUG Updating add-on states
> 1466178208571 addons.xpi-utils DEBUG Writing add-ons list
> 1466178208575 addons.xpi DEBUG Registering manifest for
> /usr/lib/firefox/browser/features/firefox@getpocket.com.xpi
> 1466178208576 addons.xpi DEBUG Calling bootstrap method startup
> on firefox@getpocket.com version 1.0.2
> 1466178208578 addons.xpi DEBUG Registering manifest for
> /usr/lib/firefox/browser/features/e10srollout@mozilla.org.xpi
> 1466178208578 addons.xpi DEBUG Calling bootstrap method startup
> on e10srollout@mozilla.org version 1.0
> 1466178208578 addons.xpi DEBUG Registering manifest for
> /usr/lib/firefox/browser/features/loop@mozilla.org.xpi
> 1466178208579 addons.xpi DEBUG Calling bootstrap method startup
> on loop@mozilla.org version 1.3.2
> 1466178208610 addons.manager DEBUG Registering shutdown blocker
> for XPIProvider
> 1466178208610 addons.manager DEBUG Provider finished startup:
> XPIProvider
> 1466178208610 addons.manager DEBUG Starting provider:
> LightweightThemeManager
> 1466178208611 addons.manager DEBUG Registering shutdown blocker
> for LightweightThemeManager
> 1466178208612 addons.manager DEBUG Provider finished startup:
> LightweightThemeManager
> 1466178208613 addons.manager DEBUG Starting provider: GMPProvider
> 1466178208621 addons.manager DEBUG Registering shutdown blocker
> for GMPProvider
> 1466178208622 addons.manager DEBUG Provider finished startup:
> GMPProvider
> 1466178208622 addons.manager DEBUG Starting provider:
> PluginProvider
> 1466178208622 addons.manager DEBUG Registering shutdown blocker
> for PluginProvider
> 1466178208622 addons.manager DEBUG Provider finished startup:
> PluginProvider
> 1466178208623 addons.manager DEBUG Completed startup sequence
> 1466178209011 addons.manager DEBUG Starting provider:
> <unnamed-provider>
> 1466178209011 addons.manager DEBUG Registering shutdown blocker
> for <unnamed-provider>
> 1466178209012 addons.manager DEBUG Provider finished startup:
> <unnamed-provider>
> 1466178209202 DeferredSave.extensions.json DEBUG Write succeeded
> 1466178209202 addons.xpi-utils DEBUG XPI Database saved, setting
> schema version preference to 17
> 1466178209202 DeferredSave.extensions.json DEBUG Starting timer
> 1466178209229 DeferredSave.extensions.json DEBUG Starting write
> 1466178209237 addons.repository DEBUG No addons.json found.
> 1466178209238 DeferredSave.addons.json DEBUG Save changes
> 1466178209242 DeferredSave.addons.json DEBUG Starting timer
> 1466178209309 addons.manager DEBUG Starting provider:
> PreviousExperimentProvider
> 1466178209310 addons.manager DEBUG Registering shutdown blocker
> for PreviousExperimentProvider
> 1466178209310 addons.manager DEBUG Provider finished startup:
> PreviousExperimentProvider
> 1466178209317 DeferredSave.addons.json DEBUG Starting write
> 1466178209329 DeferredSave.extensions.json DEBUG Write succeeded
> 1466178209357 DeferredSave.addons.json DEBUG Write succeeded
>
> (firefox:3352): Gtk-CRITICAL **: gtk_clipboard_set_with_data: assertion
> 'targets != NULL' failed
>
> Build info: version: '2.48.2', revision:
> '41bccdd10cf2c0560f637404c2d96164b67d9d67', time: '2015-10-09 13:08:06'
> System info: host: 'solr', ip: '127.0.1.1', os.name: 'Linux', os.arch:
> 'amd64', os.version: '3.19.0-39-generic', java.version: '1.8.0_91'
> Driver info: driver.version: FirefoxDriver
> at
> org.apache.nutch.protocol.selenium.HttpWebClient.getDriverForPage(HttpWebClient.java:118)
> at
> org.apache.nutch.protocol.selenium.HttpWebClient.getHtmlPage(HttpWebClient.java:155)
> at
> org.apache.nutch.protocol.selenium.HttpResponse.readPlainContent(HttpResponse.java:244)
> at
> org.apache.nutch.protocol.selenium.HttpResponse.<init>(HttpResponse.java:168)
> at org.apache.nutch.protocol.selenium.Http.getResponse(Http.java:56)
> at
> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:261)
> at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:290)
> *Caused by: org.openqa.selenium.WebDriverException: Failed to connect to
> binary FirefoxBinary(/usr/bin/firefox) on port 7055; process output
> follows: *
> ения Firefox для Ubuntu","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["sl"],"name":"Ubuntu
> Modifications","description":"Ubuntu razširitve za
> Firefox.","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["sv-SE"],"name":"Ubuntu
> Modifications","description":"Ubuntu-paket för
> Firefox.","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["uk"],"name":"Ubuntu
> Modifications","description":"Убунтівські доповнення до
> Firefox.","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["zh-CN"],"name":"Ubuntu
> Modifications","description":"Ubuntu 火狐扩展包.","creator":"Canonical
> Ltd.","homepageURL":null},{"locales":["zh-TW"],"name":"Ubuntu
> Modifications","description":"Ubuntu Firefox 擴充包。","creator":"Canonical
> Ltd.","homepageURL":null}],"targetApplications":[{"id":"{ec8030f7-c20a-464f-9b0e-13a3a9e97384}","minVersion":"9.0","maxVersion":"37.0a1"}],"targetPlatforms":[],"multiprocessCompatible":false,"signedState":2,"seen":true}
> 1466178208570 DeferredSave.extensions.json DEBUG Save changes
> 1466178208570 addons.xpi DEBUG Updating database with changes to
> installed add-ons
> 1466178208570 addons.xpi-utils DEBUG Updating add-on states
> 1466178208571 addons.xpi-utils DEBUG Writing add-ons list
>
>
> *I have found some comments on this issue but nothing helpful:*
> Remote driver & Firefox: Unable to bind to locking port 7054 within 45000
> ms · Issue #7272 · SeleniumHQ/selenium-google-code-issue-archive
> <https://github.com/seleniumhq/selenium-google-code-issue-archive/issues/7272>
>
> Remote driver & Firefox: Unable to bind to locking port 7054 within 45...
> Originally reported on Google Code with ID 7272 Hi All, I'm experiencing
> some sporadic issues with Remote ...
>
> <https://github.com/seleniumhq/selenium-google-code-issue-archive/issues/7272>
> In Firefox Browser:Unable to bind to locking port 7054 within 45000ms ·
> Issue #6760 · SeleniumHQ/selenium-google-code-issue-archive
> <https://github.com/seleniumhq/selenium-google-code-issue-archive/issues/6760>
>
> In Firefox Browser:Unable to bind to locking port 7054 within 45000ms ·
> Iss...
> Originally reported on Google Code with ID 6760 selenium: 2.32.0,
> OS:Windows XP firefox version: 26.0. steps:...
>
> <https://github.com/seleniumhq/selenium-google-code-issue-archive/issues/6760>
>
> Unable to bind to locking port 7054 within 45000 ms : webdriver firefox
> <http://stackoverflow.com/questions/13992986/unable-to-bind-to-locking-port-7054-within-45000-ms-webdriver-firefox>
>
> Unable to bind to locking port 7054 within 45000 ms : webdriver firefox
> i'm new to selenium webdriver i'm trying to run a simple test : i'm using
> firefox 17.0.1 and seleni...
>
> <http://stackoverflow.com/questions/13992986/unable-to-bind-to-locking-port-7054-within-45000-ms-webdriver-firefox>
>
>
>
> *Please advice,*
>
>
> *Much obliged,*
>
> *Christian Fotache*
> Tel: 0728.297.207
>
>
>
>
Error parsing javascript with selenium (solr 6.0.0 & nutch 1.11 &
firefox 47.0)
Posted by li...@yahoo.com.INVALID.
Hi,
I'm trying to use selenium (solr 6.0.0 & nutch 1.11 & firefox 47.0) to parse javascript pagesI'm using this configuration for nutch-site:<property>
<name>plugin.includes</name>
<value>protocol-(httpclient|interactiveselenium|selenium)|urlfilter-(automaton|regex)|parse-(metatags|ext|html|js|swf|tika|zip)|index-(metadata|basic|anchor|geoip|dummy|links|more|replace|static)|scoring-opic|indexer-solr|urlnormalizer-(pass|regex|basic|ajax)|creativecommons|feed|headings|language-identifier|lib-nekohtml|lib-xml|microformats-reltag|mimetype-filter|nutch-extensionpoints|lib-selenium|subcollection|tld|parserfilter-naivebayes</value>
<description>...</description>
</property>
and this configuration for parse-plugins.xml
<parse-plugins>
<!-- by default if the mimeType is set to *, or
if it can't be determined, use parse-tika -->
<mimeType name="*">
<plugin id="parse-metatags"/>
<plugin id="protocol-interactiveselenium"/>
<plugin id="protocol-selenium"/>
<plugin id="lib-selenium"/>
<plugin id="nutch-extensionpoints"/>
<plugin id="parse-js"/>
<plugin id="parse-tika" />
<plugin id="feed"/>
<plugin id="parse-html"/>
<plugin id="parse-js"/>
<plugin id="parse-html" />
</mimeType>
<mimeType name="application/rss+xml">
<plugin id="parse-tika" />
<plugin id="feed" />
</mimeType>
<mimeType name="application/x-bzip2">
<!-- try and parse it with the zip parser -->
<plugin id="parse-zip" />
</mimeType>
<mimeType name="application/x-gzip">
<!-- try and parse it with the zip parser -->
<plugin id="parse-zip" />
</mimeType>
<mimeType name="application/x-javascript">
<plugin id="parse-js" />
<plugin id="protocol-interactiveselenium"/>
<plugin id="protocol-selenium"/>
<plugin id="lib-selenium"/>
<plugin id="nutch-extensionpoints"/>
<plugin id="parse-metatags"/>
<!--<plugin id="parse-ext"/>-->
<plugin id="parse-tika" />
</mimeType>
<mimeType name="application/x-shockwave-flash">
<plugin id="parse-swf" />
</mimeType>
<mimeType name="application/zip">
<plugin id="parse-zip" />
</mimeType>
<!--<mimeType name="text/html">
<plugin id="parse-html" />
</mimeType>-->
<mimeType name="text/html">
<plugin id="parse-metatags"/>
<plugin id="protocol-interactiveselenium"/>
<plugin id="protocol-selenium"/>
<plugin id="lib-selenium"/>
<plugin id="nutch-extensionpoints"/>
<!--<plugin id="parse-ext"/>-->
<!--<plugin id="parse-js"/>-->
<plugin id="parse-html" />
<plugin id="parse-tika" />
</mimeType>
<mimeType name="application/xhtml+xml">
<plugin id="parse-metatags"/>
<plugin id="protocol-interactiveselenium"/>
<plugin id="protocol-selenium"/>
<plugin id="lib-selenium"/> <plugin id="nutch-extensionpoints"/>
<plugin id="parse-tika" />
<plugin id="feed" />
<plugin id="parse-html" />
</mimeType>
<mimeType name="text/xml">
<plugin id="parse-metatags"/>
<plugin id="protocol-interactiveselenium"/>
<plugin id="protocol-selenium"/>
<plugin id="lib-selenium"/>
<plugin id="parse-tika" />
<plugin id="feed" />
</mimeType>
The firefox window popup with a message about private browsing on it.
However, I get the error below and the job crushes into flames:
17 18:44:13,029 INFO api.HttpRobotRulesParser - Couldn't get robots.txt for http://findjobs.mashable.com/: java.lang.RuntimeException: org.openqa.selenium.WebDriverException: Unable to bind to locking port 7054 within 45000 ms
Build info: version: '2.48.2', revision: '41bccdd10cf2c0560f637404c2d96164b67d9d67', time: '2015-10-09 13:08:06'
System info: host: 'solr', ip: '127.0.1.1', os.name: 'Linux', os.arch: 'amd64', os.version: '3.19.0-39-generic', java.version: '1.8.0_91'
Driver info: driver.version: FirefoxDriver
2016-06-17 18:44:13,129 ERROR selenium.Http - Failed to get protocol output
java.lang.RuntimeException: org.openqa.selenium.WebDriverException: Failed to connect to binary FirefoxBinary(/usr/bin/firefox) on port 7055; process output follows:
ения Firefox для Ubuntu","creator":"Canonical Ltd.","homepageURL":null},{"locales":["sl"],"name":"Ubuntu Modifications","description":"Ubuntu razširitve za Firefox.","creator":"Canonical Ltd.","homepageURL":null},{"locales":["sv-SE"],"name":"Ubuntu Modifications","description":"Ubuntu-paket för Firefox.","creator":"Canonical Ltd.","homepageURL":null},{"locales":["uk"],"name":"Ubuntu Modifications","description":"Убунтівські доповнення до Firefox.","creator":"Canonical Ltd.","homepageURL":null},{"locales":["zh-CN"],"name":"Ubuntu Modifications","description":"Ubuntu 火狐扩展包.","creator":"Canonical Ltd.","homepageURL":null},{"locales":["zh-TW"],"name":"Ubuntu Modifications","description":"Ubuntu Firefox 擴充包。","creator":"Canonical Ltd.","homepageURL":null}],"targetApplications":[{"id":"{ec8030f7-c20a-464f-9b0e-13a3a9e97384}","minVersion":"9.0","maxVersion":"37.0a1"}],"targetPlatforms":[],"multiprocessCompatible":false,"signedState":2,"seen":true}
1466178208570 DeferredSave.extensions.json DEBUG Save changes
1466178208570 addons.xpi DEBUG Updating database with changes to installed add-ons
1466178208570 addons.xpi-utils DEBUG Updating add-on states
1466178208571 addons.xpi-utils DEBUG Writing add-ons list
1466178208575 addons.xpi DEBUG Registering manifest for /usr/lib/firefox/browser/features/firefox@getpocket.com.xpi
1466178208576 addons.xpi DEBUG Calling bootstrap method startup on firefox@getpocket.com version 1.0.2
1466178208578 addons.xpi DEBUG Registering manifest for /usr/lib/firefox/browser/features/e10srollout@mozilla.org.xpi
1466178208578 addons.xpi DEBUG Calling bootstrap method startup on e10srollout@mozilla.org version 1.0
1466178208578 addons.xpi DEBUG Registering manifest for /usr/lib/firefox/browser/features/loop@mozilla.org.xpi
1466178208579 addons.xpi DEBUG Calling bootstrap method startup on loop@mozilla.org version 1.3.2
1466178208610 addons.manager DEBUG Registering shutdown blocker for XPIProvider
1466178208610 addons.manager DEBUG Provider finished startup: XPIProvider
1466178208610 addons.manager DEBUG Starting provider: LightweightThemeManager
1466178208611 addons.manager DEBUG Registering shutdown blocker for LightweightThemeManager
1466178208612 addons.manager DEBUG Provider finished startup: LightweightThemeManager
1466178208613 addons.manager DEBUG Starting provider: GMPProvider
1466178208621 addons.manager DEBUG Registering shutdown blocker for GMPProvider
1466178208622 addons.manager DEBUG Provider finished startup: GMPProvider
1466178208622 addons.manager DEBUG Starting provider: PluginProvider
1466178208622 addons.manager DEBUG Registering shutdown blocker for PluginProvider
1466178208622 addons.manager DEBUG Provider finished startup: PluginProvider
1466178208623 addons.manager DEBUG Completed startup sequence
1466178209011 addons.manager DEBUG Starting provider: <unnamed-provider>
1466178209011 addons.manager DEBUG Registering shutdown blocker for <unnamed-provider>
1466178209012 addons.manager DEBUG Provider finished startup: <unnamed-provider>
1466178209202 DeferredSave.extensions.json DEBUG Write succeeded
1466178209202 addons.xpi-utils DEBUG XPI Database saved, setting schema version preference to 17
1466178209202 DeferredSave.extensions.json DEBUG Starting timer
1466178209229 DeferredSave.extensions.json DEBUG Starting write
1466178209237 addons.repository DEBUG No addons.json found.
1466178209238 DeferredSave.addons.json DEBUG Save changes
1466178209242 DeferredSave.addons.json DEBUG Starting timer
1466178209309 addons.manager DEBUG Starting provider: PreviousExperimentProvider
1466178209310 addons.manager DEBUG Registering shutdown blocker for PreviousExperimentProvider
1466178209310 addons.manager DEBUG Provider finished startup: PreviousExperimentProvider
1466178209317 DeferredSave.addons.json DEBUG Starting write
1466178209329 DeferredSave.extensions.json DEBUG Write succeeded
1466178209357 DeferredSave.addons.json DEBUG Write succeeded
(firefox:3352): Gtk-CRITICAL **: gtk_clipboard_set_with_data: assertion 'targets != NULL' failed
Build info: version: '2.48.2', revision: '41bccdd10cf2c0560f637404c2d96164b67d9d67', time: '2015-10-09 13:08:06'
System info: host: 'solr', ip: '127.0.1.1', os.name: 'Linux', os.arch: 'amd64', os.version: '3.19.0-39-generic', java.version: '1.8.0_91'
Driver info: driver.version: FirefoxDriver
at org.apache.nutch.protocol.selenium.HttpWebClient.getDriverForPage(HttpWebClient.java:118)
at org.apache.nutch.protocol.selenium.HttpWebClient.getHtmlPage(HttpWebClient.java:155)
at org.apache.nutch.protocol.selenium.HttpResponse.readPlainContent(HttpResponse.java:244)
at org.apache.nutch.protocol.selenium.HttpResponse.<init>(HttpResponse.java:168)
at org.apache.nutch.protocol.selenium.Http.getResponse(Http.java:56)
at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:261)
at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:290)
Caused by: org.openqa.selenium.WebDriverException: Failed to connect to binary FirefoxBinary(/usr/bin/firefox) on port 7055; process output follows:
ения Firefox для Ubuntu","creator":"Canonical Ltd.","homepageURL":null},{"locales":["sl"],"name":"Ubuntu Modifications","description":"Ubuntu razširitve za Firefox.","creator":"Canonical Ltd.","homepageURL":null},{"locales":["sv-SE"],"name":"Ubuntu Modifications","description":"Ubuntu-paket för Firefox.","creator":"Canonical Ltd.","homepageURL":null},{"locales":["uk"],"name":"Ubuntu Modifications","description":"Убунтівські доповнення до Firefox.","creator":"Canonical Ltd.","homepageURL":null},{"locales":["zh-CN"],"name":"Ubuntu Modifications","description":"Ubuntu 火狐扩展包.","creator":"Canonical Ltd.","homepageURL":null},{"locales":["zh-TW"],"name":"Ubuntu Modifications","description":"Ubuntu Firefox 擴充包。","creator":"Canonical Ltd.","homepageURL":null}],"targetApplications":[{"id":"{ec8030f7-c20a-464f-9b0e-13a3a9e97384}","minVersion":"9.0","maxVersion":"37.0a1"}],"targetPlatforms":[],"multiprocessCompatible":false,"signedState":2,"seen":true}
1466178208570 DeferredSave.extensions.json DEBUG Save changes
1466178208570 addons.xpi DEBUG Updating database with changes to installed add-ons
1466178208570 addons.xpi-utils DEBUG Updating add-on states
1466178208571 addons.xpi-utils DEBUG Writing add-ons list
I have found some comments on this issue but nothing helpful:
Remote driver & Firefox: Unable to bind to locking port 7054 within 45000 ms · Issue #7272 · SeleniumHQ/selenium-google-code-issue-archive
|
|
|
| | |
|
|
|
| |
Remote driver & Firefox: Unable to bind to locking port 7054 within 45...
Originally reported on Google Code with ID 7272 Hi All, I'm experiencing some sporadic issues with Remote ... | |
|
|
In Firefox Browser:Unable to bind to locking port 7054 within 45000ms · Issue #6760 · SeleniumHQ/selenium-google-code-issue-archive
|
|
|
| | |
|
|
|
| |
In Firefox Browser:Unable to bind to locking port 7054 within 45000ms · Iss...
Originally reported on Google Code with ID 6760 selenium: 2.32.0, OS:Windows XP firefox version: 26.0. steps:... | |
|
|
Unable to bind to locking port 7054 within 45000 ms : webdriver firefox
|
|
|
| | |
|
|
|
| |
Unable to bind to locking port 7054 within 45000 ms : webdriver firefox
i'm new to selenium webdriver i'm trying to run a simple test : i'm using firefox 17.0.1 and seleni... | |
|
|
Please advice,
Much obliged,
Christian Fotache
Tel: 0728.297.207
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
There will likely be some conflicts with bouncy castle.
Tika 1.13:
bcmail-jdk15on 1.54
bcprov-jdk15on 1.54
vs. Solr:
org.bouncycastle.version = 1.45
/org.bouncycastle/bcmail-jdk15 = ${org.bouncycastle.version}
/org.bouncycastle/bcprov-jdk15 = ${org.bouncycastle.version}
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
OK, the tests pass for me successfully. Should I remove the jackcess-encrypt package from your PR after merging (you said you will be away this weekend)?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
> will take a look. The test passed if you assumed that the html had two bodies, but that's crazy...
I hope this test does not download the internet? It should all run local! I have not looked into it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
The XHTMLContentHandler adds <body> and </body>. In out-of-the-box Tika with the DefaultHtmlMapper, "body" tags are not in the list of "SAFE_ELEMENTS", which means that the html's "body" tag is never passed through...so we don't see the doubling in Tika.
The solution is to suppress the body tag in Solr's MostlyPassthroughHtmlMapper.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
I also only have Windows :)
I would leave out image format, but MS Access looks fine. Could we leave out updating bouncycastl then?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
> I also only have Windows :)
How can you live with the failed builds?!? I wanted to help with [morphlines](https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201606.mbox/%3CCY1PR09MB1115F9A08E97879D959D3CDCC7570%40CY1PR09MB1115.namprd09.prod.outlook.com%3E), but I can't easily do much...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Let's pick option 2 for now. Maybe update the rest of Solr after some review.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Ah OK, so no problem on my side. I'll wait a bit.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
I think I got it... ant precommit worked in Linux with these modifications. I kept getting hangs with ant jar-checksums in Windows.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Did you check with Java 9 or should I do it? I am not sure about the last assume removed, because there is another SOLR issue in the assume message' not just the PDFBOX one.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
for me it still happens. I just merged the PR
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
GitHub user tballison reopened a pull request:
https://github.com/apache/lucene-solr/pull/44
SOLR-8981
SOLR-8981 upgrade to Tika 1.13
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tballison/lucene-solr SOLR-8981
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/lucene-solr/pull/44.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #44
----
commit ba0e71703464849198b384aa6e92962db8a04b51
Author: tballison <ta...@mitre.org>
Date: 2016-06-16T16:56:45Z
SOLR-8981 upgrade to Tika 1.13
commit 1706b92790011f3ec5a85915adad3834e87d8970
Author: tballison <ta...@mitre.org>
Date: 2016-06-16T19:36:52Z
SOLR-8981 clean up license and sha1 info
commit 31c091b4856081f2d1b302499a436e5953779e5e
Author: tballison <ta...@mitre.org>
Date: 2016-06-17T13:47:53Z
SOLR-8981 clean up new lines, upgrade isoparser, add notice in CHANGES.txt
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #44: SOLR-8981
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/lucene-solr/pull/44
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison closed the pull request at:
https://github.com/apache/lucene-solr/pull/44
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Grep for that one and remove them. Tests should pass then with latest Java 9:
`assumeFalse("This test fails with Java 9 (https://issues.apache.org/jira/browse/PDFBOX-3155)", Constants.JRE_IS_MINIMUM_JAVA9);`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr pull request #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/44#discussion_r67575579
--- Diff: solr/contrib/morphlines-cell/src/test/org/apache/solr/morphlines/cell/SolrCellMorphlineTest.java ---
@@ -42,8 +42,6 @@
@BeforeClass
public static void beforeClass2() {
assumeFalse("FIXME: Morphlines currently has issues with Windows paths", Constants.WINDOWS);
- assumeFalse("This test fails with Java 9 (https://issues.apache.org/jira/browse/PDFBOX-3155, https://issues.apache.org/jira/browse/SOLR-8876)",
--- End diff --
This should stay, because Hadoop related stuff also fails with Java 9. Maybe only remove the PDFBOX issue number.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
What file formats are this? Documents? Otherwise please leave them out.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
> I think this should work... ant precommit worked in Linux with these modifications. I kept getting hangs with ant jar-checksums in Windows.
If you checkout with git on windows using auto-eol it fails. The reason is git that threats sha1 files as text and converts their line endings.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Just found it. Confirming that fix doesn't break anything else.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by lewismc <gi...@git.apache.org>.
Github user lewismc commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Yes the server is buggered. Good work folks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Git (well, it was my fault, don't get me wrong) added the \r\n somehow. I had turned off autocrlf earlier.
> C:\...>git config --get core.autocrlf
input
I realized I forgot to update the isoparser, and I cleaned up the Jackcess notice.
Let me know how this looks now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by uschindler <gi...@git.apache.org>.
Github user uschindler commented on the issue:
https://github.com/apache/lucene-solr/pull/44
LOL. So is this a bug in Solr or in TIKA? Because it did not happen previously.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
No, it is a self-contained test with a test file. +1 on local and _only_ local.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
argh...
will take a look. The test passed if you assumed that the html had two bodies, but that's crazy...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
WebP is an image format.
Jackcess encrypt is the library that allows users to decrypt MSAccess files.
Please give it a go with Java 9. I can't easily test the morphlines stuff on my main dev box (Windows ... :( ).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
If we leave out updating bouncycastle, I'm fairly confident that users will run problems at run time if they try to decrypt MSAccess and probably PDF and doc.
We had a binary incompatibility between 1.52 and 1.54 with Jackcess: https://sourceforge.net/p/jackcessencrypt/feature-requests/2/
IIRC, the exception was thrown on any encrypted MSAccess file, not just those for which the user had a password.
I see two options:
1) upgrade bouncycastle and hope we don't break other parts of Solr
2) announce decryption of Jackcess/POI/PDFBox as unsupported
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[GitHub] lucene-solr issue #44: SOLR-8981
Posted by tballison <gi...@git.apache.org>.
Github user tballison commented on the issue:
https://github.com/apache/lucene-solr/pull/44
Not willing to point fingers... :)
I'd like to track down the change in our history between 1.7 and 1.13 so that I actually understand what happened
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org