You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Uwe Schindler <us...@apache.org> on 2016/04/06 10:45:19 UTC
Apache Solr and Tika used to index Panama Papers
Hi all,
I just wanted to repost the following by Chris Mattman on the TIKA list:
If you have been following the news you’ve seen the Panama papers and how the world’s rich and elite have been storing all their money offshore to hide it. Two of the ASF’s key technologies were used in uncovering that story and showing the world what was going on: Apache Tika and Apache Solr.
Solr was used for making the Terabytes of Panama Papers available to journalists. The preprocessing of the documents for indexing was done with Tika (maybe through the contrib/extraction module).
Here is the article by Forbes about that:
http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
Uwe
-----
Uwe Schindler
uschindler@apache.org
ASF Member, Apache Lucene PMC / Committer
Bremen, Germany
http://lucene.apache.org/
RE: Apache Solr and Tika used to index Panama Papers
Posted by Martin Gainty <mg...@hotmail.com>.
last time I traveled into Panama City there were a TON of high-rollers with silk suits wearing Gucci Shoes on board the plane
(I of course thought it was for the annual Latin American Star-Trek convention)
Thanks Uwe (and Dave)
Martin
______________________________________________
Date: Wed, 6 Apr 2016 09:00:00 -0400
Subject: Re: Apache Solr and Tika used to index Panama Papers
From: joelsolr@gmail.com
To: dev@lucene.apache.org
Wonder if they were on the users list!Joel Bernsteinhttp://joelsolr.blogspot.com/
On Wed, Apr 6, 2016 at 8:57 AM, David Smiley <da...@gmail.com> wrote:
😀 awesome
On Wed, Apr 6, 2016 at 4:45 AM Uwe Schindler <us...@apache.org> wrote:
Hi all,
I just wanted to repost the following by Chris Mattman on the TIKA list:
If you have been following the news you’ve seen the Panama papers and how the world’s rich and elite have been storing all their money offshore to hide it. Two of the ASF’s key technologies were used in uncovering that story and showing the world what was going on: Apache Tika and Apache Solr.
Solr was used for making the Terabytes of Panama Papers available to journalists. The preprocessing of the documents for indexing was done with Tika (maybe through the contrib/extraction module).
Here is the article by Forbes about that:
http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
Uwe
-----
Uwe Schindler
uschindler@apache.org
ASF Member, Apache Lucene PMC / Committer
Bremen, Germany
http://lucene.apache.org/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
--
Lucene/Solr Search Committer, Consultant, Developer, Author, SpeakerLinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
Re: Apache Solr and Tika used to index Panama Papers
Posted by Joel Bernstein <jo...@gmail.com>.
Wonder if they were on the users list!
Joel Bernstein
http://joelsolr.blogspot.com/
On Wed, Apr 6, 2016 at 8:57 AM, David Smiley <da...@gmail.com>
wrote:
> 😀 awesome
>
> On Wed, Apr 6, 2016 at 4:45 AM Uwe Schindler <us...@apache.org>
> wrote:
>
>> Hi all,
>>
>> I just wanted to repost the following by Chris Mattman on the TIKA list:
>>
>> If you have been following the news you’ve seen the Panama papers and how
>> the world’s rich and elite have been storing all their money offshore to
>> hide it. Two of the ASF’s key technologies were used in uncovering that
>> story and showing the world what was going on: Apache Tika and Apache Solr.
>>
>> Solr was used for making the Terabytes of Panama Papers available to
>> journalists. The preprocessing of the documents for indexing was done with
>> Tika (maybe through the contrib/extraction module).
>>
>> Here is the article by Forbes about that:
>>
>> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>>
>> Uwe
>>
>> -----
>> Uwe Schindler
>> uschindler@apache.org
>> ASF Member, Apache Lucene PMC / Committer
>> Bremen, Germany
>> http://lucene.apache.org/
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
Re: Apache Solr and Tika used to index Panama Papers
Posted by SIDDHAST® Roshan <ro...@siddhast.com>.
congrats to apache solr team
roshan
On Thu, Apr 7, 2016 at 3:30 PM, Charlie Hull <ch...@flax.co.uk> wrote:
> This isn't the first time a global news organisation has used Solr to
> index leaked data, unsurprisingly - for creating something fast & quietly,
> open source is a natural choice.
>
> Charlie
>
>
> On 06/04/2016 13:57, David Smiley wrote:
>
>> 😀 awesome
>> On Wed, Apr 6, 2016 at 4:45 AM Uwe Schindler <us...@apache.org>
>> wrote:
>>
>> Hi all,
>>>
>>> I just wanted to repost the following by Chris Mattman on the TIKA list:
>>>
>>> If you have been following the news you’ve seen the Panama papers and how
>>> the world’s rich and elite have been storing all their money offshore to
>>> hide it. Two of the ASF’s key technologies were used in uncovering that
>>> story and showing the world what was going on: Apache Tika and Apache
>>> Solr.
>>>
>>> Solr was used for making the Terabytes of Panama Papers available to
>>> journalists. The preprocessing of the documents for indexing was done
>>> with
>>> Tika (maybe through the contrib/extraction module).
>>>
>>> Here is the article by Forbes about that:
>>>
>>>
>>> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>>>
>>> Uwe
>>>
>>> -----
>>> Uwe Schindler
>>> uschindler@apache.org
>>> ASF Member, Apache Lucene PMC / Committer
>>> Bremen, Germany
>>> http://lucene.apache.org/
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>> --
>>>
>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> http://www.solrenterprisesearchserver.com
>>
>>
>
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334
> mobile: +44 (0)7767 825828
> web: www.flax.co.uk
>
--
Roshan Agarwal
Director sales
Siddhast® Ip innovation (P) ltd
907 chandra vihar colony
Jhansi-284002
M:+917376314900
Re: Apache Solr and Tika used to index Panama Papers
Posted by Charlie Hull <ch...@flax.co.uk>.
This isn't the first time a global news organisation has used Solr to
index leaked data, unsurprisingly - for creating something fast &
quietly, open source is a natural choice.
Charlie
On 06/04/2016 13:57, David Smiley wrote:
> 😀 awesome
> On Wed, Apr 6, 2016 at 4:45 AM Uwe Schindler <us...@apache.org> wrote:
>
>> Hi all,
>>
>> I just wanted to repost the following by Chris Mattman on the TIKA list:
>>
>> If you have been following the news you’ve seen the Panama papers and how
>> the world’s rich and elite have been storing all their money offshore to
>> hide it. Two of the ASF’s key technologies were used in uncovering that
>> story and showing the world what was going on: Apache Tika and Apache Solr.
>>
>> Solr was used for making the Terabytes of Panama Papers available to
>> journalists. The preprocessing of the documents for indexing was done with
>> Tika (maybe through the contrib/extraction module).
>>
>> Here is the article by Forbes about that:
>>
>> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>>
>> Uwe
>>
>> -----
>> Uwe Schindler
>> uschindler@apache.org
>> ASF Member, Apache Lucene PMC / Committer
>> Bremen, Germany
>> http://lucene.apache.org/
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
--
Charlie Hull
Flax - Open Source Enterprise Search
tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828
web: www.flax.co.uk
Re: Apache Solr and Tika used to index Panama Papers
Posted by David Smiley <da...@gmail.com>.
😀 awesome
On Wed, Apr 6, 2016 at 4:45 AM Uwe Schindler <us...@apache.org> wrote:
> Hi all,
>
> I just wanted to repost the following by Chris Mattman on the TIKA list:
>
> If you have been following the news you’ve seen the Panama papers and how
> the world’s rich and elite have been storing all their money offshore to
> hide it. Two of the ASF’s key technologies were used in uncovering that
> story and showing the world what was going on: Apache Tika and Apache Solr.
>
> Solr was used for making the Terabytes of Panama Papers available to
> journalists. The preprocessing of the documents for indexing was done with
> Tika (maybe through the contrib/extraction module).
>
> Here is the article by Forbes about that:
>
> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>
> Uwe
>
> -----
> Uwe Schindler
> uschindler@apache.org
> ASF Member, Apache Lucene PMC / Committer
> Bremen, Germany
> http://lucene.apache.org/
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com
Re: Apache Solr and Tika used to index Panama Papers
Posted by Ted Dunning <te...@gmail.com>.
GPL != license
The GPL is only one of many open source licenses.
Presumably you knew that and this was just a slip of the keystroke.
On Thu, Apr 7, 2016 at 10:40 AM, Klaus Ramelow <Kl...@gmx.de> wrote:
> in my opinion,
> it is the "nature" of open source to be open to erverybody who is
> interested in it
> and use it and / or modify it under the respective GPL
>
> Klaus
>
>
> Am 07.04.2016 um 18:38 schrieb SIDDHAST® Roshan:
>
>> It is not necessary that open source is available to you. open source mean
>> that code is open to client. Now it is on Client how he provide it or sell
>> it . If client further sells it he or she shall also open the code.
>> Hope you got it
>> Roshan
>> On Apr 7, 2016 8:55 PM, "Jack Krupansky" <ja...@gmail.com>
>> wrote:
>>
>> Hmmm... I seem to have missed it, but remind me where the link is for
>>> public access? I mean, if this is all open source, it should be available
>>> to me, right?
>>>
>>> -- Jack Krupansky
>>>
>>> On Thu, Apr 7, 2016 at 6:52 AM, Erik Hatcher <er...@gmail.com>
>>> wrote:
>>>
>>> Also of note, Blacklight was used for the Solr-based UI -
>>>> http://projectblacklight.org
>>>>
>>>> And another link about the data analysis process -
>>>>
>>>>
>>> https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
>>>
>>>> "Layered on top was the shiny interface, built using Blacklight, another
>>>> open source development."
>>>>
>>>>
>>>>
>>>> On Apr 6, 2016, at 04:45, Uwe Schindler <us...@apache.org> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I just wanted to repost the following by Chris Mattman on the TIKA
>>>>>
>>>> list:
>>>
>>>> If you have been following the news you’ve seen the Panama papers and
>>>>>
>>>> how the world’s rich and elite have been storing all their money
>>>> offshore
>>>> to hide it. Two of the ASF’s key technologies were used in uncovering
>>>>
>>> that
>>>
>>>> story and showing the world what was going on: Apache Tika and Apache
>>>>
>>> Solr.
>>>
>>>> Solr was used for making the Terabytes of Panama Papers available to
>>>>>
>>>> journalists. The preprocessing of the documents for indexing was done
>>>>
>>> with
>>>
>>>> Tika (maybe through the contrib/extraction module).
>>>>
>>>>> Here is the article by Forbes about that:
>>>>>
>>>>>
>>> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>>>
>>>> Uwe
>>>>>
>>>>> -----
>>>>> Uwe Schindler
>>>>> uschindler@apache.org
>>>>> ASF Member, Apache Lucene PMC / Committer
>>>>> Bremen, Germany
>>>>> http://lucene.apache.org/
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>>
> --
> Mail-Anhang - Dementies
>
> /D e m e n t i e s/
>
> stellen die Basis
>
> in der Politik ...
>
> (Klaus Ramelow 2015)
>
>
Re: Apache Solr and Tika used to index Panama Papers
Posted by SIDDHAST® Roshan <ro...@siddhast.com>.
Open source is jUST code to be opened. What right you have you will
transfer those right to your client.
Roshan
On Apr 7, 2016 11:11 PM, "Klaus Ramelow" <Kl...@gmx.de> wrote:
> in my opinion,
> it is the "nature" of open source to be open to erverybody who is
> interested in it
> and use it and / or modify it under the respective GPL
>
> Klaus
>
> Am 07.04.2016 um 18:38 schrieb SIDDHAST® Roshan:
>
>> It is not necessary that open source is available to you. open source mean
>> that code is open to client. Now it is on Client how he provide it or sell
>> it . If client further sells it he or she shall also open the code.
>> Hope you got it
>> Roshan
>> On Apr 7, 2016 8:55 PM, "Jack Krupansky" <ja...@gmail.com>
>> wrote:
>>
>> Hmmm... I seem to have missed it, but remind me where the link is for
>>> public access? I mean, if this is all open source, it should be available
>>> to me, right?
>>>
>>> -- Jack Krupansky
>>>
>>> On Thu, Apr 7, 2016 at 6:52 AM, Erik Hatcher <er...@gmail.com>
>>> wrote:
>>>
>>> Also of note, Blacklight was used for the Solr-based UI -
>>>> http://projectblacklight.org
>>>>
>>>> And another link about the data analysis process -
>>>>
>>>>
>>> https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
>>>
>>>> "Layered on top was the shiny interface, built using Blacklight, another
>>>> open source development."
>>>>
>>>>
>>>>
>>>> On Apr 6, 2016, at 04:45, Uwe Schindler <us...@apache.org> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I just wanted to repost the following by Chris Mattman on the TIKA
>>>>>
>>>> list:
>>>
>>>> If you have been following the news you’ve seen the Panama papers and
>>>>>
>>>> how the world’s rich and elite have been storing all their money
>>>> offshore
>>>> to hide it. Two of the ASF’s key technologies were used in uncovering
>>>>
>>> that
>>>
>>>> story and showing the world what was going on: Apache Tika and Apache
>>>>
>>> Solr.
>>>
>>>> Solr was used for making the Terabytes of Panama Papers available to
>>>>>
>>>> journalists. The preprocessing of the documents for indexing was done
>>>>
>>> with
>>>
>>>> Tika (maybe through the contrib/extraction module).
>>>>
>>>>> Here is the article by Forbes about that:
>>>>>
>>>>>
>>> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>>>
>>>> Uwe
>>>>>
>>>>> -----
>>>>> Uwe Schindler
>>>>> uschindler@apache.org
>>>>> ASF Member, Apache Lucene PMC / Committer
>>>>> Bremen, Germany
>>>>> http://lucene.apache.org/
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>>
> --
> Mail-Anhang - Dementies
>
> /D e m e n t i e s/
>
> stellen die Basis
>
> in der Politik ...
>
> (Klaus Ramelow 2015)
>
>
Re: Apache Solr and Tika used to index Panama Papers
Posted by Jack Krupansky <ja...@gmail.com>.
LOL...
"WikiLeaks criticizes lack of access to Panama Papers"
"Whistleblowing group WikiLeaks criticized the International Consortium of
Investigative Journalists' decision not to allow open access to documents
that show how wealthy people have links to offshore financial services. "If
you censor more than 99% of the documents you are engaged in 1% journalism
by definition," WikiLeaks said in a tweet Wednesday."
See:
http://www.usatoday.com/story/news/world/2016/04/07/wikileaks-criticizes-lack-access-panama-papers/82736064/
Interesting that I now find myself on Julian Assange's side of the fence!
-- Jack Krupansky
On Thu, Apr 7, 2016 at 1:40 PM, Klaus Ramelow <Kl...@gmx.de> wrote:
> in my opinion,
> it is the "nature" of open source to be open to erverybody who is
> interested in it
> and use it and / or modify it under the respective GPL
>
> Klaus
>
>
> Am 07.04.2016 um 18:38 schrieb SIDDHAST® Roshan:
>
>> It is not necessary that open source is available to you. open source mean
>> that code is open to client. Now it is on Client how he provide it or sell
>> it . If client further sells it he or she shall also open the code.
>> Hope you got it
>> Roshan
>> On Apr 7, 2016 8:55 PM, "Jack Krupansky" <ja...@gmail.com>
>> wrote:
>>
>> Hmmm... I seem to have missed it, but remind me where the link is for
>>> public access? I mean, if this is all open source, it should be available
>>> to me, right?
>>>
>>> -- Jack Krupansky
>>>
>>> On Thu, Apr 7, 2016 at 6:52 AM, Erik Hatcher <er...@gmail.com>
>>> wrote:
>>>
>>> Also of note, Blacklight was used for the Solr-based UI -
>>>> http://projectblacklight.org
>>>>
>>>> And another link about the data analysis process -
>>>>
>>>>
>>> https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
>>>
>>>> "Layered on top was the shiny interface, built using Blacklight, another
>>>> open source development."
>>>>
>>>>
>>>>
>>>> On Apr 6, 2016, at 04:45, Uwe Schindler <us...@apache.org> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I just wanted to repost the following by Chris Mattman on the TIKA
>>>>>
>>>> list:
>>>
>>>> If you have been following the news you’ve seen the Panama papers and
>>>>>
>>>> how the world’s rich and elite have been storing all their money
>>>> offshore
>>>> to hide it. Two of the ASF’s key technologies were used in uncovering
>>>>
>>> that
>>>
>>>> story and showing the world what was going on: Apache Tika and Apache
>>>>
>>> Solr.
>>>
>>>> Solr was used for making the Terabytes of Panama Papers available to
>>>>>
>>>> journalists. The preprocessing of the documents for indexing was done
>>>>
>>> with
>>>
>>>> Tika (maybe through the contrib/extraction module).
>>>>
>>>>> Here is the article by Forbes about that:
>>>>>
>>>>>
>>> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>>>
>>>> Uwe
>>>>>
>>>>> -----
>>>>> Uwe Schindler
>>>>> uschindler@apache.org
>>>>> ASF Member, Apache Lucene PMC / Committer
>>>>> Bremen, Germany
>>>>> http://lucene.apache.org/
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>>
> --
> Mail-Anhang - Dementies
>
> /D e m e n t i e s/
>
> stellen die Basis
>
> in der Politik ...
>
> (Klaus Ramelow 2015)
>
>
Re: Apache Solr and Tika used to index Panama Papers
Posted by Klaus Ramelow <Kl...@gmx.de>.
in my opinion,
it is the "nature" of open source to be open to erverybody who is
interested in it
and use it and / or modify it under the respective GPL
Klaus
Am 07.04.2016 um 18:38 schrieb SIDDHAST® Roshan:
> It is not necessary that open source is available to you. open source mean
> that code is open to client. Now it is on Client how he provide it or sell
> it . If client further sells it he or she shall also open the code.
> Hope you got it
> Roshan
> On Apr 7, 2016 8:55 PM, "Jack Krupansky" <ja...@gmail.com> wrote:
>
>> Hmmm... I seem to have missed it, but remind me where the link is for
>> public access? I mean, if this is all open source, it should be available
>> to me, right?
>>
>> -- Jack Krupansky
>>
>> On Thu, Apr 7, 2016 at 6:52 AM, Erik Hatcher <er...@gmail.com>
>> wrote:
>>
>>> Also of note, Blacklight was used for the Solr-based UI -
>>> http://projectblacklight.org
>>>
>>> And another link about the data analysis process -
>>>
>> https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
>>> "Layered on top was the shiny interface, built using Blacklight, another
>>> open source development."
>>>
>>>
>>>
>>>> On Apr 6, 2016, at 04:45, Uwe Schindler <us...@apache.org> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I just wanted to repost the following by Chris Mattman on the TIKA
>> list:
>>>> If you have been following the news you’ve seen the Panama papers and
>>> how the world’s rich and elite have been storing all their money offshore
>>> to hide it. Two of the ASF’s key technologies were used in uncovering
>> that
>>> story and showing the world what was going on: Apache Tika and Apache
>> Solr.
>>>> Solr was used for making the Terabytes of Panama Papers available to
>>> journalists. The preprocessing of the documents for indexing was done
>> with
>>> Tika (maybe through the contrib/extraction module).
>>>> Here is the article by Forbes about that:
>>>>
>> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>>>> Uwe
>>>>
>>>> -----
>>>> Uwe Schindler
>>>> uschindler@apache.org
>>>> ASF Member, Apache Lucene PMC / Committer
>>>> Bremen, Germany
>>>> http://lucene.apache.org/
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
--
Mail-Anhang - Dementies
/D e m e n t i e s/
stellen die Basis
in der Politik ...
(Klaus Ramelow 2015)
Re: Apache Solr and Tika used to index Panama Papers
Posted by SIDDHAST® Roshan <ro...@siddhast.com>.
It is not necessary that open source is available to you. open source mean
that code is open to client. Now it is on Client how he provide it or sell
it . If client further sells it he or she shall also open the code.
Hope you got it
Roshan
On Apr 7, 2016 8:55 PM, "Jack Krupansky" <ja...@gmail.com> wrote:
> Hmmm... I seem to have missed it, but remind me where the link is for
> public access? I mean, if this is all open source, it should be available
> to me, right?
>
> -- Jack Krupansky
>
> On Thu, Apr 7, 2016 at 6:52 AM, Erik Hatcher <er...@gmail.com>
> wrote:
>
> > Also of note, Blacklight was used for the Solr-based UI -
> > http://projectblacklight.org
> >
> > And another link about the data analysis process -
> >
> https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
> >
> > "Layered on top was the shiny interface, built using Blacklight, another
> > open source development."
> >
> >
> >
> > > On Apr 6, 2016, at 04:45, Uwe Schindler <us...@apache.org> wrote:
> > >
> > > Hi all,
> > >
> > > I just wanted to repost the following by Chris Mattman on the TIKA
> list:
> > >
> > > If you have been following the news you’ve seen the Panama papers and
> > how the world’s rich and elite have been storing all their money offshore
> > to hide it. Two of the ASF’s key technologies were used in uncovering
> that
> > story and showing the world what was going on: Apache Tika and Apache
> Solr.
> > >
> > > Solr was used for making the Terabytes of Panama Papers available to
> > journalists. The preprocessing of the documents for indexing was done
> with
> > Tika (maybe through the contrib/extraction module).
> > >
> > > Here is the article by Forbes about that:
> > >
> >
> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
> > >
> > > Uwe
> > >
> > > -----
> > > Uwe Schindler
> > > uschindler@apache.org
> > > ASF Member, Apache Lucene PMC / Committer
> > > Bremen, Germany
> > > http://lucene.apache.org/
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
>
Re: Apache Solr and Tika used to index Panama Papers
Posted by Jack Krupansky <ja...@gmail.com>.
Hmmm... I seem to have missed it, but remind me where the link is for
public access? I mean, if this is all open source, it should be available
to me, right?
-- Jack Krupansky
On Thu, Apr 7, 2016 at 6:52 AM, Erik Hatcher <er...@gmail.com> wrote:
> Also of note, Blacklight was used for the Solr-based UI -
> http://projectblacklight.org
>
> And another link about the data analysis process -
> https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
>
> "Layered on top was the shiny interface, built using Blacklight, another
> open source development."
>
>
>
> > On Apr 6, 2016, at 04:45, Uwe Schindler <us...@apache.org> wrote:
> >
> > Hi all,
> >
> > I just wanted to repost the following by Chris Mattman on the TIKA list:
> >
> > If you have been following the news you’ve seen the Panama papers and
> how the world’s rich and elite have been storing all their money offshore
> to hide it. Two of the ASF’s key technologies were used in uncovering that
> story and showing the world what was going on: Apache Tika and Apache Solr.
> >
> > Solr was used for making the Terabytes of Panama Papers available to
> journalists. The preprocessing of the documents for indexing was done with
> Tika (maybe through the contrib/extraction module).
> >
> > Here is the article by Forbes about that:
> >
> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > uschindler@apache.org
> > ASF Member, Apache Lucene PMC / Committer
> > Bremen, Germany
> > http://lucene.apache.org/
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
Re: Apache Solr and Tika used to index Panama Papers
Posted by Erik Hatcher <er...@gmail.com>.
Also of note, Blacklight was used for the Solr-based UI - http://projectblacklight.org
And another link about the data analysis process - https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
"Layered on top was the shiny interface, built using Blacklight, another open source development."
> On Apr 6, 2016, at 04:45, Uwe Schindler <us...@apache.org> wrote:
>
> Hi all,
>
> I just wanted to repost the following by Chris Mattman on the TIKA list:
>
> If you have been following the news you’ve seen the Panama papers and how the world’s rich and elite have been storing all their money offshore to hide it. Two of the ASF’s key technologies were used in uncovering that story and showing the world what was going on: Apache Tika and Apache Solr.
>
> Solr was used for making the Terabytes of Panama Papers available to journalists. The preprocessing of the documents for indexing was done with Tika (maybe through the contrib/extraction module).
>
> Here is the article by Forbes about that:
> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>
> Uwe
>
> -----
> Uwe Schindler
> uschindler@apache.org
> ASF Member, Apache Lucene PMC / Committer
> Bremen, Germany
> http://lucene.apache.org/
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
Re: Apache Solr and Tika used to index Panama Papers
Posted by Erik Hatcher <er...@gmail.com>.
Also of note, Blacklight was used for the Solr-based UI - http://projectblacklight.org
And another link about the data analysis process - https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
"Layered on top was the shiny interface, built using Blacklight, another open source development."
> On Apr 6, 2016, at 04:45, Uwe Schindler <us...@apache.org> wrote:
>
> Hi all,
>
> I just wanted to repost the following by Chris Mattman on the TIKA list:
>
> If you have been following the news you’ve seen the Panama papers and how the world’s rich and elite have been storing all their money offshore to hide it. Two of the ASF’s key technologies were used in uncovering that story and showing the world what was going on: Apache Tika and Apache Solr.
>
> Solr was used for making the Terabytes of Panama Papers available to journalists. The preprocessing of the documents for indexing was done with Tika (maybe through the contrib/extraction module).
>
> Here is the article by Forbes about that:
> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>
> Uwe
>
> -----
> Uwe Schindler
> uschindler@apache.org
> ASF Member, Apache Lucene PMC / Committer
> Bremen, Germany
> http://lucene.apache.org/
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
Re: Apache Solr and Tika used to index Panama Papers
Posted by David Smiley <da...@gmail.com>.
😀 awesome
On Wed, Apr 6, 2016 at 4:45 AM Uwe Schindler <us...@apache.org> wrote:
> Hi all,
>
> I just wanted to repost the following by Chris Mattman on the TIKA list:
>
> If you have been following the news you’ve seen the Panama papers and how
> the world’s rich and elite have been storing all their money offshore to
> hide it. Two of the ASF’s key technologies were used in uncovering that
> story and showing the world what was going on: Apache Tika and Apache Solr.
>
> Solr was used for making the Terabytes of Panama Papers available to
> journalists. The preprocessing of the documents for indexing was done with
> Tika (maybe through the contrib/extraction module).
>
> Here is the article by Forbes about that:
>
> http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak
>
> Uwe
>
> -----
> Uwe Schindler
> uschindler@apache.org
> ASF Member, Apache Lucene PMC / Committer
> Bremen, Germany
> http://lucene.apache.org/
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com