You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Matthew Roth <ma...@yale.edu> on 2016/10/17 14:01:04 UTC

PDF writer

Hi Group,

Is there a documented or preferred path to have a PDF response writer? I am
using solr 5.3.x for an internal project. I have an XSL-FO transformation
that I am able to return via the XSLT response writer. Is there a
documented way to produce  a PDF via solr? Alternatively, I was thinking of
passing the response through an eXist-db instance [0] we have running.
However, a pdf response writer would be ideal.

Best,
Matt

[0] http://exist-db.org/

Re: PDF writer

Posted by John Bickerstaff <jo...@johnbickerstaff.com>.

It's not fun to build a .pdf this way, but this may help...

http://itextpdf.com/

On Mon, Oct 17, 2016 at 12:15 PM, Matthew Roth <ma...@yale.edu>
wrote:

> Thanks Erick. That is as anticipated. Scouring my other resources didn't
> indicate the existence of a PDF writer. I thought I'd try the group be
> embarking on a custom solution.
>
>
> Matt
>
> On Mon, Oct 17, 2016 at 11:58 AM, Erick Erickson <er...@gmail.com>
> wrote:
>
> > There's no PDF writer that I know of, and I doubt there's much
> > enthusiasm for creating one as part of Solr. ResponseWriters are
> > pluggable so this would certainly be possible.
> >
> > At root, in a response writer you just have a map of key/value pairs
> > (it's a little more complicated than that, but not much) that you can
> > do whatever you want with, either on Solr or on a SolrJ client.
> >
> > Not much help I know...
> >
> > Best,
> > Erick
> >
> > On Mon, Oct 17, 2016 at 10:01 AM, Matthew Roth <ma...@yale.edu>
> > wrote:
> > > Hi Group,
> > >
> > > Is there a documented or preferred path to have a PDF response writer?
> I
> > am
> > > using solr 5.3.x for an internal project. I have an XSL-FO
> transformation
> > > that I am able to return via the XSLT response writer. Is there a
> > > documented way to produce  a PDF via solr? Alternatively, I was
> thinking
> > of
> > > passing the response through an eXist-db instance [0] we have running.
> > > However, a pdf response writer would be ideal.
> > >
> > > Best,
> > > Matt
> > >
> > > [0] http://exist-db.org/
> >
>

Re: PDF writer

Posted by Matthew Roth <ma...@yale.edu>.

Thanks Erick. That is as anticipated. Scouring my other resources didn't
indicate the existence of a PDF writer. I thought I'd try the group be
embarking on a custom solution.


Matt

On Mon, Oct 17, 2016 at 11:58 AM, Erick Erickson <er...@gmail.com>
wrote:

> There's no PDF writer that I know of, and I doubt there's much
> enthusiasm for creating one as part of Solr. ResponseWriters are
> pluggable so this would certainly be possible.
>
> At root, in a response writer you just have a map of key/value pairs
> (it's a little more complicated than that, but not much) that you can
> do whatever you want with, either on Solr or on a SolrJ client.
>
> Not much help I know...
>
> Best,
> Erick
>
> On Mon, Oct 17, 2016 at 10:01 AM, Matthew Roth <ma...@yale.edu>
> wrote:
> > Hi Group,
> >
> > Is there a documented or preferred path to have a PDF response writer? I
> am
> > using solr 5.3.x for an internal project. I have an XSL-FO transformation
> > that I am able to return via the XSLT response writer. Is there a
> > documented way to produce  a PDF via solr? Alternatively, I was thinking
> of
> > passing the response through an eXist-db instance [0] we have running.
> > However, a pdf response writer would be ideal.
> >
> > Best,
> > Matt
> >
> > [0] http://exist-db.org/
>

Re: PDF writer

Posted by Erick Erickson <er...@gmail.com>.

There's no PDF writer that I know of, and I doubt there's much
enthusiasm for creating one as part of Solr. ResponseWriters are
pluggable so this would certainly be possible.

At root, in a response writer you just have a map of key/value pairs
(it's a little more complicated than that, but not much) that you can
do whatever you want with, either on Solr or on a SolrJ client.

Not much help I know...

Best,
Erick

On Mon, Oct 17, 2016 at 10:01 AM, Matthew Roth <ma...@yale.edu> wrote:
> Hi Group,
>
> Is there a documented or preferred path to have a PDF response writer? I am
> using solr 5.3.x for an internal project. I have an XSL-FO transformation
> that I am able to return via the XSLT response writer. Is there a
> documented way to produce  a PDF via solr? Alternatively, I was thinking of
> passing the response through an eXist-db instance [0] we have running.
> However, a pdf response writer would be ideal.
>
> Best,
> Matt
>
> [0] http://exist-db.org/

Re: PDF writer

Posted by Matthew Roth <mg...@gmail.com>.

> I think this is the best option.

I really do too once I think about it some more. Rubber Ducky strikes
again. Once I say it aloud--in this case type it out--it seems much clearer
what the answer is to this question.

Thanks again. I've really appreciated all the feedback on this question.

Matt


On Fri, Oct 21, 2016 at 10:44 AM, Davis, Daniel (NIH/NLM) [C] <
daniel.davis@nih.gov> wrote:

> If the PDF report is truly a report, I agree with this.   We have a
> use-case with IBM InfoSphere Watson Explorer where our users want a PDF
> report on the results for their query to be generated on the fly.   They
> can then save the query and have the report emailed to them :)   Not only
> is Solr middleware - Search engines in general should be Middleware,
> because these sorts of business requirements keep coming up.   We've
> invested a lot in IBM InfoSphere Watson Explorer because it can create a
> GUI for us, but that often ends-up biting you in the end.
>
> This creates search UI's that are maintained by the "search team" while
> the corresponding application is maintained by the "developer team", and so
> look and feel can often be replicated while using different HTML,
> JavaScript, and CSS.   So, updates can be hard, and achieving the same
> mobile responsive behavior can be nearly impossible.
>
> Search engines *should* be middleware.   I value having a back-office for
> crawling the web that allows a crawl to be defined entirely through a GUI,
> but question whether it really is much better than a FOSS architecture.
>
> -----Original Message-----
> From: Alexandre Rafalovitch [mailto:arafalov@gmail.com]
> Sent: Friday, October 21, 2016 10:35 AM
> To: solr-user <so...@lucene.apache.org>
> Subject: Re: PDF writer
>
> On 21 October 2016 at 09:58, Matthew Roth <ma...@yale.edu> wrote:
> > . I could always process the upstream relational data to produce my
> > PDF reports.
>
> I think this is the best option. This allows you to mangle/de-normalize
> your data stored in Solr to be the best fit for search.
>
> Regards,
>    Alex.
> ----
> Solr Example reading group is starting November 2016, join us at
> http://j.mp/SolrERG Newsletter and resources for Solr beginners and
> intermediates:
> http://www.solr-start.com/
>

RE: PDF writer

Posted by "Davis, Daniel (NIH/NLM) [C]" <da...@nih.gov>.

If the PDF report is truly a report, I agree with this.   We have a use-case with IBM InfoSphere Watson Explorer where our users want a PDF report on the results for their query to be generated on the fly.   They can then save the query and have the report emailed to them :)   Not only is Solr middleware - Search engines in general should be Middleware, because these sorts of business requirements keep coming up.   We've invested a lot in IBM InfoSphere Watson Explorer because it can create a GUI for us, but that often ends-up biting you in the end.

This creates search UI's that are maintained by the "search team" while the corresponding application is maintained by the "developer team", and so look and feel can often be replicated while using different HTML, JavaScript, and CSS.   So, updates can be hard, and achieving the same mobile responsive behavior can be nearly impossible.

Search engines *should* be middleware.   I value having a back-office for crawling the web that allows a crawl to be defined entirely through a GUI, but question whether it really is much better than a FOSS architecture.

-----Original Message-----
From: Alexandre Rafalovitch [mailto:arafalov@gmail.com] 
Sent: Friday, October 21, 2016 10:35 AM
To: solr-user <so...@lucene.apache.org>
Subject: Re: PDF writer

On 21 October 2016 at 09:58, Matthew Roth <ma...@yale.edu> wrote:
> . I could always process the upstream relational data to produce my 
> PDF reports.

I think this is the best option. This allows you to mangle/de-normalize your data stored in Solr to be the best fit for search.

Regards,
   Alex.
----
Solr Example reading group is starting November 2016, join us at http://j.mp/SolrERG Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/

Re: PDF writer

Posted by Alexandre Rafalovitch <ar...@gmail.com>.

On 21 October 2016 at 09:58, Matthew Roth <ma...@yale.edu> wrote:
> . I could always process the upstream relational data to
> produce my PDF reports.

I think this is the best option. This allows you to
mangle/de-normalize your data stored in Solr to be the best fit for
search.

Regards,
   Alex.
----
Solr Example reading group is starting November 2016, join us at
http://j.mp/SolrERG
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/

Re: PDF writer

Posted by Matthew Roth <ma...@yale.edu>.

Hi Shawn,

Thanks for the thoughtful response on middleware and the solr philosophy.
You are correct and I intend to handle this outside of Solr. This inquiry
was me doing some forethought on a distant project. When I see an
XSLTResponseWriter the jump-to-conclusions part of my brain jumps to PDF.
The separation you are describing is very logical.

At this point I intend to make use of an XSLTresponse to produce formatting
objects that I will process at a later point in the application. Or maybe I
won't. Solr isn't my upstream source. The data is relational, but my
indexes are in solr. I could always process the upstream relational data to
produce my PDF reports.


Matt

On Wed, Oct 19, 2016 at 10:53 AM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 10/17/2016 8:01 AM, Matthew Roth wrote:
> > Is there a documented or preferred path to have a PDF response writer?
> > I am using solr 5.3.x for an internal project. I have an XSL-FO
> > transformation that I am able to return via the XSLT response writer.
> > Is there a documented way to produce a PDF via solr? Alternatively, I
> > was thinking of passing the response through an eXist-db instance [0]
> > we have running. However, a pdf response writer would be ideal.
>
> Solr responses are designed to be processed by a program making a search
> query, not read by an end user.  Solr is middleware.  There are multiple
> formats (json, xml, javabin) because we do not know what kind of program
> will consume the response.
>
> https://en.wikipedia.org/wiki/Middleware
>
> PDF is an end-user format for display and print, not a middleware
> response format.  Creating content like that is best handled by other
> pieces of software, not Solr.
>
> For best results that fit your needs perfectly, that software is likely
> to be something you write yourself.  The Solr project has absolutely no
> idea how you will define your schema, or how you would like the data in
> a Solr response transformed, integrated, and formatted in a PDF.
>
> Designing the feature you want would be something best handled as an
> software project separate from Solr.  The software would take a Solr
> response and turn it into a PDF.  It doesn't fit into Solr's core usage,
> so making it a part of Solr is not a good fit and unlikely to happen.
>
> No matter where the development for a general feature like that happens,
> it would likely take weeks or months of work just to reach alpha
> quality.  After that, it would take weeks or months of additional work
> to reach release quality ... and even then it probably wouldn't produce
> the exact results you want without extensive and complicated
> configuration.  Handling complicated configuration is itself very
> complicated, which is one reason why development would take so long.
>
> Thanks,
> Shawn
>
>

Re: PDF writer

Posted by Shawn Heisey <ap...@elyograg.org>.

On 10/17/2016 8:01 AM, Matthew Roth wrote:
> Is there a documented or preferred path to have a PDF response writer?
> I am using solr 5.3.x for an internal project. I have an XSL-FO
> transformation that I am able to return via the XSLT response writer.
> Is there a documented way to produce a PDF via solr? Alternatively, I
> was thinking of passing the response through an eXist-db instance [0]
> we have running. However, a pdf response writer would be ideal.

Solr responses are designed to be processed by a program making a search
query, not read by an end user.  Solr is middleware.  There are multiple
formats (json, xml, javabin) because we do not know what kind of program
will consume the response.

https://en.wikipedia.org/wiki/Middleware

PDF is an end-user format for display and print, not a middleware
response format.  Creating content like that is best handled by other
pieces of software, not Solr.

For best results that fit your needs perfectly, that software is likely
to be something you write yourself.  The Solr project has absolutely no
idea how you will define your schema, or how you would like the data in
a Solr response transformed, integrated, and formatted in a PDF.

Designing the feature you want would be something best handled as an
software project separate from Solr.  The software would take a Solr
response and turn it into a PDF.  It doesn't fit into Solr's core usage,
so making it a part of Solr is not a good fit and unlikely to happen.

No matter where the development for a general feature like that happens,
it would likely take weeks or months of work just to reach alpha
quality.  After that, it would take weeks or months of additional work
to reach release quality ... and even then it probably wouldn't produce
the exact results you want without extensive and complicated
configuration.  Handling complicated configuration is itself very
complicated, which is one reason why development would take so long.

Thanks,
Shawn