You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by Andreas Beeker <ki...@apache.org> on 2020/08/09 17:19:51 UTC

Factory clean up

Hi Devs,

I'm currently cleaning up the various X**F / H**F factories incl. extractors, with the following changes (= API breaks):


a) provide service locator pattern to conform with JigSaw - see WorkbookFactory / -Provider on how it would look like.
Although chances are that won't work with OSGi, we need that approach with JigSaw.
There are workarounds for this: http://www.basepatterns.org/java/2009/09/30/osgi-service-locator.html.
Btw. the integration tests are currently not executing the OOXML extractor, as I forgot to update a reflection class name string.


b) the above leads to service provider instances, therefore I replace static factory methods with instance methods.
Furthermore I rename all methods like "createXY" (createWorkbook) to "create(...)"
this is just more straight forward than e.g. POIXMLExtractorFactory a; a.createExtractor(...) - i.e. I already have the extractor factory handle, why would I need to repeat that I'd like to create an extractor ...
I haven't planed to provide a generic factory interface yet, but that would make things easier later on.


c) remove main() methods in the extractor. Those look like test methods and I don't want our source verification tools to complain about poor command line handling


d) use interfaces instead of abstract classes - ONLY POI*TextExtractor
those abstract classes only add minimal logic which can be handled by default methods.
on the other side I have problems with common classes like SlideShowExtractor and a new sub extractor for OOXML offering additional OOXML logic when keeping the abstract classes


e) removing a few obsolete constructs like PowerPointExtractor


f) rename factories
org.apache.poi.ooxml.extractor.ExtractorFactory -> org.apache.poi.ooxml.extractor.POIXMLExtractorFactory
org.apache.poi.extractor.OLE2ExtractorFactory -> org.apache.poi.extractor.ExtractorFactory
It's confusing that the specialized factory has a generic name and vice versa.


Andi



Re: Factory clean up

Posted by Yegor Kozlov <ye...@dinom.ru>.
Hi Andi,

"Provide-Capability"  and "Require-Capability"  are needed if we rely on
the Aries Spi-Fly mechanism to locate services in OSGi. The key point is
that these capabilities are provided by a third-party bundle (Spi-Fly) and
the POI bundle won't start if they are not installed.

I thought about (b) too. If ServiceLoader returned none it could be a
symptom of running in OSGi and the code can fallback to reflection. It
might work, but we will need to pass the class names which means we will
need to hardcode them somewhere...  It does not smell good to me and (a) is
a better option.

Regards,
Yegor


On Tue, Sep 22, 2020 at 1:18 AM Andreas Beeker <ki...@apache.org> wrote:

> Hi Yegor,
>
> I hoped there would be some OSGI magic with the "Provide-Capability"
> manifest entries [2], but I guess you've tried that all.
>
> WRT your approach ... I see two implementation options:
> a) register the providers by user code - as you sketched it
> b) or fallback in the WorkbookFactory and try to reflect the constructors
>
> Option b) has the drawback, that this fails in JMPS - I'm not sure if OSGI
> and JMPS are actually used together, but the combination looks troublesome
> [3] ... at least I don't understand it without getting my hands dirty ...
>
> Option a) seems to be straight forward, but also think about unregistering
> the providers in case the OSGi modules get unloaded.
> When thinking about the dynamic nature of OSGi, I don't like the
> modifications in the static singleton - but on the other hand, if the OSGI
> module, which uses POI, gets unloaded, the classloader referencing the
> static class becomes invalid anyway and so should be eligible for garbage
> collection - correct?
>
> So you have my +1 for a)
>
> Best wishes,
> Andi
>
>
> [2]
> https://stackoverflow.com/questions/48674868/in-osgi-serviceloader-load-fails-to-find-an-implementation
> [3] https://blog.osgi.org/2016/08/osgi-with-java-modules-all-way-down.html
>
> https://docs.google.com/presentation/d/15HMGH8_LMDIVng1lVqKtV8pp2IpgHR4TVsOwQMdPUTk/edit#slide=id.g1cb5367fdc_0_0
>
> On 21.09.20 10:49, Yegor Kozlov wrote:
> > Hi Andi,
> >
> > I'm working on the POI OSGi bundle and it's mostly working except for the
> > places using java.util.ServiceLoader which returns none in OSGi
> containers.
> > As a result, Workbook, SlideShow and Extractor factories are not
> working  .
> >
> > Apache Aries SPI Fly [1] solves this problem but it  seems like overkill
> to
> > me. This approach requires installing two additional bundles and I'd like
> > to avoid that and keep the POI bundle self-contained. I don't want POI to
> > depend on OSGi libraries either and do conditional initialization like
> > if( running in OSGI) {
> >   ...
> > } else {
> >   ServiceLoader.load(WorkbookProvider.class).forEach(....);
> > }
> > It will require the osgi api in the classpath which is not good.
> >
> > I'm thinking of a simpler solution: if ServiceLoader cannot locate the
> > services in OSGi, we can register them from the bundle Activator.
> > The idea is to extend the factory classes with a addProvider() method,
> > something like
> >
> >     // to be called *only* from bundle activator
> >     public static void addProvider(WorkbookProvider provider){
> >         // add to the list of registered providers.
> >     }
> >
> >
> >   and then the bundle would register the HSSF and XSSF implementations on
> > startup:
> >
> >     @Override
> >     public void start(BundleContext context) throws Exception{
> >         WorkbookFactory.addProvider(new HSSFWorkbookFactory());
> >         WorkbookFactory.addProvider(new XSSFWorkbookFactory());
> >         ...
> >   }
> >
> > This way POI remains OSGi-agnostic  and provides an option to register
> > services from OSGi.
> >
> > If there are no objections I'd like to check it in.
> >
> > [1] http://aries.apache.org/modules/spi-fly.html
> >
> > Regards,
> > Yegor
> >
> >
> > On Sun, Aug 9, 2020 at 8:19 PM Andreas Beeker <ki...@apache.org>
> wrote:
> >
> >> Hi Devs,
> >>
> >> I'm currently cleaning up the various X**F / H**F factories incl.
> >> extractors, with the following changes (= API breaks):
> >>
> >>
> >> a) provide service locator pattern to conform with JigSaw - see
> >> WorkbookFactory / -Provider on how it would look like.
> >> Although chances are that won't work with OSGi, we need that approach
> with
> >> JigSaw.
> >> There are workarounds for this:
> >> http://www.basepatterns.org/java/2009/09/30/osgi-service-locator.html.
> >> Btw. the integration tests are currently not executing the OOXML
> >> extractor, as I forgot to update a reflection class name string.
> >>
> >>
> >> b) the above leads to service provider instances, therefore I replace
> >> static factory methods with instance methods.
> >> Furthermore I rename all methods like "createXY" (createWorkbook) to
> >> "create(...)"
> >> this is just more straight forward than e.g. POIXMLExtractorFactory a;
> >> a.createExtractor(...) - i.e. I already have the extractor factory
> handle,
> >> why would I need to repeat that I'd like to create an extractor ...
> >> I haven't planed to provide a generic factory interface yet, but that
> >> would make things easier later on.
> >>
> >>
> >> c) remove main() methods in the extractor. Those look like test methods
> >> and I don't want our source verification tools to complain about poor
> >> command line handling
> >>
> >>
> >> d) use interfaces instead of abstract classes - ONLY POI*TextExtractor
> >> those abstract classes only add minimal logic which can be handled by
> >> default methods.
> >> on the other side I have problems with common classes like
> >> SlideShowExtractor and a new sub extractor for OOXML offering additional
> >> OOXML logic when keeping the abstract classes
> >>
> >>
> >> e) removing a few obsolete constructs like PowerPointExtractor
> >>
> >>
> >> f) rename factories
> >> org.apache.poi.ooxml.extractor.ExtractorFactory ->
> >> org.apache.poi.ooxml.extractor.POIXMLExtractorFactory
> >> org.apache.poi.extractor.OLE2ExtractorFactory ->
> >> org.apache.poi.extractor.ExtractorFactory
> >> It's confusing that the specialized factory has a generic name and vice
> >> versa.
> >>
> >>
> >> Andi
> >>
> >>
> >>
>
>
>

Re: Factory clean up

Posted by Andreas Beeker <ki...@apache.org>.
Hi Yegor,

I hoped there would be some OSGI magic with the "Provide-Capability" manifest entries [2], but I guess you've tried that all.

WRT your approach ... I see two implementation options:
a) register the providers by user code - as you sketched it
b) or fallback in the WorkbookFactory and try to reflect the constructors

Option b) has the drawback, that this fails in JMPS - I'm not sure if OSGI and JMPS are actually used together, but the combination looks troublesome [3] ... at least I don't understand it without getting my hands dirty ...

Option a) seems to be straight forward, but also think about unregistering the providers in case the OSGi modules get unloaded.
When thinking about the dynamic nature of OSGi, I don't like the modifications in the static singleton - but on the other hand, if the OSGI module, which uses POI, gets unloaded, the classloader referencing the static class becomes invalid anyway and so should be eligible for garbage collection - correct?

So you have my +1 for a)

Best wishes,
Andi


[2] https://stackoverflow.com/questions/48674868/in-osgi-serviceloader-load-fails-to-find-an-implementation
[3] https://blog.osgi.org/2016/08/osgi-with-java-modules-all-way-down.html
https://docs.google.com/presentation/d/15HMGH8_LMDIVng1lVqKtV8pp2IpgHR4TVsOwQMdPUTk/edit#slide=id.g1cb5367fdc_0_0

On 21.09.20 10:49, Yegor Kozlov wrote:
> Hi Andi,
>
> I'm working on the POI OSGi bundle and it's mostly working except for the
> places using java.util.ServiceLoader which returns none in OSGi containers.
> As a result, Workbook, SlideShow and Extractor factories are not working  .
>
> Apache Aries SPI Fly [1] solves this problem but it  seems like overkill to
> me. This approach requires installing two additional bundles and I'd like
> to avoid that and keep the POI bundle self-contained. I don't want POI to
> depend on OSGi libraries either and do conditional initialization like
> if( running in OSGI) {
>   ...
> } else {
>   ServiceLoader.load(WorkbookProvider.class).forEach(....);
> }
> It will require the osgi api in the classpath which is not good.
>
> I'm thinking of a simpler solution: if ServiceLoader cannot locate the
> services in OSGi, we can register them from the bundle Activator.
> The idea is to extend the factory classes with a addProvider() method,
> something like
>
>     // to be called *only* from bundle activator
>     public static void addProvider(WorkbookProvider provider){
>         // add to the list of registered providers.
>     }
>
>
>   and then the bundle would register the HSSF and XSSF implementations on
> startup:
>
>     @Override
>     public void start(BundleContext context) throws Exception{
>         WorkbookFactory.addProvider(new HSSFWorkbookFactory());
>         WorkbookFactory.addProvider(new XSSFWorkbookFactory());
>         ...
>   }
>
> This way POI remains OSGi-agnostic  and provides an option to register
> services from OSGi.
>
> If there are no objections I'd like to check it in.
>
> [1] http://aries.apache.org/modules/spi-fly.html
>
> Regards,
> Yegor
>
>
> On Sun, Aug 9, 2020 at 8:19 PM Andreas Beeker <ki...@apache.org> wrote:
>
>> Hi Devs,
>>
>> I'm currently cleaning up the various X**F / H**F factories incl.
>> extractors, with the following changes (= API breaks):
>>
>>
>> a) provide service locator pattern to conform with JigSaw - see
>> WorkbookFactory / -Provider on how it would look like.
>> Although chances are that won't work with OSGi, we need that approach with
>> JigSaw.
>> There are workarounds for this:
>> http://www.basepatterns.org/java/2009/09/30/osgi-service-locator.html.
>> Btw. the integration tests are currently not executing the OOXML
>> extractor, as I forgot to update a reflection class name string.
>>
>>
>> b) the above leads to service provider instances, therefore I replace
>> static factory methods with instance methods.
>> Furthermore I rename all methods like "createXY" (createWorkbook) to
>> "create(...)"
>> this is just more straight forward than e.g. POIXMLExtractorFactory a;
>> a.createExtractor(...) - i.e. I already have the extractor factory handle,
>> why would I need to repeat that I'd like to create an extractor ...
>> I haven't planed to provide a generic factory interface yet, but that
>> would make things easier later on.
>>
>>
>> c) remove main() methods in the extractor. Those look like test methods
>> and I don't want our source verification tools to complain about poor
>> command line handling
>>
>>
>> d) use interfaces instead of abstract classes - ONLY POI*TextExtractor
>> those abstract classes only add minimal logic which can be handled by
>> default methods.
>> on the other side I have problems with common classes like
>> SlideShowExtractor and a new sub extractor for OOXML offering additional
>> OOXML logic when keeping the abstract classes
>>
>>
>> e) removing a few obsolete constructs like PowerPointExtractor
>>
>>
>> f) rename factories
>> org.apache.poi.ooxml.extractor.ExtractorFactory ->
>> org.apache.poi.ooxml.extractor.POIXMLExtractorFactory
>> org.apache.poi.extractor.OLE2ExtractorFactory ->
>> org.apache.poi.extractor.ExtractorFactory
>> It's confusing that the specialized factory has a generic name and vice
>> versa.
>>
>>
>> Andi
>>
>>
>>



Re: Factory clean up

Posted by Yegor Kozlov <ye...@dinom.ru>.
Hi Andi,

I'm working on the POI OSGi bundle and it's mostly working except for the
places using java.util.ServiceLoader which returns none in OSGi containers.
As a result, Workbook, SlideShow and Extractor factories are not working  .

Apache Aries SPI Fly [1] solves this problem but it  seems like overkill to
me. This approach requires installing two additional bundles and I'd like
to avoid that and keep the POI bundle self-contained. I don't want POI to
depend on OSGi libraries either and do conditional initialization like
if( running in OSGI) {
  ...
} else {
  ServiceLoader.load(WorkbookProvider.class).forEach(....);
}
It will require the osgi api in the classpath which is not good.

I'm thinking of a simpler solution: if ServiceLoader cannot locate the
services in OSGi, we can register them from the bundle Activator.
The idea is to extend the factory classes with a addProvider() method,
something like

    // to be called *only* from bundle activator
    public static void addProvider(WorkbookProvider provider){
        // add to the list of registered providers.
    }


  and then the bundle would register the HSSF and XSSF implementations on
startup:

    @Override
    public void start(BundleContext context) throws Exception{
        WorkbookFactory.addProvider(new HSSFWorkbookFactory());
        WorkbookFactory.addProvider(new XSSFWorkbookFactory());
        ...
  }

This way POI remains OSGi-agnostic  and provides an option to register
services from OSGi.

If there are no objections I'd like to check it in.

[1] http://aries.apache.org/modules/spi-fly.html

Regards,
Yegor


On Sun, Aug 9, 2020 at 8:19 PM Andreas Beeker <ki...@apache.org> wrote:

> Hi Devs,
>
> I'm currently cleaning up the various X**F / H**F factories incl.
> extractors, with the following changes (= API breaks):
>
>
> a) provide service locator pattern to conform with JigSaw - see
> WorkbookFactory / -Provider on how it would look like.
> Although chances are that won't work with OSGi, we need that approach with
> JigSaw.
> There are workarounds for this:
> http://www.basepatterns.org/java/2009/09/30/osgi-service-locator.html.
> Btw. the integration tests are currently not executing the OOXML
> extractor, as I forgot to update a reflection class name string.
>
>
> b) the above leads to service provider instances, therefore I replace
> static factory methods with instance methods.
> Furthermore I rename all methods like "createXY" (createWorkbook) to
> "create(...)"
> this is just more straight forward than e.g. POIXMLExtractorFactory a;
> a.createExtractor(...) - i.e. I already have the extractor factory handle,
> why would I need to repeat that I'd like to create an extractor ...
> I haven't planed to provide a generic factory interface yet, but that
> would make things easier later on.
>
>
> c) remove main() methods in the extractor. Those look like test methods
> and I don't want our source verification tools to complain about poor
> command line handling
>
>
> d) use interfaces instead of abstract classes - ONLY POI*TextExtractor
> those abstract classes only add minimal logic which can be handled by
> default methods.
> on the other side I have problems with common classes like
> SlideShowExtractor and a new sub extractor for OOXML offering additional
> OOXML logic when keeping the abstract classes
>
>
> e) removing a few obsolete constructs like PowerPointExtractor
>
>
> f) rename factories
> org.apache.poi.ooxml.extractor.ExtractorFactory ->
> org.apache.poi.ooxml.extractor.POIXMLExtractorFactory
> org.apache.poi.extractor.OLE2ExtractorFactory ->
> org.apache.poi.extractor.ExtractorFactory
> It's confusing that the specialized factory has a generic name and vice
> versa.
>
>
> Andi
>
>
>