You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by Mathias Bauer <Ma...@gmx.net> on 2011/07/18 19:14:46 UTC

A first try to remove some copyleft components from the build

Hi,

I tried to get rid of some copyleft dependencies. As I will leave for
vacation on Wednesday, I now send my first patch to the list so that
others already can have a look or even continue. I only did it on Linux
so far, of course we will need adaptions for other platforms.
I created the patch from the hg repository of OpenOffice.org, but the
differences to our still not existing svn repository won't be huge, so
it should bring us a little bit closer to a "clean" build.

I also added some more todos to the wiki page.

Meanwhile the license information at

http://wiki.services.openoffice.org/wiki/ApacheMigration

about our external tarballs is nearly complete, so there might be some
more modules that could be worked on before we will tackle the
"internal" copyleft dependencies. As I see it, I already should have
covered most dependencies on copyleft licensed external tarballs, with
the exceptions of the modules for the svg import (Kai Ahrens announced
to work on it) and linear solver (Niklas Nebel already created a first
patch for it).

Now some words about the patch.

I wanted to have it lazy ;-), so I just added a single configure switch
"--with-disable-copyleft" and did some quick and dirty changes to
configure.in. If someone wants to make the name or the implementation
nicer - please go for it. I never got when to use "yes/no" or
"true/false" in our configure.in. :-)

With the changes in the patch the following modules won't be built
anymore when the switch is used:

dictionaries
epm
gstreamer
hunspell
hyphen
libwpd
mozilla
mythes
neon
nss
saxon

and some modules or module parts that depend on them. I didn't need to
remove any sources, as all copyleft parts that I removed are external
tar balls. The code using these parts is just "normal" OOo code that can
stay in the repository, though should be left out in the standard build
at Apache.

The final installation set of the build on Linux currently ist just
created as a tar.gz, native packages can't be created as epm is missing.
(I already posted a question about epm to this list, hopefully someone
will be able to answer it.) Thus the build currently finishes with an
error message, nevertheless the tar.gz was created and the result
basically runs fine.

The patch contains some unfinished work in rhino that is still commented
out. If someone knows how to add conditional compilation to a build.xml
file (I don't): the module "scripting" has parts depending on rhino that
need to be removed from the build.xml in case DISABLE_RHINO is set.
Which parts these are can easily be "detected" (means: found by build
breakages) by removing the comment signs from my changes in rhino before
you start the build.

If you want to try it yourself: get the source from
http://hg.services.openoffice.org/OOO340, apply the patch, call autoconf
and then do the build as usual, following the build instructions in the
ooo wiki. On my Ubuntu 11.04 the new configure switch was the only one I
needed.

For those amongst us that are used to the OOo build system: I treated
configure no longer as part of the build tree, so autoconf is required
after applying the patch. Makes more sense that way, IMHO, and from what
I already learned on this list, it is the preferred way at Apache.

Regards,
Mathias

Re: A first try to remove some copyleft components from the build

Posted by Jürgen Schmidt <jo...@googlemail.com>.
On Mon, Jul 18, 2011 at 7:14 PM, Mathias Bauer <Ma...@gmx.net>wrote:

> Hi,
>
> I tried to get rid of some copyleft dependencies. As I will leave for
> vacation on Wednesday, I now send my first patch to the list so that
> others already can have a look or even continue. I only did it on Linux
> so far, of course we will need adaptions for other platforms.
> I created the patch from the hg repository of OpenOffice.org, but the
> differences to our still not existing svn repository won't be huge, so
> it should bring us a little bit closer to a "clean" build.
>
> I also added some more todos to the wiki page.
>
> Meanwhile the license information at
>
> http://wiki.services.openoffice.org/wiki/ApacheMigration
>
> about our external tarballs is nearly complete, so there might be some
> more modules that could be worked on before we will tackle the
> "internal" copyleft dependencies. As I see it, I already should have
> covered most dependencies on copyleft licensed external tarballs, with
> the exceptions of the modules for the svg import (Kai Ahrens announced
> to work on it) and linear solver (Niklas Nebel already created a first
> patch for it).
>
> Now some words about the patch.
>
> I wanted to have it lazy ;-), so I just added a single configure switch
> "--with-disable-copyleft" and did some quick and dirty changes to
> configure.in. If someone wants to make the name or the implementation
> nicer - please go for it. I never got when to use "yes/no" or
> "true/false" in our configure.in. :-)
>
> With the changes in the patch the following modules won't be built
> anymore when the switch is used:
>
> dictionaries
> epm
> gstreamer
> hunspell
> hyphen
> libwpd
> mozilla
> mythes
> neon
> nss
> saxon
>
> and some modules or module parts that depend on them. I didn't need to
> remove any sources, as all copyleft parts that I removed are external
> tar balls. The code using these parts is just "normal" OOo code that can
> stay in the repository, though should be left out in the standard build
> at Apache.
>
> The final installation set of the build on Linux currently ist just
> created as a tar.gz, native packages can't be created as epm is missing.
> (I already posted a question about epm to this list, hopefully someone
> will be able to answer it.) Thus the build currently finishes with an
> error message, nevertheless the tar.gz was created and the result
> basically runs fine.
>
> The patch contains some unfinished work in rhino that is still commented
> out. If someone knows how to add conditional compilation to a build.xml
> file (I don't): the module "scripting" has parts depending on rhino that
> need to be removed from the build.xml in case DISABLE_RHINO is set.
> Which parts these are can easily be "detected" (means: found by build
> breakages) by removing the comment signs from my changes in rhino before
> you start the build.
>
> If you want to try it yourself: get the source from
> http://hg.services.openoffice.org/OOO340, apply the patch, call autoconf
> and then do the build as usual, following the build instructions in the
> ooo wiki. On my Ubuntu 11.04 the new configure switch was the only one I
> needed.
>

i will try it under MacOS but i will also take a short break over the
weekend

By the way well done Mathias it brings us forward

Juergen


>
> For those amongst us that are used to the OOo build system: I treated
> configure no longer as part of the build tree, so autoconf is required
> after applying the patch. Makes more sense that way, IMHO, and from what
> I already learned on this list, it is the preferred way at Apache.
>
> Regards,
> Mathias
>

Re: A first try to remove some copyleft components from the build

Posted by Mathias Bauer <Ma...@gmx.net>.
On 18.07.2011 19:29, Sam Ruby wrote:

> On Mon, Jul 18, 2011 at 1:14 PM, Mathias Bauer <Ma...@gmx.net> wrote:
>>
>> I wanted to have it lazy ;-), so I just added a single configure switch
>> "--with-disable-copyleft" and did some quick and dirty changes to
>> configure.in. If someone wants to make the name or the implementation
>> nicer - please go for it. I never got when to use "yes/no" or
>> "true/false" in our configure.in. :-)
> 
> If I read this correctly, you have this switch backwards.  Copyleft
> needs to be opt-in, not opt-out.
> 
> - Sam Ruby
> 

Yes, that's for practical reasons, otherwise I had to rewrite larger
parts of configure.in. Once the build works with this switch in all
places, it's a better time to do this rewrite.

Thanks for the remark, I totally forgot to mention this.

Regards,
Mathias

Re: A first try to remove some copyleft components from the build

Posted by Sam Ruby <ru...@intertwingly.net>.
On Mon, Jul 18, 2011 at 1:14 PM, Mathias Bauer <Ma...@gmx.net> wrote:
>
> I wanted to have it lazy ;-), so I just added a single configure switch
> "--with-disable-copyleft" and did some quick and dirty changes to
> configure.in. If someone wants to make the name or the implementation
> nicer - please go for it. I never got when to use "yes/no" or
> "true/false" in our configure.in. :-)

If I read this correctly, you have this switch backwards.  Copyleft
needs to be opt-in, not opt-out.

- Sam Ruby

Re: A first try to remove some copyleft components from the build

Posted by Mathias Bauer <Ma...@gmx.net>.
On 20.07.2011 13:06, Eike Rathke wrote:
> Hi Mathias,
>
> On Tuesday, 2011-07-19 23:32:50 +0200, Mathias Bauer wrote:
>
>>> Builds fine on Debian Squeeze unxlngx6.pro, but I didn't get any install
>>> set, not even a .tar.gz, this helped:
>>>
>>> cd $SRC_ROOT/instsetoo_native/util
>>> dmake openoffice_en-US PKGFORMAT=installed
>>
>> Interesting, I got the tar.gzbuilt on Ubuntu. Maybe your changes caused
>> the difference? It's important that EPM is not disabled as the stupid
>> instsetoo_native module does not do anything in that case though it
>> could create the tar.gz without problems.
>
> Um, probably because epm wasn't installed on my machine.. I installed
> and reconfigured, but still..
>
> A few minutes later: the patch sets enable_epm="no", removing that
> works, but that may be no real solution because later on BUILD_EPM may
> be set to YES if epm is enabled and no system epm was found, I didn't
> try that.

It's only a short time solution to be able to test the result of the 
build. Later on we have to find out whether it is OK to use EPM for 
packaging and adjust configure.in accordingly (we would make installing 
EPM a requirement then, just like installing a compiler or other 
things). I hope that someone is able to answer my question regarding EPM.

Regards,
Mathias


Re: A first try to remove some copyleft components from the build

Posted by Eike Rathke <oo...@erack.de>.
Hi Mathias,

On Tuesday, 2011-07-19 23:32:50 +0200, Mathias Bauer wrote:

> > Builds fine on Debian Squeeze unxlngx6.pro, but I didn't get any install
> > set, not even a .tar.gz, this helped:
> > 
> > cd $SRC_ROOT/instsetoo_native/util
> > dmake openoffice_en-US PKGFORMAT=installed
> 
> Interesting, I got the tar.gzbuilt on Ubuntu. Maybe your changes caused
> the difference? It's important that EPM is not disabled as the stupid
> instsetoo_native module does not do anything in that case though it
> could create the tar.gz without problems.

Um, probably because epm wasn't installed on my machine.. I installed
and reconfigured, but still..

A few minutes later: the patch sets enable_epm="no", removing that
works, but that may be no real solution because later on BUILD_EPM may
be set to YES if epm is enabled and no system epm was found, I didn't
try that.

  Eike

-- 
 PGP/OpenPGP/GnuPG encrypted mail preferred in all private communication.
 Key ID: 0x293C05FD - 997A 4C60 CE41 0149 0DB3  9E96 2F1A D073 293C 05FD

Re: A first try to remove some copyleft components from the build

Posted by Mathias Bauer <Ma...@gmx.net>.
Hi Eike,

thanks a lot for your improvements!

On 19.07.2011 23:21, Eike Rathke wrote:

> Hi Mathias,
> 
> On Monday, 2011-07-18 19:14:46 +0200, Mathias Bauer wrote:
> 
>> I wanted to have it lazy ;-), so I just added a single configure switch
>> "--with-disable-copyleft" and did some quick and dirty changes to
>> configure.in.
> 
> Well, that resulted in copyleft always disabled ;-) because in
> 
> if test "$with-disable-copyleft" != ""; then
> 
> the expression is always true as "-disable-copyleft" appended to $with
> is a non-empty string.. that would had to use "$with_disable_copyleft"
> instead.

Ah, yes. Good catch.

>> If someone wants to make the name or the implementation
>> nicer - please go for it. I never got when to use "yes/no" or
>> "true/false" in our configure.in. :-)
> 
> --with-... options take any argument, for example a path where a library
> can be found. --enable-.../--disable-... indeed sometimes evaluate both,
> yes/no and true/false, but many places only check for yes/no, to me
> that's sufficient.
> 
> I changed the option to --enable-copyleft with yes/no, defaulting to
> empty == no.

That's what Sam asked for. I thought that this makes configure switches
a little bit inconsistent, as now switches like "disable-mozilla" can be
used only in conjunction with enable-copyleft. I thought that then
configure.in should get a larger rewrite so that all stuff that is
disabled by the new switch should get "enable" switches and not
"disable" switches. But now when I see your changes I agree with them
and think that we can rewrite the configure logic later.

> Builds fine on Debian Squeeze unxlngx6.pro, but I didn't get any install
> set, not even a .tar.gz, this helped:
> 
> cd $SRC_ROOT/instsetoo_native/util
> dmake openoffice_en-US PKGFORMAT=installed

Interesting, I got the tar.gzbuilt on Ubuntu. Maybe your changes caused
the difference? It's important that EPM is not disabled as the stupid
instsetoo_native module does not do anything in that case though it
could create the tar.gz without problems.

Regards,
Mathias

Re: A first try to remove some copyleft components from the build

Posted by Eike Rathke <oo...@erack.de>.
Hi Mathias,

On Monday, 2011-07-18 19:14:46 +0200, Mathias Bauer wrote:

> I wanted to have it lazy ;-), so I just added a single configure switch
> "--with-disable-copyleft" and did some quick and dirty changes to
> configure.in.

Well, that resulted in copyleft always disabled ;-) because in

if test "$with-disable-copyleft" != ""; then

the expression is always true as "-disable-copyleft" appended to $with
is a non-empty string.. that would had to use "$with_disable_copyleft"
instead.

> If someone wants to make the name or the implementation
> nicer - please go for it. I never got when to use "yes/no" or
> "true/false" in our configure.in. :-)

--with-... options take any argument, for example a path where a library
can be found. --enable-.../--disable-... indeed sometimes evaluate both,
yes/no and true/false, but many places only check for yes/no, to me
that's sufficient.

I changed the option to --enable-copyleft with yes/no, defaulting to
empty == no.

I also provided empty sets for the variables so invoking configure
--enable-copyleft from an already set environment produces the desired
results.

Attached are on_top_of_mba.diff with only the changes to the previous
version one can apply on top, and aooo_disable_copyleft.diff with the
merged patches.

Don't forget to run autoconf when applied.

Builds fine on Debian Squeeze unxlngx6.pro, but I didn't get any install
set, not even a .tar.gz, this helped:

cd $SRC_ROOT/instsetoo_native/util
dmake openoffice_en-US PKGFORMAT=installed

Smoketest worked, except one message:
Gtk-Message: Failed to load module "gnomebreakpad": libgnomebreakpad.so: cannot open shared object file: No such file or directory

  Eike

-- 
 PGP/OpenPGP/GnuPG encrypted mail preferred in all private communication.
 Key ID: 0x293C05FD - 997A 4C60 CE41 0149 0DB3  9E96 2F1A D073 293C 05FD

Re: A first try to remove some copyleft components from the build

Posted by Mathias Bauer <Ma...@gmx.net>.
On 18.07.2011 21:28, Pedro F. Giffuni wrote:

> --- On Mon, 7/18/11, Christian Lohmaier <cl...@openoffice.org> wrote:
> 
>> On Mon, Jul 18, 2011 at 8:04 PM, Pedro F. Giffuni <gi...@tutopia.com>
>> wrote:
>> >
>> > 1) xpdf (GPL'd) is a run dependency, this is
>> > linux/unix specific. PDFBox may be a replacement.
>> 
>> Why would this be linux/unix specific? xpdf code is used
>> for the pdf-import extension, and that is available for
>> Windows as well.
>>
> 
> My bad, sorry. I thought the X referred to the windowing
> system. 
> 
>> > 2) The build requires GNU cp, which is inconvenient
>> > for the BSDs and MacOS X:
>> 
>> You don't need GNU cp to build on Mac.
>>
> 
> GNU cp appears as a build dependency for FreeBSD's
> OpenOffice 3.4 (devel) port. Should I try building
> without it?

Sorry, I don't remember exactly. Perhaps after my vacation. :-)

Regards,
Mathias

Re: A first try to remove some copyleft components from the build

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.
--- On Mon, 7/18/11, Christian Lohmaier <cl...@openoffice.org> wrote:

> On Mon, Jul 18, 2011 at 8:04 PM, Pedro F. Giffuni <gi...@tutopia.com>
> wrote:
> >
> > 1) xpdf (GPL'd) is a run dependency, this is
> > linux/unix specific. PDFBox may be a replacement.
> 
> Why would this be linux/unix specific? xpdf code is used
> for the pdf-import extension, and that is available for
> Windows as well.
>

My bad, sorry. I thought the X referred to the windowing
system. 

> > 2) The build requires GNU cp, which is inconvenient
> > for the BSDs and MacOS X:
> 
> You don't need GNU cp to build on Mac.
>

GNU cp appears as a build dependency for FreeBSD's
OpenOffice 3.4 (devel) port. Should I try building
without it?

cheers,

Pedro.

Re: A first try to remove some copyleft components from the build

Posted by Christian Lohmaier <cl...@openoffice.org>.
Hi *,

On Mon, Jul 18, 2011 at 8:04 PM, Pedro F. Giffuni <gi...@tutopia.com> wrote:
>
> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
> specific. PDFBox may be a replacement.

Why would this be linux/unix specific? xpdf code is used for the
pdf-import extension, and that is available for Windows as well.

> 2) The build requires GNU cp, which is inconvenient for
> the BSDs and MacOS X:

You don't need GNU cp to build on Mac.

> http://lists.freedesktop.org/archives/libreoffice/2010-October/000586.html

You should have read the initial message, or the current situation.
If you use --disable-build-mozilla / --disable-mozilla, you only need
XCode on Mac, no other external stuff. (unless you're on 10.4, then
you also need gnu make >= 3.81)

When building mozilla (nowadays actually seamonkey, but the configure
switch remained the same), you need libIDL and its dependencies in
addition (glib, gettext, and pkg-config for convenience)

ciao
Christian

Re: A first try to remove some copyleft components from the build

Posted by Jürgen Schmidt <jo...@googlemail.com>.
On Tue, Jul 19, 2011 at 12:54 AM, Andy Brown <an...@the-martin-byrd.net>wrote:

> Marcus (OOo) wrote:
> > Am 07/18/2011 10:04 PM, schrieb David McKay:
> >>
> >> On 18/07/11 20:50, Andy Brown wrote:
> >>> Mathias Bauer wrote:
> >>>> On 18.07.2011 20:21, Mathias Bauer wrote:
> >>>>
> >>>>>> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
> >>>>>> specific. PDFBox may be a replacement.
> >>>>> This component is used for the pdf import extension, not for OOo
> >>>>> itself.
> >>>>>
> >>>>> The pdf import extension is not built by default, there is a
> configure
> >>>>> switch to enable it in the build. In that case xpdf would be
> >>>>> required. I
> >>>>> think that this already fulfils the legal requirements that building
> >>>>> lgpl code must be "opt-in". So as far as I can see, this is not a
> >>>>> "to do".
> >>>> Giving it one more thought: it would be still a to do if we wanted to
> >>>> have a pdf import extension released by Apache. So perhaps a to do
> with
> >>>> minor priority.
> >>>>
> >>>> Regards,
> >>>> Mathias
> >>> If we do include the pdf import extension I would like to see it
> >>> rewritten to do a better job of importing. I have seen to many post in
> >>> the forums about the way that it works. My suggestion would be to drop
> >>> it completely.
> >>>
> >>> Andy
> >>>
> >> A lot of the issues I see on the forum regarding the PDF extension are
> >> to do with expectation. People seem to think this extension is going to
> >> give them a full-blown PDF editor with the capabilities of the Adobe
> >> tools. When they discover it is for tiny corrections and typo fixes they
> >> feel let down. That's not to say there aren't any bugs in it, there may
> >> well be. But I don;t think the PDF extension was positioned or described
> >> sufficiently to provide users with the correct expectations.
> >
> > The intension was to show what is possible. On the extension website is
> > a note that the Beta status was left due to the positive notes we got
> > about the extension. But this is no promiss that its quality is like the
> > import filter for the documents formats for MS Word & Co.
> >
> > The solution is not to remove the extension but to improve it's work.
> >
> > Marcus
> >
>
> If it can be improved then it maybe worth the effort.  I still think an
> OCR engine would do the work.
>

i think if we can fix the legal issues with the dependencies we should keep
it and maybe somebody will work on it. It provides at least a minimal import
which is not enough for some people but is good enough for many others. It's
always the same that you hear more concerns than positive feedback.

And if somebody will develop something on OCR it's even better and we get a
further enhancement...

Juergen



>
> Andy
>

Re: A first try to remove some copyleft components from the build

Posted by Andy Brown <an...@the-martin-byrd.net>.
Marcus (OOo) wrote:
> Am 07/18/2011 10:04 PM, schrieb David McKay:
>>
>> On 18/07/11 20:50, Andy Brown wrote:
>>> Mathias Bauer wrote:
>>>> On 18.07.2011 20:21, Mathias Bauer wrote:
>>>>
>>>>>> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
>>>>>> specific. PDFBox may be a replacement.
>>>>> This component is used for the pdf import extension, not for OOo
>>>>> itself.
>>>>>
>>>>> The pdf import extension is not built by default, there is a configure
>>>>> switch to enable it in the build. In that case xpdf would be
>>>>> required. I
>>>>> think that this already fulfils the legal requirements that building
>>>>> lgpl code must be "opt-in". So as far as I can see, this is not a
>>>>> "to do".
>>>> Giving it one more thought: it would be still a to do if we wanted to
>>>> have a pdf import extension released by Apache. So perhaps a to do with
>>>> minor priority.
>>>>
>>>> Regards,
>>>> Mathias
>>> If we do include the pdf import extension I would like to see it
>>> rewritten to do a better job of importing. I have seen to many post in
>>> the forums about the way that it works. My suggestion would be to drop
>>> it completely.
>>>
>>> Andy
>>>
>> A lot of the issues I see on the forum regarding the PDF extension are
>> to do with expectation. People seem to think this extension is going to
>> give them a full-blown PDF editor with the capabilities of the Adobe
>> tools. When they discover it is for tiny corrections and typo fixes they
>> feel let down. That's not to say there aren't any bugs in it, there may
>> well be. But I don;t think the PDF extension was positioned or described
>> sufficiently to provide users with the correct expectations.
> 
> The intension was to show what is possible. On the extension website is
> a note that the Beta status was left due to the positive notes we got
> about the extension. But this is no promiss that its quality is like the
> import filter for the documents formats for MS Word & Co.
> 
> The solution is not to remove the extension but to improve it's work.
> 
> Marcus
> 

If it can be improved then it maybe worth the effort.  I still think an
OCR engine would do the work.

Andy

Re: A first try to remove some copyleft components from the build

Posted by "Marcus (OOo)" <ma...@wtnet.de>.
Am 07/18/2011 10:04 PM, schrieb David McKay:
>
> On 18/07/11 20:50, Andy Brown wrote:
>> Mathias Bauer wrote:
>>> On 18.07.2011 20:21, Mathias Bauer wrote:
>>>
>>>>> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
>>>>> specific. PDFBox may be a replacement.
>>>> This component is used for the pdf import extension, not for OOo
>>>> itself.
>>>>
>>>> The pdf import extension is not built by default, there is a configure
>>>> switch to enable it in the build. In that case xpdf would be
>>>> required. I
>>>> think that this already fulfils the legal requirements that building
>>>> lgpl code must be "opt-in". So as far as I can see, this is not a
>>>> "to do".
>>> Giving it one more thought: it would be still a to do if we wanted to
>>> have a pdf import extension released by Apache. So perhaps a to do with
>>> minor priority.
>>>
>>> Regards,
>>> Mathias
>> If we do include the pdf import extension I would like to see it
>> rewritten to do a better job of importing. I have seen to many post in
>> the forums about the way that it works. My suggestion would be to drop
>> it completely.
>>
>> Andy
>>
> A lot of the issues I see on the forum regarding the PDF extension are
> to do with expectation. People seem to think this extension is going to
> give them a full-blown PDF editor with the capabilities of the Adobe
> tools. When they discover it is for tiny corrections and typo fixes they
> feel let down. That's not to say there aren't any bugs in it, there may
> well be. But I don;t think the PDF extension was positioned or described
> sufficiently to provide users with the correct expectations.

The intension was to show what is possible. On the extension website is 
a note that the Beta status was left due to the positive notes we got 
about the extension. But this is no promiss that its quality is like the 
import filter for the documents formats for MS Word & Co.

The solution is not to remove the extension but to improve it's work.

Marcus

Re: OCR (was Re: A first try to remove some copyleft components from the build)

Posted by Andy Brown <an...@the-martin-byrd.net>.
Pedro F. Giffuni wrote:
> FWIW;
> 
> --- On Mon, 7/18/11, Andy Brown wrote:
> ...
>>
>> I agree with you that it is the expectations that cause the
>> real problem.  For me it is a waste as it does not do as
>> most people expect. An OCR import would be a better option,
>> if we can find one.
>>
> 
> I haven't looked at this in a while, but ...
> 
> http://code.google.com/p/tesseract-ocr/ 
> 
> cheers,
> 
> Pedro. 
> 

Thanks.  It does look interesting from a quick glance.  Bookmarked for
reference.

Andy

OCR (was Re: A first try to remove some copyleft components from the build)

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.
FWIW;

--- On Mon, 7/18/11, Andy Brown wrote:
...
> 
> I agree with you that it is the expectations that cause the
> real problem.  For me it is a waste as it does not do as
> most people expect. An OCR import would be a better option,
> if we can find one.
>

I haven't looked at this in a while, but ...

http://code.google.com/p/tesseract-ocr/ 

cheers,

Pedro. 

Re: A first try to remove some copyleft components from the build

Posted by Andy Brown <an...@the-martin-byrd.net>.
David McKay wrote:
> 
> On 18/07/11 20:50, Andy Brown wrote:
>> Mathias Bauer wrote:
>>> On 18.07.2011 20:21, Mathias Bauer wrote:
>>>
>>>>> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
>>>>> specific. PDFBox may be a replacement.
>>>> This component is used for the pdf import extension, not for OOo
>>>> itself.
>>>>
>>>> The pdf import extension is not built by default, there is a configure
>>>> switch to enable it in the build. In that case xpdf would be
>>>> required. I
>>>> think that this already fulfils the legal requirements that building
>>>> lgpl code must be "opt-in". So as far as I can see, this is not a
>>>> "to do".
>>> Giving it one more thought: it would be still a to do if we wanted to
>>> have a pdf import extension released by Apache. So perhaps a to do with
>>> minor priority.
>>>
>>> Regards,
>>> Mathias
>> If we do include the pdf import extension I would like to see it
>> rewritten to do a better job of importing.  I have seen to many post in
>> the forums about the way that it works.  My suggestion would be to drop
>> it completely.
>>
>> Andy
>>
> A lot of the issues I see on the forum regarding the PDF extension are
> to do with expectation. People seem to think this extension is going to
> give them a full-blown PDF editor with the capabilities of the Adobe
> tools. When they discover it is for tiny corrections and typo fixes they
> feel let down. That's not to say there aren't any bugs in it, there may
> well be. But I don;t think the PDF extension was positioned or described
> sufficiently to provide users with the correct expectations.
> 
> Dave.

I agree with you that it is the expectations that cause the real
problem.  For me it is a waste as it does not do as most people expect.
 An OCR import would be a better option, if we can find one.

Andy

Re: A first try to remove some copyleft components from the build

Posted by David McKay <dm...@btconnect.com>.
On 18/07/11 20:50, Andy Brown wrote:
> Mathias Bauer wrote:
>> On 18.07.2011 20:21, Mathias Bauer wrote:
>>
>>>> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
>>>> specific. PDFBox may be a replacement.
>>> This component is used for the pdf import extension, not for OOo itself.
>>>
>>> The pdf import extension is not built by default, there is a configure
>>> switch to enable it in the build. In that case xpdf would be required. I
>>> think that this already fulfils the legal requirements that building
>>> lgpl code must be "opt-in". So as far as I can see, this is not a "to do".
>> Giving it one more thought: it would be still a to do if we wanted to
>> have a pdf import extension released by Apache. So perhaps a to do with
>> minor priority.
>>
>> Regards,
>> Mathias
> If we do include the pdf import extension I would like to see it
> rewritten to do a better job of importing.  I have seen to many post in
> the forums about the way that it works.  My suggestion would be to drop
> it completely.
>
> Andy
>
A lot of the issues I see on the forum regarding the PDF extension are 
to do with expectation. People seem to think this extension is going to 
give them a full-blown PDF editor with the capabilities of the Adobe 
tools. When they discover it is for tiny corrections and typo fixes they 
feel let down. That's not to say there aren't any bugs in it, there may 
well be. But I don;t think the PDF extension was positioned or described 
sufficiently to provide users with the correct expectations.

Dave.

Re: A first try to remove some copyleft components from the build

Posted by Andy Brown <an...@the-martin-byrd.net>.
Mathias Bauer wrote:
> On 18.07.2011 20:21, Mathias Bauer wrote:
> 
>>> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
>>> specific. PDFBox may be a replacement.
>>
>> This component is used for the pdf import extension, not for OOo itself.
>>
>> The pdf import extension is not built by default, there is a configure
>> switch to enable it in the build. In that case xpdf would be required. I
>> think that this already fulfils the legal requirements that building
>> lgpl code must be "opt-in". So as far as I can see, this is not a "to do".
> 
> Giving it one more thought: it would be still a to do if we wanted to
> have a pdf import extension released by Apache. So perhaps a to do with
> minor priority.
> 
> Regards,
> Mathias

If we do include the pdf import extension I would like to see it
rewritten to do a better job of importing.  I have seen to many post in
the forums about the way that it works.  My suggestion would be to drop
it completely.

Andy

Re: A first try to remove some copyleft components from the build

Posted by Mathias Bauer <Ma...@gmx.net>.
On 18.07.2011 20:21, Mathias Bauer wrote:

>> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
>> specific. PDFBox may be a replacement.
> 
> This component is used for the pdf import extension, not for OOo itself.
> 
> The pdf import extension is not built by default, there is a configure
> switch to enable it in the build. In that case xpdf would be required. I
> think that this already fulfils the legal requirements that building
> lgpl code must be "opt-in". So as far as I can see, this is not a "to do".

Giving it one more thought: it would be still a to do if we wanted to
have a pdf import extension released by Apache. So perhaps a to do with
minor priority.

Regards,
Mathias

Re: A first try to remove some copyleft components from the build

Posted by Malte Timmermann <ma...@gmx.com>.
Wrt PDF Import Extension, and similar extensions: Optional extensions 
should not be part of a regular OOo build or source tree, IMHO.

We should have separate source trees for the core product, and for 
optional extensions.

Malte.

On 18.07.2011 20:21, Mathias Bauer wrote:
> On 18.07.2011 20:04, Pedro F. Giffuni wrote:
>
>> Hello Mathias;
>>
>> --- On Mon, 7/18/11, Mathias Bauer<Ma...@gmx.net>  wrote:
>>> Hi,
>>>
>>> I tried to get rid of some copyleft dependencies. As I will
>>> leave for vacation on Wednesday,
>>
>> First of all, thanks so much for working on this!
>>
>>> I now send my first patch to the list so that others already
>>> can have a look or even continue. I only did it on Linux so
>>> far, of course we will need adaptions for other platforms.
>>
>> Not too many :-).
>>
>>> I created the patch from the hg repository of
>>> OpenOffice.org, but the
>>> differences to our still not existing svn repository won't
>>> be huge, so
>>> it should bring us a little bit closer to a "clean" build.
>>>
>>> I also added some more todos to the wiki page.
>>>
>>> Meanwhile the license information at
>>>
>>> http://wiki.services.openoffice.org/wiki/ApacheMigration
>>>
>>
>> I would like to add a couple more:
>>
>> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
>> specific. PDFBox may be a replacement.
>
> This component is used for the pdf import extension, not for OOo itself.
>
> The pdf import extension is not built by default, there is a configure
> switch to enable it in the build. In that case xpdf would be required. I
> think that this already fulfils the legal requirements that building
> lgpl code must be "opt-in". So as far as I can see, this is not a "to do".
>
>>
>> 2) The build requires GNU cp, which is inconvenient for
>> the BSDs and MacOS X:
>> http://lists.freedesktop.org/archives/libreoffice/2010-October/000586.html
>
> I remember a lot of discussions around different ways to copy files in
> the new build system of OOo. We can revive that and see where it brings
> us. Nevertheless, I think that discussing GNU cp will happen on
> usability grounds, not caused by legal requirements. I added this to the
> todo list.
>
> Regards,
> Mathias

Re: A first try to remove some copyleft components from the build

Posted by Malte Timmermann <ma...@gmx.com>.
Wrt PDF Import Extension, and similar extensions: Optional extensions 
should not be part of a regular OOo build or source tree, IMHO.

We should have separate source trees for the core product, and for 
optional extensions.

Malte.

On 18.07.2011 20:21, Mathias Bauer wrote:
> On 18.07.2011 20:04, Pedro F. Giffuni wrote:
>
>> Hello Mathias;
>>
>> --- On Mon, 7/18/11, Mathias Bauer<Ma...@gmx.net>  wrote:
>>> Hi,
>>>
>>> I tried to get rid of some copyleft dependencies. As I will
>>> leave for vacation on Wednesday,
>>
>> First of all, thanks so much for working on this!
>>
>>> I now send my first patch to the list so that others already
>>> can have a look or even continue. I only did it on Linux so
>>> far, of course we will need adaptions for other platforms.
>>
>> Not too many :-).
>>
>>> I created the patch from the hg repository of
>>> OpenOffice.org, but the
>>> differences to our still not existing svn repository won't
>>> be huge, so
>>> it should bring us a little bit closer to a "clean" build.
>>>
>>> I also added some more todos to the wiki page.
>>>
>>> Meanwhile the license information at
>>>
>>> http://wiki.services.openoffice.org/wiki/ApacheMigration
>>>
>>
>> I would like to add a couple more:
>>
>> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
>> specific. PDFBox may be a replacement.
>
> This component is used for the pdf import extension, not for OOo itself.
>
> The pdf import extension is not built by default, there is a configure
> switch to enable it in the build. In that case xpdf would be required. I
> think that this already fulfils the legal requirements that building
> lgpl code must be "opt-in". So as far as I can see, this is not a "to do".
>
>>
>> 2) The build requires GNU cp, which is inconvenient for
>> the BSDs and MacOS X:
>> http://lists.freedesktop.org/archives/libreoffice/2010-October/000586.html
>
> I remember a lot of discussions around different ways to copy files in
> the new build system of OOo. We can revive that and see where it brings
> us. Nevertheless, I think that discussing GNU cp will happen on
> usability grounds, not caused by legal requirements. I added this to the
> todo list.
>
> Regards,
> Mathias

Re: A first try to remove some copyleft components from the build

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.
--- On Mon, 7/18/11, Mathias Bauer <Ma...@gmx.net> wrote:

...

> > 
> > I would like to add a couple more:
> > 
> > 1) xpdf (GPL'd) is a run dependency, this is
> > linux/unix specific. PDFBox may be a replacement.
> 
> This component is used for the pdf import extension, not
> for OOo itself.
>
 
> The pdf import extension is not built by default, there is
> a configure switch to enable it in the build. In that case
> xpdf would be required. I think that this already fulfils
> the legal requirements that building lgpl code must be
> "opt-in". So as far as I can see, this is not a "to do".
>

OK, that's fair. I do think it's convenient since I doubt
xpdf is portable beyond unix. FWIW, while looking at
PDFBox and it's dependencies, I became acquainted with the
Legion of the Bouncy Castle:
     http://www.bouncycastle.org/
A possible replacement for NSS if you don't care it's Java.
 
> > 
> > 2) The build requires GNU cp, which is inconvenient
> > for the BSDs and MacOS X:
...
> Nevertheless, I think that discussing GNU cp will
> happen on usability grounds, not caused by legal
> requirements. I added this to the todo list.
>

Thanks. I agree it's a usability issue, but just to
put this in context, it involves rebuilding all the
GNU coreutils package as a requisite for building
OpenOffice, which brings not only extra build time,
but also leaves a lot of bloat:

basename, cat, chgrp, chmod, chown, chroot, cksum, comm,
cp, csplit, cut, date, dd, df, dir, dircolors, dirname,
du, echo, env, expand, expr, factor, false, fmt, fold,
groups, head, hostid, hostname, id, install, join, kill,
link, ln, logname, ls, md5sum, mkdir, mkfifo, mknod, mv,
nice,  nl, nohup, od, paste, pathchk, pinky, pr, printenv,
printf, ptx, pwd, readlink, rm, rmdir, seq, sha1sum, shred,
sleep, sort, split, stat, stty, su, sum, sync, tac, tail,
tee, test, touch, tr, true, tsort, tty, uname, unexpand,
uniq, unlink, uptime, users, vdir, wc, who, whoami, yes

It is worse than bash!

(sorry for the rant .. I know it's not your fault :) )

Pedro.

Re: A first try to remove some copyleft components from the build

Posted by Mathias Bauer <Ma...@gmx.net>.
On 18.07.2011 20:04, Pedro F. Giffuni wrote:

> Hello Mathias;
> 
> --- On Mon, 7/18/11, Mathias Bauer <Ma...@gmx.net> wrote:
>> Hi,
>> 
>> I tried to get rid of some copyleft dependencies. As I will
>> leave for vacation on Wednesday,
> 
> First of all, thanks so much for working on this!
> 
>> I now send my first patch to the list so that others already
>> can have a look or even continue. I only did it on Linux so
>> far, of course we will need adaptions for other platforms.
> 
> Not too many :-).
> 
>> I created the patch from the hg repository of
>> OpenOffice.org, but the
>> differences to our still not existing svn repository won't
>> be huge, so
>> it should bring us a little bit closer to a "clean" build.
>> 
>> I also added some more todos to the wiki page.
>> 
>> Meanwhile the license information at
>> 
>> http://wiki.services.openoffice.org/wiki/ApacheMigration
>>
> 
> I would like to add a couple more:
> 
> 1) xpdf (GPL'd) is a run dependency, this is linux/unix
> specific. PDFBox may be a replacement.

This component is used for the pdf import extension, not for OOo itself.

The pdf import extension is not built by default, there is a configure
switch to enable it in the build. In that case xpdf would be required. I
think that this already fulfils the legal requirements that building
lgpl code must be "opt-in". So as far as I can see, this is not a "to do".

> 
> 2) The build requires GNU cp, which is inconvenient for
> the BSDs and MacOS X:
> http://lists.freedesktop.org/archives/libreoffice/2010-October/000586.html

I remember a lot of discussions around different ways to copy files in
the new build system of OOo. We can revive that and see where it brings
us. Nevertheless, I think that discussing GNU cp will happen on
usability grounds, not caused by legal requirements. I added this to the
todo list.

Regards,
Mathias

Re: A first try to remove some copyleft components from the build

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.
Hello Mathias;

--- On Mon, 7/18/11, Mathias Bauer <Ma...@gmx.net> wrote:
> Hi,
> 
> I tried to get rid of some copyleft dependencies. As I will
> leave for vacation on Wednesday,

First of all, thanks so much for working on this!

> I now send my first patch to the list so that others already
> can have a look or even continue. I only did it on Linux so
> far, of course we will need adaptions for other platforms.

Not too many :-).

> I created the patch from the hg repository of
> OpenOffice.org, but the
> differences to our still not existing svn repository won't
> be huge, so
> it should bring us a little bit closer to a "clean" build.
> 
> I also added some more todos to the wiki page.
> 
> Meanwhile the license information at
> 
> http://wiki.services.openoffice.org/wiki/ApacheMigration
>

I would like to add a couple more:

1) xpdf (GPL'd) is a run dependency, this is linux/unix
specific. PDFBox may be a replacement.

2) The build requires GNU cp, which is inconvenient for
the BSDs and MacOS X:
http://lists.freedesktop.org/archives/libreoffice/2010-October/000586.html

cheers,

Pedro.