You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by Piergiorgio Lucidi <pi...@apache.org> on 2017/06/29 11:41:51 UTC

Proposal for adding an Alfresco BFSI Output connector

Hi,

Luis Cabaceira (Solutions Architect at Alfresco) contacted me because he
would like to contribute on a new Output Connector for migration purpose
specifically for migrating contents to an Alfresco instance.

For massive content migration scenario against an Alfresco instance the
CMIS Output Connector could be not enough in terms on timing.

The better way for migrating tons of contents to an Alfresco repo is to
create an export on the file system exactly how Alfresco is expecting for
importing them using its own Bulk File System Import Tool with the in-place
mode enabled:
http://docs.alfresco.com/5.2/concepts/bulk-import-prepare-filesystem.html

The in-place mode allow to inject contents in the repository without
processing/uploading the binary but creating nodes with a property that
will associate the existent binary in the storage with the node itself.
This will solve a lot of problems considering how much time should be spend
processing upload streams.

This is a specific Alfresco Output Connector but it could solve a lot of
migration purpose problems all over the world and in the community.
I would like to invest some time on this connector together with him and
bring this in the project very soon.

What do you think about this proposal?

Please let me know.
Thank you for any feedback.

Cheers,
PJ

-- 
Piergiorgio Lucidi
Technology Evangelist and Chief of Enterprise Information Management @
Sourcesense
Mentor / PMC Member / Committer @ Apache Software Foundation
Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
Author and Technical Reviewer @ Packt Publishing
Technical Advisory Group Member @ Microsoft
Top Community Contributor @ Crafter
Project Leader / Committer @ JBoss
http://www.open4dev.com

Re: Proposal for adding an Alfresco BFSI Output connector

Posted by Piergiorgio Lucidi <pi...@apache.org>.
Hi,

I have created a new component in our JIRA dedicated to the development of
this new connector [1].

You will find also a comment by Luis in the initial issue [2] about the
official involvement of Apache ManifoldCF in the Alfresco content migration
strategy.

Hope this helps.

PJ

[1] - https://issues.apache.org/jira/browse/CONNECTORS/component/12332945
[2] - https://issues.apache.org/jira/browse/CONNECTORS-1442

2017-06-29 14:55 GMT+02:00 Piergiorgio Lucidi <pi...@apache.org>:

> Hi Karl,
>
> 2017-06-29 13:48 GMT+02:00 Karl Wright <da...@gmail.com>:
>
>> Hi Piergiorgio,
>>
>> I think you are doing a great job expanding MCF's ability to port content
>> from system to system.  This can only help.
>>
>
> Thank you so much for your feedback, very appreciated :)
>
>
>>
>> The only concern I have is whether the Alfresco bulk importer can be used
>> for synchronization, or whether it would be a once-only arrangement.  As
>> you know, in order for synchronization to work, it must be possible to use
>> the URL that ManifoldCF creates for each document as an identifier for the
>> document.  In other words, it must be possible to use the URL to update
>> the
>> document in the target repository.  Can the bulk importer be used in that
>> way?
>>
>
> I have just talked with Luis and I can say that we can store a new
> property inside the metadata fields for keeping documentURL of ManifoldCF
> and then implement the policy for a correct synchronization mechanism.
>
> We can also change and fix any side effect that we see in the Alfresco
> Bulk File System Import tool for improving the progressive import and for
> making it fully compliant with Apache ManifoldCF in an official way :)
>
> PJ
>
>
>>
>> Karl
>>
>>
>> On Thu, Jun 29, 2017 at 7:41 AM, Piergiorgio Lucidi <
>> piergiorgio@apache.org>
>> wrote:
>>
>> > Hi,
>> >
>> > Luis Cabaceira (Solutions Architect at Alfresco) contacted me because he
>> > would like to contribute on a new Output Connector for migration purpose
>> > specifically for migrating contents to an Alfresco instance.
>> >
>> > For massive content migration scenario against an Alfresco instance the
>> > CMIS Output Connector could be not enough in terms on timing.
>> >
>> > The better way for migrating tons of contents to an Alfresco repo is to
>> > create an export on the file system exactly how Alfresco is expecting
>> for
>> > importing them using its own Bulk File System Import Tool with the
>> in-place
>> > mode enabled:
>> > http://docs.alfresco.com/5.2/concepts/bulk-import-prepare-fi
>> lesystem.html
>> >
>> > The in-place mode allow to inject contents in the repository without
>> > processing/uploading the binary but creating nodes with a property that
>> > will associate the existent binary in the storage with the node itself.
>> > This will solve a lot of problems considering how much time should be
>> spend
>> > processing upload streams.
>> >
>> > This is a specific Alfresco Output Connector but it could solve a lot of
>> > migration purpose problems all over the world and in the community.
>> > I would like to invest some time on this connector together with him and
>> > bring this in the project very soon.
>> >
>> > What do you think about this proposal?
>> >
>> > Please let me know.
>> > Thank you for any feedback.
>> >
>> > Cheers,
>> > PJ
>> >
>> > --
>> > Piergiorgio Lucidi
>> > Technology Evangelist and Chief of Enterprise Information Management @
>> > Sourcesense
>> > Mentor / PMC Member / Committer @ Apache Software Foundation
>> > Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
>> > Author and Technical Reviewer @ Packt Publishing
>> > Technical Advisory Group Member @ Microsoft
>> > Top Community Contributor @ Crafter
>> > Project Leader / Committer @ JBoss
>> > http://www.open4dev.com
>> >
>>
>
>
>
> --
> Piergiorgio Lucidi
> Technology Evangelist and Chief of Enterprise Information Management @
> Sourcesense
> Mentor / PMC Member / Committer @ Apache Software Foundation
> Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
> Author and Technical Reviewer @ Packt Publishing
> Technical Advisory Group Member @ Microsoft
> Top Community Contributor @ Crafter
> Project Leader / Committer @ JBoss
> http://www.open4dev.com
>



-- 
Piergiorgio Lucidi
Technology Evangelist and Chief of Enterprise Information Management @
Sourcesense
Mentor / PMC Member / Committer @ Apache Software Foundation
Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
Author and Technical Reviewer @ Packt Publishing
Technical Advisory Group Member @ Microsoft
Top Community Contributor @ Crafter
Project Leader / Committer @ JBoss
http://www.open4dev.com

Re: Proposal for adding an Alfresco BFSI Output connector

Posted by Piergiorgio Lucidi <pi...@apache.org>.
Hi Karl,

2017-06-29 13:48 GMT+02:00 Karl Wright <da...@gmail.com>:

> Hi Piergiorgio,
>
> I think you are doing a great job expanding MCF's ability to port content
> from system to system.  This can only help.
>

Thank you so much for your feedback, very appreciated :)


>
> The only concern I have is whether the Alfresco bulk importer can be used
> for synchronization, or whether it would be a once-only arrangement.  As
> you know, in order for synchronization to work, it must be possible to use
> the URL that ManifoldCF creates for each document as an identifier for the
> document.  In other words, it must be possible to use the URL to update the
> document in the target repository.  Can the bulk importer be used in that
> way?
>

I have just talked with Luis and I can say that we can store a new property
inside the metadata fields for keeping documentURL of ManifoldCF and then
implement the policy for a correct synchronization mechanism.

We can also change and fix any side effect that we see in the Alfresco Bulk
File System Import tool for improving the progressive import and for making
it fully compliant with Apache ManifoldCF in an official way :)

PJ


>
> Karl
>
>
> On Thu, Jun 29, 2017 at 7:41 AM, Piergiorgio Lucidi <
> piergiorgio@apache.org>
> wrote:
>
> > Hi,
> >
> > Luis Cabaceira (Solutions Architect at Alfresco) contacted me because he
> > would like to contribute on a new Output Connector for migration purpose
> > specifically for migrating contents to an Alfresco instance.
> >
> > For massive content migration scenario against an Alfresco instance the
> > CMIS Output Connector could be not enough in terms on timing.
> >
> > The better way for migrating tons of contents to an Alfresco repo is to
> > create an export on the file system exactly how Alfresco is expecting for
> > importing them using its own Bulk File System Import Tool with the
> in-place
> > mode enabled:
> > http://docs.alfresco.com/5.2/concepts/bulk-import-prepare-
> filesystem.html
> >
> > The in-place mode allow to inject contents in the repository without
> > processing/uploading the binary but creating nodes with a property that
> > will associate the existent binary in the storage with the node itself.
> > This will solve a lot of problems considering how much time should be
> spend
> > processing upload streams.
> >
> > This is a specific Alfresco Output Connector but it could solve a lot of
> > migration purpose problems all over the world and in the community.
> > I would like to invest some time on this connector together with him and
> > bring this in the project very soon.
> >
> > What do you think about this proposal?
> >
> > Please let me know.
> > Thank you for any feedback.
> >
> > Cheers,
> > PJ
> >
> > --
> > Piergiorgio Lucidi
> > Technology Evangelist and Chief of Enterprise Information Management @
> > Sourcesense
> > Mentor / PMC Member / Committer @ Apache Software Foundation
> > Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
> > Author and Technical Reviewer @ Packt Publishing
> > Technical Advisory Group Member @ Microsoft
> > Top Community Contributor @ Crafter
> > Project Leader / Committer @ JBoss
> > http://www.open4dev.com
> >
>



-- 
Piergiorgio Lucidi
Technology Evangelist and Chief of Enterprise Information Management @
Sourcesense
Mentor / PMC Member / Committer @ Apache Software Foundation
Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
Author and Technical Reviewer @ Packt Publishing
Technical Advisory Group Member @ Microsoft
Top Community Contributor @ Crafter
Project Leader / Committer @ JBoss
http://www.open4dev.com

Re: Proposal for adding an Alfresco BFSI Output connector

Posted by Piergiorgio Lucidi <pi...@apache.org>.
2017-06-29 13:50 GMT+02:00 Karl Wright <da...@gmail.com>:

> I'd also like to ask the same question about the CMIS output connector.
> Does it work in that way?  And, is it finally ready to merge to trunk?
>

The CMIS Output Connector invokes update/upload requests directly against a
repository API without any middle-step. So it is different from the
previous approach discussed for the Alfresco BFSI Output Connector.

Here we will stream binaries using the CMIS API for uploading and updating
contents.

We use the documentURL generated from any repository connector and even if
we found the same content with exactly the same documentURL, we will update
the content (properties and stream).

This afternoon I'm working on integration tests for the CMIS Output
Connector and I hope to commit very soon this final part.

I'll let you know as soon as possible when we can merge to the trunk.

PJ


>
> Karl
>
>
> On Thu, Jun 29, 2017 at 7:48 AM, Karl Wright <da...@gmail.com> wrote:
>
> > Hi Piergiorgio,
> >
> > I think you are doing a great job expanding MCF's ability to port content
> > from system to system.  This can only help.
> >
> > The only concern I have is whether the Alfresco bulk importer can be used
> > for synchronization, or whether it would be a once-only arrangement.  As
> > you know, in order for synchronization to work, it must be possible to
> use
> > the URL that ManifoldCF creates for each document as an identifier for
> the
> > document.  In other words, it must be possible to use the URL to update
> the
> > document in the target repository.  Can the bulk importer be used in that
> > way?
> >
> > Karl
> >
> >
> > On Thu, Jun 29, 2017 at 7:41 AM, Piergiorgio Lucidi <
> > piergiorgio@apache.org> wrote:
> >
> >> Hi,
> >>
> >> Luis Cabaceira (Solutions Architect at Alfresco) contacted me because he
> >> would like to contribute on a new Output Connector for migration purpose
> >> specifically for migrating contents to an Alfresco instance.
> >>
> >> For massive content migration scenario against an Alfresco instance the
> >> CMIS Output Connector could be not enough in terms on timing.
> >>
> >> The better way for migrating tons of contents to an Alfresco repo is to
> >> create an export on the file system exactly how Alfresco is expecting
> for
> >> importing them using its own Bulk File System Import Tool with the
> >> in-place
> >> mode enabled:
> >> http://docs.alfresco.com/5.2/concepts/bulk-import-prepare-
> filesystem.html
> >>
> >> The in-place mode allow to inject contents in the repository without
> >> processing/uploading the binary but creating nodes with a property that
> >> will associate the existent binary in the storage with the node itself.
> >> This will solve a lot of problems considering how much time should be
> >> spend
> >> processing upload streams.
> >>
> >> This is a specific Alfresco Output Connector but it could solve a lot of
> >> migration purpose problems all over the world and in the community.
> >> I would like to invest some time on this connector together with him and
> >> bring this in the project very soon.
> >>
> >> What do you think about this proposal?
> >>
> >> Please let me know.
> >> Thank you for any feedback.
> >>
> >> Cheers,
> >> PJ
> >>
> >> --
> >> Piergiorgio Lucidi
> >> Technology Evangelist and Chief of Enterprise Information Management @
> >> Sourcesense
> >> Mentor / PMC Member / Committer @ Apache Software Foundation
> >> Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
> >> Author and Technical Reviewer @ Packt Publishing
> >> Technical Advisory Group Member @ Microsoft
> >> Top Community Contributor @ Crafter
> >> Project Leader / Committer @ JBoss
> >> http://www.open4dev.com
> >>
> >
> >
>



-- 
Piergiorgio Lucidi
Technology Evangelist and Chief of Enterprise Information Management @
Sourcesense
Mentor / PMC Member / Committer @ Apache Software Foundation
Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
Author and Technical Reviewer @ Packt Publishing
Technical Advisory Group Member @ Microsoft
Top Community Contributor @ Crafter
Project Leader / Committer @ JBoss
http://www.open4dev.com

Re: Proposal for adding an Alfresco BFSI Output connector

Posted by Karl Wright <da...@gmail.com>.
I'd also like to ask the same question about the CMIS output connector.
Does it work in that way?  And, is it finally ready to merge to trunk?

Karl


On Thu, Jun 29, 2017 at 7:48 AM, Karl Wright <da...@gmail.com> wrote:

> Hi Piergiorgio,
>
> I think you are doing a great job expanding MCF's ability to port content
> from system to system.  This can only help.
>
> The only concern I have is whether the Alfresco bulk importer can be used
> for synchronization, or whether it would be a once-only arrangement.  As
> you know, in order for synchronization to work, it must be possible to use
> the URL that ManifoldCF creates for each document as an identifier for the
> document.  In other words, it must be possible to use the URL to update the
> document in the target repository.  Can the bulk importer be used in that
> way?
>
> Karl
>
>
> On Thu, Jun 29, 2017 at 7:41 AM, Piergiorgio Lucidi <
> piergiorgio@apache.org> wrote:
>
>> Hi,
>>
>> Luis Cabaceira (Solutions Architect at Alfresco) contacted me because he
>> would like to contribute on a new Output Connector for migration purpose
>> specifically for migrating contents to an Alfresco instance.
>>
>> For massive content migration scenario against an Alfresco instance the
>> CMIS Output Connector could be not enough in terms on timing.
>>
>> The better way for migrating tons of contents to an Alfresco repo is to
>> create an export on the file system exactly how Alfresco is expecting for
>> importing them using its own Bulk File System Import Tool with the
>> in-place
>> mode enabled:
>> http://docs.alfresco.com/5.2/concepts/bulk-import-prepare-filesystem.html
>>
>> The in-place mode allow to inject contents in the repository without
>> processing/uploading the binary but creating nodes with a property that
>> will associate the existent binary in the storage with the node itself.
>> This will solve a lot of problems considering how much time should be
>> spend
>> processing upload streams.
>>
>> This is a specific Alfresco Output Connector but it could solve a lot of
>> migration purpose problems all over the world and in the community.
>> I would like to invest some time on this connector together with him and
>> bring this in the project very soon.
>>
>> What do you think about this proposal?
>>
>> Please let me know.
>> Thank you for any feedback.
>>
>> Cheers,
>> PJ
>>
>> --
>> Piergiorgio Lucidi
>> Technology Evangelist and Chief of Enterprise Information Management @
>> Sourcesense
>> Mentor / PMC Member / Committer @ Apache Software Foundation
>> Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
>> Author and Technical Reviewer @ Packt Publishing
>> Technical Advisory Group Member @ Microsoft
>> Top Community Contributor @ Crafter
>> Project Leader / Committer @ JBoss
>> http://www.open4dev.com
>>
>
>

Re: Proposal for adding an Alfresco BFSI Output connector

Posted by Karl Wright <da...@gmail.com>.
Hi Piergiorgio,

I think you are doing a great job expanding MCF's ability to port content
from system to system.  This can only help.

The only concern I have is whether the Alfresco bulk importer can be used
for synchronization, or whether it would be a once-only arrangement.  As
you know, in order for synchronization to work, it must be possible to use
the URL that ManifoldCF creates for each document as an identifier for the
document.  In other words, it must be possible to use the URL to update the
document in the target repository.  Can the bulk importer be used in that
way?

Karl


On Thu, Jun 29, 2017 at 7:41 AM, Piergiorgio Lucidi <pi...@apache.org>
wrote:

> Hi,
>
> Luis Cabaceira (Solutions Architect at Alfresco) contacted me because he
> would like to contribute on a new Output Connector for migration purpose
> specifically for migrating contents to an Alfresco instance.
>
> For massive content migration scenario against an Alfresco instance the
> CMIS Output Connector could be not enough in terms on timing.
>
> The better way for migrating tons of contents to an Alfresco repo is to
> create an export on the file system exactly how Alfresco is expecting for
> importing them using its own Bulk File System Import Tool with the in-place
> mode enabled:
> http://docs.alfresco.com/5.2/concepts/bulk-import-prepare-filesystem.html
>
> The in-place mode allow to inject contents in the repository without
> processing/uploading the binary but creating nodes with a property that
> will associate the existent binary in the storage with the node itself.
> This will solve a lot of problems considering how much time should be spend
> processing upload streams.
>
> This is a specific Alfresco Output Connector but it could solve a lot of
> migration purpose problems all over the world and in the community.
> I would like to invest some time on this connector together with him and
> bring this in the project very soon.
>
> What do you think about this proposal?
>
> Please let me know.
> Thank you for any feedback.
>
> Cheers,
> PJ
>
> --
> Piergiorgio Lucidi
> Technology Evangelist and Chief of Enterprise Information Management @
> Sourcesense
> Mentor / PMC Member / Committer @ Apache Software Foundation
> Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
> Author and Technical Reviewer @ Packt Publishing
> Technical Advisory Group Member @ Microsoft
> Top Community Contributor @ Crafter
> Project Leader / Committer @ JBoss
> http://www.open4dev.com
>