You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@fineract.apache.org by Avik Ganguly <av...@gmail.com> on 2018/02/15 10:28:32 UTC

Re: IMP::Regarding Data Migration

Hi Hari,

The documented approach is the least risky of all the approaches if your
team does not have a decent amount of Fineract technical know-how.

If you do a DB level migration from system X to Fineract, then the onus of
data integrity will fall on you depending on how well the data is
maintained in system X and the patterns in the data in system X which is
not supported in Fineract. However, I have seen DB level migrations for
around 15 lakh loans (not sure what the 70m records you mentioned are) from
a moderately well maintained system within 6 hours with GL account level
and txn level verification.  But then around 4 months were spent
discovering and fixing data integrity issues for approx 1% of the loans and
slowly integrating the knowledge back as verification steps for
post-transformation process.
So the DB to DB approach is the most risky one and you would need to
develop a lot of transformation scripts depending on your source data
structure.

Another slightly risky approach would be to use the latest import code
(still API based) available as part of fineract develop branch for
importing the data and then falling back to a stable release before using
the data (some data version mismatch issues can occur here as well). The
advantage here is that you can split your data staff / branch-wise and
upload them in parallel. Ex :- If you are using a 32-core machine for
importing, you can start with 60 parallel workbook uploads and gradually
ramp it up and find optimal import performance. Majority of your
post-migration verification is taken care of in this method.

Regards,
Avik.

On Thu, Feb 15, 2018 at 8:11 AM, Hari Haran <ha...@habile.in> wrote:

> Team,
>
> I am in the process of migrating 70 million  records from Existing system
> to Mifos.
>
> I am facing the following challenges.
>
>    - As I am using DataImportTool as suggested by support team, It takes
>    lots of time to load data via API calls.(*It took nearly 7 days for 17
>    lakh records*)
>    - As We are using excel based templates, We can only load limited amount
>    of data into excel templates which also takes lots of manual work. This
>    results in data errors.
>
> I have gone through our community forums and there is no exact answer for
> this.
>
> One of the forum they have suggested the below steps. I have highlighted my
> doubt in RED.
>
>    - Always disable all jobs before you start a migration!
>    - - Create all via API
>    - - Update their status to approved via 1 SQL (to avoid another round of
>    API calls, and no updates are made in other tables) - *What they mean
>    here?*
>    - - Disburse via API calls
>    - - post loan transactions via API calls (make sure that they are sorted
>    by date, with the most recent payment coming last, MifosX will reprocess
>    all previous entries if you backdate before a previous entry)
>
> Here also they are suggesting Us to use API Calls.
>
> My requirements are given below.
>
>    - ​Need to migrate 70 million records faster
>    - Accounting needs to be enabled
>    - Verification needs to be done after data migration​
>
>
>
> ------------------------------
> Regards,
> Hariharan.G
> Technical Consultant | Habile Technologies
> www.habiletechnologies.com | (O): +91 44 4202 0550 | (M): + 91 90030 25156
> Skype: hari.mitteam | https://www.facebook.com/HabileTechnologies
> <https://www.facebook.com/HabileTechnologies>
>
> ------------------------------
> DISCLAIMER:
> All emails and any files transmitted with them are confidential and
> intended solely for the use of the individual or entity to whom they are
> addressed. If you have received an email in error please notify your system
> manager. The message in the email you have received contains confidential
> information and is intended only for the individual named. If you are not
> the named addressee you should not disseminate, distribute or copy the
> email. Please notify the sender immediately by email if you have received
> an email by mistake and delete the email from your system. If you are not
> the intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of the
> information is strictly prohibited.
>

Re: IMP::Regarding Data Migration

Posted by Sundar <su...@gmail.com>.
Thanks a lot for you inputs. Yes, we wanted to do data import tool as we
need GL entries (even for closed loans).

Could anyone please explain on the suggestion -  "Update their status to
approved via 1 SQL (to avoid another round of API calls, and no updates are
made in other tables) "

Cheers,
Sundar.

On Thu, Feb 15, 2018 at 4:15 PM, Sander van der Heyden <
sandervanderheyden@musonisystem.com> wrote:

> +1 on Avik's response. We've done loads of DB to DB migrations in the
> early days and don't do them anymore, reason: Too many integrity issues
> crop up later. I think the post you quoted is from me, and it was not using
> the Data Import Tool, which is the only thing we use nowadays.
>
> On the migration I think there are a few key business decisions to be made
> before you even start:
> - Do you want to import all loan history with individual payments, or is
> it good enough to know client A had 5 loans of amount 1,2,3,4,5 and when
> they finished paying (Without knowing the exact payment behaviour)? This
> saves a LOT of data, journals etc.
> - Cloud resources are cheap and can be scaled very easily and quickly.
> Upping a DB instance to have 60 or 120 GB's of mem and loads of CPU is
> cheap and as long as you scale down when you are done importing the cost of
> resources will almost always be preferable over the cost of waiting for it
> to finish
> - same for your tomcat servers, and yes this does mean I always recommend
> to split tomcat and mysql into separate pieces of infrastructure,
> especially when talking this size of migration.
> - Indeed look at the new version with integrated import code and trigger
> in parallel
>
> S
>
>
>
> Sander van der Heyden
>
> CTO Musoni Services
>
>
>
>
> Mobile (NL): +31 (0)6 14239505
> Skype: s.vdheyden
> Website: musonisystem.com
> Follow us on Twitter!  <https://twitter.com/musonimfi>
> Postal address: Papiermolen 10, 3994DK Houten, Netherlands
>
> On 15 February 2018 at 11:28, Avik Ganguly <av...@gmail.com>
> wrote:
>
>> Hi Hari,
>>
>> The documented approach is the least risky of all the approaches if your
>> team does not have a decent amount of Fineract technical know-how.
>>
>> If you do a DB level migration from system X to Fineract, then the onus
>> of data integrity will fall on you depending on how well the data is
>> maintained in system X and the patterns in the data in system X which is
>> not supported in Fineract. However, I have seen DB level migrations for
>> around 15 lakh loans (not sure what the 70m records you mentioned are) from
>> a moderately well maintained system within 6 hours with GL account level
>> and txn level verification.  But then around 4 months were spent
>> discovering and fixing data integrity issues for approx 1% of the loans and
>> slowly integrating the knowledge back as verification steps for
>> post-transformation process.
>> So the DB to DB approach is the most risky one and you would need to
>> develop a lot of transformation scripts depending on your source data
>> structure.
>>
>> Another slightly risky approach would be to use the latest import code
>> (still API based) available as part of fineract develop branch for
>> importing the data and then falling back to a stable release before using
>> the data (some data version mismatch issues can occur here as well). The
>> advantage here is that you can split your data staff / branch-wise and
>> upload them in parallel. Ex :- If you are using a 32-core machine for
>> importing, you can start with 60 parallel workbook uploads and gradually
>> ramp it up and find optimal import performance. Majority of your
>> post-migration verification is taken care of in this method.
>>
>> Regards,
>> Avik.
>>
>> On Thu, Feb 15, 2018 at 8:11 AM, Hari Haran <ha...@habile.in>
>> wrote:
>>
>>> Team,
>>>
>>> I am in the process of migrating 70 million  records from Existing system
>>> to Mifos.
>>>
>>> I am facing the following challenges.
>>>
>>>    - As I am using DataImportTool as suggested by support team, It takes
>>>    lots of time to load data via API calls.(*It took nearly 7 days for 17
>>>    lakh records*)
>>>    - As We are using excel based templates, We can only load limited
>>> amount
>>>    of data into excel templates which also takes lots of manual work.
>>> This
>>>    results in data errors.
>>>
>>> I have gone through our community forums and there is no exact answer for
>>> this.
>>>
>>> One of the forum they have suggested the below steps. I have highlighted
>>> my
>>> doubt in RED.
>>>
>>>    - Always disable all jobs before you start a migration!
>>>    - - Create all via API
>>>    - - Update their status to approved via 1 SQL (to avoid another round
>>> of
>>>    API calls, and no updates are made in other tables) - *What they mean
>>>    here?*
>>>    - - Disburse via API calls
>>>    - - post loan transactions via API calls (make sure that they are
>>> sorted
>>>    by date, with the most recent payment coming last, MifosX will
>>> reprocess
>>>    all previous entries if you backdate before a previous entry)
>>>
>>> Here also they are suggesting Us to use API Calls.
>>>
>>> My requirements are given below.
>>>
>>>    - ​Need to migrate 70 million records faster
>>>    - Accounting needs to be enabled
>>>    - Verification needs to be done after data migration​
>>>
>>>
>>>
>>> ------------------------------
>>> Regards,
>>> Hariharan.G
>>> Technical Consultant | Habile Technologies
>>> www.habiletechnologies.com | (O): +91 44 4202 0550
>>> <+91%2044%204202%200550> | (M): + 91 90030 25156 <+91%2090030%2025156>
>>> Skype: hari.mitteam | https://www.facebook.com/HabileTechnologies
>>> <https://www.facebook.com/HabileTechnologies>
>>>
>>> ------------------------------
>>> DISCLAIMER:
>>> All emails and any files transmitted with them are confidential and
>>> intended solely for the use of the individual or entity to whom they are
>>> addressed. If you have received an email in error please notify your
>>> system
>>> manager. The message in the email you have received contains confidential
>>> information and is intended only for the individual named. If you are not
>>> the named addressee you should not disseminate, distribute or copy the
>>> email. Please notify the sender immediately by email if you have received
>>> an email by mistake and delete the email from your system. If you are not
>>> the intended recipient you are notified that disclosing, copying,
>>> distributing or taking any action in reliance on the contents of the
>>> information is strictly prohibited.
>>>
>>
>>
>

Re: IMP::Regarding Data Migration

Posted by Sander van der Heyden <sa...@musonisystem.com>.
+1 on Avik's response. We've done loads of DB to DB migrations in the early
days and don't do them anymore, reason: Too many integrity issues crop up
later. I think the post you quoted is from me, and it was not using the
Data Import Tool, which is the only thing we use nowadays.

On the migration I think there are a few key business decisions to be made
before you even start:
- Do you want to import all loan history with individual payments, or is it
good enough to know client A had 5 loans of amount 1,2,3,4,5 and when they
finished paying (Without knowing the exact payment behaviour)? This saves a
LOT of data, journals etc.
- Cloud resources are cheap and can be scaled very easily and quickly.
Upping a DB instance to have 60 or 120 GB's of mem and loads of CPU is
cheap and as long as you scale down when you are done importing the cost of
resources will almost always be preferable over the cost of waiting for it
to finish
- same for your tomcat servers, and yes this does mean I always recommend
to split tomcat and mysql into separate pieces of infrastructure,
especially when talking this size of migration.
- Indeed look at the new version with integrated import code and trigger in
parallel

S



Sander van der Heyden

CTO Musoni Services




Mobile (NL): +31 (0)6 14239505
Skype: s.vdheyden
Website: musonisystem.com
Follow us on Twitter!  <https://twitter.com/musonimfi>
Postal address: Papiermolen 10, 3994DK Houten, Netherlands

On 15 February 2018 at 11:28, Avik Ganguly <av...@gmail.com> wrote:

> Hi Hari,
>
> The documented approach is the least risky of all the approaches if your
> team does not have a decent amount of Fineract technical know-how.
>
> If you do a DB level migration from system X to Fineract, then the onus of
> data integrity will fall on you depending on how well the data is
> maintained in system X and the patterns in the data in system X which is
> not supported in Fineract. However, I have seen DB level migrations for
> around 15 lakh loans (not sure what the 70m records you mentioned are) from
> a moderately well maintained system within 6 hours with GL account level
> and txn level verification.  But then around 4 months were spent
> discovering and fixing data integrity issues for approx 1% of the loans and
> slowly integrating the knowledge back as verification steps for
> post-transformation process.
> So the DB to DB approach is the most risky one and you would need to
> develop a lot of transformation scripts depending on your source data
> structure.
>
> Another slightly risky approach would be to use the latest import code
> (still API based) available as part of fineract develop branch for
> importing the data and then falling back to a stable release before using
> the data (some data version mismatch issues can occur here as well). The
> advantage here is that you can split your data staff / branch-wise and
> upload them in parallel. Ex :- If you are using a 32-core machine for
> importing, you can start with 60 parallel workbook uploads and gradually
> ramp it up and find optimal import performance. Majority of your
> post-migration verification is taken care of in this method.
>
> Regards,
> Avik.
>
> On Thu, Feb 15, 2018 at 8:11 AM, Hari Haran <ha...@habile.in> wrote:
>
>> Team,
>>
>> I am in the process of migrating 70 million  records from Existing system
>> to Mifos.
>>
>> I am facing the following challenges.
>>
>>    - As I am using DataImportTool as suggested by support team, It takes
>>    lots of time to load data via API calls.(*It took nearly 7 days for 17
>>    lakh records*)
>>    - As We are using excel based templates, We can only load limited
>> amount
>>    of data into excel templates which also takes lots of manual work. This
>>    results in data errors.
>>
>> I have gone through our community forums and there is no exact answer for
>> this.
>>
>> One of the forum they have suggested the below steps. I have highlighted
>> my
>> doubt in RED.
>>
>>    - Always disable all jobs before you start a migration!
>>    - - Create all via API
>>    - - Update their status to approved via 1 SQL (to avoid another round
>> of
>>    API calls, and no updates are made in other tables) - *What they mean
>>    here?*
>>    - - Disburse via API calls
>>    - - post loan transactions via API calls (make sure that they are
>> sorted
>>    by date, with the most recent payment coming last, MifosX will
>> reprocess
>>    all previous entries if you backdate before a previous entry)
>>
>> Here also they are suggesting Us to use API Calls.
>>
>> My requirements are given below.
>>
>>    - ​Need to migrate 70 million records faster
>>    - Accounting needs to be enabled
>>    - Verification needs to be done after data migration​
>>
>>
>>
>> ------------------------------
>> Regards,
>> Hariharan.G
>> Technical Consultant | Habile Technologies
>> www.habiletechnologies.com | (O): +91 44 4202 0550
>> <+91%2044%204202%200550> | (M): + 91 90030 25156 <+91%2090030%2025156>
>> Skype: hari.mitteam | https://www.facebook.com/HabileTechnologies
>> <https://www.facebook.com/HabileTechnologies>
>>
>> ------------------------------
>> DISCLAIMER:
>> All emails and any files transmitted with them are confidential and
>> intended solely for the use of the individual or entity to whom they are
>> addressed. If you have received an email in error please notify your
>> system
>> manager. The message in the email you have received contains confidential
>> information and is intended only for the individual named. If you are not
>> the named addressee you should not disseminate, distribute or copy the
>> email. Please notify the sender immediately by email if you have received
>> an email by mistake and delete the email from your system. If you are not
>> the intended recipient you are notified that disclosing, copying,
>> distributing or taking any action in reliance on the contents of the
>> information is strictly prohibited.
>>
>
>