You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Leonard, Michael" <Mi...@opco.com> on 2015/10/26 16:01:40 UTC
Hadoop on premise versus cloud
Hi,
I work at a large financial institution. I'm exploring deploying Hadoop and I'm trying to understand why I would deploy on premise when the cloud is faster and easier. What are the pros/cons of each? How does pricing compare between on premise and cloud deployments?
Any color would be very helpful. Thank you in advance.
Sincerely,
Michael
This communication and any attached files may contain information that is confidential or privileged. If this communication has been received in error, please delete or destroy it immediately. Please go to http://www.opco.com/EmailDisclosures for important information and further disclosures pertaining to this transmission.
Re: Hadoop on premise versus cloud
Posted by daemeon reiydelle <da...@gmail.com>.
In addition to the data privacy concern described well below, there are a
couple of other areas you might consider (you can also respond privately
where I can be a bit more candid). . My experience with most banks (I work
with most of the players in the EU and US) are such that (1) below drives
development heavily into the cloud, in spite of (2)
1. Physical plant processes?
1. Your culture and processes may add months to a hardware delivery
cycle for production systems,
2. Does the data center under your employer's control even have the
available racks, power, network ports and router software versions to
support (bonding/stacking/teaming multiple 10gbit ports)
3. Big data gets much less interesting and viable when layers like
SAN and heavy hypervisors (VMware, Citrix) get into the mix, ditto "full
nightly backups" and other interesting confusions about the tech, not to
mention heavily committed to this, and requier full DR and zero
data loss
backups.
2. Data issues
1. Are the data sources "inside" your employer's network which would
require extra authorizations to allow them to connect to a cloud provider?
2. Are the consumers of the data going to be able to access the
cluster (similar questions if an intermediating data manipulation tool is
access by your employee/consumers)
3. As to data privacy
1. There are several data center providers who are legally and
entirely based within either the continental (Netherlands, Germany) or
adding UK if you think that is an alternative. To my knowledge Amazon is
not yet there but will be, I do not know if it is generally available but
Google Compute does have such ringfenced facilities in the EU, etc.
4. Now the real motiviations:
1. Startup costs as you figure out your data ingest complexity and as
your user expectations get clarified mean you seldom know what you will
need in a manner that management needs for their planning cycle.
2. Capital (hardware) costs are zero and all costs can be written off
in the current period. (Management has no hit to their capital expenses
budget)
3. Chainging (increasing ;{) costs can be directly tied to specific
activities as they occur (customer wants more X, additional data
sources Y,
and data Z% dirtier than expected ... you know the drill, yes?)
*.......*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*
On Mon, Oct 26, 2015 at 8:01 AM, Leonard, Michael <Mi...@opco.com>
wrote:
> Hi,
>
>
>
> I work at a large financial institution. I’m exploring deploying Hadoop
> and I’m trying to understand why I would deploy on premise when the cloud
> is faster and easier. What are the pros/cons of each? How does pricing
> compare between on premise and cloud deployments?
>
>
>
> Any color would be very helpful. Thank you in advance.
>
>
>
> Sincerely,
>
> Michael
>
>
>
>
> This communication and any attached files may contain information that is
> confidential or privileged. If this communication has been received in
> error, please delete or destroy it immediately. Please go to
> www.opco.com/EmailDisclosures
>
Re: Hadoop on premise versus cloud
Posted by Daniel Schulz <da...@hotmail.com>.
Hi Michael,
Thank you for your message.
To European customers data privacy is a major concern. So they are quiet reluctant to use pubic clouds — even iff their data will be encrypted. This is one reason may organisations strongly prefer on-premise clouds to public ones. So companies use public clouds, other plan to migrate to a private one, but most major companies do not want to rely on a third party when it comes to company data.
On the other hand, latency is lower due to all processes running in your local network. But throughput may mitigate that for production data loads.
However, the major advantage of public clouds if the relatively small cost to get started. AWS and others are ready to go — whereas a private cloud needs to be installed and later maintained.
Hope this helps a little.
Kind regards, Daniel.
> On 26 Oct 2015, at 16:01, Leonard, Michael <Mi...@opco.com> wrote:
>
> Hi,
>
> I work at a large financial institution. I’m exploring deploying Hadoop and I’m trying to understand why I would deploy on premise when the cloud is faster and easier. What are the pros/cons of each? How does pricing compare between on premise and cloud deployments?
>
> Any color would be very helpful. Thank you in advance.
>
> Sincerely,
> Michael
>
>
> This communication and any attached files may contain information that is confidential or privileged. If this communication has been received in error, please delete or destroy it immediately. Please go to www.opco.com/EmailDisclosures <http://www.opco.com/EmailDisclosures>
Re: Hadoop on premise versus cloud
Posted by daemeon reiydelle <da...@gmail.com>.
In addition to the data privacy concern described well below, there are a
couple of other areas you might consider (you can also respond privately
where I can be a bit more candid). . My experience with most banks (I work
with most of the players in the EU and US) are such that (1) below drives
development heavily into the cloud, in spite of (2)
1. Physical plant processes?
1. Your culture and processes may add months to a hardware delivery
cycle for production systems,
2. Does the data center under your employer's control even have the
available racks, power, network ports and router software versions to
support (bonding/stacking/teaming multiple 10gbit ports)
3. Big data gets much less interesting and viable when layers like
SAN and heavy hypervisors (VMware, Citrix) get into the mix, ditto "full
nightly backups" and other interesting confusions about the tech, not to
mention heavily committed to this, and requier full DR and zero
data loss
backups.
2. Data issues
1. Are the data sources "inside" your employer's network which would
require extra authorizations to allow them to connect to a cloud provider?
2. Are the consumers of the data going to be able to access the
cluster (similar questions if an intermediating data manipulation tool is
access by your employee/consumers)
3. As to data privacy
1. There are several data center providers who are legally and
entirely based within either the continental (Netherlands, Germany) or
adding UK if you think that is an alternative. To my knowledge Amazon is
not yet there but will be, I do not know if it is generally available but
Google Compute does have such ringfenced facilities in the EU, etc.
4. Now the real motiviations:
1. Startup costs as you figure out your data ingest complexity and as
your user expectations get clarified mean you seldom know what you will
need in a manner that management needs for their planning cycle.
2. Capital (hardware) costs are zero and all costs can be written off
in the current period. (Management has no hit to their capital expenses
budget)
3. Chainging (increasing ;{) costs can be directly tied to specific
activities as they occur (customer wants more X, additional data
sources Y,
and data Z% dirtier than expected ... you know the drill, yes?)
*.......*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*
On Mon, Oct 26, 2015 at 8:01 AM, Leonard, Michael <Mi...@opco.com>
wrote:
> Hi,
>
>
>
> I work at a large financial institution. I’m exploring deploying Hadoop
> and I’m trying to understand why I would deploy on premise when the cloud
> is faster and easier. What are the pros/cons of each? How does pricing
> compare between on premise and cloud deployments?
>
>
>
> Any color would be very helpful. Thank you in advance.
>
>
>
> Sincerely,
>
> Michael
>
>
>
>
> This communication and any attached files may contain information that is
> confidential or privileged. If this communication has been received in
> error, please delete or destroy it immediately. Please go to
> www.opco.com/EmailDisclosures
>
Re: Hadoop on premise versus cloud
Posted by Daniel Schulz <da...@hotmail.com>.
Hi Michael,
Thank you for your message.
To European customers data privacy is a major concern. So they are quiet reluctant to use pubic clouds — even iff their data will be encrypted. This is one reason may organisations strongly prefer on-premise clouds to public ones. So companies use public clouds, other plan to migrate to a private one, but most major companies do not want to rely on a third party when it comes to company data.
On the other hand, latency is lower due to all processes running in your local network. But throughput may mitigate that for production data loads.
However, the major advantage of public clouds if the relatively small cost to get started. AWS and others are ready to go — whereas a private cloud needs to be installed and later maintained.
Hope this helps a little.
Kind regards, Daniel.
> On 26 Oct 2015, at 16:01, Leonard, Michael <Mi...@opco.com> wrote:
>
> Hi,
>
> I work at a large financial institution. I’m exploring deploying Hadoop and I’m trying to understand why I would deploy on premise when the cloud is faster and easier. What are the pros/cons of each? How does pricing compare between on premise and cloud deployments?
>
> Any color would be very helpful. Thank you in advance.
>
> Sincerely,
> Michael
>
>
> This communication and any attached files may contain information that is confidential or privileged. If this communication has been received in error, please delete or destroy it immediately. Please go to www.opco.com/EmailDisclosures <http://www.opco.com/EmailDisclosures>
Re: Hadoop on premise versus cloud
Posted by daemeon reiydelle <da...@gmail.com>.
In addition to the data privacy concern described well below, there are a
couple of other areas you might consider (you can also respond privately
where I can be a bit more candid). . My experience with most banks (I work
with most of the players in the EU and US) are such that (1) below drives
development heavily into the cloud, in spite of (2)
1. Physical plant processes?
1. Your culture and processes may add months to a hardware delivery
cycle for production systems,
2. Does the data center under your employer's control even have the
available racks, power, network ports and router software versions to
support (bonding/stacking/teaming multiple 10gbit ports)
3. Big data gets much less interesting and viable when layers like
SAN and heavy hypervisors (VMware, Citrix) get into the mix, ditto "full
nightly backups" and other interesting confusions about the tech, not to
mention heavily committed to this, and requier full DR and zero
data loss
backups.
2. Data issues
1. Are the data sources "inside" your employer's network which would
require extra authorizations to allow them to connect to a cloud provider?
2. Are the consumers of the data going to be able to access the
cluster (similar questions if an intermediating data manipulation tool is
access by your employee/consumers)
3. As to data privacy
1. There are several data center providers who are legally and
entirely based within either the continental (Netherlands, Germany) or
adding UK if you think that is an alternative. To my knowledge Amazon is
not yet there but will be, I do not know if it is generally available but
Google Compute does have such ringfenced facilities in the EU, etc.
4. Now the real motiviations:
1. Startup costs as you figure out your data ingest complexity and as
your user expectations get clarified mean you seldom know what you will
need in a manner that management needs for their planning cycle.
2. Capital (hardware) costs are zero and all costs can be written off
in the current period. (Management has no hit to their capital expenses
budget)
3. Chainging (increasing ;{) costs can be directly tied to specific
activities as they occur (customer wants more X, additional data
sources Y,
and data Z% dirtier than expected ... you know the drill, yes?)
*.......*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*
On Mon, Oct 26, 2015 at 8:01 AM, Leonard, Michael <Mi...@opco.com>
wrote:
> Hi,
>
>
>
> I work at a large financial institution. I’m exploring deploying Hadoop
> and I’m trying to understand why I would deploy on premise when the cloud
> is faster and easier. What are the pros/cons of each? How does pricing
> compare between on premise and cloud deployments?
>
>
>
> Any color would be very helpful. Thank you in advance.
>
>
>
> Sincerely,
>
> Michael
>
>
>
>
> This communication and any attached files may contain information that is
> confidential or privileged. If this communication has been received in
> error, please delete or destroy it immediately. Please go to
> www.opco.com/EmailDisclosures
>
Re: Hadoop on premise versus cloud
Posted by Daniel Schulz <da...@hotmail.com>.
Hi Michael,
Thank you for your message.
To European customers data privacy is a major concern. So they are quiet reluctant to use pubic clouds — even iff their data will be encrypted. This is one reason may organisations strongly prefer on-premise clouds to public ones. So companies use public clouds, other plan to migrate to a private one, but most major companies do not want to rely on a third party when it comes to company data.
On the other hand, latency is lower due to all processes running in your local network. But throughput may mitigate that for production data loads.
However, the major advantage of public clouds if the relatively small cost to get started. AWS and others are ready to go — whereas a private cloud needs to be installed and later maintained.
Hope this helps a little.
Kind regards, Daniel.
> On 26 Oct 2015, at 16:01, Leonard, Michael <Mi...@opco.com> wrote:
>
> Hi,
>
> I work at a large financial institution. I’m exploring deploying Hadoop and I’m trying to understand why I would deploy on premise when the cloud is faster and easier. What are the pros/cons of each? How does pricing compare between on premise and cloud deployments?
>
> Any color would be very helpful. Thank you in advance.
>
> Sincerely,
> Michael
>
>
> This communication and any attached files may contain information that is confidential or privileged. If this communication has been received in error, please delete or destroy it immediately. Please go to www.opco.com/EmailDisclosures <http://www.opco.com/EmailDisclosures>
Re: Hadoop on premise versus cloud
Posted by Daniel Schulz <da...@hotmail.com>.
Hi Michael,
Thank you for your message.
To European customers data privacy is a major concern. So they are quiet reluctant to use pubic clouds — even iff their data will be encrypted. This is one reason may organisations strongly prefer on-premise clouds to public ones. So companies use public clouds, other plan to migrate to a private one, but most major companies do not want to rely on a third party when it comes to company data.
On the other hand, latency is lower due to all processes running in your local network. But throughput may mitigate that for production data loads.
However, the major advantage of public clouds if the relatively small cost to get started. AWS and others are ready to go — whereas a private cloud needs to be installed and later maintained.
Hope this helps a little.
Kind regards, Daniel.
> On 26 Oct 2015, at 16:01, Leonard, Michael <Mi...@opco.com> wrote:
>
> Hi,
>
> I work at a large financial institution. I’m exploring deploying Hadoop and I’m trying to understand why I would deploy on premise when the cloud is faster and easier. What are the pros/cons of each? How does pricing compare between on premise and cloud deployments?
>
> Any color would be very helpful. Thank you in advance.
>
> Sincerely,
> Michael
>
>
> This communication and any attached files may contain information that is confidential or privileged. If this communication has been received in error, please delete or destroy it immediately. Please go to www.opco.com/EmailDisclosures <http://www.opco.com/EmailDisclosures>
Re: Hadoop on premise versus cloud
Posted by daemeon reiydelle <da...@gmail.com>.
In addition to the data privacy concern described well below, there are a
couple of other areas you might consider (you can also respond privately
where I can be a bit more candid). . My experience with most banks (I work
with most of the players in the EU and US) are such that (1) below drives
development heavily into the cloud, in spite of (2)
1. Physical plant processes?
1. Your culture and processes may add months to a hardware delivery
cycle for production systems,
2. Does the data center under your employer's control even have the
available racks, power, network ports and router software versions to
support (bonding/stacking/teaming multiple 10gbit ports)
3. Big data gets much less interesting and viable when layers like
SAN and heavy hypervisors (VMware, Citrix) get into the mix, ditto "full
nightly backups" and other interesting confusions about the tech, not to
mention heavily committed to this, and requier full DR and zero
data loss
backups.
2. Data issues
1. Are the data sources "inside" your employer's network which would
require extra authorizations to allow them to connect to a cloud provider?
2. Are the consumers of the data going to be able to access the
cluster (similar questions if an intermediating data manipulation tool is
access by your employee/consumers)
3. As to data privacy
1. There are several data center providers who are legally and
entirely based within either the continental (Netherlands, Germany) or
adding UK if you think that is an alternative. To my knowledge Amazon is
not yet there but will be, I do not know if it is generally available but
Google Compute does have such ringfenced facilities in the EU, etc.
4. Now the real motiviations:
1. Startup costs as you figure out your data ingest complexity and as
your user expectations get clarified mean you seldom know what you will
need in a manner that management needs for their planning cycle.
2. Capital (hardware) costs are zero and all costs can be written off
in the current period. (Management has no hit to their capital expenses
budget)
3. Chainging (increasing ;{) costs can be directly tied to specific
activities as they occur (customer wants more X, additional data
sources Y,
and data Z% dirtier than expected ... you know the drill, yes?)
*.......*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*
On Mon, Oct 26, 2015 at 8:01 AM, Leonard, Michael <Mi...@opco.com>
wrote:
> Hi,
>
>
>
> I work at a large financial institution. I’m exploring deploying Hadoop
> and I’m trying to understand why I would deploy on premise when the cloud
> is faster and easier. What are the pros/cons of each? How does pricing
> compare between on premise and cloud deployments?
>
>
>
> Any color would be very helpful. Thank you in advance.
>
>
>
> Sincerely,
>
> Michael
>
>
>
>
> This communication and any attached files may contain information that is
> confidential or privileged. If this communication has been received in
> error, please delete or destroy it immediately. Please go to
> www.opco.com/EmailDisclosures
>