You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dennis <ar...@yahoo.com.cn> on 2010/05/07 04:02:18 UTC

Virtualization vs. Cassandra and Hadloop

Please check out this PNG image from attachment or from Google docs: http://docs.google.com/drawings/pub?id=1P3jdSddseG1oSYrtjREWcajizxmxoRIhUHCEw4sDi3k&w=771&h=624So, what I want to do is something like a private cloud storage solution.I belive the http servers and application servers should be set up on virtual VMs, but what about the Cassandra and Hadloop servers, should their be set up on VMs or directly on physical machines? If they should be set up on VMs, the data of Cassandra and Hadloop should be stored in local storage or a Storage Repository?
Thanks,Dennis


      

Re: Virtualization vs. Cassandra and Hadloop

Posted by Edward Capriolo <ed...@gmail.com>.
On Thu, May 6, 2010 at 10:02 PM, Dennis <ar...@yahoo.com.cn> wrote:

> Please check out this PNG image from attachment or from Google docs:
> http://docs.google.com/drawings/pub?id=1P3jdSddseG1oSYrtjREWcajizxmxoRIhUHCEw4sDi3k&w=771&h=624
> So, what I want to do is something like a private cloud storage solution.
> I belive the http servers and application servers should be set up on
> virtual VMs, but what about the Cassandra and Hadloop servers, should their
> be set up on VMs or directly on physical machines? If they should be set up
> on VMs, the data of Cassandra and Hadloop should be stored in local storage
> or a Storage Repository?
>
> Thanks,
> Dennis
>
>
Dennis,

Looks like fun :)

Either architect is fine depending on your usage patterns. I have tried
running cassandra on non-dedicated nodes. I have noticed that if cassandra
gets CPU starved the gossip protocol detects nodes as down and that is a bad
thing. so the danger is that very CPU intensive Hadoop jobs could starve our
cassandra. Cassandra runs pretty even as I see except when it does its
anti-compaction and repairs.

Re: Virtualization vs. Cassandra and Hadloop

Posted by Steve Loughran <st...@apache.org>.
Vijay wrote:
> Probably only me... but we have seen a higher latencies when using VMWare,
> also i think it depends on the H/W and VM configuration.... I have to figure
> out why (You might also try to mix the application's which runs on the
> hw).... i think there are people who run it on Amazons EC.

IO latency to virtual HDDs sucks as there is an extra layer there; the 
virtual OS thinks the blocks are laid out sequentially, but on the real 
disk they could be fragmented. Virtualised network IO can be slower too.

CPU, raw computation, can be fairly close to physicial.

Re: Virtualization vs. Cassandra and Hadloop

Posted by Vijay <vi...@gmail.com>.
Probably only me... but we have seen a higher latencies when using VMWare,
also i think it depends on the H/W and VM configuration.... I have to figure
out why (You might also try to mix the application's which runs on the
hw).... i think there are people who run it on Amazons EC.

Regards,
</VJ>



On Thu, May 6, 2010 at 7:02 PM, Dennis <ar...@yahoo.com.cn> wrote:

> Please check out this PNG image from attachment or from Google docs:
> http://docs.google.com/drawings/pub?id=1P3jdSddseG1oSYrtjREWcajizxmxoRIhUHCEw4sDi3k&w=771&h=624
> So, what I want to do is something like a private cloud storage solution.
> I belive the http servers and application servers should be set up on
> virtual VMs, but what about the Cassandra and Hadloop servers, should their
> be set up on VMs or directly on physical machines? If they should be set up
> on VMs, the data of Cassandra and Hadloop should be stored in local storage
> or a Storage Repository?
>
> Thanks,
> Dennis
>
>

Re: Virtualization vs. Cassandra and Hadloop

Posted by Vijay <vi...@gmail.com>.
Probably only me... but we have seen a higher latencies when using VMWare,
also i think it depends on the H/W and VM configuration.... I have to figure
out why (You might also try to mix the application's which runs on the
hw).... i think there are people who run it on Amazons EC.

Regards,
</VJ>



On Thu, May 6, 2010 at 7:02 PM, Dennis <ar...@yahoo.com.cn> wrote:

> Please check out this PNG image from attachment or from Google docs:
> http://docs.google.com/drawings/pub?id=1P3jdSddseG1oSYrtjREWcajizxmxoRIhUHCEw4sDi3k&w=771&h=624
> So, what I want to do is something like a private cloud storage solution.
> I belive the http servers and application servers should be set up on
> virtual VMs, but what about the Cassandra and Hadloop servers, should their
> be set up on VMs or directly on physical machines? If they should be set up
> on VMs, the data of Cassandra and Hadloop should be stored in local storage
> or a Storage Repository?
>
> Thanks,
> Dennis
>
>