You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Ankur Goel <an...@corp.aol.com> on 2009/04/28 12:46:51 UTC

Hadoop / MySQL

hello hadoop users,
Recently I had a chance to lead a team building a log-processing system that uses Hadoop and MySQL. The system's goal was to process the incoming information as quickly as possible (real time or near real time), and make it available for querying in MySQL. I thought it would be good to share the experience and the challenges with the community. Couldn't think of a better place than these mailing lists as I am not much of a blogger :-)

The information flow in the system looks something like

[Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]

Transferring from Apache-Servers to Hadoop was quite easy as we just had to organize the data in timely buckets (directories). Once that was running smooth we had to make sure that map-reduce jobs are fired at regular intervals and they pick up the right data. The jobs would then process/aggregate the date and dump the info into MySQL shards from the reducers [we have our own DB partioning set up]. This is where we hit major bottlenecks [any surprises? :-)]

The table engine used was InnoDB as there was a need for fast replication and writes but only moderate reads (should eventually support high read rates). The data would take up quite a while to load completely far away from being near-real time. And so our optimization journey begin.

1. We tried to optimize/tune InnoDB parameters like increasing the buffer pool size to 75 % of available RAM. This helped but only till the time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host the data and indexes.

2. We also realized that InnoDB has considerable locking overhead because of which write concurrency is really bad when you have a large number of concurrent threads doing writes. The default thread concurrency for us was set to no_of_cpu * 2 = 8 which is what the official documentation advises as the optimal limit. So we limited the number of reduce tasks and consequently the number of concurrent writes and boy the performance improved 4x. We were almost there :-)

3. Next thing we tried is the standard DB optimzation techniques like de-normalizing the schema and dropping constraints. This gave only a minor performance improvement, nothing earth shattering. Note that we were already caching connections in reducers to each MySQL shard and partionining logic was embedded into reducers.

4. Falling still short of our performance objectives, we finally we decided to get rid of JDBC writes from reducers and work on an alternative that uses MySQLs LOAD utility.
- The processing would partition the data into MySQL shard specific files resident in HDFS.
- A script would then spawn processes via ssh on different physical machines to download this data.
- Each spawned process just downloads the data for the shard it should upload to.
- All the processes then start uploading data in parallel into their respective MySQL shards using LOAD DATA infile.

This proved to be the fastest approach, even in the wake of increasing data loads. The enitre processing/loading would complete in less than 6 min. The system has been holding up quite well so far, even though we've had to limit the number of days for which we keep the data or else the MySQLs get overwhelmed.

Hope this is helpful to people.

Regards
-Ankur

Re: Hadoop / MySQL

Posted by Yi-Kai Tsai <yi...@yahoo-inc.com>.

Hi Ankur

Nice share , btw whats your query behavior ?
I'm asking because if the query is simple or could be 
transform/normalized , you could try output to HBase directly?

Yi-Kai

> hello hadoop users, 
> Recently I had a chance to lead a team building a log-processing system that uses Hadoop and MySQL. The system's goal was to process the incoming information as quickly as possible (real time or near real time), and make it available for querying in MySQL. I thought it would be good to share the experience and the challenges with the community. Couldn't think of a better place than these mailing lists as I am not much of a blogger :-) 
>
> The information flow in the system looks something like 
>
> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools] 
>
> Transferring from Apache-Servers to Hadoop was quite easy as we just had to organize the data in timely buckets (directories). Once that was running smooth we had to make sure that map-reduce jobs are fired at regular intervals and they pick up the right data. The jobs would then process/aggregate the date and dump the info into MySQL shards from the reducers [we have our own DB partioning set up]. This is where we hit major bottlenecks [any surprises? :-)] 
>
> The table engine used was InnoDB as there was a need for fast replication and writes but only moderate reads (should eventually support high read rates). The data would take up quite a while to load completely far away from being near-real time. And so our optimization journey begin. 
>
> 1. We tried to optimize/tune InnoDB parameters like increasing the buffer pool size to 75 % of available RAM. This helped but only till the time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host the data and indexes. 
>
> 2. We also realized that InnoDB has considerable locking overhead because of which write concurrency is really bad when you have a large number of concurrent threads doing writes. The default thread concurrency for us was set to no_of_cpu * 2 = 8 which is what the official documentation advises as the optimal limit. So we limited the number of reduce tasks and consequently the number of concurrent writes and boy the performance improved 4x. We were almost there :-) 
>
> 3. Next thing we tried is the standard DB optimzation techniques like de-normalizing the schema and dropping constraints. This gave only a minor performance improvement, nothing earth shattering. Note that we were already caching connections in reducers to each MySQL shard and partionining logic was embedded into reducers. 
>
> 4. Falling still short of our performance objectives, we finally we decided to get rid of JDBC writes from reducers and work on an alternative that uses MySQLs LOAD utility. 
> - The processing would partition the data into MySQL shard specific files resident in HDFS. 
> - A script would then spawn processes via ssh on different physical machines to download this data. 
> - Each spawned process just downloads the data for the shard it should upload to. 
> - All the processes then start uploading data in parallel into their respective MySQL shards using LOAD DATA infile. 
>
> This proved to be the fastest approach, even in the wake of increasing data loads. The enitre processing/loading would complete in less than 6 min. The system has been holding up quite well so far, even though we've had to limit the number of days for which we keep the data or else the MySQLs get overwhelmed. 
>
> Hope this is helpful to people. 
>
> Regards 
> -Ankur 
>   


-- 
Yi-Kai Tsai (cuma) <yi...@yahoo-inc.com>, Asia Search Engineering.

Re: Hadoop / MySQL

Posted by Yi-Kai Tsai <yi...@yahoo-inc.com>.

Hi Ankur

Nice share , btw whats your query behavior ?
I'm asking because if the query is simple or could be 
transform/normalized , you could try output to HBase directly?

Yi-Kai

> hello hadoop users, 
> Recently I had a chance to lead a team building a log-processing system that uses Hadoop and MySQL. The system's goal was to process the incoming information as quickly as possible (real time or near real time), and make it available for querying in MySQL. I thought it would be good to share the experience and the challenges with the community. Couldn't think of a better place than these mailing lists as I am not much of a blogger :-) 
>
> The information flow in the system looks something like 
>
> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools] 
>
> Transferring from Apache-Servers to Hadoop was quite easy as we just had to organize the data in timely buckets (directories). Once that was running smooth we had to make sure that map-reduce jobs are fired at regular intervals and they pick up the right data. The jobs would then process/aggregate the date and dump the info into MySQL shards from the reducers [we have our own DB partioning set up]. This is where we hit major bottlenecks [any surprises? :-)] 
>
> The table engine used was InnoDB as there was a need for fast replication and writes but only moderate reads (should eventually support high read rates). The data would take up quite a while to load completely far away from being near-real time. And so our optimization journey begin. 
>
> 1. We tried to optimize/tune InnoDB parameters like increasing the buffer pool size to 75 % of available RAM. This helped but only till the time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host the data and indexes. 
>
> 2. We also realized that InnoDB has considerable locking overhead because of which write concurrency is really bad when you have a large number of concurrent threads doing writes. The default thread concurrency for us was set to no_of_cpu * 2 = 8 which is what the official documentation advises as the optimal limit. So we limited the number of reduce tasks and consequently the number of concurrent writes and boy the performance improved 4x. We were almost there :-) 
>
> 3. Next thing we tried is the standard DB optimzation techniques like de-normalizing the schema and dropping constraints. This gave only a minor performance improvement, nothing earth shattering. Note that we were already caching connections in reducers to each MySQL shard and partionining logic was embedded into reducers. 
>
> 4. Falling still short of our performance objectives, we finally we decided to get rid of JDBC writes from reducers and work on an alternative that uses MySQLs LOAD utility. 
> - The processing would partition the data into MySQL shard specific files resident in HDFS. 
> - A script would then spawn processes via ssh on different physical machines to download this data. 
> - Each spawned process just downloads the data for the shard it should upload to. 
> - All the processes then start uploading data in parallel into their respective MySQL shards using LOAD DATA infile. 
>
> This proved to be the fastest approach, even in the wake of increasing data loads. The enitre processing/loading would complete in less than 6 min. The system has been holding up quite well so far, even though we've had to limit the number of days for which we keep the data or else the MySQLs get overwhelmed. 
>
> Hope this is helpful to people. 
>
> Regards 
> -Ankur 
>   


-- 
Yi-Kai Tsai (cuma) <yi...@yahoo-inc.com>, Asia Search Engineering.

Re: Hadoop / MySQL

Posted by Peter Skomoroch <pe...@gmail.com>.

Thanks for sharing sounds like a nice system - I always advise people to
avoid direct SQL inserts for batch jobs / large amounts of data and use
MySQL's optimized LOAD utility like you did.  Same goes for Oracle...
Nothing brings a DB server to its knees like a ton of individual inserts on
indexed tables..

On Tue, Apr 28, 2009 at 6:46 AM, Ankur Goel <an...@corp.aol.com> wrote:

>
> hello hadoop users,
> Recently I had a chance to lead a team building a log-processing system
> that uses Hadoop and MySQL. The system's goal was to process the incoming
> information as quickly as possible (real time or near real time), and make
> it available for querying in MySQL. I thought it would be good to share the
> experience and the challenges with the community. Couldn't think of a better
> place than these mailing lists as I am not much of a blogger :-)
>
> The information flow in the system looks something like
>
> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
>
> Transferring from Apache-Servers to Hadoop was quite easy as we just had to
> organize the data in timely buckets (directories). Once that was running
> smooth we had to make sure that map-reduce jobs are fired at regular
> intervals and they pick up the right data. The jobs would then
> process/aggregate the date and dump the info into MySQL shards from the
> reducers [we have our own DB partioning set up]. This is where we hit major
> bottlenecks [any surprises? :-)]
>
> The table engine used was InnoDB as there was a need for fast replication
> and writes but only moderate reads (should eventually support high read
> rates). The data would take up quite a while to load completely far away
> from being near-real time. And so our optimization journey begin.
>
> 1. We tried to optimize/tune InnoDB parameters like increasing the buffer
> pool size to 75 % of available RAM. This helped but only till the time DBs
> were lightly loaded i.e. innoDB had sufficient buffer pool to host the data
> and indexes.
>
> 2. We also realized that InnoDB has considerable locking overhead because
> of which write concurrency is really bad when you have a large number of
> concurrent threads doing writes. The default thread concurrency for us was
> set to no_of_cpu * 2 = 8 which is what the official documentation advises as
> the optimal limit. So we limited the number of reduce tasks and consequently
> the number of concurrent writes and boy the performance improved 4x. We were
> almost there :-)
>
> 3. Next thing we tried is the standard DB optimzation techniques like
> de-normalizing the schema and dropping constraints. This gave only a minor
> performance improvement, nothing earth shattering. Note that we were already
> caching connections in reducers to each MySQL shard and partionining logic
> was embedded into reducers.
>
> 4. Falling still short of our performance objectives, we finally we decided
> to get rid of JDBC writes from reducers and work on an alternative that uses
> MySQLs LOAD utility.
> - The processing would partition the data into MySQL shard specific files
> resident in HDFS.
> - A script would then spawn processes via ssh on different physical
> machines to download this data.
> - Each spawned process just downloads the data for the shard it should
> upload to.
> - All the processes then start uploading data in parallel into their
> respective MySQL shards using LOAD DATA infile.
>
> This proved to be the fastest approach, even in the wake of increasing data
> loads. The enitre processing/loading would complete in less than 6 min. The
> system has been holding up quite well so far, even though we've had to limit
> the number of days for which we keep the data or else the MySQLs get
> overwhelmed.
>
> Hope this is helpful to people.
>
> Regards
> -Ankur
>



-- 
Peter N. Skomoroch
617.285.8348
http://www.datawrangling.com
http://delicious.com/pskomoroch
http://twitter.com/peteskomoroch

Re: Hadoop / MySQL

Posted by Peter Skomoroch <pe...@gmail.com>.

Thanks for sharing sounds like a nice system - I always advise people to
avoid direct SQL inserts for batch jobs / large amounts of data and use
MySQL's optimized LOAD utility like you did.  Same goes for Oracle...
Nothing brings a DB server to its knees like a ton of individual inserts on
indexed tables..

On Tue, Apr 28, 2009 at 6:46 AM, Ankur Goel <an...@corp.aol.com> wrote:

>
> hello hadoop users,
> Recently I had a chance to lead a team building a log-processing system
> that uses Hadoop and MySQL. The system's goal was to process the incoming
> information as quickly as possible (real time or near real time), and make
> it available for querying in MySQL. I thought it would be good to share the
> experience and the challenges with the community. Couldn't think of a better
> place than these mailing lists as I am not much of a blogger :-)
>
> The information flow in the system looks something like
>
> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
>
> Transferring from Apache-Servers to Hadoop was quite easy as we just had to
> organize the data in timely buckets (directories). Once that was running
> smooth we had to make sure that map-reduce jobs are fired at regular
> intervals and they pick up the right data. The jobs would then
> process/aggregate the date and dump the info into MySQL shards from the
> reducers [we have our own DB partioning set up]. This is where we hit major
> bottlenecks [any surprises? :-)]
>
> The table engine used was InnoDB as there was a need for fast replication
> and writes but only moderate reads (should eventually support high read
> rates). The data would take up quite a while to load completely far away
> from being near-real time. And so our optimization journey begin.
>
> 1. We tried to optimize/tune InnoDB parameters like increasing the buffer
> pool size to 75 % of available RAM. This helped but only till the time DBs
> were lightly loaded i.e. innoDB had sufficient buffer pool to host the data
> and indexes.
>
> 2. We also realized that InnoDB has considerable locking overhead because
> of which write concurrency is really bad when you have a large number of
> concurrent threads doing writes. The default thread concurrency for us was
> set to no_of_cpu * 2 = 8 which is what the official documentation advises as
> the optimal limit. So we limited the number of reduce tasks and consequently
> the number of concurrent writes and boy the performance improved 4x. We were
> almost there :-)
>
> 3. Next thing we tried is the standard DB optimzation techniques like
> de-normalizing the schema and dropping constraints. This gave only a minor
> performance improvement, nothing earth shattering. Note that we were already
> caching connections in reducers to each MySQL shard and partionining logic
> was embedded into reducers.
>
> 4. Falling still short of our performance objectives, we finally we decided
> to get rid of JDBC writes from reducers and work on an alternative that uses
> MySQLs LOAD utility.
> - The processing would partition the data into MySQL shard specific files
> resident in HDFS.
> - A script would then spawn processes via ssh on different physical
> machines to download this data.
> - Each spawned process just downloads the data for the shard it should
> upload to.
> - All the processes then start uploading data in parallel into their
> respective MySQL shards using LOAD DATA infile.
>
> This proved to be the fastest approach, even in the wake of increasing data
> loads. The enitre processing/loading would complete in less than 6 min. The
> system has been holding up quite well so far, even though we've had to limit
> the number of days for which we keep the data or else the MySQLs get
> overwhelmed.
>
> Hope this is helpful to people.
>
> Regards
> -Ankur
>



-- 
Peter N. Skomoroch
617.285.8348
http://www.datawrangling.com
http://delicious.com/pskomoroch
http://twitter.com/peteskomoroch

Re: Hadoop / MySQL

Posted by Edward Capriolo <ed...@gmail.com>.

On Wed, Apr 29, 2009 at 10:19 AM, Stefan Podkowinski <sp...@gmail.com> wrote:
> If you have trouble loading your data into mysql using INSERTs or LOAD
> DATA, consider that MySQL supports CSV directly using the CSV storage
> engine. The only thing you have to do is to copy your hadoop produced
> csv file into the mysql data directory and issue a "flush tables"
> command to have mysql flush its caches and pickup the new file. Its
> very simple and you have the full set of sql commands available just
> as with innodb or myisam. What you don't get with the csv engine are
> indexes and foreign keys. Can't have it all, can you?
>
> Stefan
>
>
> On Tue, Apr 28, 2009 at 9:23 PM, Bill Habermaas <bi...@habermaas.us> wrote:
>> Excellent discussion. Thank you Todd.
>> You're forgiven for being off topic (at least by me).
>> :)
>> Bill
>>
>> -----Original Message-----
>> From: Todd Lipcon [mailto:todd@cloudera.com]
>> Sent: Tuesday, April 28, 2009 2:29 PM
>> To: core-user
>> Subject: Re: Hadoop / MySQL
>>
>> Warning: derailing a bit into MySQL discussion below, but I think enough
>> people have similar use cases that it's worth discussing this even though
>> it's gotten off-topic.
>>
>> 2009/4/28 tim robertson <ti...@gmail.com>
>>
>>>
>>> So we ended up with 2 DBs
>>> - DB1 we insert to, prepare and do batch processing
>>> - DB2 serving the read only web app
>>>
>>
>> This is a pretty reasonable and common architecture. Depending on your
>> specific setup, instead of flip-flopping between DB1 and DB2, you could
>> actually pull snapshots of MyISAM tables off DB1 and load them onto other
>> machines. As long as you've flushed the tables with a read lock, MyISAM
>> tables are transferrable between machines (eg via rsync). Obviously this can
>> get a bit hairy, but it's a nice trick to consider for this kind of
>> workflow.
>>
>>
>>> Why did we end up with this?  Because of locking on writes that kill
>>> reads as you say... basically you can't insert when a read is
>>> happening on myisam as it locks the whole table.
>>
>>
>> This is only true if you have binary logging enabled. Otherwise, myisam
>> supports concurrent inserts with reads. That said, binary logging is
>> necessary if you have any slaves. If you're loading bulk data from the
>> result of a mapreduce job, you might be better off not using replication and
>> simply loading the bulk data to each of the serving replicas individually.
>> Turning off the binary logging will also double your write speed (LOAD DATA
>> writes the entirety of the data to the binary log as well as to the table)
>>
>>
>>>  InnoDB has row level
>>> locking to get around this but in our experience (at the time we had
>>> 130million records) it just didn't work either.
>>
>>
>> You're quite likely to be hitting the InnoDB autoincrement lock if you have
>> an autoincrement primary key here. There are fixes for this in MySQL 5.1.
>> The best solution is to avoid autoincrement primary keys and use LOAD DATA
>> for these kind of bulk loads, as others have suggested.
>>
>>
>>>  We spent €10,000 for
>>> the supposed "european expert" on mysql from their professional
>>> services and were unfortunately very disappointed.  Seems such large
>>> tables are just problematic with mysql.  We are now very much looking
>>> into Lucene technologies for search and Hadoop for reporting and
>>> datamining type operations. SOLR does a lot of what our DB does for
>>> us.
>>>
>>
>> Yep - oftentimes MySQL is not the correct solution, but other times it can
>> be just what you need. If you already have competencies with MySQL and a
>> good access layer from your serving tier, it's often easier to stick with
>> MySQL than add a new technology into the mix.
>>
>>
>>>
>>> So with myisam... here is what we learnt:
>>>
>>> Only very latest mysql versions (beta still I think) support more than
>>> 4G memory for indexes (you really really need the index in memory, and
>>> where possible the FK for joins in the index too).
>>
>>
>> As far as I know, any 64-bit mysql instance will use more than 4G without
>> trouble.
>>
>>
>>>  Mysql has
>>> differing join strategies between innoDB and myisam, so be aware.
>>
>>
>> I don't think this is true. Joining happens at the MySQL execution layer,
>> which is above the storage engine API. The same join strategies are
>> available for both. For a particular query, InnoDB and MyISAM tables may end
>> up providing a different query plan based on the statistics that are
>> collected, but given properly analyzed tables, the strategies will be the
>> same. This is how MySQL allows inter-storage-engine joins. If one engine is
>> providing a better query plan, you can use query hints to enforce that plan
>> (see STRAIGHT_JOIN and FORCE INDEX for example)
>>
>>
>>> An undocumented feature of myisam is you can create memory buffers for
>>> single indexes:
>>> In the my.cnf:
>>>     taxon_concept_cache.key_buffer_size=3990M    -- for some reason
>>> you have to drop a little under 4G
>>>
>>> then in the DB run:
>>>    cache index taxon_concept in taxon_concept_cache;
>>>    load index into cache taxon_concept;
>>>
>>> This allows for making sure an index gets into memory for sure.
>>>
>>
>> But for most use cases and a properly configured machine you're better off
>> letting it use its own caching policies to keep hot indexes in RAM.
>>
>>
>>>
>>> And here is our main cnf changes:
>>>
>>> sort_buffer_size = 128M
>>
>>
>> This is likely to be causing huge slowdowns on ORDER BYs. See
>> http://www.mysqlperformanceblog.com/2007/08/18/how-fast-can-you-sort-data-with-mysql/for
>> example. Try setting to 128K instead.
>>
>>
>>>
>>> read_buffer_size = 64M
>>> read_rnd_buffer_size = 300M
>>
>>
>> read_rnd_buffer_size seems too high to me.
>>
>>
>> Hope this is helpful for anyone else using MySQL as a sink for MapReduce
>> output.
>>
>> -Todd
>>
>>
>>
>>>
>>> On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong <wa...@gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > That's really cool. It seems that Hadoop could work with SQL DBs like
>>> > Mysql with bearable time. I thought when inserting data to Mysql, the
>>> > expense of communication was always a big problem. You got a method to
>>> > reduce the expense.
>>> >
>>> > Using Distribution Databases like HBase is another good choice. It is
>>> > based on Hadoop and HDFS, so there is no problem with communication
>>> > expense.
>>> >
>>> >
>>> >
>>> > On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel <an...@corp.aol.com>
>>> wrote:
>>> >>
>>> >> hello hadoop users,
>>> >> Recently I had a chance to lead a team building a log-processing system
>>> that uses Hadoop and MySQL. The system's goal was to process the incoming
>>> information as quickly as possible (real time or near real time), and make
>>> it available for querying in MySQL. I thought it would be good to share the
>>> experience and the challenges with the community. Couldn't think of a better
>>> place than these mailing lists as I am not much of a blogger :-)
>>> >>
>>> >> The information flow in the system looks something like
>>> >>
>>> >> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
>>> >>
>>> >> Transferring from Apache-Servers to Hadoop was quite easy as we just had
>>> to organize the data in timely buckets (directories). Once that was running
>>> smooth we had to make sure that map-reduce jobs are fired at regular
>>> intervals and they pick up the right data. The jobs would then
>>> process/aggregate the date and dump the info into MySQL shards from the
>>> reducers [we have our own DB partioning set up]. This is where we hit major
>>> bottlenecks [any surprises? :-)]
>>> >>
>>> >> The table engine used was InnoDB as there was a need for fast
>>> replication and writes but only moderate reads (should eventually support
>>> high read rates). The data would take up quite a while to load completely
>>> far away from being near-real time. And so our optimization journey begin.
>>> >>
>>> >> 1. We tried to optimize/tune InnoDB parameters like increasing the
>>> buffer pool size to 75 % of available RAM. This helped but only till the
>>> time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host
>>> the data and indexes.
>>> >>
>>> >> 2. We also realized that InnoDB has considerable locking overhead
>>> because of which write concurrency is really bad when you have a large
>>> number of concurrent threads doing writes. The default thread concurrency
>>> for us was set to no_of_cpu * 2 = 8 which is what the official documentation
>>> advises as the optimal limit. So we limited the number of reduce tasks and
>>> consequently the number of concurrent writes and boy the performance
>>> improved 4x. We were almost there :-)
>>> >>
>>> >> 3. Next thing we tried is the standard DB optimzation techniques like
>>> de-normalizing the schema and dropping constraints. This gave only a minor
>>> performance improvement, nothing earth shattering. Note that we were already
>>> caching connections in reducers to each MySQL shard and partionining logic
>>> was embedded into reducers.
>>> >>
>>> >> 4. Falling still short of our performance objectives, we finally we
>>> decided to get rid of JDBC writes from reducers and work on an alternative
>>> that uses MySQLs LOAD utility.
>>> >> - The processing would partition the data into MySQL shard specific
>>> files resident in HDFS.
>>> >> - A script would then spawn processes via ssh on different physical
>>> machines to download this data.
>>> >> - Each spawned process just downloads the data for the shard it should
>>> upload to.
>>> >> - All the processes then start uploading data in parallel into their
>>> respective MySQL shards using LOAD DATA infile.
>>> >>
>>> >> This proved to be the fastest approach, even in the wake of increasing
>>> data loads. The enitre processing/loading would complete in less than 6 min.
>>> The system has been holding up quite well so far, even though we've had to
>>> limit the number of days for which we keep the data or else the MySQLs get
>>> overwhelmed.
>>> >>
>>> >> Hope this is helpful to people.
>>> >>
>>> >> Regards
>>> >> -Ankur
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Wang Zhong
>>> >
>>>
>>
>>
>>
>

Here is a thought that just blew my mind. mySQL CSV storage engine is
essentially one of the functions of hadoop-hive. Imagine a type of
"auto replication" that could store the same data in mysql and in
hadoop-hive. You would be able to support short term mysql queries and
long running Hive queries over the same dataset...

Re: Hadoop / MySQL

Posted by Todd Lipcon <to...@cloudera.com>.

On Wed, Apr 29, 2009 at 12:25 PM, Edward Capriolo <ed...@gmail.com>wrote:

>
>
> The .frm and .myd are binary platform dependent files. You can not
> even move them from 32bit-64bit. Generating them without native tools
> would be difficult. Moving then around with HDFS might have merit,
> although the RSYNC could accomplish the same thing.
>

Nope -- MyISAM data files are completely compatible between 32-bit and
64-bit as well as different different architectures. From the MySQL manual:

"All data values are stored with the low byte first. This makes the data
machine and operating system independent. The only requirements for binary
portability are that the machine uses two's-complement signed integers and
IEEE floating-point format. These requirements are widely used among
mainstream machines. Binary compatibility might not be applicable to
embedded systems, which sometimes have peculiar processors."

> Derby-DB might be a better candidate for something like this since the
> underlying DB is cross platform.
>

Yep, but for shops that already have a big MySQL infrastructure, it makes
sense not to introduce more software.

-Todd

Re: Hadoop / MySQL

Posted by Edward Capriolo <ed...@gmail.com>.

On Wed, Apr 29, 2009 at 2:48 PM, Todd Lipcon <to...@cloudera.com> wrote:
> On Wed, Apr 29, 2009 at 7:19 AM, Stefan Podkowinski <sp...@gmail.com>wrote:
>
>> If you have trouble loading your data into mysql using INSERTs or LOAD
>> DATA, consider that MySQL supports CSV directly using the CSV storage
>> engine. The only thing you have to do is to copy your hadoop produced
>> csv file into the mysql data directory and issue a "flush tables"
>> command to have mysql flush its caches and pickup the new file. Its
>> very simple and you have the full set of sql commands available just
>> as with innodb or myisam. What you don't get with the csv engine are
>> indexes and foreign keys. Can't have it all, can you?
>>
>
> The CSV storage engine is definitely an interesting option, but it has a
> couple downsides:
>
> - Like you mentioned, you don't get indexes. This seems like a huge deal to
> me - the reason you want to load data into MySQL instead of just keeping it
> in Hadoop is so you can service real-time queries. Not having any indexing
> kind of defeats the purpose there. This is especially true since MySQL only
> supports nested-loop joins, and there's no way of attaching metadata to a
> CSV table to say "hey look, this table is already in sorted order so you can
> use a merge join".
>
> - Since CSV is a text based format, it's likely to be a lot less compact
> than a proper table. For example, a unix timestamp is likely to be ~10
> characters vs 4 bytes in a packed table.
>
> - I'm not aware of many people actually using CSV for anything except
> tutorials and training. Since it's not in heavy use by big mysql users, I
> wouldn't build a production system around it.
>
> Here's a wacky idea that I might be interested in hacking up if anyone's
> interested:
>
> What if there were a MyISAMTableOutputFormat in hadoop? You could use this
> as a reducer output and have it actually output .frm and .myd files onto
> HDFS, then simply hdfs -get them onto DB servers for realtime serving.
> Sounds like a fun hack I might be interested in if people would find it
> useful. Building the .myi indexes in Hadoop would be pretty killer as well,
> but potentially more difficult.
>
> -Todd
>

The .frm and .myd are binary platform dependent files. You can not
even move them from 32bit-64bit. Generating them without native tools
would be difficult. Moving then around with HDFS might have merit,
although the RSYNC could accomplish the same thing.

Derby-DB might be a better candidate for something like this since the
underlying DB is cross platform.

Re: Hadoop / MySQL

Posted by Bhupesh Bansal <bb...@linkedin.com>.

Slightly off topic .. As being a non-mySQL solution

We have the same problem computing about 100G of data daily and serving it
online with minimum impact while data refresh.

We are using our in-house clone of amazon dynamo a key value Distributed
hash table store (Prject-Voldemort) for the serving side. Project-voldemort
supports a ReadOnlyStore which uses file based data/index. The interesting
part is that we compute the new data/index on hadoop and just Hot Swap it on
voldemort nodes. Total swap time is roughly scp/rsync time with actual
service impact time being very very minimal (closing and opening file
descriptors)

Thanks a lot for info on this thread have been very interesting.

Best
Bhupesh


On 4/29/09 11:48 AM, "Todd Lipcon" <to...@cloudera.com> wrote:

> On Wed, Apr 29, 2009 at 7:19 AM, Stefan Podkowinski <sp...@gmail.com>wrote:
> 
>> If you have trouble loading your data into mysql using INSERTs or LOAD
>> DATA, consider that MySQL supports CSV directly using the CSV storage
>> engine. The only thing you have to do is to copy your hadoop produced
>> csv file into the mysql data directory and issue a "flush tables"
>> command to have mysql flush its caches and pickup the new file. Its
>> very simple and you have the full set of sql commands available just
>> as with innodb or myisam. What you don't get with the csv engine are
>> indexes and foreign keys. Can't have it all, can you?
>> 
> 
> The CSV storage engine is definitely an interesting option, but it has a
> couple downsides:
> 
> - Like you mentioned, you don't get indexes. This seems like a huge deal to
> me - the reason you want to load data into MySQL instead of just keeping it
> in Hadoop is so you can service real-time queries. Not having any indexing
> kind of defeats the purpose there. This is especially true since MySQL only
> supports nested-loop joins, and there's no way of attaching metadata to a
> CSV table to say "hey look, this table is already in sorted order so you can
> use a merge join".
> 
> - Since CSV is a text based format, it's likely to be a lot less compact
> than a proper table. For example, a unix timestamp is likely to be ~10
> characters vs 4 bytes in a packed table.
> 
> - I'm not aware of many people actually using CSV for anything except
> tutorials and training. Since it's not in heavy use by big mysql users, I
> wouldn't build a production system around it.
> 
> Here's a wacky idea that I might be interested in hacking up if anyone's
> interested:
> 
> What if there were a MyISAMTableOutputFormat in hadoop? You could use this
> as a reducer output and have it actually output .frm and .myd files onto
> HDFS, then simply hdfs -get them onto DB servers for realtime serving.
> Sounds like a fun hack I might be interested in if people would find it
> useful. Building the .myi indexes in Hadoop would be pretty killer as well,
> but potentially more difficult.
> 
> -Todd

Re: Hadoop / MySQL

Posted by Todd Lipcon <to...@cloudera.com>.

On Wed, Apr 29, 2009 at 7:19 AM, Stefan Podkowinski <sp...@gmail.com>wrote:

> If you have trouble loading your data into mysql using INSERTs or LOAD
> DATA, consider that MySQL supports CSV directly using the CSV storage
> engine. The only thing you have to do is to copy your hadoop produced
> csv file into the mysql data directory and issue a "flush tables"
> command to have mysql flush its caches and pickup the new file. Its
> very simple and you have the full set of sql commands available just
> as with innodb or myisam. What you don't get with the csv engine are
> indexes and foreign keys. Can't have it all, can you?
>

The CSV storage engine is definitely an interesting option, but it has a
couple downsides:

- Like you mentioned, you don't get indexes. This seems like a huge deal to
me - the reason you want to load data into MySQL instead of just keeping it
in Hadoop is so you can service real-time queries. Not having any indexing
kind of defeats the purpose there. This is especially true since MySQL only
supports nested-loop joins, and there's no way of attaching metadata to a
CSV table to say "hey look, this table is already in sorted order so you can
use a merge join".

- Since CSV is a text based format, it's likely to be a lot less compact
than a proper table. For example, a unix timestamp is likely to be ~10
characters vs 4 bytes in a packed table.

- I'm not aware of many people actually using CSV for anything except
tutorials and training. Since it's not in heavy use by big mysql users, I
wouldn't build a production system around it.

Here's a wacky idea that I might be interested in hacking up if anyone's
interested:

What if there were a MyISAMTableOutputFormat in hadoop? You could use this
as a reducer output and have it actually output .frm and .myd files onto
HDFS, then simply hdfs -get them onto DB servers for realtime serving.
Sounds like a fun hack I might be interested in if people would find it
useful. Building the .myi indexes in Hadoop would be pretty killer as well,
but potentially more difficult.

-Todd

Re: Hadoop / MySQL

Posted by Stefan Podkowinski <sp...@gmail.com>.

If you have trouble loading your data into mysql using INSERTs or LOAD
DATA, consider that MySQL supports CSV directly using the CSV storage
engine. The only thing you have to do is to copy your hadoop produced
csv file into the mysql data directory and issue a "flush tables"
command to have mysql flush its caches and pickup the new file. Its
very simple and you have the full set of sql commands available just
as with innodb or myisam. What you don't get with the csv engine are
indexes and foreign keys. Can't have it all, can you?

Stefan


On Tue, Apr 28, 2009 at 9:23 PM, Bill Habermaas <bi...@habermaas.us> wrote:
> Excellent discussion. Thank you Todd.
> You're forgiven for being off topic (at least by me).
> :)
> Bill
>
> -----Original Message-----
> From: Todd Lipcon [mailto:todd@cloudera.com]
> Sent: Tuesday, April 28, 2009 2:29 PM
> To: core-user
> Subject: Re: Hadoop / MySQL
>
> Warning: derailing a bit into MySQL discussion below, but I think enough
> people have similar use cases that it's worth discussing this even though
> it's gotten off-topic.
>
> 2009/4/28 tim robertson <ti...@gmail.com>
>
>>
>> So we ended up with 2 DBs
>> - DB1 we insert to, prepare and do batch processing
>> - DB2 serving the read only web app
>>
>
> This is a pretty reasonable and common architecture. Depending on your
> specific setup, instead of flip-flopping between DB1 and DB2, you could
> actually pull snapshots of MyISAM tables off DB1 and load them onto other
> machines. As long as you've flushed the tables with a read lock, MyISAM
> tables are transferrable between machines (eg via rsync). Obviously this can
> get a bit hairy, but it's a nice trick to consider for this kind of
> workflow.
>
>
>> Why did we end up with this?  Because of locking on writes that kill
>> reads as you say... basically you can't insert when a read is
>> happening on myisam as it locks the whole table.
>
>
> This is only true if you have binary logging enabled. Otherwise, myisam
> supports concurrent inserts with reads. That said, binary logging is
> necessary if you have any slaves. If you're loading bulk data from the
> result of a mapreduce job, you might be better off not using replication and
> simply loading the bulk data to each of the serving replicas individually.
> Turning off the binary logging will also double your write speed (LOAD DATA
> writes the entirety of the data to the binary log as well as to the table)
>
>
>>  InnoDB has row level
>> locking to get around this but in our experience (at the time we had
>> 130million records) it just didn't work either.
>
>
> You're quite likely to be hitting the InnoDB autoincrement lock if you have
> an autoincrement primary key here. There are fixes for this in MySQL 5.1.
> The best solution is to avoid autoincrement primary keys and use LOAD DATA
> for these kind of bulk loads, as others have suggested.
>
>
>>  We spent €10,000 for
>> the supposed "european expert" on mysql from their professional
>> services and were unfortunately very disappointed.  Seems such large
>> tables are just problematic with mysql.  We are now very much looking
>> into Lucene technologies for search and Hadoop for reporting and
>> datamining type operations. SOLR does a lot of what our DB does for
>> us.
>>
>
> Yep - oftentimes MySQL is not the correct solution, but other times it can
> be just what you need. If you already have competencies with MySQL and a
> good access layer from your serving tier, it's often easier to stick with
> MySQL than add a new technology into the mix.
>
>
>>
>> So with myisam... here is what we learnt:
>>
>> Only very latest mysql versions (beta still I think) support more than
>> 4G memory for indexes (you really really need the index in memory, and
>> where possible the FK for joins in the index too).
>
>
> As far as I know, any 64-bit mysql instance will use more than 4G without
> trouble.
>
>
>>  Mysql has
>> differing join strategies between innoDB and myisam, so be aware.
>
>
> I don't think this is true. Joining happens at the MySQL execution layer,
> which is above the storage engine API. The same join strategies are
> available for both. For a particular query, InnoDB and MyISAM tables may end
> up providing a different query plan based on the statistics that are
> collected, but given properly analyzed tables, the strategies will be the
> same. This is how MySQL allows inter-storage-engine joins. If one engine is
> providing a better query plan, you can use query hints to enforce that plan
> (see STRAIGHT_JOIN and FORCE INDEX for example)
>
>
>> An undocumented feature of myisam is you can create memory buffers for
>> single indexes:
>> In the my.cnf:
>>     taxon_concept_cache.key_buffer_size=3990M    -- for some reason
>> you have to drop a little under 4G
>>
>> then in the DB run:
>>    cache index taxon_concept in taxon_concept_cache;
>>    load index into cache taxon_concept;
>>
>> This allows for making sure an index gets into memory for sure.
>>
>
> But for most use cases and a properly configured machine you're better off
> letting it use its own caching policies to keep hot indexes in RAM.
>
>
>>
>> And here is our main cnf changes:
>>
>> sort_buffer_size = 128M
>
>
> This is likely to be causing huge slowdowns on ORDER BYs. See
> http://www.mysqlperformanceblog.com/2007/08/18/how-fast-can-you-sort-data-with-mysql/for
> example. Try setting to 128K instead.
>
>
>>
>> read_buffer_size = 64M
>> read_rnd_buffer_size = 300M
>
>
> read_rnd_buffer_size seems too high to me.
>
>
> Hope this is helpful for anyone else using MySQL as a sink for MapReduce
> output.
>
> -Todd
>
>
>
>>
>> On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong <wa...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > That's really cool. It seems that Hadoop could work with SQL DBs like
>> > Mysql with bearable time. I thought when inserting data to Mysql, the
>> > expense of communication was always a big problem. You got a method to
>> > reduce the expense.
>> >
>> > Using Distribution Databases like HBase is another good choice. It is
>> > based on Hadoop and HDFS, so there is no problem with communication
>> > expense.
>> >
>> >
>> >
>> > On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel <an...@corp.aol.com>
>> wrote:
>> >>
>> >> hello hadoop users,
>> >> Recently I had a chance to lead a team building a log-processing system
>> that uses Hadoop and MySQL. The system's goal was to process the incoming
>> information as quickly as possible (real time or near real time), and make
>> it available for querying in MySQL. I thought it would be good to share the
>> experience and the challenges with the community. Couldn't think of a better
>> place than these mailing lists as I am not much of a blogger :-)
>> >>
>> >> The information flow in the system looks something like
>> >>
>> >> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
>> >>
>> >> Transferring from Apache-Servers to Hadoop was quite easy as we just had
>> to organize the data in timely buckets (directories). Once that was running
>> smooth we had to make sure that map-reduce jobs are fired at regular
>> intervals and they pick up the right data. The jobs would then
>> process/aggregate the date and dump the info into MySQL shards from the
>> reducers [we have our own DB partioning set up]. This is where we hit major
>> bottlenecks [any surprises? :-)]
>> >>
>> >> The table engine used was InnoDB as there was a need for fast
>> replication and writes but only moderate reads (should eventually support
>> high read rates). The data would take up quite a while to load completely
>> far away from being near-real time. And so our optimization journey begin.
>> >>
>> >> 1. We tried to optimize/tune InnoDB parameters like increasing the
>> buffer pool size to 75 % of available RAM. This helped but only till the
>> time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host
>> the data and indexes.
>> >>
>> >> 2. We also realized that InnoDB has considerable locking overhead
>> because of which write concurrency is really bad when you have a large
>> number of concurrent threads doing writes. The default thread concurrency
>> for us was set to no_of_cpu * 2 = 8 which is what the official documentation
>> advises as the optimal limit. So we limited the number of reduce tasks and
>> consequently the number of concurrent writes and boy the performance
>> improved 4x. We were almost there :-)
>> >>
>> >> 3. Next thing we tried is the standard DB optimzation techniques like
>> de-normalizing the schema and dropping constraints. This gave only a minor
>> performance improvement, nothing earth shattering. Note that we were already
>> caching connections in reducers to each MySQL shard and partionining logic
>> was embedded into reducers.
>> >>
>> >> 4. Falling still short of our performance objectives, we finally we
>> decided to get rid of JDBC writes from reducers and work on an alternative
>> that uses MySQLs LOAD utility.
>> >> - The processing would partition the data into MySQL shard specific
>> files resident in HDFS.
>> >> - A script would then spawn processes via ssh on different physical
>> machines to download this data.
>> >> - Each spawned process just downloads the data for the shard it should
>> upload to.
>> >> - All the processes then start uploading data in parallel into their
>> respective MySQL shards using LOAD DATA infile.
>> >>
>> >> This proved to be the fastest approach, even in the wake of increasing
>> data loads. The enitre processing/loading would complete in less than 6 min.
>> The system has been holding up quite well so far, even though we've had to
>> limit the number of days for which we keep the data or else the MySQLs get
>> overwhelmed.
>> >>
>> >> Hope this is helpful to people.
>> >>
>> >> Regards
>> >> -Ankur
>> >>
>> >
>> >
>> >
>> > --
>> > Wang Zhong
>> >
>>
>
>
>

RE: Hadoop / MySQL

Posted by Bill Habermaas <bi...@habermaas.us>.

Excellent discussion. Thank you Todd. 
You're forgiven for being off topic (at least by me). 
:)
Bill

-----Original Message-----
From: Todd Lipcon [mailto:todd@cloudera.com] 
Sent: Tuesday, April 28, 2009 2:29 PM
To: core-user
Subject: Re: Hadoop / MySQL

Warning: derailing a bit into MySQL discussion below, but I think enough
people have similar use cases that it's worth discussing this even though
it's gotten off-topic.

2009/4/28 tim robertson <ti...@gmail.com>

>
> So we ended up with 2 DBs
> - DB1 we insert to, prepare and do batch processing
> - DB2 serving the read only web app
>

This is a pretty reasonable and common architecture. Depending on your
specific setup, instead of flip-flopping between DB1 and DB2, you could
actually pull snapshots of MyISAM tables off DB1 and load them onto other
machines. As long as you've flushed the tables with a read lock, MyISAM
tables are transferrable between machines (eg via rsync). Obviously this can
get a bit hairy, but it's a nice trick to consider for this kind of
workflow.

> Why did we end up with this?  Because of locking on writes that kill
> reads as you say... basically you can't insert when a read is
> happening on myisam as it locks the whole table.

This is only true if you have binary logging enabled. Otherwise, myisam
supports concurrent inserts with reads. That said, binary logging is
necessary if you have any slaves. If you're loading bulk data from the
result of a mapreduce job, you might be better off not using replication and
simply loading the bulk data to each of the serving replicas individually.
Turning off the binary logging will also double your write speed (LOAD DATA
writes the entirety of the data to the binary log as well as to the table)

>  InnoDB has row level
> locking to get around this but in our experience (at the time we had
> 130million records) it just didn't work either.

You're quite likely to be hitting the InnoDB autoincrement lock if you have
an autoincrement primary key here. There are fixes for this in MySQL 5.1.
The best solution is to avoid autoincrement primary keys and use LOAD DATA
for these kind of bulk loads, as others have suggested.

>  We spent €10,000 for
> the supposed "european expert" on mysql from their professional
> services and were unfortunately very disappointed.  Seems such large
> tables are just problematic with mysql.  We are now very much looking
> into Lucene technologies for search and Hadoop for reporting and
> datamining type operations. SOLR does a lot of what our DB does for
> us.
>

Yep - oftentimes MySQL is not the correct solution, but other times it can
be just what you need. If you already have competencies with MySQL and a
good access layer from your serving tier, it's often easier to stick with
MySQL than add a new technology into the mix.

>
> So with myisam... here is what we learnt:
>
> Only very latest mysql versions (beta still I think) support more than
> 4G memory for indexes (you really really need the index in memory, and
> where possible the FK for joins in the index too).

As far as I know, any 64-bit mysql instance will use more than 4G without
trouble.

>  Mysql has
> differing join strategies between innoDB and myisam, so be aware.

I don't think this is true. Joining happens at the MySQL execution layer,
which is above the storage engine API. The same join strategies are
available for both. For a particular query, InnoDB and MyISAM tables may end
up providing a different query plan based on the statistics that are
collected, but given properly analyzed tables, the strategies will be the
same. This is how MySQL allows inter-storage-engine joins. If one engine is
providing a better query plan, you can use query hints to enforce that plan
(see STRAIGHT_JOIN and FORCE INDEX for example)

> An undocumented feature of myisam is you can create memory buffers for
> single indexes:
> In the my.cnf:
>     taxon_concept_cache.key_buffer_size=3990M    -- for some reason
> you have to drop a little under 4G
>
> then in the DB run:
>    cache index taxon_concept in taxon_concept_cache;
>    load index into cache taxon_concept;
>
> This allows for making sure an index gets into memory for sure.
>

But for most use cases and a properly configured machine you're better off
letting it use its own caching policies to keep hot indexes in RAM.

>
> And here is our main cnf changes:
>
> sort_buffer_size = 128M

This is likely to be causing huge slowdowns on ORDER BYs. See
http://www.mysqlperformanceblog.com/2007/08/18/how-fast-can-you-sort-data-with-mysql/for
example. Try setting to 128K instead.

>
> read_buffer_size = 64M
> read_rnd_buffer_size = 300M

read_rnd_buffer_size seems too high to me.

Hope this is helpful for anyone else using MySQL as a sink for MapReduce
output.

-Todd

>
> On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong <wa...@gmail.com>
> wrote:
> > Hi,
> >
> > That's really cool. It seems that Hadoop could work with SQL DBs like
> > Mysql with bearable time. I thought when inserting data to Mysql, the
> > expense of communication was always a big problem. You got a method to
> > reduce the expense.
> >
> > Using Distribution Databases like HBase is another good choice. It is
> > based on Hadoop and HDFS, so there is no problem with communication
> > expense.
> >
> >
> >
> > On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel <an...@corp.aol.com>
> wrote:
> >>
> >> hello hadoop users,
> >> Recently I had a chance to lead a team building a log-processing system
> that uses Hadoop and MySQL. The system's goal was to process the incoming
> information as quickly as possible (real time or near real time), and make
> it available for querying in MySQL. I thought it would be good to share the
> experience and the challenges with the community. Couldn't think of a better
> place than these mailing lists as I am not much of a blogger :-)
> >>
> >> The information flow in the system looks something like
> >>
> >> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
> >>
> >> Transferring from Apache-Servers to Hadoop was quite easy as we just had
> to organize the data in timely buckets (directories). Once that was running
> smooth we had to make sure that map-reduce jobs are fired at regular
> intervals and they pick up the right data. The jobs would then
> process/aggregate the date and dump the info into MySQL shards from the
> reducers [we have our own DB partioning set up]. This is where we hit major
> bottlenecks [any surprises? :-)]
> >>
> >> The table engine used was InnoDB as there was a need for fast
> replication and writes but only moderate reads (should eventually support
> high read rates). The data would take up quite a while to load completely
> far away from being near-real time. And so our optimization journey begin.
> >>
> >> 1. We tried to optimize/tune InnoDB parameters like increasing the
> buffer pool size to 75 % of available RAM. This helped but only till the
> time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host
> the data and indexes.
> >>
> >> 2. We also realized that InnoDB has considerable locking overhead
> because of which write concurrency is really bad when you have a large
> number of concurrent threads doing writes. The default thread concurrency
> for us was set to no_of_cpu * 2 = 8 which is what the official documentation
> advises as the optimal limit. So we limited the number of reduce tasks and
> consequently the number of concurrent writes and boy the performance
> improved 4x. We were almost there :-)
> >>
> >> 3. Next thing we tried is the standard DB optimzation techniques like
> de-normalizing the schema and dropping constraints. This gave only a minor
> performance improvement, nothing earth shattering. Note that we were already
> caching connections in reducers to each MySQL shard and partionining logic
> was embedded into reducers.
> >>
> >> 4. Falling still short of our performance objectives, we finally we
> decided to get rid of JDBC writes from reducers and work on an alternative
> that uses MySQLs LOAD utility.
> >> - The processing would partition the data into MySQL shard specific
> files resident in HDFS.
> >> - A script would then spawn processes via ssh on different physical
> machines to download this data.
> >> - Each spawned process just downloads the data for the shard it should
> upload to.
> >> - All the processes then start uploading data in parallel into their
> respective MySQL shards using LOAD DATA infile.
> >>
> >> This proved to be the fastest approach, even in the wake of increasing
> data loads. The enitre processing/loading would complete in less than 6 min.
> The system has been holding up quite well so far, even though we've had to
> limit the number of days for which we keep the data or else the MySQLs get
> overwhelmed.
> >>
> >> Hope this is helpful to people.
> >>
> >> Regards
> >> -Ankur
> >>
> >
> >
> >
> > --
> > Wang Zhong
> >
>

Re: Hadoop / MySQL

Posted by Todd Lipcon <to...@cloudera.com>.

Warning: derailing a bit into MySQL discussion below, but I think enough
people have similar use cases that it's worth discussing this even though
it's gotten off-topic.

2009/4/28 tim robertson <ti...@gmail.com>

>
> So we ended up with 2 DBs
> - DB1 we insert to, prepare and do batch processing
> - DB2 serving the read only web app
>

This is a pretty reasonable and common architecture. Depending on your
specific setup, instead of flip-flopping between DB1 and DB2, you could
actually pull snapshots of MyISAM tables off DB1 and load them onto other
machines. As long as you've flushed the tables with a read lock, MyISAM
tables are transferrable between machines (eg via rsync). Obviously this can
get a bit hairy, but it's a nice trick to consider for this kind of
workflow.

> Why did we end up with this?  Because of locking on writes that kill
> reads as you say... basically you can't insert when a read is
> happening on myisam as it locks the whole table.

This is only true if you have binary logging enabled. Otherwise, myisam
supports concurrent inserts with reads. That said, binary logging is
necessary if you have any slaves. If you're loading bulk data from the
result of a mapreduce job, you might be better off not using replication and
simply loading the bulk data to each of the serving replicas individually.
Turning off the binary logging will also double your write speed (LOAD DATA
writes the entirety of the data to the binary log as well as to the table)

>  InnoDB has row level
> locking to get around this but in our experience (at the time we had
> 130million records) it just didn't work either.

You're quite likely to be hitting the InnoDB autoincrement lock if you have
an autoincrement primary key here. There are fixes for this in MySQL 5.1.
The best solution is to avoid autoincrement primary keys and use LOAD DATA
for these kind of bulk loads, as others have suggested.

>  We spent €10,000 for
> the supposed "european expert" on mysql from their professional
> services and were unfortunately very disappointed.  Seems such large
> tables are just problematic with mysql.  We are now very much looking
> into Lucene technologies for search and Hadoop for reporting and
> datamining type operations. SOLR does a lot of what our DB does for
> us.
>

Yep - oftentimes MySQL is not the correct solution, but other times it can
be just what you need. If you already have competencies with MySQL and a
good access layer from your serving tier, it's often easier to stick with
MySQL than add a new technology into the mix.

>
> So with myisam... here is what we learnt:
>
> Only very latest mysql versions (beta still I think) support more than
> 4G memory for indexes (you really really need the index in memory, and
> where possible the FK for joins in the index too).

As far as I know, any 64-bit mysql instance will use more than 4G without
trouble.

>  Mysql has
> differing join strategies between innoDB and myisam, so be aware.

I don't think this is true. Joining happens at the MySQL execution layer,
which is above the storage engine API. The same join strategies are
available for both. For a particular query, InnoDB and MyISAM tables may end
up providing a different query plan based on the statistics that are
collected, but given properly analyzed tables, the strategies will be the
same. This is how MySQL allows inter-storage-engine joins. If one engine is
providing a better query plan, you can use query hints to enforce that plan
(see STRAIGHT_JOIN and FORCE INDEX for example)

> An undocumented feature of myisam is you can create memory buffers for
> single indexes:
> In the my.cnf:
>     taxon_concept_cache.key_buffer_size=3990M    -- for some reason
> you have to drop a little under 4G
>
> then in the DB run:
>    cache index taxon_concept in taxon_concept_cache;
>    load index into cache taxon_concept;
>
> This allows for making sure an index gets into memory for sure.
>

But for most use cases and a properly configured machine you're better off
letting it use its own caching policies to keep hot indexes in RAM.

>
> And here is our main cnf changes:
>
> sort_buffer_size = 128M

This is likely to be causing huge slowdowns on ORDER BYs. See
http://www.mysqlperformanceblog.com/2007/08/18/how-fast-can-you-sort-data-with-mysql/for
example. Try setting to 128K instead.

>
> read_buffer_size = 64M
> read_rnd_buffer_size = 300M

read_rnd_buffer_size seems too high to me.

Hope this is helpful for anyone else using MySQL as a sink for MapReduce
output.

-Todd

>
> On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong <wa...@gmail.com>
> wrote:
> > Hi,
> >
> > That's really cool. It seems that Hadoop could work with SQL DBs like
> > Mysql with bearable time. I thought when inserting data to Mysql, the
> > expense of communication was always a big problem. You got a method to
> > reduce the expense.
> >
> > Using Distribution Databases like HBase is another good choice. It is
> > based on Hadoop and HDFS, so there is no problem with communication
> > expense.
> >
> >
> >
> > On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel <an...@corp.aol.com>
> wrote:
> >>
> >> hello hadoop users,
> >> Recently I had a chance to lead a team building a log-processing system
> that uses Hadoop and MySQL. The system's goal was to process the incoming
> information as quickly as possible (real time or near real time), and make
> it available for querying in MySQL. I thought it would be good to share the
> experience and the challenges with the community. Couldn't think of a better
> place than these mailing lists as I am not much of a blogger :-)
> >>
> >> The information flow in the system looks something like
> >>
> >> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
> >>
> >> Transferring from Apache-Servers to Hadoop was quite easy as we just had
> to organize the data in timely buckets (directories). Once that was running
> smooth we had to make sure that map-reduce jobs are fired at regular
> intervals and they pick up the right data. The jobs would then
> process/aggregate the date and dump the info into MySQL shards from the
> reducers [we have our own DB partioning set up]. This is where we hit major
> bottlenecks [any surprises? :-)]
> >>
> >> The table engine used was InnoDB as there was a need for fast
> replication and writes but only moderate reads (should eventually support
> high read rates). The data would take up quite a while to load completely
> far away from being near-real time. And so our optimization journey begin.
> >>
> >> 1. We tried to optimize/tune InnoDB parameters like increasing the
> buffer pool size to 75 % of available RAM. This helped but only till the
> time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host
> the data and indexes.
> >>
> >> 2. We also realized that InnoDB has considerable locking overhead
> because of which write concurrency is really bad when you have a large
> number of concurrent threads doing writes. The default thread concurrency
> for us was set to no_of_cpu * 2 = 8 which is what the official documentation
> advises as the optimal limit. So we limited the number of reduce tasks and
> consequently the number of concurrent writes and boy the performance
> improved 4x. We were almost there :-)
> >>
> >> 3. Next thing we tried is the standard DB optimzation techniques like
> de-normalizing the schema and dropping constraints. This gave only a minor
> performance improvement, nothing earth shattering. Note that we were already
> caching connections in reducers to each MySQL shard and partionining logic
> was embedded into reducers.
> >>
> >> 4. Falling still short of our performance objectives, we finally we
> decided to get rid of JDBC writes from reducers and work on an alternative
> that uses MySQLs LOAD utility.
> >> - The processing would partition the data into MySQL shard specific
> files resident in HDFS.
> >> - A script would then spawn processes via ssh on different physical
> machines to download this data.
> >> - Each spawned process just downloads the data for the shard it should
> upload to.
> >> - All the processes then start uploading data in parallel into their
> respective MySQL shards using LOAD DATA infile.
> >>
> >> This proved to be the fastest approach, even in the wake of increasing
> data loads. The enitre processing/loading would complete in less than 6 min.
> The system has been holding up quite well so far, even though we've had to
> limit the number of days for which we keep the data or else the MySQLs get
> overwhelmed.
> >>
> >> Hope this is helpful to people.
> >>
> >> Regards
> >> -Ankur
> >>
> >
> >
> >
> > --
> > Wang Zhong
> >
>

Re: Hadoop / MySQL

Posted by Joerg Rieger <jo...@mni.fh-giessen.de>.

I remember reading an article last year about something similar done  
by Rackspace.

They went through various iterations of their logging system and  
encountered similar
scaling issues with MySQL.

In the end they started using Hadoop, Lucene and Solr:

"
How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data
"


On 28.04.2009, at 15:29, tim robertson wrote:

> Sorry, that was not meant to be sent to the list...
>
>
> On Tue, Apr 28, 2009 at 3:27 PM, tim robertson
> <ti...@gmail.com> wrote:
>> Hi,
>>
>> [Ankur]: How can make sure this happens?
>> -- show processlist is how we spot it... literally it takes hours on
>> our set up so easy to find.
>>
>> So we ended up with 2 DBs
>> - DB1 we insert to, prepare and do batch processing
>> - DB2 serving the read only web app
>>
>> Periodically we dump the DB1, point the read-only web app at DB1,  
>> load
>> the dump into DB2, and then DB2 becomes the write DB... and an  
>> endless
>> cycle.  We can afford data latency in our system though.
>>
>> Why did we end up with this?  Because of locking on writes that kill
>> reads as you say... basically you can't insert when a read is
>> happening on myisam as it locks the whole table.  InnoDB has row  
>> level
>> locking to get around this but in our experience (at the time we had
>> 130million records) it just didn't work either.  We spent €10,000 for
>> the supposed "european expert" on mysql from their professional
>> services and were unfortunately very disappointed.  Seems such large
>> tables are just problematic with mysql.  We are now very much looking
>> into Lucene technologies for search and Hadoop for reporting and
>> datamining type operations. SOLR does a lot of what our DB does for
>> us.
>>
>> So with myisam... here is what we learnt:
>>
>> Only very latest mysql versions (beta still I think) support more  
>> than
>> 4G memory for indexes (you really really need the index in memory,  
>> and
>> where possible the FK for joins in the index too).  Mysql has
>> differing join strategies between innoDB and myisam, so be aware.
>> An undocumented feature of myisam is you can create memory buffers  
>> for
>> single indexes:
>> In the my.cnf:
>>     taxon_concept_cache.key_buffer_size=3990M    -- for some reason
>> you have to drop a little under 4G
>>
>> then in the DB run:
>>    cache index taxon_concept in taxon_concept_cache;
>>    load index into cache taxon_concept;
>>
>> This allows for making sure an index gets into memory for sure.
>>
>> And here is our main cnf changes:
>>
>> table_cache = 512
>> sort_buffer_size = 128M
>> read_buffer_size = 64M
>> read_rnd_buffer_size = 300M
>> thread_cache_size = 8
>> query_cache_type = 0
>> query_cache_size = 0
>> thread_concurrency = 8
>> # this is used for repair by sorting and is faster than keycache by  
>> far
>> myisam_max_sort_file_size = 200G
>> myisam_max_extra_sort_file_size = 200G
>> # this is used for repair by sorting and is limited to 4G
>> myisam_sort_buffer_size = 3990M
>> # maximum value for this is 4g for now but see section on specific
>> keys for large indexes
>> key_buffer_size=3990M
>> tmp_table_size=512M
>> max_heap_table_size=512M
>>
>> -- for repair by sorting, here are the required values
>> myisam_max_sort_file_size = 200G
>> myisam_max_extra_sort_file_size = 200G
>> myisam_sort_buffer_size = 3990M
>>
>>
>> I guess just archive this mail off for if you ever do try myisam for
>> something... these params are the result of a lot of trial and error
>> and days of waiting!
>>
>> Our DB has about 44G of index which we can keep in memory but are
>> hitting limits - hence on the hadoop train.
>>
>> Cheers
>>
>> Tim
>>
>>
>>
>>
>>
>> On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong  
>> <wa...@gmail.com> wrote:
>>> Hi,
>>>
>>> That's really cool. It seems that Hadoop could work with SQL DBs  
>>> like
>>> Mysql with bearable time. I thought when inserting data to Mysql,  
>>> the
>>> expense of communication was always a big problem. You got a  
>>> method to
>>> reduce the expense.
>>>
>>> Using Distribution Databases like HBase is another good choice. It  
>>> is
>>> based on Hadoop and HDFS, so there is no problem with communication
>>> expense.
>>>
>>>
>>>
>>> On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel  
>>> <an...@corp.aol.com> wrote:
>>>>
>>>> hello hadoop users,
>>>> Recently I had a chance to lead a team building a log-processing  
>>>> system that uses Hadoop and MySQL. The system's goal was to  
>>>> process the incoming information as quickly as possible (real  
>>>> time or near real time), and make it available for querying in  
>>>> MySQL. I thought it would be good to share the experience and the  
>>>> challenges with the community. Couldn't think of a better place  
>>>> than these mailing lists as I am not much of a blogger :-)
>>>>
>>>> The information flow in the system looks something like
>>>>
>>>> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
>>>>
>>>> Transferring from Apache-Servers to Hadoop was quite easy as we  
>>>> just had to organize the data in timely buckets (directories).  
>>>> Once that was running smooth we had to make sure that map-reduce  
>>>> jobs are fired at regular intervals and they pick up the right  
>>>> data. The jobs would then process/aggregate the date and dump the  
>>>> info into MySQL shards from the reducers [we have our own DB  
>>>> partioning set up]. This is where we hit major bottlenecks [any  
>>>> surprises? :-)]
>>>>
>>>> The table engine used was InnoDB as there was a need for fast  
>>>> replication and writes but only moderate reads (should eventually  
>>>> support high read rates). The data would take up quite a while to  
>>>> load completely far away from being near-real time. And so our  
>>>> optimization journey begin.
>>>>
>>>> 1. We tried to optimize/tune InnoDB parameters like increasing  
>>>> the buffer pool size to 75 % of available RAM. This helped but  
>>>> only till the time DBs were lightly loaded i.e. innoDB had  
>>>> sufficient buffer pool to host the data and indexes.
>>>>
>>>> 2. We also realized that InnoDB has considerable locking overhead  
>>>> because of which write concurrency is really bad when you have a  
>>>> large number of concurrent threads doing writes. The default  
>>>> thread concurrency for us was set to no_of_cpu * 2 = 8 which is  
>>>> what the official documentation advises as the optimal limit. So  
>>>> we limited the number of reduce tasks and consequently the number  
>>>> of concurrent writes and boy the performance improved 4x. We were  
>>>> almost there :-)
>>>>
>>>> 3. Next thing we tried is the standard DB optimzation techniques  
>>>> like de-normalizing the schema and dropping constraints. This  
>>>> gave only a minor performance improvement, nothing earth  
>>>> shattering. Note that we were already caching connections in  
>>>> reducers to each MySQL shard and partionining logic was embedded  
>>>> into reducers.
>>>>
>>>> 4. Falling still short of our performance objectives, we finally  
>>>> we decided to get rid of JDBC writes from reducers and work on an  
>>>> alternative that uses MySQLs LOAD utility.
>>>> - The processing would partition the data into MySQL shard  
>>>> specific files resident in HDFS.
>>>> - A script would then spawn processes via ssh on different  
>>>> physical machines to download this data.
>>>> - Each spawned process just downloads the data for the shard it  
>>>> should upload to.
>>>> - All the processes then start uploading data in parallel into  
>>>> their respective MySQL shards using LOAD DATA infile.
>>>>
>>>> This proved to be the fastest approach, even in the wake of  
>>>> increasing data loads. The enitre processing/loading would  
>>>> complete in less than 6 min. The system has been holding up quite  
>>>> well so far, even though we've had to limit the number of days  
>>>> for which we keep the data or else the MySQLs get overwhelmed.
>>>>
>>>> Hope this is helpful to people.
>>>>
>>>> Regards
>>>> -Ankur
>>>>
>>>
>>>
>>>
>>> --
>>> Wang Zhong
>>>
>>

--

Re: Hadoop / MySQL

Posted by tim robertson <ti...@gmail.com>.

Sorry, that was not meant to be sent to the list...


On Tue, Apr 28, 2009 at 3:27 PM, tim robertson
<ti...@gmail.com> wrote:
> Hi,
>
> [Ankur]: How can make sure this happens?
> -- show processlist is how we spot it... literally it takes hours on
> our set up so easy to find.
>
> So we ended up with 2 DBs
> - DB1 we insert to, prepare and do batch processing
> - DB2 serving the read only web app
>
> Periodically we dump the DB1, point the read-only web app at DB1, load
> the dump into DB2, and then DB2 becomes the write DB... and an endless
> cycle.  We can afford data latency in our system though.
>
> Why did we end up with this?  Because of locking on writes that kill
> reads as you say... basically you can't insert when a read is
> happening on myisam as it locks the whole table.  InnoDB has row level
> locking to get around this but in our experience (at the time we had
> 130million records) it just didn't work either.  We spent €10,000 for
> the supposed "european expert" on mysql from their professional
> services and were unfortunately very disappointed.  Seems such large
> tables are just problematic with mysql.  We are now very much looking
> into Lucene technologies for search and Hadoop for reporting and
> datamining type operations. SOLR does a lot of what our DB does for
> us.
>
> So with myisam... here is what we learnt:
>
> Only very latest mysql versions (beta still I think) support more than
> 4G memory for indexes (you really really need the index in memory, and
> where possible the FK for joins in the index too).  Mysql has
> differing join strategies between innoDB and myisam, so be aware.
> An undocumented feature of myisam is you can create memory buffers for
> single indexes:
> In the my.cnf:
>     taxon_concept_cache.key_buffer_size=3990M    -- for some reason
> you have to drop a little under 4G
>
> then in the DB run:
>    cache index taxon_concept in taxon_concept_cache;
>    load index into cache taxon_concept;
>
> This allows for making sure an index gets into memory for sure.
>
> And here is our main cnf changes:
>
> table_cache = 512
> sort_buffer_size = 128M
> read_buffer_size = 64M
> read_rnd_buffer_size = 300M
> thread_cache_size = 8
> query_cache_type = 0
> query_cache_size = 0
> thread_concurrency = 8
> # this is used for repair by sorting and is faster than keycache by far
> myisam_max_sort_file_size = 200G
> myisam_max_extra_sort_file_size = 200G
> # this is used for repair by sorting and is limited to 4G
> myisam_sort_buffer_size = 3990M
> # maximum value for this is 4g for now but see section on specific
> keys for large indexes
> key_buffer_size=3990M
> tmp_table_size=512M
> max_heap_table_size=512M
>
> -- for repair by sorting, here are the required values
> myisam_max_sort_file_size = 200G
> myisam_max_extra_sort_file_size = 200G
> myisam_sort_buffer_size = 3990M
>
>
> I guess just archive this mail off for if you ever do try myisam for
> something... these params are the result of a lot of trial and error
> and days of waiting!
>
> Our DB has about 44G of index which we can keep in memory but are
> hitting limits - hence on the hadoop train.
>
> Cheers
>
> Tim
>
>
>
>
>
> On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong <wa...@gmail.com> wrote:
>> Hi,
>>
>> That's really cool. It seems that Hadoop could work with SQL DBs like
>> Mysql with bearable time. I thought when inserting data to Mysql, the
>> expense of communication was always a big problem. You got a method to
>> reduce the expense.
>>
>> Using Distribution Databases like HBase is another good choice. It is
>> based on Hadoop and HDFS, so there is no problem with communication
>> expense.
>>
>>
>>
>> On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel <an...@corp.aol.com> wrote:
>>>
>>> hello hadoop users,
>>> Recently I had a chance to lead a team building a log-processing system that uses Hadoop and MySQL. The system's goal was to process the incoming information as quickly as possible (real time or near real time), and make it available for querying in MySQL. I thought it would be good to share the experience and the challenges with the community. Couldn't think of a better place than these mailing lists as I am not much of a blogger :-)
>>>
>>> The information flow in the system looks something like
>>>
>>> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
>>>
>>> Transferring from Apache-Servers to Hadoop was quite easy as we just had to organize the data in timely buckets (directories). Once that was running smooth we had to make sure that map-reduce jobs are fired at regular intervals and they pick up the right data. The jobs would then process/aggregate the date and dump the info into MySQL shards from the reducers [we have our own DB partioning set up]. This is where we hit major bottlenecks [any surprises? :-)]
>>>
>>> The table engine used was InnoDB as there was a need for fast replication and writes but only moderate reads (should eventually support high read rates). The data would take up quite a while to load completely far away from being near-real time. And so our optimization journey begin.
>>>
>>> 1. We tried to optimize/tune InnoDB parameters like increasing the buffer pool size to 75 % of available RAM. This helped but only till the time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host the data and indexes.
>>>
>>> 2. We also realized that InnoDB has considerable locking overhead because of which write concurrency is really bad when you have a large number of concurrent threads doing writes. The default thread concurrency for us was set to no_of_cpu * 2 = 8 which is what the official documentation advises as the optimal limit. So we limited the number of reduce tasks and consequently the number of concurrent writes and boy the performance improved 4x. We were almost there :-)
>>>
>>> 3. Next thing we tried is the standard DB optimzation techniques like de-normalizing the schema and dropping constraints. This gave only a minor performance improvement, nothing earth shattering. Note that we were already caching connections in reducers to each MySQL shard and partionining logic was embedded into reducers.
>>>
>>> 4. Falling still short of our performance objectives, we finally we decided to get rid of JDBC writes from reducers and work on an alternative that uses MySQLs LOAD utility.
>>> - The processing would partition the data into MySQL shard specific files resident in HDFS.
>>> - A script would then spawn processes via ssh on different physical machines to download this data.
>>> - Each spawned process just downloads the data for the shard it should upload to.
>>> - All the processes then start uploading data in parallel into their respective MySQL shards using LOAD DATA infile.
>>>
>>> This proved to be the fastest approach, even in the wake of increasing data loads. The enitre processing/loading would complete in less than 6 min. The system has been holding up quite well so far, even though we've had to limit the number of days for which we keep the data or else the MySQLs get overwhelmed.
>>>
>>> Hope this is helpful to people.
>>>
>>> Regards
>>> -Ankur
>>>
>>
>>
>>
>> --
>> Wang Zhong
>>
>

Re: Hadoop / MySQL

Posted by tim robertson <ti...@gmail.com>.

Hi,

[Ankur]: How can make sure this happens?
-- show processlist is how we spot it... literally it takes hours on
our set up so easy to find.

So we ended up with 2 DBs
- DB1 we insert to, prepare and do batch processing
- DB2 serving the read only web app

Periodically we dump the DB1, point the read-only web app at DB1, load
the dump into DB2, and then DB2 becomes the write DB... and an endless
cycle.  We can afford data latency in our system though.

Why did we end up with this?  Because of locking on writes that kill
reads as you say... basically you can't insert when a read is
happening on myisam as it locks the whole table.  InnoDB has row level
locking to get around this but in our experience (at the time we had
130million records) it just didn't work either.  We spent €10,000 for
the supposed "european expert" on mysql from their professional
services and were unfortunately very disappointed.  Seems such large
tables are just problematic with mysql.  We are now very much looking
into Lucene technologies for search and Hadoop for reporting and
datamining type operations. SOLR does a lot of what our DB does for
us.

So with myisam... here is what we learnt:

Only very latest mysql versions (beta still I think) support more than
4G memory for indexes (you really really need the index in memory, and
where possible the FK for joins in the index too).  Mysql has
differing join strategies between innoDB and myisam, so be aware.
An undocumented feature of myisam is you can create memory buffers for
single indexes:
In the my.cnf:
     taxon_concept_cache.key_buffer_size=3990M    -- for some reason
you have to drop a little under 4G

then in the DB run:
    cache index taxon_concept in taxon_concept_cache;
    load index into cache taxon_concept;

This allows for making sure an index gets into memory for sure.

And here is our main cnf changes:

table_cache = 512
sort_buffer_size = 128M
read_buffer_size = 64M
read_rnd_buffer_size = 300M
thread_cache_size = 8
query_cache_type = 0
query_cache_size = 0
thread_concurrency = 8
# this is used for repair by sorting and is faster than keycache by far
myisam_max_sort_file_size = 200G
myisam_max_extra_sort_file_size = 200G
# this is used for repair by sorting and is limited to 4G
myisam_sort_buffer_size = 3990M
# maximum value for this is 4g for now but see section on specific
keys for large indexes
key_buffer_size=3990M
tmp_table_size=512M
max_heap_table_size=512M

-- for repair by sorting, here are the required values
myisam_max_sort_file_size = 200G
myisam_max_extra_sort_file_size = 200G
myisam_sort_buffer_size = 3990M


I guess just archive this mail off for if you ever do try myisam for
something... these params are the result of a lot of trial and error
and days of waiting!

Our DB has about 44G of index which we can keep in memory but are
hitting limits - hence on the hadoop train.

Cheers

Tim





On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong <wa...@gmail.com> wrote:
> Hi,
>
> That's really cool. It seems that Hadoop could work with SQL DBs like
> Mysql with bearable time. I thought when inserting data to Mysql, the
> expense of communication was always a big problem. You got a method to
> reduce the expense.
>
> Using Distribution Databases like HBase is another good choice. It is
> based on Hadoop and HDFS, so there is no problem with communication
> expense.
>
>
>
> On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel <an...@corp.aol.com> wrote:
>>
>> hello hadoop users,
>> Recently I had a chance to lead a team building a log-processing system that uses Hadoop and MySQL. The system's goal was to process the incoming information as quickly as possible (real time or near real time), and make it available for querying in MySQL. I thought it would be good to share the experience and the challenges with the community. Couldn't think of a better place than these mailing lists as I am not much of a blogger :-)
>>
>> The information flow in the system looks something like
>>
>> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
>>
>> Transferring from Apache-Servers to Hadoop was quite easy as we just had to organize the data in timely buckets (directories). Once that was running smooth we had to make sure that map-reduce jobs are fired at regular intervals and they pick up the right data. The jobs would then process/aggregate the date and dump the info into MySQL shards from the reducers [we have our own DB partioning set up]. This is where we hit major bottlenecks [any surprises? :-)]
>>
>> The table engine used was InnoDB as there was a need for fast replication and writes but only moderate reads (should eventually support high read rates). The data would take up quite a while to load completely far away from being near-real time. And so our optimization journey begin.
>>
>> 1. We tried to optimize/tune InnoDB parameters like increasing the buffer pool size to 75 % of available RAM. This helped but only till the time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host the data and indexes.
>>
>> 2. We also realized that InnoDB has considerable locking overhead because of which write concurrency is really bad when you have a large number of concurrent threads doing writes. The default thread concurrency for us was set to no_of_cpu * 2 = 8 which is what the official documentation advises as the optimal limit. So we limited the number of reduce tasks and consequently the number of concurrent writes and boy the performance improved 4x. We were almost there :-)
>>
>> 3. Next thing we tried is the standard DB optimzation techniques like de-normalizing the schema and dropping constraints. This gave only a minor performance improvement, nothing earth shattering. Note that we were already caching connections in reducers to each MySQL shard and partionining logic was embedded into reducers.
>>
>> 4. Falling still short of our performance objectives, we finally we decided to get rid of JDBC writes from reducers and work on an alternative that uses MySQLs LOAD utility.
>> - The processing would partition the data into MySQL shard specific files resident in HDFS.
>> - A script would then spawn processes via ssh on different physical machines to download this data.
>> - Each spawned process just downloads the data for the shard it should upload to.
>> - All the processes then start uploading data in parallel into their respective MySQL shards using LOAD DATA infile.
>>
>> This proved to be the fastest approach, even in the wake of increasing data loads. The enitre processing/loading would complete in less than 6 min. The system has been holding up quite well so far, even though we've had to limit the number of days for which we keep the data or else the MySQLs get overwhelmed.
>>
>> Hope this is helpful to people.
>>
>> Regards
>> -Ankur
>>
>
>
>
> --
> Wang Zhong
>

Re: Hadoop / MySQL

Posted by Wang Zhong <wa...@gmail.com>.

Hi,

That's really cool. It seems that Hadoop could work with SQL DBs like
Mysql with bearable time. I thought when inserting data to Mysql, the
expense of communication was always a big problem. You got a method to
reduce the expense.

Using Distribution Databases like HBase is another good choice. It is
based on Hadoop and HDFS, so there is no problem with communication
expense.



On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel <an...@corp.aol.com> wrote:
>
> hello hadoop users,
> Recently I had a chance to lead a team building a log-processing system that uses Hadoop and MySQL. The system's goal was to process the incoming information as quickly as possible (real time or near real time), and make it available for querying in MySQL. I thought it would be good to share the experience and the challenges with the community. Couldn't think of a better place than these mailing lists as I am not much of a blogger :-)
>
> The information flow in the system looks something like
>
> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
>
> Transferring from Apache-Servers to Hadoop was quite easy as we just had to organize the data in timely buckets (directories). Once that was running smooth we had to make sure that map-reduce jobs are fired at regular intervals and they pick up the right data. The jobs would then process/aggregate the date and dump the info into MySQL shards from the reducers [we have our own DB partioning set up]. This is where we hit major bottlenecks [any surprises? :-)]
>
> The table engine used was InnoDB as there was a need for fast replication and writes but only moderate reads (should eventually support high read rates). The data would take up quite a while to load completely far away from being near-real time. And so our optimization journey begin.
>
> 1. We tried to optimize/tune InnoDB parameters like increasing the buffer pool size to 75 % of available RAM. This helped but only till the time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to host the data and indexes.
>
> 2. We also realized that InnoDB has considerable locking overhead because of which write concurrency is really bad when you have a large number of concurrent threads doing writes. The default thread concurrency for us was set to no_of_cpu * 2 = 8 which is what the official documentation advises as the optimal limit. So we limited the number of reduce tasks and consequently the number of concurrent writes and boy the performance improved 4x. We were almost there :-)
>
> 3. Next thing we tried is the standard DB optimzation techniques like de-normalizing the schema and dropping constraints. This gave only a minor performance improvement, nothing earth shattering. Note that we were already caching connections in reducers to each MySQL shard and partionining logic was embedded into reducers.
>
> 4. Falling still short of our performance objectives, we finally we decided to get rid of JDBC writes from reducers and work on an alternative that uses MySQLs LOAD utility.
> - The processing would partition the data into MySQL shard specific files resident in HDFS.
> - A script would then spawn processes via ssh on different physical machines to download this data.
> - Each spawned process just downloads the data for the shard it should upload to.
> - All the processes then start uploading data in parallel into their respective MySQL shards using LOAD DATA infile.
>
> This proved to be the fastest approach, even in the wake of increasing data loads. The enitre processing/loading would complete in less than 6 min. The system has been holding up quite well so far, even though we've had to limit the number of days for which we keep the data or else the MySQLs get overwhelmed.
>
> Hope this is helpful to people.
>
> Regards
> -Ankur
>



-- 
Wang Zhong