You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Marcos Medrado Rubinelli <ma...@buscape-inc.com> on 2010/03/23 12:04:59 UTC

Hadoop for accounting?

Hi,

The wiki's "Powered By" page ( http://wiki.apache.org/hadoop/PoweredBy ) 
lists dozens of companies using Hadoop in production, some of them for 
mission-critical operations, but is anyone using it - or planning to - 
for anything that involves money, like calculating a bill from API 
usage, or microtransactions?

If not, what are your main concerns? What parts would you consider 
stable enough for this kind of use?

Thanks,
Marcos

Re: Hadoop for accounting?

Posted by Stephen Watt <sw...@us.ibm.com>.
Hi Marcos

People are doing very similar things, but its not immediately clear from 
the way the use cases are described in the PoweredBy page (I believe the 
company Visible Measures may be one example).

If you are storing API usage information in a log, you can use Hadoop to 
examine the logs, group the usage information by user, pivot on user usage 
by specific API and then aggregate on the last collection to determine the 
charges. This kind of log analysis is a very common pattern in Hadoop and 
it works and scales very well (to answer your questions about viability 
and stability).

You can do this via writing Map-Reduce Jobs in Java, using a higher order 
language like Pig or a higher order UI (like Apache Hive or IBM BigSheets)

Kind regards
Steve Watt



From:
Marcos Medrado Rubinelli <ma...@buscape-inc.com>
To:
<co...@hadoop.apache.org>
Date:
03/23/2010 06:06 AM
Subject:
Hadoop for accounting?



Hi,

The wiki's "Powered By" page ( http://wiki.apache.org/hadoop/PoweredBy ) 
lists dozens of companies using Hadoop in production, some of them for 
mission-critical operations, but is anyone using it - or planning to - 
for anything that involves money, like calculating a bill from API 
usage, or microtransactions?

If not, what are your main concerns? What parts would you consider 
stable enough for this kind of use?

Thanks,
Marcos



Re: Hadoop for accounting?

Posted by Eric Sammer <es...@cloudera.com>.
Marcos:

It is extremely common that data processed with Hadoop is eventually
used for billing purposes. Regarding your example, one could use map
reduce to calculate usage totals by customer ID and drive billing from
there.

On Tue, Mar 23, 2010 at 7:04 AM, Marcos Medrado Rubinelli
<ma...@buscape-inc.com> wrote:
> Hi,
>
> The wiki's "Powered By" page ( http://wiki.apache.org/hadoop/PoweredBy )
> lists dozens of companies using Hadoop in production, some of them for
> mission-critical operations, but is anyone using it - or planning to - for
> anything that involves money, like calculating a bill from API usage, or
> microtransactions?
>
> If not, what are your main concerns? What parts would you consider stable
> enough for this kind of use?
>
> Thanks,
> Marcos
>



-- 
Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com

Re: Hadoop for accounting?

Posted by Brian Bockelman <bb...@cse.unl.edu>.
Yeah, that's what struck me immediately too.

I believe this was one of the reasons for moving (quickly) on Kerberos security for Hadoop.  Even with that, there's still a relatively high barrier if you someone says the words "FIPS" or "HIPAA".

I love HDFS.  The damn thing never breaks, no matter what hardware or user we throw at it.  Our scientists love it.  However, there's a damn good reason that transactions were invented, especially for accounting/billing matters...

Brian

On Mar 23, 2010, at 11:30 AM, Allen Wittenauer wrote:

> 
> 
> 
> On 3/23/10 4:04 AM, "Marcos Medrado Rubinelli" <ma...@buscape-inc.com>
> wrote:
>> If not, what are your main concerns? What parts would you consider
>> stable enough for this kind of use?
> 
> While we're not doing any sort of billing on Hadoop, my #1 concern would be
> the fact that Hadoop (today) has zero security.  No way it would pass any
> reasonable PCI audit.


Re: Hadoop for accounting?

Posted by Allen Wittenauer <aw...@linkedin.com>.


On 3/23/10 4:04 AM, "Marcos Medrado Rubinelli" <ma...@buscape-inc.com>
wrote:
> If not, what are your main concerns? What parts would you consider
> stable enough for this kind of use?

While we're not doing any sort of billing on Hadoop, my #1 concern would be
the fact that Hadoop (today) has zero security.  No way it would pass any
reasonable PCI audit.