You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Thomas Fuller <th...@coherentlogic.com> on 2018/05/23 19:06:38 UTC

CMR: An open-source Data acquisition API for Spark is available

Hi Folks,

Today I've released my open-source CMR API, which is used to acquire data
from several data providers directly in Spark.

Currently the CMR API offers integration with the following:

- Federal Reserve Bank of St. Louis
- World Bank
- TreasuryDirect.gov
- OpenFIGI.com

*Of note*:

- The project page is here <https://coherentlogic.com/wordpress/cmr/>,
including a jar file that has been built from the current source, along
with a few demonstration videos (https://coherentlogic.com/wordpress/cmr/)
- Source code can be found here
<https://bitbucket.org/CoherentLogic/cmrapi/> (
https://bitbucket.org/CoherentLogic/cmrapi/)
- See a very simple example of the CMR API configured with the Infinispan
distributed cache <http://infinispan.org/> (http://infinispan.org/) here
<https://coherentlogic.com/wordpress/cmr-infinispan-lightning-fast-data-acquisition/>
(
https://coherentlogic.com/wordpress/cmr-infinispan-lightning-fast-data-acquisition/
).

*Sample*:

Bring S&P 500 data directly into Spark from the Federal Reserve Bank of St.
Louis (https://fred.stlouisfed.org/ and
https://research.stlouisfed.org/docs/api/fred/) as follows:

val observationsDS = *cmr*.*fred*.*series*.*observations*.*withApiKey*
(FRED_API_KEY).*withSeriesId*("SP500").*doGetAsObservationsDataset*(spark)

Feedback is welcomed so please feel free to send me your comments.

Tom
Coherent Logic Limited <https://coherentlogic.com/> | LinkedIn
<https://www.linkedin.com/in/thomasfuller/>