Working with Big (lots) Data and Pentaho – Extreme Performance


Codeks Blog

OK, firstly, I’m not talking proper BigData here.  This is not Hadoop, or even an analytical database.  (Lets not get into whether an analytical database counts as bigdata though!). And it’s certainly not NoSQL.  Disk space we’re looking at 100’s of gigabytes, not terabytes.  Yet this project involves more data than the Hadoop projects I’ve done.

So tens of billions of records. Records that must be processed in a limited environment in extremely tight time windows.  And yes; I’m storing all of that in MySQL!

Hey, wake up, yes, I did say billions of records in MySQL, try not to lose consciousness again…  (It’s not the first time I’ve had billions of rows in MySQL either – Yet I know some of you will guffaw at the idea)

In fact, in this project we are moving away from a database cluster, to a single box. The database cluster has 64 nodes and 4TB…

View original post 495 more words

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s