Creating Pentaho Reports from MongoDB

So you’ve made the move and started using MongoDB to store unstructured data.  Now your users want to create reports on the MongoDB databases and collections.  One approach is to use a Kettle transformation that retrieves data from MongoDB for reports.  This approach is documented on the Pentaho Wiki.  However, I want to use the MongoDB database directly without dealing with Spoon and Kettle transformations.  Fortunately Pentaho Reporting also supports scripting with Groovy built in.  This tutorial will show you how to create a report against MongoDB data using the Javadrivers and Groovy scripting.

You should already have mongodb installed and accessible.  I’m running on the same machine with the default settings, so vary the code as needed for your configuration.
You also need to put the mongo-java-driver-2.7.2.jar file in the libraries for Report Designer and the BA Server
$PENTAHO_HOME/design-tools/Pentaho Report
Restart the app and BA Server if they are running to pick up the new .jar files.
Setting Up
The first thing you need is some data.  I’ve created an input file of sales by region and year to use as an example.  Download and import the data using the mongoimport command:
> mongoimport -d pentaho -c sales data.json
Verify that the data has been successfully imported by opening the mongo shell and using the following commands:
> use pentaho
> db.sales.find();
You should see a list of documents that were added.
Creating the Report
  1. Using Pentaho Report Designer, create a new report.
  2. Add a data source and choose Advanced -> Scriptable
  3. Select groovy as the language and click the (+) for a new query
  4. Enter the following code as the script (check Server Address on database connection creation)

import com.mongodb.*

import org.pentaho.reporting.engine.classic.core.util.TypedTableModel;
def mongo = new Mongo("", 27017)
def db = mongo.getDB("pentaho")
def sales = db.getCollection("sales")
def columnNames = new String[6];
columnNames[0] = "Region";
columnNames[1] = "Year";
columnNames[2] = "Q1";
columnNames[3] = "Q2";
columnNames[4] = "Q3";
columnNames[5] = "Q4";
Class[] columnTypes = new Class[6];
columnTypes[0] = String.class;
columnTypes[1] = Integer.class;
columnTypes[2] = Integer.class;
columnTypes[3] = Integer.class;
columnTypes[4] = Integer.class;
columnTypes[5] = Integer.class;
TypedTableModel model = new TypedTableModel(columnNames, columnTypes);
model.addRow([ new String("East"), new Integer(10), new Integer(10), new Integer(14), new Integer(21) ] as Object[]);
def docs = sales.find()
while (docs.hasNext()) {
  def doc =
  model.addRow([ doc.get("region"), doc.get("year"), doc.get("q1"), doc.get("q2"), doc.get("q3"), doc.get("q4") ] as Object[]);
This will read the data from MongoDB and return the table model needed by the reporting engine.
From here it’s just standard report generation and publishing, which is described in the Pentaho documentation.

via Creating Pentaho Reports from MongoDB.

Thanks to BillBack  @billbackbi


Pentaho and MongoDB

Pentaho and MongoDB. LINK

At  MongoNYC conference in New York today, where Pentaho is a sponsor. 10gen have done a great job with this event, and they have 1,000 attendees at the event.

We just announced a strategic partnership between 10gen and Pentaho. From a technical perspective the integration between MongoDB and Pentaho means:

  • No Big Silos. Data silos are bad. Big ones are no better. Our MongoDB ETL connectors for reading and writing data mean you can integrate your MongoDB data store with the rest of your data architecture (relational databases, hosted applications, custom applications, etc).
  • Live reporting. We can provide desktop and web-based reports directly on MongoDB data
  • Staging. We can provide trending and historical analysis by staging snapshots of MongoDB aggregations in a column store.

I’m looking forward to working with 10gen to integrate some of their new aggregation capabilities into Pentaho.

Pentaho, 10gen Collaborate to Integrate MongoDB

Business analytics vendor Pentaho and 10gen, the company behind MongoDB, today announced a partnership to provide direct integration between Pentaho Business Analytics and MongoDB.

As enterprise data architectures continue to evolve, customers are looking to address rapidly changing multi-structured data and take advantage of cloud-like architectures. This alliance brings the data integration, data discovery and visualization capabilities of Pentaho to MongoDB.

The companies say that the native integration between Pentaho and MongoDB helps enterprises take advantage of the flexible, scalable data storage capabilities of MongoDB while ensuring compatibility and interoperability with existing data infrastructure.

Pentaho and 10gen have developed connectors to tightly integrate MongoDB and Pentaho Business Analytics. By adding MongoDB integration to its existing library of connectors for relational databases, analytic databases, data warehouses, enterprise applications, and standards-based information exchange formats, Pentaho says it can provide a more robust enterprise architects, developers, data scientists and analysts for both MongoDB and existing databases.

As a release this week stated, “Enterprise architects benefit from a scalable data integration framework functioning across MongoDB and other data stores, and developers gain access to familiar graphical interfaces for data integration and job management with full support for MongoDB. Data scientists and analysts can now visualize and explore data across multiple data sources, including MongoDB.”

Soon I will made a video explaining MongoDB integration with Pentaho Data Integration and Pentaho Reporting