Pentaho Report Designer Tips Collection


Set encoding for PDF exports

Tired of  errors in character set PDF export… here is a little help.

File –> Configuration –> output-pageable-pdf –> Enable .Encoding checkbox and set for example  ISO-8859-1 character set as value

 

Set Locale for using in date message-fields

Writes multiple data types (text, string field, date field, and numeric field) into one object. We may have a message field including date with the following value $(report.date,date,EEEE, d MMMM yyyy HH:mm) and we want to change development locale, here is how you can modify it.

File –> Configuration –> core-module  –> Enable .environment.designtime.Locale and set for example  es_ES as value for Report Designer development stage locale

 

Set CSV export separator character

File –> Configuration –> output-table-csv –> Enable .Separator and set for example  ; as value to override default , character

Packt Publishing celebrate their 2000th title with an exclusive offer #Packt2k


2000th-Book-Home-Page-Banner#Packt2k promotion link

Known for their extensive range of pragmatic IT ebooks, Packt Publishing are celebrating their 2000th book title `Learning Dart’– they want their customers to celebrate too.

To mark this milestone Packt Publishing will launch a ‘Buy One Get One Free’ offer across all eBooks on March 18th – for a limited period only. 

`Learning Dart’ was selected as a title and published by Packt earlier this year. As a project that aims to revolutionise a language as crucial as JavaScript, Dart is a great example of an emerging technology which aims to support the community and their requirement for constant improvement. The content itself explains how to develop apps using Dart and HTML5 in a model-driven and fast-paced approach, enabling developers to build more complex and high-performing web apps.

David Maclean, Managing Director explains `It’s not by chance that this book is our 2000th title. Our customers and community drive demand and it is our job to ensure that whatever they’re working on, Packt provides practical help and support.

At Packt we understand that sometimes our customers want to learn a new programming language pretty much from scratch, with little knowledge of similar language concepts. Other times our customers know a related language fairly well and therefore want a fast-paced primer that brings them up to a competent professional level quickly.

That’s what makes Packt different: all our books are specifically commissioned by category experts, based on intensive research of the technology and the key tasks.’

Since 2004, Packt Publishing has been providing practical IT-related information that enables everyone to learn and develop their IT knowledge, from novice to expert.  

Packt is one of the most prolific and fast-growing tech book publishers in the world. Originally focused on open source software, Packt contributes back into the community paying a royalty on relevant books directly to open source projects. These projects have received over $400,000 as part of Packt’s Open Source Royalty Scheme to date.

Their books focus on practicality, recognising that readers are ultimately concerned with getting the job done. Packt’s digitally-focused business model allows them to quickly publish up-to-date books in very specific areas across a range of key categories – web development, game development, big data, application development, and more. Their commitment to providing a comprehensive range of titles has seen Packt publish 1054% more titles in 2013 than in 2006.

Erol Staveley, Publisher, says `Recent research shows that 88% of our customers are very satisfied with the service knowing that we offer a wide breadth of titles in a timely manner, and owing to the quality of service that they receive 94% of customers are willing to recommend Packt to friends and family. It’s great that we’ve hit such a significant milestone, and we want to continue delivering this fantastic content to our customers.’

Here are some of the best titles across Packt’s main categories – but Buy One, Get One Free will apply across all 2000 titles:

Web Development

Big Data & Cloud

Game Development

App Development

Open Business Analytics Training in London #BI #BigData #ETL #OLAP


Training Main page

Training

Dates:  From 28th April to 1st May 2014

Duration: 24 hours. 4 days

Location: Executive offices group meeting rooms. London.

Address: Central Court, 25 Southampton Bldgs – WC2A 1AL .

Training contents:

DAY 1
Business Intelligence Open Source Introduction and BI Server User Console
a. Pentaho 5 Architecture and new features, Mondrian, Kettle, etc…
b. Users and roles in Pentaho 5.
c. Browsing the repository in the user console.
d. Design tools.
Pentaho Data Integration (Kettle) ETL tool
a. Best Practices of ETL Processes.
b. Functional Overview (Jobs, transformations, stream control)
c. Parameters and Variables in kettle
• Environment variables and shared connections.
• ETL scheduling
d. Jobs
• Overview
• Step types (Mail, File Management, Scripting, etc…)
• Steps description
e. Transformations
• Overview
• Step types (Input, Output, Transform, Big Data, etc…)
• Steps description
f. Practical exercises
g. Data profiling with DataCleaner (pattern analysis, value distribution, date gap analysis …)
h. Talend Open Studio vs Kettle comparative
DAY 2
Data warehousing, OLAP and Mondrian
a. Datawarehouse – Datamart.
b. Star database schemas.
c.Multidimensional/OLAP
d. Mondrian ROLAP engine.
e. JPivot and MDX.
f. Designing OLAP structures Schema Workbench.
g. Tips to maximize Mondrian performance.
h. Alternatives to JPivot: STPivot, Saiku Analytics, OpenI
i. Practical Exercises
Social Intelligence
a. Introduction
b. Social KPIs (Facebook, Twitter …)
c. Samples
DAY 3
Reporting
a. AdHoc Reporting
• WAQR
• Pentaho Metadata Editor
• Creating a business model
b. Pentaho Reporting. Report Designer.
c. Practical Exercises

Big Data

a. Big Data Introduction
b. Pentaho Big Data components
c. Relational vs Columnar and Document Databases
DAY 4
Dashboards and Ctools
a. Introduction.
• Previous concepts.
• Best practices in dashboard design.
• Practical design.
b. Types of dashboards.
c. CDF
• Introduction.
• Samples.
d. CDE (Dashboard Editor)
• Introduction.
• Samples.
• Practical Exercise.
e. Ad hoc Dashboards.
• Introduction.
• Samples.
f. STDashboard
Plug-ins
a. SPARKL (Application designer)
b. Startup Rules (Substitute of xactions)

Pentaho’s embedded analytics helps prepare German companies for ‘Industry 4.0’


Originally posted on Pentaho Business Analytics Blog:

In Germany, our greatest success last year was with customers across different industries using Pentaho to embed analytics into their commercial software. There are several reasons for this, some of which are unique to Germany and others that reflect a larger global industry trend.

Germany has a vibrant software industry with excellent growth rates and even better forecasts for this year than last. The most important driver for this growth is the large body of highly specialised, medium-sized software companies, which belong to what we in Germany call, the “Mittelstand”. Many of them build specific solutions for our established automotive, chemical and mechanical industries, to name a few.

These companies’ customers are increasingly starting to demand analytics and reporting capabilities to be embedded into their core software applications, not only to make industrial and business processes more efficient, but to lay the foundation for Industry 4.0. Industry 4.0 is a…

View original 309 more words

Removing Special Characters from a string field in Oracle


Today while I was doing consultancy work I faced against the issue of loading a table into from Oracle to PostgreSQL, when I checked the logs I saw the some oracle varchar fields had strange characters at the end of them and this caused INSERT statements fail.  Initially I tried using Pentaho Data Integration  replace values in string and replace CR, LF and CRLF since they looked like carriage returns when copied the log files in Notepad++. But all attempts were unsuccessful, so I decided to look for Oracle functions and soon I got a proper solution.

REGEXP_REPLACE helped my as you could see in the query below

SELECT
REGEXP_REPLACE( customer_description ,'[^[:alnum:]'' '']', NULL)
 FROM dim_customer

 

Brief Explanation

The [[:alnum:]] character class represents alphabetic and numeric characters, and it is same as using [a-zA-Z0-9] in regular expression.

 

Hope you have enjoyed :-)

Increase MySQL output to 80K rows/second in Pentaho Data Integration


One of our clients has a MySQL table with around 40M records. To load the table it took around 2,5 hours. When i was watching the statistics of the transformation I noticed that the bottleneck was the write to the database. I was stuck at around 2000 rows/second. You can imagine that it will take a long time to write 40M records at that speed.
I was looking in what way I could improve the speed. There were a couple of options:
  1. Tune MySQL for better performance on Inserts
  2. Use the MySQL Bulk loader step in PDI
  3. Write SQL statements to file with PDI and  read them with mysql-binary
When i discussed this with one of my contacts of Basis06 they faced a similar issue a while ago. He mentioned that speed can be boosted by using some simple JDBC-connection setting. useServerPrepStmts=false
rewriteBatchedStatements=true
useCompression=true

These options should be entered in PDI at the connection. Double click the connection go to Options and set these values.

Used together, useServerPrepStmts=false and rewriteBatchedStatements=true will “fake” batch inserts on the client. Specifically, the insert statements:

INSERT INTO t (c1,c2) VALUES ('One',1);
INSERT INTO t (c1,c2) VALUES ('Two',2);
INSERT INTO t (c1,c2) VALUES ('Three',3);

will be rewritten into:

INSERT INTO t (c1,c2) VALUES ('One',1),('Two',2),('Three',3);

The third option useCompression=true compresses the traffic between the client and the MySQL server.

Finally I increased the number of copies of the output step to 2 so that there are two treads inserting into the database.

This all together increased the speed to around 84.000 rows a second! WOW!