Increase MySQL output to 80K rows/second in Pentaho Data Integration


One of our clients has a MySQL table with around 40M records. To load the table it took around 2,5 hours. When i was watching the statistics of the transformation I noticed that the bottleneck was the write to the database. I was stuck at around 2000 rows/second. You can imagine that it will take a long time to write 40M records at that speed.
I was looking in what way I could improve the speed. There were a couple of options:
  1. Tune MySQL for better performance on Inserts
  2. Use the MySQL Bulk loader step in PDI
  3. Write SQL statements to file with PDI and  read them with mysql-binary
When i discussed this with one of my contacts of Basis06 they faced a similar issue a while ago. He mentioned that speed can be boosted by using some simple JDBC-connection setting. useServerPrepStmts=false
rewriteBatchedStatements=true
useCompression=true

These options should be entered in PDI at the connection. Double click the connection go to Options and set these values.

Used together, useServerPrepStmts=false and rewriteBatchedStatements=true will “fake” batch inserts on the client. Specifically, the insert statements:

INSERT INTO t (c1,c2) VALUES ('One',1);
INSERT INTO t (c1,c2) VALUES ('Two',2);
INSERT INTO t (c1,c2) VALUES ('Three',3);

will be rewritten into:

INSERT INTO t (c1,c2) VALUES ('One',1),('Two',2),('Three',3);

The third option useCompression=true compresses the traffic between the client and the MySQL server.

Finally I increased the number of copies of the output step to 2 so that there are two treads inserting into the database.

This all together increased the speed to around 84.000 rows a second! WOW!

Advertisements

Pentaho BI Server 5.0.1CE MySQL installation guide


How to install Pentaho BI Server 5.0.1 Community Edition with MySQL 5.x and Windows

Hi all this is David Fombella and I would like to share a way to deploy Pentaho Community last edition installing its security into a MySQL database server.

Help me keep the guides up to date and the posts flowing by donating, every small amount of money helps!

btn_donatecc_lg

Here is the download link to get the biserver 5.0.1 CE .

http://sourceforge.net/projects/pentaho/files/Business%20Intelligence%20Server/5.0.1-stable/biserver-ce-5.0.1-stable.zip/download

Creation of Databases & Users

In our first stage we are going to create hibernate database and the user hibuser. To achieve this execute the SQL script  create_repository_mysql.sql included at biserver-ce\data\mysql5

SQL code

CREATE DATABASE IF NOT EXISTS `hibernate` DEFAULT CHARACTER SET latin1;
USE hibernate;
GRANT ALL ON hibernate.* TO 'hibuser'@'localhost' identified by 'password';
commit;

Next we create quartz database and the user pentaho_user. You only need to execute the SQL script  create_quartz_mysql.sql included at biserver-ce\data\mysql5

CREATE DATABASE IF NOT EXISTS `quartz` DEFAULT CHARACTER SET latin1;

grant all on quartz.* to 'pentaho_user'@'localhost' identified by 'password';

USE `quartz`;

DROP TABLE IF EXISTS QRTZ5_JOB_LISTENERS;
DROP TABLE IF EXISTS QRTZ5_TRIGGER_LISTENERS;
DROP TABLE IF EXISTS QRTZ5_FIRED_TRIGGERS;
DROP TABLE IF EXISTS QRTZ5_PAUSED_TRIGGER_GRPS;
DROP TABLE IF EXISTS QRTZ5_SCHEDULER_STATE;
DROP TABLE IF EXISTS QRTZ5_LOCKS;
DROP TABLE IF EXISTS QRTZ5_SIMPLE_TRIGGERS;
DROP TABLE IF EXISTS QRTZ5_CRON_TRIGGERS;
DROP TABLE IF EXISTS QRTZ5_BLOB_TRIGGERS;
DROP TABLE IF EXISTS QRTZ5_TRIGGERS;
DROP TABLE IF EXISTS QRTZ5_JOB_DETAILS;
DROP TABLE IF EXISTS QRTZ5_CALENDARS;

CREATE TABLE QRTZ5_JOB_DETAILS
  (
    JOB_NAME  VARCHAR(200) NOT NULL,
    JOB_GROUP VARCHAR(200) NOT NULL,
    DESCRIPTION VARCHAR(250) NULL,
    JOB_CLASS_NAME   VARCHAR(250) NOT NULL,
    IS_DURABLE VARCHAR(1) NOT NULL,
    IS_VOLATILE VARCHAR(1) NOT NULL,
    IS_STATEFUL VARCHAR(1) NOT NULL,
    REQUESTS_RECOVERY VARCHAR(1) NOT NULL,
    JOB_DATA BLOB NULL,
    PRIMARY KEY (JOB_NAME,JOB_GROUP)
);

CREATE TABLE QRTZ5_JOB_LISTENERS
  (
    JOB_NAME  VARCHAR(200) NOT NULL,
    JOB_GROUP VARCHAR(200) NOT NULL,
    JOB_LISTENER VARCHAR(200) NOT NULL,
    PRIMARY KEY (JOB_NAME,JOB_GROUP,JOB_LISTENER),
    FOREIGN KEY (JOB_NAME,JOB_GROUP)
        REFERENCES QRTZ5_JOB_DETAILS(JOB_NAME,JOB_GROUP)
);

CREATE TABLE QRTZ5_TRIGGERS
  (
    TRIGGER_NAME VARCHAR(200) NOT NULL,
    TRIGGER_GROUP VARCHAR(200) NOT NULL,
    JOB_NAME  VARCHAR(200) NOT NULL,
    JOB_GROUP VARCHAR(200) NOT NULL,
    IS_VOLATILE VARCHAR(1) NOT NULL,
    DESCRIPTION VARCHAR(250) NULL,
    NEXT_FIRE_TIME BIGINT(13) NULL,
    PREV_FIRE_TIME BIGINT(13) NULL,
    PRIORITY INTEGER NULL,
    TRIGGER_STATE VARCHAR(16) NOT NULL,
    TRIGGER_TYPE VARCHAR(8) NOT NULL,
    START_TIME BIGINT(13) NOT NULL,
    END_TIME BIGINT(13) NULL,
    CALENDAR_NAME VARCHAR(200) NULL,
    MISFIRE_INSTR SMALLINT(2) NULL,
    JOB_DATA BLOB NULL,
    PRIMARY KEY (TRIGGER_NAME,TRIGGER_GROUP),
    FOREIGN KEY (JOB_NAME,JOB_GROUP)
        REFERENCES QRTZ5_JOB_DETAILS(JOB_NAME,JOB_GROUP)
);

CREATE TABLE QRTZ5_SIMPLE_TRIGGERS
  (
    TRIGGER_NAME VARCHAR(200) NOT NULL,
    TRIGGER_GROUP VARCHAR(200) NOT NULL,
    REPEAT_COUNT BIGINT(7) NOT NULL,
    REPEAT_INTERVAL BIGINT(12) NOT NULL,
    TIMES_TRIGGERED BIGINT(10) NOT NULL,
    PRIMARY KEY (TRIGGER_NAME,TRIGGER_GROUP),
    FOREIGN KEY (TRIGGER_NAME,TRIGGER_GROUP)
        REFERENCES QRTZ5_TRIGGERS(TRIGGER_NAME,TRIGGER_GROUP)
);

CREATE TABLE QRTZ5_CRON_TRIGGERS
  (
    TRIGGER_NAME VARCHAR(200) NOT NULL,
    TRIGGER_GROUP VARCHAR(200) NOT NULL,
    CRON_EXPRESSION VARCHAR(200) NOT NULL,
    TIME_ZONE_ID VARCHAR(80),
    PRIMARY KEY (TRIGGER_NAME,TRIGGER_GROUP),
    FOREIGN KEY (TRIGGER_NAME,TRIGGER_GROUP)
        REFERENCES QRTZ5_TRIGGERS(TRIGGER_NAME,TRIGGER_GROUP)
);

CREATE TABLE QRTZ5_BLOB_TRIGGERS
  (
    TRIGGER_NAME VARCHAR(200) NOT NULL,
    TRIGGER_GROUP VARCHAR(200) NOT NULL,
    BLOB_DATA BLOB NULL,
    PRIMARY KEY (TRIGGER_NAME,TRIGGER_GROUP),
    FOREIGN KEY (TRIGGER_NAME,TRIGGER_GROUP)
        REFERENCES QRTZ5_TRIGGERS(TRIGGER_NAME,TRIGGER_GROUP)
);

CREATE TABLE QRTZ5_TRIGGER_LISTENERS
  (
    TRIGGER_NAME  VARCHAR(200) NOT NULL,
    TRIGGER_GROUP VARCHAR(200) NOT NULL,
    TRIGGER_LISTENER VARCHAR(200) NOT NULL,
    PRIMARY KEY (TRIGGER_NAME,TRIGGER_GROUP,TRIGGER_LISTENER),
    FOREIGN KEY (TRIGGER_NAME,TRIGGER_GROUP)
        REFERENCES QRTZ5_TRIGGERS(TRIGGER_NAME,TRIGGER_GROUP)
);

CREATE TABLE QRTZ5_CALENDARS
  (
    CALENDAR_NAME  VARCHAR(200) NOT NULL,
    CALENDAR BLOB NOT NULL,
    PRIMARY KEY (CALENDAR_NAME)
);

CREATE TABLE QRTZ5_PAUSED_TRIGGER_GRPS
  (
    TRIGGER_GROUP  VARCHAR(200) NOT NULL,
    PRIMARY KEY (TRIGGER_GROUP)
);

CREATE TABLE QRTZ5_FIRED_TRIGGERS
  (
    ENTRY_ID VARCHAR(95) NOT NULL,
    TRIGGER_NAME VARCHAR(200) NOT NULL,
    TRIGGER_GROUP VARCHAR(200) NOT NULL,
    IS_VOLATILE VARCHAR(1) NOT NULL,
    INSTANCE_NAME VARCHAR(200) NOT NULL,
    FIRED_TIME BIGINT(13) NOT NULL,
    PRIORITY INTEGER NOT NULL,
    STATE VARCHAR(16) NOT NULL,
    JOB_NAME VARCHAR(200) NULL,
    JOB_GROUP VARCHAR(200) NULL,
    IS_STATEFUL VARCHAR(1) NULL,
    REQUESTS_RECOVERY VARCHAR(1) NULL,
    PRIMARY KEY (ENTRY_ID)
);

CREATE TABLE QRTZ5_SCHEDULER_STATE
  (
    INSTANCE_NAME VARCHAR(200) NOT NULL,
    LAST_CHECKIN_TIME BIGINT(13) NOT NULL,
    CHECKIN_INTERVAL BIGINT(13) NOT NULL,
    PRIMARY KEY (INSTANCE_NAME)
);

CREATE TABLE QRTZ5_LOCKS
  (
    LOCK_NAME  VARCHAR(40) NOT NULL,
    PRIMARY KEY (LOCK_NAME)
);

INSERT INTO QRTZ5_LOCKS values('TRIGGER_ACCESS');
INSERT INTO QRTZ5_LOCKS values('JOB_ACCESS');
INSERT INTO QRTZ5_LOCKS values('CALENDAR_ACCESS');
INSERT INTO QRTZ5_LOCKS values('STATE_ACCESS');
INSERT INTO QRTZ5_LOCKS values('MISFIRE_ACCESS');
commit;

Finally we create jackrabbit database and the user jcr_user. You only need to execute the SQL script  create_jcr_mysql.sql included at biserver-ce\data\mysql5

CREATE DATABASE IF NOT EXISTS `jackrabbit` DEFAULT CHARACTER SET latin1;
grant all on jackrabbit.* to 'jcr_user'@'localhost' identified by 'password';
commit;

Configuring JDBC Security

This section describes how to configure the Pentaho BI Platform JDBC security to use a MySQL server, this means the Pentaho BI Platform will now point to the hibernate database on the MySQL server instead of the packaged HSQL in memory database.

1. applicationContext-spring-security-hibernate.properties.

Edit the file pentaho-solutions\system\applicationContext-spring-security-hibernate.properties.

Original code

jdbc.driver=org.hsqldb.jdbcDriver
jdbc.url=jdbc:hsqldb:hsql://localhost:9001/hibernate
jdbc.username=hibuser
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.HSQLDialect
 

Make the changes necessary to get the snippet of code below

jdbc.driver=com.mysql.jdbc.Driver
jdbc.url=jdbc:mysql://localhost:3306/hibernate
jdbc.username=hibuser
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.MySQLDialect
 

    2. hibernate-settings.xml

Edit the file pentaho-solutions\system\hibernate\hibernate-settings.xml.

Original code

<config-file>system/hibernate/hsql.hibernate.cfg.xml</config-file>
 

Make the changes necessary to get the snippet of code below

<config-file>system/hibernate/mysql5.hibernate.cfg.xml</config-file>
 

3. mysql5.hibernate.cfg.xml

Edit the file pentaho-solutions\system\hibernate\mysql5.hibernate.cfg.xml .

You do not need to make any changes to this file if you would like to use the default user hibuser. However, if you would like to specify your custom user, change connection.username and password  properties.

Original code

<property name="connection.driver_class">com.mysql.jdbc.Driver</property>
<property name="connection.url">jdbc:mysql://localhost:3306/hibernate</property>
<property name="dialect">org.hibernate.dialect.MySQL5InnoDBDialect</property>
<property name="connection.username">hibuser</property>
<property name="connection.password">password</property>
 

4. quartz.properties

Edit the file pentaho-solutions\system\quartz\quartz.properties .

Original code

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
 

Make the changes necessary to get the snippet of code below

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
 

Configuring Hibernate and Quartz

Hibernate and Quartz need to specifically use the hibernate and quartz databases which were created on the MySQL server. To do so modifications need to be executed in context.xml file .

5. context.xml

Edit the file tomcat\webapps\pentaho\META-INF\context.xml.

Remember deleting tomcat\conf\Catalina\localhost\pentaho.xml , Pentaho creates  on startup pentaho.xml as a copy of context.xml.

Original code

<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="hibuser" password="password"
driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/hibernate"
validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES" />

<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="pentaho_user" password="password"
driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/quartz"
validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES"/>
 

Make the changes necessary to get the snippet of code below

<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="hibuser" password="password"
driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/hibernate"
validationQuery="select 1" />

<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="pentaho_user" password="password"
driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/quartz"
validationQuery="select 1"/>
 

6. repository.xml

Edit the file pentaho-solutions\system\jackrabbit\repository.xml.

One of the new features of pentaho 5 is the use of Jackrabbit content repository.

http://jackrabbit.apache.org/repository-server-howto.html

jlogo

Comment the original code in the FileSystem part

<!--
Replace the following "FileSystem" XML node to use supported databases as
the repository file system.  Change the url, user, password and other parameters
to suit your db installation.  The schemaObjectPrefix should
be a unique prefix that will be prepended to the table names.
NOTE: The database must be pre-created in and match the parameters.  See Jackrabbit
documentation for further explanation.
-->
<FileSystem>
<param name="path" value="${rep.home}/repository"/>
</FileSystem>
 

Make this code active on FileSystem part of the code

<FileSystem>
<param name="driver" value="com.mysql.jdbc.Driver"/>
<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>
<param name="user" value="jcr_user"/>
<param name="password" value="password"/>
<param name="schema" value="mysql"/>
<param name="schemaObjectPrefix" value="fs_repos_"/>
</FileSystem>
 

Comment the original code in the DataStore part

<!--
data store configuration
-->
<!--
Replace the following "DataStore" XML node to use supported databases as the data
store for the repository.  Change the url, user, password and other parameters
to suit your db installation.  The schemaObjectPrefix should
be a unique prefix that will be prepended to the table names.
NOTE: The database must be pre-created in and match the parameters.  See Jackrabbit
documentation for further explanation.
-->
<DataStore class="org.apache.jackrabbit.core.data.FileDataStore"/>
 

Make this code active on DataStore part of the code

 <DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">
    <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>
    <param name="user" value="jcr_user"/>
    <param name="password" value="password"/>
    <param name="databaseType" value="mysql"/>
    <param name="driver" value="com.mysql.jdbc.Driver"/>
    <param name="minRecordLength" value="1024"/>
    <param name="maxConnections" value="3"/>
    <param name="copyWhenReading" value="true"/>
    <param name="tablePrefix" value=""/>
    <param name="schemaObjectPrefix" value="ds_repos_"/>
  </DataStore>

Below the security part comment the original code in the FileSystem Workspace part

<!--
virtual file system of the workspace:
class: FQN of class implementing the FileSystem interface
-->
<!--
Replace the following "FileSystem" XML node to use supported databases as
the repository file system.  Change the url, user, password and other parameters
to suit your db installation.  The schemaObjectPrefix should
be a unique prefix that will be prepended to the table names.
NOTE: The database must be pre-created in and match the parameters.  See Jackrabbit
documentation for further explanation.
-->
<FileSystem>
<param name="path" value="${wsp.home}"/>
</FileSystem>
 

Make this code active on FileSystem WorkSpace part of the code

<FileSystem>
<param name="driver" value="com.mysql.jdbc.Driver"/>
<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>
<param name="user" value="jcr_user"/>
<param name="password" value="password"/>
<param name="schema" value="mysql"/>
<param name="schemaObjectPrefix" value="fs_ws_"/>
</FileSystem>

Below  FileSystem Workspace part you will find the PersistenceManager part

<!--
persistence manager of the workspace:
class: FQN of class implementing the PersistenceManager interface
-->
<!--
Replace the following "PersistenceManager" XML node to use a supported database as the
persistenanceManager store.  Change the url, user, password and parameters
to suit your db installation.  The schemaObjectPrefix should
be a unique prefix that will be prepended to the table names.
NOTE: The database must be pre-created in and match the parameters.  See Jackrabbit
documentation for further explanation.
-->

<PersistenceManager>
<param name="url" value="jdbc:h2:${wsp.home}/db"/>
<param name="schemaObjectPrefix" value="${wsp.name}_"/>
</PersistenceManager>

 

Make this code active on PersistenceManager part of the code

<PersistenceManager>
<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>
<param name="user" value="jcr_user" />
<param name="password" value="password" />
<param name="schema" value="mysql"/>
<param name="schemaObjectPrefix" value="${wsp.name}_pm_ws_"/>
</PersistenceManager>

Below you will find FileSystem Versioning part

<!--
Configures the filesystem to use for versioning for the respective
persistence manager
-->
<!--
Replace the following "FileSystem" XML node to use a supported database as
the repository file system.  Change the url, user, password and other parameters
to suit your db installation.  The schemaObjectPrefix should
be a unique prefix that will be prepended to the table names.
NOTE: The database must be pre-created in and match the parameters.  See Jackrabbit
documentation for further explanation.
-->
<FileSystem>
<param name="path" value="${rep.home}/version" />
</FileSystem>
 

Make this code active on FileSystem Versioning part

<FileSystem>
<param name="driver" value="com.mysql.jdbc.Driver"/>
<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>
<param name="user" value="jcr_user"/>
<param name="password" value="password"/>
<param name="schema" value="mysql"/>
<param name="schemaObjectPrefix" value="fs_ver_"/>
</FileSystem>

Below you will find PersistenceManager Versioning part

<!--
Configures the persistence manager to be used for persisting version state.
Please note that the current versioning implementation is based on
a 'normal' persistence manager, but this could change in future
implementations.
-->
<!--
Replace the following "PersistenceManager" XML node to use a supported database as the
persistenanceManager store.  Change the url, user, password and parameters
to suit your db installation.  The schemaObjectPrefix should
be a unique prefix that will be prepended to the table names.
NOTE: The database must be pre-created in and match the parameters.  See Jackrabbit
documentation for further explanation.
-->
<PersistenceManager>
<param name="url" value="jdbc:h2:${rep.home}/version/db"/>
<param name="schemaObjectPrefix" value="version_"/>
</PersistenceManager>

Make this code active on PersistenceManager Versioning part

<PersistenceManager>
<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>
<param name="user" value="jcr_user" />
<param name="password" value="password" />
<param name="schema" value="mysql"/>
<param name="schemaObjectPrefix" value="pm_ver_"/>
</PersistenceManager>

Quit HSQL Hypersonic automatic startup

By default  Hypersonic database starts up automatically – to avoid this  comment or delete locate the following snippets of code from web.xml:

<!-- [BEGIN HSQLDB DATABASES] -->
<context-param>
<param-name>hsqldb-databases</param-name>
<param-value>sampledata@../../data/hsqldb/sampledata,hibernate@../../data/hsqldb/hibernate,quartz@../../data/hsqldb/quartz</param-value>
</context-param>
<!-- [END HSQLDB DATABASES] -->
 

Second  section you need to comment or eliminate

<!-- [BEGIN HSQLDB STARTER] -->
<listener>
<listener-class>org.pentaho.platform.web.http.context.HsqldbStartupListener</listener-class>
</listener>
<!-- [END HSQLDB STARTER] -->
 

Sign into the user console

Using this URL http://localhost:8080/pentaho you can reach pentaho login screen:

login biserver5Once you are into the user console, you will see  Home perspective with several buttons:

  • Browse Files: Navigate through Public and Home root directories and Cut, Copy, Share, Download, Schedule all contents included in Pentaho ( PRPT reports,  sequence actions, jobs, transformations, olap views, saiku views —)
  • Create New: (JPivot View , Add components via Marketplace)
  • Manage DataSources: Create, Edit and Delete  JDBC datasources, Mondrian metadata sources  and Metadata.xmi reporting ad hoc datasources
  • Documentation: Documentation for Pentaho 5.0

PUC biserver5

One of the main changes of Pentaho 5 is the fact that there isn’t an external Administration Console and administration capabilities are included in the User Console. In this version our “ancient”  friend joe has been replaced with the admin user. Besides the new roles available are the following:

  • Administrator
  • Power User
  • Report Author
  • Business Analyst

You can access in the administration perspective in the user console using the dropdown and selecting Administration:

administration biserver5Then you will reach the administration perspective

administration roles biserver5

Pentaho BI Server 4.5.0 & Pentaho BI Server 4.8.0- MySQL installation guide


How to install Pentaho BI Server 4.5.0 & 4.8.0 Community Edition with MySQL 5.x and Windows

Hi all this is David Fombella and I would like to share a correct way to deploy Pentaho Community last edition installing its security into a MySQL database server.

Help me keep the guides up to date and the posts flowing by donating, every small amount of money helps!

btn_donatecc_lg

Here is the download links to get the biserver 450 and 480 editions.

http://sourceforge.net/projects/pentaho/files/Business%20Intelligence%20Server/4.5.0-stable/biserver-ce-4.5.0-stable.zip/download

http://sourceforge.net/projects/pentaho/files/Business%20Intelligence%20Server/4.8.0-stable/biserver-ce-4.8.0-stable.zip/download

Remember to paste mysql_connector.jar into  /tomcat/lib  for biserver 480 edition since it is not included

Creation of Databases & Users

In our first stage we are going to create hibernate database and the user hibuser. To achieve this execute the SQL script  create_repository_mysql.sql included at biserver-ce\data\mysql5

SQL code

CREATE DATABASE IF NOT EXISTS `hibernate` DEFAULT CHARACTER SET latin1;
USE hibernate;
GRANT ALL ON hibernate.* TO 'hibuser'@'localhost' identified by 'password';
DROP TABLE IF EXISTS DATASOURCE;
--  Create Users Table
CREATE TABLE DATASOURCE(NAME VARCHAR(50) NOT NULL PRIMARY KEY,MAXACTCONN INTEGER NOT NULL,DRIVERCLASS VARCHAR(50) NOT NULL,IDLECONN INTEGER NOT NULL,USERNAME VARCHAR(50) NULL,PASSWORD VARCHAR(150) NULL,URL VARCHAR(512) NOT NULL,QUERY VARCHAR(100) NULL,WAIT INTEGER NOT NULL);
commit;
 

Next we create quartz database and the user pentaho_user. You only need to execute the SQL script  create_quartz_mysql.sql included at biserver-ce\data\mysql5


CREATE DATABASE IF NOT EXISTS `quartz` DEFAULT CHARACTER SET latin1;

grant all on quartz.* to 'pentaho_user'@'localhost' identified by 'password';

USE `quartz`;

DROP TABLE IF EXISTS QRTZ_JOB_LISTENERS;
DROP TABLE IF EXISTS QRTZ_TRIGGER_LISTENERS;
DROP TABLE IF EXISTS QRTZ_FIRED_TRIGGERS;
DROP TABLE IF EXISTS QRTZ_PAUSED_TRIGGER_GRPS;
DROP TABLE IF EXISTS QRTZ_SCHEDULER_STATE;
DROP TABLE IF EXISTS QRTZ_LOCKS;
DROP TABLE IF EXISTS QRTZ_SIMPLE_TRIGGERS;
DROP TABLE IF EXISTS QRTZ_CRON_TRIGGERS;
DROP TABLE IF EXISTS QRTZ_BLOB_TRIGGERS;
DROP TABLE IF EXISTS QRTZ_TRIGGERS;
DROP TABLE IF EXISTS QRTZ_JOB_DETAILS;
DROP TABLE IF EXISTS QRTZ_CALENDARS;
-- continues
 

Appart from that to fix an issue that occurs on BI Server 4.8.0 execute the following code below the error log


ERROR [org.quartz.core.ErrorLogger] An error occured while scanning for the next trigger to fire.org.quartz.JobPersistenceException: Couldn't acquire next trigger: Field 'PRIORITY' doesn't have a default value
 [See nested exception: java.sql.SQLException: Field 'PRIORITY' doesn't have a default value]
 at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:1778)
 at org.quartz.impl.jdbcjobstore.JobStoreTX.acquireNextTrigger(JobStoreTX.java:1218)
 at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:233)
 * Nested Exception (Underlying Cause) ---------------
 java.sql.SQLException: Field 'PRIORITY' doesn't have a default value

 

SQL code to fix PRIORITY column errors


ALTER TABLE  QRTZ_TRIGGERS DROP COLUMN PRIORITY;
ALTER TABLE  QRTZ_FIRED_TRIGGERS DROP COLUMN PRIORITY;

 

Finally you have to get a MySQL  dump of Sampledata  database.

Configuring JDBC Security

This section describes how to configure the Pentaho BI Platform JDBC security to use a MySQL server, this means the Pentaho BI Platform will now point to the hibernate database on the MySQL server instead of the packaged HSQL in memory database.

1. applicationContext-spring-security-jdbc.xml

Edit the file pentaho-solutions\system\applicationContext-spring-security-jdbc.xml.

Original code

<bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="org.hsqldb.jdbcDriver" />
<property name="url" value="jdbc:hsqldb:hsql://localhost:9001/hibernate" />
<property name="username" value="hibuser" />
<property name="password" value="password" />
</bean>
 

Make the changes necessary to get the snippet of code below

<bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="com.mysql.jdbc.Driver" />
<property name="url" value="jdbc:mysql://localhost:3306/hibernate" />
<property name="username" value="hibuser" />
<property name="password" value="password" />
</bean>
 

     2. applicationContext-spring-security-hibernate.properties.

Edit the file pentaho-solutions\system\applicationContext-spring-security-hibernate.properties.

Original code

jdbc.driver=org.hsqldb.jdbcDriver
jdbc.url=jdbc:hsqldb:hsql://localhost:9001/hibernate
jdbc.username=hibuser
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.HSQLDialect
 

Make the changes necessary to get the snippet of code below

jdbc.driver=com.mysql.jdbc.Driver
jdbc.url=jdbc:mysql://localhost:3306/hibernate
jdbc.username=hibuser
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.MySQLDialect
 

     3. hibernate-settings.xml

Edit the file pentaho-solutions\system\hibernate\hibernate-settings.xml.

Original code

<config-file>system/hibernate/hsql.hibernate.cfg.xml</config-file>
 

Make the changes necessary to get the snippet of code below

<config-file>system/hibernate/mysql5.hibernate.cfg.xml</config-file>
 

4. mysql5.hibernate.cfg.xml

Edit the file pentaho-solutions\system\hibernate\mysql5.hibernate.cfg.xml .

You do not need to make any changes to this file if you would like to use the default user hibuser. However, if you would like to specify your custom user, change connection.username and password  properties.

Original code

<property name="connection.driver_class">com.mysql.jdbc.Driver</property>
<property name="connection.url">jdbc:mysql://localhost:3306/hibernate</property>
<property name="dialect">org.hibernate.dialect.MySQL5InnoDBDialect</property>
<property name="connection.username">hibuser</property>
<property name="connection.password">password</property>
 

Configuring Hibernate and Quartz

Hibernate and Quartz need to specifically use the hibernate and quartz databases which were created on the MySQL server. To do so modifications need to be executed in context.xml file .

     5. context.xml

Edit the file tomcat\webapps\pentaho\META-INF\context.xml.

Remember deleting tomcat\conf\Catalina\localhost\pentaho.xml , Pentaho creates  on startup pentaho.xml as a copy of context.xml.

Original code

<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="hibuser" password="password"
driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/hibernate"
validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES" />

<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="pentaho_user" password="password"
driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/quartz"
validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES"/>
 

Make the changes necessary to get the snippet of code below

<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="hibuser" password="password"
driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/hibernate"
validationQuery="select 1" />

<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="pentaho_user" password="password"
driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/quartz"
validationQuery="select 1"/>
 

Configuring Apache-Tomcat Web Server and Servlet Container

In order to configure the settings of the Apache-Tomcat server for your Pentaho BI Server 4.5.0 the great majority of the modifications should be made inside the web.xml file which is located under the \tomcat\webapps\pentaho\WEB_INF\ folder.

     6. web.xml

In this first modification we will indicate the web application the location of our solution folder , the place where reports, xactions, dashboards …  will be stored.

Edit the file tomcat\webapps\pentaho\WEB-INF\web.xml.

Original code with empty solution-path

<context-param>
<param-name>solution-path</param-name>
<param-value></param-value>
</context-param>
 

Make the changes necessary to indicate your pentaho-solutions path

<context-param>
<param-name>solution-path</param-name>
<param-value>C:/Pentaho/biserver-ce/pentaho-solutions</param-value>
</context-param>

 

fully-qualified-server-url

Now we are going to set the URL to access Pentaho’s BI User console, the default URL is the following http://localhost:8080/pentaho .If you would like others to access the site (remotely or on a network) you will need to make changes to this parameter.

Original code with default url

<context-param>
<param-name>fully-qualified-server-url</param-name>
<param-value>http://localhost:8080/pentaho/</param-value>
</context-param>
 

Make the changes necessary to indicate your server’s domain

 <context-param>
<param-name>fully-qualified-server-url</param-name>
<param-value>http://www.mydomain.com:8080/pentaho/</param-value>
</context-param>
 

Quit HSQL Hypersonic automatic startup

By default  Hypersonic database starts up automatically – to avoid this  comment or delete locate the following snippets of code from web.xml:

<!-- [BEGIN HSQLDB DATABASES] -->
<context-param>
<param-name>hsqldb-databases</param-name>
<param-value>sampledata@../../data/hsqldb/sampledata,hibernate@../../data/hsqldb/hibernate,quartz@../../data/hsqldb/quartz</param-value>
</context-param>
<!-- [END HSQLDB DATABASES] -->
 

Second  section you need to comment or eliminate

<!-- [BEGIN HSQLDB STARTER] -->
<listener>
<listener-class>org.pentaho.platform.web.http.context.HsqldbStartupListener</listener-class>
</listener>
<!-- [END HSQLDB STARTER] -->
 

Pentaho User Console

Now it is time to run Pentaho BI Server, you only need to execute  start-pentaho.bat script stored in biserver-ce, another script named stop-pentaho.bat is included in the same directory.

If you haven’t changed the URL type on your browser the following: http://localhost:8080/pentaho . The default admin user is joe and joe’s password is password.

Pentaho Administrator Console

Now we are going to start the admin console  a web tool where we could manage users,roles and datasources. We execute start-pac.bat  and  if we open a browser and we typehttp://localhost:8099 we will be prompted for administrator console credentials user: admin and admin’s password:password.

We move to Database Connections and clicking on + we create SampleData datasource, pentaho_user should be previously granted access to this sample database

After creating the datasource restart BI server, stop-pentaho.bat …start-pentaho.bat and enjoy the SampleData database.

Hope you have enjoyed