ExcelTransformBean Part 1: Convert various Excel formats to simple XML easily

Update 25 Mar 2015: ExcelTransformBean has been refactored to be part of FormatConversionBean. Parameter conversionType replaced with converterClass. Parameter indentXML replaced by indentFactor
Update 11 Feb 2015: Updated source code links to shared module repository on GitHub. Previous GitHub repository will be deleted.
Update 27 Jan 2015: Updated Source code section to indicate EAR deployment file released for download at GitHub repository
Update 3 Nov 2014: Refactoring of source code to cater for SimpleExcel2XML and SimpleXML2Excel conversions.

Introduction

There are already a handful of blogs on SCN dealing with Excel to XML conversion - so why another one??

Here is my wish list for a comprehensive solution:-

Able to read all kinds of Excel format (XLS and XLSX)
Behaves in a similar way as MessageTransformBean
Highly configurable - develop/deploy once, use multiple times
Able to handle XML special characters

In the reference section below are some of the more popular approaches, however below are some of the (non-exhaustive) limitations/drawbacks

XLSX files stores string contents in a separate sharedStrings.xml file in the zipped XLSX file
JExcel API does not support Excel 2007 XLSX formats
Limitations in handling formulas, formatting and special characters in cells

ExcelTransformBean (ETB) aims to provide a generic adapter module solution (a la MessageTransformBean) that is highly configurable and reusable. It is based on the Apache POI API. Utilizing the combined SS interface of the API, it uses a single logic to read all kinds of Excel files (XLS and XLSX).

This first part covers Excel to simple XML conversion, while the second part covers simple XML to Excel conversion.

Source Code

The full source code can be found in the following public repository on GitHub.

GitHub repository for ExcelTransformBean

Update: EAR deployment file is available for download from following GitHub repository release. This EAR can be used for direct deployment in order to use the Adapter Module Bean. Note that this EAR is created based on EJB3.0 version on NWDS 7.31 SP13 Patch 0. For earlier versions of PI, the EJB and EAR file will have to be manually built and compiled using the source code provided.

This module is based on Apache POI 3.9 library. In order for the Java project to compile and build successfully, the following JAR files need to be referenced/imported into the project.

poi-3.9-20121203.jar
poi-ooxml-3.9-20121203.jar
poi-ooxml-schemas-3.9-20121203.jar
xmlbeans-2.3.0.jar
dom4j-1.6.1.jar

The library files can be downloaded from Apache's website, direct link to the ZIP file is provided below.

Apache POI 3.9 ZIP file

Module Parameter Reference

Below is a list of the parameters for configuration of the module for Excel to XML conversion (converterClass = 'com.equalize.xpi.af.modules.excel.Excel2XMLTransformer'). Certain parameters will automatically inherit the default values if it is not configured.

Parameter Name	Allowed values	Default value	Remarks
conversionType	SimpleExcel2XML		Required field. Determines conversion type.
converterClass	com.equalize.xpi.af.modules.excel.Excel2XMLTransformer		Required field. Determines conversion class.
sheetName			The name of the active Excel sheet to extract. Either sheetName or sheetIndex must be populated.
sheetIndex	Integer values beginning from 0		The index of the active Excel sheet to extract (starts from 0). Either sheetName or sheetIndex must be populated.
skipEmptyRows	Y, N	Y	Empty rows to be skipped or not
rowOffset	Integer values beginning from 1	0	Starting row to begin extracting content from (i.e. 0 = start from first row, 1 = start from second row). If processFieldNames = 'fromFile' and rowOffset = 0, first line is always skipped
processFieldNames	fromFile, fromConfiguration, notAvailable		Required field. Determines the naming of each column of the rows, and the number of columns to extract: fromFile = Column names and number of columns are determined from header line of the sheet fromConfiguration = Column names and number of columns are determined from parameter fieldNames notAvailable = Column names will be set as ColumnX, where X = 1,2,3,4. Number of columns will be determined from parameter columnCount
fieldNames			Name of columns. Required field whenprocessFieldNames = 'fromConfiguration'
columnCount	Integer values beginning from 1		Number of columns for extraction. Required field whenprocessFieldNames = 'notAvailable'
recordName		Record	XML element name for row of record in output
documentName			Required field. Document name of root element of XML output
documentNamespace			Required field. Namespace of root element of XML output
formatting	excel, raw	excel	Controls how the cell contents are formatted in XML output excel = Cells are displayed the same way as Excel formatting of corresponding cell raw = Raw value of cells are displayed
evaluateFormulas	Y, N	Y	Controls how cell contents with formulas are displayed in XML output YES = Cells are displayed with result of formula evalution NO = Cells are displayed with actual formula
emptyCellOutput	suppress, defaultValue	suppress	Controls how empty cells are displayed in XML output suppress = Empty cells are not displayed (no corresponding XML tags for empty cells) defaultValue = Empty cells will be displayed with default value
emptyCellDefaultValue		<blank>	If emptyCellOutput = 'defaultValue', all empty cells will be populated with value in this parameter
indentFactor	Integer values beginning from 1	0	Determines the number of indentation spaces for each level in the XML output
debug	Y, N	N	Displays contents in Audit Log of each cell extracted. WARNING: Use this only for debugging in non-productive systems

Example Scenarios

Here are some example scenarios of the behavior of the conversion based on different configuration options.

Scenario 1

Excel 2007 XSLX file format.

Extract Sheet1 with column names determined directly from header line of file.

Special character & automatically converted

Module parameters

Parameter Name	Parameter Value
converterClass	com.equalize.xpi.af.modules.excel.Excel2XMLTransformer
sheetName	Sheet1
processFieldNames	fromFile
documentName	MT_Order
documentNamespace	urn:equalize:com

Result

Input
Output

Scenario 2

Excel binary XLS file format.

Extract sheet at index 0. Column names are provided from configuration.

Row offset provided to skip first two lines.

No formatting of cells, so raw values displayed.

Module parameters

Parameter Name	Parameter Value
converterClass	com.equalize.xpi.af.modules.excel.Excel2XMLTransformer
sheetIndex	0
processFieldNames	fromConfiguration
fieldNames	Order,Date,Material,Quantity
rowOffset	2
recordName	Line
documentName	MT_CustomOrder
documentNamespace	urn:equalize:com
formatting	raw

Result

Input
Output

Scenario 3

Excel 2007 XSLX file format.

Extract sheet at index 0.

Column names are not available. Number of columns = 5.

Row offset provided to skip first line.

Empty rows are included.

Cells with formula are displayed with formula instead of result.

Empty cells are displayed with default value "space".

Module parameters

Parameter Name	Parameter Value
converterClass	com.equalize.xpi.af.modules.excel.Excel2XMLTransformer
sheetIndex	0
processFieldNames	notAvailable
columnCount	5
rowOffset	1
documentName	MT_CustomOrder
documentNamespace	urn:equalize:com
skipEmptyRows	N
evaluateFormulas	N
emptyCellOutput	defaultValue
emptyCellDefaultValue	space

Result

Input
Output

Reference

Part 2 - ExcelTransformBean Part 2: Convert simple XML to various Excel formats easily

This article does not cover the steps for creating a custom adapter module. This can be found easily via SCN search. It is also listed in the reference section of the article below regarding adapter module testing.

Standalone testing of Adapter Module in NWDS

Alternative methods for Excel conversion

PI/XI: Reading MS Excel's XLSX and XLSM files with standard PI modules - easily...

Excel Files - How to handle them in SAP XI/PI (The Alternatives)

A Simple approach in Reading Excel File

ExcelTransformBean Part 1: Convert various Excel formats to simple XML easily

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112