You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by luke devon <lu...@yahoo.com> on 2010/04/27 05:45:43 UTC

Re: Convert XLS into CSV - Removing a Work Sheet

Hi Mark

I modified the code to removing the Summary sheet from my Excel workbook. It is working. Thanks for your guide and the support.( Herewith I have attached the code )

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;

public class SheetUtil {

   public static void main(String a[]){
       new SheetUtil().deleteSheet();
   }


   public void deleteSheet(){
       int sheetToDelete = -1;

       String path = "Book1.xls";
       FileInputStream fis  = null;
       Workbook workbook = null;
       File file = null;
       FileOutputStream fos = null;
       try{

           file = new File(path);

           fis = new FileInputStream(file);
           workbook = WorkbookFactory.create(fis);



           int numSheets = workbook.getNumberOfSheets();

           for(int i = 0; i < numSheets; i++) {
               org.apache.poi.ss.usermodel.Sheet sheet = workbook.getSheetAt(i);
               System.out.println(sheet.getSheetName());
               if(sheet.getSheetName().equalsIgnoreCase("Summary")) {
                   sheetToDelete = i;
                   break;

               }
           }


           if(sheetToDelete > -1) {
               workbook.removeSheetAt(sheetToDelete);
           }

           fos = new FileOutputStream(file);

           workbook.write(fos);

       }catch(Exception e){
           e.printStackTrace();
       }finally{
           try{
               if(fis != null){
                   fis.close();
                   fis = null;

               }
               if(fos != null){
                   fos.close();
                   fos = null;

               }
           }catch(Exception ee){
               ee.printStackTrace();

           }
       }
   }
}

=====================================================

I have few more modifications , to be applied for the workbook. Can you please help me further?

I will list down you my other requirements.

After I removed the "Summary" sheet , there will be only one Sheet will be remain which is "Detail". 

In the Detail Sheet,

1. In the Sheet , there is String called "Sum". This value also will be there more than once.

Once that value found , needed to be delete entire row respectively. Eg: If "Sum" is in , Cell H11, 11th row needed to be removed from the worksheet.

2. Some Cells are containing values with "1000 separator (,)" -- Eg: 1,256.00 . I wanted to remove all thousand separators from the values which are exist in the Detail excel sheet. Eg: 1256.00 --- after removing the "," value should be like this.(1256.00)

3. In B16 , There is a String called "Linux". This value i wanted to copy ( repeat ) until B25.Thats meant This value must be repeat from B17 to B25.( This is just like we copy a value along the column for more than one row )

Like that , in this worksheet , there will be lot of values to be copied. Hence I think we can not do it dynamically and we needed hard code those values in the code .

Thanks in advance
Luke


________________________________
From: MSB <ma...@tiscali.co.uk>
To: user@poi.apache.org
Sent: Friday, April 23, 2010 3:27:38
Subject: Re: Convert XLS into CSV


Luke

As you have only limited experience with Java, I have uploaded a
distribution that you should be able to
run from the command line. I know some companies restrict access to file
sharing sites bu if you can use Rapidshare from your location, this is the
link to the archive;

http://rapidshare.com/files/378927105/CSV_Conversion_Code.rar

and if you have any problems downloading the file, just let me know. It is
quite large as it contains all of
the POI archives that you will need to run the code but I am confident there
is another way I could send it
directly to you if Rapidshare is inaccessible.

If you are unfamiliar with Rapidshare, all you need to do is follow the
link, choose the free download option
and download the file to your machine. Once you have downloaded the archive
called 
CSV Conversion Code.rar, unzip it and you should have a folder with the name
CSV Conversion Code. This
folder contains a further folder called dist and a file by the name of Read
Me.rtf.
Open this rtf file (Word or WordPad will handle it successfully) and it will
tell you what to do next.

Any problems, just send me a message.

Yours

Mark B

PS How did you get on with removing the Summary sheet from your Excel
workbook?


Luke_Devon wrote:
> 
> Hi Mark
> 
> First of all I would like to thank you for all your explanation which you
> have explained very well regarding the code. It was really helpful to
> understand the flow of the code and I think its nice to have such audience
> and explanation from an expert / one of the best,  in the industry for a
> beginner like me. 
> 
> I was executing your final Code to convert XLS into CSV. Its compiled, but
> there was two messages , 
> uses unchecked or unsafe operations.
> Note: Recompile with -Xlint:unchecked for details.
> 
> I did google , for those messages and there was some directives , but as
> of some forum says , it wont affect to the execution.
> 
> Finally , when I tried to execute it , 
> 
> java ToCSV d:\excel  d:\csv , 
> 
> [java] Usage: java ToCSV "Source Folder" "Destination Folder" "CSV Element
> Separator"
> 
> BUILD FAILED
> Target "d:\excel" does not exist in the project "Inbox".
> 
> I tried to go through the code  , if there is any thing that i can
> modified to solve the problem.but no luck.
> 
> Thanks in advance
> Luke
> 
> 
> ________________________________
> From: MSB <ma...@tiscali.co.uk>
> To: user@poi.apache.org
> Sent: Tuesday, April 13, 2010 22:04:55
> Subject: Re: Convert XLS into CSV
> 
> 
> The code I have pasted into this message has been modified so that it is
> possible to specify the path to and name of a folder containing one or
> more
> Excel workbooks and the code will now iterate through each of the
> workbooks
> in the folder and convert them into separate CSV files. The name of the
> resulting CSV file will in each case be derived by taking the name of the
> source Excel workbook, removing the .xls or xlsx extension and suffixing
> the
> .csv extension. Before looking at the code in any detail - most of it is
> identical by the way - I need to explain how you can call and use this
> class. Forgive me if you know any of this but I thought it safer to assume
> you might not be totally clear on the details.
> 
> When you run a Java class with a command like this;
> 
> java ToCSV
> 
> the runtime environment will locate the .class file, load and compile the
> bytecode and then look inside the class for the main() method which, if
> that
> method is present and it contains any code at all, is where execution of
> the
> program will begin. If you look at the signature for the main() method,
> you
> will see that it looks something like this;
> 
> public static void main(String[] args)
> 
> ignoring the public, static and void keywords for the moment, take a look
> at
> what the brackets contain - String[] args. This is the declaration for a
> variable called args which is able to hold a reference to an array where
> each element of that array is an instance the java.lang.String class. This
> array will hold any parameters that were specified on the command line
> when
> the class was run. So, to modify the example above slightly, this;
> 
> java ToCSV C:/temp C:/temp/CSV
> 
> would result in a reference to an an array holding two Strings being
> passed
> to the args parameter; the first element would hold "C:/temp" and the
> second
> C:/temp/CSV. Well, more accurately, the elements would hold references to
> instances of the String class that encapsulated the values "C:/temp" and
> C:/temp/CSV.
> 
> The contents of the args parameter are accessible to any code within the
> args method and so it is possible to extract the contents of the two
> elements by simply referencing each using Java's standard notation; the
> first element in the args array would be args[0] and the second args[1].
> 
> With this information, you can see how to run the ToCSV class from the
> command line. All that you need do is specify the path to and name of
> either
> an individual Excel workbook or of a folder that contains Excel workbook,
> the path to and name of the folder that should contain the CSV files and,
> optionally, the character or string of characters that will separate each
> element on the line within the CSV file. The class can be invoked from the
> command line in this manner;
> 
> java ToCSV "C:/temp/Excel Workbooks" "C:/temp/CSV Files"
> 
> or
> 
> java ToCSV C:/temp/Excel C:/temp/CSV ,
> 
> Note that the speach marks enclosing the paths in the first example are
> only
> necessary if there are embedded spaces within the parameter.
> 
> If you now take a look at the code in the main() method of the ToCSV file,
> you can see that it simply recovers the values from the args array and
> uses
> them to call the convertExcelToCSV() method on an instance of the ToCSV
> class.
> 
> Equally, you can call this class from code within another Java class. If
> you
> choose to do this, it is an even simpler process; simply create an
> instance
> of the ToCSV class and then call either of the overloaded
> convertExcelToCSV() methods. One accepts two parameters, the name of and
> path to the source Excel workbook(s) and the name of and path to the
> folder
> the CSV files ought to be written to. If you call this method, it will be
> assumed that the separtor is a comma. The other method accepts a third
> parameter allowing you to define what the separator should be, so you
> could
> pass a colon for example. 
> 
> The process is a little like this;
> 
> ToCSV converter = new ToCSV();
> converter.convertExcelToCSV("C:/temp/Excel Workbooks", "C:/temp/CSV
> Files",
> "!");
> 
> The folder that contains the workbooks can hold other files as well, you
> do
> not have to worry about copying the Excel workbooks to a special file for
> processing. The ToCSV class is able to identify and select just Excel
> workbook files that have extensions of '.xls' or '.xlsx' and it does this
> by
> using an instance of an inner class called ExcelFilenameFilter that you
> can
> see defined at the bottom of the source code, immediately following the
> main() method. As you can see, it contains a single method - accept() -
> that
> contains a single line of code. This line of code will return the boolean
> value true if the name of a file ends with either '.xls' or '.xlsx' and an
> instance of ExcelFilenameFilter can, therefore, be used, when the contents
> of the folder are read, to ensure that only Excel workbooks are processed.
> I
> often use inner classes like this if they are very small and simple or if
> they only make sense in the context of the enclosing class. By this, I
> mean
> asking yourslef if it is likely any other class may need similar
> functionality.
> 
> Anayway, here is the code. Remember that it is only one possible solution
> and you may need to modify how it works to suit your specific requirement.
> You may not be worried, for example, if all of the rows in the CSV file
> contain different numbers of elements. If this is the case, the contents
> of
> the cells could be written out immediately after they were read and you
> could completely remove the additional step of storing them in ArrayLists
> to
> ensure the matrix is square.
> 
> package postprocessxlsx;
> 
> import org.apache.poi.ss.usermodel.WorkbookFactory;
> import org.apache.poi.ss.usermodel.Workbook;
> import org.apache.poi.ss.usermodel.Sheet;
> import org.apache.poi.ss.usermodel.Row;
> import org.apache.poi.ss.usermodel.Cell;
> import org.apache.poi.ss.usermodel.DataFormatter;
> import org.apache.poi.ss.usermodel.FormulaEvaluator;
> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
> 
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileWriter;
> import java.io.BufferedWriter;
> import java.io.FilenameFilter;
> import java.io.IOException;
> import java.io.FileNotFoundException;
> import java.util.ArrayList;
> 
> /**
> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
> This
> * class makes the following assumptions;
> *
> * 1. Where the Excel workbook contains more that one worksheet then a
> single
> *    CSV file will contain the data from all of the worksheets.
> * 2. The data matrix contained in the CSV file will be square. This means
> that
> *    the number of elements in each row of the CSV file will match the
> number
> *    of cells in the longest row found in the Excel workbook. Any short
> rows
> *    will be 'padded' with empty elements - an empty elements is
> represented in
> *    the CSV file in this way ,,.
> * 3. Empty elements will represent missing cells.
> * 4. A row consisting of empty elements will be used to represent an empty
> row
> *    in the Excel workbook.
> *
> * @author Mark B
> * @version 1.10 12th April 2010
> */
> public class ToCSV {
> 
>     private Workbook workbook = null;
>     private ArrayList<ArrayList> csvData = null;
>     private int maxRowWidth = 0;
>     private DataFormatter formatter = null;
>     private FormulaEvaluator evaluator = null;
>     private String separtor = null;
> 
>     public static final String CSV_FILE_EXTENSION = ".csv";
>     public static final String DEFAULT_SEPARATOR = ",";
> 
>     /**
>      * Process the contents of a folder, convert the contents of each
> Excel
>      * workbook into CSV format and save the resulting file to the
> specified
>      * folder using the same name as the original workbook with the .xls
> or
>      * .xlsx extension replaced by .csv
>      *
>      * @param source An instance of the String class that encapsulates the
>      *        name of and path to either a folder containing those Excel
>      *        workbook(s) or the name of and path to an individual Excel
> workbook
>      *        that is/are to be converted.
>      * @param destination An instance of the String class encapsulating
> the
> name
>      *        of and path to a folder that will contain the resulting CSV
> files.
>      * @throws java.io.FileNotFoundException Thrown if any file cannot be
> located
>      *         on the filesystem during processing.
>      * @throws java.io.IOException Thrown if the filesystem encounters any
>      *         problems during processing.
>      * @throws java.lang.IllegalArgumentException Thrown if the values
> passed
>      *         to the source parameters refers to a file or folder that
> does
> not
>      *         exist and if the value passed to the destination paramater
> refers
>      *         to a folder that does not exist or simply does not refer to
> a
>      *         folder that does not exist.
>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
> Thrown
>      *         if the xml markup encounetered whilst parsing a
> SpreadsheetML
>      *         file (.xlsx) is invalid.
>      */
>     public void convertExcelToCSV(String strSource, String strDestination)
>                        throws FileNotFoundException, IOException,
>                               IllegalArgumentException,
> InvalidFormatException {
> 
>         // Simply chain the call to the overloaded
> convertExcelToCSV(String,
>         // String, String) method and pass the default separator to ensure
> this
>         // String is used to separate elemnts on the line in the CSV file.
>         this.convertExcelToCSV(strSource, strDestination,
> ToCSV.DEFAULT_SEPARATOR);
>     }
> 
>     /**
>      * Process the contents of a folder, convert the contents of each
> Excel
>      * workbook into CSV format and save the resulting file to the
> specified
>      * folder using the same name as the original workbook with the .xls
> or
>      * .xlsx extension replaced by .csv
>      *
>      * @param source An instance of the String class that encapsulates the
>      *        name of and path to either a folder containing those Excel
>      *        workbook(s) or the name of and path to an individual Excel
> workbook
>      *        that is/are to be converted.
>      * @param destination An instance of the String class encapsulating
> the
> name
>      *        of and path to a folder that will contain the resulting CSV
> files.
>      * @param separator An instance of the String class encapsulating the
>      *        characters or characters that should be used to separate
> items
>      *        on a line within the CSV file.
>      * @throws java.io.FileNotFoundException Thrown if any file cannot be
> located
>      *         on the filesystem during processing.
>      * @throws java.io.IOException Thrown if the filesystem encounters any
>      *         problems during processing.
>      * @throws java.lang.IllegalArgumentException Thrown if the values
> passed
>      *         to the source parameters refers to a file or folder that
> does
> not
>      *         exist and if the value passed to the destination paramater
> refers
>      *         to a folder that does not exist or simply does not refer to
> a
>      *         folder that does not exist.
>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
> Thrown
>      *         if the xml markup encounetered whilst parsing a
> SpreadsheetML
>      *         file (.xlsx) is invalid.
>      */
>     public void convertExcelToCSV(String strSource,
>                                   String strDestination, String separator)
>                        throws FileNotFoundException, IOException,
>                               IllegalArgumentException,
> InvalidFormatException {
>         File source = new File(strSource);
>         File destination = new File(strDestination);
>         File[] filesList = null;
>         String destinationFilename = null;
> 
>         // Check that the source file/folder exists.
>         if(!source.exists()) {
>             throw new IllegalArgumentException("The source for the Excel "
> +
>                     "file(s) cannot be found.");
>         }
> 
>         // Ensure thaat the folder the user has chosen to save the CSV
> files
>         // away into firstly exists and secondly is a folder rather than,
> for
>         // instance, a data file.
>         if(!destination.exists()) {
>             throw new IllegalArgumentException("The folder/directory for
> the
> " +
>                     "converted CSV file(s) does not exist.");
>         }
>         if(!destination.isDirectory()) {
>             throw new IllegalArgumentException("The destination for the
> CSV
> " +
>                     "file(s) is not a directory/folder.");
>         }
> 
>         // Check to see if the sourceFolder variable holds a reference to
>         // a file or a folder full of files.
>         if(source.isDirectory()) {
>             // Get a list of all of the Excel spreadsheet files
> (workbooks)
> in
>             // the source folder/directory
>             filesList = source.listFiles(new ExcelFilenameFilter());
>         }
>         else {
>             // Assume that it must be a file handle - although there are
> other
>             // options the code should perhaps check - and store the
> reference
>             // into the filesList variable.
>             filesList = new File[]{source};
>         }
> 
>         // Step through each of the files in the source folder and for
> each
>         // open the workbook, convert it's contents to CSV format and then
>         // save the resulting file away into the folder specified by the
>         // contents of the destination variable. Note that the name of the
>         // csv file will be created by taking the name of the Excel file,
>         // removing the extension and replacing it with .csv. Note that
> there
>         // is one drawback with this approach; if the folder holding the
> files
>         // contains two workbooks whose names match but one is a binary
> file
>         // (.xls) and the other a SpreadsheetML file (.xlsx), then the
> names
>         // for both CSV files will be identical and one CSV file will,
>         // therefore, over-write the other.
>         for(File excelFile : filesList) {
>             // Open the workbook
>             this.openWorkbook(excelFile);
> 
>             // Convert it's contents into a CSV file
>             this.convertToCSV();
> 
>             // Build the name of the csv folder from that of the Excel
> workbook.
>             // Simply replace the .xls or .xlsx file extension with .csv
>             destinationFilename = excelFile.getName();
>             destinationFilename = destinationFilename.substring(
>                     0, destinationFilename.lastIndexOf(".")) + ".csv";
> 
>             // Save the CSV file away using the newly constricted file
> name
>             // and to the specified directory.
>             this.saveCSVFile(new File(destination, destinationFilename),
> separator);
>         }
>     }
> 
>     /**
>      * Open an Excel workbook ready for conversion.
>      *
>      * @param file An instance of the File class that encapsulates a
> handle
>      *        to a valid Excel workbook. Note that the workbook can be in
>      *        either binary (.xls) or SpreadsheetML (.xlsx) format.
>      *
>      * @throws java.io.FileNotFoundException Thrown if the file cannot be
> located.
>      * @throws java.io.IOException Thrown if a problem occurs in the file
> system.
>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
> Thrown
>      *         if invalid xml is found whilst parsing an input
> SpreadsheetML
>      *         file.
>      */
>     private void openWorkbook(File file) throws FileNotFoundException,
>                                            IOException,
> InvalidFormatException {
>         FileInputStream fis = null;
>         try {
>             System.out.println("Opening workbook [" + file.getName() +
> "]");
> 
>             fis = new FileInputStream(file);
> 
>             // Open the workbook and then create the FormulaEvaluator and
>             // DataFormatter instances that will be needed to,
> respectively,
>             // force evaluation of forumlae found in cells and create a
>             // formatted String encapsulating the cells contents.
>             this.workbook = WorkbookFactory.create(fis);
>             this.evaluator =
> this.workbook.getCreationHelper().createFormulaEvaluator();
>             this.formatter = new DataFormatter();
>         }
>         finally {
>             if(fis != null) {
>                 fis.close();
>             }
>         }
>     }
> 
>     /**
>      * Called to convert the contents of the currently opened workbook
> into
>      * a CSV file.
>      */
>     private void convertToCSV() {
>         Sheet sheet = null;
>         Row row = null;
>         int lastRowNum = 0;
>         this.csvData = new ArrayList<ArrayList>();
> 
>         System.out.println("Converting files contents to CSV format.");
> 
>         // Discover how many sheets there are in the workbook....
>         int numSheets = this.workbook.getNumberOfSheets();
> 
>         // and then iterate through them.
>         for(int i = 0; i < numSheets; i++) {
> 
>             // Get a reference to a sheet and check to see if it contains
>             // any rows.
>             sheet = this.workbook.getSheetAt(i);
>             if(sheet.getPhysicalNumberOfRows() > 0) {
> 
>                 // Note down the index number of the bottom-most row and
>                 // then iterate through all of the rows on the sheet
> starting
>                 // from the very first row - number 1 - even if it is
> missing.
>                 // Recover a reference to the row and then call another
> method
>                 // which will strip the data from the cells and build
> lines
>                 // for inclusion in the resylting CSV file.
>                 lastRowNum = sheet.getLastRowNum();
>                 for(int j = 0; j <= lastRowNum; j++) {
>                     row = sheet.getRow(j);
>                     this.rowToCSV(row);
>                 }
>             }
>         }
>     }
> 
>     /**
>      * Called to actually save the data recovered from the Excel workbook
>      * as a CSV file.
>      *
>      * @param file An instance of the File class that encapsulates a
> handle
>      *             referring to the CSV file.
>      * @param separator An instance of the String class that encapsulates
> the
>      *                  character or character that ought to be used to
> delimit
>      *                  elements on the lines of the CSV file.
>      * @throws java.io.FileNotFoundException Thrown if the file cannot be
> found.
>      * @throws java.io.IOException Thrown to indicate and error occurred
> in
> the
>      *                             underylying file system.
>      */
>     private void saveCSVFile(File file, String separator)
>                                      throws FileNotFoundException,
> IOException {
>         FileWriter fw = null;
>         BufferedWriter bw = null;
>         ArrayList<String> line = null;
>         StringBuffer buffer = null;
>         String csvLineElement = null;
>         try {
> 
>             System.out.println("Saving the CSV file [" + file.getName() +
> "]");
> 
>             // Open a writer onto the CSV file.
>             fw = new FileWriter(file);
>             bw = new BufferedWriter(fw);
> 
>             // Step through the elements of the ArrayList that was used to
> hold
>             // all of the data recovered from the Excel workbooks' sheets,
> rows
>             // and cells.
>             for(int i = 0; i < this.csvData.size(); i++) {
>                 buffer = new StringBuffer();
> 
>                 // Get an element from the ArrayList that contains the
> data
> for
>                 // the workbook. This element will itself be an ArrayList
>                 // containing Strings and each String will hold the data
> recovered
>                 // from a single cell. The for() loop is used to recover
> elements
>                 // from this 'row' ArrayList one at a time and to write
> the
> Strings
>                 // away to a StringBuffer thus assembling a single line
> for
> inclusion
>                 // in the CSV file. If a row was empty or if it was short,
> then
>                 // the ArrayList that contains it's data will also be
> shorter than
>                 // some of the others. Therefore, it is necessary to check
> within
>                 // the for loop to ensure that the ArrayList contains data
> to be
>                 // processed. If it does, then an element will be
> recovered
> and
>                 // appended to the StringBuffer.
>                 line = this.csvData.get(i);
>                 for(int j = 0; j < this.maxRowWidth; j++) {
>                     if(line.size() > j) {
>                         csvLineElement = line.get(j);
>                         if(csvLineElement != null) {
>                             buffer.append(csvLineElement);
>                         }
>                     }
>                     if(j < (this.maxRowWidth - 1)) {
>                         buffer.append(separator);
>                     }
>                 }
> 
>                 // Once the line is built, write it away to the CSV file.
>                 bw.write(buffer.toString().trim());
> 
>                 // Condition the inclusion of new line characters so as to
>                 // avoid an additional, superfluous, new line at the end
> of
>                 // the file.
>                 if(i < (this.csvData.size() - 1)) {
>                     bw.newLine();
>                 }
>             }
>         }
>         finally {
>             if(bw != null) {
>                 bw.flush();
>                 bw.close();
>             }
>         }
>     }
> 
>     /**
>      * Called to convert a row of cells into a line of data that can later
> be
>      * output to the CSV file.
>      *
>      * Note that no tests have yet been conducted with blank cells or
> those
>      * containing formulae. Such may require latereations to the way this
> code
>      * works.
>      *
>      * @param row An instance of either the HSSFRow or XSSFRo classes that
>      *            encapsulates information about a row of cells recovered
> from
>      *            an Excel workbook.
>      */
>     private void rowToCSV(Row row) {
>         Cell cell = null;
>         int lastCellNum = 0;
>         ArrayList<String> csvLine = new ArrayList<String>();
> 
>         // Check to ensure that a row was recovered from the sheet as it
> is
>         // possible that one or more rows between other populated rows
> could
> be
>         // missing - blank. If the row does contain cells then...
>         if(row != null) {
> 
>             // Get the index for the right most cell on the row and then
>             // step along the row from left to right recovering the
> contents
>             // of each cell, converting that into a formatted String and
>             // then storing the String into the csvLine ArrayList.
>             lastCellNum = row.getLastCellNum();
>             for(int i = 0; i <= lastCellNum; i++) {
>                 cell = row.getCell(i);
>                 if(cell == null) {
>                     csvLine.add("");
>                 }
>                 else {
>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>                         csvLine.add(this.formatter.formatCellValue(cell));
>                     }
>                     else {
>                         csvLine.add(this.formatter.formatCellValue(cell,
> this.evaluator));
>                     }
>                 }
>             }
>             // Make a note of the index number of the right most cell.
> This
> value
>             // will later be used to ensure that the matrix of data in the
> CSV file
>             // is square.
>             if(lastCellNum > this.maxRowWidth) {
>                 this.maxRowWidth = lastCellNum;
>             }
>         }
>         this.csvData.add(csvLine);
>     }
> 
>     /**
>      * The main() method contains code that demonstrates how to use the
> class.
>      * @param args
>      */
>     public static void main(String[] args) {
>         // Check the number of arguments passed to the main method. There
>         // must be two or three, the name of and path to either the folder
>         // containing the Excel files or an individula Excel workbook that
> is/are
>         // to be converted, the name of and path to the folder to which
> the
> CSV
>         // files should be written and then finally, optionally, the
> separator
>         // that should be used to separate individual items on the lines
> in
> the
>         // CSV file. Note that the names of the CSV files will be derived
> from
>         // those of the Excel file(s). Put simply the .xls or .xlsx
> extension
>         // will be replaced with .csv.
>         ToCSV converter = null;
>         try {
>             converter = new ToCSV();
>             if(args.length == 2) {
>                 converter.convertExcelToCSV(args[0], args[1]);
>             }
>             else if(args.length == 3){
>                 converter.convertExcelToCSV(args[0], args[1], args[2]);
>             }
>             else {
>                 System.out.println("Usage: java ToCSV \"Source Folder\" "
> +
>                         "\"Destination Folder\" \"CSV Element
> Separator\"");
>             }
>         }
>         catch(Exception ex) {
>             System.out.println("Caught an: " + ex.getClass().getName());
>             System.out.println("Message: " + ex.getMessage());
>             System.out.println("Stacktrace follows:.....");
>             ex.printStackTrace(System.out);
>         }
>     }
> 
>     /**
>      * An instance of this class can be used to control the files returned
>      * be a call to the listFiles() method when made on an instance of the
>      * File class and that object refers to a folder/directory
>      */
>     class ExcelFilenameFilter implements FilenameFilter {
> 
>         /**
>          * Determine those files that will be returned by a call to the
>          * listFiles() method. In this case, the name of the file must end
> with
>          * either of the following two extension; '.xls' or '.xlsx'
>          * @param file An instance of the File class that encapsulates a
> handle
>          *             referring to the folder/directory that contains the
> file.
>          * @param name An instance of the String class that encapsulates
> the
>          *             name of the file.
>          * @return A boolean value that indicates whether the file should
> be
>          *         included in the array retirned by the call to the
> listFiles()
>          *         method. In this case true will be returned if the name
> of
> the
>          *         file ends with either '.xls' or '.xlsx' and false will
> be
>          *         returned in all other instances.
>          */
>         public boolean accept(File file, String name) {
>             return(name.endsWith(".xls") || name.endsWith(".xlsx"));
>         }
>     }
> }
> 
> 
> Luke_Devon wrote:
>> 
>> Hi Mark
>> 
>> I tested the code. It was fine. I could manage to convert XLS into CSV
>> without any problem. But i have few more questions.
>> 
>> In this code , you have been hard coded the path where XLS and CSV
>> located. and file names also hard coded.
>> 
>> How it would be pointing to a single directory and convert all XLS into
>> CSV stored in the folder ? 
>> 
>> In the RAW XLS file , there might be some unwanted data to be converted
>> into CSV. or some times we need to complete the blank cells in the excel
>> sheet.
>> 
>> How can we do it , before convert into CSV ? 
>> 
>> Sorry about all those simple questions.( Since I'm not a programmer)
>> 
>> Thanks in advance
>> Luke
>> 
>> 
>> 
>> 
>> 
>> ________________________________
>> From: MSB <ma...@tiscali.co.uk>
>> To: user@poi.apache.org
>> Sent: Friday, April 9, 2010 22:18:43
>> Subject: Re: Convert XLS into CSV
>> 
>> 
>> Hello Luke,
>> 
>> As promised, a bit of code that uses the usermodel to create CSV files. I
>> have not had the opportunity to test it thoroughly and do expect there to
>> be
>> issues so do not use the code in a production environment until you have
>> put
>> it through the wringer so to speak. Also, you may find the performance a
>> little slower than you expect, especially if you are using the newer xml
>> based file format and have run the eventusermodel code that Nick wrote.
>> 
>> In essence, 'my' code simplt takes the contents of the workbook and
>> converts
>> it into an ArrayList of ArrayLists where each inner ArrayList contains
>> zero,
>> one or more Strings that describe the contents of a cell. I used this
>> approach becuase it allows me to ensure that every row in the finished
>> CVS
>> file is the same length - with regard to the number of elements it
>> contains
>> - even if the input workbook contains rows that have varying numbers of
>> cells on them. The code as it stands does evaluate any formulae that may
>> be
>> contained within cells and I hope will perform pretty much as you
>> require.
>> Take a look down into the main() method to see how it can be used; this
>> method only shows the class being used to process a single file but an
>> instance can be used to process more than one file in this manner;
>> 
>> ToCSV converter = new ToCSV();
>> converter.openWorkbook("C:/temp/To CSV.xls");
>> converter.convertToCSV();
>> converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>> 
>> converter.openWorkbook("C:/temp/Another To CSV.xlsx");
>> converter.convertToCSV();
>> converter.saveCSVFile("C:/temp/Second CSV.csv", ";");
>> 
>> 
>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>> import org.apache.poi.ss.usermodel.Workbook;
>> import org.apache.poi.ss.usermodel.Sheet;
>> import org.apache.poi.ss.usermodel.Row;
>> import org.apache.poi.ss.usermodel.Cell;
>> import org.apache.poi.ss.usermodel.DataFormatter;
>> import org.apache.poi.ss.usermodel.FormulaEvaluator;
>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>> 
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.io.FileWriter;
>> import java.io.BufferedWriter;
>> import java.io.IOException;
>> import java.io.FileNotFoundException;
>> import java.util.ArrayList;
>> 
>> /**
>> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
>> This
>> * class makes the following assumptions;
>> *
>> * 1. Where the Excel workbook contains more that one worksheet then a
>> single
>> *    CSV file will contain the data from all of the worksheets.
>> * 2. The data matrix contained in the CSV file will be square. This means
>> that
>> *    the number of elements in each row of the CSV file will match the
>> number
>> *    of cells in the longest row found in the Excel workbook. Any short
>> rows
>> *    will be 'padded' with empty elements - an empty elements is
>> represented in
>> *    the CSV file in this way ,,.
>> * 3. Empty elements will represent missing cells.
>> * 4. A row consisting of empty elements will be used to represent an
>> empty
>> row
>> *    in the Excel workbook.
>> *
>> * @author Mark B
>> * @version 1.00 9th April 2010
>> */
>> public class ToCSV {
>> 
>>     private Workbook workbook = null;
>>     private ArrayList<ArrayList> csvData = null;
>>     private int maxRowWidth = 0;
>>     private DataFormatter formatter = null;
>>     private FormulaEvaluator evaluator = null;
>> 
>>     /**
>>      * Open an Excel workbook readt for conversion.
>>      *
>>      * @param filename An instance of the String class that encapsulates
>> the
>>      *                 path to and name of a valid Excel workbook. Note
>> that
>> the
>>      *                 workbook can be either a binary (.xls) or
>> SpreadsheetML
>>      *                 (.xlsx) file.
>>      *
>>      * @throws java.io.FileNotFoundException Thrown if the file cannot be
>> located.
>>      * @throws java.io.IOException Thrown if a problem occurs in the file
>> system.
>>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
>> Thrown
>>      *     if invalid xml is found whilst parsing an input SpreadsheetML
>> file.
>>      */
>>     public void openWorkbook(String filename) throws
>> FileNotFoundException,
>>                                            IOException,
>> InvalidFormatException {
>>         File file = null;
>>         FileInputStream fis = null;
>>         try {
>>             file = new File(filename);
>>             fis = new FileInputStream(file);
>>             this.workbook = WorkbookFactory.create(fis);
>>             this.evaluator =
>> this.workbook.getCreationHelper().createFormulaEvaluator();
>>             this.formatter = new DataFormatter();
>>         }
>>         finally {
>>             if(fis != null) {
>>                 fis.close();
>>             }
>>         }
>>     }
>> 
>>     /**
>>      * Called to convert the contents of the currently opened workbook
>> into
>>      * a CSV file.
>>      */
>>     public void convertToCSV() {
>>         Sheet sheet = null;
>>         Row row = null;
>>         int lastRowNum = 0;
>>         this.csvData = new ArrayList<ArrayList>();
>> 
>>         // Discover how many sheets there are in the workbook....
>>         int numSheets = this.workbook.getNumberOfSheets();
>> 
>>         // and then iterate through them.
>>         for(int i = 0; i < numSheets; i++) {
>> 
>>             // Get a reference to a sheet and check to see if it contains
>>             // any rows.
>>             sheet = this.workbook.getSheetAt(i);
>>             if(sheet.getPhysicalNumberOfRows() > 0) {
>> 
>>                 // Note down the index number of the bottom-most row and
>>                 // then iterate through all of the rows on the sheet
>> starting
>>                 // from the very first row - number 1 - even if it is
>> missing.
>>                 // Recover a reference to the row and then call another
>> method
>>                 // which will strip the data from the cells and build
>> lines
>>                 // for inclusion in the resylting CSV file.
>>                 lastRowNum = sheet.getLastRowNum();
>>                 for(int j = 0; j <= lastRowNum; j++) {
>>                     row = sheet.getRow(j);
>>                     this.rowToCSV(row);
>>                 }
>>             }
>>         }
>>     }
>> 
>>     /**
>>      * Called to actually save the data recovered from the Excel workbook
>>      * as a CSV file.
>>      *
>>      * @param filename An instance of the String class that encapsulates
>> the
>>      *                path to and name of the CSV file.
>>      * @param separator An instance of the String class that encapsulates
>> the
>>      *                  character or character that ought to be used to
>> delimit
>>      *                  elements on the lines of the CSV file.
>>      * @throws java.io.FileNotFoundException Thrown if the file cannot be
>> found.
>>      * @throws java.io.IOException Thrown to indicate and error occurred
>> in
>> the
>>      *                             underylying file system.
>>      */
>>     public void saveCSVFile(String filename, String separator)
>>                                                    throws
>> FileNotFoundException,
>>                                                           IOException {
>>         File file = null;
>>         FileWriter fw = null;
>>         BufferedWriter bw = null;
>>         ArrayList<String> line = null;
>>         StringBuffer buffer = null;
>>         String csvLineElement = null;
>>         try {
>>             // Open a writer onto the CSV file.
>>             file = new File(filename);
>>             fw = new FileWriter(file);
>>             bw = new BufferedWriter(fw);
>> 
>>             // Step through the elements of the ArrayList that was used
>> to
>> hold
>>             // all of the data recovered from the Excel workbooks'
>> sheets,
>> rows
>>             // and cells.
>>             for(int i = 0; i < this.csvData.size(); i++) {
>>                 buffer = new StringBuffer();
>> 
>>                 // Get an element from the ArrayList that contains the
>> data
>> for
>>                 // the workbook. This element will itself be an ArrayList
>>                 // containing Strings and each String will hold the data
>> recovered
>>                 // from a single cell. The for() loop is used to recover
>> elements
>>                 // from this 'row' ArrayList one at a time and to write
>> the
>> Strings
>>                 // away to a StringBuffer thus assembling a single line
>> for
>> inclusion
>>                 // in the CSV file. If a row was empty or if it was
>> short,
>> then
>>                 // the ArrayList that contains it's data will also be
>> shorter than
>>                 // some of the others. Therefore, it is necessary to
>> check
>> within
>>                 // the for loop to ensure that the ArrayList contains
>> data
>> to be
>>                 // processed. If it does, then an element will be
>> recovered
>> and
>>                 // appended to the StringBuffer.
>>                 line = this.csvData.get(i);
>>                 for(int j = 0; j < this.maxRowWidth; j++) {
>>                     if(line.size() > j) {
>>                         csvLineElement = line.get(j);
>>                         if(csvLineElement != null) {
>>                             buffer.append(csvLineElement);
>>                         }
>>                     }
>>                     if(j < (this.maxRowWidth - 1)) {
>>                         buffer.append(separator);
>>                     }
>>                 }
>> 
>>                 // Once the line is built, write it away to the CSV file.
>>                 bw.write(buffer.toString().trim());
>> 
>>                 // Condition the inclusion of new line characters so as
>> to
>>                 // avoid an additional, superfluous, new line at the end
>> of
>>                 // the file.
>>                 if(i < (this.csvData.size() - 1)) {
>>                     bw.newLine();
>>                 }
>>             }
>>         }
>>         finally {
>>             if(bw != null) {
>>                 bw.flush();
>>                 bw.close();
>>             }
>>         }
>>     }
>> 
>>     /**
>>      * Called to convert a row of cells into a line of data that can
>> later
>> be
>>      * output to the CSV file.
>>      *
>>      * Note that no tests have yet been conducted with blank cells or
>> those
>>      * containing formulae. Such may require latereations to the way this
>> code
>>      * works.
>>      *
>>      * @param row An instance of either the HSSFRow or XSSFRo classes
>> that
>>      *            encapsulates information about a row of cells recovered
>> from
>>      *            an Excel workbook.
>>      */
>>     private void rowToCSV(Row row) {
>>         Cell cell = null;
>>         int lastCellNum = 0;
>>         ArrayList<String> csvLine = new ArrayList<String>();
>> 
>>         // Check to ensure that a row was recovered from the sheet as it
>> is
>>         // possible that one or more rows between other populated rows
>> could
>> be
>>         // missing - blank. If the row does contain cells then...
>>         if(row != null) {
>> 
>>             // Get the index for the right most cell on the row and then
>>             // step along the row from left to right recovering the
>> contents
>>             // of each cell, converting that into a formatted String and
>>             // then storing the String into the csvLine ArrayList.
>>             lastCellNum = row.getLastCellNum();
>>             for(int i = 0; i <= lastCellNum; i++) {
>>                 cell = row.getCell(i);
>>                 if(cell == null) {
>>                     csvLine.add("");
>>                 }
>>                 else {
>>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>>                        
>> csvLine.add(this.formatter.formatCellValue(cell));
>>                     }
>>                     else {
>>                         csvLine.add(this.formatter.formatCellValue(cell,
>> this.evaluator));
>>                     }
>>                 }
>>             }
>>             // Make a note of the index number of the right most cell.
>> This
>> value
>>             // will later be used to ensure that the matrix of data in
>> the
>> CSV file
>>             // is square.
>>             if(lastCellNum > this.maxRowWidth) {
>>                 this.maxRowWidth = lastCellNum;
>>             }
>>         }
>>         this.csvData.add(csvLine);
>>     }
>> 
>>     /**
>>      * The main() method contains code that demonstrates how to use the
>> class.
>>      * @param args
>>      */
>>     public static void main(String[] args) {
>>         try {
>>             ToCSV converter = new ToCSV();
>>             converter.openWorkbook("C:/temp/To CSV.xls");
>>             converter.convertToCSV();
>>             converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>>         }
>>         catch(Exception ex) {
>>             System.out.println("Caught an: " + ex.getClass().getName());
>>             System.out.println("Message: " + ex.getMessage());
>>             System.out.println("Stacktrace follows:.....");
>>             ex.printStackTrace(System.out);
>>         }
>> 
>>     }
>> }
>> 
>> Test it out, have a good look through it and if there is anything you
>> want
>> to know just post to the list.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Luke_Devon wrote:
>>> 
>>> Hi Mark,
>>> 
>>> First of all I would like to thank you for the reply. 
>>> 
>>> Actually , currently I am using MS Office 2002. But I would like to use
>>> the code for other latest versions also.
>>> In my case , I dont want to do any validations for the EXCEL file ,
>>> because I just wanted convert entire excel file into CSV.
>>> Is there any simple java code available for such a basic requirement ?
>>> 
>>> anyway I'll try to use the code in the link which you have given to me .
>>> 
>>> Thanks & Regards
>>> 
>>> Luke.
>>> 
>>> 
>>> 
>>> 
>>> ________________________________
>>> From: MSB <ma...@tiscali.co.uk>
>>> To: user@poi.apache.org
>>> Sent: Thursday, April 8, 2010 23:47:07
>>> Subject: Re: Convert XLS into CSV
>>> 
>>> 
>>> Hello Luke,
>>> 
>>> Which version of the Excel file format are you targetting, the older
>>> binary
>>> or newer xml based version? I ask because Nick wrote and contributed
>>> some
>>> code that can be used to convert the older binary files into csv. It
>>> uses
>>> the eventmodel and will seem quite complex on first acquaintance but
>>> here
>>> it
>>> is;
>>> 
>>> http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java
>>> 
>>> I know that other users have modified the code to, for example, output
>>> diffenert worksheets to separate csv files.
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Luke_Devon wrote:
>>>> 
>>>> Hi 
>>>> 
>>>> I wanted to convert some XLS files into CSV. I found that apache.poi is
>>>> the most perfect tool. Since I'm a beginner , i have no idea how to do
>>>> that . Can some body help me please ? Do you have sample code for
>>>> convert
>>>> xls into csv ?
>>>> 
>>>> Thanks in Advance
>>>> Luke
>>>> 
>>>> 
>>>> 
>>>>       Get your preferred Email name!
>>>> Now you can @ymail.com and @rocketmail.com. 
>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28180503.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>>       New Email names for you! 
>>> Get the Email name you've always wanted on the new @ymail and
>>> @rocketmail. 
>>> Hurry before someone else does!
>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28191046.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>>       Get your new Email address!
>> Grab the Email name you've always wanted before someone else does!
>> http://mail.promotions.yahoo.com/newdomains/aa/
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28230701.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28333557.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Modifying excel sheets

Posted by luke devon <lu...@yahoo.com>.
Hi Mark

Of course now I am also really interested about developments. But just to make a piece of a code takes long time. This is my concern. anyway as you directed me previously , I could come with the final code.Like that i will work on the other steps as well. But still i need your support .

Many Thanks
Luke


________________________________
From: MSB <ma...@tiscali.co.uk>
To: user@poi.apache.org
Sent: Tuesday, April 27, 2010 18:21:16
Subject: Re: Convert XLS into CSV - Removing a Work Sheet


Hello again Luke,

Just a quick visit - am enjoying that great British tradition, the
tea-break, and taking the chance to look at a few messages.

Knowing that the word is just in column H is a great help as it will make a
life a lot easier for us. Column H will be referenced - using POI - by the
index 7. Thus, I am currently thinking along the following lines;

To create an ArrayList of type Integer.
Read the worksheet line by line and get cell index 7.
If it's type is String and if it contains the word 'Sum' then add the index
number of the row to the ArrayList.
Once the sheet has been fully processed, start at the end of the ArrayList,
recover each row index in turn and delete it from the sheet. Working from
the bottom-up in this manner is slightly easier than working from the bottom
down.

Tonight, I will be able to put some code together to do this but why not
have a go for yourself? It can be esailly combined with the code you are
using to remove the Summary sheet - all you should need to do is process
those sheets that are not called Summary of course.

Then, once we have this bit working, the only hurdle is copying the missing
data. The command line is not the only way to get information into the
application, we can use Properties or even a file that the application can
parse; obviously, this will increase the complexity but add to the
flexibility. So I am thinking along the lines of having a simple text file
that contains lines like;

C10! C11:C20
D34! D67:D78
or
Linux! C10:C20

and is able to provide the instruction to the application. To change it's
behaviour, all you need to do is change the 'instructions' in the file.

Anyway, must go now, we have a fallen Willow tree to clear, a broken fence
to mend, a dipping platform to construct and there is a nice piece of
chocolate cake with my name on it.

Yours

Mark B


Luke_Devon wrote:
> 
> Hi Mark,
> 
> 1.    Value “Sum” always resides on Column H and there is no specific
> /constant space for number of rows between each occurrence of the line
> including that word “Sum”. That’s meant, first “Sum” stored in H, 11.
> Second “Sum” stored in H, 14, Third “Sum” stored in H, 26 ………….so on. Row
> numbers are not constant but Column is constant. That is H.  
> 
> 2.    Finally I understood that we don’t need to remove “comma” from the
> excel sheet since function of XLS convert into CSV is taking care of the
> matter. ( Thanks for directing me to the point )
> 
> 3.    Major part is copying other missing data in the columns. Once every
> thing developed, I don’t need to involve with enter values manually. Main
> idea is handover the tasks to a CRON job.
> 
> Then I have a doubt, how to enter -C "C10! C11:C20 D34! D67:D78" or -C
> "Linux! C10:C20" since program is running by a CRON job
> 
> Please help me to go further .
> 
> Thanks in advance
> Luke
> 
> 
> ________________________________
> From: MSB <ma...@tiscali.co.uk>
> To: user@poi.apache.org
> Sent: Tuesday, April 27, 2010 15:08:36
> Subject: Re: Convert XLS into CSV - Removing a Work Sheet
> 
> 
> Glad to hear that Luke, now onto the other tasks.
> 
> Firstly, can I ask you a question about the first task - removing a row if
> the word Sum appears on it - please? Does Sum always appear in a
> particular
> column? Failing that, are there a set number of rows between each
> occurrence
> of the line including that word? If so, this will speed up the entire
> process somewhat, if not, then we are faced with a process where we need
> to
> examine every cell on each row in the worksheet and this could be quite a
> time consuming process. If it is possible to identify a pattern, even a
> quite crude one that identifies that the word will only appear in columns
> D,
> G or H, then that will help.
> 
> Secondly, the commas in numeric values, this could pressent a bit of a
> problem. As you may know, styles are used to control how a value is
> displayed in a cell and it could be that the person who originally
> constructed the worksheet entered a value such as 12345 into the cell and
> then applied a style so that it appears to contain 12,345.00. The best way
> to resolve this issue is, in many ways, to simply remove the cells
> formatting - but there is yet another issue, what if the cell was not
> numeric at all and the user entered a String of text and then right
> aligned
> it? So, can you look at the worksheet using Excel, find a few of these
> cells
> and see what type they are and what format has been applied to them please
> because, without the file at hand, it is hard to tell you how to proceed
> on
> this issue. Finally, do you need to modify these cells in the workbook as
> it
> would be trivial task to remove the comma when writing the value into the
> CSV file.
> 
> Thirdly, the copying cell contents code, is it possible to say, in
> advance,
> copy the contents of cell B16 into the range B17 to B25 for example? If
> so,
> then it would be quite a straightforward task to automate the process
> completely and simply pass in a parameter format in a way the code would
> recognise. For example, and assuming running the code from the command
> line,
> we could have -C "C10!C11:C20 D34!D67:D78" or -C "Linux!C10:C20" where the
> former would take the contents of cell C10 and copy it into cells C11 to
> C20
> and the latter would write the String Linux into cell C11 to C20.
> 
> Yours
> 
> Mark B
> 
> 
> 
> Luke_Devon wrote:
>> 
>> Hi Mark
>> 
>> I modified the code to removing the Summary sheet from my Excel workbook.
>> It is working. Thanks for your guide and the support.( Herewith I have
>> attached the code )
>> 
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.io.FileOutputStream;
>> import org.apache.poi.ss.usermodel.Workbook;
>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>> 
>> public class SheetUtil {
>> 
>>    public static void main(String a[]){
>>        new SheetUtil().deleteSheet();
>>    }
>> 
>> 
>>    public void deleteSheet(){
>>        int sheetToDelete = -1;
>> 
>>        String path = "Book1.xls";
>>        FileInputStream fis  = null;
>>        Workbook workbook = null;
>>        File file = null;
>>        FileOutputStream fos = null;
>>        try{
>> 
>>            file = new File(path);
>> 
>>            fis = new FileInputStream(file);
>>            workbook = WorkbookFactory.create(fis);
>> 
>> 
>> 
>>            int numSheets = workbook.getNumberOfSheets();
>> 
>>            for(int i = 0; i < numSheets; i++) {
>>                org.apache.poi.ss.usermodel.Sheet sheet =
>> workbook.getSheetAt(i);
>>                System.out.println(sheet.getSheetName());
>>                if(sheet.getSheetName().equalsIgnoreCase("Summary")) {
>>                    sheetToDelete = i;
>>                    break;
>> 
>>                }
>>            }
>> 
>> 
>>            if(sheetToDelete > -1) {
>>                workbook.removeSheetAt(sheetToDelete);
>>            }
>> 
>>            fos = new FileOutputStream(file);
>> 
>>            workbook.write(fos);
>> 
>>        }catch(Exception e){
>>            e.printStackTrace();
>>        }finally{
>>            try{
>>                if(fis != null){
>>                    fis.close();
>>                    fis = null;
>> 
>>                }
>>                if(fos != null){
>>                    fos.close();
>>                    fos = null;
>> 
>>                }
>>            }catch(Exception ee){
>>                ee.printStackTrace();
>> 
>>            }
>>        }
>>    }
>> }
>> 
>> =====================================================
>> 
>> I have few more modifications , to be applied for the workbook. Can you
>> please help me further?
>> 
>> I will list down you my other requirements.
>> 
>> After I removed the "Summary" sheet , there will be only one Sheet will
>> be
>> remain which is "Detail". 
>> 
>> In the Detail Sheet,
>> 
>> 1. In the Sheet , there is String called "Sum". This value also will be
>> there more than once.
>> 
>> Once that value found , needed to be delete entire row respectively. Eg:
>> If "Sum" is in , Cell H11, 11th row needed to be removed from the
>> worksheet.
>> 
>> 2. Some Cells are containing values with "1000 separator (,)" -- Eg:
>> 1,256.00 . I wanted to remove all thousand separators from the values
>> which are exist in the Detail excel sheet. Eg: 1256.00 --- after removing
>> the "," value should be like this.(1256.00)
>> 
>> 3. In B16 , There is a String called "Linux". This value i wanted to copy
>> ( repeat ) until B25.Thats meant This value must be repeat from B17 to
>> B25.( This is just like we copy a value along the column for more than
>> one
>> row )
>> 
>> Like that , in this worksheet , there will be lot of values to be copied.
>> Hence I think we can not do it dynamically and we needed hard code those
>> values in the code .
>> 
>> Thanks in advance
>> Luke
>> 
>> 
>> ________________________________
>> From: MSB <ma...@tiscali.co.uk>
>> To: user@poi.apache.org
>> Sent: Friday, April 23, 2010 3:27:38
>> Subject: Re: Convert XLS into CSV
>> 
>> 
>> Luke
>> 
>> As you have only limited experience with Java, I have uploaded a
>> distribution that you should be able to
>> run from the command line. I know some companies restrict access to file
>> sharing sites bu if you can use Rapidshare from your location, this is
>> the
>> link to the archive;
>> 
>> http://rapidshare.com/files/378927105/CSV_Conversion_Code.rar
>> 
>> and if you have any problems downloading the file, just let me know. It
>> is
>> quite large as it contains all of
>> the POI archives that you will need to run the code but I am confident
>> there
>> is another way I could send it
>> directly to you if Rapidshare is inaccessible.
>> 
>> If you are unfamiliar with Rapidshare, all you need to do is follow the
>> link, choose the free download option
>> and download the file to your machine. Once you have downloaded the
>> archive
>> called 
>> CSV Conversion Code.rar, unzip it and you should have a folder with the
>> name
>> CSV Conversion Code. This
>> folder contains a further folder called dist and a file by the name of
>> Read
>> Me.rtf.
>> Open this rtf file (Word or WordPad will handle it successfully) and it
>> will
>> tell you what to do next.
>> 
>> Any problems, just send me a message.
>> 
>> Yours
>> 
>> Mark B
>> 
>> PS How did you get on with removing the Summary sheet from your Excel
>> workbook?
>> 
>> 
>> Luke_Devon wrote:
>>> 
>>> Hi Mark
>>> 
>>> First of all I would like to thank you for all your explanation which
>>> you
>>> have explained very well regarding the code. It was really helpful to
>>> understand the flow of the code and I think its nice to have such
>>> audience
>>> and explanation from an expert / one of the best,  in the industry for a
>>> beginner like me. 
>>> 
>>> I was executing your final Code to convert XLS into CSV. Its compiled,
>>> but
>>> there was two messages , 
>>> uses unchecked or unsafe operations.
>>> Note: Recompile with -Xlint:unchecked for details.
>>> 
>>> I did google , for those messages and there was some directives , but as
>>> of some forum says , it wont affect to the execution.
>>> 
>>> Finally , when I tried to execute it , 
>>> 
>>> java ToCSV d:\excel  d:\csv , 
>>> 
>>> [java] Usage: java ToCSV "Source Folder" "Destination Folder" "CSV
>>> Element
>>> Separator"
>>> 
>>> BUILD FAILED
>>> Target "d:\excel" does not exist in the project "Inbox".
>>> 
>>> I tried to go through the code  , if there is any thing that i can
>>> modified to solve the problem.but no luck.
>>> 
>>> Thanks in advance
>>> Luke
>>> 
>>> 
>>> ________________________________
>>> From: MSB <ma...@tiscali.co.uk>
>>> To: user@poi.apache.org
>>> Sent: Tuesday, April 13, 2010 22:04:55
>>> Subject: Re: Convert XLS into CSV
>>> 
>>> 
>>> The code I have pasted into this message has been modified so that it is
>>> possible to specify the path to and name of a folder containing one or
>>> more
>>> Excel workbooks and the code will now iterate through each of the
>>> workbooks
>>> in the folder and convert them into separate CSV files. The name of the
>>> resulting CSV file will in each case be derived by taking the name of
>>> the
>>> source Excel workbook, removing the .xls or xlsx extension and suffixing
>>> the
>>> .csv extension. Before looking at the code in any detail - most of it is
>>> identical by the way - I need to explain how you can call and use this
>>> class. Forgive me if you know any of this but I thought it safer to
>>> assume
>>> you might not be totally clear on the details.
>>> 
>>> When you run a Java class with a command like this;
>>> 
>>> java ToCSV
>>> 
>>> the runtime environment will locate the .class file, load and compile
>>> the
>>> bytecode and then look inside the class for the main() method which, if
>>> that
>>> method is present and it contains any code at all, is where execution of
>>> the
>>> program will begin. If you look at the signature for the main() method,
>>> you
>>> will see that it looks something like this;
>>> 
>>> public static void main(String[] args)
>>> 
>>> ignoring the public, static and void keywords for the moment, take a
>>> look
>>> at
>>> what the brackets contain - String[] args. This is the declaration for a
>>> variable called args which is able to hold a reference to an array where
>>> each element of that array is an instance the java.lang.String class.
>>> This
>>> array will hold any parameters that were specified on the command line
>>> when
>>> the class was run. So, to modify the example above slightly, this;
>>> 
>>> java ToCSV C:/temp C:/temp/CSV
>>> 
>>> would result in a reference to an an array holding two Strings being
>>> passed
>>> to the args parameter; the first element would hold "C:/temp" and the
>>> second
>>> C:/temp/CSV. Well, more accurately, the elements would hold references
>>> to
>>> instances of the String class that encapsulated the values "C:/temp" and
>>> C:/temp/CSV.
>>> 
>>> The contents of the args parameter are accessible to any code within the
>>> args method and so it is possible to extract the contents of the two
>>> elements by simply referencing each using Java's standard notation; the
>>> first element in the args array would be args[0] and the second args[1].
>>> 
>>> With this information, you can see how to run the ToCSV class from the
>>> command line. All that you need do is specify the path to and name of
>>> either
>>> an individual Excel workbook or of a folder that contains Excel
>>> workbook,
>>> the path to and name of the folder that should contain the CSV files
>>> and,
>>> optionally, the character or string of characters that will separate
>>> each
>>> element on the line within the CSV file. The class can be invoked from
>>> the
>>> command line in this manner;
>>> 
>>> java ToCSV "C:/temp/Excel Workbooks" "C:/temp/CSV Files"
>>> 
>>> or
>>> 
>>> java ToCSV C:/temp/Excel C:/temp/CSV ,
>>> 
>>> Note that the speach marks enclosing the paths in the first example are
>>> only
>>> necessary if there are embedded spaces within the parameter.
>>> 
>>> If you now take a look at the code in the main() method of the ToCSV
>>> file,
>>> you can see that it simply recovers the values from the args array and
>>> uses
>>> them to call the convertExcelToCSV() method on an instance of the ToCSV
>>> class.
>>> 
>>> Equally, you can call this class from code within another Java class. If
>>> you
>>> choose to do this, it is an even simpler process; simply create an
>>> instance
>>> of the ToCSV class and then call either of the overloaded
>>> convertExcelToCSV() methods. One accepts two parameters, the name of and
>>> path to the source Excel workbook(s) and the name of and path to the
>>> folder
>>> the CSV files ought to be written to. If you call this method, it will
>>> be
>>> assumed that the separtor is a comma. The other method accepts a third
>>> parameter allowing you to define what the separator should be, so you
>>> could
>>> pass a colon for example. 
>>> 
>>> The process is a little like this;
>>> 
>>> ToCSV converter = new ToCSV();
>>> converter.convertExcelToCSV("C:/temp/Excel Workbooks", "C:/temp/CSV
>>> Files",
>>> "!");
>>> 
>>> The folder that contains the workbooks can hold other files as well, you
>>> do
>>> not have to worry about copying the Excel workbooks to a special file
>>> for
>>> processing. The ToCSV class is able to identify and select just Excel
>>> workbook files that have extensions of '.xls' or '.xlsx' and it does
>>> this
>>> by
>>> using an instance of an inner class called ExcelFilenameFilter that you
>>> can
>>> see defined at the bottom of the source code, immediately following the
>>> main() method. As you can see, it contains a single method - accept() -
>>> that
>>> contains a single line of code. This line of code will return the
>>> boolean
>>> value true if the name of a file ends with either '.xls' or '.xlsx' and
>>> an
>>> instance of ExcelFilenameFilter can, therefore, be used, when the
>>> contents
>>> of the folder are read, to ensure that only Excel workbooks are
>>> processed.
>>> I
>>> often use inner classes like this if they are very small and simple or
>>> if
>>> they only make sense in the context of the enclosing class. By this, I
>>> mean
>>> asking yourslef if it is likely any other class may need similar
>>> functionality.
>>> 
>>> Anayway, here is the code. Remember that it is only one possible
>>> solution
>>> and you may need to modify how it works to suit your specific
>>> requirement.
>>> You may not be worried, for example, if all of the rows in the CSV file
>>> contain different numbers of elements. If this is the case, the contents
>>> of
>>> the cells could be written out immediately after they were read and you
>>> could completely remove the additional step of storing them in
>>> ArrayLists
>>> to
>>> ensure the matrix is square.
>>> 
>>> package postprocessxlsx;
>>> 
>>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>>> import org.apache.poi.ss.usermodel.Workbook;
>>> import org.apache.poi.ss.usermodel.Sheet;
>>> import org.apache.poi.ss.usermodel.Row;
>>> import org.apache.poi.ss.usermodel.Cell;
>>> import org.apache.poi.ss.usermodel.DataFormatter;
>>> import org.apache.poi.ss.usermodel.FormulaEvaluator;
>>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>>> 
>>> import java.io.File;
>>> import java.io.FileInputStream;
>>> import java.io.FileWriter;
>>> import java.io.BufferedWriter;
>>> import java.io.FilenameFilter;
>>> import java.io.IOException;
>>> import java.io.FileNotFoundException;
>>> import java.util.ArrayList;
>>> 
>>> /**
>>> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
>>> This
>>> * class makes the following assumptions;
>>> *
>>> * 1. Where the Excel workbook contains more that one worksheet then a
>>> single
>>> *    CSV file will contain the data from all of the worksheets.
>>> * 2. The data matrix contained in the CSV file will be square. This
>>> means
>>> that
>>> *    the number of elements in each row of the CSV file will match the
>>> number
>>> *    of cells in the longest row found in the Excel workbook. Any short
>>> rows
>>> *    will be 'padded' with empty elements - an empty elements is
>>> represented in
>>> *    the CSV file in this way ,,.
>>> * 3. Empty elements will represent missing cells.
>>> * 4. A row consisting of empty elements will be used to represent an
>>> empty
>>> row
>>> *    in the Excel workbook.
>>> *
>>> * @author Mark B
>>> * @version 1.10 12th April 2010
>>> */
>>> public class ToCSV {
>>> 
>>>     private Workbook workbook = null;
>>>     private ArrayList<ArrayList> csvData = null;
>>>     private int maxRowWidth = 0;
>>>     private DataFormatter formatter = null;
>>>     private FormulaEvaluator evaluator = null;
>>>     private String separtor = null;
>>> 
>>>     public static final String CSV_FILE_EXTENSION = ".csv";
>>>     public static final String DEFAULT_SEPARATOR = ",";
>>> 
>>>     /**
>>>      * Process the contents of a folder, convert the contents of each
>>> Excel
>>>      * workbook into CSV format and save the resulting file to the
>>> specified
>>>      * folder using the same name as the original workbook with the .xls
>>> or
>>>      * .xlsx extension replaced by .csv
>>>      *
>>>      * @param source An instance of the String class that encapsulates
>>> the
>>>      *        name of and path to either a folder containing those Excel
>>>      *        workbook(s) or the name of and path to an individual Excel
>>> workbook
>>>      *        that is/are to be converted.
>>>      * @param destination An instance of the String class encapsulating
>>> the
>>> name
>>>      *        of and path to a folder that will contain the resulting
>>> CSV
>>> files.
>>>      * @throws java.io.FileNotFoundException Thrown if any file cannot
>>> be
>>> located
>>>      *         on the filesystem during processing.
>>>      * @throws java.io.IOException Thrown if the filesystem encounters
>>> any
>>>      *         problems during processing.
>>>      * @throws java.lang.IllegalArgumentException Thrown if the values
>>> passed
>>>      *         to the source parameters refers to a file or folder that
>>> does
>>> not
>>>      *         exist and if the value passed to the destination
>>> paramater
>>> refers
>>>      *         to a folder that does not exist or simply does not refer
>>> to
>>> a
>>>      *         folder that does not exist.
>>>      * @throws
>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>> Thrown
>>>      *         if the xml markup encounetered whilst parsing a
>>> SpreadsheetML
>>>      *         file (.xlsx) is invalid.
>>>      */
>>>     public void convertExcelToCSV(String strSource, String
>>> strDestination)
>>>                        throws FileNotFoundException, IOException,
>>>                               IllegalArgumentException,
>>> InvalidFormatException {
>>> 
>>>         // Simply chain the call to the overloaded
>>> convertExcelToCSV(String,
>>>         // String, String) method and pass the default separator to
>>> ensure
>>> this
>>>         // String is used to separate elemnts on the line in the CSV
>>> file.
>>>         this.convertExcelToCSV(strSource, strDestination,
>>> ToCSV.DEFAULT_SEPARATOR);
>>>     }
>>> 
>>>     /**
>>>      * Process the contents of a folder, convert the contents of each
>>> Excel
>>>      * workbook into CSV format and save the resulting file to the
>>> specified
>>>      * folder using the same name as the original workbook with the .xls
>>> or
>>>      * .xlsx extension replaced by .csv
>>>      *
>>>      * @param source An instance of the String class that encapsulates
>>> the
>>>      *        name of and path to either a folder containing those Excel
>>>      *        workbook(s) or the name of and path to an individual Excel
>>> workbook
>>>      *        that is/are to be converted.
>>>      * @param destination An instance of the String class encapsulating
>>> the
>>> name
>>>      *        of and path to a folder that will contain the resulting
>>> CSV
>>> files.
>>>      * @param separator An instance of the String class encapsulating
>>> the
>>>      *        characters or characters that should be used to separate
>>> items
>>>      *        on a line within the CSV file.
>>>      * @throws java.io.FileNotFoundException Thrown if any file cannot
>>> be
>>> located
>>>      *         on the filesystem during processing.
>>>      * @throws java.io.IOException Thrown if the filesystem encounters
>>> any
>>>      *         problems during processing.
>>>      * @throws java.lang.IllegalArgumentException Thrown if the values
>>> passed
>>>      *         to the source parameters refers to a file or folder that
>>> does
>>> not
>>>      *         exist and if the value passed to the destination
>>> paramater
>>> refers
>>>      *         to a folder that does not exist or simply does not refer
>>> to
>>> a
>>>      *         folder that does not exist.
>>>      * @throws
>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>> Thrown
>>>      *         if the xml markup encounetered whilst parsing a
>>> SpreadsheetML
>>>      *         file (.xlsx) is invalid.
>>>      */
>>>     public void convertExcelToCSV(String strSource,
>>>                                   String strDestination, String
>>> separator)
>>>                        throws FileNotFoundException, IOException,
>>>                               IllegalArgumentException,
>>> InvalidFormatException {
>>>         File source = new File(strSource);
>>>         File destination = new File(strDestination);
>>>         File[] filesList = null;
>>>         String destinationFilename = null;
>>> 
>>>         // Check that the source file/folder exists.
>>>         if(!source.exists()) {
>>>             throw new IllegalArgumentException("The source for the Excel
>>> "
>>> +
>>>                     "file(s) cannot be found.");
>>>         }
>>> 
>>>         // Ensure thaat the folder the user has chosen to save the CSV
>>> files
>>>         // away into firstly exists and secondly is a folder rather
>>> than,
>>> for
>>>         // instance, a data file.
>>>         if(!destination.exists()) {
>>>             throw new IllegalArgumentException("The folder/directory for
>>> the
>>> " +
>>>                     "converted CSV file(s) does not exist.");
>>>         }
>>>         if(!destination.isDirectory()) {
>>>             throw new IllegalArgumentException("The destination for the
>>> CSV
>>> " +
>>>                     "file(s) is not a directory/folder.");
>>>         }
>>> 
>>>         // Check to see if the sourceFolder variable holds a reference
>>> to
>>>         // a file or a folder full of files.
>>>         if(source.isDirectory()) {
>>>             // Get a list of all of the Excel spreadsheet files
>>> (workbooks)
>>> in
>>>             // the source folder/directory
>>>             filesList = source.listFiles(new ExcelFilenameFilter());
>>>         }
>>>         else {
>>>             // Assume that it must be a file handle - although there are
>>> other
>>>             // options the code should perhaps check - and store the
>>> reference
>>>             // into the filesList variable.
>>>             filesList = new File[]{source};
>>>         }
>>> 
>>>         // Step through each of the files in the source folder and for
>>> each
>>>         // open the workbook, convert it's contents to CSV format and
>>> then
>>>         // save the resulting file away into the folder specified by the
>>>         // contents of the destination variable. Note that the name of
>>> the
>>>         // csv file will be created by taking the name of the Excel
>>> file,
>>>         // removing the extension and replacing it with .csv. Note that
>>> there
>>>         // is one drawback with this approach; if the folder holding the
>>> files
>>>         // contains two workbooks whose names match but one is a binary
>>> file
>>>         // (.xls) and the other a SpreadsheetML file (.xlsx), then the
>>> names
>>>         // for both CSV files will be identical and one CSV file will,
>>>         // therefore, over-write the other.
>>>         for(File excelFile : filesList) {
>>>             // Open the workbook
>>>             this.openWorkbook(excelFile);
>>> 
>>>             // Convert it's contents into a CSV file
>>>             this.convertToCSV();
>>> 
>>>             // Build the name of the csv folder from that of the Excel
>>> workbook.
>>>             // Simply replace the .xls or .xlsx file extension with .csv
>>>             destinationFilename = excelFile.getName();
>>>             destinationFilename = destinationFilename.substring(
>>>                     0, destinationFilename.lastIndexOf(".")) + ".csv";
>>> 
>>>             // Save the CSV file away using the newly constricted file
>>> name
>>>             // and to the specified directory.
>>>             this.saveCSVFile(new File(destination, destinationFilename),
>>> separator);
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Open an Excel workbook ready for conversion.
>>>      *
>>>      * @param file An instance of the File class that encapsulates a
>>> handle
>>>      *        to a valid Excel workbook. Note that the workbook can be
>>> in
>>>      *        either binary (.xls) or SpreadsheetML (.xlsx) format.
>>>      *
>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>> be
>>> located.
>>>      * @throws java.io.IOException Thrown if a problem occurs in the
>>> file
>>> system.
>>>      * @throws
>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>> Thrown
>>>      *         if invalid xml is found whilst parsing an input
>>> SpreadsheetML
>>>      *         file.
>>>      */
>>>     private void openWorkbook(File file) throws FileNotFoundException,
>>>                                            IOException,
>>> InvalidFormatException {
>>>         FileInputStream fis = null;
>>>         try {
>>>             System.out.println("Opening workbook [" + file.getName() +
>>> "]");
>>> 
>>>             fis = new FileInputStream(file);
>>> 
>>>             // Open the workbook and then create the FormulaEvaluator
>>> and
>>>             // DataFormatter instances that will be needed to,
>>> respectively,
>>>             // force evaluation of forumlae found in cells and create a
>>>             // formatted String encapsulating the cells contents.
>>>             this.workbook = WorkbookFactory.create(fis);
>>>             this.evaluator =
>>> this.workbook.getCreationHelper().createFormulaEvaluator();
>>>             this.formatter = new DataFormatter();
>>>         }
>>>         finally {
>>>             if(fis != null) {
>>>                 fis.close();
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to convert the contents of the currently opened workbook
>>> into
>>>      * a CSV file.
>>>      */
>>>     private void convertToCSV() {
>>>         Sheet sheet = null;
>>>         Row row = null;
>>>         int lastRowNum = 0;
>>>         this.csvData = new ArrayList<ArrayList>();
>>> 
>>>         System.out.println("Converting files contents to CSV format.");
>>> 
>>>         // Discover how many sheets there are in the workbook....
>>>         int numSheets = this.workbook.getNumberOfSheets();
>>> 
>>>         // and then iterate through them.
>>>         for(int i = 0; i < numSheets; i++) {
>>> 
>>>             // Get a reference to a sheet and check to see if it
>>> contains
>>>             // any rows.
>>>             sheet = this.workbook.getSheetAt(i);
>>>             if(sheet.getPhysicalNumberOfRows() > 0) {
>>> 
>>>                 // Note down the index number of the bottom-most row and
>>>                 // then iterate through all of the rows on the sheet
>>> starting
>>>                 // from the very first row - number 1 - even if it is
>>> missing.
>>>                 // Recover a reference to the row and then call another
>>> method
>>>                 // which will strip the data from the cells and build
>>> lines
>>>                 // for inclusion in the resylting CSV file.
>>>                 lastRowNum = sheet.getLastRowNum();
>>>                 for(int j = 0; j <= lastRowNum; j++) {
>>>                     row = sheet.getRow(j);
>>>                     this.rowToCSV(row);
>>>                 }
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to actually save the data recovered from the Excel
>>> workbook
>>>      * as a CSV file.
>>>      *
>>>      * @param file An instance of the File class that encapsulates a
>>> handle
>>>      *             referring to the CSV file.
>>>      * @param separator An instance of the String class that
>>> encapsulates
>>> the
>>>      *                  character or character that ought to be used to
>>> delimit
>>>      *                  elements on the lines of the CSV file.
>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>> be
>>> found.
>>>      * @throws java.io.IOException Thrown to indicate and error occurred
>>> in
>>> the
>>>      *                             underylying file system.
>>>      */
>>>     private void saveCSVFile(File file, String separator)
>>>                                      throws FileNotFoundException,
>>> IOException {
>>>         FileWriter fw = null;
>>>         BufferedWriter bw = null;
>>>         ArrayList<String> line = null;
>>>         StringBuffer buffer = null;
>>>         String csvLineElement = null;
>>>         try {
>>> 
>>>             System.out.println("Saving the CSV file [" + file.getName()
>>> +
>>> "]");
>>> 
>>>             // Open a writer onto the CSV file.
>>>             fw = new FileWriter(file);
>>>             bw = new BufferedWriter(fw);
>>> 
>>>             // Step through the elements of the ArrayList that was used
>>> to
>>> hold
>>>             // all of the data recovered from the Excel workbooks'
>>> sheets,
>>> rows
>>>             // and cells.
>>>             for(int i = 0; i < this.csvData.size(); i++) {
>>>                 buffer = new StringBuffer();
>>> 
>>>                 // Get an element from the ArrayList that contains the
>>> data
>>> for
>>>                 // the workbook. This element will itself be an
>>> ArrayList
>>>                 // containing Strings and each String will hold the data
>>> recovered
>>>                 // from a single cell. The for() loop is used to recover
>>> elements
>>>                 // from this 'row' ArrayList one at a time and to write
>>> the
>>> Strings
>>>                 // away to a StringBuffer thus assembling a single line
>>> for
>>> inclusion
>>>                 // in the CSV file. If a row was empty or if it was
>>> short,
>>> then
>>>                 // the ArrayList that contains it's data will also be
>>> shorter than
>>>                 // some of the others. Therefore, it is necessary to
>>> check
>>> within
>>>                 // the for loop to ensure that the ArrayList contains
>>> data
>>> to be
>>>                 // processed. If it does, then an element will be
>>> recovered
>>> and
>>>                 // appended to the StringBuffer.
>>>                 line = this.csvData.get(i);
>>>                 for(int j = 0; j < this.maxRowWidth; j++) {
>>>                     if(line.size() > j) {
>>>                         csvLineElement = line.get(j);
>>>                         if(csvLineElement != null) {
>>>                             buffer.append(csvLineElement);
>>>                         }
>>>                     }
>>>                     if(j < (this.maxRowWidth - 1)) {
>>>                         buffer.append(separator);
>>>                     }
>>>                 }
>>> 
>>>                 // Once the line is built, write it away to the CSV
>>> file.
>>>                 bw.write(buffer.toString().trim());
>>> 
>>>                 // Condition the inclusion of new line characters so as
>>> to
>>>                 // avoid an additional, superfluous, new line at the end
>>> of
>>>                 // the file.
>>>                 if(i < (this.csvData.size() - 1)) {
>>>                     bw.newLine();
>>>                 }
>>>             }
>>>         }
>>>         finally {
>>>             if(bw != null) {
>>>                 bw.flush();
>>>                 bw.close();
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to convert a row of cells into a line of data that can
>>> later
>>> be
>>>      * output to the CSV file.
>>>      *
>>>      * Note that no tests have yet been conducted with blank cells or
>>> those
>>>      * containing formulae. Such may require latereations to the way
>>> this
>>> code
>>>      * works.
>>>      *
>>>      * @param row An instance of either the HSSFRow or XSSFRo classes
>>> that
>>>      *            encapsulates information about a row of cells
>>> recovered
>>> from
>>>      *            an Excel workbook.
>>>      */
>>>     private void rowToCSV(Row row) {
>>>         Cell cell = null;
>>>         int lastCellNum = 0;
>>>         ArrayList<String> csvLine = new ArrayList<String>();
>>> 
>>>         // Check to ensure that a row was recovered from the sheet as it
>>> is
>>>         // possible that one or more rows between other populated rows
>>> could
>>> be
>>>         // missing - blank. If the row does contain cells then...
>>>         if(row != null) {
>>> 
>>>             // Get the index for the right most cell on the row and then
>>>             // step along the row from left to right recovering the
>>> contents
>>>             // of each cell, converting that into a formatted String and
>>>             // then storing the String into the csvLine ArrayList.
>>>             lastCellNum = row.getLastCellNum();
>>>             for(int i = 0; i <= lastCellNum; i++) {
>>>                 cell = row.getCell(i);
>>>                 if(cell == null) {
>>>                     csvLine.add("");
>>>                 }
>>>                 else {
>>>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>>>                        
>>> csvLine.add(this.formatter.formatCellValue(cell));
>>>                     }
>>>                     else {
>>>                         csvLine.add(this.formatter.formatCellValue(cell,
>>> this.evaluator));
>>>                     }
>>>                 }
>>>             }
>>>             // Make a note of the index number of the right most cell.
>>> This
>>> value
>>>             // will later be used to ensure that the matrix of data in
>>> the
>>> CSV file
>>>             // is square.
>>>             if(lastCellNum > this.maxRowWidth) {
>>>                 this.maxRowWidth = lastCellNum;
>>>             }
>>>         }
>>>         this.csvData.add(csvLine);
>>>     }
>>> 
>>>     /**
>>>      * The main() method contains code that demonstrates how to use the
>>> class.
>>>      * @param args
>>>      */
>>>     public static void main(String[] args) {
>>>         // Check the number of arguments passed to the main method.
>>> There
>>>         // must be two or three, the name of and path to either the
>>> folder
>>>         // containing the Excel files or an individula Excel workbook
>>> that
>>> is/are
>>>         // to be converted, the name of and path to the folder to which
>>> the
>>> CSV
>>>         // files should be written and then finally, optionally, the
>>> separator
>>>         // that should be used to separate individual items on the lines
>>> in
>>> the
>>>         // CSV file. Note that the names of the CSV files will be
>>> derived
>>> from
>>>         // those of the Excel file(s). Put simply the .xls or .xlsx
>>> extension
>>>         // will be replaced with .csv.
>>>         ToCSV converter = null;
>>>         try {
>>>             converter = new ToCSV();
>>>             if(args.length == 2) {
>>>                 converter.convertExcelToCSV(args[0], args[1]);
>>>             }
>>>             else if(args.length == 3){
>>>                 converter.convertExcelToCSV(args[0], args[1], args[2]);
>>>             }
>>>             else {
>>>                 System.out.println("Usage: java ToCSV \"Source Folder\"
>>> "
>>> +
>>>                         "\"Destination Folder\" \"CSV Element
>>> Separator\"");
>>>             }
>>>         }
>>>         catch(Exception ex) {
>>>             System.out.println("Caught an: " + ex.getClass().getName());
>>>             System.out.println("Message: " + ex.getMessage());
>>>             System.out.println("Stacktrace follows:.....");
>>>             ex.printStackTrace(System.out);
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * An instance of this class can be used to control the files
>>> returned
>>>      * be a call to the listFiles() method when made on an instance of
>>> the
>>>      * File class and that object refers to a folder/directory
>>>      */
>>>     class ExcelFilenameFilter implements FilenameFilter {
>>> 
>>>         /**
>>>          * Determine those files that will be returned by a call to the
>>>          * listFiles() method. In this case, the name of the file must
>>> end
>>> with
>>>          * either of the following two extension; '.xls' or '.xlsx'
>>>          * @param file An instance of the File class that encapsulates a
>>> handle
>>>          *             referring to the folder/directory that contains
>>> the
>>> file.
>>>          * @param name An instance of the String class that encapsulates
>>> the
>>>          *             name of the file.
>>>          * @return A boolean value that indicates whether the file
>>> should
>>> be
>>>          *         included in the array retirned by the call to the
>>> listFiles()
>>>          *         method. In this case true will be returned if the
>>> name
>>> of
>>> the
>>>          *         file ends with either '.xls' or '.xlsx' and false
>>> will
>>> be
>>>          *         returned in all other instances.
>>>          */
>>>         public boolean accept(File file, String name) {
>>>             return(name.endsWith(".xls") || name.endsWith(".xlsx"));
>>>         }
>>>     }
>>> }
>>> 
>>> 
>>> Luke_Devon wrote:
>>>> 
>>>> Hi Mark
>>>> 
>>>> I tested the code. It was fine. I could manage to convert XLS into CSV
>>>> without any problem. But i have few more questions.
>>>> 
>>>> In this code , you have been hard coded the path where XLS and CSV
>>>> located. and file names also hard coded.
>>>> 
>>>> How it would be pointing to a single directory and convert all XLS into
>>>> CSV stored in the folder ? 
>>>> 
>>>> In the RAW XLS file , there might be some unwanted data to be converted
>>>> into CSV. or some times we need to complete the blank cells in the
>>>> excel
>>>> sheet.
>>>> 
>>>> How can we do it , before convert into CSV ? 
>>>> 
>>>> Sorry about all those simple questions.( Since I'm not a programmer)
>>>> 
>>>> Thanks in advance
>>>> Luke
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ________________________________
>>>> From: MSB <ma...@tiscali.co.uk>
>>>> To: user@poi.apache.org
>>>> Sent: Friday, April 9, 2010 22:18:43
>>>> Subject: Re: Convert XLS into CSV
>>>> 
>>>> 
>>>> Hello Luke,
>>>> 
>>>> As promised, a bit of code that uses the usermodel to create CSV files.
>>>> I
>>>> have not had the opportunity to test it thoroughly and do expect there
>>>> to
>>>> be
>>>> issues so do not use the code in a production environment until you
>>>> have
>>>> put
>>>> it through the wringer so to speak. Also, you may find the performance
>>>> a
>>>> little slower than you expect, especially if you are using the newer
>>>> xml
>>>> based file format and have run the eventusermodel code that Nick wrote.
>>>> 
>>>> In essence, 'my' code simplt takes the contents of the workbook and
>>>> converts
>>>> it into an ArrayList of ArrayLists where each inner ArrayList contains
>>>> zero,
>>>> one or more Strings that describe the contents of a cell. I used this
>>>> approach becuase it allows me to ensure that every row in the finished
>>>> CVS
>>>> file is the same length - with regard to the number of elements it
>>>> contains
>>>> - even if the input workbook contains rows that have varying numbers of
>>>> cells on them. The code as it stands does evaluate any formulae that
>>>> may
>>>> be
>>>> contained within cells and I hope will perform pretty much as you
>>>> require.
>>>> Take a look down into the main() method to see how it can be used; this
>>>> method only shows the class being used to process a single file but an
>>>> instance can be used to process more than one file in this manner;
>>>> 
>>>> ToCSV converter = new ToCSV();
>>>> converter.openWorkbook("C:/temp/To CSV.xls");
>>>> converter.convertToCSV();
>>>> converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>>>> 
>>>> converter.openWorkbook("C:/temp/Another To CSV.xlsx");
>>>> converter.convertToCSV();
>>>> converter.saveCSVFile("C:/temp/Second CSV.csv", ";");
>>>> 
>>>> 
>>>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>>>> import org.apache.poi.ss.usermodel.Workbook;
>>>> import org.apache.poi.ss.usermodel.Sheet;
>>>> import org.apache.poi.ss.usermodel.Row;
>>>> import org.apache.poi.ss.usermodel.Cell;
>>>> import org.apache.poi.ss.usermodel.DataFormatter;
>>>> import org.apache.poi.ss.usermodel.FormulaEvaluator;
>>>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>>>> 
>>>> import java.io.File;
>>>> import java.io.FileInputStream;
>>>> import java.io.FileWriter;
>>>> import java.io.BufferedWriter;
>>>> import java.io.IOException;
>>>> import java.io.FileNotFoundException;
>>>> import java.util.ArrayList;
>>>> 
>>>> /**
>>>> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
>>>> This
>>>> * class makes the following assumptions;
>>>> *
>>>> * 1. Where the Excel workbook contains more that one worksheet then a
>>>> single
>>>> *    CSV file will contain the data from all of the worksheets.
>>>> * 2. The data matrix contained in the CSV file will be square. This
>>>> means
>>>> that
>>>> *    the number of elements in each row of the CSV file will match the
>>>> number
>>>> *    of cells in the longest row found in the Excel workbook. Any short
>>>> rows
>>>> *    will be 'padded' with empty elements - an empty elements is
>>>> represented in
>>>> *    the CSV file in this way ,,.
>>>> * 3. Empty elements will represent missing cells.
>>>> * 4. A row consisting of empty elements will be used to represent an
>>>> empty
>>>> row
>>>> *    in the Excel workbook.
>>>> *
>>>> * @author Mark B
>>>> * @version 1.00 9th April 2010
>>>> */
>>>> public class ToCSV {
>>>> 
>>>>     private Workbook workbook = null;
>>>>     private ArrayList<ArrayList> csvData = null;
>>>>     private int maxRowWidth = 0;
>>>>     private DataFormatter formatter = null;
>>>>     private FormulaEvaluator evaluator = null;
>>>> 
>>>>     /**
>>>>      * Open an Excel workbook readt for conversion.
>>>>      *
>>>>      * @param filename An instance of the String class that
>>>> encapsulates
>>>> the
>>>>      *                 path to and name of a valid Excel workbook. Note
>>>> that
>>>> the
>>>>      *                 workbook can be either a binary (.xls) or
>>>> SpreadsheetML
>>>>      *                 (.xlsx) file.
>>>>      *
>>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>>> be
>>>> located.
>>>>      * @throws java.io.IOException Thrown if a problem occurs in the
>>>> file
>>>> system.
>>>>      * @throws
>>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>>> Thrown
>>>>      *     if invalid xml is found whilst parsing an input
>>>> SpreadsheetML
>>>> file.
>>>>      */
>>>>     public void openWorkbook(String filename) throws
>>>> FileNotFoundException,
>>>>                                            IOException,
>>>> InvalidFormatException {
>>>>         File file = null;
>>>>         FileInputStream fis = null;
>>>>         try {
>>>>             file = new File(filename);
>>>>             fis = new FileInputStream(file);
>>>>             this.workbook = WorkbookFactory.create(fis);
>>>>             this.evaluator =
>>>> this.workbook.getCreationHelper().createFormulaEvaluator();
>>>>             this.formatter = new DataFormatter();
>>>>         }
>>>>         finally {
>>>>             if(fis != null) {
>>>>                 fis.close();
>>>>             }
>>>>         }
>>>>     }
>>>> 
>>>>     /**
>>>>      * Called to convert the contents of the currently opened workbook
>>>> into
>>>>      * a CSV file.
>>>>      */
>>>>     public void convertToCSV() {
>>>>         Sheet sheet = null;
>>>>         Row row = null;
>>>>         int lastRowNum = 0;
>>>>         this.csvData = new ArrayList<ArrayList>();
>>>> 
>>>>         // Discover how many sheets there are in the workbook....
>>>>         int numSheets = this.workbook.getNumberOfSheets();
>>>> 
>>>>         // and then iterate through them.
>>>>         for(int i = 0; i < numSheets; i++) {
>>>> 
>>>>             // Get a reference to a sheet and check to see if it
>>>> contains
>>>>             // any rows.
>>>>             sheet = this.workbook.getSheetAt(i);
>>>>             if(sheet.getPhysicalNumberOfRows() > 0) {
>>>> 
>>>>                 // Note down the index number of the bottom-most row
>>>> and
>>>>                 // then iterate through all of the rows on the sheet
>>>> starting
>>>>                 // from the very first row - number 1 - even if it is
>>>> missing.
>>>>                 // Recover a reference to the row and then call another
>>>> method
>>>>                 // which will strip the data from the cells and build
>>>> lines
>>>>                 // for inclusion in the resylting CSV file.
>>>>                 lastRowNum = sheet.getLastRowNum();
>>>>                 for(int j = 0; j <= lastRowNum; j++) {
>>>>                     row = sheet.getRow(j);
>>>>                     this.rowToCSV(row);
>>>>                 }
>>>>             }
>>>>         }
>>>>     }
>>>> 
>>>>     /**
>>>>      * Called to actually save the data recovered from the Excel
>>>> workbook
>>>>      * as a CSV file.
>>>>      *
>>>>      * @param filename An instance of the String class that
>>>> encapsulates
>>>> the
>>>>      *                path to and name of the CSV file.
>>>>      * @param separator An instance of the String class that
>>>> encapsulates
>>>> the
>>>>      *                  character or character that ought to be used to
>>>> delimit
>>>>      *                  elements on the lines of the CSV file.
>>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>>> be
>>>> found.
>>>>      * @throws java.io.IOException Thrown to indicate and error
>>>> occurred
>>>> in
>>>> the
>>>>      *                             underylying file system.
>>>>      */
>>>>     public void saveCSVFile(String filename, String separator)
>>>>                                                    throws
>>>> FileNotFoundException,
>>>>                                                           IOException {
>>>>         File file = null;
>>>>         FileWriter fw = null;
>>>>         BufferedWriter bw = null;
>>>>         ArrayList<String> line = null;
>>>>         StringBuffer buffer = null;
>>>>         String csvLineElement = null;
>>>>         try {
>>>>             // Open a writer onto the CSV file.
>>>>             file = new File(filename);
>>>>             fw = new FileWriter(file);
>>>>             bw = new BufferedWriter(fw);
>>>> 
>>>>             // Step through the elements of the ArrayList that was used
>>>> to
>>>> hold
>>>>             // all of the data recovered from the Excel workbooks'
>>>> sheets,
>>>> rows
>>>>             // and cells.
>>>>             for(int i = 0; i < this.csvData.size(); i++) {
>>>>                 buffer = new StringBuffer();
>>>> 
>>>>                 // Get an element from the ArrayList that contains the
>>>> data
>>>> for
>>>>                 // the workbook. This element will itself be an
>>>> ArrayList
>>>>                 // containing Strings and each String will hold the
>>>> data
>>>> recovered
>>>>                 // from a single cell. The for() loop is used to
>>>> recover
>>>> elements
>>>>                 // from this 'row' ArrayList one at a time and to write
>>>> the
>>>> Strings
>>>>                 // away to a StringBuffer thus assembling a single line
>>>> for
>>>> inclusion
>>>>                 // in the CSV file. If a row was empty or if it was
>>>> short,
>>>> then
>>>>                 // the ArrayList that contains it's data will also be
>>>> shorter than
>>>>                 // some of the others. Therefore, it is necessary to
>>>> check
>>>> within
>>>>                 // the for loop to ensure that the ArrayList contains
>>>> data
>>>> to be
>>>>                 // processed. If it does, then an element will be
>>>> recovered
>>>> and
>>>>                 // appended to the StringBuffer.
>>>>                 line = this.csvData.get(i);
>>>>                 for(int j = 0; j < this.maxRowWidth; j++) {
>>>>                     if(line.size() > j) {
>>>>                         csvLineElement = line.get(j);
>>>>                         if(csvLineElement != null) {
>>>>                             buffer.append(csvLineElement);
>>>>                         }
>>>>                     }
>>>>                     if(j < (this.maxRowWidth - 1)) {
>>>>                         buffer.append(separator);
>>>>                     }
>>>>                 }
>>>> 
>>>>                 // Once the line is built, write it away to the CSV
>>>> file.
>>>>                 bw.write(buffer.toString().trim());
>>>> 
>>>>                 // Condition the inclusion of new line characters so as
>>>> to
>>>>                 // avoid an additional, superfluous, new line at the
>>>> end
>>>> of
>>>>                 // the file.
>>>>                 if(i < (this.csvData.size() - 1)) {
>>>>                     bw.newLine();
>>>>                 }
>>>>             }
>>>>         }
>>>>         finally {
>>>>             if(bw != null) {
>>>>                 bw.flush();
>>>>                 bw.close();
>>>>             }
>>>>         }
>>>>     }
>>>> 
>>>>     /**
>>>>      * Called to convert a row of cells into a line of data that can
>>>> later
>>>> be
>>>>      * output to the CSV file.
>>>>      *
>>>>      * Note that no tests have yet been conducted with blank cells or
>>>> those
>>>>      * containing formulae. Such may require latereations to the way
>>>> this
>>>> code
>>>>      * works.
>>>>      *
>>>>      * @param row An instance of either the HSSFRow or XSSFRo classes
>>>> that
>>>>      *            encapsulates information about a row of cells
>>>> recovered
>>>> from
>>>>      *            an Excel workbook.
>>>>      */
>>>>     private void rowToCSV(Row row) {
>>>>         Cell cell = null;
>>>>         int lastCellNum = 0;
>>>>         ArrayList<String> csvLine = new ArrayList<String>();
>>>> 
>>>>         // Check to ensure that a row was recovered from the sheet as
>>>> it
>>>> is
>>>>         // possible that one or more rows between other populated rows
>>>> could
>>>> be
>>>>         // missing - blank. If the row does contain cells then...
>>>>         if(row != null) {
>>>> 
>>>>             // Get the index for the right most cell on the row and
>>>> then
>>>>             // step along the row from left to right recovering the
>>>> contents
>>>>             // of each cell, converting that into a formatted String
>>>> and
>>>>             // then storing the String into the csvLine ArrayList.
>>>>             lastCellNum = row.getLastCellNum();
>>>>             for(int i = 0; i <= lastCellNum; i++) {
>>>>                 cell = row.getCell(i);
>>>>                 if(cell == null) {
>>>>                     csvLine.add("");
>>>>                 }
>>>>                 else {
>>>>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>>>>                        
>>>> csvLine.add(this.formatter.formatCellValue(cell));
>>>>                     }
>>>>                     else {
>>>>                        
>>>> csvLine.add(this.formatter.formatCellValue(cell,
>>>> this.evaluator));
>>>>                     }
>>>>                 }
>>>>             }
>>>>             // Make a note of the index number of the right most cell.
>>>> This
>>>> value
>>>>             // will later be used to ensure that the matrix of data in
>>>> the
>>>> CSV file
>>>>             // is square.
>>>>             if(lastCellNum > this.maxRowWidth) {
>>>>                 this.maxRowWidth = lastCellNum;
>>>>             }
>>>>         }
>>>>         this.csvData.add(csvLine);
>>>>     }
>>>> 
>>>>     /**
>>>>      * The main() method contains code that demonstrates how to use the
>>>> class.
>>>>      * @param args
>>>>      */
>>>>     public static void main(String[] args) {
>>>>         try {
>>>>             ToCSV converter = new ToCSV();
>>>>             converter.openWorkbook("C:/temp/To CSV.xls");
>>>>             converter.convertToCSV();
>>>>             converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>>>>         }
>>>>         catch(Exception ex) {
>>>>             System.out.println("Caught an: " +
>>>> ex.getClass().getName());
>>>>             System.out.println("Message: " + ex.getMessage());
>>>>             System.out.println("Stacktrace follows:.....");
>>>>             ex.printStackTrace(System.out);
>>>>         }
>>>> 
>>>>     }
>>>> }
>>>> 
>>>> Test it out, have a good look through it and if there is anything you
>>>> want
>>>> to know just post to the list.
>>>> 
>>>> Yours
>>>> 
>>>> Mark B
>>>> 
>>>> 
>>>> Luke_Devon wrote:
>>>>> 
>>>>> Hi Mark,
>>>>> 
>>>>> First of all I would like to thank you for the reply. 
>>>>> 
>>>>> Actually , currently I am using MS Office 2002. But I would like to
>>>>> use
>>>>> the code for other latest versions also.
>>>>> In my case , I dont want to do any validations for the EXCEL file ,
>>>>> because I just wanted convert entire excel file into CSV.
>>>>> Is there any simple java code available for such a basic requirement ?
>>>>> 
>>>>> anyway I'll try to use the code in the link which you have given to me
>>>>> .
>>>>> 
>>>>> Thanks & Regards
>>>>> 
>>>>> Luke.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ________________________________
>>>>> From: MSB <ma...@tiscali.co.uk>
>>>>> To: user@poi.apache.org
>>>>> Sent: Thursday, April 8, 2010 23:47:07
>>>>> Subject: Re: Convert XLS into CSV
>>>>> 
>>>>> 
>>>>> Hello Luke,
>>>>> 
>>>>> Which version of the Excel file format are you targetting, the older
>>>>> binary
>>>>> or newer xml based version? I ask because Nick wrote and contributed
>>>>> some
>>>>> code that can be used to convert the older binary files into csv. It
>>>>> uses
>>>>> the eventmodel and will seem quite complex on first acquaintance but
>>>>> here
>>>>> it
>>>>> is;
>>>>> 
>>>>> http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java
>>>>> 
>>>>> I know that other users have modified the code to, for example, output
>>>>> diffenert worksheets to separate csv files.
>>>>> 
>>>>> Yours
>>>>> 
>>>>> Mark B
>>>>> 
>>>>> 
>>>>> Luke_Devon wrote:
>>>>>> 
>>>>>> Hi 
>>>>>> 
>>>>>> I wanted to convert some XLS files into CSV. I found that apache.poi
>>>>>> is
>>>>>> the most perfect tool. Since I'm a beginner , i have no idea how to
>>>>>> do
>>>>>> that . Can some body help me please ? Do you have sample code for
>>>>>> convert
>>>>>> xls into csv ?
>>>>>> 
>>>>>> Thanks in Advance
>>>>>> Luke
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>       Get your preferred Email name!
>>>>>> Now you can @ymail.com and @rocketmail.com. 
>>>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>>>> 
>>>>> 
>>>>> -- 
>>>>> View this message in context:
>>>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28180503.html
>>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>>> 
>>>>> 
>>>>>       New Email names for you! 
>>>>> Get the Email name you've always wanted on the new @ymail and
>>>>> @rocketmail. 
>>>>> Hurry before someone else does!
>>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>>> 
>>>> 
>>>> -- 
>>>> View this message in context:
>>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28191046.html
>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>> 
>>>> 
>>>>       Get your new Email address!
>>>> Grab the Email name you've always wanted before someone else does!
>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28230701.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28333557.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28373635.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28375257.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Convert XLS into CSV - Removing a Work Sheet

Posted by MSB <ma...@tiscali.co.uk>.
Hello again Luke,

Just a quick visit - am enjoying that great British tradition, the
tea-break, and taking the chance to look at a few messages.

Knowing that the word is just in column H is a great help as it will make a
life a lot easier for us. Column H will be referenced - using POI - by the
index 7. Thus, I am currently thinking along the following lines;

To create an ArrayList of type Integer.
Read the worksheet line by line and get cell index 7.
If it's type is String and if it contains the word 'Sum' then add the index
number of the row to the ArrayList.
Once the sheet has been fully processed, start at the end of the ArrayList,
recover each row index in turn and delete it from the sheet. Working from
the bottom-up in this manner is slightly easier than working from the bottom
down.

Tonight, I will be able to put some code together to do this but why not
have a go for yourself? It can be esailly combined with the code you are
using to remove the Summary sheet - all you should need to do is process
those sheets that are not called Summary of course.

Then, once we have this bit working, the only hurdle is copying the missing
data. The command line is not the only way to get information into the
application, we can use Properties or even a file that the application can
parse; obviously, this will increase the complexity but add to the
flexibility. So I am thinking along the lines of having a simple text file
that contains lines like;

C10! C11:C20
D34! D67:D78
or
Linux! C10:C20

and is able to provide the instruction to the application. To change it's
behaviour, all you need to do is change the 'instructions' in the file.

Anyway, must go now, we have a fallen Willow tree to clear, a broken fence
to mend, a dipping platform to construct and there is a nice piece of
chocolate cake with my name on it.

Yours

Mark B


Luke_Devon wrote:
> 
> Hi Mark,
> 
> 1.    Value “Sum” always resides on Column H and there is no specific
> /constant space for number of rows between each occurrence of the line
> including that word “Sum”. That’s meant, first “Sum” stored in H, 11.
> Second “Sum” stored in H, 14, Third “Sum” stored in H, 26 ………….so on. Row
> numbers are not constant but Column is constant. That is H.  
> 
> 2.    Finally I understood that we don’t need to remove “comma” from the
> excel sheet since function of XLS convert into CSV is taking care of the
> matter. ( Thanks for directing me to the point )
> 
> 3.    Major part is copying other missing data in the columns. Once every
> thing developed, I don’t need to involve with enter values manually. Main
> idea is handover the tasks to a CRON job.
> 
> Then I have a doubt, how to enter -C "C10! C11:C20 D34! D67:D78" or -C
> "Linux! C10:C20" since program is running by a CRON job
> 
> Please help me to go further .
> 
> Thanks in advance
> Luke
> 
> 
> ________________________________
> From: MSB <ma...@tiscali.co.uk>
> To: user@poi.apache.org
> Sent: Tuesday, April 27, 2010 15:08:36
> Subject: Re: Convert XLS into CSV - Removing a Work Sheet
> 
> 
> Glad to hear that Luke, now onto the other tasks.
> 
> Firstly, can I ask you a question about the first task - removing a row if
> the word Sum appears on it - please? Does Sum always appear in a
> particular
> column? Failing that, are there a set number of rows between each
> occurrence
> of the line including that word? If so, this will speed up the entire
> process somewhat, if not, then we are faced with a process where we need
> to
> examine every cell on each row in the worksheet and this could be quite a
> time consuming process. If it is possible to identify a pattern, even a
> quite crude one that identifies that the word will only appear in columns
> D,
> G or H, then that will help.
> 
> Secondly, the commas in numeric values, this could pressent a bit of a
> problem. As you may know, styles are used to control how a value is
> displayed in a cell and it could be that the person who originally
> constructed the worksheet entered a value such as 12345 into the cell and
> then applied a style so that it appears to contain 12,345.00. The best way
> to resolve this issue is, in many ways, to simply remove the cells
> formatting - but there is yet another issue, what if the cell was not
> numeric at all and the user entered a String of text and then right
> aligned
> it? So, can you look at the worksheet using Excel, find a few of these
> cells
> and see what type they are and what format has been applied to them please
> because, without the file at hand, it is hard to tell you how to proceed
> on
> this issue. Finally, do you need to modify these cells in the workbook as
> it
> would be trivial task to remove the comma when writing the value into the
> CSV file.
> 
> Thirdly, the copying cell contents code, is it possible to say, in
> advance,
> copy the contents of cell B16 into the range B17 to B25 for example? If
> so,
> then it would be quite a straightforward task to automate the process
> completely and simply pass in a parameter format in a way the code would
> recognise. For example, and assuming running the code from the command
> line,
> we could have -C "C10!C11:C20 D34!D67:D78" or -C "Linux!C10:C20" where the
> former would take the contents of cell C10 and copy it into cells C11 to
> C20
> and the latter would write the String Linux into cell C11 to C20.
> 
> Yours
> 
> Mark B
> 
> 
> 
> Luke_Devon wrote:
>> 
>> Hi Mark
>> 
>> I modified the code to removing the Summary sheet from my Excel workbook.
>> It is working. Thanks for your guide and the support.( Herewith I have
>> attached the code )
>> 
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.io.FileOutputStream;
>> import org.apache.poi.ss.usermodel.Workbook;
>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>> 
>> public class SheetUtil {
>> 
>>    public static void main(String a[]){
>>        new SheetUtil().deleteSheet();
>>    }
>> 
>> 
>>    public void deleteSheet(){
>>        int sheetToDelete = -1;
>> 
>>        String path = "Book1.xls";
>>        FileInputStream fis  = null;
>>        Workbook workbook = null;
>>        File file = null;
>>        FileOutputStream fos = null;
>>        try{
>> 
>>            file = new File(path);
>> 
>>            fis = new FileInputStream(file);
>>            workbook = WorkbookFactory.create(fis);
>> 
>> 
>> 
>>            int numSheets = workbook.getNumberOfSheets();
>> 
>>            for(int i = 0; i < numSheets; i++) {
>>                org.apache.poi.ss.usermodel.Sheet sheet =
>> workbook.getSheetAt(i);
>>                System.out.println(sheet.getSheetName());
>>                if(sheet.getSheetName().equalsIgnoreCase("Summary")) {
>>                    sheetToDelete = i;
>>                    break;
>> 
>>                }
>>            }
>> 
>> 
>>            if(sheetToDelete > -1) {
>>                workbook.removeSheetAt(sheetToDelete);
>>            }
>> 
>>            fos = new FileOutputStream(file);
>> 
>>            workbook.write(fos);
>> 
>>        }catch(Exception e){
>>            e.printStackTrace();
>>        }finally{
>>            try{
>>                if(fis != null){
>>                    fis.close();
>>                    fis = null;
>> 
>>                }
>>                if(fos != null){
>>                    fos.close();
>>                    fos = null;
>> 
>>                }
>>            }catch(Exception ee){
>>                ee.printStackTrace();
>> 
>>            }
>>        }
>>    }
>> }
>> 
>> =====================================================
>> 
>> I have few more modifications , to be applied for the workbook. Can you
>> please help me further?
>> 
>> I will list down you my other requirements.
>> 
>> After I removed the "Summary" sheet , there will be only one Sheet will
>> be
>> remain which is "Detail". 
>> 
>> In the Detail Sheet,
>> 
>> 1. In the Sheet , there is String called "Sum". This value also will be
>> there more than once.
>> 
>> Once that value found , needed to be delete entire row respectively. Eg:
>> If "Sum" is in , Cell H11, 11th row needed to be removed from the
>> worksheet.
>> 
>> 2. Some Cells are containing values with "1000 separator (,)" -- Eg:
>> 1,256.00 . I wanted to remove all thousand separators from the values
>> which are exist in the Detail excel sheet. Eg: 1256.00 --- after removing
>> the "," value should be like this.(1256.00)
>> 
>> 3. In B16 , There is a String called "Linux". This value i wanted to copy
>> ( repeat ) until B25.Thats meant This value must be repeat from B17 to
>> B25.( This is just like we copy a value along the column for more than
>> one
>> row )
>> 
>> Like that , in this worksheet , there will be lot of values to be copied.
>> Hence I think we can not do it dynamically and we needed hard code those
>> values in the code .
>> 
>> Thanks in advance
>> Luke
>> 
>> 
>> ________________________________
>> From: MSB <ma...@tiscali.co.uk>
>> To: user@poi.apache.org
>> Sent: Friday, April 23, 2010 3:27:38
>> Subject: Re: Convert XLS into CSV
>> 
>> 
>> Luke
>> 
>> As you have only limited experience with Java, I have uploaded a
>> distribution that you should be able to
>> run from the command line. I know some companies restrict access to file
>> sharing sites bu if you can use Rapidshare from your location, this is
>> the
>> link to the archive;
>> 
>> http://rapidshare.com/files/378927105/CSV_Conversion_Code.rar
>> 
>> and if you have any problems downloading the file, just let me know. It
>> is
>> quite large as it contains all of
>> the POI archives that you will need to run the code but I am confident
>> there
>> is another way I could send it
>> directly to you if Rapidshare is inaccessible.
>> 
>> If you are unfamiliar with Rapidshare, all you need to do is follow the
>> link, choose the free download option
>> and download the file to your machine. Once you have downloaded the
>> archive
>> called 
>> CSV Conversion Code.rar, unzip it and you should have a folder with the
>> name
>> CSV Conversion Code. This
>> folder contains a further folder called dist and a file by the name of
>> Read
>> Me.rtf.
>> Open this rtf file (Word or WordPad will handle it successfully) and it
>> will
>> tell you what to do next.
>> 
>> Any problems, just send me a message.
>> 
>> Yours
>> 
>> Mark B
>> 
>> PS How did you get on with removing the Summary sheet from your Excel
>> workbook?
>> 
>> 
>> Luke_Devon wrote:
>>> 
>>> Hi Mark
>>> 
>>> First of all I would like to thank you for all your explanation which
>>> you
>>> have explained very well regarding the code. It was really helpful to
>>> understand the flow of the code and I think its nice to have such
>>> audience
>>> and explanation from an expert / one of the best,  in the industry for a
>>> beginner like me. 
>>> 
>>> I was executing your final Code to convert XLS into CSV. Its compiled,
>>> but
>>> there was two messages , 
>>> uses unchecked or unsafe operations.
>>> Note: Recompile with -Xlint:unchecked for details.
>>> 
>>> I did google , for those messages and there was some directives , but as
>>> of some forum says , it wont affect to the execution.
>>> 
>>> Finally , when I tried to execute it , 
>>> 
>>> java ToCSV d:\excel  d:\csv , 
>>> 
>>> [java] Usage: java ToCSV "Source Folder" "Destination Folder" "CSV
>>> Element
>>> Separator"
>>> 
>>> BUILD FAILED
>>> Target "d:\excel" does not exist in the project "Inbox".
>>> 
>>> I tried to go through the code  , if there is any thing that i can
>>> modified to solve the problem.but no luck.
>>> 
>>> Thanks in advance
>>> Luke
>>> 
>>> 
>>> ________________________________
>>> From: MSB <ma...@tiscali.co.uk>
>>> To: user@poi.apache.org
>>> Sent: Tuesday, April 13, 2010 22:04:55
>>> Subject: Re: Convert XLS into CSV
>>> 
>>> 
>>> The code I have pasted into this message has been modified so that it is
>>> possible to specify the path to and name of a folder containing one or
>>> more
>>> Excel workbooks and the code will now iterate through each of the
>>> workbooks
>>> in the folder and convert them into separate CSV files. The name of the
>>> resulting CSV file will in each case be derived by taking the name of
>>> the
>>> source Excel workbook, removing the .xls or xlsx extension and suffixing
>>> the
>>> .csv extension. Before looking at the code in any detail - most of it is
>>> identical by the way - I need to explain how you can call and use this
>>> class. Forgive me if you know any of this but I thought it safer to
>>> assume
>>> you might not be totally clear on the details.
>>> 
>>> When you run a Java class with a command like this;
>>> 
>>> java ToCSV
>>> 
>>> the runtime environment will locate the .class file, load and compile
>>> the
>>> bytecode and then look inside the class for the main() method which, if
>>> that
>>> method is present and it contains any code at all, is where execution of
>>> the
>>> program will begin. If you look at the signature for the main() method,
>>> you
>>> will see that it looks something like this;
>>> 
>>> public static void main(String[] args)
>>> 
>>> ignoring the public, static and void keywords for the moment, take a
>>> look
>>> at
>>> what the brackets contain - String[] args. This is the declaration for a
>>> variable called args which is able to hold a reference to an array where
>>> each element of that array is an instance the java.lang.String class.
>>> This
>>> array will hold any parameters that were specified on the command line
>>> when
>>> the class was run. So, to modify the example above slightly, this;
>>> 
>>> java ToCSV C:/temp C:/temp/CSV
>>> 
>>> would result in a reference to an an array holding two Strings being
>>> passed
>>> to the args parameter; the first element would hold "C:/temp" and the
>>> second
>>> C:/temp/CSV. Well, more accurately, the elements would hold references
>>> to
>>> instances of the String class that encapsulated the values "C:/temp" and
>>> C:/temp/CSV.
>>> 
>>> The contents of the args parameter are accessible to any code within the
>>> args method and so it is possible to extract the contents of the two
>>> elements by simply referencing each using Java's standard notation; the
>>> first element in the args array would be args[0] and the second args[1].
>>> 
>>> With this information, you can see how to run the ToCSV class from the
>>> command line. All that you need do is specify the path to and name of
>>> either
>>> an individual Excel workbook or of a folder that contains Excel
>>> workbook,
>>> the path to and name of the folder that should contain the CSV files
>>> and,
>>> optionally, the character or string of characters that will separate
>>> each
>>> element on the line within the CSV file. The class can be invoked from
>>> the
>>> command line in this manner;
>>> 
>>> java ToCSV "C:/temp/Excel Workbooks" "C:/temp/CSV Files"
>>> 
>>> or
>>> 
>>> java ToCSV C:/temp/Excel C:/temp/CSV ,
>>> 
>>> Note that the speach marks enclosing the paths in the first example are
>>> only
>>> necessary if there are embedded spaces within the parameter.
>>> 
>>> If you now take a look at the code in the main() method of the ToCSV
>>> file,
>>> you can see that it simply recovers the values from the args array and
>>> uses
>>> them to call the convertExcelToCSV() method on an instance of the ToCSV
>>> class.
>>> 
>>> Equally, you can call this class from code within another Java class. If
>>> you
>>> choose to do this, it is an even simpler process; simply create an
>>> instance
>>> of the ToCSV class and then call either of the overloaded
>>> convertExcelToCSV() methods. One accepts two parameters, the name of and
>>> path to the source Excel workbook(s) and the name of and path to the
>>> folder
>>> the CSV files ought to be written to. If you call this method, it will
>>> be
>>> assumed that the separtor is a comma. The other method accepts a third
>>> parameter allowing you to define what the separator should be, so you
>>> could
>>> pass a colon for example. 
>>> 
>>> The process is a little like this;
>>> 
>>> ToCSV converter = new ToCSV();
>>> converter.convertExcelToCSV("C:/temp/Excel Workbooks", "C:/temp/CSV
>>> Files",
>>> "!");
>>> 
>>> The folder that contains the workbooks can hold other files as well, you
>>> do
>>> not have to worry about copying the Excel workbooks to a special file
>>> for
>>> processing. The ToCSV class is able to identify and select just Excel
>>> workbook files that have extensions of '.xls' or '.xlsx' and it does
>>> this
>>> by
>>> using an instance of an inner class called ExcelFilenameFilter that you
>>> can
>>> see defined at the bottom of the source code, immediately following the
>>> main() method. As you can see, it contains a single method - accept() -
>>> that
>>> contains a single line of code. This line of code will return the
>>> boolean
>>> value true if the name of a file ends with either '.xls' or '.xlsx' and
>>> an
>>> instance of ExcelFilenameFilter can, therefore, be used, when the
>>> contents
>>> of the folder are read, to ensure that only Excel workbooks are
>>> processed.
>>> I
>>> often use inner classes like this if they are very small and simple or
>>> if
>>> they only make sense in the context of the enclosing class. By this, I
>>> mean
>>> asking yourslef if it is likely any other class may need similar
>>> functionality.
>>> 
>>> Anayway, here is the code. Remember that it is only one possible
>>> solution
>>> and you may need to modify how it works to suit your specific
>>> requirement.
>>> You may not be worried, for example, if all of the rows in the CSV file
>>> contain different numbers of elements. If this is the case, the contents
>>> of
>>> the cells could be written out immediately after they were read and you
>>> could completely remove the additional step of storing them in
>>> ArrayLists
>>> to
>>> ensure the matrix is square.
>>> 
>>> package postprocessxlsx;
>>> 
>>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>>> import org.apache.poi.ss.usermodel.Workbook;
>>> import org.apache.poi.ss.usermodel.Sheet;
>>> import org.apache.poi.ss.usermodel.Row;
>>> import org.apache.poi.ss.usermodel.Cell;
>>> import org.apache.poi.ss.usermodel.DataFormatter;
>>> import org.apache.poi.ss.usermodel.FormulaEvaluator;
>>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>>> 
>>> import java.io.File;
>>> import java.io.FileInputStream;
>>> import java.io.FileWriter;
>>> import java.io.BufferedWriter;
>>> import java.io.FilenameFilter;
>>> import java.io.IOException;
>>> import java.io.FileNotFoundException;
>>> import java.util.ArrayList;
>>> 
>>> /**
>>> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
>>> This
>>> * class makes the following assumptions;
>>> *
>>> * 1. Where the Excel workbook contains more that one worksheet then a
>>> single
>>> *    CSV file will contain the data from all of the worksheets.
>>> * 2. The data matrix contained in the CSV file will be square. This
>>> means
>>> that
>>> *    the number of elements in each row of the CSV file will match the
>>> number
>>> *    of cells in the longest row found in the Excel workbook. Any short
>>> rows
>>> *    will be 'padded' with empty elements - an empty elements is
>>> represented in
>>> *    the CSV file in this way ,,.
>>> * 3. Empty elements will represent missing cells.
>>> * 4. A row consisting of empty elements will be used to represent an
>>> empty
>>> row
>>> *    in the Excel workbook.
>>> *
>>> * @author Mark B
>>> * @version 1.10 12th April 2010
>>> */
>>> public class ToCSV {
>>> 
>>>     private Workbook workbook = null;
>>>     private ArrayList<ArrayList> csvData = null;
>>>     private int maxRowWidth = 0;
>>>     private DataFormatter formatter = null;
>>>     private FormulaEvaluator evaluator = null;
>>>     private String separtor = null;
>>> 
>>>     public static final String CSV_FILE_EXTENSION = ".csv";
>>>     public static final String DEFAULT_SEPARATOR = ",";
>>> 
>>>     /**
>>>      * Process the contents of a folder, convert the contents of each
>>> Excel
>>>      * workbook into CSV format and save the resulting file to the
>>> specified
>>>      * folder using the same name as the original workbook with the .xls
>>> or
>>>      * .xlsx extension replaced by .csv
>>>      *
>>>      * @param source An instance of the String class that encapsulates
>>> the
>>>      *        name of and path to either a folder containing those Excel
>>>      *        workbook(s) or the name of and path to an individual Excel
>>> workbook
>>>      *        that is/are to be converted.
>>>      * @param destination An instance of the String class encapsulating
>>> the
>>> name
>>>      *        of and path to a folder that will contain the resulting
>>> CSV
>>> files.
>>>      * @throws java.io.FileNotFoundException Thrown if any file cannot
>>> be
>>> located
>>>      *         on the filesystem during processing.
>>>      * @throws java.io.IOException Thrown if the filesystem encounters
>>> any
>>>      *         problems during processing.
>>>      * @throws java.lang.IllegalArgumentException Thrown if the values
>>> passed
>>>      *         to the source parameters refers to a file or folder that
>>> does
>>> not
>>>      *         exist and if the value passed to the destination
>>> paramater
>>> refers
>>>      *         to a folder that does not exist or simply does not refer
>>> to
>>> a
>>>      *         folder that does not exist.
>>>      * @throws
>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>> Thrown
>>>      *         if the xml markup encounetered whilst parsing a
>>> SpreadsheetML
>>>      *         file (.xlsx) is invalid.
>>>      */
>>>     public void convertExcelToCSV(String strSource, String
>>> strDestination)
>>>                        throws FileNotFoundException, IOException,
>>>                               IllegalArgumentException,
>>> InvalidFormatException {
>>> 
>>>         // Simply chain the call to the overloaded
>>> convertExcelToCSV(String,
>>>         // String, String) method and pass the default separator to
>>> ensure
>>> this
>>>         // String is used to separate elemnts on the line in the CSV
>>> file.
>>>         this.convertExcelToCSV(strSource, strDestination,
>>> ToCSV.DEFAULT_SEPARATOR);
>>>     }
>>> 
>>>     /**
>>>      * Process the contents of a folder, convert the contents of each
>>> Excel
>>>      * workbook into CSV format and save the resulting file to the
>>> specified
>>>      * folder using the same name as the original workbook with the .xls
>>> or
>>>      * .xlsx extension replaced by .csv
>>>      *
>>>      * @param source An instance of the String class that encapsulates
>>> the
>>>      *        name of and path to either a folder containing those Excel
>>>      *        workbook(s) or the name of and path to an individual Excel
>>> workbook
>>>      *        that is/are to be converted.
>>>      * @param destination An instance of the String class encapsulating
>>> the
>>> name
>>>      *        of and path to a folder that will contain the resulting
>>> CSV
>>> files.
>>>      * @param separator An instance of the String class encapsulating
>>> the
>>>      *        characters or characters that should be used to separate
>>> items
>>>      *        on a line within the CSV file.
>>>      * @throws java.io.FileNotFoundException Thrown if any file cannot
>>> be
>>> located
>>>      *         on the filesystem during processing.
>>>      * @throws java.io.IOException Thrown if the filesystem encounters
>>> any
>>>      *         problems during processing.
>>>      * @throws java.lang.IllegalArgumentException Thrown if the values
>>> passed
>>>      *         to the source parameters refers to a file or folder that
>>> does
>>> not
>>>      *         exist and if the value passed to the destination
>>> paramater
>>> refers
>>>      *         to a folder that does not exist or simply does not refer
>>> to
>>> a
>>>      *         folder that does not exist.
>>>      * @throws
>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>> Thrown
>>>      *         if the xml markup encounetered whilst parsing a
>>> SpreadsheetML
>>>      *         file (.xlsx) is invalid.
>>>      */
>>>     public void convertExcelToCSV(String strSource,
>>>                                   String strDestination, String
>>> separator)
>>>                        throws FileNotFoundException, IOException,
>>>                               IllegalArgumentException,
>>> InvalidFormatException {
>>>         File source = new File(strSource);
>>>         File destination = new File(strDestination);
>>>         File[] filesList = null;
>>>         String destinationFilename = null;
>>> 
>>>         // Check that the source file/folder exists.
>>>         if(!source.exists()) {
>>>             throw new IllegalArgumentException("The source for the Excel
>>> "
>>> +
>>>                     "file(s) cannot be found.");
>>>         }
>>> 
>>>         // Ensure thaat the folder the user has chosen to save the CSV
>>> files
>>>         // away into firstly exists and secondly is a folder rather
>>> than,
>>> for
>>>         // instance, a data file.
>>>         if(!destination.exists()) {
>>>             throw new IllegalArgumentException("The folder/directory for
>>> the
>>> " +
>>>                     "converted CSV file(s) does not exist.");
>>>         }
>>>         if(!destination.isDirectory()) {
>>>             throw new IllegalArgumentException("The destination for the
>>> CSV
>>> " +
>>>                     "file(s) is not a directory/folder.");
>>>         }
>>> 
>>>         // Check to see if the sourceFolder variable holds a reference
>>> to
>>>         // a file or a folder full of files.
>>>         if(source.isDirectory()) {
>>>             // Get a list of all of the Excel spreadsheet files
>>> (workbooks)
>>> in
>>>             // the source folder/directory
>>>             filesList = source.listFiles(new ExcelFilenameFilter());
>>>         }
>>>         else {
>>>             // Assume that it must be a file handle - although there are
>>> other
>>>             // options the code should perhaps check - and store the
>>> reference
>>>             // into the filesList variable.
>>>             filesList = new File[]{source};
>>>         }
>>> 
>>>         // Step through each of the files in the source folder and for
>>> each
>>>         // open the workbook, convert it's contents to CSV format and
>>> then
>>>         // save the resulting file away into the folder specified by the
>>>         // contents of the destination variable. Note that the name of
>>> the
>>>         // csv file will be created by taking the name of the Excel
>>> file,
>>>         // removing the extension and replacing it with .csv. Note that
>>> there
>>>         // is one drawback with this approach; if the folder holding the
>>> files
>>>         // contains two workbooks whose names match but one is a binary
>>> file
>>>         // (.xls) and the other a SpreadsheetML file (.xlsx), then the
>>> names
>>>         // for both CSV files will be identical and one CSV file will,
>>>         // therefore, over-write the other.
>>>         for(File excelFile : filesList) {
>>>             // Open the workbook
>>>             this.openWorkbook(excelFile);
>>> 
>>>             // Convert it's contents into a CSV file
>>>             this.convertToCSV();
>>> 
>>>             // Build the name of the csv folder from that of the Excel
>>> workbook.
>>>             // Simply replace the .xls or .xlsx file extension with .csv
>>>             destinationFilename = excelFile.getName();
>>>             destinationFilename = destinationFilename.substring(
>>>                     0, destinationFilename.lastIndexOf(".")) + ".csv";
>>> 
>>>             // Save the CSV file away using the newly constricted file
>>> name
>>>             // and to the specified directory.
>>>             this.saveCSVFile(new File(destination, destinationFilename),
>>> separator);
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Open an Excel workbook ready for conversion.
>>>      *
>>>      * @param file An instance of the File class that encapsulates a
>>> handle
>>>      *        to a valid Excel workbook. Note that the workbook can be
>>> in
>>>      *        either binary (.xls) or SpreadsheetML (.xlsx) format.
>>>      *
>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>> be
>>> located.
>>>      * @throws java.io.IOException Thrown if a problem occurs in the
>>> file
>>> system.
>>>      * @throws
>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>> Thrown
>>>      *         if invalid xml is found whilst parsing an input
>>> SpreadsheetML
>>>      *         file.
>>>      */
>>>     private void openWorkbook(File file) throws FileNotFoundException,
>>>                                            IOException,
>>> InvalidFormatException {
>>>         FileInputStream fis = null;
>>>         try {
>>>             System.out.println("Opening workbook [" + file.getName() +
>>> "]");
>>> 
>>>             fis = new FileInputStream(file);
>>> 
>>>             // Open the workbook and then create the FormulaEvaluator
>>> and
>>>             // DataFormatter instances that will be needed to,
>>> respectively,
>>>             // force evaluation of forumlae found in cells and create a
>>>             // formatted String encapsulating the cells contents.
>>>             this.workbook = WorkbookFactory.create(fis);
>>>             this.evaluator =
>>> this.workbook.getCreationHelper().createFormulaEvaluator();
>>>             this.formatter = new DataFormatter();
>>>         }
>>>         finally {
>>>             if(fis != null) {
>>>                 fis.close();
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to convert the contents of the currently opened workbook
>>> into
>>>      * a CSV file.
>>>      */
>>>     private void convertToCSV() {
>>>         Sheet sheet = null;
>>>         Row row = null;
>>>         int lastRowNum = 0;
>>>         this.csvData = new ArrayList<ArrayList>();
>>> 
>>>         System.out.println("Converting files contents to CSV format.");
>>> 
>>>         // Discover how many sheets there are in the workbook....
>>>         int numSheets = this.workbook.getNumberOfSheets();
>>> 
>>>         // and then iterate through them.
>>>         for(int i = 0; i < numSheets; i++) {
>>> 
>>>             // Get a reference to a sheet and check to see if it
>>> contains
>>>             // any rows.
>>>             sheet = this.workbook.getSheetAt(i);
>>>             if(sheet.getPhysicalNumberOfRows() > 0) {
>>> 
>>>                 // Note down the index number of the bottom-most row and
>>>                 // then iterate through all of the rows on the sheet
>>> starting
>>>                 // from the very first row - number 1 - even if it is
>>> missing.
>>>                 // Recover a reference to the row and then call another
>>> method
>>>                 // which will strip the data from the cells and build
>>> lines
>>>                 // for inclusion in the resylting CSV file.
>>>                 lastRowNum = sheet.getLastRowNum();
>>>                 for(int j = 0; j <= lastRowNum; j++) {
>>>                     row = sheet.getRow(j);
>>>                     this.rowToCSV(row);
>>>                 }
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to actually save the data recovered from the Excel
>>> workbook
>>>      * as a CSV file.
>>>      *
>>>      * @param file An instance of the File class that encapsulates a
>>> handle
>>>      *             referring to the CSV file.
>>>      * @param separator An instance of the String class that
>>> encapsulates
>>> the
>>>      *                  character or character that ought to be used to
>>> delimit
>>>      *                  elements on the lines of the CSV file.
>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>> be
>>> found.
>>>      * @throws java.io.IOException Thrown to indicate and error occurred
>>> in
>>> the
>>>      *                             underylying file system.
>>>      */
>>>     private void saveCSVFile(File file, String separator)
>>>                                      throws FileNotFoundException,
>>> IOException {
>>>         FileWriter fw = null;
>>>         BufferedWriter bw = null;
>>>         ArrayList<String> line = null;
>>>         StringBuffer buffer = null;
>>>         String csvLineElement = null;
>>>         try {
>>> 
>>>             System.out.println("Saving the CSV file [" + file.getName()
>>> +
>>> "]");
>>> 
>>>             // Open a writer onto the CSV file.
>>>             fw = new FileWriter(file);
>>>             bw = new BufferedWriter(fw);
>>> 
>>>             // Step through the elements of the ArrayList that was used
>>> to
>>> hold
>>>             // all of the data recovered from the Excel workbooks'
>>> sheets,
>>> rows
>>>             // and cells.
>>>             for(int i = 0; i < this.csvData.size(); i++) {
>>>                 buffer = new StringBuffer();
>>> 
>>>                 // Get an element from the ArrayList that contains the
>>> data
>>> for
>>>                 // the workbook. This element will itself be an
>>> ArrayList
>>>                 // containing Strings and each String will hold the data
>>> recovered
>>>                 // from a single cell. The for() loop is used to recover
>>> elements
>>>                 // from this 'row' ArrayList one at a time and to write
>>> the
>>> Strings
>>>                 // away to a StringBuffer thus assembling a single line
>>> for
>>> inclusion
>>>                 // in the CSV file. If a row was empty or if it was
>>> short,
>>> then
>>>                 // the ArrayList that contains it's data will also be
>>> shorter than
>>>                 // some of the others. Therefore, it is necessary to
>>> check
>>> within
>>>                 // the for loop to ensure that the ArrayList contains
>>> data
>>> to be
>>>                 // processed. If it does, then an element will be
>>> recovered
>>> and
>>>                 // appended to the StringBuffer.
>>>                 line = this.csvData.get(i);
>>>                 for(int j = 0; j < this.maxRowWidth; j++) {
>>>                     if(line.size() > j) {
>>>                         csvLineElement = line.get(j);
>>>                         if(csvLineElement != null) {
>>>                             buffer.append(csvLineElement);
>>>                         }
>>>                     }
>>>                     if(j < (this.maxRowWidth - 1)) {
>>>                         buffer.append(separator);
>>>                     }
>>>                 }
>>> 
>>>                 // Once the line is built, write it away to the CSV
>>> file.
>>>                 bw.write(buffer.toString().trim());
>>> 
>>>                 // Condition the inclusion of new line characters so as
>>> to
>>>                 // avoid an additional, superfluous, new line at the end
>>> of
>>>                 // the file.
>>>                 if(i < (this.csvData.size() - 1)) {
>>>                     bw.newLine();
>>>                 }
>>>             }
>>>         }
>>>         finally {
>>>             if(bw != null) {
>>>                 bw.flush();
>>>                 bw.close();
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to convert a row of cells into a line of data that can
>>> later
>>> be
>>>      * output to the CSV file.
>>>      *
>>>      * Note that no tests have yet been conducted with blank cells or
>>> those
>>>      * containing formulae. Such may require latereations to the way
>>> this
>>> code
>>>      * works.
>>>      *
>>>      * @param row An instance of either the HSSFRow or XSSFRo classes
>>> that
>>>      *            encapsulates information about a row of cells
>>> recovered
>>> from
>>>      *            an Excel workbook.
>>>      */
>>>     private void rowToCSV(Row row) {
>>>         Cell cell = null;
>>>         int lastCellNum = 0;
>>>         ArrayList<String> csvLine = new ArrayList<String>();
>>> 
>>>         // Check to ensure that a row was recovered from the sheet as it
>>> is
>>>         // possible that one or more rows between other populated rows
>>> could
>>> be
>>>         // missing - blank. If the row does contain cells then...
>>>         if(row != null) {
>>> 
>>>             // Get the index for the right most cell on the row and then
>>>             // step along the row from left to right recovering the
>>> contents
>>>             // of each cell, converting that into a formatted String and
>>>             // then storing the String into the csvLine ArrayList.
>>>             lastCellNum = row.getLastCellNum();
>>>             for(int i = 0; i <= lastCellNum; i++) {
>>>                 cell = row.getCell(i);
>>>                 if(cell == null) {
>>>                     csvLine.add("");
>>>                 }
>>>                 else {
>>>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>>>                        
>>> csvLine.add(this.formatter.formatCellValue(cell));
>>>                     }
>>>                     else {
>>>                         csvLine.add(this.formatter.formatCellValue(cell,
>>> this.evaluator));
>>>                     }
>>>                 }
>>>             }
>>>             // Make a note of the index number of the right most cell.
>>> This
>>> value
>>>             // will later be used to ensure that the matrix of data in
>>> the
>>> CSV file
>>>             // is square.
>>>             if(lastCellNum > this.maxRowWidth) {
>>>                 this.maxRowWidth = lastCellNum;
>>>             }
>>>         }
>>>         this.csvData.add(csvLine);
>>>     }
>>> 
>>>     /**
>>>      * The main() method contains code that demonstrates how to use the
>>> class.
>>>      * @param args
>>>      */
>>>     public static void main(String[] args) {
>>>         // Check the number of arguments passed to the main method.
>>> There
>>>         // must be two or three, the name of and path to either the
>>> folder
>>>         // containing the Excel files or an individula Excel workbook
>>> that
>>> is/are
>>>         // to be converted, the name of and path to the folder to which
>>> the
>>> CSV
>>>         // files should be written and then finally, optionally, the
>>> separator
>>>         // that should be used to separate individual items on the lines
>>> in
>>> the
>>>         // CSV file. Note that the names of the CSV files will be
>>> derived
>>> from
>>>         // those of the Excel file(s). Put simply the .xls or .xlsx
>>> extension
>>>         // will be replaced with .csv.
>>>         ToCSV converter = null;
>>>         try {
>>>             converter = new ToCSV();
>>>             if(args.length == 2) {
>>>                 converter.convertExcelToCSV(args[0], args[1]);
>>>             }
>>>             else if(args.length == 3){
>>>                 converter.convertExcelToCSV(args[0], args[1], args[2]);
>>>             }
>>>             else {
>>>                 System.out.println("Usage: java ToCSV \"Source Folder\"
>>> "
>>> +
>>>                         "\"Destination Folder\" \"CSV Element
>>> Separator\"");
>>>             }
>>>         }
>>>         catch(Exception ex) {
>>>             System.out.println("Caught an: " + ex.getClass().getName());
>>>             System.out.println("Message: " + ex.getMessage());
>>>             System.out.println("Stacktrace follows:.....");
>>>             ex.printStackTrace(System.out);
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * An instance of this class can be used to control the files
>>> returned
>>>      * be a call to the listFiles() method when made on an instance of
>>> the
>>>      * File class and that object refers to a folder/directory
>>>      */
>>>     class ExcelFilenameFilter implements FilenameFilter {
>>> 
>>>         /**
>>>          * Determine those files that will be returned by a call to the
>>>          * listFiles() method. In this case, the name of the file must
>>> end
>>> with
>>>          * either of the following two extension; '.xls' or '.xlsx'
>>>          * @param file An instance of the File class that encapsulates a
>>> handle
>>>          *             referring to the folder/directory that contains
>>> the
>>> file.
>>>          * @param name An instance of the String class that encapsulates
>>> the
>>>          *             name of the file.
>>>          * @return A boolean value that indicates whether the file
>>> should
>>> be
>>>          *         included in the array retirned by the call to the
>>> listFiles()
>>>          *         method. In this case true will be returned if the
>>> name
>>> of
>>> the
>>>          *         file ends with either '.xls' or '.xlsx' and false
>>> will
>>> be
>>>          *         returned in all other instances.
>>>          */
>>>         public boolean accept(File file, String name) {
>>>             return(name.endsWith(".xls") || name.endsWith(".xlsx"));
>>>         }
>>>     }
>>> }
>>> 
>>> 
>>> Luke_Devon wrote:
>>>> 
>>>> Hi Mark
>>>> 
>>>> I tested the code. It was fine. I could manage to convert XLS into CSV
>>>> without any problem. But i have few more questions.
>>>> 
>>>> In this code , you have been hard coded the path where XLS and CSV
>>>> located. and file names also hard coded.
>>>> 
>>>> How it would be pointing to a single directory and convert all XLS into
>>>> CSV stored in the folder ? 
>>>> 
>>>> In the RAW XLS file , there might be some unwanted data to be converted
>>>> into CSV. or some times we need to complete the blank cells in the
>>>> excel
>>>> sheet.
>>>> 
>>>> How can we do it , before convert into CSV ? 
>>>> 
>>>> Sorry about all those simple questions.( Since I'm not a programmer)
>>>> 
>>>> Thanks in advance
>>>> Luke
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ________________________________
>>>> From: MSB <ma...@tiscali.co.uk>
>>>> To: user@poi.apache.org
>>>> Sent: Friday, April 9, 2010 22:18:43
>>>> Subject: Re: Convert XLS into CSV
>>>> 
>>>> 
>>>> Hello Luke,
>>>> 
>>>> As promised, a bit of code that uses the usermodel to create CSV files.
>>>> I
>>>> have not had the opportunity to test it thoroughly and do expect there
>>>> to
>>>> be
>>>> issues so do not use the code in a production environment until you
>>>> have
>>>> put
>>>> it through the wringer so to speak. Also, you may find the performance
>>>> a
>>>> little slower than you expect, especially if you are using the newer
>>>> xml
>>>> based file format and have run the eventusermodel code that Nick wrote.
>>>> 
>>>> In essence, 'my' code simplt takes the contents of the workbook and
>>>> converts
>>>> it into an ArrayList of ArrayLists where each inner ArrayList contains
>>>> zero,
>>>> one or more Strings that describe the contents of a cell. I used this
>>>> approach becuase it allows me to ensure that every row in the finished
>>>> CVS
>>>> file is the same length - with regard to the number of elements it
>>>> contains
>>>> - even if the input workbook contains rows that have varying numbers of
>>>> cells on them. The code as it stands does evaluate any formulae that
>>>> may
>>>> be
>>>> contained within cells and I hope will perform pretty much as you
>>>> require.
>>>> Take a look down into the main() method to see how it can be used; this
>>>> method only shows the class being used to process a single file but an
>>>> instance can be used to process more than one file in this manner;
>>>> 
>>>> ToCSV converter = new ToCSV();
>>>> converter.openWorkbook("C:/temp/To CSV.xls");
>>>> converter.convertToCSV();
>>>> converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>>>> 
>>>> converter.openWorkbook("C:/temp/Another To CSV.xlsx");
>>>> converter.convertToCSV();
>>>> converter.saveCSVFile("C:/temp/Second CSV.csv", ";");
>>>> 
>>>> 
>>>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>>>> import org.apache.poi.ss.usermodel.Workbook;
>>>> import org.apache.poi.ss.usermodel.Sheet;
>>>> import org.apache.poi.ss.usermodel.Row;
>>>> import org.apache.poi.ss.usermodel.Cell;
>>>> import org.apache.poi.ss.usermodel.DataFormatter;
>>>> import org.apache.poi.ss.usermodel.FormulaEvaluator;
>>>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>>>> 
>>>> import java.io.File;
>>>> import java.io.FileInputStream;
>>>> import java.io.FileWriter;
>>>> import java.io.BufferedWriter;
>>>> import java.io.IOException;
>>>> import java.io.FileNotFoundException;
>>>> import java.util.ArrayList;
>>>> 
>>>> /**
>>>> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
>>>> This
>>>> * class makes the following assumptions;
>>>> *
>>>> * 1. Where the Excel workbook contains more that one worksheet then a
>>>> single
>>>> *    CSV file will contain the data from all of the worksheets.
>>>> * 2. The data matrix contained in the CSV file will be square. This
>>>> means
>>>> that
>>>> *    the number of elements in each row of the CSV file will match the
>>>> number
>>>> *    of cells in the longest row found in the Excel workbook. Any short
>>>> rows
>>>> *    will be 'padded' with empty elements - an empty elements is
>>>> represented in
>>>> *    the CSV file in this way ,,.
>>>> * 3. Empty elements will represent missing cells.
>>>> * 4. A row consisting of empty elements will be used to represent an
>>>> empty
>>>> row
>>>> *    in the Excel workbook.
>>>> *
>>>> * @author Mark B
>>>> * @version 1.00 9th April 2010
>>>> */
>>>> public class ToCSV {
>>>> 
>>>>     private Workbook workbook = null;
>>>>     private ArrayList<ArrayList> csvData = null;
>>>>     private int maxRowWidth = 0;
>>>>     private DataFormatter formatter = null;
>>>>     private FormulaEvaluator evaluator = null;
>>>> 
>>>>     /**
>>>>      * Open an Excel workbook readt for conversion.
>>>>      *
>>>>      * @param filename An instance of the String class that
>>>> encapsulates
>>>> the
>>>>      *                 path to and name of a valid Excel workbook. Note
>>>> that
>>>> the
>>>>      *                 workbook can be either a binary (.xls) or
>>>> SpreadsheetML
>>>>      *                 (.xlsx) file.
>>>>      *
>>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>>> be
>>>> located.
>>>>      * @throws java.io.IOException Thrown if a problem occurs in the
>>>> file
>>>> system.
>>>>      * @throws
>>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>>> Thrown
>>>>      *     if invalid xml is found whilst parsing an input
>>>> SpreadsheetML
>>>> file.
>>>>      */
>>>>     public void openWorkbook(String filename) throws
>>>> FileNotFoundException,
>>>>                                            IOException,
>>>> InvalidFormatException {
>>>>         File file = null;
>>>>         FileInputStream fis = null;
>>>>         try {
>>>>             file = new File(filename);
>>>>             fis = new FileInputStream(file);
>>>>             this.workbook = WorkbookFactory.create(fis);
>>>>             this.evaluator =
>>>> this.workbook.getCreationHelper().createFormulaEvaluator();
>>>>             this.formatter = new DataFormatter();
>>>>         }
>>>>         finally {
>>>>             if(fis != null) {
>>>>                 fis.close();
>>>>             }
>>>>         }
>>>>     }
>>>> 
>>>>     /**
>>>>      * Called to convert the contents of the currently opened workbook
>>>> into
>>>>      * a CSV file.
>>>>      */
>>>>     public void convertToCSV() {
>>>>         Sheet sheet = null;
>>>>         Row row = null;
>>>>         int lastRowNum = 0;
>>>>         this.csvData = new ArrayList<ArrayList>();
>>>> 
>>>>         // Discover how many sheets there are in the workbook....
>>>>         int numSheets = this.workbook.getNumberOfSheets();
>>>> 
>>>>         // and then iterate through them.
>>>>         for(int i = 0; i < numSheets; i++) {
>>>> 
>>>>             // Get a reference to a sheet and check to see if it
>>>> contains
>>>>             // any rows.
>>>>             sheet = this.workbook.getSheetAt(i);
>>>>             if(sheet.getPhysicalNumberOfRows() > 0) {
>>>> 
>>>>                 // Note down the index number of the bottom-most row
>>>> and
>>>>                 // then iterate through all of the rows on the sheet
>>>> starting
>>>>                 // from the very first row - number 1 - even if it is
>>>> missing.
>>>>                 // Recover a reference to the row and then call another
>>>> method
>>>>                 // which will strip the data from the cells and build
>>>> lines
>>>>                 // for inclusion in the resylting CSV file.
>>>>                 lastRowNum = sheet.getLastRowNum();
>>>>                 for(int j = 0; j <= lastRowNum; j++) {
>>>>                     row = sheet.getRow(j);
>>>>                     this.rowToCSV(row);
>>>>                 }
>>>>             }
>>>>         }
>>>>     }
>>>> 
>>>>     /**
>>>>      * Called to actually save the data recovered from the Excel
>>>> workbook
>>>>      * as a CSV file.
>>>>      *
>>>>      * @param filename An instance of the String class that
>>>> encapsulates
>>>> the
>>>>      *                path to and name of the CSV file.
>>>>      * @param separator An instance of the String class that
>>>> encapsulates
>>>> the
>>>>      *                  character or character that ought to be used to
>>>> delimit
>>>>      *                  elements on the lines of the CSV file.
>>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>>> be
>>>> found.
>>>>      * @throws java.io.IOException Thrown to indicate and error
>>>> occurred
>>>> in
>>>> the
>>>>      *                             underylying file system.
>>>>      */
>>>>     public void saveCSVFile(String filename, String separator)
>>>>                                                    throws
>>>> FileNotFoundException,
>>>>                                                           IOException {
>>>>         File file = null;
>>>>         FileWriter fw = null;
>>>>         BufferedWriter bw = null;
>>>>         ArrayList<String> line = null;
>>>>         StringBuffer buffer = null;
>>>>         String csvLineElement = null;
>>>>         try {
>>>>             // Open a writer onto the CSV file.
>>>>             file = new File(filename);
>>>>             fw = new FileWriter(file);
>>>>             bw = new BufferedWriter(fw);
>>>> 
>>>>             // Step through the elements of the ArrayList that was used
>>>> to
>>>> hold
>>>>             // all of the data recovered from the Excel workbooks'
>>>> sheets,
>>>> rows
>>>>             // and cells.
>>>>             for(int i = 0; i < this.csvData.size(); i++) {
>>>>                 buffer = new StringBuffer();
>>>> 
>>>>                 // Get an element from the ArrayList that contains the
>>>> data
>>>> for
>>>>                 // the workbook. This element will itself be an
>>>> ArrayList
>>>>                 // containing Strings and each String will hold the
>>>> data
>>>> recovered
>>>>                 // from a single cell. The for() loop is used to
>>>> recover
>>>> elements
>>>>                 // from this 'row' ArrayList one at a time and to write
>>>> the
>>>> Strings
>>>>                 // away to a StringBuffer thus assembling a single line
>>>> for
>>>> inclusion
>>>>                 // in the CSV file. If a row was empty or if it was
>>>> short,
>>>> then
>>>>                 // the ArrayList that contains it's data will also be
>>>> shorter than
>>>>                 // some of the others. Therefore, it is necessary to
>>>> check
>>>> within
>>>>                 // the for loop to ensure that the ArrayList contains
>>>> data
>>>> to be
>>>>                 // processed. If it does, then an element will be
>>>> recovered
>>>> and
>>>>                 // appended to the StringBuffer.
>>>>                 line = this.csvData.get(i);
>>>>                 for(int j = 0; j < this.maxRowWidth; j++) {
>>>>                     if(line.size() > j) {
>>>>                         csvLineElement = line.get(j);
>>>>                         if(csvLineElement != null) {
>>>>                             buffer.append(csvLineElement);
>>>>                         }
>>>>                     }
>>>>                     if(j < (this.maxRowWidth - 1)) {
>>>>                         buffer.append(separator);
>>>>                     }
>>>>                 }
>>>> 
>>>>                 // Once the line is built, write it away to the CSV
>>>> file.
>>>>                 bw.write(buffer.toString().trim());
>>>> 
>>>>                 // Condition the inclusion of new line characters so as
>>>> to
>>>>                 // avoid an additional, superfluous, new line at the
>>>> end
>>>> of
>>>>                 // the file.
>>>>                 if(i < (this.csvData.size() - 1)) {
>>>>                     bw.newLine();
>>>>                 }
>>>>             }
>>>>         }
>>>>         finally {
>>>>             if(bw != null) {
>>>>                 bw.flush();
>>>>                 bw.close();
>>>>             }
>>>>         }
>>>>     }
>>>> 
>>>>     /**
>>>>      * Called to convert a row of cells into a line of data that can
>>>> later
>>>> be
>>>>      * output to the CSV file.
>>>>      *
>>>>      * Note that no tests have yet been conducted with blank cells or
>>>> those
>>>>      * containing formulae. Such may require latereations to the way
>>>> this
>>>> code
>>>>      * works.
>>>>      *
>>>>      * @param row An instance of either the HSSFRow or XSSFRo classes
>>>> that
>>>>      *            encapsulates information about a row of cells
>>>> recovered
>>>> from
>>>>      *            an Excel workbook.
>>>>      */
>>>>     private void rowToCSV(Row row) {
>>>>         Cell cell = null;
>>>>         int lastCellNum = 0;
>>>>         ArrayList<String> csvLine = new ArrayList<String>();
>>>> 
>>>>         // Check to ensure that a row was recovered from the sheet as
>>>> it
>>>> is
>>>>         // possible that one or more rows between other populated rows
>>>> could
>>>> be
>>>>         // missing - blank. If the row does contain cells then...
>>>>         if(row != null) {
>>>> 
>>>>             // Get the index for the right most cell on the row and
>>>> then
>>>>             // step along the row from left to right recovering the
>>>> contents
>>>>             // of each cell, converting that into a formatted String
>>>> and
>>>>             // then storing the String into the csvLine ArrayList.
>>>>             lastCellNum = row.getLastCellNum();
>>>>             for(int i = 0; i <= lastCellNum; i++) {
>>>>                 cell = row.getCell(i);
>>>>                 if(cell == null) {
>>>>                     csvLine.add("");
>>>>                 }
>>>>                 else {
>>>>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>>>>                        
>>>> csvLine.add(this.formatter.formatCellValue(cell));
>>>>                     }
>>>>                     else {
>>>>                        
>>>> csvLine.add(this.formatter.formatCellValue(cell,
>>>> this.evaluator));
>>>>                     }
>>>>                 }
>>>>             }
>>>>             // Make a note of the index number of the right most cell.
>>>> This
>>>> value
>>>>             // will later be used to ensure that the matrix of data in
>>>> the
>>>> CSV file
>>>>             // is square.
>>>>             if(lastCellNum > this.maxRowWidth) {
>>>>                 this.maxRowWidth = lastCellNum;
>>>>             }
>>>>         }
>>>>         this.csvData.add(csvLine);
>>>>     }
>>>> 
>>>>     /**
>>>>      * The main() method contains code that demonstrates how to use the
>>>> class.
>>>>      * @param args
>>>>      */
>>>>     public static void main(String[] args) {
>>>>         try {
>>>>             ToCSV converter = new ToCSV();
>>>>             converter.openWorkbook("C:/temp/To CSV.xls");
>>>>             converter.convertToCSV();
>>>>             converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>>>>         }
>>>>         catch(Exception ex) {
>>>>             System.out.println("Caught an: " +
>>>> ex.getClass().getName());
>>>>             System.out.println("Message: " + ex.getMessage());
>>>>             System.out.println("Stacktrace follows:.....");
>>>>             ex.printStackTrace(System.out);
>>>>         }
>>>> 
>>>>     }
>>>> }
>>>> 
>>>> Test it out, have a good look through it and if there is anything you
>>>> want
>>>> to know just post to the list.
>>>> 
>>>> Yours
>>>> 
>>>> Mark B
>>>> 
>>>> 
>>>> Luke_Devon wrote:
>>>>> 
>>>>> Hi Mark,
>>>>> 
>>>>> First of all I would like to thank you for the reply. 
>>>>> 
>>>>> Actually , currently I am using MS Office 2002. But I would like to
>>>>> use
>>>>> the code for other latest versions also.
>>>>> In my case , I dont want to do any validations for the EXCEL file ,
>>>>> because I just wanted convert entire excel file into CSV.
>>>>> Is there any simple java code available for such a basic requirement ?
>>>>> 
>>>>> anyway I'll try to use the code in the link which you have given to me
>>>>> .
>>>>> 
>>>>> Thanks & Regards
>>>>> 
>>>>> Luke.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ________________________________
>>>>> From: MSB <ma...@tiscali.co.uk>
>>>>> To: user@poi.apache.org
>>>>> Sent: Thursday, April 8, 2010 23:47:07
>>>>> Subject: Re: Convert XLS into CSV
>>>>> 
>>>>> 
>>>>> Hello Luke,
>>>>> 
>>>>> Which version of the Excel file format are you targetting, the older
>>>>> binary
>>>>> or newer xml based version? I ask because Nick wrote and contributed
>>>>> some
>>>>> code that can be used to convert the older binary files into csv. It
>>>>> uses
>>>>> the eventmodel and will seem quite complex on first acquaintance but
>>>>> here
>>>>> it
>>>>> is;
>>>>> 
>>>>> http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java
>>>>> 
>>>>> I know that other users have modified the code to, for example, output
>>>>> diffenert worksheets to separate csv files.
>>>>> 
>>>>> Yours
>>>>> 
>>>>> Mark B
>>>>> 
>>>>> 
>>>>> Luke_Devon wrote:
>>>>>> 
>>>>>> Hi 
>>>>>> 
>>>>>> I wanted to convert some XLS files into CSV. I found that apache.poi
>>>>>> is
>>>>>> the most perfect tool. Since I'm a beginner , i have no idea how to
>>>>>> do
>>>>>> that . Can some body help me please ? Do you have sample code for
>>>>>> convert
>>>>>> xls into csv ?
>>>>>> 
>>>>>> Thanks in Advance
>>>>>> Luke
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>       Get your preferred Email name!
>>>>>> Now you can @ymail.com and @rocketmail.com. 
>>>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>>>> 
>>>>> 
>>>>> -- 
>>>>> View this message in context:
>>>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28180503.html
>>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>>> 
>>>>> 
>>>>>       New Email names for you! 
>>>>> Get the Email name you've always wanted on the new @ymail and
>>>>> @rocketmail. 
>>>>> Hurry before someone else does!
>>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>>> 
>>>> 
>>>> -- 
>>>> View this message in context:
>>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28191046.html
>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>> 
>>>> 
>>>>       Get your new Email address!
>>>> Grab the Email name you've always wanted before someone else does!
>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28230701.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28333557.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28373635.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28375257.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Convert XLS into CSV - Removing a Work Sheet

Posted by luke devon <lu...@yahoo.com>.
Hi Mark,

1.    Value “Sum” always resides on Column H and there is no specific /constant space for number of rows between each occurrence of the line including that word “Sum”. That’s meant, first “Sum” stored in H, 11. Second “Sum” stored in H, 14, Third “Sum” stored in H, 26 ………….so on. Row numbers are not constant but Column is constant. That is H.  

2.    Finally I understood that we don’t need to remove “comma” from the excel sheet since function of XLS convert into CSV is taking care of the matter. ( Thanks for directing me to the point )

3.    Major part is copying other missing data in the columns. Once every thing developed, I don’t need to involve with enter values manually. Main idea is handover the tasks to a CRON job.

Then I have a doubt, how to enter -C "C10! C11:C20 D34! D67:D78" or -C "Linux! C10:C20" since program is running by a CRON job

Please help me to go further .

Thanks in advance
Luke


________________________________
From: MSB <ma...@tiscali.co.uk>
To: user@poi.apache.org
Sent: Tuesday, April 27, 2010 15:08:36
Subject: Re: Convert XLS into CSV - Removing a Work Sheet


Glad to hear that Luke, now onto the other tasks.

Firstly, can I ask you a question about the first task - removing a row if
the word Sum appears on it - please? Does Sum always appear in a particular
column? Failing that, are there a set number of rows between each occurrence
of the line including that word? If so, this will speed up the entire
process somewhat, if not, then we are faced with a process where we need to
examine every cell on each row in the worksheet and this could be quite a
time consuming process. If it is possible to identify a pattern, even a
quite crude one that identifies that the word will only appear in columns D,
G or H, then that will help.

Secondly, the commas in numeric values, this could pressent a bit of a
problem. As you may know, styles are used to control how a value is
displayed in a cell and it could be that the person who originally
constructed the worksheet entered a value such as 12345 into the cell and
then applied a style so that it appears to contain 12,345.00. The best way
to resolve this issue is, in many ways, to simply remove the cells
formatting - but there is yet another issue, what if the cell was not
numeric at all and the user entered a String of text and then right aligned
it? So, can you look at the worksheet using Excel, find a few of these cells
and see what type they are and what format has been applied to them please
because, without the file at hand, it is hard to tell you how to proceed on
this issue. Finally, do you need to modify these cells in the workbook as it
would be trivial task to remove the comma when writing the value into the
CSV file.

Thirdly, the copying cell contents code, is it possible to say, in advance,
copy the contents of cell B16 into the range B17 to B25 for example? If so,
then it would be quite a straightforward task to automate the process
completely and simply pass in a parameter format in a way the code would
recognise. For example, and assuming running the code from the command line,
we could have -C "C10!C11:C20 D34!D67:D78" or -C "Linux!C10:C20" where the
former would take the contents of cell C10 and copy it into cells C11 to C20
and the latter would write the String Linux into cell C11 to C20.

Yours

Mark B



Luke_Devon wrote:
> 
> Hi Mark
> 
> I modified the code to removing the Summary sheet from my Excel workbook.
> It is working. Thanks for your guide and the support.( Herewith I have
> attached the code )
> 
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileOutputStream;
> import org.apache.poi.ss.usermodel.Workbook;
> import org.apache.poi.ss.usermodel.WorkbookFactory;
> 
> public class SheetUtil {
> 
>    public static void main(String a[]){
>        new SheetUtil().deleteSheet();
>    }
> 
> 
>    public void deleteSheet(){
>        int sheetToDelete = -1;
> 
>        String path = "Book1.xls";
>        FileInputStream fis  = null;
>        Workbook workbook = null;
>        File file = null;
>        FileOutputStream fos = null;
>        try{
> 
>            file = new File(path);
> 
>            fis = new FileInputStream(file);
>            workbook = WorkbookFactory.create(fis);
> 
> 
> 
>            int numSheets = workbook.getNumberOfSheets();
> 
>            for(int i = 0; i < numSheets; i++) {
>                org.apache.poi.ss.usermodel.Sheet sheet =
> workbook.getSheetAt(i);
>                System.out.println(sheet.getSheetName());
>                if(sheet.getSheetName().equalsIgnoreCase("Summary")) {
>                    sheetToDelete = i;
>                    break;
> 
>                }
>            }
> 
> 
>            if(sheetToDelete > -1) {
>                workbook.removeSheetAt(sheetToDelete);
>            }
> 
>            fos = new FileOutputStream(file);
> 
>            workbook.write(fos);
> 
>        }catch(Exception e){
>            e.printStackTrace();
>        }finally{
>            try{
>                if(fis != null){
>                    fis.close();
>                    fis = null;
> 
>                }
>                if(fos != null){
>                    fos.close();
>                    fos = null;
> 
>                }
>            }catch(Exception ee){
>                ee.printStackTrace();
> 
>            }
>        }
>    }
> }
> 
> =====================================================
> 
> I have few more modifications , to be applied for the workbook. Can you
> please help me further?
> 
> I will list down you my other requirements.
> 
> After I removed the "Summary" sheet , there will be only one Sheet will be
> remain which is "Detail". 
> 
> In the Detail Sheet,
> 
> 1. In the Sheet , there is String called "Sum". This value also will be
> there more than once.
> 
> Once that value found , needed to be delete entire row respectively. Eg:
> If "Sum" is in , Cell H11, 11th row needed to be removed from the
> worksheet.
> 
> 2. Some Cells are containing values with "1000 separator (,)" -- Eg:
> 1,256.00 . I wanted to remove all thousand separators from the values
> which are exist in the Detail excel sheet. Eg: 1256.00 --- after removing
> the "," value should be like this.(1256.00)
> 
> 3. In B16 , There is a String called "Linux". This value i wanted to copy
> ( repeat ) until B25.Thats meant This value must be repeat from B17 to
> B25.( This is just like we copy a value along the column for more than one
> row )
> 
> Like that , in this worksheet , there will be lot of values to be copied.
> Hence I think we can not do it dynamically and we needed hard code those
> values in the code .
> 
> Thanks in advance
> Luke
> 
> 
> ________________________________
> From: MSB <ma...@tiscali.co.uk>
> To: user@poi.apache.org
> Sent: Friday, April 23, 2010 3:27:38
> Subject: Re: Convert XLS into CSV
> 
> 
> Luke
> 
> As you have only limited experience with Java, I have uploaded a
> distribution that you should be able to
> run from the command line. I know some companies restrict access to file
> sharing sites bu if you can use Rapidshare from your location, this is the
> link to the archive;
> 
> http://rapidshare.com/files/378927105/CSV_Conversion_Code.rar
> 
> and if you have any problems downloading the file, just let me know. It is
> quite large as it contains all of
> the POI archives that you will need to run the code but I am confident
> there
> is another way I could send it
> directly to you if Rapidshare is inaccessible.
> 
> If you are unfamiliar with Rapidshare, all you need to do is follow the
> link, choose the free download option
> and download the file to your machine. Once you have downloaded the
> archive
> called 
> CSV Conversion Code.rar, unzip it and you should have a folder with the
> name
> CSV Conversion Code. This
> folder contains a further folder called dist and a file by the name of
> Read
> Me.rtf.
> Open this rtf file (Word or WordPad will handle it successfully) and it
> will
> tell you what to do next.
> 
> Any problems, just send me a message.
> 
> Yours
> 
> Mark B
> 
> PS How did you get on with removing the Summary sheet from your Excel
> workbook?
> 
> 
> Luke_Devon wrote:
>> 
>> Hi Mark
>> 
>> First of all I would like to thank you for all your explanation which you
>> have explained very well regarding the code. It was really helpful to
>> understand the flow of the code and I think its nice to have such
>> audience
>> and explanation from an expert / one of the best,  in the industry for a
>> beginner like me. 
>> 
>> I was executing your final Code to convert XLS into CSV. Its compiled,
>> but
>> there was two messages , 
>> uses unchecked or unsafe operations.
>> Note: Recompile with -Xlint:unchecked for details.
>> 
>> I did google , for those messages and there was some directives , but as
>> of some forum says , it wont affect to the execution.
>> 
>> Finally , when I tried to execute it , 
>> 
>> java ToCSV d:\excel  d:\csv , 
>> 
>> [java] Usage: java ToCSV "Source Folder" "Destination Folder" "CSV
>> Element
>> Separator"
>> 
>> BUILD FAILED
>> Target "d:\excel" does not exist in the project "Inbox".
>> 
>> I tried to go through the code  , if there is any thing that i can
>> modified to solve the problem.but no luck.
>> 
>> Thanks in advance
>> Luke
>> 
>> 
>> ________________________________
>> From: MSB <ma...@tiscali.co.uk>
>> To: user@poi.apache.org
>> Sent: Tuesday, April 13, 2010 22:04:55
>> Subject: Re: Convert XLS into CSV
>> 
>> 
>> The code I have pasted into this message has been modified so that it is
>> possible to specify the path to and name of a folder containing one or
>> more
>> Excel workbooks and the code will now iterate through each of the
>> workbooks
>> in the folder and convert them into separate CSV files. The name of the
>> resulting CSV file will in each case be derived by taking the name of the
>> source Excel workbook, removing the .xls or xlsx extension and suffixing
>> the
>> .csv extension. Before looking at the code in any detail - most of it is
>> identical by the way - I need to explain how you can call and use this
>> class. Forgive me if you know any of this but I thought it safer to
>> assume
>> you might not be totally clear on the details.
>> 
>> When you run a Java class with a command like this;
>> 
>> java ToCSV
>> 
>> the runtime environment will locate the .class file, load and compile the
>> bytecode and then look inside the class for the main() method which, if
>> that
>> method is present and it contains any code at all, is where execution of
>> the
>> program will begin. If you look at the signature for the main() method,
>> you
>> will see that it looks something like this;
>> 
>> public static void main(String[] args)
>> 
>> ignoring the public, static and void keywords for the moment, take a look
>> at
>> what the brackets contain - String[] args. This is the declaration for a
>> variable called args which is able to hold a reference to an array where
>> each element of that array is an instance the java.lang.String class.
>> This
>> array will hold any parameters that were specified on the command line
>> when
>> the class was run. So, to modify the example above slightly, this;
>> 
>> java ToCSV C:/temp C:/temp/CSV
>> 
>> would result in a reference to an an array holding two Strings being
>> passed
>> to the args parameter; the first element would hold "C:/temp" and the
>> second
>> C:/temp/CSV. Well, more accurately, the elements would hold references to
>> instances of the String class that encapsulated the values "C:/temp" and
>> C:/temp/CSV.
>> 
>> The contents of the args parameter are accessible to any code within the
>> args method and so it is possible to extract the contents of the two
>> elements by simply referencing each using Java's standard notation; the
>> first element in the args array would be args[0] and the second args[1].
>> 
>> With this information, you can see how to run the ToCSV class from the
>> command line. All that you need do is specify the path to and name of
>> either
>> an individual Excel workbook or of a folder that contains Excel workbook,
>> the path to and name of the folder that should contain the CSV files and,
>> optionally, the character or string of characters that will separate each
>> element on the line within the CSV file. The class can be invoked from
>> the
>> command line in this manner;
>> 
>> java ToCSV "C:/temp/Excel Workbooks" "C:/temp/CSV Files"
>> 
>> or
>> 
>> java ToCSV C:/temp/Excel C:/temp/CSV ,
>> 
>> Note that the speach marks enclosing the paths in the first example are
>> only
>> necessary if there are embedded spaces within the parameter.
>> 
>> If you now take a look at the code in the main() method of the ToCSV
>> file,
>> you can see that it simply recovers the values from the args array and
>> uses
>> them to call the convertExcelToCSV() method on an instance of the ToCSV
>> class.
>> 
>> Equally, you can call this class from code within another Java class. If
>> you
>> choose to do this, it is an even simpler process; simply create an
>> instance
>> of the ToCSV class and then call either of the overloaded
>> convertExcelToCSV() methods. One accepts two parameters, the name of and
>> path to the source Excel workbook(s) and the name of and path to the
>> folder
>> the CSV files ought to be written to. If you call this method, it will be
>> assumed that the separtor is a comma. The other method accepts a third
>> parameter allowing you to define what the separator should be, so you
>> could
>> pass a colon for example. 
>> 
>> The process is a little like this;
>> 
>> ToCSV converter = new ToCSV();
>> converter.convertExcelToCSV("C:/temp/Excel Workbooks", "C:/temp/CSV
>> Files",
>> "!");
>> 
>> The folder that contains the workbooks can hold other files as well, you
>> do
>> not have to worry about copying the Excel workbooks to a special file for
>> processing. The ToCSV class is able to identify and select just Excel
>> workbook files that have extensions of '.xls' or '.xlsx' and it does this
>> by
>> using an instance of an inner class called ExcelFilenameFilter that you
>> can
>> see defined at the bottom of the source code, immediately following the
>> main() method. As you can see, it contains a single method - accept() -
>> that
>> contains a single line of code. This line of code will return the boolean
>> value true if the name of a file ends with either '.xls' or '.xlsx' and
>> an
>> instance of ExcelFilenameFilter can, therefore, be used, when the
>> contents
>> of the folder are read, to ensure that only Excel workbooks are
>> processed.
>> I
>> often use inner classes like this if they are very small and simple or if
>> they only make sense in the context of the enclosing class. By this, I
>> mean
>> asking yourslef if it is likely any other class may need similar
>> functionality.
>> 
>> Anayway, here is the code. Remember that it is only one possible solution
>> and you may need to modify how it works to suit your specific
>> requirement.
>> You may not be worried, for example, if all of the rows in the CSV file
>> contain different numbers of elements. If this is the case, the contents
>> of
>> the cells could be written out immediately after they were read and you
>> could completely remove the additional step of storing them in ArrayLists
>> to
>> ensure the matrix is square.
>> 
>> package postprocessxlsx;
>> 
>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>> import org.apache.poi.ss.usermodel.Workbook;
>> import org.apache.poi.ss.usermodel.Sheet;
>> import org.apache.poi.ss.usermodel.Row;
>> import org.apache.poi.ss.usermodel.Cell;
>> import org.apache.poi.ss.usermodel.DataFormatter;
>> import org.apache.poi.ss.usermodel.FormulaEvaluator;
>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>> 
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.io.FileWriter;
>> import java.io.BufferedWriter;
>> import java.io.FilenameFilter;
>> import java.io.IOException;
>> import java.io.FileNotFoundException;
>> import java.util.ArrayList;
>> 
>> /**
>> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
>> This
>> * class makes the following assumptions;
>> *
>> * 1. Where the Excel workbook contains more that one worksheet then a
>> single
>> *    CSV file will contain the data from all of the worksheets.
>> * 2. The data matrix contained in the CSV file will be square. This means
>> that
>> *    the number of elements in each row of the CSV file will match the
>> number
>> *    of cells in the longest row found in the Excel workbook. Any short
>> rows
>> *    will be 'padded' with empty elements - an empty elements is
>> represented in
>> *    the CSV file in this way ,,.
>> * 3. Empty elements will represent missing cells.
>> * 4. A row consisting of empty elements will be used to represent an
>> empty
>> row
>> *    in the Excel workbook.
>> *
>> * @author Mark B
>> * @version 1.10 12th April 2010
>> */
>> public class ToCSV {
>> 
>>     private Workbook workbook = null;
>>     private ArrayList<ArrayList> csvData = null;
>>     private int maxRowWidth = 0;
>>     private DataFormatter formatter = null;
>>     private FormulaEvaluator evaluator = null;
>>     private String separtor = null;
>> 
>>     public static final String CSV_FILE_EXTENSION = ".csv";
>>     public static final String DEFAULT_SEPARATOR = ",";
>> 
>>     /**
>>      * Process the contents of a folder, convert the contents of each
>> Excel
>>      * workbook into CSV format and save the resulting file to the
>> specified
>>      * folder using the same name as the original workbook with the .xls
>> or
>>      * .xlsx extension replaced by .csv
>>      *
>>      * @param source An instance of the String class that encapsulates
>> the
>>      *        name of and path to either a folder containing those Excel
>>      *        workbook(s) or the name of and path to an individual Excel
>> workbook
>>      *        that is/are to be converted.
>>      * @param destination An instance of the String class encapsulating
>> the
>> name
>>      *        of and path to a folder that will contain the resulting CSV
>> files.
>>      * @throws java.io.FileNotFoundException Thrown if any file cannot be
>> located
>>      *         on the filesystem during processing.
>>      * @throws java.io.IOException Thrown if the filesystem encounters
>> any
>>      *         problems during processing.
>>      * @throws java.lang.IllegalArgumentException Thrown if the values
>> passed
>>      *         to the source parameters refers to a file or folder that
>> does
>> not
>>      *         exist and if the value passed to the destination paramater
>> refers
>>      *         to a folder that does not exist or simply does not refer
>> to
>> a
>>      *         folder that does not exist.
>>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
>> Thrown
>>      *         if the xml markup encounetered whilst parsing a
>> SpreadsheetML
>>      *         file (.xlsx) is invalid.
>>      */
>>     public void convertExcelToCSV(String strSource, String
>> strDestination)
>>                        throws FileNotFoundException, IOException,
>>                               IllegalArgumentException,
>> InvalidFormatException {
>> 
>>         // Simply chain the call to the overloaded
>> convertExcelToCSV(String,
>>         // String, String) method and pass the default separator to
>> ensure
>> this
>>         // String is used to separate elemnts on the line in the CSV
>> file.
>>         this.convertExcelToCSV(strSource, strDestination,
>> ToCSV.DEFAULT_SEPARATOR);
>>     }
>> 
>>     /**
>>      * Process the contents of a folder, convert the contents of each
>> Excel
>>      * workbook into CSV format and save the resulting file to the
>> specified
>>      * folder using the same name as the original workbook with the .xls
>> or
>>      * .xlsx extension replaced by .csv
>>      *
>>      * @param source An instance of the String class that encapsulates
>> the
>>      *        name of and path to either a folder containing those Excel
>>      *        workbook(s) or the name of and path to an individual Excel
>> workbook
>>      *        that is/are to be converted.
>>      * @param destination An instance of the String class encapsulating
>> the
>> name
>>      *        of and path to a folder that will contain the resulting CSV
>> files.
>>      * @param separator An instance of the String class encapsulating the
>>      *        characters or characters that should be used to separate
>> items
>>      *        on a line within the CSV file.
>>      * @throws java.io.FileNotFoundException Thrown if any file cannot be
>> located
>>      *         on the filesystem during processing.
>>      * @throws java.io.IOException Thrown if the filesystem encounters
>> any
>>      *         problems during processing.
>>      * @throws java.lang.IllegalArgumentException Thrown if the values
>> passed
>>      *         to the source parameters refers to a file or folder that
>> does
>> not
>>      *         exist and if the value passed to the destination paramater
>> refers
>>      *         to a folder that does not exist or simply does not refer
>> to
>> a
>>      *         folder that does not exist.
>>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
>> Thrown
>>      *         if the xml markup encounetered whilst parsing a
>> SpreadsheetML
>>      *         file (.xlsx) is invalid.
>>      */
>>     public void convertExcelToCSV(String strSource,
>>                                   String strDestination, String
>> separator)
>>                        throws FileNotFoundException, IOException,
>>                               IllegalArgumentException,
>> InvalidFormatException {
>>         File source = new File(strSource);
>>         File destination = new File(strDestination);
>>         File[] filesList = null;
>>         String destinationFilename = null;
>> 
>>         // Check that the source file/folder exists.
>>         if(!source.exists()) {
>>             throw new IllegalArgumentException("The source for the Excel
>> "
>> +
>>                     "file(s) cannot be found.");
>>         }
>> 
>>         // Ensure thaat the folder the user has chosen to save the CSV
>> files
>>         // away into firstly exists and secondly is a folder rather than,
>> for
>>         // instance, a data file.
>>         if(!destination.exists()) {
>>             throw new IllegalArgumentException("The folder/directory for
>> the
>> " +
>>                     "converted CSV file(s) does not exist.");
>>         }
>>         if(!destination.isDirectory()) {
>>             throw new IllegalArgumentException("The destination for the
>> CSV
>> " +
>>                     "file(s) is not a directory/folder.");
>>         }
>> 
>>         // Check to see if the sourceFolder variable holds a reference to
>>         // a file or a folder full of files.
>>         if(source.isDirectory()) {
>>             // Get a list of all of the Excel spreadsheet files
>> (workbooks)
>> in
>>             // the source folder/directory
>>             filesList = source.listFiles(new ExcelFilenameFilter());
>>         }
>>         else {
>>             // Assume that it must be a file handle - although there are
>> other
>>             // options the code should perhaps check - and store the
>> reference
>>             // into the filesList variable.
>>             filesList = new File[]{source};
>>         }
>> 
>>         // Step through each of the files in the source folder and for
>> each
>>         // open the workbook, convert it's contents to CSV format and
>> then
>>         // save the resulting file away into the folder specified by the
>>         // contents of the destination variable. Note that the name of
>> the
>>         // csv file will be created by taking the name of the Excel file,
>>         // removing the extension and replacing it with .csv. Note that
>> there
>>         // is one drawback with this approach; if the folder holding the
>> files
>>         // contains two workbooks whose names match but one is a binary
>> file
>>         // (.xls) and the other a SpreadsheetML file (.xlsx), then the
>> names
>>         // for both CSV files will be identical and one CSV file will,
>>         // therefore, over-write the other.
>>         for(File excelFile : filesList) {
>>             // Open the workbook
>>             this.openWorkbook(excelFile);
>> 
>>             // Convert it's contents into a CSV file
>>             this.convertToCSV();
>> 
>>             // Build the name of the csv folder from that of the Excel
>> workbook.
>>             // Simply replace the .xls or .xlsx file extension with .csv
>>             destinationFilename = excelFile.getName();
>>             destinationFilename = destinationFilename.substring(
>>                     0, destinationFilename.lastIndexOf(".")) + ".csv";
>> 
>>             // Save the CSV file away using the newly constricted file
>> name
>>             // and to the specified directory.
>>             this.saveCSVFile(new File(destination, destinationFilename),
>> separator);
>>         }
>>     }
>> 
>>     /**
>>      * Open an Excel workbook ready for conversion.
>>      *
>>      * @param file An instance of the File class that encapsulates a
>> handle
>>      *        to a valid Excel workbook. Note that the workbook can be in
>>      *        either binary (.xls) or SpreadsheetML (.xlsx) format.
>>      *
>>      * @throws java.io.FileNotFoundException Thrown if the file cannot be
>> located.
>>      * @throws java.io.IOException Thrown if a problem occurs in the file
>> system.
>>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
>> Thrown
>>      *         if invalid xml is found whilst parsing an input
>> SpreadsheetML
>>      *         file.
>>      */
>>     private void openWorkbook(File file) throws FileNotFoundException,
>>                                            IOException,
>> InvalidFormatException {
>>         FileInputStream fis = null;
>>         try {
>>             System.out.println("Opening workbook [" + file.getName() +
>> "]");
>> 
>>             fis = new FileInputStream(file);
>> 
>>             // Open the workbook and then create the FormulaEvaluator and
>>             // DataFormatter instances that will be needed to,
>> respectively,
>>             // force evaluation of forumlae found in cells and create a
>>             // formatted String encapsulating the cells contents.
>>             this.workbook = WorkbookFactory.create(fis);
>>             this.evaluator =
>> this.workbook.getCreationHelper().createFormulaEvaluator();
>>             this.formatter = new DataFormatter();
>>         }
>>         finally {
>>             if(fis != null) {
>>                 fis.close();
>>             }
>>         }
>>     }
>> 
>>     /**
>>      * Called to convert the contents of the currently opened workbook
>> into
>>      * a CSV file.
>>      */
>>     private void convertToCSV() {
>>         Sheet sheet = null;
>>         Row row = null;
>>         int lastRowNum = 0;
>>         this.csvData = new ArrayList<ArrayList>();
>> 
>>         System.out.println("Converting files contents to CSV format.");
>> 
>>         // Discover how many sheets there are in the workbook....
>>         int numSheets = this.workbook.getNumberOfSheets();
>> 
>>         // and then iterate through them.
>>         for(int i = 0; i < numSheets; i++) {
>> 
>>             // Get a reference to a sheet and check to see if it contains
>>             // any rows.
>>             sheet = this.workbook.getSheetAt(i);
>>             if(sheet.getPhysicalNumberOfRows() > 0) {
>> 
>>                 // Note down the index number of the bottom-most row and
>>                 // then iterate through all of the rows on the sheet
>> starting
>>                 // from the very first row - number 1 - even if it is
>> missing.
>>                 // Recover a reference to the row and then call another
>> method
>>                 // which will strip the data from the cells and build
>> lines
>>                 // for inclusion in the resylting CSV file.
>>                 lastRowNum = sheet.getLastRowNum();
>>                 for(int j = 0; j <= lastRowNum; j++) {
>>                     row = sheet.getRow(j);
>>                     this.rowToCSV(row);
>>                 }
>>             }
>>         }
>>     }
>> 
>>     /**
>>      * Called to actually save the data recovered from the Excel workbook
>>      * as a CSV file.
>>      *
>>      * @param file An instance of the File class that encapsulates a
>> handle
>>      *             referring to the CSV file.
>>      * @param separator An instance of the String class that encapsulates
>> the
>>      *                  character or character that ought to be used to
>> delimit
>>      *                  elements on the lines of the CSV file.
>>      * @throws java.io.FileNotFoundException Thrown if the file cannot be
>> found.
>>      * @throws java.io.IOException Thrown to indicate and error occurred
>> in
>> the
>>      *                             underylying file system.
>>      */
>>     private void saveCSVFile(File file, String separator)
>>                                      throws FileNotFoundException,
>> IOException {
>>         FileWriter fw = null;
>>         BufferedWriter bw = null;
>>         ArrayList<String> line = null;
>>         StringBuffer buffer = null;
>>         String csvLineElement = null;
>>         try {
>> 
>>             System.out.println("Saving the CSV file [" + file.getName() +
>> "]");
>> 
>>             // Open a writer onto the CSV file.
>>             fw = new FileWriter(file);
>>             bw = new BufferedWriter(fw);
>> 
>>             // Step through the elements of the ArrayList that was used
>> to
>> hold
>>             // all of the data recovered from the Excel workbooks'
>> sheets,
>> rows
>>             // and cells.
>>             for(int i = 0; i < this.csvData.size(); i++) {
>>                 buffer = new StringBuffer();
>> 
>>                 // Get an element from the ArrayList that contains the
>> data
>> for
>>                 // the workbook. This element will itself be an ArrayList
>>                 // containing Strings and each String will hold the data
>> recovered
>>                 // from a single cell. The for() loop is used to recover
>> elements
>>                 // from this 'row' ArrayList one at a time and to write
>> the
>> Strings
>>                 // away to a StringBuffer thus assembling a single line
>> for
>> inclusion
>>                 // in the CSV file. If a row was empty or if it was
>> short,
>> then
>>                 // the ArrayList that contains it's data will also be
>> shorter than
>>                 // some of the others. Therefore, it is necessary to
>> check
>> within
>>                 // the for loop to ensure that the ArrayList contains
>> data
>> to be
>>                 // processed. If it does, then an element will be
>> recovered
>> and
>>                 // appended to the StringBuffer.
>>                 line = this.csvData.get(i);
>>                 for(int j = 0; j < this.maxRowWidth; j++) {
>>                     if(line.size() > j) {
>>                         csvLineElement = line.get(j);
>>                         if(csvLineElement != null) {
>>                             buffer.append(csvLineElement);
>>                         }
>>                     }
>>                     if(j < (this.maxRowWidth - 1)) {
>>                         buffer.append(separator);
>>                     }
>>                 }
>> 
>>                 // Once the line is built, write it away to the CSV file.
>>                 bw.write(buffer.toString().trim());
>> 
>>                 // Condition the inclusion of new line characters so as
>> to
>>                 // avoid an additional, superfluous, new line at the end
>> of
>>                 // the file.
>>                 if(i < (this.csvData.size() - 1)) {
>>                     bw.newLine();
>>                 }
>>             }
>>         }
>>         finally {
>>             if(bw != null) {
>>                 bw.flush();
>>                 bw.close();
>>             }
>>         }
>>     }
>> 
>>     /**
>>      * Called to convert a row of cells into a line of data that can
>> later
>> be
>>      * output to the CSV file.
>>      *
>>      * Note that no tests have yet been conducted with blank cells or
>> those
>>      * containing formulae. Such may require latereations to the way this
>> code
>>      * works.
>>      *
>>      * @param row An instance of either the HSSFRow or XSSFRo classes
>> that
>>      *            encapsulates information about a row of cells recovered
>> from
>>      *            an Excel workbook.
>>      */
>>     private void rowToCSV(Row row) {
>>         Cell cell = null;
>>         int lastCellNum = 0;
>>         ArrayList<String> csvLine = new ArrayList<String>();
>> 
>>         // Check to ensure that a row was recovered from the sheet as it
>> is
>>         // possible that one or more rows between other populated rows
>> could
>> be
>>         // missing - blank. If the row does contain cells then...
>>         if(row != null) {
>> 
>>             // Get the index for the right most cell on the row and then
>>             // step along the row from left to right recovering the
>> contents
>>             // of each cell, converting that into a formatted String and
>>             // then storing the String into the csvLine ArrayList.
>>             lastCellNum = row.getLastCellNum();
>>             for(int i = 0; i <= lastCellNum; i++) {
>>                 cell = row.getCell(i);
>>                 if(cell == null) {
>>                     csvLine.add("");
>>                 }
>>                 else {
>>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>>                        
>> csvLine.add(this.formatter.formatCellValue(cell));
>>                     }
>>                     else {
>>                         csvLine.add(this.formatter.formatCellValue(cell,
>> this.evaluator));
>>                     }
>>                 }
>>             }
>>             // Make a note of the index number of the right most cell.
>> This
>> value
>>             // will later be used to ensure that the matrix of data in
>> the
>> CSV file
>>             // is square.
>>             if(lastCellNum > this.maxRowWidth) {
>>                 this.maxRowWidth = lastCellNum;
>>             }
>>         }
>>         this.csvData.add(csvLine);
>>     }
>> 
>>     /**
>>      * The main() method contains code that demonstrates how to use the
>> class.
>>      * @param args
>>      */
>>     public static void main(String[] args) {
>>         // Check the number of arguments passed to the main method. There
>>         // must be two or three, the name of and path to either the
>> folder
>>         // containing the Excel files or an individula Excel workbook
>> that
>> is/are
>>         // to be converted, the name of and path to the folder to which
>> the
>> CSV
>>         // files should be written and then finally, optionally, the
>> separator
>>         // that should be used to separate individual items on the lines
>> in
>> the
>>         // CSV file. Note that the names of the CSV files will be derived
>> from
>>         // those of the Excel file(s). Put simply the .xls or .xlsx
>> extension
>>         // will be replaced with .csv.
>>         ToCSV converter = null;
>>         try {
>>             converter = new ToCSV();
>>             if(args.length == 2) {
>>                 converter.convertExcelToCSV(args[0], args[1]);
>>             }
>>             else if(args.length == 3){
>>                 converter.convertExcelToCSV(args[0], args[1], args[2]);
>>             }
>>             else {
>>                 System.out.println("Usage: java ToCSV \"Source Folder\" "
>> +
>>                         "\"Destination Folder\" \"CSV Element
>> Separator\"");
>>             }
>>         }
>>         catch(Exception ex) {
>>             System.out.println("Caught an: " + ex.getClass().getName());
>>             System.out.println("Message: " + ex.getMessage());
>>             System.out.println("Stacktrace follows:.....");
>>             ex.printStackTrace(System.out);
>>         }
>>     }
>> 
>>     /**
>>      * An instance of this class can be used to control the files
>> returned
>>      * be a call to the listFiles() method when made on an instance of
>> the
>>      * File class and that object refers to a folder/directory
>>      */
>>     class ExcelFilenameFilter implements FilenameFilter {
>> 
>>         /**
>>          * Determine those files that will be returned by a call to the
>>          * listFiles() method. In this case, the name of the file must
>> end
>> with
>>          * either of the following two extension; '.xls' or '.xlsx'
>>          * @param file An instance of the File class that encapsulates a
>> handle
>>          *             referring to the folder/directory that contains
>> the
>> file.
>>          * @param name An instance of the String class that encapsulates
>> the
>>          *             name of the file.
>>          * @return A boolean value that indicates whether the file should
>> be
>>          *         included in the array retirned by the call to the
>> listFiles()
>>          *         method. In this case true will be returned if the name
>> of
>> the
>>          *         file ends with either '.xls' or '.xlsx' and false will
>> be
>>          *         returned in all other instances.
>>          */
>>         public boolean accept(File file, String name) {
>>             return(name.endsWith(".xls") || name.endsWith(".xlsx"));
>>         }
>>     }
>> }
>> 
>> 
>> Luke_Devon wrote:
>>> 
>>> Hi Mark
>>> 
>>> I tested the code. It was fine. I could manage to convert XLS into CSV
>>> without any problem. But i have few more questions.
>>> 
>>> In this code , you have been hard coded the path where XLS and CSV
>>> located. and file names also hard coded.
>>> 
>>> How it would be pointing to a single directory and convert all XLS into
>>> CSV stored in the folder ? 
>>> 
>>> In the RAW XLS file , there might be some unwanted data to be converted
>>> into CSV. or some times we need to complete the blank cells in the excel
>>> sheet.
>>> 
>>> How can we do it , before convert into CSV ? 
>>> 
>>> Sorry about all those simple questions.( Since I'm not a programmer)
>>> 
>>> Thanks in advance
>>> Luke
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ________________________________
>>> From: MSB <ma...@tiscali.co.uk>
>>> To: user@poi.apache.org
>>> Sent: Friday, April 9, 2010 22:18:43
>>> Subject: Re: Convert XLS into CSV
>>> 
>>> 
>>> Hello Luke,
>>> 
>>> As promised, a bit of code that uses the usermodel to create CSV files.
>>> I
>>> have not had the opportunity to test it thoroughly and do expect there
>>> to
>>> be
>>> issues so do not use the code in a production environment until you have
>>> put
>>> it through the wringer so to speak. Also, you may find the performance a
>>> little slower than you expect, especially if you are using the newer xml
>>> based file format and have run the eventusermodel code that Nick wrote.
>>> 
>>> In essence, 'my' code simplt takes the contents of the workbook and
>>> converts
>>> it into an ArrayList of ArrayLists where each inner ArrayList contains
>>> zero,
>>> one or more Strings that describe the contents of a cell. I used this
>>> approach becuase it allows me to ensure that every row in the finished
>>> CVS
>>> file is the same length - with regard to the number of elements it
>>> contains
>>> - even if the input workbook contains rows that have varying numbers of
>>> cells on them. The code as it stands does evaluate any formulae that may
>>> be
>>> contained within cells and I hope will perform pretty much as you
>>> require.
>>> Take a look down into the main() method to see how it can be used; this
>>> method only shows the class being used to process a single file but an
>>> instance can be used to process more than one file in this manner;
>>> 
>>> ToCSV converter = new ToCSV();
>>> converter.openWorkbook("C:/temp/To CSV.xls");
>>> converter.convertToCSV();
>>> converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>>> 
>>> converter.openWorkbook("C:/temp/Another To CSV.xlsx");
>>> converter.convertToCSV();
>>> converter.saveCSVFile("C:/temp/Second CSV.csv", ";");
>>> 
>>> 
>>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>>> import org.apache.poi.ss.usermodel.Workbook;
>>> import org.apache.poi.ss.usermodel.Sheet;
>>> import org.apache.poi.ss.usermodel.Row;
>>> import org.apache.poi.ss.usermodel.Cell;
>>> import org.apache.poi.ss.usermodel.DataFormatter;
>>> import org.apache.poi.ss.usermodel.FormulaEvaluator;
>>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>>> 
>>> import java.io.File;
>>> import java.io.FileInputStream;
>>> import java.io.FileWriter;
>>> import java.io.BufferedWriter;
>>> import java.io.IOException;
>>> import java.io.FileNotFoundException;
>>> import java.util.ArrayList;
>>> 
>>> /**
>>> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
>>> This
>>> * class makes the following assumptions;
>>> *
>>> * 1. Where the Excel workbook contains more that one worksheet then a
>>> single
>>> *    CSV file will contain the data from all of the worksheets.
>>> * 2. The data matrix contained in the CSV file will be square. This
>>> means
>>> that
>>> *    the number of elements in each row of the CSV file will match the
>>> number
>>> *    of cells in the longest row found in the Excel workbook. Any short
>>> rows
>>> *    will be 'padded' with empty elements - an empty elements is
>>> represented in
>>> *    the CSV file in this way ,,.
>>> * 3. Empty elements will represent missing cells.
>>> * 4. A row consisting of empty elements will be used to represent an
>>> empty
>>> row
>>> *    in the Excel workbook.
>>> *
>>> * @author Mark B
>>> * @version 1.00 9th April 2010
>>> */
>>> public class ToCSV {
>>> 
>>>     private Workbook workbook = null;
>>>     private ArrayList<ArrayList> csvData = null;
>>>     private int maxRowWidth = 0;
>>>     private DataFormatter formatter = null;
>>>     private FormulaEvaluator evaluator = null;
>>> 
>>>     /**
>>>      * Open an Excel workbook readt for conversion.
>>>      *
>>>      * @param filename An instance of the String class that encapsulates
>>> the
>>>      *                 path to and name of a valid Excel workbook. Note
>>> that
>>> the
>>>      *                 workbook can be either a binary (.xls) or
>>> SpreadsheetML
>>>      *                 (.xlsx) file.
>>>      *
>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>> be
>>> located.
>>>      * @throws java.io.IOException Thrown if a problem occurs in the
>>> file
>>> system.
>>>      * @throws
>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>> Thrown
>>>      *     if invalid xml is found whilst parsing an input SpreadsheetML
>>> file.
>>>      */
>>>     public void openWorkbook(String filename) throws
>>> FileNotFoundException,
>>>                                            IOException,
>>> InvalidFormatException {
>>>         File file = null;
>>>         FileInputStream fis = null;
>>>         try {
>>>             file = new File(filename);
>>>             fis = new FileInputStream(file);
>>>             this.workbook = WorkbookFactory.create(fis);
>>>             this.evaluator =
>>> this.workbook.getCreationHelper().createFormulaEvaluator();
>>>             this.formatter = new DataFormatter();
>>>         }
>>>         finally {
>>>             if(fis != null) {
>>>                 fis.close();
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to convert the contents of the currently opened workbook
>>> into
>>>      * a CSV file.
>>>      */
>>>     public void convertToCSV() {
>>>         Sheet sheet = null;
>>>         Row row = null;
>>>         int lastRowNum = 0;
>>>         this.csvData = new ArrayList<ArrayList>();
>>> 
>>>         // Discover how many sheets there are in the workbook....
>>>         int numSheets = this.workbook.getNumberOfSheets();
>>> 
>>>         // and then iterate through them.
>>>         for(int i = 0; i < numSheets; i++) {
>>> 
>>>             // Get a reference to a sheet and check to see if it
>>> contains
>>>             // any rows.
>>>             sheet = this.workbook.getSheetAt(i);
>>>             if(sheet.getPhysicalNumberOfRows() > 0) {
>>> 
>>>                 // Note down the index number of the bottom-most row and
>>>                 // then iterate through all of the rows on the sheet
>>> starting
>>>                 // from the very first row - number 1 - even if it is
>>> missing.
>>>                 // Recover a reference to the row and then call another
>>> method
>>>                 // which will strip the data from the cells and build
>>> lines
>>>                 // for inclusion in the resylting CSV file.
>>>                 lastRowNum = sheet.getLastRowNum();
>>>                 for(int j = 0; j <= lastRowNum; j++) {
>>>                     row = sheet.getRow(j);
>>>                     this.rowToCSV(row);
>>>                 }
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to actually save the data recovered from the Excel
>>> workbook
>>>      * as a CSV file.
>>>      *
>>>      * @param filename An instance of the String class that encapsulates
>>> the
>>>      *                path to and name of the CSV file.
>>>      * @param separator An instance of the String class that
>>> encapsulates
>>> the
>>>      *                  character or character that ought to be used to
>>> delimit
>>>      *                  elements on the lines of the CSV file.
>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>> be
>>> found.
>>>      * @throws java.io.IOException Thrown to indicate and error occurred
>>> in
>>> the
>>>      *                             underylying file system.
>>>      */
>>>     public void saveCSVFile(String filename, String separator)
>>>                                                    throws
>>> FileNotFoundException,
>>>                                                           IOException {
>>>         File file = null;
>>>         FileWriter fw = null;
>>>         BufferedWriter bw = null;
>>>         ArrayList<String> line = null;
>>>         StringBuffer buffer = null;
>>>         String csvLineElement = null;
>>>         try {
>>>             // Open a writer onto the CSV file.
>>>             file = new File(filename);
>>>             fw = new FileWriter(file);
>>>             bw = new BufferedWriter(fw);
>>> 
>>>             // Step through the elements of the ArrayList that was used
>>> to
>>> hold
>>>             // all of the data recovered from the Excel workbooks'
>>> sheets,
>>> rows
>>>             // and cells.
>>>             for(int i = 0; i < this.csvData.size(); i++) {
>>>                 buffer = new StringBuffer();
>>> 
>>>                 // Get an element from the ArrayList that contains the
>>> data
>>> for
>>>                 // the workbook. This element will itself be an
>>> ArrayList
>>>                 // containing Strings and each String will hold the data
>>> recovered
>>>                 // from a single cell. The for() loop is used to recover
>>> elements
>>>                 // from this 'row' ArrayList one at a time and to write
>>> the
>>> Strings
>>>                 // away to a StringBuffer thus assembling a single line
>>> for
>>> inclusion
>>>                 // in the CSV file. If a row was empty or if it was
>>> short,
>>> then
>>>                 // the ArrayList that contains it's data will also be
>>> shorter than
>>>                 // some of the others. Therefore, it is necessary to
>>> check
>>> within
>>>                 // the for loop to ensure that the ArrayList contains
>>> data
>>> to be
>>>                 // processed. If it does, then an element will be
>>> recovered
>>> and
>>>                 // appended to the StringBuffer.
>>>                 line = this.csvData.get(i);
>>>                 for(int j = 0; j < this.maxRowWidth; j++) {
>>>                     if(line.size() > j) {
>>>                         csvLineElement = line.get(j);
>>>                         if(csvLineElement != null) {
>>>                             buffer.append(csvLineElement);
>>>                         }
>>>                     }
>>>                     if(j < (this.maxRowWidth - 1)) {
>>>                         buffer.append(separator);
>>>                     }
>>>                 }
>>> 
>>>                 // Once the line is built, write it away to the CSV
>>> file.
>>>                 bw.write(buffer.toString().trim());
>>> 
>>>                 // Condition the inclusion of new line characters so as
>>> to
>>>                 // avoid an additional, superfluous, new line at the end
>>> of
>>>                 // the file.
>>>                 if(i < (this.csvData.size() - 1)) {
>>>                     bw.newLine();
>>>                 }
>>>             }
>>>         }
>>>         finally {
>>>             if(bw != null) {
>>>                 bw.flush();
>>>                 bw.close();
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to convert a row of cells into a line of data that can
>>> later
>>> be
>>>      * output to the CSV file.
>>>      *
>>>      * Note that no tests have yet been conducted with blank cells or
>>> those
>>>      * containing formulae. Such may require latereations to the way
>>> this
>>> code
>>>      * works.
>>>      *
>>>      * @param row An instance of either the HSSFRow or XSSFRo classes
>>> that
>>>      *            encapsulates information about a row of cells
>>> recovered
>>> from
>>>      *            an Excel workbook.
>>>      */
>>>     private void rowToCSV(Row row) {
>>>         Cell cell = null;
>>>         int lastCellNum = 0;
>>>         ArrayList<String> csvLine = new ArrayList<String>();
>>> 
>>>         // Check to ensure that a row was recovered from the sheet as it
>>> is
>>>         // possible that one or more rows between other populated rows
>>> could
>>> be
>>>         // missing - blank. If the row does contain cells then...
>>>         if(row != null) {
>>> 
>>>             // Get the index for the right most cell on the row and then
>>>             // step along the row from left to right recovering the
>>> contents
>>>             // of each cell, converting that into a formatted String and
>>>             // then storing the String into the csvLine ArrayList.
>>>             lastCellNum = row.getLastCellNum();
>>>             for(int i = 0; i <= lastCellNum; i++) {
>>>                 cell = row.getCell(i);
>>>                 if(cell == null) {
>>>                     csvLine.add("");
>>>                 }
>>>                 else {
>>>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>>>                        
>>> csvLine.add(this.formatter.formatCellValue(cell));
>>>                     }
>>>                     else {
>>>                         csvLine.add(this.formatter.formatCellValue(cell,
>>> this.evaluator));
>>>                     }
>>>                 }
>>>             }
>>>             // Make a note of the index number of the right most cell.
>>> This
>>> value
>>>             // will later be used to ensure that the matrix of data in
>>> the
>>> CSV file
>>>             // is square.
>>>             if(lastCellNum > this.maxRowWidth) {
>>>                 this.maxRowWidth = lastCellNum;
>>>             }
>>>         }
>>>         this.csvData.add(csvLine);
>>>     }
>>> 
>>>     /**
>>>      * The main() method contains code that demonstrates how to use the
>>> class.
>>>      * @param args
>>>      */
>>>     public static void main(String[] args) {
>>>         try {
>>>             ToCSV converter = new ToCSV();
>>>             converter.openWorkbook("C:/temp/To CSV.xls");
>>>             converter.convertToCSV();
>>>             converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>>>         }
>>>         catch(Exception ex) {
>>>             System.out.println("Caught an: " + ex.getClass().getName());
>>>             System.out.println("Message: " + ex.getMessage());
>>>             System.out.println("Stacktrace follows:.....");
>>>             ex.printStackTrace(System.out);
>>>         }
>>> 
>>>     }
>>> }
>>> 
>>> Test it out, have a good look through it and if there is anything you
>>> want
>>> to know just post to the list.
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Luke_Devon wrote:
>>>> 
>>>> Hi Mark,
>>>> 
>>>> First of all I would like to thank you for the reply. 
>>>> 
>>>> Actually , currently I am using MS Office 2002. But I would like to use
>>>> the code for other latest versions also.
>>>> In my case , I dont want to do any validations for the EXCEL file ,
>>>> because I just wanted convert entire excel file into CSV.
>>>> Is there any simple java code available for such a basic requirement ?
>>>> 
>>>> anyway I'll try to use the code in the link which you have given to me
>>>> .
>>>> 
>>>> Thanks & Regards
>>>> 
>>>> Luke.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ________________________________
>>>> From: MSB <ma...@tiscali.co.uk>
>>>> To: user@poi.apache.org
>>>> Sent: Thursday, April 8, 2010 23:47:07
>>>> Subject: Re: Convert XLS into CSV
>>>> 
>>>> 
>>>> Hello Luke,
>>>> 
>>>> Which version of the Excel file format are you targetting, the older
>>>> binary
>>>> or newer xml based version? I ask because Nick wrote and contributed
>>>> some
>>>> code that can be used to convert the older binary files into csv. It
>>>> uses
>>>> the eventmodel and will seem quite complex on first acquaintance but
>>>> here
>>>> it
>>>> is;
>>>> 
>>>> http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java
>>>> 
>>>> I know that other users have modified the code to, for example, output
>>>> diffenert worksheets to separate csv files.
>>>> 
>>>> Yours
>>>> 
>>>> Mark B
>>>> 
>>>> 
>>>> Luke_Devon wrote:
>>>>> 
>>>>> Hi 
>>>>> 
>>>>> I wanted to convert some XLS files into CSV. I found that apache.poi
>>>>> is
>>>>> the most perfect tool. Since I'm a beginner , i have no idea how to do
>>>>> that . Can some body help me please ? Do you have sample code for
>>>>> convert
>>>>> xls into csv ?
>>>>> 
>>>>> Thanks in Advance
>>>>> Luke
>>>>> 
>>>>> 
>>>>> 
>>>>>       Get your preferred Email name!
>>>>> Now you can @ymail.com and @rocketmail.com. 
>>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>>> 
>>>> 
>>>> -- 
>>>> View this message in context:
>>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28180503.html
>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>> 
>>>> 
>>>>       New Email names for you! 
>>>> Get the Email name you've always wanted on the new @ymail and
>>>> @rocketmail. 
>>>> Hurry before someone else does!
>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28191046.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>>       Get your new Email address!
>>> Grab the Email name you've always wanted before someone else does!
>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28230701.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28333557.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28373635.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Convert XLS into CSV - Removing a Work Sheet

Posted by MSB <ma...@tiscali.co.uk>.
Glad to hear that Luke, now onto the other tasks.

Firstly, can I ask you a question about the first task - removing a row if
the word Sum appears on it - please? Does Sum always appear in a particular
column? Failing that, are there a set number of rows between each occurrence
of the line including that word? If so, this will speed up the entire
process somewhat, if not, then we are faced with a process where we need to
examine every cell on each row in the worksheet and this could be quite a
time consuming process. If it is possible to identify a pattern, even a
quite crude one that identifies that the word will only appear in columns D,
G or H, then that will help.

Secondly, the commas in numeric values, this could pressent a bit of a
problem. As you may know, styles are used to control how a value is
displayed in a cell and it could be that the person who originally
constructed the worksheet entered a value such as 12345 into the cell and
then applied a style so that it appears to contain 12,345.00. The best way
to resolve this issue is, in many ways, to simply remove the cells
formatting - but there is yet another issue, what if the cell was not
numeric at all and the user entered a String of text and then right aligned
it? So, can you look at the worksheet using Excel, find a few of these cells
and see what type they are and what format has been applied to them please
because, without the file at hand, it is hard to tell you how to proceed on
this issue. Finally, do you need to modify these cells in the workbook as it
would be trivial task to remove the comma when writing the value into the
CSV file.

Thirdly, the copying cell contents code, is it possible to say, in advance,
copy the contents of cell B16 into the range B17 to B25 for example? If so,
then it would be quite a straightforward task to automate the process
completely and simply pass in a parameter format in a way the code would
recognise. For example, and assuming running the code from the command line,
we could have -C "C10!C11:C20 D34!D67:D78" or -C "Linux!C10:C20" where the
former would take the contents of cell C10 and copy it into cells C11 to C20
and the latter would write the String Linux into cell C11 to C20.

Yours

Mark B



Luke_Devon wrote:
> 
> Hi Mark
> 
> I modified the code to removing the Summary sheet from my Excel workbook.
> It is working. Thanks for your guide and the support.( Herewith I have
> attached the code )
> 
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileOutputStream;
> import org.apache.poi.ss.usermodel.Workbook;
> import org.apache.poi.ss.usermodel.WorkbookFactory;
> 
> public class SheetUtil {
> 
>    public static void main(String a[]){
>        new SheetUtil().deleteSheet();
>    }
> 
> 
>    public void deleteSheet(){
>        int sheetToDelete = -1;
> 
>        String path = "Book1.xls";
>        FileInputStream fis  = null;
>        Workbook workbook = null;
>        File file = null;
>        FileOutputStream fos = null;
>        try{
> 
>            file = new File(path);
> 
>            fis = new FileInputStream(file);
>            workbook = WorkbookFactory.create(fis);
> 
> 
> 
>            int numSheets = workbook.getNumberOfSheets();
> 
>            for(int i = 0; i < numSheets; i++) {
>                org.apache.poi.ss.usermodel.Sheet sheet =
> workbook.getSheetAt(i);
>                System.out.println(sheet.getSheetName());
>                if(sheet.getSheetName().equalsIgnoreCase("Summary")) {
>                    sheetToDelete = i;
>                    break;
> 
>                }
>            }
> 
> 
>            if(sheetToDelete > -1) {
>                workbook.removeSheetAt(sheetToDelete);
>            }
> 
>            fos = new FileOutputStream(file);
> 
>            workbook.write(fos);
> 
>        }catch(Exception e){
>            e.printStackTrace();
>        }finally{
>            try{
>                if(fis != null){
>                    fis.close();
>                    fis = null;
> 
>                }
>                if(fos != null){
>                    fos.close();
>                    fos = null;
> 
>                }
>            }catch(Exception ee){
>                ee.printStackTrace();
> 
>            }
>        }
>    }
> }
> 
> =====================================================
> 
> I have few more modifications , to be applied for the workbook. Can you
> please help me further?
> 
> I will list down you my other requirements.
> 
> After I removed the "Summary" sheet , there will be only one Sheet will be
> remain which is "Detail". 
> 
> In the Detail Sheet,
> 
> 1. In the Sheet , there is String called "Sum". This value also will be
> there more than once.
> 
> Once that value found , needed to be delete entire row respectively. Eg:
> If "Sum" is in , Cell H11, 11th row needed to be removed from the
> worksheet.
> 
> 2. Some Cells are containing values with "1000 separator (,)" -- Eg:
> 1,256.00 . I wanted to remove all thousand separators from the values
> which are exist in the Detail excel sheet. Eg: 1256.00 --- after removing
> the "," value should be like this.(1256.00)
> 
> 3. In B16 , There is a String called "Linux". This value i wanted to copy
> ( repeat ) until B25.Thats meant This value must be repeat from B17 to
> B25.( This is just like we copy a value along the column for more than one
> row )
> 
> Like that , in this worksheet , there will be lot of values to be copied.
> Hence I think we can not do it dynamically and we needed hard code those
> values in the code .
> 
> Thanks in advance
> Luke
> 
> 
> ________________________________
> From: MSB <ma...@tiscali.co.uk>
> To: user@poi.apache.org
> Sent: Friday, April 23, 2010 3:27:38
> Subject: Re: Convert XLS into CSV
> 
> 
> Luke
> 
> As you have only limited experience with Java, I have uploaded a
> distribution that you should be able to
> run from the command line. I know some companies restrict access to file
> sharing sites bu if you can use Rapidshare from your location, this is the
> link to the archive;
> 
> http://rapidshare.com/files/378927105/CSV_Conversion_Code.rar
> 
> and if you have any problems downloading the file, just let me know. It is
> quite large as it contains all of
> the POI archives that you will need to run the code but I am confident
> there
> is another way I could send it
> directly to you if Rapidshare is inaccessible.
> 
> If you are unfamiliar with Rapidshare, all you need to do is follow the
> link, choose the free download option
> and download the file to your machine. Once you have downloaded the
> archive
> called 
> CSV Conversion Code.rar, unzip it and you should have a folder with the
> name
> CSV Conversion Code. This
> folder contains a further folder called dist and a file by the name of
> Read
> Me.rtf.
> Open this rtf file (Word or WordPad will handle it successfully) and it
> will
> tell you what to do next.
> 
> Any problems, just send me a message.
> 
> Yours
> 
> Mark B
> 
> PS How did you get on with removing the Summary sheet from your Excel
> workbook?
> 
> 
> Luke_Devon wrote:
>> 
>> Hi Mark
>> 
>> First of all I would like to thank you for all your explanation which you
>> have explained very well regarding the code. It was really helpful to
>> understand the flow of the code and I think its nice to have such
>> audience
>> and explanation from an expert / one of the best,  in the industry for a
>> beginner like me. 
>> 
>> I was executing your final Code to convert XLS into CSV. Its compiled,
>> but
>> there was two messages , 
>> uses unchecked or unsafe operations.
>> Note: Recompile with -Xlint:unchecked for details.
>> 
>> I did google , for those messages and there was some directives , but as
>> of some forum says , it wont affect to the execution.
>> 
>> Finally , when I tried to execute it , 
>> 
>> java ToCSV d:\excel  d:\csv , 
>> 
>> [java] Usage: java ToCSV "Source Folder" "Destination Folder" "CSV
>> Element
>> Separator"
>> 
>> BUILD FAILED
>> Target "d:\excel" does not exist in the project "Inbox".
>> 
>> I tried to go through the code  , if there is any thing that i can
>> modified to solve the problem.but no luck.
>> 
>> Thanks in advance
>> Luke
>> 
>> 
>> ________________________________
>> From: MSB <ma...@tiscali.co.uk>
>> To: user@poi.apache.org
>> Sent: Tuesday, April 13, 2010 22:04:55
>> Subject: Re: Convert XLS into CSV
>> 
>> 
>> The code I have pasted into this message has been modified so that it is
>> possible to specify the path to and name of a folder containing one or
>> more
>> Excel workbooks and the code will now iterate through each of the
>> workbooks
>> in the folder and convert them into separate CSV files. The name of the
>> resulting CSV file will in each case be derived by taking the name of the
>> source Excel workbook, removing the .xls or xlsx extension and suffixing
>> the
>> .csv extension. Before looking at the code in any detail - most of it is
>> identical by the way - I need to explain how you can call and use this
>> class. Forgive me if you know any of this but I thought it safer to
>> assume
>> you might not be totally clear on the details.
>> 
>> When you run a Java class with a command like this;
>> 
>> java ToCSV
>> 
>> the runtime environment will locate the .class file, load and compile the
>> bytecode and then look inside the class for the main() method which, if
>> that
>> method is present and it contains any code at all, is where execution of
>> the
>> program will begin. If you look at the signature for the main() method,
>> you
>> will see that it looks something like this;
>> 
>> public static void main(String[] args)
>> 
>> ignoring the public, static and void keywords for the moment, take a look
>> at
>> what the brackets contain - String[] args. This is the declaration for a
>> variable called args which is able to hold a reference to an array where
>> each element of that array is an instance the java.lang.String class.
>> This
>> array will hold any parameters that were specified on the command line
>> when
>> the class was run. So, to modify the example above slightly, this;
>> 
>> java ToCSV C:/temp C:/temp/CSV
>> 
>> would result in a reference to an an array holding two Strings being
>> passed
>> to the args parameter; the first element would hold "C:/temp" and the
>> second
>> C:/temp/CSV. Well, more accurately, the elements would hold references to
>> instances of the String class that encapsulated the values "C:/temp" and
>> C:/temp/CSV.
>> 
>> The contents of the args parameter are accessible to any code within the
>> args method and so it is possible to extract the contents of the two
>> elements by simply referencing each using Java's standard notation; the
>> first element in the args array would be args[0] and the second args[1].
>> 
>> With this information, you can see how to run the ToCSV class from the
>> command line. All that you need do is specify the path to and name of
>> either
>> an individual Excel workbook or of a folder that contains Excel workbook,
>> the path to and name of the folder that should contain the CSV files and,
>> optionally, the character or string of characters that will separate each
>> element on the line within the CSV file. The class can be invoked from
>> the
>> command line in this manner;
>> 
>> java ToCSV "C:/temp/Excel Workbooks" "C:/temp/CSV Files"
>> 
>> or
>> 
>> java ToCSV C:/temp/Excel C:/temp/CSV ,
>> 
>> Note that the speach marks enclosing the paths in the first example are
>> only
>> necessary if there are embedded spaces within the parameter.
>> 
>> If you now take a look at the code in the main() method of the ToCSV
>> file,
>> you can see that it simply recovers the values from the args array and
>> uses
>> them to call the convertExcelToCSV() method on an instance of the ToCSV
>> class.
>> 
>> Equally, you can call this class from code within another Java class. If
>> you
>> choose to do this, it is an even simpler process; simply create an
>> instance
>> of the ToCSV class and then call either of the overloaded
>> convertExcelToCSV() methods. One accepts two parameters, the name of and
>> path to the source Excel workbook(s) and the name of and path to the
>> folder
>> the CSV files ought to be written to. If you call this method, it will be
>> assumed that the separtor is a comma. The other method accepts a third
>> parameter allowing you to define what the separator should be, so you
>> could
>> pass a colon for example. 
>> 
>> The process is a little like this;
>> 
>> ToCSV converter = new ToCSV();
>> converter.convertExcelToCSV("C:/temp/Excel Workbooks", "C:/temp/CSV
>> Files",
>> "!");
>> 
>> The folder that contains the workbooks can hold other files as well, you
>> do
>> not have to worry about copying the Excel workbooks to a special file for
>> processing. The ToCSV class is able to identify and select just Excel
>> workbook files that have extensions of '.xls' or '.xlsx' and it does this
>> by
>> using an instance of an inner class called ExcelFilenameFilter that you
>> can
>> see defined at the bottom of the source code, immediately following the
>> main() method. As you can see, it contains a single method - accept() -
>> that
>> contains a single line of code. This line of code will return the boolean
>> value true if the name of a file ends with either '.xls' or '.xlsx' and
>> an
>> instance of ExcelFilenameFilter can, therefore, be used, when the
>> contents
>> of the folder are read, to ensure that only Excel workbooks are
>> processed.
>> I
>> often use inner classes like this if they are very small and simple or if
>> they only make sense in the context of the enclosing class. By this, I
>> mean
>> asking yourslef if it is likely any other class may need similar
>> functionality.
>> 
>> Anayway, here is the code. Remember that it is only one possible solution
>> and you may need to modify how it works to suit your specific
>> requirement.
>> You may not be worried, for example, if all of the rows in the CSV file
>> contain different numbers of elements. If this is the case, the contents
>> of
>> the cells could be written out immediately after they were read and you
>> could completely remove the additional step of storing them in ArrayLists
>> to
>> ensure the matrix is square.
>> 
>> package postprocessxlsx;
>> 
>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>> import org.apache.poi.ss.usermodel.Workbook;
>> import org.apache.poi.ss.usermodel.Sheet;
>> import org.apache.poi.ss.usermodel.Row;
>> import org.apache.poi.ss.usermodel.Cell;
>> import org.apache.poi.ss.usermodel.DataFormatter;
>> import org.apache.poi.ss.usermodel.FormulaEvaluator;
>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>> 
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.io.FileWriter;
>> import java.io.BufferedWriter;
>> import java.io.FilenameFilter;
>> import java.io.IOException;
>> import java.io.FileNotFoundException;
>> import java.util.ArrayList;
>> 
>> /**
>> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
>> This
>> * class makes the following assumptions;
>> *
>> * 1. Where the Excel workbook contains more that one worksheet then a
>> single
>> *    CSV file will contain the data from all of the worksheets.
>> * 2. The data matrix contained in the CSV file will be square. This means
>> that
>> *    the number of elements in each row of the CSV file will match the
>> number
>> *    of cells in the longest row found in the Excel workbook. Any short
>> rows
>> *    will be 'padded' with empty elements - an empty elements is
>> represented in
>> *    the CSV file in this way ,,.
>> * 3. Empty elements will represent missing cells.
>> * 4. A row consisting of empty elements will be used to represent an
>> empty
>> row
>> *    in the Excel workbook.
>> *
>> * @author Mark B
>> * @version 1.10 12th April 2010
>> */
>> public class ToCSV {
>> 
>>     private Workbook workbook = null;
>>     private ArrayList<ArrayList> csvData = null;
>>     private int maxRowWidth = 0;
>>     private DataFormatter formatter = null;
>>     private FormulaEvaluator evaluator = null;
>>     private String separtor = null;
>> 
>>     public static final String CSV_FILE_EXTENSION = ".csv";
>>     public static final String DEFAULT_SEPARATOR = ",";
>> 
>>     /**
>>      * Process the contents of a folder, convert the contents of each
>> Excel
>>      * workbook into CSV format and save the resulting file to the
>> specified
>>      * folder using the same name as the original workbook with the .xls
>> or
>>      * .xlsx extension replaced by .csv
>>      *
>>      * @param source An instance of the String class that encapsulates
>> the
>>      *        name of and path to either a folder containing those Excel
>>      *        workbook(s) or the name of and path to an individual Excel
>> workbook
>>      *        that is/are to be converted.
>>      * @param destination An instance of the String class encapsulating
>> the
>> name
>>      *        of and path to a folder that will contain the resulting CSV
>> files.
>>      * @throws java.io.FileNotFoundException Thrown if any file cannot be
>> located
>>      *         on the filesystem during processing.
>>      * @throws java.io.IOException Thrown if the filesystem encounters
>> any
>>      *         problems during processing.
>>      * @throws java.lang.IllegalArgumentException Thrown if the values
>> passed
>>      *         to the source parameters refers to a file or folder that
>> does
>> not
>>      *         exist and if the value passed to the destination paramater
>> refers
>>      *         to a folder that does not exist or simply does not refer
>> to
>> a
>>      *         folder that does not exist.
>>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
>> Thrown
>>      *         if the xml markup encounetered whilst parsing a
>> SpreadsheetML
>>      *         file (.xlsx) is invalid.
>>      */
>>     public void convertExcelToCSV(String strSource, String
>> strDestination)
>>                        throws FileNotFoundException, IOException,
>>                               IllegalArgumentException,
>> InvalidFormatException {
>> 
>>         // Simply chain the call to the overloaded
>> convertExcelToCSV(String,
>>         // String, String) method and pass the default separator to
>> ensure
>> this
>>         // String is used to separate elemnts on the line in the CSV
>> file.
>>         this.convertExcelToCSV(strSource, strDestination,
>> ToCSV.DEFAULT_SEPARATOR);
>>     }
>> 
>>     /**
>>      * Process the contents of a folder, convert the contents of each
>> Excel
>>      * workbook into CSV format and save the resulting file to the
>> specified
>>      * folder using the same name as the original workbook with the .xls
>> or
>>      * .xlsx extension replaced by .csv
>>      *
>>      * @param source An instance of the String class that encapsulates
>> the
>>      *        name of and path to either a folder containing those Excel
>>      *        workbook(s) or the name of and path to an individual Excel
>> workbook
>>      *        that is/are to be converted.
>>      * @param destination An instance of the String class encapsulating
>> the
>> name
>>      *        of and path to a folder that will contain the resulting CSV
>> files.
>>      * @param separator An instance of the String class encapsulating the
>>      *        characters or characters that should be used to separate
>> items
>>      *        on a line within the CSV file.
>>      * @throws java.io.FileNotFoundException Thrown if any file cannot be
>> located
>>      *         on the filesystem during processing.
>>      * @throws java.io.IOException Thrown if the filesystem encounters
>> any
>>      *         problems during processing.
>>      * @throws java.lang.IllegalArgumentException Thrown if the values
>> passed
>>      *         to the source parameters refers to a file or folder that
>> does
>> not
>>      *         exist and if the value passed to the destination paramater
>> refers
>>      *         to a folder that does not exist or simply does not refer
>> to
>> a
>>      *         folder that does not exist.
>>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
>> Thrown
>>      *         if the xml markup encounetered whilst parsing a
>> SpreadsheetML
>>      *         file (.xlsx) is invalid.
>>      */
>>     public void convertExcelToCSV(String strSource,
>>                                   String strDestination, String
>> separator)
>>                        throws FileNotFoundException, IOException,
>>                               IllegalArgumentException,
>> InvalidFormatException {
>>         File source = new File(strSource);
>>         File destination = new File(strDestination);
>>         File[] filesList = null;
>>         String destinationFilename = null;
>> 
>>         // Check that the source file/folder exists.
>>         if(!source.exists()) {
>>             throw new IllegalArgumentException("The source for the Excel
>> "
>> +
>>                     "file(s) cannot be found.");
>>         }
>> 
>>         // Ensure thaat the folder the user has chosen to save the CSV
>> files
>>         // away into firstly exists and secondly is a folder rather than,
>> for
>>         // instance, a data file.
>>         if(!destination.exists()) {
>>             throw new IllegalArgumentException("The folder/directory for
>> the
>> " +
>>                     "converted CSV file(s) does not exist.");
>>         }
>>         if(!destination.isDirectory()) {
>>             throw new IllegalArgumentException("The destination for the
>> CSV
>> " +
>>                     "file(s) is not a directory/folder.");
>>         }
>> 
>>         // Check to see if the sourceFolder variable holds a reference to
>>         // a file or a folder full of files.
>>         if(source.isDirectory()) {
>>             // Get a list of all of the Excel spreadsheet files
>> (workbooks)
>> in
>>             // the source folder/directory
>>             filesList = source.listFiles(new ExcelFilenameFilter());
>>         }
>>         else {
>>             // Assume that it must be a file handle - although there are
>> other
>>             // options the code should perhaps check - and store the
>> reference
>>             // into the filesList variable.
>>             filesList = new File[]{source};
>>         }
>> 
>>         // Step through each of the files in the source folder and for
>> each
>>         // open the workbook, convert it's contents to CSV format and
>> then
>>         // save the resulting file away into the folder specified by the
>>         // contents of the destination variable. Note that the name of
>> the
>>         // csv file will be created by taking the name of the Excel file,
>>         // removing the extension and replacing it with .csv. Note that
>> there
>>         // is one drawback with this approach; if the folder holding the
>> files
>>         // contains two workbooks whose names match but one is a binary
>> file
>>         // (.xls) and the other a SpreadsheetML file (.xlsx), then the
>> names
>>         // for both CSV files will be identical and one CSV file will,
>>         // therefore, over-write the other.
>>         for(File excelFile : filesList) {
>>             // Open the workbook
>>             this.openWorkbook(excelFile);
>> 
>>             // Convert it's contents into a CSV file
>>             this.convertToCSV();
>> 
>>             // Build the name of the csv folder from that of the Excel
>> workbook.
>>             // Simply replace the .xls or .xlsx file extension with .csv
>>             destinationFilename = excelFile.getName();
>>             destinationFilename = destinationFilename.substring(
>>                     0, destinationFilename.lastIndexOf(".")) + ".csv";
>> 
>>             // Save the CSV file away using the newly constricted file
>> name
>>             // and to the specified directory.
>>             this.saveCSVFile(new File(destination, destinationFilename),
>> separator);
>>         }
>>     }
>> 
>>     /**
>>      * Open an Excel workbook ready for conversion.
>>      *
>>      * @param file An instance of the File class that encapsulates a
>> handle
>>      *        to a valid Excel workbook. Note that the workbook can be in
>>      *        either binary (.xls) or SpreadsheetML (.xlsx) format.
>>      *
>>      * @throws java.io.FileNotFoundException Thrown if the file cannot be
>> located.
>>      * @throws java.io.IOException Thrown if a problem occurs in the file
>> system.
>>      * @throws org.apache.poi.openxml4j.exceptions.InvalidFormatException
>> Thrown
>>      *         if invalid xml is found whilst parsing an input
>> SpreadsheetML
>>      *         file.
>>      */
>>     private void openWorkbook(File file) throws FileNotFoundException,
>>                                            IOException,
>> InvalidFormatException {
>>         FileInputStream fis = null;
>>         try {
>>             System.out.println("Opening workbook [" + file.getName() +
>> "]");
>> 
>>             fis = new FileInputStream(file);
>> 
>>             // Open the workbook and then create the FormulaEvaluator and
>>             // DataFormatter instances that will be needed to,
>> respectively,
>>             // force evaluation of forumlae found in cells and create a
>>             // formatted String encapsulating the cells contents.
>>             this.workbook = WorkbookFactory.create(fis);
>>             this.evaluator =
>> this.workbook.getCreationHelper().createFormulaEvaluator();
>>             this.formatter = new DataFormatter();
>>         }
>>         finally {
>>             if(fis != null) {
>>                 fis.close();
>>             }
>>         }
>>     }
>> 
>>     /**
>>      * Called to convert the contents of the currently opened workbook
>> into
>>      * a CSV file.
>>      */
>>     private void convertToCSV() {
>>         Sheet sheet = null;
>>         Row row = null;
>>         int lastRowNum = 0;
>>         this.csvData = new ArrayList<ArrayList>();
>> 
>>         System.out.println("Converting files contents to CSV format.");
>> 
>>         // Discover how many sheets there are in the workbook....
>>         int numSheets = this.workbook.getNumberOfSheets();
>> 
>>         // and then iterate through them.
>>         for(int i = 0; i < numSheets; i++) {
>> 
>>             // Get a reference to a sheet and check to see if it contains
>>             // any rows.
>>             sheet = this.workbook.getSheetAt(i);
>>             if(sheet.getPhysicalNumberOfRows() > 0) {
>> 
>>                 // Note down the index number of the bottom-most row and
>>                 // then iterate through all of the rows on the sheet
>> starting
>>                 // from the very first row - number 1 - even if it is
>> missing.
>>                 // Recover a reference to the row and then call another
>> method
>>                 // which will strip the data from the cells and build
>> lines
>>                 // for inclusion in the resylting CSV file.
>>                 lastRowNum = sheet.getLastRowNum();
>>                 for(int j = 0; j <= lastRowNum; j++) {
>>                     row = sheet.getRow(j);
>>                     this.rowToCSV(row);
>>                 }
>>             }
>>         }
>>     }
>> 
>>     /**
>>      * Called to actually save the data recovered from the Excel workbook
>>      * as a CSV file.
>>      *
>>      * @param file An instance of the File class that encapsulates a
>> handle
>>      *             referring to the CSV file.
>>      * @param separator An instance of the String class that encapsulates
>> the
>>      *                  character or character that ought to be used to
>> delimit
>>      *                  elements on the lines of the CSV file.
>>      * @throws java.io.FileNotFoundException Thrown if the file cannot be
>> found.
>>      * @throws java.io.IOException Thrown to indicate and error occurred
>> in
>> the
>>      *                             underylying file system.
>>      */
>>     private void saveCSVFile(File file, String separator)
>>                                      throws FileNotFoundException,
>> IOException {
>>         FileWriter fw = null;
>>         BufferedWriter bw = null;
>>         ArrayList<String> line = null;
>>         StringBuffer buffer = null;
>>         String csvLineElement = null;
>>         try {
>> 
>>             System.out.println("Saving the CSV file [" + file.getName() +
>> "]");
>> 
>>             // Open a writer onto the CSV file.
>>             fw = new FileWriter(file);
>>             bw = new BufferedWriter(fw);
>> 
>>             // Step through the elements of the ArrayList that was used
>> to
>> hold
>>             // all of the data recovered from the Excel workbooks'
>> sheets,
>> rows
>>             // and cells.
>>             for(int i = 0; i < this.csvData.size(); i++) {
>>                 buffer = new StringBuffer();
>> 
>>                 // Get an element from the ArrayList that contains the
>> data
>> for
>>                 // the workbook. This element will itself be an ArrayList
>>                 // containing Strings and each String will hold the data
>> recovered
>>                 // from a single cell. The for() loop is used to recover
>> elements
>>                 // from this 'row' ArrayList one at a time and to write
>> the
>> Strings
>>                 // away to a StringBuffer thus assembling a single line
>> for
>> inclusion
>>                 // in the CSV file. If a row was empty or if it was
>> short,
>> then
>>                 // the ArrayList that contains it's data will also be
>> shorter than
>>                 // some of the others. Therefore, it is necessary to
>> check
>> within
>>                 // the for loop to ensure that the ArrayList contains
>> data
>> to be
>>                 // processed. If it does, then an element will be
>> recovered
>> and
>>                 // appended to the StringBuffer.
>>                 line = this.csvData.get(i);
>>                 for(int j = 0; j < this.maxRowWidth; j++) {
>>                     if(line.size() > j) {
>>                         csvLineElement = line.get(j);
>>                         if(csvLineElement != null) {
>>                             buffer.append(csvLineElement);
>>                         }
>>                     }
>>                     if(j < (this.maxRowWidth - 1)) {
>>                         buffer.append(separator);
>>                     }
>>                 }
>> 
>>                 // Once the line is built, write it away to the CSV file.
>>                 bw.write(buffer.toString().trim());
>> 
>>                 // Condition the inclusion of new line characters so as
>> to
>>                 // avoid an additional, superfluous, new line at the end
>> of
>>                 // the file.
>>                 if(i < (this.csvData.size() - 1)) {
>>                     bw.newLine();
>>                 }
>>             }
>>         }
>>         finally {
>>             if(bw != null) {
>>                 bw.flush();
>>                 bw.close();
>>             }
>>         }
>>     }
>> 
>>     /**
>>      * Called to convert a row of cells into a line of data that can
>> later
>> be
>>      * output to the CSV file.
>>      *
>>      * Note that no tests have yet been conducted with blank cells or
>> those
>>      * containing formulae. Such may require latereations to the way this
>> code
>>      * works.
>>      *
>>      * @param row An instance of either the HSSFRow or XSSFRo classes
>> that
>>      *            encapsulates information about a row of cells recovered
>> from
>>      *            an Excel workbook.
>>      */
>>     private void rowToCSV(Row row) {
>>         Cell cell = null;
>>         int lastCellNum = 0;
>>         ArrayList<String> csvLine = new ArrayList<String>();
>> 
>>         // Check to ensure that a row was recovered from the sheet as it
>> is
>>         // possible that one or more rows between other populated rows
>> could
>> be
>>         // missing - blank. If the row does contain cells then...
>>         if(row != null) {
>> 
>>             // Get the index for the right most cell on the row and then
>>             // step along the row from left to right recovering the
>> contents
>>             // of each cell, converting that into a formatted String and
>>             // then storing the String into the csvLine ArrayList.
>>             lastCellNum = row.getLastCellNum();
>>             for(int i = 0; i <= lastCellNum; i++) {
>>                 cell = row.getCell(i);
>>                 if(cell == null) {
>>                     csvLine.add("");
>>                 }
>>                 else {
>>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>>                        
>> csvLine.add(this.formatter.formatCellValue(cell));
>>                     }
>>                     else {
>>                         csvLine.add(this.formatter.formatCellValue(cell,
>> this.evaluator));
>>                     }
>>                 }
>>             }
>>             // Make a note of the index number of the right most cell.
>> This
>> value
>>             // will later be used to ensure that the matrix of data in
>> the
>> CSV file
>>             // is square.
>>             if(lastCellNum > this.maxRowWidth) {
>>                 this.maxRowWidth = lastCellNum;
>>             }
>>         }
>>         this.csvData.add(csvLine);
>>     }
>> 
>>     /**
>>      * The main() method contains code that demonstrates how to use the
>> class.
>>      * @param args
>>      */
>>     public static void main(String[] args) {
>>         // Check the number of arguments passed to the main method. There
>>         // must be two or three, the name of and path to either the
>> folder
>>         // containing the Excel files or an individula Excel workbook
>> that
>> is/are
>>         // to be converted, the name of and path to the folder to which
>> the
>> CSV
>>         // files should be written and then finally, optionally, the
>> separator
>>         // that should be used to separate individual items on the lines
>> in
>> the
>>         // CSV file. Note that the names of the CSV files will be derived
>> from
>>         // those of the Excel file(s). Put simply the .xls or .xlsx
>> extension
>>         // will be replaced with .csv.
>>         ToCSV converter = null;
>>         try {
>>             converter = new ToCSV();
>>             if(args.length == 2) {
>>                 converter.convertExcelToCSV(args[0], args[1]);
>>             }
>>             else if(args.length == 3){
>>                 converter.convertExcelToCSV(args[0], args[1], args[2]);
>>             }
>>             else {
>>                 System.out.println("Usage: java ToCSV \"Source Folder\" "
>> +
>>                         "\"Destination Folder\" \"CSV Element
>> Separator\"");
>>             }
>>         }
>>         catch(Exception ex) {
>>             System.out.println("Caught an: " + ex.getClass().getName());
>>             System.out.println("Message: " + ex.getMessage());
>>             System.out.println("Stacktrace follows:.....");
>>             ex.printStackTrace(System.out);
>>         }
>>     }
>> 
>>     /**
>>      * An instance of this class can be used to control the files
>> returned
>>      * be a call to the listFiles() method when made on an instance of
>> the
>>      * File class and that object refers to a folder/directory
>>      */
>>     class ExcelFilenameFilter implements FilenameFilter {
>> 
>>         /**
>>          * Determine those files that will be returned by a call to the
>>          * listFiles() method. In this case, the name of the file must
>> end
>> with
>>          * either of the following two extension; '.xls' or '.xlsx'
>>          * @param file An instance of the File class that encapsulates a
>> handle
>>          *             referring to the folder/directory that contains
>> the
>> file.
>>          * @param name An instance of the String class that encapsulates
>> the
>>          *             name of the file.
>>          * @return A boolean value that indicates whether the file should
>> be
>>          *         included in the array retirned by the call to the
>> listFiles()
>>          *         method. In this case true will be returned if the name
>> of
>> the
>>          *         file ends with either '.xls' or '.xlsx' and false will
>> be
>>          *         returned in all other instances.
>>          */
>>         public boolean accept(File file, String name) {
>>             return(name.endsWith(".xls") || name.endsWith(".xlsx"));
>>         }
>>     }
>> }
>> 
>> 
>> Luke_Devon wrote:
>>> 
>>> Hi Mark
>>> 
>>> I tested the code. It was fine. I could manage to convert XLS into CSV
>>> without any problem. But i have few more questions.
>>> 
>>> In this code , you have been hard coded the path where XLS and CSV
>>> located. and file names also hard coded.
>>> 
>>> How it would be pointing to a single directory and convert all XLS into
>>> CSV stored in the folder ? 
>>> 
>>> In the RAW XLS file , there might be some unwanted data to be converted
>>> into CSV. or some times we need to complete the blank cells in the excel
>>> sheet.
>>> 
>>> How can we do it , before convert into CSV ? 
>>> 
>>> Sorry about all those simple questions.( Since I'm not a programmer)
>>> 
>>> Thanks in advance
>>> Luke
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ________________________________
>>> From: MSB <ma...@tiscali.co.uk>
>>> To: user@poi.apache.org
>>> Sent: Friday, April 9, 2010 22:18:43
>>> Subject: Re: Convert XLS into CSV
>>> 
>>> 
>>> Hello Luke,
>>> 
>>> As promised, a bit of code that uses the usermodel to create CSV files.
>>> I
>>> have not had the opportunity to test it thoroughly and do expect there
>>> to
>>> be
>>> issues so do not use the code in a production environment until you have
>>> put
>>> it through the wringer so to speak. Also, you may find the performance a
>>> little slower than you expect, especially if you are using the newer xml
>>> based file format and have run the eventusermodel code that Nick wrote.
>>> 
>>> In essence, 'my' code simplt takes the contents of the workbook and
>>> converts
>>> it into an ArrayList of ArrayLists where each inner ArrayList contains
>>> zero,
>>> one or more Strings that describe the contents of a cell. I used this
>>> approach becuase it allows me to ensure that every row in the finished
>>> CVS
>>> file is the same length - with regard to the number of elements it
>>> contains
>>> - even if the input workbook contains rows that have varying numbers of
>>> cells on them. The code as it stands does evaluate any formulae that may
>>> be
>>> contained within cells and I hope will perform pretty much as you
>>> require.
>>> Take a look down into the main() method to see how it can be used; this
>>> method only shows the class being used to process a single file but an
>>> instance can be used to process more than one file in this manner;
>>> 
>>> ToCSV converter = new ToCSV();
>>> converter.openWorkbook("C:/temp/To CSV.xls");
>>> converter.convertToCSV();
>>> converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>>> 
>>> converter.openWorkbook("C:/temp/Another To CSV.xlsx");
>>> converter.convertToCSV();
>>> converter.saveCSVFile("C:/temp/Second CSV.csv", ";");
>>> 
>>> 
>>> import org.apache.poi.ss.usermodel.WorkbookFactory;
>>> import org.apache.poi.ss.usermodel.Workbook;
>>> import org.apache.poi.ss.usermodel.Sheet;
>>> import org.apache.poi.ss.usermodel.Row;
>>> import org.apache.poi.ss.usermodel.Cell;
>>> import org.apache.poi.ss.usermodel.DataFormatter;
>>> import org.apache.poi.ss.usermodel.FormulaEvaluator;
>>> import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
>>> 
>>> import java.io.File;
>>> import java.io.FileInputStream;
>>> import java.io.FileWriter;
>>> import java.io.BufferedWriter;
>>> import java.io.IOException;
>>> import java.io.FileNotFoundException;
>>> import java.util.ArrayList;
>>> 
>>> /**
>>> * Demonstrate one way to convert an Excel spreadsheet into a CSV file.
>>> This
>>> * class makes the following assumptions;
>>> *
>>> * 1. Where the Excel workbook contains more that one worksheet then a
>>> single
>>> *    CSV file will contain the data from all of the worksheets.
>>> * 2. The data matrix contained in the CSV file will be square. This
>>> means
>>> that
>>> *    the number of elements in each row of the CSV file will match the
>>> number
>>> *    of cells in the longest row found in the Excel workbook. Any short
>>> rows
>>> *    will be 'padded' with empty elements - an empty elements is
>>> represented in
>>> *    the CSV file in this way ,,.
>>> * 3. Empty elements will represent missing cells.
>>> * 4. A row consisting of empty elements will be used to represent an
>>> empty
>>> row
>>> *    in the Excel workbook.
>>> *
>>> * @author Mark B
>>> * @version 1.00 9th April 2010
>>> */
>>> public class ToCSV {
>>> 
>>>     private Workbook workbook = null;
>>>     private ArrayList<ArrayList> csvData = null;
>>>     private int maxRowWidth = 0;
>>>     private DataFormatter formatter = null;
>>>     private FormulaEvaluator evaluator = null;
>>> 
>>>     /**
>>>      * Open an Excel workbook readt for conversion.
>>>      *
>>>      * @param filename An instance of the String class that encapsulates
>>> the
>>>      *                 path to and name of a valid Excel workbook. Note
>>> that
>>> the
>>>      *                 workbook can be either a binary (.xls) or
>>> SpreadsheetML
>>>      *                 (.xlsx) file.
>>>      *
>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>> be
>>> located.
>>>      * @throws java.io.IOException Thrown if a problem occurs in the
>>> file
>>> system.
>>>      * @throws
>>> org.apache.poi.openxml4j.exceptions.InvalidFormatException
>>> Thrown
>>>      *     if invalid xml is found whilst parsing an input SpreadsheetML
>>> file.
>>>      */
>>>     public void openWorkbook(String filename) throws
>>> FileNotFoundException,
>>>                                            IOException,
>>> InvalidFormatException {
>>>         File file = null;
>>>         FileInputStream fis = null;
>>>         try {
>>>             file = new File(filename);
>>>             fis = new FileInputStream(file);
>>>             this.workbook = WorkbookFactory.create(fis);
>>>             this.evaluator =
>>> this.workbook.getCreationHelper().createFormulaEvaluator();
>>>             this.formatter = new DataFormatter();
>>>         }
>>>         finally {
>>>             if(fis != null) {
>>>                 fis.close();
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to convert the contents of the currently opened workbook
>>> into
>>>      * a CSV file.
>>>      */
>>>     public void convertToCSV() {
>>>         Sheet sheet = null;
>>>         Row row = null;
>>>         int lastRowNum = 0;
>>>         this.csvData = new ArrayList<ArrayList>();
>>> 
>>>         // Discover how many sheets there are in the workbook....
>>>         int numSheets = this.workbook.getNumberOfSheets();
>>> 
>>>         // and then iterate through them.
>>>         for(int i = 0; i < numSheets; i++) {
>>> 
>>>             // Get a reference to a sheet and check to see if it
>>> contains
>>>             // any rows.
>>>             sheet = this.workbook.getSheetAt(i);
>>>             if(sheet.getPhysicalNumberOfRows() > 0) {
>>> 
>>>                 // Note down the index number of the bottom-most row and
>>>                 // then iterate through all of the rows on the sheet
>>> starting
>>>                 // from the very first row - number 1 - even if it is
>>> missing.
>>>                 // Recover a reference to the row and then call another
>>> method
>>>                 // which will strip the data from the cells and build
>>> lines
>>>                 // for inclusion in the resylting CSV file.
>>>                 lastRowNum = sheet.getLastRowNum();
>>>                 for(int j = 0; j <= lastRowNum; j++) {
>>>                     row = sheet.getRow(j);
>>>                     this.rowToCSV(row);
>>>                 }
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to actually save the data recovered from the Excel
>>> workbook
>>>      * as a CSV file.
>>>      *
>>>      * @param filename An instance of the String class that encapsulates
>>> the
>>>      *                path to and name of the CSV file.
>>>      * @param separator An instance of the String class that
>>> encapsulates
>>> the
>>>      *                  character or character that ought to be used to
>>> delimit
>>>      *                  elements on the lines of the CSV file.
>>>      * @throws java.io.FileNotFoundException Thrown if the file cannot
>>> be
>>> found.
>>>      * @throws java.io.IOException Thrown to indicate and error occurred
>>> in
>>> the
>>>      *                             underylying file system.
>>>      */
>>>     public void saveCSVFile(String filename, String separator)
>>>                                                    throws
>>> FileNotFoundException,
>>>                                                           IOException {
>>>         File file = null;
>>>         FileWriter fw = null;
>>>         BufferedWriter bw = null;
>>>         ArrayList<String> line = null;
>>>         StringBuffer buffer = null;
>>>         String csvLineElement = null;
>>>         try {
>>>             // Open a writer onto the CSV file.
>>>             file = new File(filename);
>>>             fw = new FileWriter(file);
>>>             bw = new BufferedWriter(fw);
>>> 
>>>             // Step through the elements of the ArrayList that was used
>>> to
>>> hold
>>>             // all of the data recovered from the Excel workbooks'
>>> sheets,
>>> rows
>>>             // and cells.
>>>             for(int i = 0; i < this.csvData.size(); i++) {
>>>                 buffer = new StringBuffer();
>>> 
>>>                 // Get an element from the ArrayList that contains the
>>> data
>>> for
>>>                 // the workbook. This element will itself be an
>>> ArrayList
>>>                 // containing Strings and each String will hold the data
>>> recovered
>>>                 // from a single cell. The for() loop is used to recover
>>> elements
>>>                 // from this 'row' ArrayList one at a time and to write
>>> the
>>> Strings
>>>                 // away to a StringBuffer thus assembling a single line
>>> for
>>> inclusion
>>>                 // in the CSV file. If a row was empty or if it was
>>> short,
>>> then
>>>                 // the ArrayList that contains it's data will also be
>>> shorter than
>>>                 // some of the others. Therefore, it is necessary to
>>> check
>>> within
>>>                 // the for loop to ensure that the ArrayList contains
>>> data
>>> to be
>>>                 // processed. If it does, then an element will be
>>> recovered
>>> and
>>>                 // appended to the StringBuffer.
>>>                 line = this.csvData.get(i);
>>>                 for(int j = 0; j < this.maxRowWidth; j++) {
>>>                     if(line.size() > j) {
>>>                         csvLineElement = line.get(j);
>>>                         if(csvLineElement != null) {
>>>                             buffer.append(csvLineElement);
>>>                         }
>>>                     }
>>>                     if(j < (this.maxRowWidth - 1)) {
>>>                         buffer.append(separator);
>>>                     }
>>>                 }
>>> 
>>>                 // Once the line is built, write it away to the CSV
>>> file.
>>>                 bw.write(buffer.toString().trim());
>>> 
>>>                 // Condition the inclusion of new line characters so as
>>> to
>>>                 // avoid an additional, superfluous, new line at the end
>>> of
>>>                 // the file.
>>>                 if(i < (this.csvData.size() - 1)) {
>>>                     bw.newLine();
>>>                 }
>>>             }
>>>         }
>>>         finally {
>>>             if(bw != null) {
>>>                 bw.flush();
>>>                 bw.close();
>>>             }
>>>         }
>>>     }
>>> 
>>>     /**
>>>      * Called to convert a row of cells into a line of data that can
>>> later
>>> be
>>>      * output to the CSV file.
>>>      *
>>>      * Note that no tests have yet been conducted with blank cells or
>>> those
>>>      * containing formulae. Such may require latereations to the way
>>> this
>>> code
>>>      * works.
>>>      *
>>>      * @param row An instance of either the HSSFRow or XSSFRo classes
>>> that
>>>      *            encapsulates information about a row of cells
>>> recovered
>>> from
>>>      *            an Excel workbook.
>>>      */
>>>     private void rowToCSV(Row row) {
>>>         Cell cell = null;
>>>         int lastCellNum = 0;
>>>         ArrayList<String> csvLine = new ArrayList<String>();
>>> 
>>>         // Check to ensure that a row was recovered from the sheet as it
>>> is
>>>         // possible that one or more rows between other populated rows
>>> could
>>> be
>>>         // missing - blank. If the row does contain cells then...
>>>         if(row != null) {
>>> 
>>>             // Get the index for the right most cell on the row and then
>>>             // step along the row from left to right recovering the
>>> contents
>>>             // of each cell, converting that into a formatted String and
>>>             // then storing the String into the csvLine ArrayList.
>>>             lastCellNum = row.getLastCellNum();
>>>             for(int i = 0; i <= lastCellNum; i++) {
>>>                 cell = row.getCell(i);
>>>                 if(cell == null) {
>>>                     csvLine.add("");
>>>                 }
>>>                 else {
>>>                     if(cell.getCellType() != Cell.CELL_TYPE_FORMULA) {
>>>                        
>>> csvLine.add(this.formatter.formatCellValue(cell));
>>>                     }
>>>                     else {
>>>                         csvLine.add(this.formatter.formatCellValue(cell,
>>> this.evaluator));
>>>                     }
>>>                 }
>>>             }
>>>             // Make a note of the index number of the right most cell.
>>> This
>>> value
>>>             // will later be used to ensure that the matrix of data in
>>> the
>>> CSV file
>>>             // is square.
>>>             if(lastCellNum > this.maxRowWidth) {
>>>                 this.maxRowWidth = lastCellNum;
>>>             }
>>>         }
>>>         this.csvData.add(csvLine);
>>>     }
>>> 
>>>     /**
>>>      * The main() method contains code that demonstrates how to use the
>>> class.
>>>      * @param args
>>>      */
>>>     public static void main(String[] args) {
>>>         try {
>>>             ToCSV converter = new ToCSV();
>>>             converter.openWorkbook("C:/temp/To CSV.xls");
>>>             converter.convertToCSV();
>>>             converter.saveCSVFile("C:/temp/First CSV.csv", ";");
>>>         }
>>>         catch(Exception ex) {
>>>             System.out.println("Caught an: " + ex.getClass().getName());
>>>             System.out.println("Message: " + ex.getMessage());
>>>             System.out.println("Stacktrace follows:.....");
>>>             ex.printStackTrace(System.out);
>>>         }
>>> 
>>>     }
>>> }
>>> 
>>> Test it out, have a good look through it and if there is anything you
>>> want
>>> to know just post to the list.
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Luke_Devon wrote:
>>>> 
>>>> Hi Mark,
>>>> 
>>>> First of all I would like to thank you for the reply. 
>>>> 
>>>> Actually , currently I am using MS Office 2002. But I would like to use
>>>> the code for other latest versions also.
>>>> In my case , I dont want to do any validations for the EXCEL file ,
>>>> because I just wanted convert entire excel file into CSV.
>>>> Is there any simple java code available for such a basic requirement ?
>>>> 
>>>> anyway I'll try to use the code in the link which you have given to me
>>>> .
>>>> 
>>>> Thanks & Regards
>>>> 
>>>> Luke.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ________________________________
>>>> From: MSB <ma...@tiscali.co.uk>
>>>> To: user@poi.apache.org
>>>> Sent: Thursday, April 8, 2010 23:47:07
>>>> Subject: Re: Convert XLS into CSV
>>>> 
>>>> 
>>>> Hello Luke,
>>>> 
>>>> Which version of the Excel file format are you targetting, the older
>>>> binary
>>>> or newer xml based version? I ask because Nick wrote and contributed
>>>> some
>>>> code that can be used to convert the older binary files into csv. It
>>>> uses
>>>> the eventmodel and will seem quite complex on first acquaintance but
>>>> here
>>>> it
>>>> is;
>>>> 
>>>> http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java
>>>> 
>>>> I know that other users have modified the code to, for example, output
>>>> diffenert worksheets to separate csv files.
>>>> 
>>>> Yours
>>>> 
>>>> Mark B
>>>> 
>>>> 
>>>> Luke_Devon wrote:
>>>>> 
>>>>> Hi 
>>>>> 
>>>>> I wanted to convert some XLS files into CSV. I found that apache.poi
>>>>> is
>>>>> the most perfect tool. Since I'm a beginner , i have no idea how to do
>>>>> that . Can some body help me please ? Do you have sample code for
>>>>> convert
>>>>> xls into csv ?
>>>>> 
>>>>> Thanks in Advance
>>>>> Luke
>>>>> 
>>>>> 
>>>>> 
>>>>>       Get your preferred Email name!
>>>>> Now you can @ymail.com and @rocketmail.com. 
>>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>>> 
>>>> 
>>>> -- 
>>>> View this message in context:
>>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28180503.html
>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>> 
>>>> 
>>>>       New Email names for you! 
>>>> Get the Email name you've always wanted on the new @ymail and
>>>> @rocketmail. 
>>>> Hurry before someone else does!
>>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28191046.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>>       Get your new Email address!
>>> Grab the Email name you've always wanted before someone else does!
>>> http://mail.promotions.yahoo.com/newdomains/aa/
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28230701.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28333557.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Convert-XLS-into-CSV-tp28175999p28373635.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org