| <!DOCTYPE html> |
| <html lang="en"> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| http://www.apache.org/licenses/LICENSE-2.0 |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <head> |
| <meta charset="utf-8" /> |
| <title>ConvertExcelToCSVProcessor</title> |
| <style> |
| table { |
| border-collapse: collapse; |
| } |
| |
| table, th, td { |
| border: 1px solid #ccc; |
| } |
| |
| td.r { |
| text-align: right; |
| } |
| |
| td { |
| width: 50px; |
| padding: 5px; |
| } |
| </style> |
| <link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css" /> |
| </head> |
| |
| <body> |
| <h2>How it extracts CSV data from a sheet</h2> |
| <p> |
| ConvertExcelToCSVProcessor extracts CSV data with following rules: |
| </p> |
| <ul> |
| <li>Find the fist cell which has a value in it (the FirstCell).</li> |
| <li>Scan cells in the first row, starting from the FirstCell, |
| until it reaches to a cell after which no cell with a value can not be found in the row (the FirstRowLastCell).</li> |
| <li>Process the 2nd row and later, from the column of FirstCell to the column of FirstRowLastCell.</li> |
| <li>If a row does not have any cell that has a value, then the row is ignored.</li> |
| </ul> |
| |
| <p> |
| As an example, the sheet shown below will be: |
| </p> |
| |
| <table> |
| <tbody> |
| <tr><th>row </th><th>A</th><th>B</th><th>C</th><th>D</th><th>E</th><th>F</th><th>G</th></tr> |
| <tr><td class="r"> 1</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr> |
| <tr><td class="r"> 2</td><td> </td><td> </td><td>x</td><td>y</td><td>z</td><td> </td><td> </td></tr> |
| <tr><td class="r"> 3</td><td> </td><td> </td><td>1</td><td> </td><td> </td><td> </td><td> </td></tr> |
| <tr><td class="r"> 4</td><td>2</td><td> </td><td> </td><td>3</td><td> </td><td> </td><td> </td></tr> |
| <tr><td class="r"> 5</td><td> </td><td> </td><td> </td><td> </td><td>4</td><td> </td><td> </td></tr> |
| <tr><td class="r"> 6</td><td> </td><td> </td><td>5</td><td>6</td><td>7</td><td> </td><td> </td></tr> |
| <tr><td class="r"> 7</td><td> </td><td> </td><td> </td><td> </td><td> </td><td>8</td><td> </td></tr> |
| <tr><td class="r"> 8</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr> |
| <tr><td class="r"> 9</td><td> </td><td> </td><td> </td><td> </td><td>9</td><td> </td><td> </td></tr> |
| <tr><td class="r"> 10</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr> |
| <tr><td class="r"> 11</td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td><td> </td></tr> |
| </tbody> |
| </table> |
| |
| <p> |
| converted to following CSV: |
| </p> |
| |
| <pre> |
| x,y,z |
| 1,, |
| ,3, |
| ,,4 |
| 5,6,7 |
| ,,9 |
| </pre> |
| |
| <ul> |
| <li>C2(x) is the FirstCell, and E2(z) is the FirstRowLastCell.</li> |
| <li>A4(2) is ignored because it is out of range. So is F7(8).</li> |
| <li>Row 7 and 8 are ignored because those do not have a valid cell.</li> |
| <li>It is important to have a header row as shown in the example to define data area, |
| especially when a sheet includes empty cells.</li> |
| </ul> |
| |
| </body> |
| </html> |