Group Abstract Group Abstract

Message Boards Message Boards

How to import an XML file otherwise intended for Microsoft Excel?

Posted 12 years ago

I am unfamiliar with XML except to the extent that I can read XML files into MS Excel and then output them as XLS files which I can then read into Mathematica using the usual Import command. Done this way I get nice tables of all the tabs of the file in an easy to use list format.

However, I would like to access these files (that are made available several times a day) automatically through Mathematica.

When trying to Import the XML format I get objects and elements that suggest a standard Excel.Sheet template that Mathematica doesn't automatically recognize. The Short version of the file is given below:

XMLObject[ "Document"][{XMLObject["Declaration"]["Version" -> "1.0", 
   "Encoding" -> "UTF-8"], 
  XMLObject["ProcessingInstruction"]["mso-application", 
   "progid=\"Excel.Sheet\""]}, 
 XMLElement[
  "Workbook", {OutputSizeLimit`Skeleton[
   1]}, {XMLElement[
    "Worksheet", {{"urn:schemas-microsoft-com:office:spreadsheet", 
       "Name"} -> "ALL_BIDS"}, {XMLElement[
      "Table", {}, {OutputSizeLimit`Skeleton[1]}]}], 
   OutputSizeLimit`Skeleton[6], 
   XMLElement[
    "Worksheet", {OutputSizeLimit`Skeleton[
     1]}, {XMLElement["Table", {}, {OutputSizeLimit`Skeleton[1]}]}]}],
  {}

Can anyone suggest a simple way to automatically read this file into Mathematica? I suppose I could create some AppleScript (I'm on OS X - Yosemite and use Mathematica 10.) that Mathematica could call and that would in tern call Excel and direct it to do the conversion, but i would prefer to learn how to manipulate XML code in Mathematica.

Many thanks in advance for your suggestions, Steve

POSTED BY: Stephen Wilkus
2 Replies

Hi ,

I did a test with this code :

StringSplit[StringReplace[
  xml, "<" ~~ Except[">"] .. ~~ ">" -> ""], "\n"]

where , xml =

"<?xml version=\"1.0\"?>
<ss:Workbook xmlns:ss=\"urn:schemas-microsoft-com:office:spreadsheet\">
    <ss:Worksheet ss:Name=\"Sheet1\">
        <ss:Table>
            <ss:Column ss:Width=\"80\"/>
            <ss:Column ss:Width=\"80\"/>
            <ss:Column ss:Width=\"80\"/>
            <ss:Row>
                <ss:Cell>
                   <ss:Data ss:Type=\"String\">First Name</ss:Data>
                </ss:Cell>
                <ss:Cell>
                   <ss:Data ss:Type=\"String\">Last Name</ss:Data>
                </ss:Cell>
                <ss:Cell>
                   <ss:Data ss:Type=\"String\">Phone Number</ss:Data>
                </ss:Cell>
            </ss:Row>
            <ss:Row>
                <ss:Cell>
                   <ss:Data ss:Type=\"String\">Nancy</ss:Data>
                </ss:Cell>
                <ss:Cell>
                   <ss:Data ss:Type=\"String\">Davolio</ss:Data>
                </ss:Cell>
                <ss:Cell>
                   <ss:Data ss:Type=\"String\">(206)555 9857</ss:Data>
                </ss:Cell>
            </ss:Row>
        </ss:Table>
    </ss:Worksheet>

</ss:Workbook>"

And the result is:

{"    ", "        ", "            ", "            ", "            ", \
"            ", "                ", "                   First Name", \
"                ", "                ", "                   Last \
Name", "                ", "                ", "                   \
Phone Number", "                ", "            ", "            ", "  \
              ", "                   Nancy", "                ", "    \
            ", "                   Davolio", "                ", "    \
            ", "                   (206)555 9857", "                \
", "            ", "        ", "    "}

After that , you just need to clean the list easily.

POSTED BY: Marcelo De Cicco
Posted 12 years ago
POSTED BY: Gustavo Delfino
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard