Message Boards Message Boards

GROUPS:

Import large Excel sheets?

Posted 4 months ago
877 Views
|
12 Replies
|
5 Total Likes
|

Hi friends. I am having troubles with importing relatively large Excel sheets --not huge, but rather large (60 000+ rows, 150 columns).

When I try to import the database as a whole, I get the following error:

Import::fmterr: Cannot import data as XLS format.

I separated the Excel file into several files, one per column. If I try to import them manually, I can, but if I try to import them programmatically (Map[Import, names], where names are the names of each column file) I get the same error messages,

I increased the available memory, to no avail

<< JLink`;
 InstallJava[];
 ReinstallJava[JVMArguments -> "-Xmx1024m"]

Any tips or assistance are very much appreciated

Francisco

12 Replies

Francisco,

I suggest saving the excel spreadsheet as a CSV file and import it into Mathematica from the CSV. Version 11.3 has some big CSV import improvements and there are new options that will handle any data format (ie strings with or without quotes, etc)

Regards

Neil

Posted 4 months ago

Hi Neil,

11.3 got some big excel updates as well (it no longer uses Java in 11.3), with even more improvements to come in the next version!

-S

Many thanks Neil. Will try your suggestion and inform if it works.

Posted 4 months ago

Hi Francisco,

What version of Mathematica are you using? Also please let me know what operating system you are using.

-S

Hi Sean. I am using 11.3, Windows 10. The error message I get is: Import::fmterr: Cannot import data as XLS format. $Failed

Posted 4 months ago

What is the exact path the files resides on? Does it perhaps contain a non-ASCII character?

If so, what if you moved it/renamed it to an ASCII-only path?

Posted 4 months ago

Like Kyle mentioned there are some minor bugs with non-ASCII characters on filepaths and perhaps in the file in XLS in 11.3 (only on Windows) which are absolutely being addressed in the next version.

Remember that increasing Java won't help in 11.3 since XLS/XLSX import no longer uses Java!

I agree that CSV is the best option. But I ended up having to import 5GB's of excel files (200-300MB each) and the latest Mathematica Import did it without a problem - no Java issues anymore.

That said, I also tried an Excel streamer with some success - the way I used it, it was slower than Import: https://github.com/ExcelDataReader/ExcelDataReader

Posted 4 months ago

We're working hard to make XLS and XLSX as robust of an option as CSV, so use whichever format best suits your needs!

I think the work on Import["*.xlsx"] is really good. I did report a bug in 11.3 where Import failed if there was a chart sheet before a data sheet. I have to manually move chart sheets to the end of the workbook before importing. It's been very reproducible and didn't happen in 11.2.

Posted 2 months ago

Hi Eric, keep your eyes peeled for the next version of Wolfram Language, the bug involving sheets with charts in XLS and XLSX has been solved, among many other important bug fixes.

Friends: success! I saved the file as csv and I could import it without problems. The morale: it is still easier in csv...

As some asked, indeed the Excel file had some columns with extra formatting and non-ascii characters, which must be the source of the problems

Thanks to all Francisco

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract