Message Boards Message Boards

0
|
5143 Views
|
4 Replies
|
6 Total Likes
View groups...
Share
Share this post:

How is lower case "c" interpreted when importing CSV files?

Posted 3 years ago

If I have a simple CSV file with digits followed by a lower case "c", what I would expect to get imported as a string gets imported as an integer when importing. Here is an example:

If "test.csv" contains:

9a, 9b, 9C, 9d

10a, 10b, 10c, 10d

11a, 11b, 11c, 11d

12a, 12b, 12c, 12d

then

Import["test.csv"]

gives

{{"9a", "9b", "9C", "9d"}, {"10a", "10b", 10, "10d"}, {"11a", "11b", 
11, "11d"}, {"12a", "12b", 12, "12d"}}

Why is the suffix "c" causing the import to interpret the string as an integer?

POSTED BY: Andrew
4 Replies
Posted 3 years ago

Well, that makes sense (no pun intended). Thanks for the explanation Sean.

POSTED BY: Andrew
Posted 3 years ago

The reason that this is being interpreted as the number 10 is it thinks "c" is the currency token for cents. The "RawData" element which does no processing at all is similar to the option "Numeric" -> False. This is fine if you only want strings, but if you actually want numbers like "10" but not "10c" to be interpreted as numbers, you can use "CurrencyTokens" -> None.

Or, if you wanted a dataset, this should work for you:

`Import["myData.csv", "Dataset", HeaderLines -> 1, "CurrencyTokens" -> None]`

And again if you just want strings, "Numeric" -> False can replace "CurrencyTokens" -> None

POSTED BY: Sean Cheren
Posted 3 years ago

Thanks Rohit. Importing as "RawData" is definitely a workaround. I'm ultimately trying to import a CSV file with a header line as a "Dataset":

myData = Import["myData.csv", "Dataset", HeaderLines->1]

Casting the workaround as a dataset does not give the correct result:

myData = Dataset[Import["myData.csv", "RawData"]]

because the header line is treated as a data row. I'm sure the "RawData" import could be manipulated to create the necessary associations to produce the dataset I want, but that seems to be such unnecessary work for such a simple import.

POSTED BY: Andrew
Posted 3 years ago

Hi Andrew,

Why is the suffix "c" causing the import to interpret the string as an integer?

I don't know the answer, but I have seen Import of CSV interpret values and convert them from String to another type incorrectly. Very annoying. A workaround is to Import the "RawData" element.

Import["test.csv", "RawData"]
POSTED BY: Rohit Namjoshi
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract