Group Abstract Group Abstract

Message Boards Message Boards

0
|
671 Views
|
5 Replies
|
9 Total Likes
View groups...
Share
Share this post:

How can I stop Mathematica from treating strings as numbers in scientific notation?

Posted 2 months ago

Hello everyone,

I have a large CSV data file that includes a variable that is an alpha-numeric ID (CUSIP) for financial securities. When I import the file, Mathematica treats a CUSIP such as 0452E105 as the number 425^105, which it is not. How can I stop Mathematica from doing this?

POSTED BY: Gregory Lypny
5 Replies
Posted 2 months ago
POSTED BY: David Keith

In Version 14.2, together with Tabular object, we introduced "Schema" element and option in Import of CSV. By default Import interprets 0452E105 as a number:

In[1]:= ImportString["1, -2.34, 0452E105", "CSV"]

Out[1]= {{1, -2.34, 4.52*10^107}}

We can get the TabularSchema object using:

In[2]:= schema = ImportString["1, -2.34, 0452E105", {"CSV", "Schema"}]

Out[2]= TabularSchema[<|"ColumnProperties" -> {<|"ElementType" -> "Integer64"|>, 
<|"ElementType" -> "Real64"|>, <|"ElementType" -> "Real64"|>}, 
  "KeyColumns" -> None, "Backend" -> "WolframKernel"|>]

Notice the type of the third column is "Real64". To change it to "String" we need to create a new TabularSchema and pass it to Import:

In[3]:= schema2 = TabularSchema[<|"ColumnProperties" -> {<|"ElementType" -> 
       "Integer64"|>, <|"ElementType" -> "Real64"|>, <|"ElementType" ->
        "String"|>}|>]

Out[3]= TabularSchema[<|"ColumnProperties" -> {<|"ElementType" -> "Integer64"|>,
 <|"ElementType" -> "Real64"|>, <|"ElementType" -> "String"|>}|>]

In[4]:= ImportString["1, -2.34, 0452E105", "CSV", "Schema" -> schema2]

Out[4]= {{1, -2.34, " 0452E105"}}

To work with large CSV data I strongly suggest to import it as "Tabular" instead of "Data":

In[5]:= ImportString["1, -2.34, 0452E105", {"CSV", "Tabular"}, "Schema" -> schema2] // TabularQ

Out[5]= True
POSTED BY: Piotr Wendykier
Posted 2 months ago
POSTED BY: Gregory Lypny

Set the "Numeric" option to False.

POSTED BY: Carl Verdon

Among the csv import element I see "RawData", which imports as a "two-dimensional array of strings". Have you tried it?

POSTED BY: Gianluca Gorni
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard