Group Abstract Group Abstract

Message Boards Message Boards

0
|
3.2K Views
|
4 Replies
|
3 Total Likes
View groups...
Share
Share this post:

When importing a text file, discard all lines starting with #

Posted 2 years ago

I've a text file, which includes something like the following:

# blabla
x,y,z
x,y+1/8,z+7/8
x,y+1/4,z+3/4
# blabla
x,y+3/8,z+5/8
x,y+1/2,z+1/2
x,y+5/8,z+3/8
# blabla
x,y+3/4,z+1/4
x,y+7/8,z+1/8
x+1/8,y,z+7/8
x+1/8,y+1/8,z+3/4
x+1/8,y+1/4,z+5/8
x+1/8,y+3/8,z+1/2
x+1/8,y+1/2,z+3/8
x+1/8,y+5/8,z+1/4
x+1/8,y+3/4,z+1/8
x+1/8,y+7/8,z
# blabla
[...]
# blabla

When I'm importing this file into Mathematica as follows, I would like to discard all lines starting with #:

Import["~/Desktop/docs/SG229me3", "Text"]

Are there any tips to achieve this goal?

Regards,
Zhao

POSTED BY: Hongyi Zhao
4 Replies
Posted 2 years ago

You need to be careful of precedence. Compare

Thank you for pointing this out. Yes, it does the trick:

In[282]:= (Select[stringLines, Not@*StringStartsQ["#"]]//DeleteCases[#, ""]&)===(Select[stringLines, Not@*StringStartsQ["#"]]/. "" -> Nothing)

Out[282]= True
POSTED BY: Hongyi Zhao
Posted 2 years ago
POSTED BY: Eric Rimbey
Posted 2 years ago

The following method will leave an empty item in the result:

Select[stringLines, Not@*StringStartsQ["#"]]

As shown below:

In[112]:= (*If you actually want the individual lines,*)

stringLines = Import[file, {"Text", "Lines"}];

(*You could use Select:
*)

Select[stringLines, Not@*StringStartsQ["#"]][[1;;5]]

Out[113]= {"", "x,y,z", "x,y+1/8,z+7/8", "x,y+1/4,z+3/4", \
"x,y+3/8,z+5/8"}

I tried the following method, but it could not achieve the purpose of removing the above empty item:

Select[stringLines, Not@*StringStartsQ["#"]&& #=!=""&]

In fact, the above code will give you the following strange result:

{}

What's even stranger is that I tried the different methods below, and their results were different:

In[249]:= Select[stringLines, Not@*StringStartsQ["#"]]//DeleteCases[#, ""]&;
%===Select[stringLines, Not@*StringStartsQ["#"]]/. "" -> Nothing

Out[250]= False

But the check below confirms that they are exactly the same:

In[278]:= listA=Select[stringLines, Not@*StringStartsQ["#"]]//DeleteCases[#, ""]&; 
listB=Select[stringLines, Not@*StringStartsQ["#"]]/. "" -> Nothing;
ForAll[MapThread[SameQ, {listA, listB}],True]

Out[280]= True

I've attached the testing file here for your information.

Regards, Zhao

Attachments:
POSTED BY: Hongyi Zhao
Posted 2 years ago

If you import as one big string,

string = Import[pathToFile, "Text"]

Then you could use StringDelete:

StringDelete[string, RegularExpression["(?m)^#.*\n"]]

If you actually want the individual lines,

stringLines = Import[pathToFile, {"Text", "Lines"}]

You could use Select:

Select[stringLines, Not@*StringStartsQ["#"]]
POSTED BY: Eric Rimbey
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard