Message Boards Message Boards

0
|
1806 Views
|
4 Replies
|
3 Total Likes
View groups...
Share
Share this post:

When importing a text file, discard all lines starting with #

Posted 1 year ago

I've a text file, which includes something like the following:

# blabla
x,y,z
x,y+1/8,z+7/8
x,y+1/4,z+3/4
# blabla
x,y+3/8,z+5/8
x,y+1/2,z+1/2
x,y+5/8,z+3/8
# blabla
x,y+3/4,z+1/4
x,y+7/8,z+1/8
x+1/8,y,z+7/8
x+1/8,y+1/8,z+3/4
x+1/8,y+1/4,z+5/8
x+1/8,y+3/8,z+1/2
x+1/8,y+1/2,z+3/8
x+1/8,y+5/8,z+1/4
x+1/8,y+3/4,z+1/8
x+1/8,y+7/8,z
# blabla
[...]
# blabla

When I'm importing this file into Mathematica as follows, I would like to discard all lines starting with #:

Import["~/Desktop/docs/SG229me3", "Text"]

Are there any tips to achieve this goal?

Regards,
Zhao

POSTED BY: Hongyi Zhao
4 Replies
Posted 1 year ago

You need to be careful of precedence. Compare

Thank you for pointing this out. Yes, it does the trick:

In[282]:= (Select[stringLines, Not@*StringStartsQ["#"]]//DeleteCases[#, ""]&)===(Select[stringLines, Not@*StringStartsQ["#"]]/. "" -> Nothing)

Out[282]= True
POSTED BY: Hongyi Zhao
Posted 1 year ago

This method will leave an empty entry in the result:

There is a blank line in the data, so, yes, there will be a blank line in the result. We removed all lines starting with "#". A blank line does not start with "#", thus a blank line won't be removed.

Select[stringLines, Not@*StringStartsQ["#"]&& #=!=""&]

In fact, the above code will give you the following strange result:

Yes, because you've mixed up your semantics. You've created a function that conjoins a function and a condition.

What's even stranger is that I tried the different methods below, and their results were different:

You need to be careful of precedence. Compare

% === Select[stringLines, Not@*StringStartsQ["#"]] /. "" -> Nothing

with

% === (Select[stringLines, Not@*StringStartsQ["#"]] /. "" -> Nothing)
POSTED BY: Eric Rimbey
Posted 1 year ago

The following method will leave an empty item in the result:

Select[stringLines, Not@*StringStartsQ["#"]]

As shown below:

In[112]:= (*If you actually want the individual lines,*)

stringLines = Import[file, {"Text", "Lines"}];

(*You could use Select:
*)

Select[stringLines, Not@*StringStartsQ["#"]][[1;;5]]

Out[113]= {"", "x,y,z", "x,y+1/8,z+7/8", "x,y+1/4,z+3/4", \
"x,y+3/8,z+5/8"}

I tried the following method, but it could not achieve the purpose of removing the above empty item:

Select[stringLines, Not@*StringStartsQ["#"]&& #=!=""&]

In fact, the above code will give you the following strange result:

{}

What's even stranger is that I tried the different methods below, and their results were different:

In[249]:= Select[stringLines, Not@*StringStartsQ["#"]]//DeleteCases[#, ""]&;
%===Select[stringLines, Not@*StringStartsQ["#"]]/. "" -> Nothing

Out[250]= False

But the check below confirms that they are exactly the same:

In[278]:= listA=Select[stringLines, Not@*StringStartsQ["#"]]//DeleteCases[#, ""]&; 
listB=Select[stringLines, Not@*StringStartsQ["#"]]/. "" -> Nothing;
ForAll[MapThread[SameQ, {listA, listB}],True]

Out[280]= True

I've attached the testing file here for your information.

Regards, Zhao

Attachments:
POSTED BY: Hongyi Zhao
Posted 1 year ago

If you import as one big string,

string = Import[pathToFile, "Text"]

Then you could use StringDelete:

StringDelete[string, RegularExpression["(?m)^#.*\n"]]

If you actually want the individual lines,

stringLines = Import[pathToFile, {"Text", "Lines"}]

You could use Select:

Select[stringLines, Not@*StringStartsQ["#"]]
POSTED BY: Eric Rimbey
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract