Group Abstract Group Abstract

Message Boards Message Boards

0
|
8.3K Views
|
8 Replies
|
0 Total Likes
View groups...
Share
Share this post:

FindList AND Rule to find lines with ALL (not just any) Specified Strings

Posted 11 years ago
POSTED BY: Bob Stephens
8 Replies
Posted 11 years ago

yes, I like this approach is much closer to adding conditional logic to FindList.

thanks!

POSTED BY: Bob Stephens

An intermediate approach, use FindList as far as possible, then continue to filter with Select

In[64]:= Select[
  StringSplit[#, ",", All] & /@ FindList["test_file.csv", {"Type3,"}],
   StringMatchQ[#[[3]], "1"] || StringMatchQ[#[[3]], "6"] &] // Length

Out[64]= 598
POSTED BY: Udo Krause
Posted 11 years ago

ok, thank you very much this solution would address scenarios for larger (larger than the file I used in this example) files - not sure I understand the 80's comment............in any case, my comment saying it would be nice if this logic could be applied as a rule to FindList was for incremental reads but also for providing a cleaner (and simpler) solution by adding some conditional logic to FindList.

thanks again for your responses

POSTED BY: Bob Stephens

OMG you turn me back to the eighties ... the following works, but seems by no means faster than the CSV-Import ... have nevertheless fun with it.

Clear[stephensReader]
stephensReader[s_String, a1_String, v1_Integer:1, v2_Integer:6] := 
 Module[{str, r, o = 0, oo = 0, n },
   If[!FileExistsQ[s],
    Print["Sorry, cannot find file \"", s, "\". Bye."];
    Return[$Failed]
   ];
   str = OpenRead[s, BinaryFormat -> True];
   While[True,
    r = ReadLine[str];
    If[r == EndOfFile, Break[], o++, o++];
    r = StringSplit[r, ",", All];
    n = ToExpression[r[[3]]];
    If[StringMatchQ[r[[1]], a1] && ((n == v1) || (n == v2)),
     oo++;
     Print[r]
    ]
   ];
   Close[str];
   Print["Lines read: ", o, " Lines selected: ", oo]
  ] /; StringLength[s] > 0 && StringLength[a1] > 0


stephensReader["test_file.csv", "Type3"]

{Type3,OK,1,-44.57,-40.68}
{Type3,OK,6,-44.6,-41.83}
{Type3,OK,1,-44.61,-39.72}
{Type3,OK,6,-44.53,-41.44}
{Type3,OK,1,-44.56,-39.92}
{Type3,OK,6,-44.58,-40.83}
{Type3,OK,1,-44.54,-41.47}
{Type3,OK,6,-44.51,-41.17}
{Type3,OK,1,-44.56,-39.89}
<snip>
{Type3,OK,6,-44.71,-41.36}
{Type3,OK,1,-44.76,-40.94}
{Type3,OK,6,-44.78,-41.68}
{Type3,OK,1,-44.79,-39.98}
{Type3,OK,6,-44.75,-42.75}
{Type3,OK,1,-44.8,-40.52}
{Type3,OK,6,-44.8,-41.7}

Lines read: 6327 Lines selected: 598

P.S.: The Print of rows is there for demonstration only - stephensReader should as usual return the list of matching rows to be useful.

Clear[stephensReader]
stephensReader[s_String, a1_String, v1_Integer: 1, v2_Integer: 6] := 
 Module[{str, r, o = 0, oo = 0, n, resL = {}},
   If[! FileExistsQ[s],
    Print["Sorry, cannot find file \"", s, "\". Bye."];
    Return[$Failed]
   ];
   str = OpenRead[s, BinaryFormat -> True];
   While[True,
    r = ReadLine[str];
    If[r == EndOfFile, Break[], o++, o++];
    r = StringSplit[r, ",", All];
    n = ToExpression[r[[3]]];
    If[StringMatchQ[r[[1]], a1] && ((n == v1) || (n == v2)),
     oo++;
     resL = Join[resL, {r}];
    ]
   ];
   Close[str];
   Print["Lines read: ", o, "| Lines selected: ", oo];
   resL
 ] /; StringLength[s] > 0 && StringLength[a1] > 0

In[65]:= stephensReader["test_file.csv", "Type3"] // Length
During evaluation of In[65]:= Lines read: 6327| Lines selected: 598
Out[65]= 598
POSTED BY: Udo Krause
Posted 11 years ago
POSTED BY: Bob Stephens

That's it

 In[1]:= SetDirectory[FileNameJoin[{NotebookDirectory[], "test"}]]
 Out[1]= "N:\\Udo\\Abt_N\\test"

 In[22]:= Select[
          Import["test_file.csv", "Data"], 
          (StringMatchQ[#[[1]], "Type3"] && ((#[[3]] == 1) || (#[[3]] == 6))) &] // Short[#, 17] &

 Out[22]//Short= 
 {{Type3,OK,1,-44.57,-40.68},{Type3,OK,6,-44.6,-41.83},
  {Type3,OK,1,-44.61,-39.72},{Type3,OK,6,-44.53,-41.44},
  {Type3,OK,1,-44.56,-39.92},{Type3,OK,6,-44.58,-40.83},
  {Type3,OK,1,-44.54,-41.47},{Type3,OK,6,-44.51,-41.17},
  {Type3,OK,1,-44.56,-39.89},{Type3,OK,6,-44.57,-41.47},
  {Type3,OK,1,-44.57,-40.62},{Type3,OK,6,-44.55,-41.61},
  {Type3,OK,1,-44.54,-40.41},{Type3,OK,6,-44.57,-41.26},
  {Type3,OK,1,-44.56,-39.98},<<568>>,{Type3,OK,6,-44.72,-41.84},
  {Type3,OK,1,-44.7,-40.06},{Type3,OK,6,-44.7,-40.89},
  {Type3,OK,1,-44.71,-40.85},{Type3,OK,6,-44.69,-42.13},
  {Type3,OK,1,-44.71,-40.68},{Type3,OK,6,-44.73,-41.68},
  {Type3,OK,1,-44.74,-40.77},{Type3,OK,6,-44.71,-41.36},
  {Type3,OK,1,-44.76,-40.94},{Type3,OK,6,-44.78,-41.68},
  {Type3,OK,1,-44.79,-39.98},{Type3,OK,6,-44.75,-42.75},
  {Type3,OK,1,-44.8,-40.52},{Type3,OK,6,-44.8,-41.7}}

Import realizes the CSV format and builds a list per row from which one just selects what's needed.

The result could be exported back to the file system as CSV file, of course.

POSTED BY: Udo Krause
Posted 11 years ago

sure - in the enclosed example file I want to use FindList to select all lines containing "Type3" AND {1 OR 6} for Index. So I assume you are saying FindList would look something like the line below?

I noticed the documentation for RegularExpression shows p1 | p2 .........string matching p1 OR p2 (which may take care of selecting index 1 or 6 in my example) but there does not appear to be an option for AND.

Please see enclosed file.

thanks

FindList["D:\\test_file.csv", <RegularExpression selecting  "Type3" AND {"1" OR "6" in Index column}> ]
Attachments:
POSTED BY: Bob Stephens

This is the application area of regular expressions. If you deliver an example file to parse some more hints might possibly follow.

POSTED BY: Udo Krause
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard