Group Abstract Group Abstract

Message Boards Message Boards

1
|
6.6K Views
|
7 Replies
|
7 Total Likes
View groups...
Share
Share this post:

Cases with string patterns and pure function

Posted 11 years ago

Hi,

I have a list of strings, let's call it "data". I want to extract those strings that match a particular pattern: "str1 str2 str3". I can accomplish that with

Cases[_?(StringMatchQ[#,NumberString~~Whitespace~~NumberString~~Whitespace~~NumberString]&)][data]

However, I would like to use the Cases feature to "format" the results in the following way (and this is why I am using Cases instead of Select):

Cases[x_?(StringMatchQ[#,NumberString~~Whitespace~~NumberString~~Whitespace~~NumberString]&)->StringSplit[x]][data]

which would yield a list structured as follows {{str1,str2,str3},...}. In this case I get the output I want by using StringSplit, but how can I (or whether I can) form more complicated rules without resorting to that StringSplit trick. More generally, how can I name arguments in the pattern within StringMatchQ to later use on the rhs of ->?

Notice that if I run

Cases[x_?(StringMatchQ[#,f:NumberString~~Whitespace~~g:NumberString~~Whitespace~~h:NumberString]&)->{f,g,h}][data]

I will not get the answer I want.

Thanks,

POSTED BY: Miguel Olivo-V
7 Replies
Posted 11 years ago

I think this meets your requirement for naming parts. Works for me.

Flatten[Map[
  StringCases[#, 
    f : NumberString ~~ Whitespace ~~ g : NumberString ~~ Whitespace ~~
       h : NumberString -> {f, g, h}] &, data], 1]
POSTED BY: Douglas Kubler
Posted 11 years ago
POSTED BY: Miguel Olivo-V
POSTED BY: David Reiss
Posted 11 years ago

Nice, I also thought that it could be accomplished combining Cases and StringCases like you show. In any case, why doesn't Cases directly take string patterns as arguments? I was expecting Mathematica (or the Wolfram Language now) to be nice enough to be able to compute

Cases[f:NumberString~~Whitespace~~g:NumberString~~Whitespace~~h:NumberString->{f,g,h}][data]

Also funny is that StringCases doesn't have an operator form.

Thanks a lot David

POSTED BY: Miguel Olivo-V

I meant to address that question. I don't think that that's possible. The named patterns are "scoped" within the pure function and are not the patterns that the Cases is making use of.

Also, I meant to comment that, in your replacements rules for the Cases and the StringCases you probably want to use a delayed rule rather than an immediate rule.

Also (again) my final expression above should probably read

StringCases[newData, 
 "X" ~~ f : NumberString ~~ Whitespace ~~ g : NumberString ~~ 
   Whitespace ~~ h : NumberString ~~ "X" :> ToExpression[{f, g, h}]]

so that you actually get numbers in the lists rather than strings with number characters.

Finally here is an approach using Cases and then modifying the result using StringCases

Cases[x_?(StringMatchQ[#, 
       NumberString ~~ Whitespace ~~ NumberString ~~ Whitespace ~~ 
        NumberString] &) :> 
   StringCases[x, 
    f : NumberString ~~ Whitespace ~~ g : NumberString ~~ Whitespace ~~
       h : NumberString :> Sequence @@ ToExpression@{f, g, h}]][data]

or a slightly different approach:

Cases[x_?(StringMatchQ[#, 
       NumberString ~~ Whitespace ~~ NumberString ~~ Whitespace ~~ 
        NumberString] &) :> 
   StringReplace[x, 
    f : NumberString ~~ Whitespace ~~ g : NumberString ~~ Whitespace ~~
       h : NumberString :> 
     ToExpression["{" <> f <> "," <> g <> "," <> h <> "}"]]][data]
POSTED BY: David Reiss

You might take an "all string" approach like this (using the "XX" as a boundary between items in the list, though this assumes that there are no Xs in the strings in the data list):

In[1]:= data = {"1 2 3", "g 8 9", "5 3  7", "8 45", "213 452 9876"}

Out[1]= {"1 2 3", "g 8 9", "5 3  7", "8 45", "213 452 9876"}

In[2]:= newData = "XX" <> StringJoin@Riffle[data, "XX"] <> "XX"

Out[2]= "XX1 2 3XXg 8 9XX5 3  7XX8 45XX213 452 9876XX"

In[3]:= StringCases[newData, 
 "X" ~~ f : NumberString ~~ Whitespace ~~ g : NumberString ~~ 
   Whitespace ~~ h : NumberString ~~ "X" :> {f, g, h}]

Out[3]= {{"1", "2", "3"}, {"5", "3", "7"}, {"213", "452", "9876"}}

POSTED BY: David Reiss
Posted 11 years ago

Thanks, that works. But I guess the proper question is whether I can name arguments inside a pure function that modifies a pattern (using PatternTest).

POSTED BY: Miguel Olivo-V
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard