Message Boards Message Boards

1
|
14873 Views
|
6 Replies
|
9 Total Likes
View groups...
Share
Share this post:

Missing["KeyAbsent", somekey] (How to extract somekey)?

Posted 11 years ago

Hello, I am using the KeyUnion function to combine a couple of lists of lists. In some cases, a key exists in one list of sublists, but not in the other.

Here is an example:

assoc1 = <|1 -> {1, a, aa},3 -> {3, c, cc}|>

assoc2 = <|1 -> {1, 10, 100, 1000}, 5 -> {5, 50, 500, 5000}|>

I do a KeyUnion....

assocList = KeyUnion[{assoc1, assoc2}]

...to get something like this...

assocList = { <|1 -> {1, a, aa},3 -> {3, c, cc}, 5 -> Missing["KeyAbsent", 5]|>, <|1 -> {1, 10, 100, 1000},  
     3 -> Missing["KeyAbsent", 3], 5 -> {5, 50, 500, 5000}|>}

Now I want to merge the two lists, based on the key, replacing the missing values by some Key, Value pair that includes "dummy" value such as 5->{-9999}. The following piece of code, for example, works nicely for an individual replacement (eg. Key "3"):

Merge[{assocList[[1]], assocList[[2]]}, Identity]
merged = Values[%] /. _Missing -> {3, -9999};

To get...

{{{1, a, aa}, {1, 10, 100, 1000}}, {{3, c, cc}, {3, -9999}},...}

But you also get...

 {...,{{3,-9999}, {5, 50, 500, 5000}},...}

Which is not what we want. (The ultimate goal is to make a Union between the two lists associated with each key, where the first item in the list is also the same as the key.)

What I am looking for is a way to make a {Key, Value} pair replacement for each missing key. Is there a way to Is there a way to pick out the "index" part of the Missing[KeyAbsent, index] so that I can write a replacement rule that iterates over the keys?

POSTED BY: Caitlin Ramsey
6 Replies
Posted 10 years ago

Yes, please do. The source is not confidential. I am attaching a text file.

Posted 10 years ago

I've mulled this one over, and unless I am missing something, you are handling the "missing" values by not generating "missing" in the first place. Perhaps that is my fault, for constructing a "toy example" that is not well suited to illustration. Usually, I would be starting with a dataset from an outside source, "from the wild," where I do not have any control over the code or system that actually generates the sample data.

Here, I will present a typical example I would encounter importing real-world data:

In my sample data, I have two kinds of Missing ...

Missing["Unrecognized", "21.799999"]

Missing["Empty"]

To give a little context, these are being imported as a Dataset (in the Mathematica 10 sense) using Semantic Import[]. I can extract values from a column called "Days to Start a Business" that would typically contain a numeric value:

in := $CurrentData[All, "Days to Start a Business"] // Normal
out := {5, 4, Missing["Unrecognized", 
  "21.799999"], 18, 32, 13, 5, 10, 101, 15, 22, 60, 15, 101, 8, 17, 8,
  Missing["Unrecognized", "16.5"], 84, 15, 27, 2, 14, 
 Missing["Unrecognized", "19.5"], 16, 9, 19, 97, 14, 
 Missing["Unrecognized", "26.1"], 
 Missing["Unrecognized", "75.5"], 32, 31, 8, 92, 29, 
 Missing["Unrecognized", 
  "4.5"], 8, 40, 11, 19, 11, 11, 13, 17, 36, 17, 
 Missing["Unrecognized", 
  "30.799999"], 19, 53, 35, 36, 7, 9, 5, 6, 12, 9, 
 Missing["Empty"], 11, 36, 38, 13, 33, 19, 32, 21, 
 Missing["Unrecognized", "7.5"], 34, 40, 
 Missing["Unrecognized", "7.5"], 90}

For those that are Missing["Unrecognized", "7.5"], one might choose to handle them by mapping some function onto the "unrecognized" values (eg. applying a rounding strategy to convert to the nearest integer) and replacing. I should note, I haven't yet explored an alternative possibility: specifying the type explicitly as a decimal in the semantic import statement. This function and its options are new to me.

For the values that are simply Missing["Empty"] a different approach would be required. In practice, the empty or "missing" values might be replaced by either the Min or Max of the non-empty members of the set.

But in explaining this in more detail, I am prompted to look into the documentation on "Missing Values" under SemanticImport, and will follow up if/when I find a solution.

POSTED BY: Caitlin Ramsey

you are handling the "missing" values by not generating "missing" in the first place

Right.

a typical example I would encounter importing real-world data

If the source is not confidential, can you please attach it as it is for on-going experimentation on it?

POSTED BY: Udo Krause

This

    In[37]:= Clear [assoc1, assoc2]
    assoc1 = <|1 -> {1, a, aa}, 3 -> {3, c, cc}|>;
    assoc2 = <|1 -> {1, 10, 100, 1000}, 5 -> {5, 50, 500, 5000}|>;

    In[41]:= assocList = KeyUnion[{assoc1, assoc2}, <|1 -> {1}, 3 -> {3}, 5 -> {5}|>]
    Out[41]= {<|1 -> {1, a, aa}, 3 -> {3, c, cc}, 5 -> {5}|>,
              <|1 -> {1, 10, 100, 1000}, 3 -> {3}, 5 -> {5, 50, 500, 5000}|>}

    In[44]:= Merge[assocList, Identity]
    Out[44]= <|1 -> {{1, a, aa}, {1, 10, 100, 1000}}, 3 -> {{3, c, cc}, {3}}, 5 -> {{5}, {5, 50, 500, 5000}}|>

does not help? From the manual for KeyUnion:

The missing function can be an association:

the only thing yet to be done is creating the missing function association automatically from the all the present keys which seems possible but now I have to get the gifts out of the wrap ... Merry Christmas everyone!

POSTED BY: Udo Krause

the only thing yet to be done is creating the missing function association automatically

In[19]:= Clear [assoc1, assoc2, assoc3, assoK]
assoc1 = <|1 -> {1, a, aa}, 3 -> {3, c, cc}|>;
assoc2 = <|1 -> {1, 10, 100, 1000}, 5 -> {5, 50, 500, 5000}|>;
assoc3 = <|2 -> {4, 8, 16}, 4 -> {16, 64}, 7 -> {"The", "whole", "nine", "yards"}|>;
assoK = Association @@ (Rule[#, {#}] & /@ Union[Flatten[Keys /@ {assoc1, assoc2, assoc3}]])
Out[23]= <|1 -> {1}, 2 -> {2}, 3 -> {3}, 4 -> {4}, 5 -> {5}, 7 -> {7}|>

In[24]:= Clear[assocList]
assocList = KeyUnion[{assoc1, assoc2, assoc3}, assoK]
Out[25]= {<|1 -> {1, a, aa}, 3 -> {3, c, cc}, 5 -> {5}, 2 -> {2}, 4 -> {4}, 7 -> {7}|>, 
          <|1 -> {1, 10, 100, 1000}, 3 -> {3}, 5 -> {5, 50, 500, 5000}, 2 -> {2}, 4 -> {4}, 7 -> {7}|>, 
          <|1 -> {1}, 3 -> {3}, 5 -> {5}, 2 -> {4, 8, 16}, 4 -> {16, 64}, 7 -> {"The", "whole", "nine", "yards"}|>}

In[26]:= Merge[assocList, Identity]
Out[26]= <|1 -> {{1, a, aa}, {1, 10, 100, 1000}, {1}}, 
           3 -> {{3, c, cc}, {3}, {3}}, 
           5 -> {{5}, {5, 50, 500, 5000}, {5}}, 
           2 -> {{2}, {2}, {4, 8, 16}}, 
           4 -> {{4}, {4}, {16, 64}}, 
           7 -> {{7}, {7}, {"The", "whole", "nine", "yards"}}|>

of course, in defining assoK one could give a value more reminiscent about a missing value like the NULL of SQL ...

POSTED BY: Udo Krause
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract

Be respectful. Review our Community Guidelines to understand your role and responsibilities. Community Terms of Use