Message Boards Message Boards

1 Reply
4 Total Likes
View groups...
Share this post:

Getting SQL Data Directly Into a Dataset

Posted 10 years ago

I don't seem to be able to find any documentation on the best practice for getting SQL data directly into a Dataset.

Can anyone suggest an efficient method?

At the same time, I am sure somewhere in the documentation I saw a fast way of creating a Dataset in which first the list of keys were specified, and then a list of all the values were specified. So rather than something like...

dataset = Dataset[{<| "a" -> 1, "b" -> 2|>, <| "a" -> 3, "b" -> 4|>}]

...which works, but is rather long-winded with lots of records, you'd do something like...

dataset = Dataset[<| {"a", "b" }|> -> {{1, 2}, {3, 4}}]

...which is admirably terse with lots of records, but has the very small disadvantage that it doesn't work at all.

My two questions are related, because if something like the second method of Dataset[] creation actually existed, it would be pretty simple to turn the results of a SQL data query into a functioning Dataset[] by using a SQLExecute[] option like "ShowColumnHeadings" -> True

Any thoughts, anyone?

POSTED BY: Brad Varey

The following wrapper will return a Dataset, but will otherwise behave like SQLSelect:

SQLSelectToDataset[args___, o : OptionsPattern[]] := Module[
  SQLSelect[args, "ShowColumnHeadings" -> True, "ColumnSymbols" -> Function[{cols, res},
       d = AssociationThread[cols, #] & /@ res

(The "ColumnSymbols" option lets you provide a callback to operate on the column names and resultset. You could accomplish the same thing less concisely munging the resultset and its column headings conventionally.)

An example using one of the built-in example databases (documentation):

conn = OpenSQLConnection["publisher"];
SQLSelectToDataset[conn, "AUTHORS", {"AU_ID", "AU_LNAME", "AU_FNAME"},
  "SortingColumns" -> {
     SQLColumn["AU_LNAME"] -> "Ascending",
     SQLColumn["AU_FNAME"] -> "Ascending"

dataset output

Now to your point about efficiency. Key repetition is indeed built in to the current Dataset design, in part to support hierarchical data. We are aware of the need for more efficient representations of tabular data and are working on tighter Dataset-SQL integration. Note that it is possible to construct a keyless Dataset:

SQLSelectToDataset[args___, o : OptionsPattern[]] := Module[
  SQLSelect[args, "ShowColumnHeadings" -> True, "ColumnSymbols" -> Function[{cols, res},
       d = res

However in this case subsequent operations are limited to using [[part]] rather than #key syntax. HTH

POSTED BY: Dillon Tracy
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract