Message Boards Message Boards

GROUPS:

Methodology for semantic input of dirty data

Posted 4 years ago
2843 Views
|
0 Replies
|
1 Total Likes
|

Attached files implement importation of typical (but fictional) commercial data using SemanticImport[], to include the use of very basic data cleansing methodology to improve fidelity of Dataset[] contents. Where possible, information is presented as Entity[]. Problems of this sort are often encountered in commercial data processing. Store files in the same folder and execute semanticImportTest.nb.

Justification: Applied mathematics is often inapplicable absent large data volume. Much commercial data is either in Excel files or exported by databases into .csv format. Further, most commercial data contains many data of illegible format, such as a date formatted as "NA" or a simple blank. If such files are not read, applied mathematics cannot be applied to the files' data.

Attachments:
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract