Message Boards Message Boards

Import large data files for Machine Learning?

Posted 5 years ago
POSTED BY: Jürgen Kanz
11 Replies
POSTED BY: Jürgen Kanz

Another possibility is to use a Mongo database --- as it is described in the same link I gave above.

POSTED BY: Wolfgang Hitzl

I am sorry for late response. Well, I have to admit that I am not a MongoDB expert. It is not possible for me to import the entire Csv file in Mongo. The import stops after ca. 1% (5,5 * 10^6) datasets. Now I am trying to parse CSV to Json, and I hope the Mongo import will lead to success with Json.

Thanks again for support.

POSTED BY: Jürgen Kanz

I suggest to use a generating function for training of large data sets, as it is described here

Training on large data sets

POSTED BY: Wolfgang Hitzl

Thank you for the hint.

POSTED BY: Jürgen Kanz
Posted 5 years ago
POSTED BY: Jojen Bourgain
Posted 5 years ago

Is it possible for Mathematica to build a ML model based on a stream of data?

See my answer to this question.

POSTED BY: Rohit Namjoshi
POSTED BY: Jürgen Kanz

I don't know, but I know Mathematica can take advantage of Hadoop and MapReduce.

Good hint, thank you. I am currently trying to import the data into PostgreSQL. This approach means that the Predict function or a neural network would get the Data in a stream with the consequence that not all data could be available in one moment of time (due to limited RAM).

Is it possible for Mathematica to build a ML model based on a stream of data?

POSTED BY: Jürgen Kanz
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract