Group Abstract Group Abstract

Message Boards Message Boards

Import large data files for Machine Learning?

Posted 7 years ago
POSTED BY: Jürgen Kanz
11 Replies

Thank You again to all participants for their contributions. After some trials I have found an efficient way to insert the big csv file in MongoDB via mongoimport.

The already above mentioned reference page is excellent to get all needed Information to connect Mathematica to MongoDB and to make use of data which do not fit into memory.https://reference.wolfram.com/language/tutorial/NeuralNetworksLargeDatasets.html

POSTED BY: Jürgen Kanz

Another possibility is to use a Mongo database --- as it is described in the same link I gave above.

POSTED BY: Wolfgang Hitzl

I am sorry for late response. Well, I have to admit that I am not a MongoDB expert. It is not possible for me to import the entire Csv file in Mongo. The import stops after ca. 1% (5,5 * 10^6) datasets. Now I am trying to parse CSV to Json, and I hope the Mongo import will lead to success with Json.

Thanks again for support.

POSTED BY: Jürgen Kanz

I suggest to use a generating function for training of large data sets, as it is described here

Training on large data sets

POSTED BY: Wolfgang Hitzl

Thank you for the hint.

POSTED BY: Jürgen Kanz
Posted 6 years ago
POSTED BY: Jojen Bourgain
Posted 7 years ago
POSTED BY: Rohit Namjoshi

That is very useful! Thank you very much. It seems that I can solve the problem with a database based stream of data and a neural network. I will make a try.

POSTED BY: Jürgen Kanz

I uploaded my data to Mysql and than just created views in MySQL and read them from Mathematica

Good hint, thank you. I am currently trying to import the data into PostgreSQL. This approach means that the Predict function or a neural network would get the Data in a stream with the consequence that not all data could be available in one moment of time (due to limited RAM).

Is it possible for Mathematica to build a ML model based on a stream of data?

POSTED BY: Jürgen Kanz
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard