Message Boards Message Boards

GROUPS:

Import large data files for Machine Learning?

Posted 21 days ago
250 Views
|
5 Replies
|
5 Total Likes
|

I want to run a machine learning task on my Win 10 PC, 16GB RAM, Mathematica 11.3.0, but I am facing the following problems: training set size 10GB CSV file, with 700,000,000 x 2 datasets. Mathematica simply stops during import via Import or ReadList function. My idea is to split the input file into several smaller files that could be imported and to load the smaller files in a batch to feed the Predict function or perhabs a neural network. Any idea how to make it happen? Do you have a better idea?

Many thanks in advance for support!

5 Replies

I uploaded my data to Mysql and than just created views in MySQL and read them from Mathematica

Posted 21 days ago

Good hint, thank you. I am currently trying to import the data into PostgreSQL. This approach means that the Predict function or a neural network would get the Data in a stream with the consequence that not all data could be available in one moment of time (due to limited RAM).

Is it possible for Mathematica to build a ML model based on a stream of data?

I don't know, but I know Mathematica can take advantage of Hadoop and MapReduce.

Posted 21 days ago

Is it possible for Mathematica to build a ML model based on a stream of data?

See my answer to this question.

Posted 21 days ago

That is very useful! Thank you very much. It seems that I can solve the problem with a database based stream of data and a neural network. I will make a try.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract