Message Boards Message Boards

0
|
6069 Views
|
5 Replies
|
1 Total Likes
View groups...
Share
Share this post:

Can read files in the compressed file directly?

I have a large number of compressed files (ZIP) stored in different directories. For some reason, I want to be able to read all the CSV files in the compressed files directly, but do not unzip it. Now, I can only read the file list in the compressed file. What should I do next?

dataList = FileNames["*.ZIP", "//Data//", 3];
data = Flatten[#, 1] &@Import[#, {"ZIP", "FileNames"}] & /@ dataList
POSTED BY: Tsai Ming-Chou
5 Replies

If you're having trouble reading in Chinese characters, I recommend fiddling about with the character encoding. I had a similar problem once that I solve by first setting:

$CharacterEncoding = "UTF8"

Give that a try before importing the files.

POSTED BY: Sjoerd Smit

Thank everyone for warm response!

I tested it with the ideas provided by Sjoerd Smit. In the test, I read two compressed files. The filename of the data in one compressed file is composed of english and numbers, and the name of the other data contains traditional chinese (but I don't know why it is garbled). It turns out that if the data name in the compressed file is not garbled, the data can read normally. So I think the real problem should be the issue of the filename.

Due to a large of these important historical data, I am improbable to modify the data names in the compressed file one by one. Therefore, I still need to find a solution to this problem.

POSTED BY: Tsai Ming-Chou
Posted 3 years ago

You can use vim to list the content of the zip/rar/tar archive:

vim archive.zip

POSTED BY: Yasmin Hussain

This should be able to import the files in a zip archive:

zipFile = "testzip.zip";
data = Map[
  Import[zipFile, {"ZIP", #}] &,
  Import[zipFile, {"ZIP", "FileNames"}]
];
POSTED BY: Sjoerd Smit
Posted 4 years ago

Hi Tsai,

What do you mean by

read all the CSV files in the compressed files directly, but do not unzip it

The zip file has to be unzipped to access its contents. Maybe this is what you are looking for?

data = Import[#, "*.csv"]& /@ datalist
POSTED BY: Rohit Namjoshi
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract