Message Boards Message Boards

1
|
4187 Views
|
2 Replies
|
3 Total Likes
View groups...
Share
Share this post:

[Solved] Import plain text from > 300 PDF files & then export each file

Posted 3 years ago

I have about 300 PDF files in a directory & each file contains both images and text. I want to import the plain text from each of those files and then export the data to .txt files with the same name as the PDF files.

I can do this one by one but is there is a way to batch process all 300 files?.

POSTED BY: v m
2 Replies
Posted 3 years ago

Hi v m

This should do it. Please first test on a subset of the files in a different directory.

dir = "path/to/directory"
files = FileNames["*.pdf", dir]

Export[FileNameJoin[{dir, FileBaseName@#1 <> ".txt"}],  Import[#, "Plaintext"]] & /@
files
POSTED BY: Rohit Namjoshi
Posted 3 years ago

I tried it and worked with no issues at all. This was exactly what I was looking for. Thank you!

POSTED BY: v m
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract