Message Boards Message Boards

Can Mathematica save me hours of copy and paste

Posted 4 years ago
POSTED BY: heidi kellner
3 Replies
Posted 4 years ago

You can use Mathematica's RegularExpression[] to find matches and replace them:

input =  "
 width=\"1234\"   height=\"3456\"
 width =\"1234\"   height=\"3456\"
 width =   \"1234\"   height    =\"3456\"
 width=\"1234\"              height=\"3456\"
 width=\"123s\"    height=\"34z6\"
 ";

StringReplace[input,
 RegularExpression[
   "width\\s*=\\s*\"\\s*\\d*\\s*\"\\s*height\\s*=\\s*\"\\s*\\d*\\s*\"\
"] -> "class=\"responsive-image\""
]

This code returns the following string (note: since the last line was ill-formed, the replacement did not occur):

class="responsive-image"
class="responsive-image"
class="responsive-image"
class="responsive-image"
width="123s"    height="34z6"

Breaking down the regular expression:

"width\\s*=\\s*\"\\s*\\d*\\s*\"\\s*height\\s*=\\s*\"\\s*\\d*\\s*\"\"

"width": starts with "width"

"\\s*": any number of whitespace (spaces, tabs, etc) characters

"=": equals sign

"\\s*": any number of whitespace (spaces, tabs, etc) characters

"\"": a quotation mark

"\\s*": any number of whitespace (spaces, tabs, etc) characters

"\\d*": any number of digit (0, 1, 2, ... 9) characters

and so on.

A word of caution though, if the height field is given first, this won't match. Also, this will replace all instances of the matches it finds. There may be parts of your HTML document that have width and height fields that you don't want changed to responsive-image.

POSTED BY: Sam M

It would probably be easiest if the HTML documents were first imported as symbolic XML, then "fixed" with ReplaceAll and the appropriate set of replacement rules, and then exported back as HTML.

Could you please provide a couple of smallish files to try this out?

POSTED BY: Robert Nachbar

I've included five random html files that will need updated. I've targeted the most crucial and consistent part that needs updated (for starters, if this works WL can take this much further). For now it would be amazing to point Mathematica to a directory and have it go to work - removing all the old td/tr html and replacing it with a new structure. I've outlined the first easiest pass in the notebook. Can Wolfram Language Save Me?

POSTED BY: heidi kellner
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract