# How can I remove the formatting from imported RTF (Rich Text Format) files?

GROUPS:
 Hi everyone,How can I strip away the formatting and leave only the text when I import an RTF?I've imported as Import[myFile, "RTF"] Gregory
2 years ago
4 Replies
 Tim Mayes 1 Vote Gregory,Maybe try something like this: nb = Last[ Import["C:\\Users\\YourName\\Desktop\\This is an RTF file.rtf", "Rules"]] That should open a new notebook that contains the text contents of the RTF file. From there, I think you should be able to programmatically do whatever you want with the text.
 Hans Michel 3 Votes Gregory:As stated in the help document "Import and Export support RTF format Version 1.3." According to WikiPedia article on RTF (https://en.wikipedia.org/wiki/Rich_Text_Format) version 1.3 is from 1993. A quick google search for rtf examples yielded this sites http://www.jafsoft.com/examples/rtf/testrtf.rtf example containing different elements no images though.Method 1. Needs["NETLink"] InstallNET[]; rtfz = NETNew["System.Windows.Forms.RichTextBox"] rtfz@Rtf = URLFetch["http://www.jafsoft.com/examples/rtf/testrtf.rtf"]; rtfz@Text Method 2. Needs["XML"]; Cases[ToSymbolicXML[ Import["http://www.jafsoft.com/examples/rtf/testrtf.rtf", "RTF"]], XMLElement["String", _, {mtext_}] -> mtext, Infinity] Method 3. rtfrules = ToExpression[Import["path of saved attached file rtfrules.txt on your system"]]; StringReplace[ URLFetch["http://www.jafsoft.com/examples/rtf/testrtf.rtf"], rtfrules, MetaCharacters -> Automatic] Where rtfrules is the contents of the attached file. At some point in 2004 I made a beginning set of replacement rule based on RTF 1.6 or 1.7. This is a beginning set of rules setting all these rtf control tags to "" is not optimum.Method 4.If in Windows environment, then install a Generic/Text Printer Driver whose output goes to file and set it as default printer (before starting Mathematica) nb = CreateDocument[ Import["http://www.jafsoft.com/examples/rtf/testrtf.rtf", "RTF"]]; NotebookPrint[nb] The print dialog should popup to save the *.prn file set the paper size to "US Std Fanfold" for 120 characters wide or "Letter" for 80 characters wide. The resulting .prn file should contain ASCII (ANSI) text depending on layout may cutoff. Open saved .prn file in text editor to see if output is acceptable.Method 5. Do something similar to .NET method but using Java it would need to be a Swing object. I could not test this it is late and I have some java rust Needs["JLink"] InstallJava[]; rtfx = JavaNew[rtfobject] (javax.swing.text.rtf.RTFEditorKit) rtfx@Rtf = URLFetch["http://www.jafsoft.com/examples/rtf/testrtf.rtf"]; rtfx@Text `All these methods are starters as some methods would require memory management if applied repeatedly. The replacement rules would require the most work.RTF is a bit dangerous format as it accepts embedding of external objects. Attachments: