Message Boards Message Boards

0
|
8911 Views
|
6 Replies
|
1 Total Likes
View groups...
Share
Share this post:

Transform a string in UTF-8 format into a string in ANSI format?

Posted 6 years ago

I have a string like a = "abcdefg" it is a UTF-8 format string. I want to transform a into a string which is in ANSI format, how can I do that?

POSTED BY: gearss zhang
6 Replies
Posted 6 years ago

May be this MMa.SE discussion can help you:

POSTED BY: Alexey Popkov

For characters that are part of ASCII, the UTF-8 encoding is identical, so nothing changes. For characters that are not part of ASCII, the UTF-8 codes have no representation in ASCII. Therefore, what you ask is either trivial or impossible.

POSTED BY: John Doty

"The phrase ANSI character set has no well-defined meaning..." Do you mean Windows-1252? Or do you mean some other ANSI standard such as ASCII or its successor ISO-8859?

Perhaps

 http://reference.wolfram.com/language/ref/$CharacterEncodings.html 

would help you.

POSTED BY: Michael Rogers
Posted 6 years ago

I say what I wan to to do: I have a txt file, its encoding format is UTF-8, I want to tranform this file into GB18030 format. but I run the code:

FromCharacterCode[ToCharacterCode[string,  "UTF8"], targetEncoding]

The message says:

Message[Get::noopen, "/opt/Wolfram/WolframEngine/11.2/SystemFiles/CharacterEncodings/GB18030.m"]

How do I solve this problem?

POSTED BY: gearss zhang
Posted 6 years ago

I say what I wan to to do: I have a txt file, its encoding format is UTF-8, I want to tranform this file into ANSI format.

POSTED BY: gearss zhang

I'm not 100% sure I am right but I can learn if I answer, let's try then:

The question is, what do you mean by UTF-8 format string? Does it come from an UTF-8 encoded source and was decoded during import ("Text", "JSON" format etc.)? Or was it imported as a raw bytes from such source ("String", "Byte")?

Your question suggest the latter while the former is more likely to be the case. Anyway, for 'raw bytes' scenario you can use

FromCharacterCode[ToCharacterCode[string,  "UTF8"], targetEncoding]

and for, a more likely, decoded string scenario:

ExportString[string, "String",  CharacterEncoding -> targetEncoding]

I didn't have a coffee yet so sorry in case I made a mistake, strings and encoding can be confusing.

POSTED BY: Kuba Podkalicki
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract