Difference between revisions of "Manual:Unicode"

From Mudlet
Jump to navigation Jump to search
(2 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
== Changing encoding ==
 
== Changing encoding ==
  
Mudlet supports displaying text in many languages other than English, called [https://www.w3.org/International/questions/qa-what-is-encoding encoding] - this helps Mudlet display the spanish ñ, the Russian я, and many other letters. Go to Preferences > General to set the encoding:
+
Mudlet is being developed to support displaying text in many languages but how the characters that conveys that language varies between MUDs as does the languages they support. Plain ''vanilla'' Telnet actually only supports the 96-characters of ASCII by default, but other language can be supported if the way that they are converted into 8-bit bytes can be agreed upon by the use of what is called [https://www.w3.org/International/questions/qa-what-is-encoding encoding] - setting Mudlet (and the MUD server if approriate) to the correct encoding allows the correct display of characters like the Spanish ñ, the Russian я, and all other letters (or more properly [https://en.wikipedia.org/wiki/Grapheme grapheme]).
 +
 
 +
Go to Preferences > General to set the encoding:
  
 
[[File:Server_encoding.png|frame|none|Prefer UTF-8 if your game supports it.]]
 
[[File:Server_encoding.png|frame|none|Prefer UTF-8 if your game supports it.]]
  
{{note}} choosing the encoding will help Mudlet display the letters right, but triggers will ''not'' work with non-English text and nor will some Lua ''string.'' functions like string.len(). We'll be adding these features incrementally and it will all be ready in Mudlet 4.0.
+
{{note}} choosing the encoding will help Mudlet display the letters right, but triggers will ''not'' currently work with non-English text and nor will some Lua ''string.'' functions like string.len(). We'll be adding these features incrementally and it will all be ready in Mudlet 4.0. Prior to the work commencing toward this in Mudlet 3.x the only encodings that was understood were '''ASCII''' and the '''Latin 1 (ISO 8859-1)''' encoding - the latter being covered because it is the default (using 8-bits and having character ''codes'' in the range 0 to 255) for the Qt Libraries and like all the other ''ISO 8859'' ones is a super set that has the same codes in the range (0 to 127) as ASCII...
  
 
The list of encodings supported by Mudlet is:
 
The list of encodings supported by Mudlet is:
Line 17: Line 19:
 
|-
 
|-
 
| ASCII
 
| ASCII
| 3.2.0
+
| 0.0.1
 
|-
 
|-
 
|-
 
|-
Line 25: Line 27:
 
|-
 
|-
 
| ISO 8859-1
 
| ISO 8859-1
| 3.2.0
+
| 0.0.1
 
|-
 
|-
 
|-
 
|-
Line 145: Line 147:
  
 
|}
 
|}
 +
 +
== Scripting with Unicode ==
 +
 +
Mudlet uses English in all of its Lua API to enable scripts scripts to be international - so a script written on a computer with German default will work on a computer with English default, for example. This means you can expect all API functions, error messages to be in English, and the number separator is always a period <code>.</code> Mudlet sets <code>os.setlocale("C")</code> by default, [https://www.lua.org/pil/22.2.html see background].
 +
 +
Not all Lua functions beginning with <code>string.</code> will work with Unicode - Mudlet has <code>utf8.</code> equivalents for those. See [[Manual:String_Functions|String functions in Mudlet]] for a complete list. For example:
 +
 +
<syntaxhighlight lang="lua">
 +
print(string.len("слово"))
 +
> 10 -- wrong!
 +
print(utf8.len("слово"))
 +
> 5  -- correct!
 +
</syntaxhighlight>
 +
 +
== Loading external Lua files ==
 +
 +
Mudlet uses Unicode (utf8) for the trigger engine and Lua subsystem. If you have a file you're [https://www.lua.org/pil/8.html loading externally] with Lua, make sure it saved in utf8 encoding.

Revision as of 10:51, 8 July 2019

Unicode

Changing encoding

Mudlet is being developed to support displaying text in many languages but how the characters that conveys that language varies between MUDs as does the languages they support. Plain vanilla Telnet actually only supports the 96-characters of ASCII by default, but other language can be supported if the way that they are converted into 8-bit bytes can be agreed upon by the use of what is called encoding - setting Mudlet (and the MUD server if approriate) to the correct encoding allows the correct display of characters like the Spanish ñ, the Russian я, and all other letters (or more properly grapheme).

Go to Preferences > General to set the encoding:

Prefer UTF-8 if your game supports it.

Note Note: choosing the encoding will help Mudlet display the letters right, but triggers will not currently work with non-English text and nor will some Lua string. functions like string.len(). We'll be adding these features incrementally and it will all be ready in Mudlet 4.0. Prior to the work commencing toward this in Mudlet 3.x the only encodings that was understood were ASCII and the Latin 1 (ISO 8859-1) encoding - the latter being covered because it is the default (using 8-bits and having character codes in the range 0 to 255) for the Qt Libraries and like all the other ISO 8859 ones is a super set that has the same codes in the range (0 to 127) as ASCII...

The list of encodings supported by Mudlet is:

Encoding Mudlet version
ASCII 0.0.1
UTF-8 3.2.0
ISO 8859-1 0.0.1
CP850 3.2.0
CP866 3.2.0
CP874 3.2.0
ISO 8859-10 3.2.0
ISO 8859-11 3.2.0
ISO 8859-13 3.2.0
ISO 8859-14 3.2.0
ISO 8859-15 3.2.0
ISO 8859-16 3.2.0
ISO 8859-2 3.2.0
ISO 8859-3 3.2.0
ISO 8859-4 3.2.0
ISO 8859-5 3.2.0
ISO 8859-6 3.2.0
ISO 8859-7 3.2.0
ISO 8859-8 3.2.0
ISO 8859-9 3.2.0
KOI8-R 3.2.0
KOI8-U 3.2.0
MACINTOSH 3.2.0
WINDOWS-1250 3.2.0
WINDOWS-1251 3.2.0
WINDOWS-1252 3.2.0
WINDOWS-1253 3.2.0
WINDOWS-1254 3.2.0
WINDOWS-1255 3.2.0
WINDOWS-1256 3.2.0
WINDOWS-1257 3.2.0
WINDOWS-1258 3.2.0

Scripting with Unicode

Mudlet uses English in all of its Lua API to enable scripts scripts to be international - so a script written on a computer with German default will work on a computer with English default, for example. This means you can expect all API functions, error messages to be in English, and the number separator is always a period . Mudlet sets os.setlocale("C") by default, see background.

Not all Lua functions beginning with string. will work with Unicode - Mudlet has utf8. equivalents for those. See String functions in Mudlet for a complete list. For example:

print(string.len("слово"))
> 10 -- wrong!
print(utf8.len("слово"))
> 5  -- correct!

Loading external Lua files

Mudlet uses Unicode (utf8) for the trigger engine and Lua subsystem. If you have a file you're loading externally with Lua, make sure it saved in utf8 encoding.