Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411279 Posts in 69323 Topics- by 58380 Members - Latest Member: bob1029

March 28, 2024, 03:12:02 PM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)Dumb Bitmap Font question regarding foreign characters...
Pages: [1]
Print
Author Topic: Dumb Bitmap Font question regarding foreign characters...  (Read 952 times)
DrDerekDoctors
THE ARSEHAMMER
Level 8
******



View Profile WWW
« on: May 18, 2017, 12:14:04 AM »

So, if you've made a game which uses bitmap fonts and which has localisation in it (i.e. about 62 extra characters for various languages) - how did you cope with that? D'you manually redirect the foreign characters to custom values in your bitmap font at load time? I ask because they're character values are generally negative and so *something* needs to be done with them.

For reference, here's my font:

https://www.dropbox.com/s/v8zzoteql9wtr1o/DialogueBoxBodyFont.gif?dl=0
« Last Edit: May 18, 2017, 12:45:33 AM by DrDerekDoctors » Logged

Me, David Williamson and Mark Foster do an Indie Games podcast. Give it a listen. And then I'll send you an apology.
http://pigignorant.com/
oahda
Level 10
*****



View Profile
« Reply #1 on: May 18, 2017, 01:27:32 AM »

not an answer to your question but in your font a lot of the diacritics look displaced and would look better one pixel to the left IMO Tongue
« Last Edit: May 18, 2017, 09:33:33 AM by Prinsessa » Logged

Schrompf
Level 9
****

C++ professional, game dev sparetime


View Profile WWW
« Reply #2 on: May 18, 2017, 02:29:00 AM »

That's a deep cave you're looking at.

First: negative numbers are not negative by nature, it's just an interpretation when treating them. Some operations, especially printing, do differ. But most operations, addition or multiplication, do exactly the same for signed and unsigned types.

So just cast those numbers to an unsigned type and you can use them as array indices.

Second: the array to look into. This is where the dragons live. You have a number, and you want to know what graphics to put on screen. But there are waaayyy more characters then there are numbers in a char type. And over the years, many people have found multiple solutions to solve this.

"Unicode" is the table of all characters that mankind has come up with. It's basically a huge table which defines an index and a graphics for each and every letter, digit or thingy you can think of. It currently has 136690 entries.

"Encoding" is the method by which you store the charactes as bytes in memory. There's a lot of different encodings, mainly because back in the days they tried to get away with single bytes per character. But because you know that bytes (as the char type in C/C++) can only differentiate 256 values, but the number of characters is >100k, most chars are simply not present in those encodings. Old Windows versions, for example, had a "table" of characters suitable only for certain world regions, and were simply unable to hold chars from other regions. Western Europe was ISO 8859-1, a table containing umlauts such as äöüß, but it was literally impossible to store Chinese characters in there. So a Windows version for China, for example, used a different encoding. As did Eastern Europe. There were a lot of those tables. If you read a value of 184, for example, it meant Ö in one table but ò in another or whatever.

So it was obvious that if you want to solve this once and for all, you'll have to use more then one byte per character. Nowadays, the default for everyone except a few Microsoft programs is UTF8. UTF8 is an interesting idea because for Plain English it's compatible byte by byte. But if you want to leave the realm of ANSI >127 (the "negative" numbers you mentioned), the byte denoted how many additional bytes after that byte belong to the same character, and some bit fiddling gave you the actual character index.

If you're going to write your font renderer, a) use UTF8 everywhere and b) convert strings from and to the current Windows API using one of the dozens of conversion functions. Because Microsoft decided to use UTF16 instead, and that's yet another method of storing characters in bytes.
Logged

Snake World, multiplayer worm eats stuff and grows DevLog
Polly
Level 6
*



View Profile
« Reply #3 on: May 18, 2017, 02:33:13 AM »

If that font contains all the characters you're going to need, you can simply encode your text using ANSI ( 8-bit per character ).



Edit - Actually just noticed you have a capital Eszett in your font. That character only got added to Unicode in 2008 and isn't available in ANSI. But in case you don't need it, simply reorganize your bitmap like this.
« Last Edit: May 18, 2017, 05:30:44 AM by Polly » Logged
DrDerekDoctors
THE ARSEHAMMER
Level 8
******



View Profile WWW
« Reply #4 on: May 18, 2017, 07:10:29 AM »

Thanks all. As for the diacritics, yeah, a few of them are in the wrong place, although a number of them are an odd number of pixels wide where the letter is an even number so there was bound to be a little wonk either way. I've changed them so that nearly all of them match odd/even width of the letter, but it does mean some are drawn differently from others now. Thoughts?

As for the capital eszett, I've nuked that now, they can make do with a double S instead. I think I'm gonna stick with ANSI as it covers all the characters I need (tbh, there's little chance of the game being converted to *any* other language but I'm just trying to do my due diligence in case it ends up getting the EFIGS treatment). Certainly I see no chance of there being anything beyond the characters which ANSI affords me, in part because a large part of the game is about translating a system of runes and so I can't really accommodate many more letters and letters with diacritics.
Logged

Me, David Williamson and Mark Foster do an Indie Games podcast. Give it a listen. And then I'll send you an apology.
http://pigignorant.com/
DrDerekDoctors
THE ARSEHAMMER
Level 8
******



View Profile WWW
« Reply #5 on: May 18, 2017, 07:13:52 AM »

Edit - Actually just noticed you have a capital Eszett in your font. That character only got added to Unicode in 2008 and isn't available in ANSI. But in case you don't need it, simply reorganize your bitmap like this.

Ah, cheers for the re-organised image - that's a lot simpler than some stupid redirection table. Smiley
Logged

Me, David Williamson and Mark Foster do an Indie Games podcast. Give it a listen. And then I'll send you an apology.
http://pigignorant.com/
Pages: [1]
Print
Jump to:  

Theme orange-lt created by panic