@JoshJers the problem is that there's no such thing as a unicode character, especially now that you have joining characters for emoji that can join an arbitrary number of different emojis into a new one. Having researched this the best idea I found is utf8 everywhere and then you can iterate over the code points. But really I think no language solves this because it's not strictly solvable
@JoshJers I did write iterators over code points in C# (forward and backward) but that still doesn't give you actual characters so *farting noises*
@eniko yeah and there's not even a really great way to HANDLE that without, like, "here's an array of utf-32 chars that represent the whole visible thing, lol have fun" at every step
At work I wrote the code that does canonical string comparison (i.e. if you have é written as e + accent or as a single character, they'll compare as identical) and it was a nightmare of tables and clever table compression