<< 1 >>
Rating: Summary: A great book if you want to understand Unicode Review: I find this book extremely useful!This is almost three books in one. The first part provides a very good introduction to Unicode in general. The middle is really useful for all sorts of people, from linguists to content authors who want to understand the scripts encompassed by Unicode. And the last part is extremely helpful for programmers who want to understand how to implement many text processing techniques using Unicode. Throughout, Rich's style is easy and enjoyable to read, and yet quickly gets to a wealth of useful information. Great job! Highly recommended.
Rating: Summary: A great book if you want to understand Unicode Review: I find this book extremely useful! This is almost three books in one. The first part provides a very good introduction to Unicode in general. The middle is really useful for all sorts of people, from linguists to content authors who want to understand the scripts encompassed by Unicode. And the last part is extremely helpful for programmers who want to understand how to implement many text processing techniques using Unicode. Throughout, Rich's style is easy and enjoyable to read, and yet quickly gets to a wealth of useful information. Great job! Highly recommended.
Rating: Summary: Want to understand the Unicode standard? Start here! Review: The book has three main parts: (1) Unicode in essence: an architectural overview of the Unicode standard (six chapters) where you also get bits of terminology and history. (2) Unicode in depth: A guided tour of the character repertoire (six chapters) where you get a lot about writing systems that can be represented in Unicode, and less about the Unicode characters. (3) Unicode in action: implementing and using the Unicode standard (five chapters) where you get information aimed at computer programmers that wish to implement parts of the standard or write applications dealing with multilingual text. Though this book is very long (~800 pages) it is still shorter and a lot more clear than the Unicode standard itself (over 1000 pages). Code examples are in Java but they are not ment to be complete solutions and so there is no accompanying website or a CD. Professional programmers are the target audience of this book. The reader is faced with many topics in linguistics, history and data structures. Readers with computer science background would probably appreciate how classic traditional algorithms were adapted and how data structures are used in character sets with a significantly larger number of character than 256. The author of the book states that the book is about "representing written language in a computer", which may be misleading to some readers. The book is about the Unicode standard. Obviously, there are many other ways to represent written language other than the methods described in the book. As chapter 2 teaches... There are always more ways (sometimes better ways) to represent your data. Part 2 of the book will not cover every writing system of the world. A better book for that would be "The world's writing systems". Part3 is probably the most interesting and useful part for programmers (though the first part is important, in my opinion to those who want to UNDERSTAND Unicode). You can learn about a lot of things and skip many too (depending on your interest and need). I believe that most readers will skip most of the topics. This is not a book that is read lightly, but it is hellovalot easier and more fun to read than the Unicode standard itself. It appears that once you read this book and get what you want from it, you will end up going to read the Unicode standard only to see updates, hopefully, not for clarifications. I am dealing with Natural Language Processing and being a Hebrew speaker I also have a lot of text in Hebrew (almost all the time it is Hebrew with other languages too, e.g. documents that contain Hebrew with some English). This book helps understand the difficulties, the current implementations and give you a solid ground to start thinking how you can make things better. Current infrastructure for Hebrew is either poor or not perfect and in most cases the better solutions are proprietary. There seems to be always problems representing 'plain' text in more than one language without stepping into the trap of the soup of different ways to do it. Unicode is one way to do it (arguably, not the best, yet it is alive and growing) I hope this book can help more people understand what they are up against, clear the fog and help people do better implementations.
Rating: Summary: Perfect Companion Volume to the Standard Itself. Review: This book is an outstanding companion volume to the Unicode standard. In fact, if you had to pick one, you'd quite possibly be better off owning this book INSTEAD of the standard. The author display an impressive knowledge of the world's writing systems and of the inner workings of the Unicode standardization process. Part I of this book starts with the history of character encoding standards, from Morse code to today. It then presents a thorough review of the Unicode architecture and associated standards. The information presented was mostly excellent, although I found the section describing SCSU a little bit too sketchy (and the actual code in part III not entirely satisfactory to fill in the gaps). Part II gives an overview of the various writing systems and character ranges represented in Unicode. Even for a nontechnical audience, this part would be fascinating with all the typographical and historical trivia it presents. Part III discusses various algorithms applicable to text processing in a Unicode context. I must admit that I found this part a bit of a letdown. Many of the algoritms are only sketched out because discussing them in detail would be beyond the scope of the book. Quite possibly, the pages dedicated to these algorithms would have been better spent presenting examples of code using the various existing APIs for handling Unicode (Java, ICU, Perl, Windows, MacOS X). This does not take away from the fact that this is a great book that any programmer interested in Unicode should own.
Rating: Summary: A great manual for the practical use of Unicode Review: Unicode Demystified is a great manual and a good read. It earns a place on the bookshelf of programmers who deal with modern text processing, which is based on the Unicode standard. It is a great resource for anyone involved in software internationalization and localization. Gillam provides a lot of useful details, history and explanations for the structure of the character set, and shows how to use it. The book is a companion to the print and online resources of the Unicode standard itself, and provides the glue to many of the pieces, the how-to's and basic data structures. For example, the Unicode encodings UTF-8/16/32 (and BOM) are explained very well, bidirectional text is discussed with a lot of insight, and the family of Indic scripts with their special features is presented with examples for how to encode Indic text.
<< 1 >>
|