Break Iterator also provides a C
language API.
- Line boundaries
-- used for line-wrapping
-- correctly handles punctuation and hyphenated words.
- Sentence boundaries
-- handles periods within numbers and abbreviations
-- handles trailing punctuation marks such as parentheses.
- Word boundaries
-- for search and replace functions
-- for selecting words with a double mouse click
- Character boundaries
-- handles combining characters
ReadMe for IBM's International Classes for Unicode, API
Overview