30 private const WORD_SEGMENTATION_REGEX =
'/([\xc0-\xff][\x80-\xbf]*)/';
segmentByWord( $string)
Eventually, this should be a word segmentation; but for now just treat each character as a word.
hasWordBreaks()
Most writing systems use whitespace to break up words.