  
  [1X6 String and Text Utilities[0X
  
  
  [1X6.1 Text Utilities[0X
  
  This section describes some utility functions for handling texts within [5XGAP[0m.
  They  are  used by the functions in the [5XGAPDoc[0m package but may be useful for
  other  purposes  as  well.  We  start  with some variables containing useful
  strings and go on with functions for parsing and reformatting text.
  
  [1X6.1-1 WHITESPACE[0m
  
  [2X> WHITESPACE_________________________________________________[0Xglobal variable
  [2X> CAPITALLETTERS_____________________________________________[0Xglobal variable
  [2X> SMALLLETTERS_______________________________________________[0Xglobal variable
  [2X> LETTERS____________________________________________________[0Xglobal variable
  [2X> DIGITS_____________________________________________________[0Xglobal variable
  [2X> HEXDIGITS__________________________________________________[0Xglobal variable
  
  These  variables  contain  sets  of  characters  which  are  useful for text
  processing. They are defined as follows.
  
  [8X[10XWHITESPACE[0m[8X[0m
        [10X" \n\t\r"[0m
  
  [8X[10XCAPITALLETTERS[0m[8X[0m
        [10X"ABCDEFGHIJKLMNOPQRSTUVWXYZ"[0m
  
  [8X[10XSMALLLETTERS[0m[8X[0m
        [10X"abcdefghijklmnopqrstuvwxyz"[0m
  
  [8X[10XLETTERS[0m[8X[0m
        concatenation of [10XCAPITALLETTERS[0m and [10XSMALLLETTERS[0m
  
  [8X[10XDIGITS[0m[8X[0m
        [10X"0123456789"[0m
  
  [8X[10XHEXDIGITS[0m[8X[0m
        [10X"0123456789ABCDEFabcdef"[0m
  
  [1X6.1-2 TextAttr[0m
  
  [2X> TextAttr___________________________________________________[0Xglobal variable
  
  The  record  [2XTextAttr[0m  contains  strings  which can be printed to change the
  terminal  attribute  for  the  following  characters.  This  only works with
  terminals  which  understand  basic ANSI escape sequences. Try the following
  example  to see if this is the case for the terminal you are using. It shows
  the  effect  of  the  foreground  and background color attributes and of the
  [10X.bold[0m, [10X.blink[0m, [10X.normal[0m, [10X.reverse[0m and[10X.underscore[0m which can partly be mixed.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xextra := ["CSI", "reset", "delline", "home"];;[0X
    [4Xfor t in Difference(RecNames(TextAttr), extra) do[0X
    [4X  Print(TextAttr.(t), "TextAttr.", t, TextAttr.reset,"\n");[0X
    [4Xod;[0X
  [4X------------------------------------------------------------------[0X
  
  The  suggested  defaults for colors [10X0..7[0m are black, red, green, brown, blue,
  magenta,   cyan,  white.  But  this  may  be  different  for  your  terminal
  configuration.
  
  The  escape  sequence  [10X.delline[0m  deletes the content of the current line and
  [10X.home[0m moves the cursor to the beginning of the current line.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xfor i in [1..5] do [0X
    [4X  Print(TextAttr.home, TextAttr.delline, String(i,-6), "\c"); [0X
    [4X  Sleep(1); [0X
    [4Xod;[0X
  [4X------------------------------------------------------------------[0X
  
  Whenever you use this in some printing routines you should make it optional.
  Use these attributes only, when the variable [10XANSI_COLORS[0m has the value [9Xtrue[0m.
  
  [1X6.1-3 WrapTextAttribute[0m
  
  [2X> WrapTextAttribute( [0X[3Xstr, attr[0X[2X ) ___________________________________[0Xfunction
  [6XReturns:[0X  a string with markup
  
  The  argument  [3Xstr[0m  must  be  a  text as [5XGAP[0m string, possibly with markup by
  escape  sequences  as  in  [2XTextAttr[0m  ([14X6.1-2[0m). This function returns a string
  which  is  wrapped by the escape sequences [3Xattr[0m and [10XTextAttr.reset[0m. It takes
  care  of  markup in the given string by appending [3Xattr[0m also after each given
  [10XTextAttr.reset[0m in [3Xstr[0m.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> str := Concatenation("XXX",TextAttr.2, "BLUB", TextAttr.reset,"YYY");[0X
    [4X"XXX\033[32mBLUB\033[0mYYY"[0X
    [4Xgap> str2 := WrapTextAttribute(str, TextAttr.1);[0X
    [4X"\033[31mXXX\033[32mBLUB\033[0m\033[31mYYY\033[0m"[0X
    [4Xgap> str3 := WrapTextAttribute(str, TextAttr.underscore);[0X
    [4X"\033[4mXXX\033[32mBLUB\033[0m\033[4mYYY\033[0m"[0X
    [4Xgap> # use Print(str); and so on to see how it looks like.[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.1-4 FormatParagraph[0m
  
  [2X> FormatParagraph( [0X[3Xstr[, len][, flush][, attr][, widthfun], ]][0X[2X ) ___[0Xfunction
  [6XReturns:[0X  the formatted paragraph as string
  
  This  function  formats  a  text given in the string [3Xstr[0m as a paragraph. The
  optional arguments have the following meaning:
  
  [8X[3Xlen[0m[8X[0m
        the length of the lines of the resulting text (default is [10X78[0m)
  
  [8X[3Xflush[0m[8X[0m
        can  be [10X"left"[0m, [10X"right"[0m, [10X"center"[0m or [10X"both"[0m, telling that lines should
        be  flushed  left,  flushed  right,  centered or left-right justified,
        respectively (default is [10X"both"[0m)
  
  [8X[3Xattr[0m[8X[0m
        is  a  list  of  two  strings;  the  first is prepended and the second
        appended  to  each  line  of  the  result (can for example be used for
        indenting, [10X[" ", ""][0m, or some markup, [10X[TextAttr.bold, TextAttr.reset][0m,
        default is [10X["", ""][0m)
  
  [8X[3Xwidthfun[0m[8X[0m
        must be a function which returns the display width of text in [3Xstr[0m. The
        default  is  [10XLength[0m assuming that each byte corresponds to a character
        of  width  one.  If  [3Xstr[0m  is  given  in  [10XUTF-8[0m  encoding  one  can use
        [2XWidthUTF8String[0m ([14X6.2-3[0m) here.
  
  This  function tries to handle markup with the escape sequences explained in
  [2XTextAttr[0m ([14X6.1-2[0m) correctly.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> str := "One two three four five six seven eight nine ten eleven.";;[0X
    [4Xgap> Print(FormatParagraph(str, 25, "left", ["/* ", " */"]));           [0X
    [4X/* One two three four five */[0X
    [4X/* six seven eight nine ten */[0X
    [4X/* eleven. */[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.1-5 SubstitutionSublist[0m
  
  [2X> SubstitutionSublist( [0X[3Xlist, sublist, new[, flag][0X[2X ) ________________[0Xfunction
  [6XReturns:[0X  the changed list
  
  This  function  looks for (non-overlapping) occurrences of a sublist [3Xsublist[0m
  in  a  list  [3Xlist[0m (compare [2XPositionSublist[0m ([14XReference: PositionSublist[0m)) and
  returns a list where these are substituted with the list [3Xnew[0m.
  
  The  optional  argument [3Xflag[0m can either be [10X"all"[0m (this is the default if not
  given)  or [10X"one"[0m. In the second case only the first occurrence of [3Xsublist[0m is
  substituted.
  
  If  [3Xsublist[0m  does  not occur in [3Xlist[0m then [3Xlist[0m itself is returned (and not a
  [10XShallowCopy(list)[0m).
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> SubstitutionSublist("xababx", "ab", "a");[0X
    [4X"xaax"[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.1-6 StripBeginEnd[0m
  
  [2X> StripBeginEnd( [0X[3Xlist, strip[0X[2X ) _____________________________________[0Xfunction
  [6XReturns:[0X  changed string
  
  Here [3Xlist[0m and [3Xstrip[0m must be lists. This function returns the sublist of list
  which does not contain the leading and trailing entries which are entries of
  [3Xstrip[0m. If the result is equal to [3Xlist[0m then [3Xlist[0m itself is returned.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> StripBeginEnd(" ,a, b,c,   ", ", ");[0X
    [4X"a, b,c"[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.1-7 StripEscapeSequences[0m
  
  [2X> StripEscapeSequences( [0X[3Xstr[0X[2X ) ______________________________________[0Xfunction
  [6XReturns:[0X  string without escape sequences
  
  This  function  returns  the string one gets from the string [3Xstr[0m by removing
  all  escape  sequences  which are explained in [2XTextAttr[0m ([14X6.1-2[0m). If [3Xstr[0m does
  not contain such a sequence then [3Xstr[0m itself is returned.
  
  [1X6.1-8 RepeatedString[0m
  
  [2X> RepeatedString( [0X[3Xc, len[0X[2X ) _________________________________________[0Xfunction
  
  Here  [3Xc[0m  must  be  either  a character or a string and [3Xlen[0m is a non-negative
  number.  Then  [2XRepeatedString[0m  returns  a string of length [3Xlen[0m consisting of
  copies of [3Xc[0m.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> RepeatedString('=',51);[0X
    [4X"==================================================="[0X
    [4Xgap> RepeatedString("*=",51);[0X
    [4X"*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*"[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.1-9 NumberDigits[0m
  
  [2X> NumberDigits( [0X[3Xstr, base[0X[2X ) ________________________________________[0Xfunction
  [6XReturns:[0X  integer
  
  [2X> DigitsNumber( [0X[3Xn, base[0X[2X ) __________________________________________[0Xfunction
  [6XReturns:[0X  string
  
  The  argument  [3Xstr[0m  of  [2XNumberDigits[0m  must be a string consisting only of an
  optional leading [10X'-'[0m and characters in [10X0123456789abcdefABCDEF[0m, describing an
  integer  in  base  [3Xbase[0m  with  2  <=  [3Xbase[0m  <= 16. This function returns the
  corresponding integer.
  
  The function [2XDigitsNumber[0m does the reverse.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> NumberDigits("1A3F",16);[0X
    [4X6719[0X
    [4Xgap> DigitsNumber(6719, 16);[0X
    [4X"1A3F"[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.1-10 PositionMatchingDelimiter[0m
  
  [2X> PositionMatchingDelimiter( [0X[3Xstr, delim, pos[0X[2X ) _____________________[0Xfunction
  [6XReturns:[0X  position as integer or [9Xfail[0m
  
  Here  [3Xstr[0m must be a string and [3Xdelim[0m a string with two different characters.
  This  function searches the smallest position [10Xr[0m of the character [10X[3Xdelim[0m[10X[2][0m in
  [3Xstr[0m such that the number of occurrences of [10X[3Xdelim[0m[10X[2][0m in [3Xstr[0m between positions
  [10X[3Xpos[0m[10X+1[0m  and  [10Xr[0m is by one greater than the corresponding number of occurrences
  of [10X[3Xdelim[0m[10X[1][0m.
  
  If such an [10Xr[0m exists, it is returned. Otherwise [9Xfail[0m is returned.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> PositionMatchingDelimiter("{}x{ab{c}d}", "{}", 0);[0X
    [4Xfail[0X
    [4Xgap> PositionMatchingDelimiter("{}x{ab{c}d}", "{}", 1);[0X
    [4X2[0X
    [4Xgap> PositionMatchingDelimiter("{}x{ab{c}d}", "{}", 6);[0X
    [4X11[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.1-11 WordsString[0m
  
  [2X> WordsString( [0X[3Xstr[0X[2X ) _______________________________________________[0Xfunction
  [6XReturns:[0X  list of strings containing the words
  
  This  returns  the  list  of  words  of a text stored in the string [3Xstr[0m. All
  non-letters are considered as word boundaries and are removed.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> WordsString("one_two \n    three!?");[0X
    [4X[ "one", "two", "three" ][0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.1-12 Base64String[0m
  
  [2X> Base64String( [0X[3Xstr[0X[2X ) ______________________________________________[0Xfunction
  [2X> StringBase64( [0X[3Xbstr[0X[2X ) _____________________________________________[0Xfunction
  [6XReturns:[0X  a string
  
  The  first  function  translates arbitrary binary data given as a GAP string
  into  a  [13Xbase 64[0m encoded string. This encoded string contains only printable
  ASCII  characters  and  is  used  in  various  data transfer protocols ([10XMIME[0m
  encoded  emails, weak password encryption, ...). We use the specification in
  RFC 2045 ([7Xhttp://tools.ietf.org/html/rfc2045[0m).
  
  The  second  function has the reverse functionality. Here we also accept the
  characters [10X-_[0m instead of [10X+/[0m as last two characters. Whitespace is ignored.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> b := Base64String("This is a secret!");[0X
    [4X"VGhpcyBpcyBhIHNlY3JldCEA="[0X
    [4Xgap> StringBase64(b);                       [0X
    [4X"This is a secret!"[0X
  [4X------------------------------------------------------------------[0X
  
  
  [1X6.2 Unicode Strings[0X
  
  The  [5XGAPDoc[0m  package provides some tools to deal with unicode characters and
  strings.  These  can  be  used  for  recoding  text  strings between various
  encodings.
  
  
  [1X6.2-1 Unicode Strings and Characters[0X
  
  [2X> Unicode( [0X[3Xlist[, encoding][0X[2X ) _____________________________________[0Xoperation
  [2X> UChar( [0X[3Xnum[0X[2X ) ____________________________________________________[0Xoperation
  [2X> IsUnicodeString_____________________________________________________[0Xfilter
  [2X> IsUnicodeCharacter__________________________________________________[0Xfilter
  [2X> IntListUnicodeString( [0X[3Xustr[0X[2X ) _____________________________________[0Xfunction
  
  Unicode characters are described by their [13Xcodepoint[0m, an integer in the range
  from 0 to 2^21-1. For details about unicode, see [7Xhttp://www.unicode.org[0m.
  
  The  function  [2XUChar[0m  wraps  an  integer  [3Xnum[0m into a [5XGAP[0m object lying in the
  filter  [2XIsUnicodeCharacter[0m.  Use [10XInt[0m to get the codepoint back. The argument
  [3Xnum[0m  can  also be a [5XGAP[0m character which is then translated to an integer via
  [2XINT_CHAR[0m ([14XReference: INT_CHAR[0m).
  
  [2XUnicode[0m  produces  a  [5XGAP[0m  object  in  the filter [2XIsUnicodeString[0m. This is a
  wrapped  list  of  integers  for  the  unicode characters in the string. The
  function  [2XIntListUnicodeString[0m  gives access to this list of integers. Basic
  list  functionality  is  available for [2XIsUnicodeString[0m elements. The entries
  are in [2XIsUnicodeCharacter[0m. The argument [3Xlist[0m for [2XUnicode[0m is either a list of
  integers or a [5XGAP[0m string. In the latter case an [3Xencoding[0m can be specified as
  string, its default is [10X"UTF-8"[0m.
  
  Currently       supported      encodings      can      be      found      in
  [10XUNICODE_RECODE.NormalizedEncodings[0m  (ASCII,  ISO-8859-X, UTF-8 and aliases).
  The encoding [10X"XML"[0m means an ASCII encoding in which non-ASCII characters are
  specified  by  XML character entities. The encoding [10X"URL"[0m is for URL-encoded
  (also  called  percent-encoded  strings,  as specified in RFC 3986 (see here
  ([7Xhttp://www.ietf.org/rfc/rfc3986.txt[0m)).  The  listed  encodings  [10X"LaTeX"[0m and
  aliases  cannot  be  used with [2XUnicode[0m. See the operation [2XEncode[0m ([14X6.2-2[0m) for
  mapping a unicode string to a [5XGAP[0m string.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> ustr := Unicode("a and \366", "latin1");[0X
    [4XUnicode("a and ö")[0X
    [4Xgap> ustr = Unicode("a and &#246;", "XML");  [0X
    [4Xtrue[0X
    [4Xgap> IntListUnicodeString(ustr);[0X
    [4X[ 97, 32, 97, 110, 100, 32, 246 ][0X
    [4Xgap> ustr[7];[0X
    [4X'ö'[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.2-2 Encode[0m
  
  [2X> Encode( [0X[3Xustr[, encoding][0X[2X ) ______________________________________[0Xoperation
  [6XReturns:[0X  a [5XGAP[0m string
  
  [2X> SimplifiedUnicodeString( [0X[3Xustr[, encoding][, "single"][0X[2X ) __________[0Xfunction
  [6XReturns:[0X  a unicode string
  
  [2X> LowercaseUnicodeString( [0X[3Xustr[0X[2X ) ___________________________________[0Xfunction
  [6XReturns:[0X  a unicode string
  
  [2X> UppercaseUnicodeString( [0X[3Xustr[0X[2X ) ___________________________________[0Xfunction
  [6XReturns:[0X  a unicode string
  
  [2X> LaTeXUnicodeTable__________________________________________[0Xglobal variable
  [2X> SimplifiedUnicodeTable_____________________________________[0Xglobal variable
  [2X> LowercaseUnicodeTable______________________________________[0Xglobal variable
  
  The  operation  [2XEncode[0m translates a unicode string [3Xustr[0m into a [5XGAP[0m string in
  some specified [3Xencoding[0m. The default encoding is [10X"UTF-8"[0m.
  
  Supported  encodings  can  be  found  in [10XUNICODE_RECODE.NormalizedEncodings[0m.
  Except  for some cases mentioned below characters which are not available in
  the target encoding are substituted by '?' characters.
  
  If  the  [3Xencoding[0m  is  [10X"URL"[0m (see [2XUnicode[0m ([14X6.2-1[0m)) then an optional argument
  [3Xencreserved[0m  can  be  given,  it must be a list of reserved characters which
  should be percent encoded; the default is to encode only the [10X%[0m character.
  
  The  encoding  [10X"LaTeX"[0m  substitutes  non-ASCII  characters and LaTeX special
  characters  by  LaTeX  code as given in an ordered list [10XLaTeXUnicodeTable[0m of
  pairs  [codepoint,  string].  If  you  have a unicode character for which no
  substitution  is  contained  in  that  list,  you will get a warning and the
  translation  is  [10XUnicode(nr)[0m.  In  this  case  find a substitution and add a
  corresponding  [codepoint,  string]  pair  to [10XLaTeXUnicodeTable[0m using [2XAddSet[0m
  ([14XReference:  AddSet[0m).  Also,  please,  tell  the  [5XGAPDoc[0m  authors about your
  addition,  such  that we can extend the list [10XLaTeXUnicodeTable[0m. (Most of the
  initial  entries  were  generated  from lists in the TeX projects encTeX and
  [10Xucs[0m.) There are some variants of this encoding:
  
  [10X"LaTeXleavemarkup"[0m  does  the same translations for non-ASCII characters but
  leaves the LaTeX special characters (e.g., any LaTeX commands) as they are.
  
  [10X"LaTeXUTF8"[0m  does  not  give  a  warning  about  unicode  characters without
  explicit  translation,  instead  it  translates  the  character to its [10XUTF-8[0m
  encoding.  Make  sure  to  setup  your  LaTeX  document  such that all these
  characters are understood.
  
  [10X"LaTeXUTF8leavemarkup"[0m is a combination of the last two variants.
  
  Note  that the [10X"LaTeX"[0m encoding can only be used with [2XEncode[0m but not for the
  opposite  translation  with  [2XUnicode[0m  ([14X6.2-1[0m)  (which  would  need  far  too
  complicated heuristics).
  
  The   function  [2XSimplifiedUnicodeString[0m  can  be  used  to  substitute  many
  non-ASCII  characters  by  related  ASCII  characters or strings (e.g., by a
  corresponding  character  without accents). The argument [3Xustr[0m and the result
  are  unicode  strings,  if [3Xencoding[0m is [10X"ASCII"[0m then all non-ASCII characters
  are  translated,  otherwise  only  the  non-latin1 characters. If the string
  [10X"single"[0m  in  an argument then only substitutions are considered which don't
  make  the result string longer. The translations are stored in a sorted list
  [10XSimplifiedUnicodeTable[0m.  Its  entries  are  of  the form [10X[codepoint, trans1,
  trans2,  ...][0m.  Here [10Xtrans1[0m and so on is either an integer for the codepoint
  of  a  substitution  character or it is a list of codepoint integers. If you
  are missing characters in this list and know a sensible ASCII approximation,
  then  add  an  entry  (with  [2XAddSet[0m ([14XReference: AddSet[0m)) and tell the [5XGAPDoc[0m
  authors  about it. (The initial content of [10XSimplifiedUnicodeTable[0m was mainly
  generated from the "[10Xtranstab[0m" tables by Markus Kuhn.)
  
  The  function  [2XLowercaseUnicodeString[0m  gets and returns a unicode string and
  translates  each uppercase character to its corresponding lowercase version.
  This  function  uses  a  list  [10XLowercaseUnicodeTable[0m  of  pairs of codepoint
  integers.  This  list  was generated using the file [11XUnicodeData.txt[0m from the
  unicode definition (field 14 in each row).
  
  The   function   [2XUppercaseUnicodeString[0m  does  the  similar  translation  to
  uppercase characters.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> ustr := Unicode("a and &#246;", "XML");[0X
    [4XUnicode("a and ö")[0X
    [4Xgap> SimplifiedUnicodeString(ustr, "ASCII");[0X
    [4XUnicode("a and oe")[0X
    [4Xgap> SimplifiedUnicodeString(ustr, "ASCII", "single");[0X
    [4XUnicode("a and o")[0X
    [4Xgap> ustr2 := UppercaseUnicodeString(ustr);;[0X
    [4Xgap> Print(Encode(ustr2, GAPInfo.TermEncoding), "\n");[0X
    [4XA AND Ö[0X
  [4X------------------------------------------------------------------[0X
  
  
  [1X6.2-3 Lengths of UTF-8 strings[0X
  
  [2X> WidthUTF8String( [0X[3Xstr[0X[2X ) ___________________________________________[0Xfunction
  [2X> NrCharsUTF8String( [0X[3Xstr[0X[2X ) _________________________________________[0Xfunction
  [6XReturns:[0X  an integer
  
  Let  [3Xstr[0m  be  a  [5XGAP[0m  string  with  text  in UTF-8 encoding. There are three
  "lengths" of such a string which must be distinguished. The operation [2XLength[0m
  ([14XReference:  Length[0m)  returns the number of bytes and so the memory occupied
  by  [3Xstr[0m.  The  function  [2XNrCharsUTF8String[0m  returns  the  number  of unicode
  characters in [3Xstr[0m, that is the length of [10XUnicode([3Xstr[0m[10X)[0m.
  
  In  many  applications  the function [2XWidthUTF8String[0m is more interesting, it
  returns the number of columns needed by the string if printed to a terminal.
  This   takes  into  account  that  some  unicode  characters  are  combining
  characters  and that there are wide characters which need two columns (e.g.,
  for  Chinese  or Japanese). (To be precise: This implementation assumes that
  there are no control characters in [3Xstr[0m and uses the character width returned
  by the [10Xwcwidth[0m function in the GNU C-library called with UTF-8 locale.)
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> # A, German umlaut u, B, zero width space, C, newline[0X
    [4Xgap> str := Encode( Unicode( "A&#xFC;B&#x200B;C\n", "XML" ) );;[0X
    [4Xgap> Print(str);[0X
    [4XAüB​C[0X
    [4Xgap> # umlaut u needs two bytes and the zero width space three[0X
    [4Xgap> Length(str);[0X
    [4X9[0X
    [4Xgap> NrCharsUTF8String(str);[0X
    [4X6[0X
    [4Xgap> # zero width space and newline don't contribute to width[0X
    [4Xgap> WidthUTF8String(str);[0X
    [4X4[0X
  [4X------------------------------------------------------------------[0X
  
  
  [1X6.3 Print Utilities[0X
  
  The  following  printing  utilities  turned out to be useful for interactive
  work  with  texts  in [5XGAP[0m. But they are more general and so we document them
  here.
  
  [1X6.3-1 PrintTo1[0m
  
  [2X> PrintTo1( [0X[3Xfilename, fun[0X[2X ) ________________________________________[0Xfunction
  [2X> AppendTo1( [0X[3Xfilename, fun[0X[2X ) _______________________________________[0Xfunction
  
  The  argument  [3Xfun[0m must be a function without arguments. Everything which is
  printed  by  a call [3Xfun()[0m is printed into the file [3Xfilename[0m. As with [2XPrintTo[0m
  ([14XReference:  PrintTo[0m)  and [2XAppendTo[0m ([14XReference: AppendTo[0m) this overwrites or
  appends to, respectively, a previous content of [3Xfilename[0m.
  
  These functions can be particularly efficient when many small pieces of text
  shall  be  written  to  a file, because no multiple reopening of the file is
  necessary.
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> f := function() local i; [0X
    [4X>   for i in [1..100000] do Print(i, "\n"); od; end;; [0X
    [4Xgap> PrintTo1("nonsense", f); # now check the local file `nonsense'[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.3-2 StringPrint[0m
  
  [2X> StringPrint( [0X[3Xobj1[, obj2[, ...]][0X[2X ) _______________________________[0Xfunction
  [2X> StringView( [0X[3Xobj[0X[2X ) ________________________________________________[0Xfunction
  
  These  functions return a string containing the output of a [10XPrint[0m or [10XViewObj[0m
  call with the same arguments.
  
  This should be considered as a (temporary?) hack. It would be better to have
  [2XString[0m ([14XReference: String[0m) methods for all [5XGAP[0m objects and to have a generic
  [2XPrint[0m ([14XReference: Print[0m)-function which just interprets these strings.
  
  [1X6.3-3 PrintFormattedString[0m
  
  [2X> PrintFormattedString( [0X[3Xstr[0X[2X ) ______________________________________[0Xfunction
  
  This  function prints a string [3Xstr[0m. The difference to [10XPrint(str);[0m is that no
  additional  line breaks are introduced by [5XGAP[0m's standard printing mechanism.
  This  can  be  used  to print lines which are longer than the current screen
  width. In particular one can print text which contains escape sequences like
  those  explained  in  [2XTextAttr[0m ([14X6.1-2[0m), where lines may have more characters
  than [13Xvisible characters[0m.
  
  [1X6.3-4 Page[0m
  
  [2X> Page( [0X[3X...[0X[2X ) ______________________________________________________[0Xfunction
  [2X> PageDisplay( [0X[3Xobj[0X[2X ) _______________________________________________[0Xfunction
  
  These  functions  are  similar  to  [2XPrint[0m  ([14XReference:  Print[0m)  and  [2XDisplay[0m
  ([14XReference: Display[0m), respectively. The difference is that the output is not
  sent  directly to the screen, but is piped into the current pager; see [2XPAGER[0m
  ([14XReference: Pager[0m).
  
  [4X---------------------------  Example  ----------------------------[0X
    [4Xgap> Page([1..1421]+0);[0X
    [4Xgap> PageDisplay(CharacterTable("Symmetric", 14));[0X
  [4X------------------------------------------------------------------[0X
  
  [1X6.3-5 StringFile[0m
  
  [2X> StringFile( [0X[3Xfilename[0X[2X ) ___________________________________________[0Xfunction
  [2X> FileString( [0X[3Xfilename, str[, append][0X[2X ) ____________________________[0Xfunction
  
  The  function  [2XStringFile[0m  returns the content of file [3Xfilename[0m as a string.
  This  works  efficiently with arbitrary (binary or text) files. If something
  went wrong, this function returns [9Xfail[0m.
  
  Conversely  the  function [2XFileString[0m writes the content of a string [3Xstr[0m into
  the file [3Xfilename[0m. If the optional third argument [3Xappend[0m is given and equals
  [9Xtrue[0m  then  the  content  of [3Xstr[0m is appended to the file. Otherwise previous
  content  of  the  file is deleted. This function returns the number of bytes
  written or [9Xfail[0m if something went wrong.
  
  Both functions are quite efficient, even with large files.
  
