Skip to content

text_encoding

Encoding related utilities.

CEscape(text, as_utf8)

Escape a bytes string for use in an text protocol buffer.

Parameters:

Name Type Description Default
text

A byte string to be escaped.

required
as_utf8

Specifies if result may contain non-ASCII characters. In Python 3 this allows unescaped non-ASCII Unicode characters. In Python 2 the return value will be valid UTF-8 rather than only ASCII.

required

Returns: Escaped string (str).

Source code in client/ayon_hiero/vendor/google/protobuf/text_encoding.py
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
def CEscape(text, as_utf8):
  # type: (...) -> str
  """Escape a bytes string for use in an text protocol buffer.

  Args:
    text: A byte string to be escaped.
    as_utf8: Specifies if result may contain non-ASCII characters.
        In Python 3 this allows unescaped non-ASCII Unicode characters.
        In Python 2 the return value will be valid UTF-8 rather than only ASCII.
  Returns:
    Escaped string (str).
  """
  # Python's text.encode() 'string_escape' or 'unicode_escape' codecs do not
  # satisfy our needs; they encodes unprintable characters using two-digit hex
  # escapes whereas our C++ unescaping function allows hex escapes to be any
  # length.  So, "\0011".encode('string_escape') ends up being "\\x011", which
  # will be decoded in C++ as a single-character string with char code 0x11.
  text_is_unicode = isinstance(text, str)
  if as_utf8 and text_is_unicode:
    # We're already unicode, no processing beyond control char escapes.
    return text.translate(_cescape_chr_to_symbol_map)
  ord_ = ord if text_is_unicode else lambda x: x  # bytes iterate as ints.
  if as_utf8:
    return ''.join(_cescape_unicode_to_str[ord_(c)] for c in text)
  return ''.join(_cescape_byte_to_str[ord_(c)] for c in text)

CUnescape(text)

Unescape a text string with C-style escape sequences to UTF-8 bytes.

Parameters:

Name Type Description Default
text

The data to parse in a str.

required

Returns: A byte string.

Source code in client/ayon_hiero/vendor/google/protobuf/text_encoding.py
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
def CUnescape(text):
  # type: (str) -> bytes
  """Unescape a text string with C-style escape sequences to UTF-8 bytes.

  Args:
    text: The data to parse in a str.
  Returns:
    A byte string.
  """

  def ReplaceHex(m):
    # Only replace the match if the number of leading back slashes is odd. i.e.
    # the slash itself is not escaped.
    if len(m.group(1)) & 1:
      return m.group(1) + 'x0' + m.group(2)
    return m.group(0)

  # This is required because the 'string_escape' encoding doesn't
  # allow single-digit hex escapes (like '\xf').
  result = _CUNESCAPE_HEX.sub(ReplaceHex, text)

  return (result.encode('utf-8')  # Make it bytes to allow decode.
          .decode('unicode_escape')
          # Make it bytes again to return the proper type.
          .encode('raw_unicode_escape'))