add tests with UTF8/UTF16 non-ASCII text

PiperOrigin-RevId: 545424981
This commit is contained in:
Evgenii Kliuchnikov 2023-07-04 13:01:14 +00:00 committed by Evgenii Kliuchnikov
parent 6ee96e291d
commit bc32ae12d5
6 changed files with 18 additions and 1 deletions

View File

@ -34,9 +34,12 @@ jobs:
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
# CodeQL is currently crashing on files with large lists:
# https://github.com/github/codeql/issues/13656
config: |
paths-ignore:
paths-ignore:
- research
- js/test_data.*
- if: matrix.language == 'cpp'
name: Build CPP

View File

@ -45,6 +45,8 @@ TESTDATA_FILES = [
'random_org_10k.bin', # Small data
'mapsdatazrh', # Large data
'ukkonooa', # Poem
'cp1251-utf16le', # Codepage 1251 table saved in UTF16-LE encoding
'cp852-utf8', # Codepage 852 table saved in UTF8 encoding
]
# Some files might be missing in a lightweight sources pack.

BIN
tests/testdata/cp1251-utf16le vendored Normal file

Binary file not shown.

BIN
tests/testdata/cp1251-utf16le.compressed vendored Normal file

Binary file not shown.

12
tests/testdata/cp852-utf8 vendored Normal file
View File

@ -0,0 +1,12 @@
The following table shows code page 852. Each character is shown with its equivalent Unicode code point. Only the second half of the table (128255) is shown, the first half (0127) being the same as code page 437.
Code page 852
0 1 2 3 4 5 6 7 8 9 A B C D E F
8x Ç ü é â ä ů ć ç ł ë Ő ő î Ź Ä Ć
9x É Ĺ ĺ ô ö Ľ ľ Ś ś Ö Ü Ť ť Ł × č
Ax á í ó ú Ą ą Ž ž Ę ę ¬ ź Č ş « »
Bx ░ ▒ ▓ │ ┤ Á Â Ě Ş ╣ ║ ╗ ╝ Ż ż ┐
Cx └ ┴ ┬ ├ ─ ┼ Ă ă ╚ ╔ ╩ ╦ ╠ ═ ╬ ¤
Dx đ Đ Ď Ë ď Ň Í Î ě ┘ ┌ █ ▄ Ţ Ů ▀
Ex Ó ß Ô Ń ń ň Š š Ŕ Ú ŕ Ű ý Ý ţ ´
Fx SHY ˝ ˛ ˇ ˘ § ÷ ¸ ° ¨ ˙ ű Ř ř ■ NBSP

BIN
tests/testdata/cp852-utf8.compressed vendored Normal file

Binary file not shown.