ICU-20593 Trace instrumentation for data loading.

- Adds hooks to utrace.h to record when ICU reads from locale data.
- Adds userguide page to document the new hooks.
This commit is contained in:
Shane Carr 2019-06-07 21:05:14 +00:00 committed by Shane F. Carr
parent d1688fd8f1
commit 2b611dbf6e
22 changed files with 751 additions and 82 deletions

View File

@ -51,7 +51,7 @@ matrix:
before_script:
- mkdir build
- cd build
- ../icu4c/source/runConfigureICU --enable-debug --disable-release Linux --prefix="${PREFIX}"
- ../icu4c/source/runConfigureICU --enable-debug --disable-release Linux --prefix="${PREFIX}" --enable-tracing
- make -j2
script:
- make -j2 check
@ -105,7 +105,7 @@ matrix:
- clang-5.0
before_script:
- cd icu4c/source
- ./runConfigureICU --enable-debug --disable-release Linux --disable-renaming
- ./runConfigureICU --enable-debug --disable-release Linux --disable-renaming --enable-tracing
- make -j2
script:
- make -j2 check
@ -131,11 +131,11 @@ matrix:
packages:
- clang-5.0
script:
- cd icu4c/source &&
./runConfigureICU --enable-debug --disable-release Linux --disable-renaming &&
make -j2 &&
make -j2 -C test &&
make -j2 -C test/intltest check
- cd icu4c/source
- ./runConfigureICU --enable-debug --disable-release Linux --disable-renaming
- make -j2
- make -j2 -C test
- make -j2 -C test/intltest check
# copyright scan / future linter
- name: "lint"

View File

@ -518,6 +518,10 @@ following command line run from *iuc4c/source*.
**Install jsonschema:** Install the `jsonschema` pip package to get warnings
about problems with your filter file.
**See what data is being used:** ICU is instrumented to allow you to trace
which resources are used at runtime. This can help you determine what data you
need to include. For more information, see [tracing.md](tracing.md).
**Inspect data/rules.mk:** The Python script outputs the file *rules.mk*
inside *iuc4c/source/data*. To see what is going to get built, you can inspect
that file. First build ICU normally, and copy *rules.mk* to

View File

@ -0,0 +1,132 @@
<!--
© 2019 and later: Unicode, Inc. and others.
License & terms of use: http://www.unicode.org/copyright.html
-->
Resource and Data Tracing
=========================
When building an [ICU data filter specification](buildtool.md), it is useful to
see what resources are being used by your application so that you can select
those resources and discard the others. This guide describes how to use
*utrace.h* to inspect resource access in real time in ICU4C.
**Note:** This feature is only available in ICU4C at this time. If you are
interested in ICU4J, please see
[ICU-20656](https://unicode-org.atlassian.net/browse/ICU-20656).
## Quick Start
First, you *must* have a copy of ICU4C configured with tracing enabled.
$ ./runConfigureICU Linux --enable-tracing
The following program prints resource and data usages to standard out:
```cpp
#include "unicode/brkiter.h"
#include "unicode/errorcode.h"
#include "unicode/localpointer.h"
#include "unicode/utrace.h"
#include <iostream>
static void U_CALLCONV traceData(
const void *context,
int32_t fnNumber,
int32_t level,
const char *fmt,
va_list args) {
char buf[1000];
const char *fnName;
fnName = utrace_functionName(fnNumber);
utrace_vformat(buf, sizeof(buf), 0, fmt, args);
std::cout << fnName << " " << buf << std::endl;
}
int main() {
icu::ErrorCode status;
const void* context = nullptr;
utrace_setFunctions(context, nullptr, nullptr, traceData);
utrace_setLevel(UTRACE_VERBOSE);
// Create a new BreakIterator
icu::LocalPointer<icu::BreakIterator> brkitr(
icu::BreakIterator::createWordInstance("zh-CN", status));
}
```
The following output is produced from this program:
FileTracer::traceOpenResFile icudt64l-brkitr/zh_CN.res
FileTracer::traceOpenResFile icudt64l-brkitr/zh.res
FileTracer::traceOpenResFile icudt64l-brkitr/root.res
ResourceTracer::trace (string) icudt64l-brkitr/root.res @ /boundaries/word
FileTracer::traceOpenDataFile icudt64l-brkitr/word.brk
What this means:
1. The BreakIterator constructor opened three resource files in the locale
fallback chain for zh_CN.
2. One string was read from the resource bundle: the one at the resource path
"/boundaries/word" in brkitr/root.res.
3. In addition, the binary data file brkitr/word.brk was opened.
Based on that information, you can make a more informed decision when writing
resource filter rules for this simple program.
## Data Tracing API
The `traceData` function shown above takes five arguments. The following two
are most important for data tracing:
- `fnNumber` indicates what type of data access this is.
- `args` contains the details on which resources were accessed.
**Important:** When reading from `args`, the strings are valid only within the
scope of your `traceData` function. You should make copies of the strings if
you intend to save them for further processing.
### UTRACE_UDATA_RESOURCE
UTRACE_UDATA_RESOURCE is used to indicate that a value inside of a resource
bundle was read by ICU code.
When `fnNumber` is `UTRACE_UDATA_RESOURCE`, there are three C-style strings in
`args`:
1. Data type; not relevant for the purpose of resource filtering.
2. The internal path of the resource file from which the value was read.
3. The path to the value within that resource file.
To read each of these into different variables, you can write the code,
```cpp
const char* dataType = va_arg(args, const char*);
const char* filePath = va_arg(args, const char*);
const char* resPath = va_arg(args, const char*);
```
As stated above, you should copy the strings if you intend to save them. The
pointers will not be valid after the tracing function returns.
### UTRACE_UDATA_DATA_FILE
UTRACE_UDATA_DATA_FILE is used to indicate that a non-resource-bundle binary
data file was opened by ICU code. Such files are used for break iteration,
conversion, confusables, and a handful of other ICU services.
When `fnNumber` is `UTRACE_UDATA_DATA_FILE`, there is just one C string in
`args`: the internal path of the data file that was opened.
### UTRACE_UDATA_RES_FILE
UTRACE_UDATA_RES_FILE is used to indicate that a binary resource bundle file
was opened by ICU code. This can be helpful to debug locale fallbacks. For the
purposes of making your ICU data filter, the specific resource paths provided
by UTRACE_UDATA_RESOURCE are more precise and useful.
When `fnNumber` is `UTRACE_UDATA_RES_FILE`, there is just one C string in
`args`: the internal path of the resource file that was opened.

View File

@ -115,7 +115,8 @@ ulist.o uloc_tag.o icudataver.o icuplug.o \
sharedobject.o simpleformatter.o unifiedcache.o uloc_keytype.o \
ubiditransform.o \
pluralmap.o \
static_unicode_sets.o
static_unicode_sets.o \
restrace.o
## Header files to install
HEADERS = $(srcdir)/unicode/*.h

View File

@ -339,6 +339,7 @@
<ClCompile Include="utext.cpp" />
<ClCompile Include="utf_impl.cpp" />
<ClCompile Include="static_unicode_sets.cpp" />
<ClCompile Include="restrace.cpp" />
<ClInclude Include="localsvc.h" />
<ClInclude Include="msvcres.h" />
<ClInclude Include="pluralmap.h" />
@ -448,6 +449,7 @@
<ClInclude Include="static_unicode_sets.h" />
<ClInclude Include="capi_helper.h" />
<ClInclude Include="unicode\localebuilder.h" />
<ClInclude Include="restrace.h" />
</ItemGroup>
<ItemGroup>
<ResourceCompile Include="common.rc" />

View File

@ -619,6 +619,9 @@
<ClCompile Include="static_unicode_sets.cpp">
<Filter>formatting</Filter>
</ClCompile>
<ClCompile Include="restrace.cpp">
<Filter>data &amp; memory</Filter>
</ClCompile>
</ItemGroup>
<ItemGroup>
<ClInclude Include="ubidi_props.h">
@ -954,9 +957,12 @@
<ClInclude Include="static_unicode_sets.h">
<Filter>formatting</Filter>
</ClInclude>
<CustomBuild Include="capi_helper.h">
<ClInclude Include="capi_helper.h">
<Filter>data &amp; memory</Filter>
</CustomBuild>
</ClInclude>
<ClInclude Include="restrace.h">
<Filter>data &amp; memory</Filter>
</ClInclude>
</ItemGroup>
<ItemGroup>
<ResourceCompile Include="common.rc">

View File

@ -465,6 +465,7 @@
<ClCompile Include="utext.cpp" />
<ClCompile Include="utf_impl.cpp" />
<ClCompile Include="static_unicode_sets.cpp" />
<ClCompile Include="restrace.cpp" />
</ItemGroup>
<ItemGroup>
<ClInclude Include="localsvc.h" />
@ -575,6 +576,7 @@
<ClInclude Include="static_unicode_sets.h" />
<ClInclude Include="capi_helper.h" />
<ClInclude Include="unicode\localebuilder.h" />
<ClInclude Include="restrace.h" />
</ItemGroup>
<ItemGroup>
<ResourceCompile Include="common.rc" />

View File

@ -28,6 +28,7 @@
#include "unicode/utypes.h"
#include "unicode/unistr.h"
#include "unicode/ures.h"
#include "restrace.h"
struct ResourceData;
@ -47,8 +48,10 @@ public:
ResourceArray() : items16(NULL), items32(NULL), length(0) {}
/** Only for implementation use. @internal */
ResourceArray(const uint16_t *i16, const uint32_t *i32, int32_t len) :
items16(i16), items32(i32), length(len) {}
ResourceArray(const uint16_t *i16, const uint32_t *i32, int32_t len,
const ResourceTracer& traceInfo) :
items16(i16), items32(i32), length(len),
fTraceInfo(traceInfo) {}
/**
* @return The number of items in the array resource.
@ -68,6 +71,7 @@ private:
const uint16_t *items16;
const uint32_t *items32;
int32_t length;
ResourceTracer fTraceInfo;
};
/**
@ -80,8 +84,10 @@ public:
/** Only for implementation use. @internal */
ResourceTable(const uint16_t *k16, const int32_t *k32,
const uint16_t *i16, const uint32_t *i32, int32_t len) :
keys16(k16), keys32(k32), items16(i16), items32(i32), length(len) {}
const uint16_t *i16, const uint32_t *i32, int32_t len,
const ResourceTracer& traceInfo) :
keys16(k16), keys32(k32), items16(i16), items32(i32), length(len),
fTraceInfo(traceInfo) {}
/**
* @return The number of items in the array resource.
@ -101,6 +107,7 @@ private:
const uint16_t *items16;
const uint32_t *items32;
int32_t length;
ResourceTracer fTraceInfo;
};
/**

View File

@ -0,0 +1,111 @@
// © 2019 and later: Unicode, Inc. and others.
// License & terms of use: http://www.unicode.org/copyright.html
#include "unicode/utypes.h"
#if U_ENABLE_TRACING
#include "restrace.h"
#include "charstr.h"
#include "cstring.h"
#include "utracimp.h"
#include "uresimp.h"
#include "uassert.h"
#include "util.h"
U_NAMESPACE_BEGIN
ResourceTracer::~ResourceTracer() = default;
void ResourceTracer::trace(const char* resType) const {
U_ASSERT(fResB || fParent);
UTRACE_ENTRY(UTRACE_UDATA_RESOURCE);
UErrorCode status = U_ZERO_ERROR;
icu::CharString filePath;
getFilePath(filePath, status);
icu::CharString resPath;
getResPath(resPath, status);
UTRACE_DATA3(UTRACE_VERBOSE, "(%s) %s @ %s",
resType,
filePath.data(),
resPath.data());
UTRACE_EXIT_STATUS(status);
}
void ResourceTracer::getFilePath(CharString& output, UErrorCode& status) const {
if (fResB) {
output.append(fResB->fData->fPath, status);
output.append('/', status);
output.append(fResB->fData->fName, status);
output.append(".res", status);
} else {
fParent->getFilePath(output, status);
}
}
void ResourceTracer::getResPath(CharString& output, UErrorCode& status) const {
if (fResB) {
output.append('/', status);
output.append(fResB->fResPath, status);
// removing the trailing /
U_ASSERT(output[output.length()-1] == '/');
output.truncate(output.length()-1);
} else {
fParent->getResPath(output, status);
}
if (fKey) {
output.append('/', status);
output.append(fKey, status);
}
if (fIndex != -1) {
output.append('[', status);
UnicodeString indexString;
ICU_Utility::appendNumber(indexString, fIndex);
output.appendInvariantChars(indexString, status);
output.append(']', status);
}
}
void FileTracer::traceOpen(const char* path, const char* type, const char* name) {
if (uprv_strcmp(type, "res") == 0) {
traceOpenResFile(path, name);
} else {
traceOpenDataFile(path, type, name);
}
}
void FileTracer::traceOpenDataFile(const char* path, const char* type, const char* name) {
UTRACE_ENTRY(UTRACE_UDATA_DATA_FILE);
UErrorCode status = U_ZERO_ERROR;
icu::CharString filePath;
filePath.append(path, status);
filePath.append('/', status);
filePath.append(name, status);
filePath.append('.', status);
filePath.append(type, status);
UTRACE_DATA1(UTRACE_VERBOSE, "%s", filePath.data());
UTRACE_EXIT_STATUS(status);
}
void FileTracer::traceOpenResFile(const char* path, const char* name) {
UTRACE_ENTRY(UTRACE_UDATA_RES_FILE);
UErrorCode status = U_ZERO_ERROR;
icu::CharString filePath;
filePath.append(path, status);
filePath.append('/', status);
filePath.append(name, status);
filePath.append(".res", status);
UTRACE_DATA1(UTRACE_VERBOSE, "%s", filePath.data());
UTRACE_EXIT_STATUS(status);
}
U_NAMESPACE_END
#endif // U_ENABLE_TRACING

View File

@ -0,0 +1,132 @@
// © 2019 and later: Unicode, Inc. and others.
// License & terms of use: http://www.unicode.org/copyright.html
#ifndef __RESTRACE_H__
#define __RESTRACE_H__
#include "unicode/utypes.h"
#if U_ENABLE_TRACING
struct UResourceBundle;
U_NAMESPACE_BEGIN
class CharString;
/**
* Instances of this class store information used to trace reads from resource
* bundles when ICU is built with --enable-tracing.
*
* All arguments of type const UResourceBundle*, const char*, and
* const ResourceTracer& are stored as pointers. The caller must retain
* ownership for the lifetime of this ResourceTracer.
*
* Exported as U_COMMON_API for Windows because it is a value field
* in other exported types.
*/
class U_COMMON_API ResourceTracer {
public:
ResourceTracer() :
fResB(nullptr),
fParent(nullptr),
fKey(nullptr),
fIndex(-1) {}
ResourceTracer(const UResourceBundle* resB) :
fResB(resB),
fParent(nullptr),
fKey(nullptr),
fIndex(-1) {}
ResourceTracer(const UResourceBundle* resB, const char* key) :
fResB(resB),
fParent(nullptr),
fKey(key),
fIndex(-1) {}
ResourceTracer(const UResourceBundle* resB, int32_t index) :
fResB(resB),
fParent(nullptr),
fKey(nullptr),
fIndex(index) {}
ResourceTracer(const ResourceTracer& parent, const char* key) :
fResB(nullptr),
fParent(&parent),
fKey(key),
fIndex(-1) {}
ResourceTracer(const ResourceTracer& parent, int32_t index) :
fResB(nullptr),
fParent(&parent),
fKey(nullptr),
fIndex(index) {}
~ResourceTracer();
void trace(const char* type) const;
private:
const UResourceBundle* fResB;
const ResourceTracer* fParent;
const char* fKey;
int32_t fIndex;
void getFilePath(CharString& output, UErrorCode& status) const;
void getResPath(CharString& output, UErrorCode& status) const;
};
/**
* This class provides methods to trace data file reads when ICU is built
* with --enable-tracing.
*/
class FileTracer {
public:
static void traceOpen(const char* path, const char* type, const char* name);
private:
static void traceOpenDataFile(const char* path, const char* type, const char* name);
static void traceOpenResFile(const char* path, const char* name);
};
U_NAMESPACE_END
#else // U_ENABLE_TRACING
U_NAMESPACE_BEGIN
/**
* Default trivial implementation when --enable-tracing is not used.
*/
class U_COMMON_API ResourceTracer {
public:
ResourceTracer() {}
ResourceTracer(const void*) {}
ResourceTracer(const void*, const char*) {}
ResourceTracer(const void*, int32_t) {}
ResourceTracer(const ResourceTracer&, const char*) {}
ResourceTracer(const ResourceTracer&, int32_t) {}
void trace(const char*) const {}
};
/**
* Default trivial implementation when --enable-tracing is not used.
*/
class FileTracer {
public:
static void traceOpen(const char*, const char*, const char*) {}
};
U_NAMESPACE_END
#endif // U_ENABLE_TRACING
#endif //__RESTRACE_H__

View File

@ -33,6 +33,7 @@ might have to #include some other header
#include "cstring.h"
#include "mutex.h"
#include "putilimp.h"
#include "restrace.h"
#include "uassert.h"
#include "ucln_cmn.h"
#include "ucmndata.h"
@ -1168,6 +1169,9 @@ doOpenChoice(const char *path, const char *type, const char *name,
UBool isICUData = FALSE;
FileTracer::traceOpen(path, type, name);
/* Is this path ICU data? */
if(path == NULL ||
!strcmp(path, U_ICUDATA_ALIAS) || /* "ICUDATA" */

View File

@ -66,6 +66,7 @@ typedef enum UTraceFunctionNumber {
UTRACE_FUNCTION_START=0,
UTRACE_U_INIT=UTRACE_FUNCTION_START,
UTRACE_U_CLEANUP,
#ifndef U_HIDE_DEPRECATED_API
/**
* One more than the highest normal collation trace location.
@ -83,6 +84,7 @@ typedef enum UTraceFunctionNumber {
UTRACE_UCNV_FLUSH_CACHE,
UTRACE_UCNV_LOAD,
UTRACE_UCNV_UNLOAD,
#ifndef U_HIDE_DEPRECATED_API
/**
* One more than the highest normal collation trace location.
@ -101,13 +103,55 @@ typedef enum UTraceFunctionNumber {
UTRACE_UCOL_STRCOLLITER,
UTRACE_UCOL_OPEN_FROM_SHORT_STRING,
UTRACE_UCOL_STRCOLLUTF8, /**< @stable ICU 50 */
#ifndef U_HIDE_DEPRECATED_API
/**
* One more than the highest normal collation trace location.
* @deprecated ICU 58 The numeric value may change over time, see ICU ticket #12420.
*/
UTRACE_COLLATION_LIMIT
UTRACE_COLLATION_LIMIT,
#endif // U_HIDE_DEPRECATED_API
#ifndef U_HIDE_DRAFT_API
/**
* The lowest resource/data location.
* @draft ICU 65
*/
UTRACE_RES_DATA_START=0x3000,
/**
* Indicates that a value was read from a resource bundle. Provides three
* C-style strings to UTraceData: type, file name, and resource path. The
* type is "string", "binary", "intvector", "int", or "uint".
* @draft ICU 65
*/
UTRACE_UDATA_RESOURCE=UTRACE_RES_DATA_START,
/**
* Indicates that a value was read from a resource bundle. Provides one
* C-style string to UTraceData: file name.
* @draft ICU 65
*/
UTRACE_UDATA_DATA_FILE,
/**
* Indicates that a value was read from a resource bundle. Provides one
* C-style string to UTraceData: file name.
* @draft ICU 65
*/
UTRACE_UDATA_RES_FILE,
#endif // U_HIDE_DRAFT_API
#ifndef U_HIDE_INTERNAL_API
/**
* One more than the highest normal resource/data trace location.
* @internal The numeric value may change over time, see ICU ticket #12420.
*/
UTRACE_RES_DATA_LIMIT,
#endif // U_HIDE_INTERNAL_API
} UTraceFunctionNumber;
/**

View File

@ -392,7 +392,8 @@ static UResourceDataEntry *init_entry(const char *localeID, const char *path, UE
/* We'll try to get alias string from the bundle */
aliasres = res_getResource(&(r->fData), "%%ALIAS");
if (aliasres != RES_BOGUS) {
const UChar *alias = res_getString(&(r->fData), aliasres, &aliasLen);
// No tracing: called during initial data loading
const UChar *alias = res_getStringNoTrace(&(r->fData), aliasres, &aliasLen);
if(alias != NULL && aliasLen > 0) { /* if there is actual alias - unload and load new data */
u_UCharsToChars(alias, aliasName, aliasLen+1);
r->fAlias = init_entry(aliasName, path, status);
@ -533,7 +534,8 @@ loadParentsExceptRoot(UResourceDataEntry *&t1,
Resource parentRes = res_getResource(&t1->fData, "%%Parent");
if (parentRes != RES_BOGUS) { // An explicit parent was found.
int32_t parentLocaleLen = 0;
const UChar *parentLocaleName = res_getString(&(t1->fData), parentRes, &parentLocaleLen);
// No tracing: called during initial data loading
const UChar *parentLocaleName = res_getStringNoTrace(&(t1->fData), parentRes, &parentLocaleLen);
if(parentLocaleName != NULL && 0 < parentLocaleLen && parentLocaleLen < nameCapacity) {
u_UCharsToChars(parentLocaleName, name, parentLocaleLen + 1);
if (uprv_strcmp(name, kRootLocaleName) == 0) {
@ -1291,7 +1293,7 @@ U_CAPI const UChar* U_EXPORT2 ures_getString(const UResourceBundle* resB, int32_
*status = U_ILLEGAL_ARGUMENT_ERROR;
return NULL;
}
s = res_getString(&(resB->fResData), resB->fRes, len);
s = res_getString({resB}, &(resB->fResData), resB->fRes, len);
if (s == NULL) {
*status = U_RESOURCE_TYPE_MISMATCH;
}
@ -1380,7 +1382,7 @@ U_CAPI const uint8_t* U_EXPORT2 ures_getBinary(const UResourceBundle* resB, int3
*status = U_ILLEGAL_ARGUMENT_ERROR;
return NULL;
}
p = res_getBinary(&(resB->fResData), resB->fRes, len);
p = res_getBinary({resB}, &(resB->fResData), resB->fRes, len);
if (p == NULL) {
*status = U_RESOURCE_TYPE_MISMATCH;
}
@ -1397,7 +1399,7 @@ U_CAPI const int32_t* U_EXPORT2 ures_getIntVector(const UResourceBundle* resB, i
*status = U_ILLEGAL_ARGUMENT_ERROR;
return NULL;
}
p = res_getIntVector(&(resB->fResData), resB->fRes, len);
p = res_getIntVector({resB}, &(resB->fResData), resB->fRes, len);
if (p == NULL) {
*status = U_RESOURCE_TYPE_MISMATCH;
}
@ -1418,7 +1420,7 @@ U_CAPI int32_t U_EXPORT2 ures_getInt(const UResourceBundle* resB, UErrorCode *st
*status = U_RESOURCE_TYPE_MISMATCH;
return 0xffffffff;
}
return RES_GET_INT(resB->fRes);
return res_getInt({resB}, resB->fRes);
}
U_CAPI uint32_t U_EXPORT2 ures_getUInt(const UResourceBundle* resB, UErrorCode *status) {
@ -1433,7 +1435,7 @@ U_CAPI uint32_t U_EXPORT2 ures_getUInt(const UResourceBundle* resB, UErrorCode *
*status = U_RESOURCE_TYPE_MISMATCH;
return 0xffffffff;
}
return RES_GET_UINT(resB->fRes);
return res_getUInt({resB}, resB->fRes);
}
U_CAPI UResType U_EXPORT2 ures_getType(const UResourceBundle *resB) {
@ -1444,10 +1446,18 @@ U_CAPI UResType U_EXPORT2 ures_getType(const UResourceBundle *resB) {
}
U_CAPI const char * U_EXPORT2 ures_getKey(const UResourceBundle *resB) {
//
// TODO: Trace ures_getKey? I guess not usually.
//
// We usually get the key string to decide whether we want the value, or to
// make a key-value pair. Tracing the value should suffice.
//
// However, I believe we have some data (e.g., in res_index) where the key
// strings are the data. Tracing the enclosing table should suffice.
//
if(resB == NULL) {
return NULL;
}
return(resB->fKey);
}
@ -1467,7 +1477,7 @@ static const UChar* ures_getStringWithAlias(const UResourceBundle *resB, Resourc
ures_close(tempRes);
return result;
} else {
return res_getString(&(resB->fResData), r, len);
return res_getString({resB, sIndex}, &(resB->fResData), r, len);
}
}
@ -1503,7 +1513,7 @@ U_CAPI const UChar* U_EXPORT2 ures_getNextString(UResourceBundle *resB, int32_t*
switch(RES_GET_TYPE(resB->fRes)) {
case URES_STRING:
case URES_STRING_V2:
return res_getString(&(resB->fResData), resB->fRes, len);
return res_getString({resB}, &(resB->fResData), resB->fRes, len);
case URES_TABLE:
case URES_TABLE16:
case URES_TABLE32:
@ -1648,7 +1658,7 @@ U_CAPI const UChar* U_EXPORT2 ures_getStringByIndex(const UResourceBundle *resB,
switch(RES_GET_TYPE(resB->fRes)) {
case URES_STRING:
case URES_STRING_V2:
return res_getString(&(resB->fResData), resB->fRes, len);
return res_getString({resB}, &(resB->fResData), resB->fRes, len);
case URES_TABLE:
case URES_TABLE16:
case URES_TABLE32:
@ -1943,7 +1953,7 @@ void getAllItemsWithFallback(
value.pResData = &bundle->fResData;
UResourceDataEntry *parentEntry = bundle->fData->fParent;
UBool hasParent = parentEntry != NULL && U_SUCCESS(parentEntry->fBogus);
value.setResource(bundle->fRes);
value.setResource(bundle->fRes, ResourceTracer(bundle));
sink.put(bundle->fKey, value, !hasParent, errorCode);
if (hasParent) {
// We might try to query the sink whether
@ -2095,7 +2105,7 @@ U_CAPI const UChar* U_EXPORT2 ures_getStringByKey(const UResourceBundle *resB, c
switch (RES_GET_TYPE(res)) {
case URES_STRING:
case URES_STRING_V2:
return res_getString(rd, res, len);
return res_getString({resB, key}, rd, res, len);
case URES_ALIAS:
{
const UChar* result = 0;
@ -2117,7 +2127,7 @@ U_CAPI const UChar* U_EXPORT2 ures_getStringByKey(const UResourceBundle *resB, c
switch (RES_GET_TYPE(res)) {
case URES_STRING:
case URES_STRING_V2:
return res_getString(&(resB->fResData), res, len);
return res_getString({resB, key}, &(resB->fResData), res, len);
case URES_ALIAS:
{
const UChar* result = 0;
@ -2138,6 +2148,7 @@ U_CAPI const UChar* U_EXPORT2 ures_getStringByKey(const UResourceBundle *resB, c
/* here should go a first attempt to locate the key using index table */
const ResourceData *rd = getFallbackData(resB, &key, &realData, &res, status);
if(U_SUCCESS(*status)) {
// TODO: Tracing
return res_getString(rd, res, len);
} else {
*status = U_MISSING_RESOURCE_ERROR;

View File

@ -33,6 +33,7 @@
#include "uinvchar.h"
#include "uresdata.h"
#include "uresimp.h"
#include "utracimp.h"
/*
* Resource access helpers
@ -307,7 +308,7 @@ res_getPublicType(Resource res) {
}
U_CAPI const UChar * U_EXPORT2
res_getString(const ResourceData *pResData, Resource res, int32_t *pLength) {
res_getStringNoTrace(const ResourceData *pResData, Resource res, int32_t *pLength) {
const UChar *p;
uint32_t offset=RES_GET_OFFSET(res);
int32_t length;
@ -402,7 +403,8 @@ int32_t getStringArray(const ResourceData *pResData, const icu::ResourceArray &a
}
for(int32_t i = 0; i < length; ++i) {
int32_t sLength;
const UChar *s = res_getString(pResData, array.internalGetResource(pResData, i), &sLength);
// No tracing: handled by the caller
const UChar *s = res_getStringNoTrace(pResData, array.internalGetResource(pResData, i), &sLength);
if(s == NULL) {
errorCode = U_RESOURCE_TYPE_MISMATCH;
return 0;
@ -434,7 +436,7 @@ res_getAlias(const ResourceData *pResData, Resource res, int32_t *pLength) {
}
U_CAPI const uint8_t * U_EXPORT2
res_getBinary(const ResourceData *pResData, Resource res, int32_t *pLength) {
res_getBinaryNoTrace(const ResourceData *pResData, Resource res, int32_t *pLength) {
const uint8_t *p;
uint32_t offset=RES_GET_OFFSET(res);
int32_t length;
@ -454,7 +456,7 @@ res_getBinary(const ResourceData *pResData, Resource res, int32_t *pLength) {
U_CAPI const int32_t * U_EXPORT2
res_getIntVector(const ResourceData *pResData, Resource res, int32_t *pLength) {
res_getIntVectorNoTrace(const ResourceData *pResData, Resource res, int32_t *pLength) {
const int32_t *p;
uint32_t offset=RES_GET_OFFSET(res);
int32_t length;
@ -507,7 +509,7 @@ const UChar *ResourceDataValue::getString(int32_t &length, UErrorCode &errorCode
if(U_FAILURE(errorCode)) {
return NULL;
}
const UChar *s = res_getString(pResData, res, &length);
const UChar *s = res_getString(fTraceInfo, pResData, res, &length);
if(s == NULL) {
errorCode = U_RESOURCE_TYPE_MISMATCH;
}
@ -532,7 +534,7 @@ int32_t ResourceDataValue::getInt(UErrorCode &errorCode) const {
if(RES_GET_TYPE(res) != URES_INT) {
errorCode = U_RESOURCE_TYPE_MISMATCH;
}
return RES_GET_INT(res);
return res_getInt(fTraceInfo, res);
}
uint32_t ResourceDataValue::getUInt(UErrorCode &errorCode) const {
@ -542,14 +544,14 @@ uint32_t ResourceDataValue::getUInt(UErrorCode &errorCode) const {
if(RES_GET_TYPE(res) != URES_INT) {
errorCode = U_RESOURCE_TYPE_MISMATCH;
}
return RES_GET_UINT(res);
return res_getUInt(fTraceInfo, res);
}
const int32_t *ResourceDataValue::getIntVector(int32_t &length, UErrorCode &errorCode) const {
if(U_FAILURE(errorCode)) {
return NULL;
}
const int32_t *iv = res_getIntVector(pResData, res, &length);
const int32_t *iv = res_getIntVector(fTraceInfo, pResData, res, &length);
if(iv == NULL) {
errorCode = U_RESOURCE_TYPE_MISMATCH;
}
@ -560,7 +562,7 @@ const uint8_t *ResourceDataValue::getBinary(int32_t &length, UErrorCode &errorCo
if(U_FAILURE(errorCode)) {
return NULL;
}
const uint8_t *b = res_getBinary(pResData, res, &length);
const uint8_t *b = res_getBinary(fTraceInfo, pResData, res, &length);
if(b == NULL) {
errorCode = U_RESOURCE_TYPE_MISMATCH;
}
@ -590,7 +592,7 @@ ResourceArray ResourceDataValue::getArray(UErrorCode &errorCode) const {
errorCode = U_RESOURCE_TYPE_MISMATCH;
return ResourceArray();
}
return ResourceArray(items16, items32, length);
return ResourceArray(items16, items32, length, fTraceInfo);
}
ResourceTable ResourceDataValue::getTable(UErrorCode &errorCode) const {
@ -627,7 +629,7 @@ ResourceTable ResourceDataValue::getTable(UErrorCode &errorCode) const {
errorCode = U_RESOURCE_TYPE_MISMATCH;
return ResourceTable();
}
return ResourceTable(keys16, keys32, items16, items32, length);
return ResourceTable(keys16, keys32, items16, items32, length, fTraceInfo);
}
UBool ResourceDataValue::isNoInheritanceMarker() const {
@ -656,7 +658,7 @@ int32_t ResourceDataValue::getStringArrayOrStringAsArray(UnicodeString *dest, in
return 1;
}
int32_t sLength;
const UChar *s = res_getString(pResData, res, &sLength);
const UChar *s = res_getString(fTraceInfo, pResData, res, &sLength);
if(s != NULL) {
dest[0].setTo(TRUE, s, sLength);
return 1;
@ -671,7 +673,7 @@ UnicodeString ResourceDataValue::getStringOrFirstOfArray(UErrorCode &errorCode)
return us;
}
int32_t sLength;
const UChar *s = res_getString(pResData, res, &sLength);
const UChar *s = res_getString(fTraceInfo, pResData, res, &sLength);
if(s != NULL) {
us.setTo(TRUE, s, sLength);
return us;
@ -681,7 +683,8 @@ UnicodeString ResourceDataValue::getStringOrFirstOfArray(UErrorCode &errorCode)
return us;
}
if(array.getSize() > 0) {
s = res_getString(pResData, array.internalGetResource(pResData, 0), &sLength);
// Tracing is already performed above (unimportant for trace that this is an array)
s = res_getStringNoTrace(pResData, array.internalGetResource(pResData, 0), &sLength);
if(s != NULL) {
us.setTo(TRUE, s, sLength);
return us;
@ -829,7 +832,11 @@ UBool icu::ResourceTable::getKeyAndValue(int32_t i,
} else {
res = items32[i];
}
rdValue.setResource(res);
// Note: the ResourceTracer keeps a reference to the field of this
// ResourceTable. This is OK because the ResourceTable should remain
// alive for the duration that fields are being read from it
// (including nested fields).
rdValue.setResource(res, ResourceTracer(fTraceInfo, key));
return TRUE;
}
return FALSE;
@ -875,7 +882,13 @@ uint32_t icu::ResourceArray::internalGetResource(const ResourceData *pResData, i
UBool icu::ResourceArray::getValue(int32_t i, icu::ResourceValue &value) const {
if(0 <= i && i < length) {
icu::ResourceDataValue &rdValue = static_cast<icu::ResourceDataValue &>(value);
rdValue.setResource(internalGetResource(rdValue.pResData, i));
// Note: the ResourceTracer keeps a reference to the field of this
// ResourceArray. This is OK because the ResourceArray should remain
// alive for the duration that fields are being read from it
// (including nested fields).
rdValue.setResource(
internalGetResource(rdValue.pResData, i),
ResourceTracer(fTraceInfo, i));
return TRUE;
}
return FALSE;

View File

@ -69,14 +69,16 @@ typedef uint32_t Resource;
#define RES_GET_OFFSET(res) ((res)&0x0fffffff)
#define RES_GET_POINTER(pRoot, res) ((pRoot)+RES_GET_OFFSET(res))
/* get signed and unsigned integer values directly from the Resource handle */
/* get signed and unsigned integer values directly from the Resource handle
* NOTE: For proper logging, please use the res_getInt() constexpr
*/
#if U_SIGNED_RIGHT_SHIFT_IS_ARITHMETIC
# define RES_GET_INT(res) (((int32_t)((res)<<4L))>>4L)
# define RES_GET_INT_NO_TRACE(res) (((int32_t)((res)<<4L))>>4L)
#else
# define RES_GET_INT(res) (int32_t)(((res)&0x08000000) ? (res)|0xf0000000 : (res)&0x07ffffff)
# define RES_GET_INT_NO_TRACE(res) (int32_t)(((res)&0x08000000) ? (res)|0xf0000000 : (res)&0x07ffffff)
#endif
#define RES_GET_UINT(res) ((res)&0x0fffffff)
#define RES_GET_UINT_NO_TRACE(res) ((res)&0x0fffffff)
#define URES_IS_ARRAY(type) ((int32_t)(type)==URES_ARRAY || (int32_t)(type)==URES_ARRAY16)
#define URES_IS_TABLE(type) ((int32_t)(type)==URES_TABLE || (int32_t)(type)==URES_TABLE16 || (int32_t)(type)==URES_TABLE32)
@ -423,23 +425,27 @@ res_unload(ResourceData *pResData);
U_INTERNAL UResType U_EXPORT2
res_getPublicType(Resource res);
///////////////////////////////////////////////////////////////////////////
// To enable tracing, use the inline versions of the res_get* functions. //
///////////////////////////////////////////////////////////////////////////
/*
* Return a pointer to a zero-terminated, const UChar* string
* and set its length in *pLength.
* Returns NULL if not found.
*/
U_INTERNAL const UChar * U_EXPORT2
res_getString(const ResourceData *pResData, Resource res, int32_t *pLength);
res_getStringNoTrace(const ResourceData *pResData, Resource res, int32_t *pLength);
U_INTERNAL const uint8_t * U_EXPORT2
res_getBinaryNoTrace(const ResourceData *pResData, Resource res, int32_t *pLength);
U_INTERNAL const int32_t * U_EXPORT2
res_getIntVectorNoTrace(const ResourceData *pResData, Resource res, int32_t *pLength);
U_INTERNAL const UChar * U_EXPORT2
res_getAlias(const ResourceData *pResData, Resource res, int32_t *pLength);
U_INTERNAL const uint8_t * U_EXPORT2
res_getBinary(const ResourceData *pResData, Resource res, int32_t *pLength);
U_INTERNAL const int32_t * U_EXPORT2
res_getIntVector(const ResourceData *pResData, Resource res, int32_t *pLength);
U_INTERNAL Resource U_EXPORT2
res_getResource(const ResourceData *pResData, const char *key);
@ -470,16 +476,54 @@ U_CFUNC Resource res_findResource(const ResourceData *pResData, Resource r,
#ifdef __cplusplus
#include "resource.h"
#include "restrace.h"
U_NAMESPACE_BEGIN
inline const UChar* res_getString(const ResourceTracer& traceInfo,
const ResourceData *pResData, Resource res, int32_t *pLength) {
traceInfo.trace("string");
return res_getStringNoTrace(pResData, res, pLength);
}
inline const uint8_t* res_getBinary(const ResourceTracer& traceInfo,
const ResourceData *pResData, Resource res, int32_t *pLength) {
traceInfo.trace("binary");
return res_getBinaryNoTrace(pResData, res, pLength);
}
inline const int32_t* res_getIntVector(const ResourceTracer& traceInfo,
const ResourceData *pResData, Resource res, int32_t *pLength) {
traceInfo.trace("intvector");
return res_getIntVectorNoTrace(pResData, res, pLength);
}
inline int32_t res_getInt(const ResourceTracer& traceInfo, Resource res) {
traceInfo.trace("int");
return RES_GET_INT_NO_TRACE(res);
}
inline uint32_t res_getUInt(const ResourceTracer& traceInfo, Resource res) {
traceInfo.trace("uint");
return RES_GET_UINT_NO_TRACE(res);
}
class ResourceDataValue : public ResourceValue {
public:
ResourceDataValue() : pResData(NULL), res(static_cast<Resource>(URES_NONE)) {}
ResourceDataValue() :
pResData(NULL),
res(static_cast<Resource>(URES_NONE)),
fTraceInfo() {}
virtual ~ResourceDataValue();
void setData(const ResourceData *data) { pResData = data; }
void setResource(Resource r) { res = r; }
void setData(const ResourceData *data) {
pResData = data;
}
void setResource(Resource r, ResourceTracer&& traceInfo) {
res = r;
fTraceInfo = traceInfo;
}
virtual UResType getType() const;
virtual const UChar *getString(int32_t &length, UErrorCode &errorCode) const;
@ -501,6 +545,7 @@ public:
private:
Resource res;
ResourceTracer fTraceInfo;
};
U_NAMESPACE_END

View File

@ -476,6 +476,15 @@ trCollNames[] = {
NULL
};
static const char* const
trResDataNames[] = {
"ResourceTracer::trace",
"FileTracer::traceOpenDataFile",
"FileTracer::traceOpenResFile",
NULL
};
U_CAPI const char * U_EXPORT2
utrace_functionName(int32_t fnNumber) {
@ -485,6 +494,8 @@ utrace_functionName(int32_t fnNumber) {
return trConvNames[fnNumber - UTRACE_CONVERSION_START];
} else if(UTRACE_COLLATION_START <= fnNumber && fnNumber < UTRACE_COLLATION_LIMIT){
return trCollNames[fnNumber - UTRACE_COLLATION_START];
} else if(UTRACE_RES_DATA_START <= fnNumber && fnNumber < UTRACE_RES_DATA_LIMIT){
return trResDataNames[fnNumber - UTRACE_RES_DATA_START];
} else {
return "[BOGUS Trace Function Number]";
}

View File

@ -646,11 +646,12 @@ group: localebuilder
resourcebundle
group: udata
udata.o ucmndata.o udatamem.o
udata.o ucmndata.o udatamem.o restrace.o
umapfile.o
deps
uhash platform stubdata
file_io mmap_functions
icu_utility
group: unifiedcache
unifiedcache.o

View File

@ -2105,6 +2105,42 @@ UBool IntlTest::assertEquals(const char* message,
}
#endif
std::string vectorToString(const std::vector<std::string>& strings) {
std::string result = "{";
bool first = true;
for (auto element : strings) {
if (first) {
first = false;
} else {
result += ", ";
}
result += "\"";
result += element;
result += "\"";
}
result += "}";
return result;
}
UBool IntlTest::assertEquals(const char* message,
const std::vector<std::string>& expected,
const std::vector<std::string>& actual) {
if (expected != actual) {
std::string expectedAsString = vectorToString(expected);
std::string actualAsString = vectorToString(actual);
errln((UnicodeString)"FAIL: " + message +
"; got " + actualAsString.c_str() +
"; expected " + expectedAsString.c_str());
return FALSE;
}
#ifdef VERBOSE_ASSERTIONS
else {
logln((UnicodeString)"Ok: " + message + "; got " + vectorToString(actual).c_str());
}
#endif
return TRUE;
}
static char ASSERT_BUF[256];
static const char* extractToAssertBuf(const UnicodeString& message) {
@ -2169,6 +2205,11 @@ UBool IntlTest::assertEquals(const UnicodeString& message,
const UnicodeSet& actual) {
return assertEquals(extractToAssertBuf(message), expected, actual);
}
UBool IntlTest::assertEquals(const UnicodeString& message,
const std::vector<std::string>& expected,
const std::vector<std::string>& actual) {
return assertEquals(extractToAssertBuf(message), expected, actual);
}
#if !UCONFIG_NO_FORMATTING
UBool IntlTest::assertEquals(const UnicodeString& message,

View File

@ -18,6 +18,9 @@
#include "unicode/testlog.h"
#include "unicode/uniset.h"
#include <vector>
#include <string>
U_NAMESPACE_USE
#if U_PLATFORM == U_PF_OS390
@ -297,6 +300,8 @@ public:
UBool assertEquals(const char* message, double expected, double actual);
UBool assertEquals(const char* message, UErrorCode expected, UErrorCode actual);
UBool assertEquals(const char* message, const UnicodeSet& expected, const UnicodeSet& actual);
UBool assertEquals(const char* message,
const std::vector<std::string>& expected, const std::vector<std::string>& actual);
#if !UCONFIG_NO_FORMATTING
UBool assertEquals(const char* message, const Formattable& expected,
const Formattable& actual, UBool possibleDataError=FALSE);
@ -315,6 +320,8 @@ public:
UBool assertEquals(const UnicodeString& message, double expected, double actual);
UBool assertEquals(const UnicodeString& message, UErrorCode expected, UErrorCode actual);
UBool assertEquals(const UnicodeString& message, const UnicodeSet& expected, const UnicodeSet& actual);
UBool assertEquals(const UnicodeString& message,
const std::vector<std::string>& expected, const std::vector<std::string>& actual);
virtual void runIndexedTest( int32_t index, UBool exec, const char* &name, char* par = NULL ); // overide !

View File

@ -11,12 +11,42 @@
#include "cstring.h"
#include "unicode/unistr.h"
#include "unicode/resbund.h"
#include "unicode/brkiter.h"
#include "unicode/utrace.h"
#include "unicode/ucurr.h"
#include "restsnew.h"
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <limits.h>
#include <vector>
#include <string>
//***************************************************************************************
void NewResourceBundleTest::runIndexedTest( int32_t index, UBool exec, const char* &name, char* /*par*/ )
{
if (exec) logln("TestSuite ResourceBundleTest: ");
TESTCASE_AUTO_BEGIN;
#if !UCONFIG_NO_FILE_IO && !UCONFIG_NO_LEGACY_CONVERSION
TESTCASE_AUTO(TestResourceBundles);
TESTCASE_AUTO(TestConstruction);
TESTCASE_AUTO(TestIteration);
TESTCASE_AUTO(TestOtherAPI);
TESTCASE_AUTO(TestNewTypes);
#endif
TESTCASE_AUTO(TestGetByFallback);
TESTCASE_AUTO(TestFilter);
#if U_ENABLE_TRACING
TESTCASE_AUTO(TestTrace);
#endif
TESTCASE_AUTO_END;
}
//***************************************************************************************
@ -180,27 +210,6 @@ NewResourceBundleTest::~NewResourceBundleTest()
}
}
void NewResourceBundleTest::runIndexedTest( int32_t index, UBool exec, const char* &name, char* /*par*/ )
{
if (exec) logln("TestSuite ResourceBundleTest: ");
switch (index) {
#if !UCONFIG_NO_FILE_IO && !UCONFIG_NO_LEGACY_CONVERSION
case 0: name = "TestResourceBundles"; if (exec) TestResourceBundles(); break;
case 1: name = "TestConstruction"; if (exec) TestConstruction(); break;
case 2: name = "TestIteration"; if (exec) TestIteration(); break;
case 3: name = "TestOtherAPI"; if(exec) TestOtherAPI(); break;
case 4: name = "TestNewTypes"; if(exec) TestNewTypes(); break;
#else
case 0: case 1: case 2: case 3: case 4: name = "skip"; break;
#endif
case 5: name = "TestGetByFallback"; if(exec) TestGetByFallback(); break;
case 6: name = "TestFilter"; if(exec) TestFilter(); break;
default: name = ""; break; //needed to end loop
}
}
//***************************************************************************************
void
@ -1343,5 +1352,86 @@ void NewResourceBundleTest::TestFilter() {
}
}
#if U_ENABLE_TRACING
static std::vector<std::string> gResourcePathsTraced;
static std::vector<std::string> gDataFilesTraced;
static std::vector<std::string> gResFilesTraced;
static void U_CALLCONV traceData(
const void*,
int32_t fnNumber,
int32_t,
const char *,
va_list args) {
if (fnNumber == UTRACE_UDATA_RESOURCE) {
va_arg(args, const char*); // type
va_arg(args, const char*); // file
const char* resourcePath = va_arg(args, const char*);
gResourcePathsTraced.push_back(resourcePath);
} else if (fnNumber == UTRACE_UDATA_DATA_FILE) {
const char* filePath = va_arg(args, const char*);
gDataFilesTraced.push_back(filePath);
} else if (fnNumber == UTRACE_UDATA_RES_FILE) {
const char* filePath = va_arg(args, const char*);
gResFilesTraced.push_back(filePath);
}
}
void NewResourceBundleTest::TestTrace() {
IcuTestErrorCode status(*this, "TestTrace");
const void* context;
utrace_setFunctions(context, nullptr, nullptr, traceData);
utrace_setLevel(UTRACE_VERBOSE);
{
LocalPointer<BreakIterator> brkitr(BreakIterator::createWordInstance("zh-CN", status));
assertEquals("Should touch expected resource paths",
{ "/boundaries/word" },
gResourcePathsTraced);
assertEquals("Should touch expected data files",
{ U_ICUDATA_NAME "-brkitr/word.brk" },
gDataFilesTraced);
// NOTE: The following passes only when this test is run in isolation.
// If run in "make check", these files were already open.
// assertEquals("Should touch expected resource files",
// {
// U_ICUDATA_NAME "-brkitr/zh_CN.res",
// U_ICUDATA_NAME "-brkitr/zh.res",
// U_ICUDATA_NAME "-brkitr/root.res"
// },
// gResFilesTraced);
gResourcePathsTraced.clear();
gDataFilesTraced.clear();
gResFilesTraced.clear();
}
{
ucurr_getDefaultFractionDigits(u"USD", status);
assertEquals("Should touch expected resource paths",
{ "/CurrencyMeta/DEFAULT" },
gResourcePathsTraced);
assertEquals("Should touch expected data files",
{ },
gDataFilesTraced);
// NOTE: The following passes only when this test is run in isolation.
// If run in "make check", these files were already open.
// assertEquals("Should touch expected resource files",
// { U_ICUDATA_NAME "-curr/supplementalData.res" },
// gResFilesTraced);
gResourcePathsTraced.clear();
gDataFilesTraced.clear();
gResFilesTraced.clear();
}
utrace_setFunctions(context, nullptr, nullptr, nullptr);
}
#endif
//eof

View File

@ -40,6 +40,10 @@ public:
void TestFilter(void);
#if U_ENABLE_TRACING
void TestTrace(void);
#endif
private:
/**
* The assignment operator has no real implementation.

View File

@ -305,7 +305,8 @@ ures_enumDependencies(const char *itemName,
break;
}
int32_t length;
const UChar *alias=res_getString(pResData, res, &length);
// No tracing: build tool
const UChar *alias=res_getStringNoTrace(pResData, res, &length);
checkAlias(itemName, res, alias, length, useResSuffix, check, context, pErrorCode);
}
break;