The reason this test fails on ARM hardware but not on Intel hardware
(including the ARM simulator) is this:
'\xa0' is interpreted as a negative signed byte number. Casting it to
uc16 sign-extends it. The resulting string does not fit into a one-byte
string, thus a two-byte string is allocated.
For some reason the code compiled for ARM does not sign-extend, and 0xa0
fits into a one-byte string. Thus a one-byte string is allocated. Trying
to cast it to two-byte causes assertion failure.
BUG=
Review URL: https://chromiumcodereview.appspot.com/12319111
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@13729 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
The CompilationInfo record now saves a Zone, and the compiler pipeline
allocates memory from the Zone in the CompilationInfo. Before
compiling a function, we create a Zone on the stack and save a pointer
to that Zone to the CompilationInfo; which then gets picked up and
allocated from.
BUG=
TEST=
Review URL: https://chromiumcodereview.appspot.com/10534139
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11877 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
regexp can match by using a Boyer-Moore-like table. This is done by identifying
non-greedy non-capturing loops in the nodes that eat any character one at a time.
For example in the middle of the regexp /foo[\s\S]*?bar/ we find such a loop.
There is also such a loop implicitly inserted at the start of any non-anchored
regexp.
When we have found such a loop we look ahead in the nodes to find the set of
characters that can come at given distances. For example for the regexp
/.?foo/ we know that there are at least 3 characters ahead of us, and the sets
of characters that can occur are [any, [f, o], [o]]. We find a range in the
lookahead info where the set of characters is reasonably constrained. In our
example this is from index 1 to 2 (0 is not constrained). We can now look 3
characters ahead and if we don't find one of [f, o] (the union of [f, o] and
[o]) then we can skip forwards by the range size (in this case 2).
For Unicode input strings we do the same, but modulo 128.
We also look at the first string fed to the regexp and use that to get a hint
of the character frequencies in the inputs. This affects the assessment of
whether the set of characters is 'reasonably constrained'.
We still have the old lookahead mechanism, which uses a wide load of multiple
characters followed by a mask and compare to determine whether a match is
possible at this point.
Review URL: http://codereview.chromium.org/9965010
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@11204 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
("narrowing conversion ... inside { } is ill-formed in C++11").
* src/mksnapshot.cc: Cast "char" to "unsigned char" when outputting snapshot.
* test/cctest/test-regexp.cc: Use static_cast to uc16 as the char
literal is signed.
Review URL: http://codereview.chromium.org/8825003
Patch from Tobias Burnus <burnus@net-b.de>.
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@10241 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
architecture-independent.
jsregexp.h is itself included transitively quite a lot, and by getting rid of 19
of its dependencies (which even included things like src/cpu.h, the various
assemblers, etc.), the recompilation behaviour is a bit less funny than it was.
Review URL: http://codereview.chromium.org/7331014
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@8589 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
This commit adds current working versions of assembler, macro-assembler,
disassembler, and simulator.
All other mips arch files are replaced with stubbed-out versions that
will build.
Arch independent files are updated as needed to support building and
running mips.
The only test is cctest/test-assembler-mips, and this passes on the
simulator and on mips hardware.
TEST=none
BUG=none
Patch by Paul Lind from MIPS.
Review URL: http://codereview.chromium.org/6730029/
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@7388 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
Added Nl category to letters predicate (as requried for JS identifiers).
Changed/simplified representation of canonicalization ranges.
Truncated tables to code points in the BMP (all that is used by JS).
Reformatted tables to avoid excessively long lines.
Removed duplicate entries from multi-character mapping result tables.
Review URL: http://codereview.chromium.org/3030026
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@5155 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
As the start index is already passed it is easy to calculate the "at start" boolean in generated code. Also as direct entry has been implemented this needs to be done in generated code anyway, and therefore might as well be moved to the generated code for RegExp. The "at start" value is now calcualted as a local variable on the native RegExp frame based on the value of the start index argument.
The x64 version have been tested on both Linux and 64-bit Windows Vista.
For ARM I have tested cctest/test-regexp on ARM hardware, but the rest of the tests have only been run on the ARM simulator.
Review URL: http://codereview.chromium.org/554078
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3709 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
make standard regexps like \s and . case independent.
* Make use of the fact that the subject string is ASCII only
when making character classes case independent.
* Avoid spending time making large ideogram or punctuation
ranges case independent when there is no case mapping anyway.
Review URL: http://codereview.chromium.org/378024
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@3243 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
It is activated with '--log-gc' flag.
JS object size is calculated as its size + size of 'properties' and 'elements' arrays, if they are non-empty. This doesn't take maps, strings, heap numbers, and other shared objects into account.
As Soeren suggested, I've moved ZoneSplayTree from jsregexp to zone, and removed now empty jsregexp-inl header file.
Review URL: http://codereview.chromium.org/159504
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@2570 ce2b1a6d-e550-0410-aec6-3dcde31c8c00
- TARGET, the architecture we will generate code for.
This is brought it from the build system.
- HOST, the architecture our C++ compiler is building for.
This is detected automatically based on compiler defines.
This adds macros for 32 or 64 bit, and cleans up some
include conditionals, etc.
Review URL: http://codereview.chromium.org/99355
git-svn-id: http://v8.googlecode.com/svn/branches/bleeding_edge@1864 ce2b1a6d-e550-0410-aec6-3dcde31c8c00