* Factored Conformance test messages into shared test schema.
* Updated benchmarks to use new proto3 message locations.
* Fixed include path.
* Conformance: fixed include of Python test messages.
* Make maven in Rakefile use --batch-mode.
* Revert changes to benchmarks.
On second thought I think a separate schema for
CPU benchmarking makes sense.
* Try regenerating C# protos for new test protos.
* Removed benchmark messages from test proto.
* Added Jon Skeet's fixes for C#.
* Removed duplicate/old test messages C# file.
* C# fixes for test schema move.
* Fixed C# to use the correct TestAllTypes message.
* Fixes for Objective C test schema move.
* Added missing EXTRA_DIST file.
At generation time, walk the file's dependencies to see what really contains
extensions so we can generate more minimal code that only links together the
roots that provided extensions. Gets a bunch of otherwise noop code out of
the call flow when the roots are +initialized.
This takes the code that was sitting in benchmarks/
already and makes it easier for language-specific
benchmarks to consume. Future PRs will enhance this
so that the language-specific benchmarks can report
metrics back that will be tracked over time in PerfKit.
Overview of changes:
- A new C#-specific command-line option, legacy_enum_values to revert to the old behavior
- When legacy_enum_values isn't specified, we strip the enum name as a prefix, and PascalCase the value name
- A new attribute within the C# code so that we can always tell the original in-proto name
Regenerating the C# code with legacy_enum_values leads to code which still compiles and works - but
there's more still to do.
NOTE: This is a binary breaking change as structure sizes have changed size
and/or order.
- Drop capturing field options, no other options were captured and other mobile
targeted languages don't try to capture this sort information (saved 8
bytes for every field defined (in static data and again in field descriptor
instance size data).
- No longer generate/compile in the messages/enums in descriptor.proto. If
developers need it, they should generate it and compile it in. Reduced the
overhead of the core library.
- Compute the number of has_bits actually needs to avoid over reserving.
- Let the boolean single fields store via a has_bit to avoid storage, makes
the common cases of the instance size smaller.
- Reorder some flags and down size the enums to contain the bits needed.
- Reorder the items in the structures to manually ensure they are are packed
better (especially when generating 64bit code - 8 bytes for every field,
16 bytes for every extension, instance sizes 8 bytes also).
- Split off the structure initialization so when the default is zero, the
generated static storage doesn't need to reserve the space. This is batched
at the message level, so all the fields for the message have to have zero
defaults to get the saves. By definition all proto3 syntax files fall into
this case but it also saves space for the proto2 that use the standard
defaults. (saves 8 bytes of static data for every field that had a zero
default)
- Don't track the enums defined by a message. Nothing in the runtime needs it
and it was just generation and runtime overhead. (saves 8 bytes per enum)
- Ensure EnumDescriptors are started up threadsafe in all cases.
- Split some of the Descriptor initialization into multiple methods so the
generated code isn't padded with lots of zero/nil args.
- Change how oneof info is feed to the runtime enabling us to generate less
static data (8 bytes saved per oneof for 64bit).
- Change how enum value informat is capture to pack the data and only decode
it if it ends up being needed. Avoids padding issues causing bloat of 64bit,
and removes the needs for extra pointers in addition to the data (just the
data and one pointer now).
Recently, descriptor.proto gained a GeneratedCodeInfo message, which means the generated code conflicts with our type.
Unfortunately this affects codegen as well, although this is a part of the public API which is very unlikely to affect hand-written code.
Generated code changes in next commit.
Added instructions on what tools to install to compile protobuf from
source. Removed the INSTALL.txt file because it's just a simple copy of
the autoconf documentation and confuses users.
Change-Id: I6fd8aa13495f1238fe5c62451b95ad480b1c4bed
Apple engineers have pointed out that OSSpinLocks are vulnerable to live locking
on iOS in cases of priority inversion:
. http://mjtsai.com/blog/2015/12/16/osspinlock-is-unsafe/
. https://lists.swift.org/pipermail/swift-dev/Week-of-Mon-20151214/000372.html
- Use a dispatch_semaphore_t within the extension registry.
- Use a dispatch_semaphore_t for protecting autocreation within messages.
- Drop the custom/internal GPBString class since we don't have really good
numbers to judge the locking replacements and it isn't required. We can
always bring it back with real data in the future.
- Hopefully complete the deps for other languages for the generated conformance proto sources.
- List the generated sources for cleanup by make's clean rules.
- Make the toplevel nuke the pyc files that can get created in the ObjC dir.
This is only thrown directly by JsonTokenizer, but surfaces from JsonParser as well. I've added doc comments to hopefully make everything clear.
The exception is actually thrown by the reader within JsonTokenizer, in anticipation of keeping track of the location within the document, but that change is not within this PR.
This includes all the well-known types except Any.
Some aspects are likely to require further work when the details of the JSON parsing expectations are hammered out in more detail. Some of these have "ignored" tests already.
Note that the choice *not* to use Json.NET was made for two reasons:
- Going from 0 dependencies to 1 dependency is a big hit, and there's not much benefit here
- Json.NET parses more leniently than we'd want; accommodating that would be nearly as much work as writing the tokenizer
This only really affects the JsonTokenizer, which could be replaced by Json.NET. The JsonParser code would be about the same length with Json.NET... but I wouldn't be as confident in it.
NS_ENUM changes defintion in Objective C++ based on the C++ spec being
compiled with, special case the one situation where it wouldn't support doing a
forward decl for the enum.
We still need the JSON representation, which relies on something like a DescriptorPool to fetch message types from based on the type URL. That will come a bit later.
(The DescriptorPool comment in this commit is just a note which will prove useful if we use DescriptorPool itself.)
Additionally, change it to return the value passed, and make it generic with a class constraint.
A separate method doesn't have the class constraint, for more unusual scenarios.
Now the Build tool needs to define -DHAVE_ZLIB and -DHAVE-PTHREAD rather
than providing a config.h
- Make pbconfig.h a manually written file to handle hash conditions
according to platform related macros.
- Remove #include "config.h" from source code.
- Changed the configure.ac and Makefile.am to pass down the macros.
- Change cmake to pass down the the macros.
Change-Id: I537249d5df8fdeba189706aec436d1ab1104a4dc
- Add more to the ObjC dir readme.
- Merge the ExtensionField and ExtensionDescriptor to reduce overhead.
- Fix an initialization race.
- Clean up the Xcode schemes.
- Remove the class/enum filter.
- Remove some forced inline that were bloating things without proof of performance wins.
- Rename some internal types to avoid conflicts with the well know types protos.
- Drop the use of ApplyFunctions to the compiler/optimizer can do what it wants.
- Better document some possible future improvements.
- Add missing support for parsing repeated primitive fields in packed or unpacked forms.
- Improve -hash.
- Add *Count for repeated and map<> fields to avoid auto create when checking for them being set.
- Style fixups in the code.
- map<> serialization fixes and more tests.
- Autocreation of map<> fields (to match repeated fields).
- @@protoc_insertion_point(global_scope|imports).
- Fixup proto2 syntax extension support.
- Move all startup code to +initialize so it happen on class usage and not app startup.
- Have generated headers use forward declarations and move imports into generated code, reduces what is need at compile time to speed up compiled and avoid pointless rippling of rebuilds.
* Rosy hack doesn't apply (that test should be removed
for the open-source release).
* Added our own copy of parameterized.py (the open-source
version of Google Apputils doesn't contain it).
* The C++ Descriptor object didn't implement extension_ranges.
* Had to implement a hack around returning EncodeError, to
work around the module-loading behavior of the test runner.
Changes the automake to use tar-ustar for tarbal format, which supports
filenames exceeding 99-chars. Otherwise Nano source files cannot be
distributed.
Change-Id: I33e43148e317374cd46417bebb8559e40fac7299
This adds a Ruby extension in ruby/ that is based on the 'upb' library
(now included as a submodule), and adds support for Ruby code generation
to the protoc compiler.
nested autoconf package rather than as raw source. This way we can
trivially update it again in the future.
Actually, this change doesn't even include gtest in protobuf's SVN.
Instead, we auto-download it when autogen.sh is invoked. Note that
it will be included in release distributions, though.
TODO:
* Add a configure option to use the system's installed gtest rather
than the bundled copy. Apparently the gtest maintainers are working
on some general-purpose autoconf macros which will do this
automagically.
* Update MSVC project files.
All Languages
* Repeated fields of primitive types (types other that string, group, and
nested messages) may now use the option [packed = true] to get a more
efficient encoding. In the new encoding, the entire list is written
as a single byte blob using the "length-delimited" wire type. Within
this blob, the individual values are encoded the same way they would
be normally except without a tag before each value (thus, they are
tightly "packed").
C++
* UnknownFieldSet now supports STL-like iteration.
* Message interface has method ParseFromBoundedZeroCopyStream() which parses
a limited number of bytes from an input stream rather than parsing until
EOF.
Java
* Fixed bug where Message.mergeFrom(Message) failed to merge extensions.
* Message interface has new method toBuilder() which is equivalent to
newBuilderForType().mergeFrom(this).
* All enums now implement the ProtocolMessageEnum interface.
* Setting a field to null now throws NullPointerException.
* Fixed tendency for TextFormat's parsing to overflow the stack when
parsing large string values. The underlying problem is with Java's
regex implementation (which unfortunately uses recursive backtracking
rather than building an NFA). Worked around by making use of possesive
quantifiers.
Python
* Updated RPC interfaces to allow for blocking operation. A client may
now pass None for a callback when making an RPC, in which case the
call will block until the response is received, and the response
object will be returned directly to the caller. This interface change
cannot be used in practice until RPC implementations are updated to
implement it.
protoc
* Enum values may now have custom options, using syntax similar to field
options.
* Fixed bug where .proto files which use custom options but don't actually
define them (i.e. they import another .proto file defining the options)
had to explicitly import descriptor.proto.
* Adjacent string literals in .proto files will now be concatenated, like in
C.
C++
* Generated message classes now have a Swap() method which efficiently swaps
the contents of two objects.
* All message classes now have a SpaceUsed() method which returns an estimate
of the number of bytes of allocated memory currently owned by the object.
This is particularly useful when you are reusing a single message object
to improve performance but want to make sure it doesn't bloat up too large.
* New method Message::SerializeAsString() returns a string containing the
serialized data. May be more convenient than calling
SerializeToString(string*).
* In debug mode, log error messages when string-type fields are found to
contain bytes that are not valid UTF-8.
* Fixed bug where a message with multiple extension ranges couldn't parse
extensions.
* Fixed bug where MergeFrom(const Message&) didn't do anything if invoked on
a message that contained no fields (but possibly contained extensions).
* Fixed ShortDebugString() to not be O(n^2). Durr.
* Fixed crash in TextFormat parsing if the first token in the input caused a
tokenization error.
Java
* New overload of mergeFrom() which parses a slice of a byte array instead
of the whole thing.
* New method ByteString.asReadOnlyByteBuffer() does what it sounds like.
* Improved performance of isInitialized() when optimizing for code size.
Python
* Corrected ListFields() signature in Message base class to match what
subclasses actually implement.
* Some minor refactoring.