Overview of changes:
- A new C#-specific command-line option, legacy_enum_values to revert to the old behavior
- When legacy_enum_values isn't specified, we strip the enum name as a prefix, and PascalCase the value name
- A new attribute within the C# code so that we can always tell the original in-proto name
Regenerating the C# code with legacy_enum_values leads to code which still compiles and works - but
there's more still to do.
I've moved both protoc.exe and the proto files out of Google.Protobuf.
The .proto files aren't a slam-dunk, but it feels like they belong with protoc as you'd *use* them with protoc.
It's not clear to me whether we really need both an x86 and x64 version of protoc.exe, as x86 would work on 64-bit Windows anyway. Discuss :)
This makes no externally visible behavioral changes. Internally and non-behaviorally:
- We use a field (compiler-generated) to store the JsonName to avoid recomputing it repeatedly
- The documentation for JsonName is updated to reflect the meaning better
- Readonly autoprops and expression-bodied properties used where possible
This detects:
- An end-group tag with the wrong field number (doesn't match the start-group field)
- An end-group tag with no preceding start-group tag
Fixes issue #688.
This is a start to fixing issue #1212. It won't help for test protos,
conformance etc, but it will definitely be better than nothing, and
would have highlighted a change in descriptor.proto which broken C#
earlier.
Recently, descriptor.proto gained a GeneratedCodeInfo message, which means the generated code conflicts with our type.
Unfortunately this affects codegen as well, although this is a part of the public API which is very unlikely to affect hand-written code.
Generated code changes in next commit.
The usage of ICustomDiagnosticMessage here is non-essential - ToDiagnosticString
doesn't actually get called by ToString() in this case, due to JsonFormatter code. It was
intended to make it clearer that it *did* have a custom format... but then arguably I should
do the same for Value, Struct, Any etc.
Moving some of the code out of JsonFormatter and into Duration/Timestamp/FieldMask likewise
feels somewhat nice, somewhat nasty... basically there are JSON-specific bits of formatting, but
also domain-specific bits of computation. <sigh>
Thoughts welcome.
- Tighten up on Infinity/NaN handling in terms of whitespace handling (and test casing)
- Validate that values are genuinely integers when they've been parsed from a JSON number (ignoring the fact that 1.0000000000000000001 == 1 as a double...)
- Allow exponents and decimal points in string representations
The conformance tests now use types which are part of src/google/protobuf, so we need to include src in the proto path.
The notes around "fix-ups" have been out of date for some time now.
This addresses issue #1008, by creating a JsonFormatter which is private and only different
to JsonFormatter.Default in terms of reference equality.
Other plausible designs:
- The same, but expose the diagnostic-only formatter
- Add something to settings to say "I don't have a type registry at all"
- Change the behaviour of JsonFormatter.Default (bad idea IMO, as we really *don't* want the result of this used as regular JSON to be parsed)
Note that just trying to find a separate fix to issue #933 and using that to override Any.ToString() differently wouldn't work for messages that *contain* an Any.
Generated code changes follow in the next commit.
This required a rework of the tokenizer to allow for a "replaying" tokenizer, basically in case the @type value comes after the data itself. This rework is nice in some ways (all the pushback and object depth logic in one place) but is a little fragile in terms of token push-back when using the replay tokenizer. It'll be fine for the scenario we need it for, but we should be careful...
There are corner cases where MessageDescriptor.{ClrType,Parser} will return null, and these are now documented. However, normally they *should* be implemented, even for descriptors of for dynamic messages. Ditto FieldDescriptor.Accessor.
We'll still need a fair amount of work to implement dynamic messages, but this change means that the public API will be remain intact.
Additionally, this change starts making use of C# 6 features in the files that it touches. This is far from exhaustive, and later PRs will have more.
Generated code changes coming in the next commit.
Generated code coming in next commit - in a subsequent PR I want to do a bit of renaming and redocumenting around this, in anticipation of DynamicMessage.
This is only thrown directly by JsonTokenizer, but surfaces from JsonParser as well. I've added doc comments to hopefully make everything clear.
The exception is actually thrown by the reader within JsonTokenizer, in anticipation of keeping track of the location within the document, but that change is not within this PR.
This includes all the well-known types except Any.
Some aspects are likely to require further work when the details of the JSON parsing expectations are hammered out in more detail. Some of these have "ignored" tests already.
Note that the choice *not* to use Json.NET was made for two reasons:
- Going from 0 dependencies to 1 dependency is a big hit, and there's not much benefit here
- Json.NET parses more leniently than we'd want; accommodating that would be nearly as much work as writing the tokenizer
This only really affects the JsonTokenizer, which could be replaced by Json.NET. The JsonParser code would be about the same length with Json.NET... but I wouldn't be as confident in it.
This changes how we approach JSON formatting in general - instead of looking at the field a value came from, we just look at the type of the value. It's possible this *could* be slightly inefficient, but if we start caring about JSON performance deeply, we'll probably want to rewrite all of this anyway. It's definitely simpler this way.
When we support dynamic messages, we'll need to modify JsonFormatter to handle enum values, as they won't come be "real" .NET enums at that point. It shouldn't be hard to do though.
There are now summaries for:
- The Types nested class (which holds nested types)
- The file descriptor class for each proto
- The enum generated for each oneof
(Also fixed two typos.)
Generated code in next commit.
We still need the JSON representation, which relies on something like a DescriptorPool to fetch message types from based on the type URL. That will come a bit later.
(The DescriptorPool comment in this commit is just a note which will prove useful if we use DescriptorPool itself.)
This introduces a new C# option, base_namespace.
If the option is not specified, the behaviour is as before: no directories are generated.
If the option *is* specified, all C# namespaces must be relative to the base namespace, and the directories are generated relative to that namespace.
Example:
- Any.proto declares csharp_namespace = "Google.Protobuf.WellKnownTypes"
- We build with --csharp_out=Google.Protobuf --csharp_opt=base_namespace=Google.Protobuf
- The Any.cs file is generated in Google.Protobuf/WellKnownTypes (where it currently lives)
We need a change to descriptor.proto before this will all work (it wasn't in the right C# namespace) but that needs the other descriptors to be regenerated too. See next commit...
We now do this in protoc instead of the generation simpler.
Benefits:
- Generation script is simpler
- Detection is simpler as we now only need to care about one filename
- The embedded descriptor knows itself as "google/protobuf/descriptor.proto" avoiding dependency issues
This PR also makes the "invalid dependency" exception clearer in terms of expected and actual dependencies.
With this in place, generating APIs on github.com/google/googleapis works - previously annotations.proto failed.
Currently there's no access to the annotations (stored as extensions) but we could potentially expose those at a later date.
- Removed a TODO without change in DescriptorPool.LookupSymbol - the TODOs were around performance, and this is only used during descriptor initialization
- Make the CodedInputStream limits read-only, adding a static factory method for the rare cases when this is useful
- Extracted IDeepCloneable into its own file.
This is a bit of a grotty hack, as we need to sort of fake proto2 field presence, but with only a proto3 version of the descriptor messages (a bit like oneof detection).
Should be okay, but will need to be careful of this if we ever implement proto2.
Now the generated code doesn't need to check for end group tags, as it will skip whole groups at a time.
Currently it will ignore extraneous end group tags, which may or may not be a good thing.
Renamed ConsumeLastField to SkipLastField as it felt more natural.
Removed WireFormat.IsEndGroupTag as it's no longer useful.
This mostly fixes issue 688.
(Generated code changes coming in next commit.)
This is taking an approach of putting all the logic in JsonFormatter. That's helpful in terms of concealing the details of whether or not to wrap the value in quotes, but it does lack flexibility. I don't *think* we want to allow user-defined formatting of messages, so that much shouldn't be a problem.
While I've provided operators, I haven't yet provided the method equivalents. It's not clear to me that
they're actually a good idea, while we're really targeting C# developers who definitely *can* use the user-defined operators.