Marcos Sánchez

Parco: An Experiment in Go Serialization

At Atani, we built a real-time producer that connected to over 30 cryptocurrency exchanges and DEXes. The system streamed trades, orderbook updates, and market data through Redpanda and NATS to various consumers. Millions of messages per day, microseconds mattered, the usual distributed systems fun.

We used easyjson for serialization. It worked. Not perfect, but good enough. JSON is readable, debuggable, and easyjson generates specialized marshal code that’s significantly faster than the standard library. For our throughput and latency requirements, it was sufficient.

But I kept thinking about waste. Every message carried field names we didn’t need. A trade message might look like {"exchange":"binance","symbol":"BTC/USDT","price":45000.50,"volume":1.2}. Those field names—exchange, symbol, price, volume—were transmitted with every single message. The consumer already knew the structure. Why send it repeatedly?

The thing about our NATS setup was that topics were hierarchical and explicit: trade.binance.btc.usdt, orderbook.coinbase.eth.usd, ticker.kraken.dot.eur. When you subscribed to trade.binance.btc.usdt, you knew exactly what message structure to expect. The topic itself was the schema definition. We didn’t need self-describing messages because the subscription already told you what was coming.

This seemed like the perfect use case for schema-aware binary serialization. Both sides know the structure. The topic defines what model you’re working with. Why waste bytes on metadata?

Protocol Buffers was the standard answer, but it felt heavy for our use case. We had maybe 3-4 message types that barely changed. Adding protoc to the build pipeline, maintaining .proto files, dealing with generated code—it seemed like overkill for such a simple problem. I wanted something lighter that stayed entirely in Go.

So I built Parco as an experiment.

The Idea

The core concept is simple: use a builder pattern to define your codec once, then serialize and deserialize with direct function calls. No reflection at runtime. No code generation. No external tools.

type Trade struct {
    Exchange  string
    base      string
    Quote     string
    Price     float64
    Volume    float64
    Timestamp time.Time
}

tradeParser, tradeCompiler := parco.Builder[Trade]().
    Varchar(func(t Trade) string { return t.Exchange }).
    Varchar(func(t Trade) string { return t.Symbol }).
    Float64LE(func(t Trade) float64 { return t.Price }).
    Float64LE(func(t Trade) float64 { return t.Volume }).
    TimeUTC(func(t Trade) time.Time { return t.Timestamp }).
    Parco()

The builder compiles into a codec that knows the field order and types. Serialization writes bytes sequentially. Deserialization reads them back in the same order. Strings get a length prefix. Numbers are written in little-endian. Times are Unix timestamps. No field names, no type tags, no schema versioning. Just data.

With NATS topics, you could initialize codecs based on subscription patterns. Subscribe to trade.binance.btc.usdt? Use the trade codec. Subscribe to orderbook.coinbase.eth.usd? Use the orderbook codec. The topic router already handles message type discrimination, so Parco just needs to handle serialization.

The non-exciting but common reality

We never deployed Parco to production. Not because it didn’t work—the benchmarks were promising, memory usage was better than easyjson, and the concept was sound. We didn’t deploy it because easyjson was already good enough and we had other priorities. Sometimes “good enough” really is good enough.

Trading systems have thin margins for experimentation. When something works, changing it needs strong justification. Parco would have been faster and more efficient, but the improvement wasn’t significant enough to justify the migration risk and effort when we had other bottlenecks to address.

That’s the reality of production systems. The technically interesting solution isn’t always the practical one.

Why open source it

Even though we didn’t use it, the experiment was valid. The technical reasoning was sound. For systems where both sides know the schema—especially with message brokers that use topic-based routing—schema-aware serialization makes sense.

Maybe someone else has the same problem. Maybe they’re streaming Go structs over NATS or Redpanda and thinking “why am I sending field names every time?” Maybe Protocol Buffers feels too heavy for their use case. Maybe they want something simple that stays in Go.

So I cleaned it up, wrote documentation, added tests, and released it. It’s at github.com/sonirico/parco.

When it makes sense

Parco works well when you have:

It doesn’t work when you need self-describing messages, cross-language compatibility, or complex schema evolution. Use JSON for APIs and debugging. Use Protocol Buffers for multi-language systems. Use FlatBuffers if you need zero-copy access to huge messages.

But if you’re building an internal Go-to-Go streaming pipeline and you want something simpler than Protobuf, Parco might fit.

Give me numbers

I benchmarked it against JSON and MessagePack. For small messages around 90 bytes, Parco is about 25% faster than JSON. For medium messages around 750 bytes, about 80% faster. For larger payloads around 8KB, roughly twice as fast.

The bigger win is memory. Parco uses constant memory regardless of payload size—184 bytes and 3 allocations per operation. JSON scales linearly with data. When you’re processing millions of messages, lower allocations mean less garbage collection pressure and more consistent latency.

Payload sizes are 50-65% smaller than JSON and 20-40% smaller than MessagePack. Over terabytes of daily traffic, this adds up.

Full benchmarks and methodology are in PERFORMANCE.md if you want details.

Lessons

The lesson here isn’t “Parco is great, everyone should use it.” The lesson is that experiments are valuable even when they don’t reach production.

Building Parco taught me more about Go’s type system, memory management, and serialization trade-offs than reading documentation ever would. It forced me to think deeply about when schema flexibility matters and when it’s just overhead. It made me better at evaluating trade-offs between complexity and benefit.

And maybe, for someone out there with a similar problem, it’ll be useful. That’s what open source is for.

The name

You get the name by just naturally concatenating “par” from “parser” with “co” from “compiler”. Plus, “parco” stands for brief in Spanish. Kind of came up by itself.

Try it

If you think Parco fits your use case, try it:

go get github.com/sonirico/parco

Documentation and examples are in the repo. If you find bugs or have suggestions, issues and PRs are welcome. If it doesn’t fit what you’re building, that’s fine too! Use something else.