Codecs
A BinaryCodec[T]
defines both the serializer and deserializer for a given type:
trait BinaryCodec[T] extends BinarySerializer[T] with BinaryDeserializer[T]
Primitive types
The io.github.vigoo.desert
package defines a lot of implicit binary codecs for common types.
The following code examples demonstrate this and also shows how the binary representation looks like.
import io.github.vigoo.desert._
import io.github.vigoo.desert.shapeless._
import io.github.vigoo.desert.zioprelude._
import java.time._
import java.time.temporal.ChronoUnit
import scala.math._
val byte = serializeToArray(100.toByte)
val short = serializeToArray(100.toShort)
val int = serializeToArray(100)
val long = serializeToArray(100L)
val float = serializeToArray(3.14.toFloat)
val double = serializeToArray(3.14)
val bool = serializeToArray(true)
val unit = serializeToArray(())
val ch = serializeToArray('!')
val str = serializeToArray("Hello")
val uuid = serializeToArray(java.util.UUID.randomUUID())
-49 | -100 | 106 | -68 | 95 | 89 | 66 | 21 | -121 | 34 | 68 | 125 | -110 | 97 | -105 | 36 |
val bd = serializeToArray(BigDecimal(1234567890.1234567890))
36 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 48 | 46 | 49 | 50 | 51 | 52 | 53 | 54 | 55 |
val bi = serializeToArray(BigInt(1234567890))
val dow = serializeToArray(DayOfWeek.SATURDAY)
val month = serializeToArray(Month.FEBRUARY)
val year = serializeToArray(Year.of(2022))
val monthDay = serializeToArray(MonthDay.of(12, 1))
val yearMonth = serializeToArray(YearMonth.of(2022, 12))
val period = serializeToArray(Period.ofWeeks(3))
val zoneOffset = serializeToArray(ZoneOffset.UTC)
val duration = serializeToArray(Duration.of(123, ChronoUnit.SECONDS))
val instant = serializeToArray(Instant.parse("2022-12-01T11:11:00Z"))
val localDate = serializeToArray(LocalDate.of(2022, 12, 1))
val localTime = serializeToArray(LocalTime.of(11, 11))
val localDateTime = serializeToArray(LocalDateTime.of(2022, 12, 1, 11, 11, 0))
val offsetDateTime = serializeToArray(OffsetDateTime.of(2022, 12, 1, 11, 11, 0, 0, ZoneOffset.UTC))
val zonedDateTime = serializeToArray(ZonedDateTime.of(2022, 12, 1, 11, 11, 0, 0, ZoneOffset.UTC))
Option, Either, Try, Validation
Common types such as Option
and Either
are also supported out of the box. For Try
it
also has a codec for arbitrary Throwable
instances, although deserializing it does not recreate
the original throwable just a PersistedThrowable
instance. In practice this is a much safer approach
than trying to recreate the same exception via reflection.
import scala.collection.immutable.SortedSet
import scala.util._
import zio.NonEmptyChunk
import zio.prelude.Validation
val none = serializeToArray[Option[Int]](None)
val some = serializeToArray[Option[Int]](Some(100))
val left = serializeToArray[Either[Boolean, Int]](Left(true))
val right = serializeToArray[Either[Boolean, Int]](Right(100))
val valid = serializeToArray[Validation[String, Int]](Validation.succeed(100))
val invalid = serializeToArray[Validation[String, Int]](Validation.failNonEmptyChunk(NonEmptyChunk("error")))
val fail = serializeToArray[Try[Int]](Failure(new RuntimeException("Test exception")))
val failDeser = fail.flatMap(data => deserializeFromArray[Try[Int]](data))
// failDeser: Either[DesertFailure, Try[Int]] = Right(
// value = Failure(
// exception = PersistedThrowable(
// className = "java.lang.RuntimeException",
// message = "Test exception",
// stackTrace = Array(
// repl.MdocSession$MdocApp.<init>(codecs.md:242),
// repl.MdocSession$.app(codecs.md:3),
// mdoc.internal.document.DocumentBuilder$$doc$.$anonfun$build$2(DocumentBuilder.scala:89),
// scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18),
// scala.util.DynamicVariable.withValue(DynamicVariable.scala:59),
// scala.Console$.withErr(Console.scala:193),
// mdoc.internal.document.DocumentBuilder$$doc$.$anonfun$build$1(DocumentBuilder.scala:89),
// scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18),
// scala.util.DynamicVariable.withValue(DynamicVariable.scala:59),
// scala.Console$.withOut(Console.scala:164),
// mdoc.internal.document.DocumentBuilder$$doc$.build(DocumentBuilder.scala:88),
// mdoc.internal.markdown.MarkdownBuilder$.$anonfun$buildDocument$2(MarkdownBuilder.scala:47),
// mdoc.internal.markdown.MarkdownBuilder$$anon$1.run(MarkdownBuilder.scala:104)
// ),
// cause = None
// )
// )
// )
val success = serializeToArray[Try[Int]](Success(100))
Collections
There is a generic iterableCodec
that can be used to define implicit collection codecs based on
the Scala 2.13 collection API. For example this is how the vectorCodec
is defined:
implicit def vectorCodec[A: BinaryCodec]: BinaryCodec[Vector[A]] = iterableCodec[A, Vector[A]]
All these collection codecs have one of the two possible representation. If the size is known in advance
then it is the number of elements followed by all the items in iteration order, otherwise it is a flat
list of all the elements wrapped in Option[T]
. Vector
and List
are good examples for the two:
val vec = serializeToArray(Vector(1, 2, 3, 4))
val lst = serializeToArray(List(1, 2, 3, 4))
Other supported collection types in the codecs
package:
import zio.NonEmptyChunk
import zio.prelude.NonEmptyList
import zio.prelude.ZSet
val arr = serializeToArray(Array(1, 2, 3, 4))
val set = serializeToArray(Set(1, 2, 3, 4))
val sortedSet = serializeToArray(SortedSet(1, 2, 3, 4))
val nec = serializeToArray(NonEmptyChunk(1, 2, 3, 4))
val nel = serializeToArray(NonEmptyList(1, 2, 3, 4))
val nes = serializeToArray(ZSet(1, 2, 3, 4))
8 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 1 |
String deduplication
For strings the library have a simple deduplication system, without sacrificing any extra
bytes for cases when strings are not duplicate. In general, the strings are encoded by a variable length
int representing the length of the string in bytes, followed by its UTF-8 encoding.
When deduplication is enabled, each serialized
string gets an ID and if it is serialized once more in the same stream, a negative number in place of the
length identifies it.
val twoStrings1 = serializeToArray(List("Hello", "Hello"))
1 | 1 | 10 | 72 | 101 | 108 | 108 | 111 | 1 | 10 | 72 | 101 | 108 | 108 | 111 | 0 |
val twoStrings2 = serializeToArray(List(DeduplicatedString("Hello"), DeduplicatedString("Hello")))
It is not turned on by default because it breaks backward compatibility when evolving data structures.
If a new string field is added, old versions of the application will skip it and would not assign the
same ID to the string if it is first seen.
It is enabled internally in desert for some cases, and can be used in custom serializers freely.
Tuples
The elements of tuples are serialized flat and the whole tuple gets prefixed by 0
, which makes them
compatible with simple case classes:
val tup = serializeToArray((1, 2, 3))
Maps
Map
, SortedMap
and NonEmptyMap
are just another iterableCodec
built on top of the tuple support
for serializing an iteration of key-value pairs:
import scala.collection.immutable.SortedMap
val map = serializeToArray(Map(1 -> "x", 2 -> "y"))
val sortedmap = serializeToArray(SortedMap(1 -> "x", 2 -> "y"))
Generic codecs for ADTs
There is a generic derivable codec for algebraic data types, with support for evolving the type
during the lifecycle of the application.
For case classes the representation is the same as for tuples:
case class Point(x: Int, y: Int, z: Int)
object Point {
implicit val codec: BinaryCodec[Point] = DerivedBinaryCodec.derive
}
val pt = serializeToArray(Point(1, 2, 3))
Note that there is no @evolutionSteps
annotation used for the type. In this case the only additional storage
cost is a single 0
byte on the beginning just like with tuples. The evolution steps are explained on
a separate section.
For sum types the codec is not automatically derived for all the constructors when using the Shapeless based
derivation. This has mostly historical reasons, as previous versions required passing the evolution steps as
parameters to the derive
method. The new ZIO Schema based derivation does not have this limitation.
Other than that it works the same way, with derive
:
sealed trait Drink
case class Beer(typ: String) extends Drink
case object Water extends Drink
object Drink {
implicit val beerCodec: BinaryCodec[Beer] = DerivedBinaryCodec.derive
implicit val waterCodec: BinaryCodec[Water.type] = DerivedBinaryCodec.derive
implicit val codec: BinaryCodec[Drink] = DerivedBinaryCodec.derive
}
val a = serializeToArray[Drink](Beer("X"))
val b = serializeToArray[Drink](Water)
Transient fields in generic codecs
It is possible to mark some fields of a case class as transient:
case class Point2(x: Int, y: Int, z: Int, @transientField(None) cachedDistance: Option[Double])
object Point2 {
implicit val codec: BinaryCodec[Point2] = DerivedBinaryCodec.derive
}
val serializedPt2 = serializeToArray(Point2(1, 2, 3, Some(3.7416)))
val pt2 = for {
data <- serializedPt2
result <- deserializeFromArray[Point2](data)
} yield result
// pt2: Either[DesertFailure, Point2] = Right(
// value = Point2(x = 1, y = 2, z = 3, cachedDistance = None)
// )
Transient fields are not being serialized and they get a default value contained by the annotation
during deserialization. Note that the default value is not type checked during compilation, if
it does not match the field type it causes runtime error.
Transient constructors in generic codecs
It is possible to mark whole constructors as transient:
sealed trait Cases
@transientConstructor case class Case1() extends Cases
case class Case2() extends Cases
object Cases {
implicit val case2Codec: BinaryCodec[Case2] = DerivedBinaryCodec.derive
implicit val codec: BinaryCodec[Cases] = DerivedBinaryCodec.derive
}
val cs1 = serializeToArray[Cases](Case1())
Left(SerializingTransientConstructor(Case1))
val cs2 = serializeToArray[Cases](Case2())
Transient constructors cannot be serialized. A common use case is for remote accessible actors where
some actor messages are known to be local only. By marking them as transient they can hold non-serializable data
without breaking the serialization of the other, remote messages.
Generic codecs for value type wrappers
It is a good practice to use zero-cost value type wrappers around primitive types to represent
the intention in the type system. desert
can derive binary codecs for these too:
case class DocumentId(id: Long) // extends AnyVal // extends AnyVal
object DocumentId {
implicit val codec: BinaryCodec[DocumentId] = DerivedBinaryCodec.deriveForWrapper
}
val id = serializeToArray(DocumentId(100))
Custom codecs
The serialization is a simple scala function using an implicit serialization context:
def serialize(value: T)(implicit context: SerializationContext): Unit
while the deserialization is
def deserialize()(implicit ctx: DeserializationContext): T
The io.github.vigoo.desert.custom
package contains a set of serialization and
deserialization functions, all requiring the implicit contexts, that can be uesd
to implement custom codecs.
By implementing the BinaryCodec
trait it is possible to define a fully custom codec. In the following
example we define a data type capable of representing cyclic graphs via a mutable next
field, and
a custom codec for deserializing it. It also shows that built-in support for tracking object references
which is not used by the generic codecs but can be used in scenarios like this.
import cats.instances.either._
import io.github.vigoo.desert.custom._
final class Node(val label: String,
var next: Option[Node]) {
override def toString: String =
next match {
case Some(n) => s"<$label -> ${n.label}>"
case None => s"<$label>"
}
}
object Node {
implicit lazy val codec: BinaryCodec[Node] =
new BinaryCodec[Node] {
override def serialize(value: Node)(implicit context: SerializationContext): Unit = {
write(value.label) // write the label using the built-in string codec
value.next match {
case Some(next) =>
write(true) // next is defined (built-in boolean codec)
storeRefOrObject(next) // store ref-id or serialize next
case None =>
write(false) // next is undefined (built-in boolean codec)
}
}
override def deserialize()(implicit ctx: DeserializationContext): Node = {
val label = read[String]() // read the label using the built-in string codec
val result = new Node(label, None) // create the new node
storeReadRef(result) // store the node in the reference map
val hasNext = read[Boolean]() // read if 'next' is defined
if (hasNext) {
// Read next with reference-id support and mutate the result
val next = readRefOrValue[Node](storeReadReference = false)
result.next = Some(next)
}
result
}
}
}
case class Root(node: Node)
object Root {
implicit val codec: BinaryCodec[Root] = new BinaryCodec[Root] {
override def deserialize()(implicit ctx: DeserializationContext): Root =
Root(readRefOrValue[Node](storeReadReference = false))
override def serialize(value: Root)(implicit context: SerializationContext): Unit =
storeRefOrObject(value.node)
}
}
val nodeA = new Node("a", None)
// nodeA: Node = <a -> a>
val nodeB = new Node("a", None)
// nodeB: Node = <a -> a>
val nodeC = new Node("a", None)
// nodeC: Node = <a -> a>
nodeA.next = Some(nodeB)
nodeB.next = Some(nodeC)
nodeC.next = Some(nodeA)
val result = serializeToArray(Root(nodeA))
Monadic custom codecs
Previous versions of desert
exposed a monadic serializer/deserializer API based on ZPure
with the following
types:
type Ser[T] = ZPure[Nothing, SerializerState, SerializerState, SerializationEnv, DesertFailure, T]
type Deser[T] = ZPure[Nothing, SerializerState, SerializerState, DeserializationEnv, DesertFailure, T]
For compatibility, the library still defines the monadic version of the serialization functions in the
io.github.vigoo.desert.custom.pure
package.
A monadic serializer or deserializer can be converted to a BinarySerializer
or BinaryDeserializer
using the
fromPure
method.
To achieve higher performance, it is recommended to implement custom codecs using the low level serialization API.