// Our logic is define with these
case class UserId(value: Long) extends AnyVal
case class UserData(name: String, age: Int)
case class User(id: UserId, data: UserData, updatedAt: Instant)
harder than it sounds
Mateusz Kubuszok
breaking things in Scala for 10 years
a little bit of open source - including co-authoring Chimney for over 8 years now
blog at Kubuszok.com
niche Things you need to know about JVM (that matter in Scala) ebook
What transformations are we talking about? Where are they used?
How they can be generated?
What they have to do with certain mathematical concepts?
If we can use math to define the rules of transformation, does it make it easy to implement?
// Our logic is define with these
case class UserId(value: Long) extends AnyVal
case class UserData(name: String, age: Int)
case class User(id: UserId, data: UserData, updatedAt: Instant)
// API is defined with these:
case class UserApi(name: String, age: Int)
def parseUser(json: String): Either[String, UserApi] = ...
// Database persistence is defined with these:
case class UserDB(id: Long, name: String, age: Int, updatedAt: Instant)
def persistUser(user: UserDB): Either[String, Unit] = ...
// Update endpoint logic of some POST /api/users/{userId}
def updateUser(userId: Long, userBody: String): Either[String, Unit] =
parseUser(userBody)
.map { (api: UserApi) =>
??? : User // 1. Use userId and userApi to create a User
}
.map { (user: User) =>
user // 2. Maybe some domain operations?
}
.map { (user: User) =>
??? : UserDB // 3. Use user to create a UserDB
}
.flatMap { (userDB: UserDB) => persistUser(userDB) }
def updateUser(userId: Long, body: String): Either[String, Unit] =
parseUser(body)
.map { (api: UserApi) =>
// UserApi -> User
User(
UserId(userId), UserData(api.name, api.age), Instant.now())
}
.map { (user: User) =>
user
}
.map { (user: User) =>
// User -> UserDB
UserDB(
user.id.value, user.data.name, user.data.age, user.updatedAt)
}
.flatMap { (userDB: UserDB) => persistUser(userDB) }
What if number of fields grows?
What if they would have similar, primitive types (e.g. String
, Int
, Boolean
)?
What if we reorder the fields?
case class UserData(
firstName: String,
lastName: String,
age: Int,
email: String,
phone: String,
address: String,
city: String
)
(api: UserApi) =>
UserData(
api.lastName, api.firstName, api.age, //<-- lastName & firstName reordered
api.phone, api.email, api.address, api.city //<-- phone & email reordered
)
If source value and target type, has a field with the same name, it would be used by default
If for every field in the target type, there is a source value resolved, the transformation can be generated
Anything extraordinary would have to be explicitly enabled by the programmer in the code
import io.scalaland.chimney.dsl._ // <-- some code-gen library
def updateUser(userId: Long, body: String): Either[String, Unit] =
parseUser(body)
.map { (api: UserApi) =>// UserApi -> UserData is generated
// ↓ with default configuration
User(UserId(userId), api.into.transform[UserData], Instant.now())
}
.map { (user: User) =>
user
}
.map { (user: User) =>
// Fields from user: User
// AND field from user.data: UserData
// ↓ are used to generate UserDB
user.into[UserDB].withFallback(user.data).transform
}
.flatMap { (userDB: UserDB) => persistUser(userDB) }
case class RowFoo(a: Int, b: Int, c: Int, d: Int, e: Int)
case class RowBar(a: Int, b: Int, c: Int, d: Int, e: Int,
g: String, h: String)
val foo: RowFoo = ...
val bar = RowBar( // Verbose, repetitive, "imperative",
a = foo.a, // a lot of "noise"...
b = foo.b,
c = foo.c,
d = foo.d,
e = "overridden",
g = "hardcoded",
h = ((foo.a + foo.b) * foo.c).toString
)
val foo: RowFoo = ...
val bar = foo.into[RowBar] // "Give me a transformation from Foo to Bar
.withFieldConst(_.e, "overridden") // WHERE e is mapped to [provide value]
.withFieldConst(_.g, "hardcoded") // AND g is mapped to [provided value]
.withFieldComputed(_.h, f => ((f.a + f.b) * f.c).toString) // AND h is ...
.transform // AND for everything else, use a field with correspoding name"
Scala - macros, metaprogramming (Chimney, Ducktape, Henkan)
Java - annotation processors, runtime reflection, bytecode generation (MapStruct, Dozer, ModelMapper, JMapper, …)
C# - reflection.emit
, expression trees (AutoMapper, ExpressMapper, …)
Go - "struct to struct copying" Copier
Swift - runtime reflection (Converter)
But a moment ago we mentioned, something about declarative programming.
Let’s digress and talk about SQL.
"second half of 19th century" - calculus of relations created by A. De Morgan, C.S. Pierce, E. Schröder
1940 - A. Tarski, proposes first axiomatization
1970 - A relational model of data for large shared data banks, E.F. Codd
1970s - SQL, Donald D. Chamberlin, Raymond F. Boyce
m - number of "rows"
n - number of "columns"
SELECT *
FROM S s
WHERE phi(s) -- <-- this is selection
SELECT *
FROM U u
WHERE u.age > 20
SELECT s.a1, s.a2, ..., s.an -- <-- this is projection
FROM S s
SELECT u.age, u.name
FROM U u
SELECT s.a1, s.a2 AS new_name, ..., s.an -- <-- this is rename
FROM S s
Provided that \(h_S\) and \(h_T\) are disjoint
SELECT *
FROM S s, T t -- or CROSS JOIN
\(h_S\) and \(h_T\) may have an overlap:
Then
SELECT *
FROM S s
NATURAL JOIN T t
There are also:
\(\theta\)-join
inner join
left outer join
right outer join
full outer join
etc
(SELECT * FROM S s)
UNION
(SELECT * FROM T t)
Weren’t we supposed to talk about transformations?
SELECT src.e, src.d, src.c, src.b, src.a -- order we want in the result
FROM Source src -- order in Source table is irrelevant
case class Source(a: Int, b: Int, c: Int, d: Int, e: Int)
case class Target(e: Int, d: Int, c: Int, b: Int, a: Int)
val src: Source = ...
val tgt = src.into[Target].transform
// Target(e = src.e, d = src.d, c = src.c, b = src.b, a = src.a)
SELECT src.e, src.d, src.c, src.b, src.a -- extra fields are ignored
FROM Source src -- have an additional columns f and g
case class Source(a: Int, b: Int, c: Int, d: Int, e: Int, f: Int, g: Int)
// ↑ extra fields
case class Target(e: Int, d: Int, c: Int, b: Int, a: Int)
val src: Source = ...
val tgt = src.into[Target].transform
// Target(e = src.e, d = src.d, c = src.c, b = src.b, a = src.a)
SELECT src.e, src.d, src.c AS z -- column c is renamed to z
FROM Source src -- has attribute c
case class Source(a: Int, b: Int, c: Int)
case class Target(a: Int, b: Int, z: Int)
// ↑ renamed field
val src: Source = ...
val tgt = src.into[Target]
.withFieldRenamed(_.c, _.z) // <-- c AS z
.transform
// Target(a = src.a, b = src.b, z = src.c)
SELECT foo.a, foo.b, foo.c, bar.d, bar.e
FROM Foo foo
CROSS JOIN Bar bar -- Cartesian product
case class Foo(a: Int, b: Int, c: Int)
case class Bar(d: Int, e: Int)
case class Baz(a: Int, b: Int, c: Int, d: Int, e: Int)
val foo: Foo = ...
val bar: Bar = ...
val baz = foo.into[Baz]
.withFallback(bar) // <-- CROSS JOIN
.transform
// Baz(a = foo.a, b = foo.b, c = foo.c, d = bar.d, e = bar.e)
SELECT foo.a, foo.b, foo.c, provided.d, provided.e
FROM Foo foo
CROSS JOIN (SELECT 1 as d, 2 as e) provided
case class Foo(a: Int, b: Int, c: Int)
case class Baz(a: Int, b: Int, c: Int, d: Int, e: Int)
val foo: Foo = ...
val baz = foo.into[Baz]
.withFieldConst(_.d, 1)
.withFieldConst(_.e, 2)
.transform
// Baz(a = foo.a, b = foo.b, c = foo.c, d = 1, e = 2)
SELECT foo.a, foo.b, foo.c, 1 AS d, 2 AS e -- use constants without a table
FROM Foo foo
case class Foo(a: Int, b: Int, c: Int)
// ↓ extra fields
case class Baz(a: Int, b: Int, c: Int, d: Int, e: Int)
val foo: Foo = ...
val baz = foo.into[Baz]
.withFieldConst(_.d, 1)
.withFieldConst(_.e, 2)
.transform
// Baz(a = foo.a, b = foo.b, c = foo.c, d = 1, e = 2)
SELECT foo.a, foo.b, foo.c, custom_function(foo.a, foo.b) AS d
FROM Foo foo
case class Foo(a: Int, b: Int, c: Int)
// ↓ extra field
case class Baz(a: Int, b: Int, c: Int, d: Int)
val foo: Foo = ...
val baz = foo.into[Baz]
.withFieldComputed(_.d, foo => custom_function(foo.a, foo.b))
.transform
// Baz(a = foo.a, b = foo.b, c = custom_function(foo.a, foo.b))
These were quite easy.
Let’s try something harder.
enum UserType {
case Admin
case Moderator(roles: List[String])
case User(verified: Boolean)
}
val userType: UserType = ...
userType match {
case UserType.Admin => ...
case UserType.Moderator(roles) => ...
case UserType.User(verified) => ...
}
enum UserType {
case Admin
case Moderator(roles: List[String])
case User(verified: Boolean)
}
enum UserDBType {
case Admin
case Moderator(roles: List[String])
case User(verified: Boolean)
case Suspended // <-- extra case
}
val userType: UserType = ...
userType match {
case UserType.Admin => UserDBType.Admin
case UserType.Moderator(roles) => UserDBType.Moderator(roles)
case UserType.User(verified) => UserDBType.User(verified)
// nothing is matched into Suspended and that's ok
}
-- we can use unique attribute names to disntinct from which case they came
SELECT ut.type, null as roles, null as verified FROM UserType ut WHERE ut.type = 'Admin'
UNION
SELECT ut.type, ut.roles, null as verified FROM UserType ut WHERE ut.type = 'Moderator'
UNION
SELECT ut.type, null as roles, ut.verified FROM UserType ut WHERE ut.type = 'User'
case class OuterFoo(inner: InnerFoo)
case class InnerFoo(a: Int, b: Int, c: Int)
case class OuterBar(inner: InnerBar)
case class InnerBar(a: Int, b: Int, c: Int)
val foo: OuterFoo = ...
val baz = foo.into[OuterBar].transform // <--is this still principled?
OuterBar(
inner = foo.inner.into[InnerBar].transform
)
// or
OuterBar(
inner = {
val inner = foo.inner
InnerBar(a = inner.a, b = inner.b, c = inner.c)
}
)
// or
OuterBar(
inner = InnerBar(a = foo.inner.a, b = foo.inner.b, c = foo.inner.c)
)
SELECT foo.`inner.a` AS `inner.a`,
foo.`inner.b` AS `inner.b`,
foo.`inner.c` AS `inner.c`
FROM OuterFoo foo
It looks too easy, so let’s complicate it a bit.
case class OuterFoo(inner: InnerFoo)
case class InnerFoo(a: Int, b: Int)
case class OuterBar(inner: InnerBar)
case class InnerBar(c: Int, d: Int)
case class OuterBaz(inner: InnerBaz) // ↓ extra field
case class InnerBaz(a: Int, b: Int, c: Int, d: Int, e: Int)
val foo: OuterFoo = ...
val bar: OuterBar = ...
val baz = foo.into[OuterBaz]
.withFallback(bar) // <-- CROSS JOIN / Cartesian product
.withFieldConst(_.inner.e, 1) // <-- extra field
.transform
SELECT foo.`inner.a` AS `inner.a`,
foo.`inner.b` AS `inner.b`,
bar.`inner.c` AS `inner.c`,
bar.`inner.d` AS `inner.d`
1 AS `inner.e`
FROM OuterFoo foo
CROSS JOIN OuterBar bar
case class Source(value: Int)
case class Target(value: Option[Int])
val source: Source = ...
val target = source.into[Target].transform
// Target(value = Some(source.value))
case class Source2(value: Option[Int])
case class Target2(value: Int)
val source2: Source2 = Source(None) // <-- cannot be unwrapped!
val target2 = ???
// pseudo-code
class Empty
class ErrorMsg
class Exception
type E[A] = A | Empty | ErrorMsg | Exception
val result: E[Target2] = ...
result match {
case e: Empty => ...
case e: ErrorMsg => ...
case e: Exception => ...
case success: Target2 => ...
}
val source2: Source2 = Source(None) // <-- cannot be unwrapped!
val result: partial.Result[Target2] = source2.intoPartial[Target2].transform
result match { // <-- actually, implemented as enum (sealed trait)
case partial.Result.Value(target2) => ... // <-- no runtime failures
case partial.Result.Errors(errors) => ... // <-- 1 or more errors
}
case class Source(someName: String, anotherName: String) // camel case
case class Target(some_name: String, another_name: String) // snake case
symmetric - \(a \equiv b \iff b \equiv a\)
reflexive - \(a \equiv a\)
transitive - \(a \equiv b \land b \equiv c \implies a \equiv c\)
If we make values of difference cases equivalent, we can merge them into a single equivalence class.
// Define equivalence relation via a predicate and use it for mapping
case object DifferenceCasesAreEqual extends TransformedNamesComparison {
def namesMatch(fromName: String, toName: String): Boolean = ...
}
case class Source(someName: String, anotherName: String) // camel case
case class Target(some_name: String, another_name: String) // snake case
val source: Source = ...
val target = source.into[Target]
.enableCustomFieldNameComparison(DifferenceCasesAreEqual)
.transform
// Target(some_name = source.someName, another_name = source.anotherName)
So far:
resolving values for fields can follow some reasonable rules expressed declaratively
similarly enums/coproducts
and nested data
unavoidable runtime errors can be handled in a type-safe way
we can provide missing values in a principled way as well
case class Source(a: Int, b: String, c: Double)
case class Target(a: Int, b: String, c: Double)
val source: Source = ...
val target = source.into[Target].transform
// Target(a = source.a, b = source.b, c = source.c)
abstract class Parent {
val a: Int = 1 // a is declared as val, but in parent!
}
case class Source(d: Int) extends Parent {
def b: String = "value" // b is declared as method, not field!
val c: Double = 1.0 // c is declared as val, but not in "fields" list!
}
case class Target(a: Int, b: String)
val source: Source = ...
val target = source.into[Target].transform
// Target(a = source.a, b = source.b, c = source.c)
case class User(meta: UserMetadata, data: UserData)
case class UserMetadata(id: Long, createdAt: Instant)
case class UserData(name: String, age: Int)
case class UserDto(id: Long, name: String, age: Int, createdAt: Instant)
val user: User = ...
val userDto = ??? // <-- should we automatically flatten user?
case class User(name: String, age: Int)
class UserBean {
private var name: String = uninitialized
def getName(): String = name
def setName(name: String): Unit = this.name = name
private var age: Int = uninitialized
def getAge(): Int = age
def setAge(age: Int): Unit = this.age = age
}
val user: User = ...
// Requires mapping:
// - setName(user.name)
// - setAge(user.age)
val userBean: UserBean = user.into[UserBean].transform
val userBean: UserBean = ...
// Requires mapping:
// - name = userBean.getName()
// - age = userBean.getAge()
val user: User = userBean.into[User].transform
trait NamedSet {
// name of the set value belongs to
def setName: String = "User"
}
case class User(name: String, age: Int) extends NamedSet {
// value belongs to User set
override def setName: String = "User"
}
class UserBean {
private var name: String = uninitialized
def getName(): String = name
def setName(name: String): Unit = this.name = name
private var age: Int = uninitialized
def getAge(): Int = age
def setAge(age: Int): Unit = this.age = age
}
case class Source(id: String, data: String) // Newtype-like ↓
case class SourceOptional(source: Option[Source]) extends AnyVal
case class Target(id: String, data: String)
val source: Source = ...
source.into[Target].transform
// Target(id = source.id, data = source.data)
source.into[Option[Target]].transform
// Some(Target(id = source.id, data = source.data))
Option(source).into[Option[Target]].transform
// Option(source).map(src => Target(id = src.id, data = src.data))
Option(source).intoPartial[Target].transform.asOption
// Some(Target(id = source.id, data = source.data))
Option.empty[Source].intoPartial[Target].transform.asOption
// None
case class Source(id: String, data: String) // Newtype-like ↓
case class SourceOptional(source: Option[Source]) extends AnyVal
case class Target(id: String, data: String)
val source: Source = ...
// What does it do?
SourceOptional(Some(source)).intoPartial[Option[Target]].transform.asOption
SourceOptional(None).intoPartial[Option[Target]].transform.asOption
// hint: type is Option[Option[Target]]
// Expected:
Some(Some(Target(id = source.id, data = source.data)))
Some(None)
// Actual:
Some(Some(Target(id = source.id, data = source.data)))
None
when we have schemas and source and target are similar, we can define some principled way to map between them
but users might have expectations about the types, which might require creativity in defining the rules
in RL code there will be cases with no single obvious behavior
in RL code there will be feature interaction bugs