Assembler
Motivation - What problem does it solve?
Assembler helps “assemble” relational data into their corresponding hierarchical representation.
When querying a relational database, the query result are often not immediately usable for our business logic nor API response because our domain models are often hierarchical.
To use an example, DoobieRoll assemblers can help you transform results of a SQL JOIN query like this:
SELECT company_id, company_name, department_id, department_name, employee_id, employee_name
FROM company
INNER JOIN department ON department.company_id = company.id
INNER JOIN employee ON employee.department_id = department.id
company_id | company_name | department_id | department_name | employee_id | employee_name |
---|---|---|---|---|---|
comp_id_1 | Comp 1 | dep_id_1 | Dep 1 | emp_id_1 | Alice |
comp_id_1 | Comp 1 | dep_id_1 | Dep 1 | emp_id_2 | Bob |
comp_id_2 | Comp 2 | dep_id_2 | Dep 2 | emp_id_3 | John |
comp_id_2 | Comp 2 | dep_id_2 | Dep 3 | emp_id_4 | Nicole |
Into this:
List(
Company(
id = "comp_id_1",
name = "Comp 1",
departments = List(
Department(
id = "dep_id_1",
name = "Dep1",
employees = List(
Employee(
id = "emp_id_1",
name = "Alice",
),
Employee(
id = "emp_id_2",
name = "Bob"
)
)
)
)
),
id = "comp_id_2",
name = "Comp 2",
departments = List(
Department(
id = "dep_id_2",
name = "Dep2",
employees = List(
Employee(
id = "emp_id_3",
name = "John",
)
)
),
Department(
id = "dep_id_3",
name = "Dep3",
employees = List(
Employee(
id = "emp_id_4",
name = "Nicole",
)
)
)
)
With doobie, typically the columns in the result set are grouped into logical groups which roughly maps to domain models. Using the example query above, the type of a doobie query will be:
List[DbCompany :: DbDepartment :: DbEmployee :: HNil]
(The type above is isomorphic to List[(DbCompany, DbDepartment, DbEmployee))]
.
Assembler works directly shapeless’s HList instead of tuples, though it’s easy to convert between them.
I suggest reading about it
to get a basic understand of what they represent before you continue)
Each field in these DB model case classes map to a column in the query result:
case class DbCompany(
id: String,
name: String
)
case class DbDepartment(
id: String,
name: String
)
case class DbEmployee(
id: String,
name: String
)
To convert this to a List[Company]
, we often use .groupBy
s on the ids (in this case on company.id
and department.id
).
We’d group employees belonging to the same department/company together and then convert each group into their domain model.
There are however some downsides to this approach:
- Code is difficult to reuse - You’d have to write the same transformation for every different query you have
- The logic becomes more complex when
- The conversion from the database model (
DbDepartment
) to your domain model (Department
) can fail. - You object relationships are more complex, where a parent class can have multiple children types (which in turn can have their own children types)
- The conversion from the database model (
Assembler solves this problem by only requiring you to declare your data relationships, and it’ll take care of the assembling for you!
Using Assembler
Here are the main steps to use Assembler:
- Define the database-to-domain relationship - Define how to convert each database to their corresponding domain model.
- Create the Assembler using these definitions
- Assemble your query result - Feed the database query results into the assembler to get your domain objects
Let’s see how it’s actually done with an example:
- A Town can have many Schools (one-to-many relationship)
- A School can have many Students (one-to-many relationship)
import java.util.UUID
case class Town(
id: UUID,
name: String,
schools: Vector[School]
)
case class School(
id: UUID,
name: String,
students: Vector[Student]
)
case class Student(
id: UUID,
name: String
)
Let’s say we want to find towns, with the schools in those towns and the students in each of those schools. The code you’ll write with doobie to retrieve this information will probably look like
import doobie.implicits._
import shapeless.{::, HNil}
val townsQuery = fr"""
SELECT town.id, town.name, school.id, school.name, student.id, student.name
FROM town
INNER JOIN school ON school.town_id = town.id
INNER JOIN student ON student.school_id = school.id
WHERE town.name LIKE '%ville'
""".query[DbTown :: DbSchool :: DbStudent :: HNil]
// Our database models
case class DbTown(
id: UUID,
name: String
)
case class DbSchool(
id: UUID,
name: String
)
case class DbStudent(
id: UUID,
name: String
)
and after running the query against some data you’ll have a result list.
val queryResult: Vector[DbTown :: DbSchool :: DbStudent :: HNil] = // ...code to run the SQL query here omitted
1. Define the database-to-domain relationship
When assembling domain models, they can be split into two types:
- Leaf: Types without any children (e.g.
Student
) - Parent: Types with one or more children types, (e.g.
Town
hasSchool
as children, andSchool
hasStdudent
as children)
Let’s define our relationships
import cats.Id
import doobieroll._
import doobieroll.implicits._
val townDef: ParentDef.Aux[Id, Town, DbTown, School :: HNil] = ParentDef.make(
getId = (d: DbTown) => d.id,
constructWithChild = (db: DbTown, schools: Vector[School]) => Town(db.id, db.name, schools)
)
val schoolDef: ParentDef.Aux[Id, School, DbSchool, Student :: HNil] = ParentDef.make(
getId = (d: DbSchool) => d.id,
constructWithChild = (db: DbSchool, students: Vector[Student]) => School(db.id, db.name, students)
)
val studentDef: LeafDef[Id, Student, DbStudent] = LeafDef.make(
(db: DbStudent) => Student(db.id, db.name)
)
2. Create the Assembler
With our individual definitions, we can now build an Assembler
val assembler: ParentAssembler[cats.Id, Town, DbTown :: DbSchool :: DbStudent :: HNil] =
townDef.toAssembler(schoolDef.toAssembler(studentDef.toAssembler))
The signature of the assembler tells us that it knows how to construct some Town
s from a
list of query result rows! (DbTown :: DbSchool :: DbStudent :: HNil
).
(Note how the DB type of assembler and the query result from above matches!)
3. Assemble your query result
val towns: Vector[Town] = assembler.assemble(queryResult)
// towns: Vector[Town] = Vector(
// Town(
// bca65379-1e46-40ca-bff5-5ca96c5fd183,
// "Smallville",
// Vector(
// School(
// c60e0a80-cfcb-4578-b00d-14b3fe68e734,
// "Smallville High",
// Vector(
// Student(d948735b-1173-411c-ac00-e019ee89897a, "Clark"),
// Student(7158132d-fe59-4ef1-8693-1476a0359aa7, "Bob")
// )
// )
// )
// ),
// Town(
// ee58ad86-ce81-4650-b329-b701424fc23b,
// "Springfield",
// Vector(
// School(
// 51f02985-db87-4350-b557-6d9476e1367f,
// "Springfield Elementary School",
// Vector(
// Student(4386ad3e-a30c-4179-938f-a0ead5bd9b4a, "Lisa"),
// Student(997c0f4b-106d-4d80-82bd-02bae0a2c177, "Bart")
// )
// )
// )
// )
// )
That’s it!
4. Usage with Doobie (and any other sources of data)
The Assembler typeclass assemble
can take any input data that resembles a list of HList.
When querying with Doobie, you can query directly into an HList and then pipe the output straight through Assembler.
import cats.effect.IO
import doobie.{ConnectionIO, Transactor}
import doobie.implicits._
// Your SQL query to perform
// (...however you create your Doobie transactor)
val transactor: Transactor[IO] = ???
val query: ConnectionIO[Vector[DbTown :: DbSchool :: DbStudent :: HNil]] =
fr"""
|SELECT school.id, school.name, teacher.id, teacher.name, student.name
|FROM school
|LEFT JOIN teacher WHERE teacher.school_id = school.id
|LEFT JOIN student WHERE student.school_id = school.id
"""
.stripMargin
.query[DbTown :: DbSchool :: DbStudent :: HNil]
.to[Vector]
val result: IO[Vector[Town]] = query.transact(transactor).map { queryResult: Vector[DbTown :: DbSchool :: DbStudent :: HNil] =>
assembler.assemble(queryResult)
}
Usage notes
Row identity
When joining sibling tables, SQL engines will often duplicate data from previous rows when
there are no new data needed, thus Assembler
needs to deduplicate (using equals
/hashCode
)
when processing the data to avoid duplicates.
So it is important to write SQL (and database types) such that your data has fields which allows proper uniqueness detection.
Here’s an example:
SELECT school.id, school.name, teacher.id, teacher.name, student.name
FROM school
LEFT JOIN teacher WHERE teacher.school_id = school.id
LEFT JOIN student WHERE student.school_id = school.id
Note this problematic query only returns the name of the student but not the ID. If the school has 2 teachers but only one student, we will get a result like this:
school.id | school.name | teacher.id | teacher.name | student.name |
---|---|---|---|---|
sch_id_1 | School 1 | tch_id_1 | Einstein | Alice |
sch_id_1 | School 1 | tch_id_2 | Curie | Alice |
The problem here is that Alice has been duplicated by the database engine
to “fill in” the student columns when returning the second teacher
row.
We have no way of knowing whether there are one or two Alice in the school.
The simplest way to solve this is to have some sort of identifier (e.g. UUID)
for each entity type. In the above example, we should retrieve student.id
column.
(This doesn’t just apply when using Assembler
- if you’re using SQL then
this is something you need to think about)
Parent types with multiple children
Assembler supports having more than one child for parent entities. You can use make2
, make3
etc
depending on how many child entities there are.
For example, teacher
and student
can both be child entities of a school
.
case class SchoolMoreInfo(
id: UUID,
name: String,
teachers: Vector[Teacher],
students: Vector[Student]
)
val schoolMoreInfoDef: ParentDef.Aux[Id, SchoolMoreInfo, DbSchool, Teacher :: Student :: HNil] = ParentDef.make2(
getId = (d: DbSchool) => d.id,
constructWithChild = (db: DbSchool, teachers: Vector[Teacher], students: Vector[Student]) => SchoolMoreInfo(
id = db.id,
name = db.name,
teachers = teachers,
students = students
)
)
val schoolMoreInfoAssembler: ParentAssembler[cats.Id, SchoolMoreInfo, DbSchool :: DbTeacher :: DbStudent :: HNil] =
schoolMoreInfoDef.toAssembler(teacherDef.toAssembler, studentDef.toAssembler)
Validated conversion to domain types
It is common for domain types to have additional constraints representing some domain logic
(e.g. name
field cannot be empty). Assembler
allows you to handle failures with any failure context
as long as it has a cats.Applicative
instance. (e.g. , Either
, Validated
Ior
)
To create a fallible definition use makeF
instead of make
val stricterStudentDef: LeafDef[Either[MyError, *], Student, DbStudent] = LeafDef.makeF(
(db: DbStudent) => {
if (db.name.isEmpty) {
Left(MyError("Name is empty!"))
}
else Right(Student(id = db.id, name = db.name))
}
)
And our assemble result will be wrapped in our error
import cats.implicits._ // Provides Monad instance for Either // Provides Monad instance for Either
val schoolAssembler = schoolDef.forEither.toAssembler(stricterStudentDef.toAssembler)
// schoolAssembler: ParentAssembler[Either[MyError, B], School, DbSchool :: DbStudent :: HNil] = doobieroll.syntax.ToAssemblerSyntax$$anon$2@3bfe78f8
schoolAssembler.assemble(queryResultsWithBadData)
// res1: Vector[Either[MyError, School]] = Vector(
// Right(
// School(
// 51f02985-db87-4350-b557-6d9476e1367f,
// "Springfield Elementary School",
// Vector(
// Student(7158132d-fe59-4ef1-8693-1476a0359aa7, "Bob"),
// Student(4386ad3e-a30c-4179-938f-a0ead5bd9b4a, "Lisa")
// )
// )
// ),
// Left(MyError("Name is empty!"))
// )
Some notes for the code snippet:
schoolDef
(from previous section) is an unfallible definition. So we need to “lift” its error context toEither[MyError, *]
in order to combine it with the falliblestricterStudentDef
which can fail- The results are partial failures - Failures will only bubble up and fail all their parents. Other entities with valid data are still assembled and accessible. This is useful in many scenarios where you don’t want one corrupted entity to cause the whole result set to error.