Corey Sunwold
3 min readNov 20, 2021

--

This is an area I have a vested interest in.

I think you nailed it when you mentioned that data got left behind in the DevSecOps shift. Most organizations have this weird structure in place where teams own the domain services end to end for everything except data. Data ends up being something that gets shipped off to another team to deal with. Then that team(s) ends up skewing the meaning, aggregating, and redistributing data to their end users which are mostly analysts and data scientists. This team can better centralize the management of the data, but because they are so disconnected from the meaning of the data that they have troubles ensuring quality and availability. This also impairs their availability to properly secure it since they can easily misunderstand what they are dealing with.

Moving data governance ownership to domain teams will have the same troubles that the cultural move to DevSecOps has faced. This is good because it gives us a pattern to learn from and build on.

First, you have a training problem. Data technologies (Spark, Hive, Trino, Parquet and ORC file formats, etc) are different than many web services and low latency (majority of queries return < 1s) database systems and domain service engineers tend to not have as much experience with these.

Second, you have policy and compliance problem. Certain fields should only be exposed to certain roles, some data needs to be preserved but only exposed in tokenized form, access needs to be tracked etc. This is not too different from many existing security policies that require encryption at rest, encryption in transit, proper API auth, etc just applied to different problem sets.

I think, just like we've seen how to ease the transition to DevSecOps, we can ease the transition to this new model with data, which I guess we've just collectively agreed to call Data Mesh though I think thats a really ambiguous term but thats neither here nor there. Point being, domain owners should own these concerns but to shift all this burden onto them without any support is a recipe for disaster.

For improved DevOps, we have infrastructure teams who stand up Kubernetes clusters, monitoring tooling, and CI/CD systems. The domain teams just have to integrate with these. We need data infrastructure teams who can provide Spark/Hive/Trino as a service. Many big organizations have service registry services to enable service discovery. We need a catalog team to offer a catalog as a service that domain teams can integrate with. We need better security tooling so that domain teams can focus on the metadata by labeling "this field is PII, this field is a user token, this field is a customer identifier" etc. The security tooling can then take this label and say "no one gets to see PII, and only certain people have access to query the order domain" etc. Shameless plug: Okera helps solve this security tooling problem.

Domain owners should be responsible for making their domain accessible in whatever form it is needed, but they need to be given the tools to succeed. Domain engineering teams are increasingly wide generalists. Specialist teams should be building products and services to make their lives as easy as possible. Domain owners exist to unlock business value. Specialists in the areas of Devops, Security, and Data should be building tools so that these domain teams can fly.

--

--