To Lump or to Split?
November 12, 2010 1 Comment
I used to be a splitter. I have recently begun the process of converting to a lumper. What is a lumper and a splitter ? Wikipedia defines lumping and splitting in software modeling as
A lumper is always keen to generalize, and produces models with a small number of broadly defined objects. A splitter is reluctant to generalize, and produces models with a large number of narrowly defined objects. For example, according to the lumpers, a subcontractor could be basically the same as any other supplier, and is therefore the same class; meanwhile the splitters would probably argue that there are significant differences between different groups of suppliers, justifying separate classes in the model
merriam-webster defines lumper as a noun meaning:
1 : a laborer who handles freight or cargo
2 : one who classifies organisms into large often variable taxonomic groups based on major characters
and splitter as a noun meaning:
1 : one that splits
2 : one who classifies organisms into numerous named groups based on relatively minor variations or characters
I believe each group more or less equally criticizes the opposite group of being ridiculous. I have found over my short existence as a software developer that I tend to complicate issues, meaning that I am a splitter. I had a manager once who told me there are two types of software developers in the world – those that over-simplify and those that over-complicate. And he proceeded to tell me that I fell into the over-complicate category. That was four years ago. I have seen myself time and time again end up in the over-complicate camp with my software design solutions, approaches to issues in my personal life, including something as simple as cooking! About a year ago i made the connection between the over-complication/over-simplification concepts and lumping/splitting.
Lumping is to over-simplifying as splitting is to over-complicating and there are cons in both camps. Over-lumping results in architectures that are too generic, not able to scale well and adapt easily to future requirements. Over-splitting results in creating entirely too many discrete functional parts to a whole. While this may allow for ultimate flexibility, pluggability, and whateverability, there are two rather significant consequences:
- things become complicated and difficult to maintain
- things slow down with lots of different processes running (if you have split across assemblies that is)
I now see the wisdom in Einstein’s view that
Everything should be made as simple as possible, but not simpler
or, re-worded into lumping/splitting language that would say
Lumping should be used whenever possible, but not more than necessary
I now take the approach in situations where I have influence over architecture and system design to keep things lumped together until there is good reason to split out. There is a well understood concept in the software development world around separation of concerns – which is splitting pieces of a system out into separate layers to allow the system to change over time with less pain. If you do any amount of reading on design patterns you will have heard about the three-tiered approach containing presentation, business logic and data layers. Lumping and splitting decisions come into play here when you need to decide on the structure of certain objects within any of these layers. Splitting in each layer should always require good reason to do so.
So what are you, a lumper? a splitter? or a mix of the two. To some degree I believe we all have tendencies to do both depending on any number of reasons applying to our current situation – deadlines, mood (have you had your caffeine yet this morning?), team environment, mentors available, influences, etc.