Documenting Scala functional chains
Scala (and functional programming, in general), advocates a style of programming where you produce functional "chains" of the form
where the operations are various combinations of map, filter, etc.
Where the equivalent Java code might require 50 lines, the Scala code can be done in 1 or 2 lines. The functional chain can change an input collection to something completely different.
The disadvantage of the Scala code is that 10 minutes later (never mind 6 months later), I can't figure out what I was thinking, because the notation is so compact, and lacks type information (because of implied types).
How do you document this? Do you put a large block comment before the chain, changing an elegant 1 line solution into a bulky 40 line solution consisting of 39 lines of comment? Do you intersperse your comments like this?
collection. // Select the items that meet condition X filter(predicate_function). // Change these items from A's to B's map(transformation_function). // etc.
Something else? No documentation? (Leave them guessing. They'll never "downsize" you then, because no one else can maintain the code. :-))
I don't write that code to begin with (unless it's a script for one-time use or playing around in the REPL).
If I can explain what the code does in one comment and the reads okay, then I keep it as a one liner:
// Find all real-valued square roots and group them in integer bins ds.filter(_ >= 0).map(math.sqrt).groupBy(_.toInt).map(_._2)
If I can't understand this by reading carefully through the chain of commands, then I should break it up more into functionally distinct units. For example, if I expected someone to not realize that the square root of a negative number is not real-valued, I would say:
// Only non-negative numbers have a real-valued square root val nonneg = ds.filter(_ >= 0) // Find square roots and group them in integer bins nonneg.map(math.sqrt).groupBy(_.toInt).map(_._2)
In particular, if someone doesn't know the Scala collections library well, and doesn't have the patience to spend five to ten minutes understanding one line of code, then either they shouldn't be working on my code (nor on anything else that accomplishes something nontrivial that they don't understand and don't have the patience to understand), or I should know in advance that I'm providing an e.g. language and mathematics tutorial in addition to writing working code, either by writing a paragraph explaining how the following line works, or breaking it out command by command, or including comments at the start of each anonymous function explaining what is going on (as appropriate).
Anyway, if you can't understand what it does, you probably need some intermediate values. They are very helpful for mental-resetting ("I can't see how to get from A to C!...but...okay, I can understand A to B. And I can understand B to C.")
If you find yourself writing comments at that detail level, you're just repeating what the code says.
For long functional chains, define new functions to replace parts of the chain. Give these meaningful names. Then you might be able to avoid comments. The names of these functions themselves should explain what they do.
The best comments are the ones that explain why the code does something. Well-written code should make the "how" obvious from the code itself.