Sunday, July 13, 2008

Functional Programming Has Warped Me

Look at this method that I recently wrote in a class for describing my database. The responsibility of this method is to commit the current session (which is encapsulated from the user) after x number of database events (adds, updates, deletes, etc). I need his functionality because I was bulk inserting objects that I had read from an XML feed. Here's what I first came up with:

commitEvery: aNumber during: aBlock
| count |
count := 1.
[self session beginUnitOfWork.
aBlock value:
[count := count + 1.
count > aNumber
ifTrue:
[self session commitUnitOfWork.
self session beginUnitOfWork.
count := 1]]].
self session commitUnitOfWork

The method takes a one argument block that will be called with a zero argument block that when called will increase the count and decide whether to commit and start a new transaction if it has reached the number given. At the end, I commit whatever active transaction there is. The thing I like about this is that I put the logic of counting and when to commit in one place. Here's how it would be used:

populate
| db |
db := self new.
[db
commitEvery: 1000
during:
[:afterAdd |
ITunesSAXHandler readUsing:
(ITunesTracksHandler onEachTrackDo:
[:each |
db addTrack: each.
afterAdd value])]]
ensure: [db disconnect]

Notice how "afterAdd" is used. It is passed into the block that will handle all of the transaction adds. It is called after an item has been added. I like this code because it doesn't worry about anything, but adding things to the database. It also makes it obvious how often we will commit. But, I'm unhappy with commitEvery:during:. The reason is because it just seems like I could make it more generic and at the same time easier to test without resorting to mocks. I abstracted out the counting and put it as a method on BlockContext (this is Squeak):

duringDoEvery: aNumber onBegin: beginBlock onEnd: endBlock
| count result |
count := 1.
beginBlock value.
result := self value:
[ count := count + 1.
count > aNumber ifTrue:
[ endBlock value.
beginBlock value.
count := 1 ] ].
endBlock value.
^result

This turns our first method into this:

commitEvery: aNumber during: aBlock
^aBlock
duringDoEvery: 1000
onBegin: [ self session beginUnitOfWork ]
onEnd: [ self session commitUnitOfWork ]

It's a mini-DSL! Let's write a simple test. Note, this is not an SUnit test, it simply writes to the transcript. But, I wanted to show it without all of the assertions:

[:afterAdd |
| count |
count := 0.
10 timesRepeat:
[Transcript show: (count := count + 1) printString; cr.
afterAdd value]]
duringDoEvery: 5
onBegin: [Transcript show: 'begin'; cr]
onEnd: [Transcript show: 'end'; cr]

Gives this output:

begin
1
2
3
4
5
end
begin
6
7
8
9
10
end
begin
end

Everything looks wonderful until we hit the last two lines. It seems pointless and unnecessary to do a begin and end with nothing in between. Our code will need to get a little bit more complex. Here's my next try:

duringDoEvery: aNumber onBegin: beginBlock onEnd: endBlock
| count result hasBegun |
count := 1.
hasBegun := false.
result := self value:
[ :eachBlock |
count := count + 1.
hasBegun ifFalse:
[ beginBlock value.
hasBegun := true ].
eachBlock value.
count > aNumber ifTrue:
[ endBlock value.
hasBegun := false.
count := 1 ] ].
hasBegun ifTrue: [ endBlock value ].
^ result

It's more complicated. And now, I'm passing another block around to get its value. The reason was because to properly remove empty begin/end combinations, I needed to know when the before was and when the after was. This means passing a block to the afterAdd from before. So, now the test code looks like this:

[:onEachAdd |
| count |
count := 0.
10 timesRepeat:
[onEachAdd value:
[Transcript show: (count := count + 1) printString; cr]]]
duringDoEvery: 5
onBegin: [Transcript show: 'begin'; cr]
onEnd: [Transcript show: 'end'; cr]

I renamed afterAdd to onEachAdd which is a better name and indicates its role better. Now, the reason I wrote this post is that I sat back and went, "Wow. I have something easily testable, but might hurt some brains." My next thought was, "Why would this hurt someone's brain?" The answer is all of the indirection of the blocks. I will be quite honest at this point, I would normally start turning all of these blocks into objects and have better names for what they were doing. Blocks worked perfectly in this context and if you think about it. This code is not much different than what Collection>>streamContents: does.

This is one of the things that learning functional programming has done to me. The above code looks and feels natural. Passing a block into a method that will accept another block doesn't bother me at all. I love the DSL-like nature of the implementation too, but I can see where it might confuse (Hell, inject:into: still does). This is where objects come in because now we can create our small objects with more meaningful names than just value or value:, this will at least make the indirection less painful. Thought this was fun exercise to show off something else that you can do with higher order functions. Squeak on.

2 comments:

tenuki said...

The post reminded me the message #do:separatedBy:. The problem looks analog to its behavior, but I my smalltalk is a bit rusty right now..

Blaine Buxton said...

It is similar. But, do:separatedBy: iterates through the collection and then calls the separatedBy after each one except the last. Here I wanted to call the separatedBy: after x members in the collection and make sure it called in the end as well.