Saturday, October 17, 2009

Learning F# - Any good books out there?

I've become quite the F# fan-boy. There are a number of reasons why F# has bubbled to the top of my language list ahead of erlang and clojure. Many of these reasons derive from the thundering inevitability of .NET in the day job. Said day job involves things like image processing, a traditional pig-pile for the fp crowd, but I'm more interested in things like robotics and process control; stuff we're currently building using C++/C# .NET.

F# seems like the most likely Ocaml to sneak its nose under the edge of that tent, as it were. Hence, I've spent more than a few dinero on F# books and many hours trawling F# websites and blogs.

From my previous tinkering with erlang, back in the old days when book-learning erlang meant buying the orange Concurrent Programming in Erlang book from shady characters in the back of usenix BOFs, I was reasonably comfortable with pattern-matching for control flow and decomposing data. Those brain-cells were easily recycled to F#.

But other F# paraphernalia, like asynchronous workflows (monads? huh?), weren't sinking in.

Enter F# for Technical Computing (hereafter FS4TC). I stumbled across the announcement for Dr. Harrop's plan to publish a new F# book while prowling around his Flying Frog website. The announcement promised a book covering all the topics I wanted to see in a F# book. The free "intro to F#" article available on the web site was well done. So, I bit the bullet, prepared to duck the brickbats headed my way when wife of saw the credit card bill, threw down the $200, and pre-ordered the book.

Yes. $200. $202.64 at today's exchange rate, to be exact.

"It's for professional development, dear. Really. It will make me a more valuable employee in uncertain economic times. Hey! It's not my fault the dollar is so weak againt the British pound. No, I'm not trying to change the subject."

The book was published at the end of September and my copy arrived October 1st.

Having spent the last two weeks hip-deep in its pages I'll posit: even at $200, the book is the best deal going for learning F# especially if you have some previous exposure to a functional language.

The book is self-published by Dr. Harrop. As such, it's not going to be something you'll want to buy just so the spine impresses the in-laws when they see it on your book shelf. It's a simple spiral-bound A4 sized volume which, at first sight, appears a bit flimsy (and proved to be, after two weeks of hard use and vigorous transport, um, "less than robust".)  But the spiral binding is very desk friendly - stays open while working though examples; folds back nicely if necessary. Within the text code is colored blue. This is surprisingly helpful.

OK, enough about the curtains. Let's talk about the furniture.

Firstly, don't be misled (and potentially put off) by the title. While Dr. Harrop's focus is clearly scientific and numerical computing, the book is an excellent general introduction to F# and its applications. It includes sections on text processing and regular expressions, graphics and GUI programming with WPF, concurrent programming and parallel programming.

The first three chapters are an introduction to the language and its fundamental semantics.

Chapter 2 covers functional programming. Anyone with even a passing familiarity with some functional language will find the exposition blessedly devoid of pedantry. Even if you have no previous exposure to a functional language, the concise descriptions of the functional programming tools are exactly what you need to get started writing programs. (That's the goal, right?)

Chapter 3 is whirlwind tour of F# as an object-oriented language focusing on the O-O aspects of the language as interfaces to .NET. Again, I found the concision and precision a blessing, allowing easy mental mapping of familiar C# concepts to F# clothing. If you're coming to F# from a purely imperative background and need more familiar O-O surroundings there are extensive treatments of F#'s object model programming available in the world. For the functional programmer who needs to know enough to be dangerous with .NET libraries, the chapter is more than adequate.

The next two chapters focus on data structures. This is where the functional programming rubber meets the computational road. When first introduced to functional programming most people go through a phase I call: "Dude, where's my for-loop?" Understanding how functional data structures work and making the transition from for-loops to things like sequences and maps is the first real hurdle I encountered in learning functional programming (more about the second - performance - later). These chapters are what really sold me on the book especially the sections on trees. The explanations, again, are tight and precise. The examples are excellent. Several of the examples are non-trivial (to put it mildly). I spent a  couple of long afternoons working through the code in the "tree-based force computation" example, but loved every minute of it. If you buy the book, I encourage you to do the same all the time thinking about how you'd solve the same problem in C{++,#}. It's a revelation.

There are sections in numerics, on XML processing, on crawling the web, on compression, on interfaces to external libraries. All good stuff. But two topics deserve special note:

Chapter 10, on concurrent programming, describes the use of asynchronous workflows. I've been trying to grok the computational workflow - F# monad thing for quite some time, with limited success. FS4TC treatment is very practical and effective. Again, I attribute this to the simple straight-forward explanations that focus on usage along with the excellent examples. The chapter is only ten pages long, and uses examples similar to the ones in found in Expert F#. Yet after reading the FS4TC chapter I felt I understood how to *use* the tool and was ready to write some code using asynchronous workflows. (That's the goal, right?)

Chapter 12 is devoted to performance. As alluded to earlier there comes a time when your functional religion gets severely tested by issues of performance. You may end up writing the world's most beautiful functional  code which radiates elegance and simplicity from every inch of your monitor but it just doesn't run fast enough. Rather than tucking your recursive tail between your legs and accepting the swallowing the imperative blue pill there are good ways to make functional code run faster.

FS4TC's approach to measuring and improving performance provides a very nice cookbook for tackling these issues. It covers all the usual topics like profiling to find the performance bottlenecks. There is also a short section on 'algorithmic optimizations' - i.e. instead of making the hot code less hot, find algorithms that use the expensive code less often. This certainly makes sense. The author notes that these are the "most important set of optimizations". Yet, unlike almost every thing else in the book there's no real example(s) given. A bit of disappointment. At this point in the book I guess I was simply expecting great things. I'm sure Dr. Harrop won't lose too much sleep over my expectations but he redeemed himself in the following sections on low level optimizations. The sections on benchmarking data structures and 'deforestation' are very well done, replete with the usual concise writing and good examples. The section exploring the limitations imposed on F# by a runtime designed mainly to support an imperative  language makes for interesting reading. (I'm still trying to untangle the bits about type specialization.)

The last sections of the chapter enumerate some of the hardcore optimization that might be needed in extreme circumstances including inlining in the F# code (a cool trick I didn't know about) and a short section on using imperative data structures for large collections.

I'm sure you've noticed I've used the words 'concise' quite a lot in the above paragraphs. Ditto  'good examples'. They certainly define the book for me and are what make it so valuable and - despite the sticker-shock of the cover price - a good value.

Armed with two weeks of FS4TC bootcamp and a small measure of hubris, I'm off to attempt to glue the Microsoft Concurrency and Coordination Runtime into F# somehow to see if I can start moving a SCARA robot through its picks and places. I'll let you know how it goes.

What's going on here?

I'm spending a good bit of time lately 'book-learning' about functional programming. I regularly harangue long-suffering friends and co-workers to check out this or that "cool book about {F#, Erlang, clojure, haskell}".

Recently, someone ask: "Why not blog about it?"

"Why not?" indeed.

There's a whole gaggle of recently or soon-to-be published F# books in my sights. I'll probably start there ...