Understanding List.fold_left and implementing insertion sort using it

I'm trying to learn OCaml, and I'm still struggling a bit with the fold functions. I've done a bit of research and I've found the following code snippet (written in Scala) implementing insertion sort using foldleft. I think I get the idea, but I know absolutely nothing about Scala, so I still need a bit of clarification. So, I guess my question is 2-fold (ha, get it?), how would this function be written in Ocaml, and how does it actually work?

def insertionSort[A <% Ordered[A]](list: List[A]): List[A] =
list.foldLeft(List[A]()) { (r,c) =>
val (front, back) = r.span(_ < c)
front ::: c :: back
}

Thanks for the help!

As I commented above, this isn't a particularly nice way to sort (assuming I understand the Scala code, which I might not). Here is an OCaml translation of what it seems to be doing:

let insertion_sort l =
let le, gt = List.partition ((>=) x) sortedl
in
le @ [x] @ gt
in

After you write FP code for a while, folds become quite natural. The purpose is to process a series of elements from a data structure (e.g., a list) in a known order (e.g., leftmost first), while maintaining some state that gets passed along as you do your processing. In the simplest cases (as here), the state is also your final answer. In other cases, you need to carry some extra state along with the final answer. In those cases you extract the final answer at the end.

Maybe I shouldn't say this, but a fold is like the for(;;) statement in the C family, except that it encapsulates all the state that you plan to modify in your loop. In this way, it's a nicely controlled form of iteration.

The correspondence looks something like this:

for(state = S, x = first(D); more(D); x = next(D)) { state = E(state, x) }

fold_left (fun state x -> E state x) S D

The discipline of encapsulating the modified state usually turns out to be incredibly helpful after you get used to it. Very often the function that you fold (like add1 above) turns out to be useful in many other circumstances, because the structure of the fold requires a generally useful function in that spot.

Note that, in particular, the fold function contains all the knowledge of how to traverse the data structure. The folded function only needs to know about what to do with each element. So you are free to change the data structure (and the fold) without changing any of the other code. And you can also use the same folded function (add1 above) with different data structures.