Lazy Cartesian product of several Seqs in Scala

I implemented a simple method to generate Cartesian product on several Seqs like this:

object RichSeq {
  implicit def toRichSeq[T](s: Seq[T]) = new RichSeq[T](s)
}

class RichSeq[T](s: Seq[T]) {

  import RichSeq._

  def cartesian(ss: Seq[Seq[T]]): Seq[Seq[T]] = {

    ss.toList match {
      case Nil        => Seq(s)
      case s2 :: Nil  => {
        for (e <- s) yield s2.map(e2 => Seq(e, e2))
      }.flatten
      case s2 :: tail => {
        for (e <- s) yield s2.cartesian(tail).map(seq => e +: seq)
      }.flatten
    }
  }
}

Obviously, this one is really slow, as it calculates the whole product at once. Did anyone implement a lazy solution for this problem in Scala?

UPD

OK, So I implemented a reeeeally stupid, but working version of an iterator over a Cartesian product. Posting here for future enthusiasts:

object RichSeq {
  implicit def toRichSeq[T](s: Seq[T]) = new RichSeq(s) 
}

class RichSeq[T](s: Seq[T]) {

  def lazyCartesian(ss: Seq[Seq[T]]): Iterator[Seq[T]] = new Iterator[Seq[T]] {

    private[this] val seqs = s +: ss

    private[this] var indexes = Array.fill(seqs.length)(0)

    private[this] val counts = Vector(seqs.map(_.length - 1): _*)

    private[this] var current = 0

    def next(): Seq[T] = {
      val buffer = ArrayBuffer.empty[T]
      if (current != 0) {
        throw new NoSuchElementException("no more elements to traverse")
      }
      val newIndexes = ArrayBuffer.empty[Int]
      var inside = 0
      for ((index, i) <- indexes.zipWithIndex) {
        buffer.append(seqs(i)(index))
        newIndexes.append(index)
        if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
          inside = inside + 1
        }
      }
      current = inside
      if (current < seqs.length) {
        for (i <- (0 to current).reverse) {
          if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
            newIndexes(i) = 0
          } else if (newIndexes(i) < counts(i)) {
            newIndexes(i) = newIndexes(i) + 1
          }
        }
        current = 0
        indexes = newIndexes.toArray
      }
      buffer.result()
    }

    def hasNext: Boolean = current != seqs.length
  }
}

Answers


Here's my solution to the given problem. Note that the laziness is simply caused by using .view on the "root collection" of the used for comprehension.

scala> def combine[A](xs: Traversable[Traversable[A]]): Seq[Seq[A]] =
     |  xs.foldLeft(Seq(Seq.empty[A])){
     |    (x, y) => for (a <- x.view; b <- y) yield a :+ b }
combine: [A](xs: Traversable[Traversable[A]])Seq[Seq[A]]
scala> combine(Set(Set("a","b","c"), Set("1","2"), Set("S","T"))) foreach (println(_))
List(a, 1, S)
List(a, 1, T)
List(a, 2, S)
List(a, 2, T)
List(b, 1, S)
List(b, 1, T)
List(b, 2, S)
List(b, 2, T)
List(c, 1, S)
List(c, 1, T)
List(c, 2, S)
List(c, 2, T)

To obtain this, I started from the function combine defined in http://stackoverflow.com/a/4515071/53974, passing it the function (a, b) => (a, b). However, that didn't quite work directly, since that code expects a function of type (A, A) => A. So I just adapted the code a bit.


Need Your Help

C++ syntax to call templated inner class static member function?

c++ osx templates gcc

I have some template code which compiles fine in VC9 (Microsoft Visual C++ 2008) but won't compile in GCC 4.2 (on Mac). I'm wondering if there's some syntactical magic that I'm missing.

How to use Windows network paths with Git Bash

windows path git-bash

Paths to network resources are denoted in Windows with the \\servername\share\path\to\folder syntax. How does one use such a folder within Git Bash, which uses Unix-style paths?

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.