Fastest way to find out whether two ICollection<T> collections contain the same objects

What is the fastest way to find out whether two ICollection<T> collections contain precisely the same entries? Brute force is clear, I was wondering if there is a more elegant method.

We are using C# 2.0, so no extension methods if possible, please!

Edit: the answer would be interesting both for ordered and unordered collections, and would hopefully be different for each.

Answers


use C5

http://www.itu.dk/research/c5/

ContainsAll

" Check if all items in a supplied collection is in this bag (counting multiplicities). The items to look for. True if all items are found."

[Tested]

public virtual bool ContainsAll<U>(SCG.IEnumerable<U> items) where U : T
{
  HashBag<T> res = new HashBag<T>(itemequalityComparer);

  foreach (T item in items)
    if (res.ContainsCount(item) < ContainsCount(item))
      res.Add(item);
    else
      return false;

  return true;
}

First compare the .Count of the collections if they have the same count the do a brute force compare on all elements. Worst case scenarios is O(n). This is in the case the order of elements needs to be the same.

The second case where the order is not the same, you need to use a dictionary to store the count of elements found in the collections: Here's a possible algorithm

  • Compare collection Count : return false if they are different
  • Iterate the first collection
    • If item doesn't exist in dictionary then add and entry with Key = Item, Value = 1 (the count)
    • If item exists increment the count for the item int the dictionary;
  • Iterate the second collection
    • If item is not in the dictionary the then return false
    • If item is in the dictionary decrement count for the item
      • If count == 0 the remove item;
  • return Dictionary.Count == 0;

For ordered collections, you can use the SequenceEqual() extension method defined by System.Linq.Enumerable:

if (firstCollection.SequenceEqual(secondCollection))

You mean the same entries or the same entries in the same order?

Anyway, assuming you want to compare if they contain the same entries in the same order, "brute force" is really your only option in C# 2.0. I know what you mean by non elegant, but if the atomic comparision itself is O(1), the whole process should be in O(N), which is not that bad.


If the entries need to be in the same order (besides being the same), then I suggest - as an optimization - that you iterate both collections at the same time and compare the current entry in each collection. Otherwise, the brute force is the way to go.

Oh, and another suggestion - you could override Equals for the collection class and implement the equality stuff in there (depends on you project, though).


Again, using the C5 library, having two sets, you could use:

C5.ICollection<T> set1 = C5.ICollection<T> ();
C5.ICollection<T> set2 = C5.ICollecton<T> ();
if (set1.UnsequencedEquals (set2)) {
  // Do something
}

The C5 library includes a heuristic that actually tests the unsequenced hash codes of the two sets first (see C5.ICollection<T>.GetUnsequencedHashCode()) so that if the hash codes of the two sets are unequal, it doesn't need to iterate over every item to test for equality.

Also something of note to you is that C5.ICollection<T> inherits from System.Collections.Generic.ICollection<T>, so you can use C5 implementations while still using the .NET interfaces (though you have access to less functionality through .NET's stingy interfaces).


Brute force takes O(n) - comparing all elements (assuming they are sorted), which I would think is the best you could do - unless there is some property of the data that makes it easier.

I guess for the case of not sorted, its O(n*n).

In which case, I would think a solution based around a merge sort would probably help.

For example, could you re-model it so that there was only one collection? Or 3 collections, one for those in collection A only, one for B only and for in both - so if the A only and B only are empty - then they are the same... I am probably going off on totally the wrong tangent here...


Need Your Help

php json return creating syntax error on responce

php jquery ajax json

I have looked at many posts here and elswhere on this error without success of resolving my error. In console the form data is being sent via json as expected to my php processing page, the php pro...

Most correct way to run php at the linux shell

php linux shell

Is there a correct way to use php from the command line...or rather...is one way more correct than another ?