I'm writing a simple IDictionary abstraction in C# that wraps a Dictionary<K, ICollection<V>>. Basically, it maps multiple values to one key. I can't decide whether to remove a key and its empty list when the last item in a values list is removed, or leave it (to avoid instantiating a new collection if the key is reused) and do checks on a key's values' Count when determining whether a key exists.
I would remove the collections so that your MultiMap has consistent behavior. If I used your MultiMap I would be very surprised (and unhappy) to find that a missing key behaves differently depending on whether a key was previously in the MultiMap or not.
Does Clear() remove the the Collections?
You may also create an unintended memory leak if you do not remove the collections. A developer may add many items, then remove them. Memory usage (after GC) should return to the same amount as before those items were added.
I would not worry about the cost of creating Collections. I would worry about the contract you create for your MultiMap. If after profiling your application you find that to be a concerned, you could modify or create a special MultiMap for that behavior. Don't fall into the trap of premature optimization.
In .NET 3.5, there is ILookup<TKey,TValue> and Lookup<TKey,TValue> that act as a multi-map. The in-built implementation (Lookup<TKey,TValue>) is immutable, but I have written an EditableLookup<TKey,TValue> in miscutil.
In that version; yes - I remove the key if the last item (with that key) is removed. This makes it easier to look at which keys exist (i.e. .Keys etc).
Why not treat the key as present even if all the values are removed, and provide explicit API for removing the key?
It depends on your usage pattern. If you're going to be adding and removing a lot of items, then those empty collections will be using up memory. My guess would be that you won't save that much time by keeping the collections around. As always, if it's that important to your performance, you should measure instead of guessing which way is better.
If you really think that creating those collections is expensive, then instead of creating new ones all the time, put the unused ones into a list and reuse them when new keys get added to your hashmap. I think this might be the flyweight pattern. You should probably keep the unused collection list less than half the size of the main hashmap (again, measure to see how the ratio affects performance).