c# - Find values that appear in all lists (or arrays or collections) -


given following:

list<list<int>> lists = new list<list<int>>();  lists.add(new list<int>() { 1,2,3,4,5,6,7 }); lists.add(new list<int>() { 1,2 }); lists.add(new list<int>() { 1,2,3,4 }); lists.add(new list<int>() { 1,2,5,6,7 }); 

what best/fastest way of identifying numbers appear in lists?

to 2 lists 1 use x.intersect(y).

to several want like:

var intersection = lists.aggregate((x, y) => x.intersect(y)); 

but won't work because result of lambda isn't list<int> , can't fed in. might tempt try:

var intersection = lists.aggregate((x, y) => x.intersect(y).tolist()); 

but makes n-1 needless calls tolist() relatively expensive. can around with:

var intersection = lists.aggregate(   (ienumerable<int> x, ienumerable<int> y) => x.intersect(y)); 

which applies same logic, in using explicit types in lambda, can feed result of intersect() in without wasting time , memory creating list each time, , gives faster results.

if came lot can further (slight) performance improvements rolling our own rather using linq:

public static ienumerable<t> intersectall<t>(this ienumerable<ienumerable<t>> source) {   using(var en = source.getenumerator())   {     if(!en.movenext()) return enumerable.empty<t>();     var set = new hashset<t>(en.current);     while(en.movenext())     {       var newset = new hashset<t>();       foreach(t item in en.current)         if(set.remove(item))           newset.add(item);       set = newset;     }     return set;   } } 

this assumes internal use only. if called assembly should have error checks, , perhaps should defined perform intersect operations on first movenext() of calling code:

public static ienumerable<t> intersectall<t>(this ienumerable<ienumerable<t>> source) {   if(source == null)     throw new argumentnullexception("source");   return intersectalliterator(source); } public static ienumerable<t> intersectalliterator<t>(ienumerable<ienumerable<t>> source) {   using(var en = source.getenumerator())   {     if(en.movenext())     {       var set = new hashset<t>(en.current);       while(en.movenext())       {         var newset = new hashset<t>();         foreach(t item in en.current)           if(set.remove(item))             newset.add(item);         set = newset;       }       foreach(t item in set)         yield return item;     }   } } 

(in these final 2 versions there's opportunity short-circuit if end emptying set, pays off if happens relatively often, otherwise it's nett loss).

conversely, if these aren't concerns, , if know we're ever going want lists, can optimise bit further use of count , indices:

public static ienumerable<t> intersectall<t>(this list<list<t>> source) {   if (source.count == 0) return enumerable.empty<t>();   if (source.count == 1) return source[0];   var set = new hashset<t>(source[0]);   for(int = 1; != source.count; ++i)   {     var newset = new hashset<t>();     var list = source[i];     for(int j = 0; j != list.count; ++j)     {       t item = list[j];       if(set.remove(item))         newset.add(item);     }     set = newset;   }   return set; } 

and further if know we're going want results in list, , know either won't mutate list, or won't matter if input list got mutated, can optimise case of there being 0 or 1 lists (but costs more if might ever not need output in list):

public static list<t> intersectall<t>(this list<list<t>> source) {   if (source.count == 0) return new list<t>(0);   if (source.count == 1) return source[0];   var set = new hashset<t>(source[0]);   for(int = 1; != source.count; ++i)   {     var newset = new hashset<t>();     var list = source[i];     for(int j = 0; j != list.count; ++j)     {       t item = list[j];       if(set.remove(item))         newset.add(item);     }     set = newset;   }   return new list<t>(set); } 

again though, making method less widely-applicable, has risks in terms of how used, appropriate internal code can know either won't change either input or output after fact, or won't matter.


Comments

Popular posts from this blog

java - Date formats difference between yyyy-MM-dd'T'HH:mm:ss and yyyy-MM-dd'T'HH:mm:ssXXX -

c# - Get rid of xmlns attribute when adding node to existing xml -