HashSet<T> was introduced in .NET Framework 3.5 as part of the System.Collections.Generic
namespace. HashSet is an unordered collection containing unique
elements and provides a set of standard set operators such as
intersection and union (plus many more). It has the standard collection
operations Add (although this method returns a Boolean indicating whether or not that element already existed in the collection), Remove, and Contains,
but because it uses a hash-based implementation for object identity,
these operations are immediately accessible without looping the entire
list as occurs with the List<T> collection for example (O(1) rather than O(n)).
Although the operators on HashSet would appear to overlap with the LINQ set operators, Intersect and Union, these HashSet implementations modify the set they were called on, rather than return a new IEnumerable<T> collection, as is the behavior of the LINQ operators. For this reason, the names on the HashSet operators are differentiated as IntersectWith and UnionWith, and the LINQ operators are also available with a HashSet collection as Intersect and Union. This naming distinction avoids naming clashes and also allows the desired behavior to be chosen in a specific case.
HashSet implements the IEnumerable<T> pattern and therefore can be used in a LINQ query. Listing 1 demonstrates a LINQ query to find even numbers in a HashSet made by unioning two collections. The Console output from this example is shown in Output 1.
Listing 1. LINQ query over a HashSet collection—see Output 6-6
int[] first = new int[] { 1, 2, 3 };
int[] second = new int[] { 3, 4, 5 };
// modify the current set by unioning with second.
HashSet<int> set = new HashSet<int>(first);
set.UnionWith(second);
// return only the even values from the set.
// the values in the HashSet are IEnumerable<T>.
var q = from i in set
where i % 2 == 0
select i;
foreach (var item in q)
Console.WriteLine(item);
|
Output 1.
The differences between the HashSet
and LINQ operator support are listed here (as documented in the Visual
Studio documentation), although LINQ-equivalent approximations are easy
to construct as documented in Table 1 and implemented in Listing 2.
Table 1. Comparison of the LINQ Set Operators and the HashTable Type Methods
HashSet<T> operator | LINQ equivalent operator | Description |
---|
UnionWith | Union | All distinct elements in both collections. |
IntersectWith | Intersect | Only elements that are in both collections. |
ExceptWith | Except | Elements from the first collection, except those listed in the second collection. |
Unnecessary because only unique elements can be added to a HashSet<T> | Distinct | Removes duplicate elements from a collection. HashSet doesn’t allow duplicates to be added to the collection, making it unnecessary for it to have a Distinct operator equivalent. |
SymmetricExcept-With | Not provided. To approximate the same result with LINQ, combine two reciprocal Except operators in the following fashion:
a.Except(b).Concat(b.Except(a));
| Returns elements that appear in either of the collections, but not both. This is an exclusive-or operation on two collections. |
Overlaps | Not
provided. To approximate the same result with LINQ, determine the
intersect with the second collection and return true if there are any
results in the following fashion:
b.Intersect(a).Distinct().Any();
| Returns the Boolean value true if any element in the given collection appears in the second collection. |
IsSubsetOf | Not
provided. To approximate the same result using LINQ, return the Boolean
of true if the intersect count with the second collection equals the
count of the distinct elements in the collection in the following
fashion:
var aCount =
a.Distinct().Count();
var intersectCount =
a.Intersect(b)
.Distinct()
.Count();
return
intersectCount == aCount;
| Returns
the Boolean value of true if all of the elements in the given
collection are present in the second collection. Will return the Boolean
value of true if the collections share the same elements. |
IsProperSubsetOf | Not
provided. To approximate the same result with LINQ, return the Boolean
of true if the intersect count with the second collection is less than
the count of the distinct elements in the second collection, and the
intersect count equals the count of the distinct elements in the
collection in the following fashion:
var aCount =
a.Distinct().Count();
var bCount =
b.Distinct().Count();
var intersectCount =
a.Intersect(b)
.Distinct()
.Count();
return
(intersectCount < bCount) &&
(intersectCount == aCount);
| Returns
the Boolean value of true if all of the elements in the given
collection are present in the second collection. Will return the Boolean
value of false if the collections share exactly the same elements; the
collection must be an actual subset with at least one element less than
that in the second collection. |
IsSupersetOf | Not
provided. To approximate the same result using LINQ, return the Boolean
of true if the intersect count equals the count of the distinct
elements in the second collection in the following fashion:
var bCount =
b.Distinct().Count();
var intersectCount =
b.Intersect(a)
.Distinct()
.Count();
return
intersectCount == bCount;
| Returns
the Boolean value of true if all of the elements in the second
collection are present in the given collection. Will return the Boolean
value of true if the collections share the same elements. |
IsProperSupersetOf | Not
provided. To approximate the same result with LINQ, return the Boolean
of true if the intersect count is less than the count of the distinct
elements in the collection and the intersect count equals the count of
the distinct elements in the second collection in the following fashion:
var aCount =
a.Distinct().Count();
var bCount =
b.Distinct().Count();
var intersectCount =
b.Intersect(a)
.Distinct()
.Count();
return
(intersectCount < aCount) &&
(intersectCount == bCount);
| Returns
the Boolean value of true if all of the elements in the second
collection are present in the given collection. Will return the Boolean
value of false if the collections share exactly the same elements. The
collection must be an actual superset with at least one element more
than that in the second collection. |
SetEquals | Not provided. To approximate the same result with LINQ, compare the distinct, ordered collections in the following fashion:
a.Distinct().OrderBy(x => x)
.SequenceEqual(
b.Distinct().OrderBy(y => y));
| Returns the Boolean value of true if both collections share the same distinct element values. |
Listing 2. Approximate LINQ implementations of the operators in the HashSet type
public static IEnumerable<T> SymmetricExcept<T>(
this IEnumerable<T> first,
IEnumerable<T> second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
return first.Except(second)
.Concat(
second.Except(first));
}
public static bool Overlaps<T>(
this IEnumerable<T> first,
IEnumerable<T> second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
return second.Intersect(first).Distinct().Any();
}
public static bool IsSupersetOf<T>(
this IEnumerable<T> first,
IEnumerable<T> second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var secondCount = second.Distinct().Count();
var intersectCount = second
.Intersect(first)
.Distinct()
.Count();
return intersectCount == secondCount;
}
public static bool IsProperSupersetOf<T>(
this IEnumerable<T> first,
IEnumerable<T> second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var firstCount = first.Distinct().Count();
var secondCount = second.Distinct().Count();
var intersectCount =
second
.Intersect(first)
.Distinct()
.Count();
return (intersectCount < firstCount) &&
(intersectCount == secondCount);
}
public static bool IsSubsetOf<T>(
this IEnumerable<T> first,
IEnumerable<T> second)
{
// call the Superset operator and reverse the arguments
return IsSupersetOf(second, first);
}
public static bool IsProperSubsetOf<T>(
this IEnumerable<T> first,
IEnumerable<T> second)
{
// call the Superset operator and reverse the arguments
return IsProperSupersetOf(second, first);
}
public static bool SetEquals<T>(
this IEnumerable<T> first,
IEnumerable<T> second)
{
return first.Distinct().OrderBy(x => x)
.SequenceEqual(
second.Distinct().OrderBy(y => y));
}
|