ENTERPRISE

LINQ to Objects : Working with Set Data - The HashSet Class

11/27/2012 6:58:20 PM

HashSet<T> was introduced in .NET Framework 3.5 as part of the System.Collections.Generic namespace. HashSet is an unordered collection containing unique elements and provides a set of standard set operators such as intersection and union (plus many more). It has the standard collection operations Add (although this method returns a Boolean indicating whether or not that element already existed in the collection), Remove, and Contains, but because it uses a hash-based implementation for object identity, these operations are immediately accessible without looping the entire list as occurs with the List<T> collection for example (O(1) rather than O(n)).

Although the operators on HashSet would appear to overlap with the LINQ set operators, Intersect and Union, these HashSet implementations modify the set they were called on, rather than return a new IEnumerable<T> collection, as is the behavior of the LINQ operators. For this reason, the names on the HashSet operators are differentiated as IntersectWith and UnionWith, and the LINQ operators are also available with a HashSet collection as Intersect and Union. This naming distinction avoids naming clashes and also allows the desired behavior to be chosen in a specific case.

HashSet implements the IEnumerable<T> pattern and therefore can be used in a LINQ query. Listing 1 demonstrates a LINQ query to find even numbers in a HashSet made by unioning two collections. The Console output from this example is shown in Output 1.

Listing 1. LINQ query over a HashSet collection—see Output 6-6
int[] first = new int[] { 1, 2, 3 };
int[] second = new int[] { 3, 4, 5 };

// modify the current set by unioning with second.
HashSet<int> set = new HashSet<int>(first);
set.UnionWith(second);

// return only the even values from the set.
// the values in the HashSet are IEnumerable<T>.
var q = from i in set
        where i % 2 == 0
        select i;

foreach (var item in q)
    Console.WriteLine(item);

Output 1.
2
4

The differences between the HashSet and LINQ operator support are listed here (as documented in the Visual Studio documentation), although LINQ-equivalent approximations are easy to construct as documented in Table 1 and implemented in Listing 2.

Table 1. Comparison of the LINQ Set Operators and the HashTable Type Methods
HashSet<T> operatorLINQ equivalent operatorDescription
UnionWithUnionAll distinct elements in both collections.
IntersectWithIntersectOnly elements that are in both collections.
ExceptWithExceptElements from the first collection, except those listed in the second collection.
Unnecessary because only unique elements can be added to a HashSet<T>DistinctRemoves duplicate elements from a collection. HashSet doesn’t allow duplicates to be added to the collection, making it unnecessary for it to have a Distinct operator equivalent.
SymmetricExcept-WithNot provided. To approximate the same result with LINQ, combine two reciprocal Except operators in the following fashion:
a.Except(b).Concat(b.Except(a));

Returns elements that appear in either of the collections, but not both. This is an exclusive-or operation on two collections.
OverlapsNot provided. To approximate the same result with LINQ, determine the intersect with the second collection and return true if there are any results in the following fashion:
b.Intersect(a).Distinct().Any();

Returns the Boolean value true if any element in the given collection appears in the second collection.
IsSubsetOfNot provided. To approximate the same result using LINQ, return the Boolean of true if the intersect count with the second collection equals the count of the distinct elements in the collection in the following fashion:
 var aCount =
    a.Distinct().Count();
 var intersectCount =
    a.Intersect(b)
    .Distinct()
    .Count();
return
  intersectCount == aCount;

Returns the Boolean value of true if all of the elements in the given collection are present in the second collection. Will return the Boolean value of true if the collections share the same elements.
IsProperSubsetOfNot provided. To approximate the same result with LINQ, return the Boolean of true if the intersect count with the second collection is less than the count of the distinct elements in the second collection, and the intersect count equals the count of the distinct elements in the collection in the following fashion:
 var aCount =
     a.Distinct().Count();
 var bCount =
     b.Distinct().Count();
 var intersectCount =
     a.Intersect(b)
     .Distinct()
     .Count();
return
  (intersectCount < bCount) &&
  (intersectCount == aCount);

Returns the Boolean value of true if all of the elements in the given collection are present in the second collection. Will return the Boolean value of false if the collections share exactly the same elements; the collection must be an actual subset with at least one element less than that in the second collection.
IsSupersetOfNot provided. To approximate the same result using LINQ, return the Boolean of true if the intersect count equals the count of the distinct elements in the second collection in the following fashion:
var bCount =
   b.Distinct().Count();
var intersectCount =
   b.Intersect(a)
   .Distinct()
   .Count();
return
  intersectCount == bCount;

Returns the Boolean value of true if all of the elements in the second collection are present in the given collection. Will return the Boolean value of true if the collections share the same elements.
IsProperSupersetOfNot provided. To approximate the same result with LINQ, return the Boolean of true if the intersect count is less than the count of the distinct elements in the collection and the intersect count equals the count of the distinct elements in the second collection in the following fashion:
var aCount =
    a.Distinct().Count();
var bCount =
    b.Distinct().Count();
var intersectCount =
    b.Intersect(a)
    .Distinct()
    .Count();
return
  (intersectCount < aCount) &&
  (intersectCount == bCount);

Returns the Boolean value of true if all of the elements in the second collection are present in the given collection. Will return the Boolean value of false if the collections share exactly the same elements. The collection must be an actual superset with at least one element more than that in the second collection.
SetEqualsNot provided. To approximate the same result with LINQ, compare the distinct, ordered collections in the following fashion:
a.Distinct().OrderBy(x => x)
.SequenceEqual(
b.Distinct().OrderBy(y => y));

Returns the Boolean value of true if both collections share the same distinct element values.

 

Listing 2. Approximate LINQ implementations of the operators in the HashSet type
public static IEnumerable<T> SymmetricExcept<T>(
    this IEnumerable<T> first,
    IEnumerable<T> second)
{
    if (first == null)
        throw new ArgumentNullException("first");

    if (second == null)
        throw new ArgumentNullException("second");

    return  first.Except(second)
            .Concat(
            second.Except(first));
}

public static bool Overlaps<T>(
    this IEnumerable<T> first,
    IEnumerable<T> second)
{
    if (first == null)
        throw new ArgumentNullException("first");

    if (second == null)
        throw new ArgumentNullException("second");

    return second.Intersect(first).Distinct().Any();
}

public static bool IsSupersetOf<T>(
    this IEnumerable<T> first,
    IEnumerable<T> second)
{
    if (first == null)
        throw new ArgumentNullException("first");

    if (second == null)
        throw new ArgumentNullException("second");

    var secondCount = second.Distinct().Count();

    var intersectCount = second
        .Intersect(first)
        .Distinct()
        .Count();

    return intersectCount == secondCount;
}

public static bool IsProperSupersetOf<T>(
    this IEnumerable<T> first,
    IEnumerable<T> second)
{
    if (first == null)
       throw new ArgumentNullException("first");

    if (second == null)
        throw new ArgumentNullException("second");

    var firstCount = first.Distinct().Count();
    var secondCount = second.Distinct().Count();

    var intersectCount =
        second
        .Intersect(first)
        .Distinct()
        .Count();

    return (intersectCount < firstCount) &&
           (intersectCount == secondCount);
}

public static bool IsSubsetOf<T>(
    this IEnumerable<T> first,
    IEnumerable<T> second)
{
    // call the Superset operator and reverse the arguments
    return IsSupersetOf(second, first);
}

public static bool IsProperSubsetOf<T>(
    this IEnumerable<T> first,
    IEnumerable<T> second)
{
    // call the Superset operator and reverse the arguments
    return IsProperSupersetOf(second, first);
}

public static bool SetEquals<T>(
    this IEnumerable<T> first,
    IEnumerable<T> second)
{
    return first.Distinct().OrderBy(x => x)
        .SequenceEqual(
            second.Distinct().OrderBy(y => y));
}

					  

 
Other  
 
Top 10
Review : Sigma 24mm f/1.4 DG HSM Art
Review : Canon EF11-24mm f/4L USM
Review : Creative Sound Blaster Roar 2
Review : Philips Fidelio M2L
Review : Alienware 17 - Dell's Alienware laptops
Review Smartwatch : Wellograph
Review : Xiaomi Redmi 2
Extending LINQ to Objects : Writing a Single Element Operator (part 2) - Building the RandomElement Operator
Extending LINQ to Objects : Writing a Single Element Operator (part 1) - Building Our Own Last Operator
3 Tips for Maintaining Your Cell Phone Battery (part 2) - Discharge Smart, Use Smart
REVIEW
- First look: Apple Watch

- 3 Tips for Maintaining Your Cell Phone Battery (part 1)

- 3 Tips for Maintaining Your Cell Phone Battery (part 2)
VIDEO TUTORIAL
- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 1)

- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 2)

- How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 3)
Popular Tags
Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Biztalk Exchange Server Microsoft LynC Server Microsoft Dynamic Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Indesign Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe After Effects Adobe Photoshop Adobe Fireworks Adobe Flash Catalyst Corel Painter X CorelDRAW X5 CorelDraw 10 QuarkXPress 8 windows Phone 7 windows Phone 8
Visit movie_stars's profile on Pinterest.