The HashSet and LinkedHashSet Classes – Collections: Part II
By Jaime Williams / July 23, 2024 / No Comments / Composing Stream Pipelines, Executing Stream Pipelines, Introduction to Streams, Oracle Certification Exam, Replacing Elements in Collections, The NavigableSet Interface
The HashSet<E> and LinkedHashSet<E> Classes
The HashSet<E> class implements the Set<E> interface. Since this implementation uses a hash table, it offers near-constant-time performance for most operations. A HashSet<E> does not guarantee any ordering of the elements. However, the LinkedHashSet<E> subclass of HashSet<E> guarantees insertion order. It is also the implementation of choice if frequent iteration is necessary over the set. The sorted counterpart is TreeSet<E>, which implements the SortedSet<E> and the NavigableSet<E> interfaces and has logarithmic time complexity (p. 810).
A HashSet<E> relies on the implementation of the hashCode() and equals() methods of its elements (§14.2, p. 744, and §14.3, p. 753). The hashCode() method is used for hashing the elements, and the equals() method is needed for comparing elements for equality. In fact, the equality and the hash codes of HashSets are defined in terms of the equality and the hash codes of their elements.
HashSet()
Constructs a new, empty set.
HashSet(Collection<? extends E> c)
Constructs a new set containing the elements in the specified collection. The new set will not contain any duplicates. This offers a convenient way to remove duplicates from a collection.
HashSet(int initialCapacity)
Constructs a new, empty set with the specified initial capacity.
HashSet(int initialCapacity, float loadFactor)
Constructs a new, empty set with the specified initial capacity and the specified load factor.
As mentioned earlier, the LinkedHashSet<E> implementation is a subclass of the HashSet<E> class. It works similarly to a HashSet<E> except for one important detail. Unlike a HashSet<E>, a LinkedHashSet<E> guarantees that the iterator will access the elements in insertion order—that is, in the order in which the elements were inserted into the LinkedHashSet<E>.
The LinkedHashSet<E> class offers constructors analogous to the ones in the Hash-Set<E> class. The initial capacity (i.e., the number of buckets in the hash table) and its load factor (i.e., the ratio of number of elements stored to its current capacity) can be tuned when the set is created. The default values for these parameters will, under most circumstances, provide acceptable performance.
Example 15.4 demonstrates iterating over a HashSet<E> and a LinkedHashSet<E> created at (1) and (2), respectively. Regardless of the order in which elements are inserted into a HashSet<E>, we cannot depend on the order in which the for(:) loop will iterate over the elements in the set, as is evident from the program output. A LinkedHashSet<E>, on the other hand, is always iterated over in insertion order—that is, the first element inserted is the first element retrieved by the for(:) loop. The program output confirms this behavior, as the meal that was inserted last into the LinkedHashSet<E> is served first. The same behavior will be exhibited if an explicit iterator is used to iterate over the sets.
Example 15.4 Iterating Over Sets
import java.util.HashSet;
import java.util.LinkedHashSet;
import java.util.Set;
public class IterationHashSetAndLinkedHashSet {
public static void main(String[] args) {
Set<String> set1 = new HashSet<>(); // (1)
set1.add(“breakfast”); set1.add(“lunch”); set1.add(“dinner”);
System.out.println(“Serving meals from a HashSet (order can vary):”);
for (String meal : set1) {
System.out.println(meal);
}
Set<String> set2 = new LinkedHashSet<>(); // (2)
set2.add(“breakfast”); set2.add(“lunch”); set2.add(“dinner”);
System.out.println(“Serving meals from a LinkedHashSet” +
” (always insertion order):”);
for (String meal : set2) {
System.out.println(meal);
}
}
}
Output from the program:
Serving meals from a HashSet (order can vary):
dinner
breakfast
lunch
Serving meals from a LinkedHashSet (always insertion order):
breakfast
lunch
dinner
Example 15.5 demonstrates set operations. It determines the following relationships between two sets of characters:
- Whether they are disjunct—that is, have no elements in common
- Whether they have the same elements—that is, are equivalent
- Whether one is a subset of the other
- Whether one is a superset of the other
- Whether they have a common subset
Given a list of words as program arguments, each argument is turned into a set of characters. This character set is compared with the set of all characters encountered so far in previous arguments.
The set encountered created at (1) accumulates characters as each argument is processed. For each argument, an empty set of characters is created at (2). This characters set is populated with the characters of the current argument at (3). The program first determines if there is a common subset between the two sets at (4)—that is, whether the current argument has any characters that were in previous arguments:
// Determine whether a common subset exists: (4)
Set<Character> commonSubset = new HashSet<>(encountered);
commonSubset.retainAll(characters);
boolean areDisjunct = commonSubset.size()==0;
Note that the retainAll() operation is destructive. The code at (4) does not affect the encountered and the characters sets. If the size of the common subset is zero, the sets are disjunct; otherwise, the relationship must be narrowed down. The subset and superset relations are determined at (5), using the containsAll() method.
// Determine superset and subset relations. (5)
boolean isSubset = encountered.containsAll(characters);
boolean isSuperset = characters.containsAll(encountered);
The sets are equivalent if both of the previous relations are true. If the relations are both false—that is, no subset or superset relationship exists, the sets only have the subset computed at (4) in common. The encountered set is updated with the contents of the characters set to accumulate all characters encountered so far. The addAll() method is used for this purpose at (6):
encountered.addAll(characters); // (6)
As we can see from the output, the program prints the contents of the sets in the standard text representation for collections.
Example 15.5 Using Sets
import java.util.HashSet;
import java.util.Set;
public class CharacterSets {
public static void main(String[] args) {
// A set for keeping track of all characters previously encountered.
Set<Character> encountered = new HashSet<>(); // (1)
// For each program argument in the command line …
for (String argument : args) {
// Convert the current argument to a set of characters.
Set<Character> characters = new HashSet<>(); // (2)
int size = argument.length();
// For each character in the argument…
for (int j = 0; j < size; j++)
// add character to the characters set.
characters.add(argument.charAt(j)); // (3)
// Determine whether a common subset exists: (4)
Set<Character> commonSubset = new HashSet<>(encountered);
commonSubset.retainAll(characters);
boolean areDisjunct = commonSubset.size()==0;
if (areDisjunct) {
System.out.println(characters + ” and ” + encountered + ” are disjunct.”);
} else {
// Determine superset and subset relations. (5)
boolean isSubset = encountered.containsAll(characters);
boolean isSuperset = characters.containsAll(encountered);
if (isSubset && isSuperset)
System.out.println(characters + ” is equivalent to ” + encountered);
else if (isSubset)
System.out.println(characters + ” is a subset of ” + encountered);
else if (isSuperset)
System.out.println(characters + ” is a superset of ” + encountered);
else
System.out.println(characters + ” and ” + encountered + ” have ” +
commonSubset + ” in common.”);
}
// Update the set of characters encountered so far.
encountered.addAll(characters); // (6)
}
}
}
Running the program with the following arguments:
>
java CharacterSets i said i am maids
results in the following output:
[i] and [] are disjunct.
[d, a, s, i] is a superset of [i]
[i] is a subset of [d, a, s, i]
[a, m] and [d, a, s, i] have [a] in common.
[d, a, s, m, i] is equivalent to [d, a, s, m, i]