C++标准模板库中的无序集

unordered_set 是一种无序关联容器，在哈希表中使用哈希函数将键散列到哈希表的索引中，因此插入始终是随机的。所有 unordered_set 上的操作在平均情况下都需要常数时间 O(1) ，在最坏情况下可能需要线性时间 O(n) ，这取决于内部使用的哈希函数，但实际上它们的表现非常好，通常提供一个常数时间的查找操作。

unordered_set可以包含任何类型的键-预定义或用户定义数据结构，但所有键必须是唯一的。它在< unordered_set >头文件中定义为 std::unordered_set 类。

语法：

std::unordered_set<data_type> name;

在std::unordered_set中，定义了许多函数，其中最常用的函数是：

size()和empty()用于容量。
find()用于搜索关键字。
insert()和erase()用于修改。

示例：

// C++程序演示了unordered_set的各种功能
#include <bits/stdc++.h>
using namespace std;
int main()
{
// 声明用于存储字符串数据类型的set
    unordered_set<string> stringSet;
// 将各种字符串插入，重复的字符串只会存储一次
    stringSet.insert("code");
    stringSet.insert("in");
    stringSet.insert("c++");
    stringSet.insert("is");
    stringSet.insert("fast");
    string key = "slow";
// 如果未找到关键字，find()则返回end迭代器；否则，返回到关键字的迭代器
    if (stringSet.find(key) == stringSet.end())
        cout << key << " not found" << endl << endl;
    else
        cout << "Found " << key << endl << endl;
    key = "c++";
    if (stringSet.find(key) == stringSet.end())
        cout << key << " not found\n";
    else
        cout << "Found " << key << endl;
    cout << "\nAll elements : ";
    unordered_set<string>::iterator itr;
    for (itr = stringSet.begin(); itr != stringSet.end();
            itr++)
        cout << (*itr) << endl;
}

输出

slow not found

Found c++

All elements : is
fast
c++
in
code

find()、insert()和erase()函数平均需要一个恒定的时间。如果在集合中没有关键字，find()函数会返回end()迭代器，否则返回到关键字的迭代器。迭代器作为指向关键字值的指针，因此我们可以通过使用*运算符来解除引用并获取关键字。

我们还可以看到输出的顺序混乱，因为没有特定的顺序来存储unordered_set中的数据。

注意： unordered_set只允许唯一的关键字，而对于重复的关键字应该使用unordered_multiset。

C++STL中的set vs unordered_set

Set	无序集合
Set是一组有序的唯一键序列。	无序集合是一种可以按任意顺序存储键的集合，因此是无序的。
Set是作为平衡树结构实现的，可以通过特定的树遍历方式来维护元素之间的顺序。	无序集合实现为哈希表，因为我们不必担心任何顺序。
Set操作的时间复杂度为O(log n)。	基本Set操作的时间复杂度是O(1)。

在无序集合上的实际问题

给定一个整数数组（列表），找出其中所有的重复项（假设重复项是一个由随机人猜测的相同的数字，在实验中）。

输入：

arr[] = {1, 5, 2, 1, 4, 3, 1, 7, 2, 8, 9, 5}

输出：

重复项为：5 2 1

例子：

//使用unordered_set从数组中找到重复
//C++程序
#include <bits/stdc++.h>
using namespace std;
 
//使用unordered_set在arr[0…n-1]中打印重复项
void printDuplicates(int arr[], int n)
{
    //声明unordered sets用于检查和存储
    //重复
    unordered_set<int> intSet;
    unordered_set<int> duplicate;
 
    //循环遍历数组元素
    for (int i = 0; i < n; i++) {
        //如果元素不存在则插入该元素
        if (intSet.find(arr[i]) == intSet.end())
            intSet.insert(arr[i]);
 
        //如果元素已经存在，则插入到重复集
        else
            duplicate.insert(arr[i]);
    }
 
    //打印结果
    cout << "重复项为：";
    unordered_set<int>::iterator itr;
 
    //Iterator itr从begin()到end()循环
    for (itr = duplicate.begin(); itr != duplicate.end();
         itr++)
        cout << *itr << " ";
}
 
//Driver code
int main()
{
    int arr[] = { 1, 5, 2, 1, 4, 3, 1, 7, 2, 8, 9, 5 };
    int n = sizeof(arr) / sizeof(int);
 
    printDuplicates(arr, n);
    return 0;
}

输出

重复项为：5 2 1

时间复杂度：O(n log n)
辅助空间：O(n)

std::unordered_set成员方法

以下表格包含std::unordered_set类的所有成员函数：

函数名称	函数描述
insert()	在无序集合容器中插入一个新的 {element}。
begin()	返回指向无序集合容器中第一个元素的迭代器。
end()	返回指向“过去-结束”的元素的迭代器。
count()	统计无序集合容器中特定元素的出现次数。
find()	在容器中查找一个元素。
clear()	删除无序集合中的所有元素并清空容器。
cbegin()	返回指向无序集合容器中第一个元素的const_iterator。
cend()	返回指向无序集合容器或其Buckets中“过去-结束”的元素的const_iterator。
bucket_size()	返回无序集合容器中特定Bucket中存在的元素总数。
erase()	删除单个元素或从开始（包括）到结束（不包括）范围内的一系列元素。
size()	返回无序集合容器中元素的数量。
swap()	交换两个无序集合容器的值。
emplace()	在无序集合容器中插入一个元素。
max_size()	返回无序集合容器可以容纳的最大元素数量。
empty()	检查无序集合容器是否为空。
equal_range	返回包括所有等于给定值的元素的范围。
operator=	将一个无序集合复制（或移动）到另一个无序集合中，unordered_set::operator=是相应的操作函数。
hash_function()	这个哈希函数是一个一元函数，只接受一个参数，并根据它返回一个唯一的size_t类型的值。
reserve()	用于请求无序集合的容量变化。
bucket()	返回特定元素的Bucket编号。
bucket_count()	返回无序集合容器中存在的Bucket总数。
load_factor()	返回无序集合容器中的当前负载因子。
rehash()	将无序集合的容器中的Bucket数设置为给定大小或更大。
max_load_factor()	返回（或设置）无序集合容器的当前最大负载因子。
emplace_hint()	如果要插入的值是唯一的，则仅使用给定提示将新元素插入无序集合中。
operator	‘’是C++ STL中的一个运算符，用于在两个无序集合之间执行相等比较操作，unordered_set::operator是相应的操作函数。
key_eq()	根据比较返回一个布尔值。返回无序集合使用的键等价比较谓词。
operator!=	!= 是 C++ STL 中的一个关系运算符，用于比较无序集合容器之间的相等和不相等。
max_bucket_count()	查找 unordered_set 可能具有的最大桶数。