在Python中查找不同子字符串的数量的程序,为不同的查询查找
假设我们有一个长度为n的字符串s。我们还有一个查询列表Q,其中Q[i]包含一对(l,r)。对于每个查询,我们必须计算在[l,r]之间包括s的不同子字符串的数量。
因此,如果输入如下s = “ppqpp” Q = [(1,1),(1,4),(1,1),(0,2)],则输出将是[1, 8, 1, 5],因为
- 查询(1, 1)的唯一子字符串是’p’,因此输出为1
- 查询(1, 4)的子字符串是’p’,’q’,’pq’,’qp’,’pp’,’pqp’,’qpp’和’pqpp’,因此输出为8
- 再次查询(1, 1)的唯一子字符串是’p’,因此输出为1
- 查询(0, 2)的子字符串是’p’,’q’,’pp’,’pq’,’ppq’,因此输出为8.
为了解决这个问题,我们将遵循以下步骤 –
- 定义一个函数kasai()。给出s、suff和n
- lcp :一个大小为n的数组且值为0
- inv :一个大小为n的数组,值为0
- for i in range 0 to n – 1,do
- inv[suff[i]] := i
- k :0
- for i in range 0 to n – 1,do
- 如果inv[i]与n-1相同,则
- k := 0
- 进入下一次迭代
- j := suff[inv[i] + 1]
- while i + k < n and j + k < n and s[i + k]与s[j + k]相同时,do
- k := k + 1
- lcp[inv[i]] := k
- if k >0,则
- k := k – 1
- 如果inv[i]与n-1相同,则
- return lcp
- 从主要方法中进行以下操作 –
- res :一个新列表
- for i in range 0 到 Q大小-1,do
- (left, right) = Q[i]
- sub = s从左到右索引的子字符串
- length = right-left +1
- suffix =一个元素为(i、从0到length-1的每个i的sub字符串)的列表
- 按照pair substring的第二个项对后缀进行排序
- (suff,suffix) =来自后缀的索引对和相应子字符串的对
- lcp = kasai(sub,suff,length)
- count = suffix[0]的大小
- for i in range 0 ,length-2,do
- count = count + suffix [i + 1]的大小 – lcp[i]
- insert count 到结果的末尾
- 返回结果
样例
让我们看一下以下实现,以便更好地理解
def kasai (s, suff, n):
lcp = [0]* n
inv = [0] * n
for i in range (n):
inv [suff [i]] = i
k = 0
for i in range (n):
if inv [i] == n-1:
k = 0
continue
j = suff [inv [i] + 1]
while i + k 0:
k -= 1
return lcp
def solve(s, Q):
res = []
for i in range (len(Q)):
left, right = Q[i]
sub = s [left: right + 1]
length = right-left + 1
suffix = [[i, sub [i:]] for i in range (length)]
suffix.sort (key = lambda x: x [1])
suff, suffix = [list (t) for t in zip (* suffix)]
lcp = kasai (sub, suff, length)
count = len (suffix [0])
for i in range (length-1):
count += len (suffix [i + 1]) - lcp [i]
res.append(count)
return res
s = "pptpp"
Q = [(1,1),(1,4),(1,1),(0,2)]
print(solve(s, Q))
输入
"pptpp",[(1,1),(1,4),(1,1),(0,2)]
输出
[1, 8, 1, 5]