Perl 计算文本中单词的频率

计算一个字符串中所有单词的频率是任何编程语言的一个基本操作。可以计算文本中每个词的频率，并将其存储在一个哈希值中，以便进一步使用。在Perl中，我们可以这样做，首先将字符串中的单词分割成一个数组。我们使用函数split / /，它用”来分割字符串。然而，两个词之间的空白可以超过一个，因此使用了/s+/。这里的 \s+ 表示一个或多个”的出现。现在我们遍历将文本分割成单词后创建的新数组。这一次，我们在遍历数组的同时增加单词的计数。

例子：演示计算字符串中的单词频率

# Perl program for counting words in a string
  
actual_text = "GFG GeeksforGeeks GFG" ;
  
# Creating words array by splitting the string
@words= split / /,actual_text;
   
# Traversing the words array and 
# increasing count of each word by 1
foreach word(@words) 
{
    count{word}++;
}
  
# Printing the word and its actual count
foreachword (sort keys %count) 
{
    print word, " ",count{$word}, "\n";
}

输出:

GFG 2
GeeksforGeeks 1

//s+/和//之间的区别 ：’\s+’可以用于一个或多个空格的分隔符。然而，/ /只是用一个空格来分隔单词。如果文本在两个词之间有一个以上的空格，下面的代码代表了这一区别。

例子：为了证明/s+/和/ /之间的区别。

“`perl

<h1>Perl program for counting words in a string using / / </h1>

<h1>A text with two spaces rather than one</h1>

$actual_text = "GeeksforGeeks welcomes you to GeeksforGeeks portal" ;$

<h1>splitting the word with / /</h1>

@words= split / /,actual_text;

<h1>Counting the occurrence of each word </h1>

foreach $ParseError: KaTeX parse error: Expected '}', got 'EOF' at end of input: \dots(@words) {$ count{ $ParseError: KaTeX parse error: Expected 'EOF', got '}' at position 5: word}̲++; } foreac\dots$ word (sort keys %count)
{
print $word,”",$ count{$word}, "\n";
}

<pre><code class=" line-numbers"><br />**输出:**
“`perl
1
GeeksforGeeks 2
portal 1
to 1
welcomes 1
you 1

注意：多余的”也算作一个字。

使用命令//s+/来分割单词： 这里的空格不会算作独立的单词。

示例:
“`perl
#Perl program for counting words in a string using /\s+/
  </li>
</ul>

<h1>Actual string with two spaces</h1>

$ParseError: KaTeX parse error: Expected 'EOF', got '#' at position 74: \dotss portal" ; #̲S plitting the \dots$ actual_text;


<h1>counting the occurrence of each word  </h1>

foreach $ParseError: KaTeX parse error: Expected '}', got 'EOF' at end of input: \dots@words) {$ count{ $ParseError: KaTeX parse error: Expected 'EOF', got '}' at position 5: word}̲++; } foreac\dots$ word (sort keys %count)
{
    print $word,”",$ count{$word}, "\n";
}

<pre><code class=" line-numbers"><br />**输出:**
“`perl
GeeksforGeeks 2
portal 1
to 1
welcomes 1
you 1

注意：多出来的”不计入一个字。