Python 块和缺口

Python 块和缺口

块化是根据单词的性质将相似的单词组合在一起的过程。在下面的示例中,我们通过定义一个语法来生成块。该语法建议在创建块时遵循名词、形容词等短语的顺序。下面是块的图形输出。

import nltk

sentence = [("The", "DT"), ("small", "JJ"), ("red", "JJ"),("flower", "NN"), 
("flew", "VBD"), ("through", "IN"),  ("the", "DT"), ("window", "NN")]
grammar = "NP: {?*}" 
cp = nltk.RegexpParser(grammar)
result = cp.parse(sentence) 
print(result)
result.draw()

当我们运行上述程序时,我们会得到以下输出 –

Python 块和缺口

import nltk

sentence = [("The", "DT"), ("small", "JJ"), ("red", "JJ"),("flower", "NN"),
 ("flew", "VBD"), ("through", "IN"),  ("the", "DT"), ("window", "NN")]

grammar = "NP: {?*}" 

chunkprofile = nltk.RegexpParser(grammar)
result = chunkprofile.parse(sentence) 
print(result)
result.draw()

运行上述程序后,我们得到以下输出:

Python 块和缺口

砌缝

砌缝是从块中移除一系列标记的过程。如果这些标记序列出现在块的中间,这些标记将被移除,原本存在的两个块将会保留。

import nltk

sentence = [("The", "DT"), ("small", "JJ"), ("red", "JJ"),("flower", "NN"), ("flew", "VBD"), ("through", "IN"),  ("the", "DT"), ("window", "NN")]

grammar = r"""
  NP:
    {<.*>+}         # Chunk everything
    }+{      # Chink sequences of JJ and NN
  """
chunkprofile = nltk.RegexpParser(grammar)
result = chunkprofile.parse(sentence) 
print(result)
result.draw()

当我们运行上面的程序时,我们得到以下输出 –

Python 块和缺口

正如你所看到的,符合语法要求的部分被从名词短语中剥离为独立的块。将不符合所需块的文本提取出来的过程称为chinking。

Python教程

Java教程

Web教程

数据库教程

图形图像教程

大数据教程

开发工具教程

计算机教程