python nltk | nltk . white space token izer
原文:https://www . geesforgeks . org/python-nltk-nltk-white spacetokenizer/
借助**nltk.tokenize.WhitespaceTokenizer()**
方法,我们可以使用tokenize.WhitespaceTokenizer()
方法从没有空格、新行和标签的单词或句子串中提取标记。
语法:
tokenize.WhitespaceTokenizer()
返回: 从字符串中返回令牌
示例#1 :
在这个示例中,我们可以看到,通过使用tokenize.WhitespaceTokenizer()
方法,我们能够从单词流中提取标记。
# import WhitespaceTokenizer() method from nltk
from nltk.tokenize import WhitespaceTokenizer
# Create a reference variable for Class WhitespaceTokenizer
tk = WhitespaceTokenizer()
# Create a string input
gfg = "GeeksforGeeks \nis\t for geeks"
# Use tokenize method
geek = tk.tokenize(gfg)
print(geek)
输出:
['GeeksforGeeks ',' is ',' for ',' geeks']
例 2 :
# import WhitespaceTokenizer() method from nltk
from nltk.tokenize import WhitespaceTokenizer
# Create a reference variable for Class WhitespaceTokenizer
tk = WhitespaceTokenizer()
# Create a string input
gfg = "The price\t of burger \nin BurgerKing is Rs.36.\n"
# Use tokenize method
geek = tk.tokenize(gfg)
print(geek)
输出:
['The ',' price ',' of ',' burger ',' in ',' BurgerKing ',' is ',' Rs.36.']
版权属于:月萌API www.moonapi.com,转载请注明出处