Natural Language Processing With Python & NLTK Cheat Sheet: by Via
Natural Language Processing With Python & NLTK Cheat Sheet: by Via
list(text) Split text into character tokens g=nltk.CFG.fromstring("""..."" Manually define grammar
")
set(text) Unique tokens
parser=nltk.ChartParser(g) Create a parser out of the
len(text) Number of characters
grammar
trees=parser.parse_all(text)
Accessing corpora and lexical resources
for tree in trees: ... print tree
from nltk.corpus import import CorpusReader object
brown from nltk.corpus import treebank
df=pd.DataFrame(time_sents, columns=['text'])
df['text'].str.split().str.len()
df['text'].str.contains('word')
df['text'].str.count(r'\d')
df['text'].str.findall(r'\d')
df['text'].str.replace(r'\w+day\b', '???')
df['text'].str.extract(r'(\d?\d):(\d\d)')
df['text'].str.extractall(r'((\d?\d):(\d\d) ?
([ap]m))')
df['text'].str.extractall(r'(?P<digits>\d)')