this word frequency python package is handy and convenient: includes a nice little word tokenizer and supports a bunch of languages

