grub.examples.pypi

Searching for available pypi names, with word2vec query expansion

class grub.examples.pypi.Search(wordvec_zip_filepath, search_words, exclude_words='already_published', wordvec_name_in_zip='wiki-news-300d-1M-subword.vec', n_neighbors=37, verbose=False)[source]

Example:

``` zip_filepath = ‘/D/Dropbox/_odata/misc/wiki-news-300d-1M-subword.vec.zip’

import pandas as pd df = pd.read_excel(‘/Users/twhalen/Downloads/pypi package names.xlsx’) target_words = set(df.word)

from grub.examples.pypi import Search

s = Search(zip_filepath, search_words=target_words) s.search(‘search for the right name’) ```

tokenizer(pos=0, endpos=9223372036854775807)

Return a list of all non-overlapping matches of pattern in string.

class grub.examples.pypi.StreamsOfZip(zip_file, prefix='', open_kws=None)[source]
class grub.examples.pypi.WordVecStream(stream)[source]