Metadata-Version: 2.1
Name: py4tfidf
Version: 0.0.4
Summary: Term Frequency – Inverse Document Frequency (TF-IDF) Python Library
Home-page: https://github.com/luthfi118/py4tfidf
Author: Mgs. M. Luthfi Ramadhan
Author-email: luthfir96@gmail.com
License: UNKNOWN
Description: # py4tfidf
        
        Term Frequency â€“ Inverse Document Frequency (TF-IDF) Python Library
        
        ## Getting Started
        
        This project is simply an implementation of TF-IDF algorithm in python programming language.
        
        ### Prerequisites
        
        Numpy
        
        
        ### Installing
        
        The easiest way to install py4tfidf is by using pip
        
        ```
        pip install py4tfidf
        ```
        
        ### Usage
        There are 2 public methods of ```Tfidf``` class. It is ```vectorize_train``` and ```vectorize_test```. ```vectorize_train``` used to build the corpus, calculate idf based on training text, and transform it into a usable vector by multiplying its tf and its idf, while ```vectorize_test``` is just simply transforming the test text into a usable vector by multiplying its tf with previously obtained if. Both ```vectorize_train``` and ```vectorize_test``` take 1 argument namely x_train and x_text respectively. Because tokenizing is usually done in the text preprocessing phase, we assume you tokenize your text on your own, so the argument for ```vectorize_train``` and ```vectorize_test``` should be a list of tokenized text.
        ```
        from py4tfidf.vectorizer import Tfidf
        vec = Tfidf()
        x_train = [['i','love', 'python'],['natrual','language','processing','is','fun'],['python','is','fun']]
        x_test = [['python','language','is','fun'],['im','learning','natrual','language','processing']]
        x_train = vec.vectorize_train(x_train)
        x_test = vec.vectorize_test(x_test)
        ```
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
