{% extends "layout.html" %} {% set title = 'skrub: Machine Learning with dataframes' %} {%- block extrahead %} {{ super() }} {# Add here landing-page specific stuff that goes in the header (eg css) #} {%- endblock extrahead %} {% block docs_navbar %} {{ super() }} {# We add the full-width banner below the navbar, as the div there is still full-width (unlike the article) #}
skrub is a Python library to
ease preprocessing and feature engineering for
tabular machine learning.
We directly connect database tables to machine learning.
Create strong scikit-learn pipeline baselines effortlessly with
TableVectorizer
and
tabular_pipeline.
Explore your dataframes interactively with
TableReport.
Click anywhere on the table
Encode text and high cardinality categorical data
(StringEncoder,
TextEncoder,
GapEncoder,
and
MinHashEncoder), or
extract features from dates with the
DatetimeEncoder.
Chain an arbitrary set of operations to prepare, transform, assemble multiple tables for machine learning, and then tune the full pipeline, inspect it, or apply it to new data.
Works with any computational or dataframe engine.
products_df and baskets_df: (expand for full code)The Skrub project is powered by the efforts of a world-wide community of contributors. Here we display a randomly selected group of 30 contributors.
Ready to write less code and get more insights? Dive into skrub now
and be part of an emerging community!