#  DataInsightX-Raj

**A lightweight, beginner-friendly Python library for automated data quality checks and interactive visualizations.**

---

##  Overview

`DataInsightX-Raj` helps **data engineers, analysts, and students** quickly assess and visualize the quality of their datasets.

With just one command or a few lines of code, you can:
- Detect **missing values**  
- Find **duplicate rows**  
- Validate **schema consistency**  
- Identify **data drift** between datasets  
- Generate an **interactive dashboard** (Plotly + HTML)

---

##  Why This Project?

I built this library to simulate a **real-world data engineering task** — validating and profiling data before analysis or model training.

This project demonstrates:
- Python packaging and CLI development  
- Data validation and visualization skills  
- Open-source best practices (README, tests, PyPI readiness)  

 Great to showcase on a **fresher data engineer resume**!

---

##  Features

| Category | Feature | Description |
|-----------|----------|-------------|
| **Data Quality** | Missing Value Report | Identify missing values and their percentages |
|  | Duplicate Detection | Detect duplicate rows |
|  | Schema Validation | Check if data matches expected structure |
|  | Data Drift | Compare statistics between two datasets |
| **Visualization** | Automated Dashboard | Generate interactive HTML reports |
| **CLI Tool** | `datainsightx analyze file.csv` | Run full analysis from the terminal |

---

##  Installation

Install the library from PyPI:

```bash
pip install datainsightx-raj
