# Your first analysis

```{eval-rst}
.. currentmodule:: hashquery
```

Your first query will depend on the data you have access to,
but this should serve as a simple guide to get you started.

:::{note}
Values in this code that will need to be changed based on your data
will appear in UPPERCASE in the Python snippets below.
:::

## Getting the count of items

Let's start simple. We'll import a connection, and count the amount of records
in a table.

First, let's import our connection. This does not actually connect to your data
warehouse, it is just a reference as to what data connection will be
used later on.

```{literalinclude} first_analysis_0.py
:end-at: connection =
:lineno-match:
```

:::{dropdown} Failing to load the connection?

Use the following to print all the names of valid connections:

```python
from hashquery import *
from hashquery.project_importer import ProjectImporter

connections = ProjectImporter().all_data_connections()
print("Data connections:")
print([dc.alias for dc in connections])
```

:::

Next, we'll define a simple {py:class}`Model` for a table. This model represents
all the records in the table named `TABLE_NAME`.

```{literalinclude} first_analysis_0.py
:start-at: table_model =
:end-at: table_model =
:lineno-match:
```

Let's now sub-model this to form a new model. While the previous model
represented _all records_ in the table, this new model will represent
the total count of the records in the table.

We can do so by calling {py:func}`Model.aggregate` with no groups, and a single
measure: {py:func}`count`.

```{literalinclude} first_analysis_0.py
:start-at: count_model =
:end-at: count_model =
:lineno-match:
```

Finally, we'll execute the model:

```{literalinclude} first_analysis_0.py
:start-at: results =
:lineno-match:
```

When you run this script, you should should see a printed item containing a
value for the count of records in the table.

:::{dropdown} Code so far

```{literalinclude} first_analysis_0.py
:linenos:
```

:::

## Breaking out by groups

Let's revisit our analysis. Instead of gathering a count of all records,
let's group the records into buckets, and count the totals within each bucket.
We'll then take the top 3 largest buckets.

Pick a column that the data may be split on well. If you choose a date column,
instead of specifying `column("COLUMN_NAME")`, use
`column("COLUMN_NAME").by_month` to truncate the timestamps into months.

```{literalinclude} first_analysis_1.py
:start-at: top_counts_model =
:lines: -8
:lineno-match:
```

This should now show you the top 3 buckets.

:::{dropdown} Code so far

```{literalinclude} first_analysis_1.py
:linenos:
```

:::

## Modeling & referencing properties

Our queries work well, but they aren't very reusable. Anytime somebody needs
to reference your columns, they need to find the physical name of the column
in the database table, which could change and result in a cascade of changes.
Similarly, our measure may become more complex, accounting for business logic
about double counting, and if the logic was spread across many queries, you
would have to update it in many places.

What we want to do is have a layer between the raw expressions and the
semantics of the model. We'll use the model as a centralized, shared definition
of what's interesting about the table.

For this tutorial, we'll just attach attributes and measures.

- **Attributes** are expressions which are a property of _an individual record_.
- **Measures** are expressions which are a property of _a group of records_.

You can attach attributes and measures onto a model using
{py:func}`Model.with_attribute` and {py:func}`Model.with_measure` respectively.
We'll attach these to our base model, so any analysis using this table can
reuse them.

```{literalinclude} first_analysis_2.py
:start-at: table_model =
:lines: -6
:lineno-match:
```

We can then update our references in our sub-model to use the new definitions.
In HashQuery, we reference properties on models using the `_` operator.
The `_` operator [is covered in more detail here](/concept_explanations/keypaths.md).
For now, imagine `_` as a magic reference to "the model I am querying".

```{literalinclude} first_analysis_2.py
:start-at: top_counts_model =
:lines: -8
:lineno-match:
```

Now our sub-model query will automatically adjust if we change the definition
for `my_attribute` or `my_measure`. In addition, `table_model` now has more
metadata about what's interesting about the table, which allows tools in
Hashboard to offer better UIs for non-technical consumers of your data.

:::{dropdown} Final code

```{literalinclude} first_analysis_2.py
:linenos:
```

:::

## Next Steps

You can learn about the core concepts and principles of Hashquery under the
**Concepts** sidebar. For further examples, check out **Common Patterns**. API
documentation can be found under **API Reference**.

Have fun querying! [Please let us know if you have any feedback, or
encounter any issues](/project_info/feedback.md).
