Metadata-Version: 2.1
Name: yahi
Version: 0.2.3
Summary: Versatile log parser
Home-page: https://github.com/jul/yahi
Author: Julien Tayon, Stephane Bard
Author-email: julien@tayon.net
License: PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2
        --------------------------------------------
        
        1. This LICENSE AGREEMENT is between the Python Software Foundation
        ("PSF"), and the Individual or Organization ("Licensee") accessing and
        otherwise using this software ("Python") in source or binary form and
        its associated documentation.
        
        2. Subject to the terms and conditions of this License Agreement, PSF
        hereby grants Licensee a nonexclusive, royalty-free, world-wide
        license to reproduce, analyze, test, perform and/or display publicly,
        prepare derivative works, distribute, and otherwise use Python
        alone or in any derivative version, provided, however, that PSF's
        License Agreement and PSF's notice of copyright, i.e., "Copyright (c)
        2001, 2002, 2003, 2004, 2005, 2006 Python Software Foundation; All Rights
        Reserved" are retained in Python alone or in any derivative version
        prepared by Licensee.
        
        3. In the event Licensee prepares a derivative work that is based on
        or incorporates Python or any part thereof, and wants to make
        the derivative work available to others as provided herein, then
        Licensee hereby agrees to include in any such work a brief summary of
        the changes made to Python.
        
        4. PSF is making Python available to Licensee on an "AS IS"
        basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
        IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND
        DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
        FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT
        INFRINGE ANY THIRD PARTY RIGHTS.
        
        5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
        FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
        A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON,
        OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
        
        6. This License Agreement will automatically terminate upon a material
        breach of its terms and conditions.
        
        7. Nothing in this License Agreement shall be deemed to create any
        relationship of agency, partnership, or joint venture between PSF and
        Licensee. This License Agreement does not grant permission to use PSF
        trademarks or trade name in a trademark sense to endorse or promote
        products or services of Licensee, or any third party.
        
        8. By copying, installing or otherwise using Python, Licensee
        agrees to be bound by the terms and conditions of this License
        Agreement.
        
        BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0
        -------------------------------------------
        
        BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1
        
        1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an
        office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the
        Individual or Organization ("Licensee") accessing and otherwise using
        this software in source or binary form and its associated
        documentation ("the Software").
        
        2. Subject to the terms and conditions of this BeOpen Python License
        Agreement, BeOpen hereby grants Licensee a non-exclusive,
        royalty-free, world-wide license to reproduce, analyze, test, perform
        and/or display publicly, prepare derivative works, distribute, and
        otherwise use the Software alone or in any derivative version,
        provided, however, that the BeOpen Python License is retained in the
        Software, alone or in any derivative version prepared by Licensee.
        
        3. BeOpen is making the Software available to Licensee on an "AS IS"
        basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
        IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND
        DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
        FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT
        INFRINGE ANY THIRD PARTY RIGHTS.
        
        4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE
        SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS
        AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY
        DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
        
        5. This License Agreement will automatically terminate upon a material
        breach of its terms and conditions.
        
        6. This License Agreement shall be governed by and interpreted in all
        respects by the law of the State of California, excluding conflict of
        law provisions. Nothing in this License Agreement shall be deemed to
        create any relationship of agency, partnership, or joint venture
        between BeOpen and Licensee. This License Agreement does not grant
        permission to use BeOpen trademarks or trade names in a trademark
        sense to endorse or promote products or services of Licensee, or any
        third party. As an exception, the "BeOpen Python" logos available at
        http://www.pythonlabs.com/logos.html may be used according to the
        permissions granted on that web page.
        
        7. By copying, installing or otherwise using the software, Licensee
        agrees to be bound by the terms and conditions of this License
        Agreement.
        
        CNRI OPEN SOURCE LICENSE AGREEMENT (for Python 1.6b1)
        --------------------------------------------------
        
        IMPORTANT: PLEASE READ THE FOLLOWING AGREEMENT CAREFULLY.
        
        BY CLICKING ON "ACCEPT" WHERE INDICATED BELOW, OR BY COPYING,
        INSTALLING OR OTHERWISE USING PYTHON 1.6, beta 1 SOFTWARE, YOU ARE
        DEEMED TO HAVE AGREED TO THE TERMS AND CONDITIONS OF THIS LICENSE
        AGREEMENT.
        
        1. This LICENSE AGREEMENT is between the Corporation for National
        Research Initiatives, having an office at 1895 Preston White Drive,
        Reston, VA 20191 ("CNRI"), and the Individual or Organization
        ("Licensee") accessing and otherwise using Python 1.6, beta 1
        software in source or binary form and its associated documentation,
        as released at the www.python.org Internet site on August 4, 2000
        ("Python 1.6b1").
        
        2. Subject to the terms and conditions of this License Agreement, CNRI
        hereby grants Licensee a non-exclusive, royalty-free, world-wide
        license to reproduce, analyze, test, perform and/or display
        publicly, prepare derivative works, distribute, and otherwise use
        Python 1.6b1 alone or in any derivative version, provided, however,
        that CNRIs License Agreement is retained in Python 1.6b1, alone or
        in any derivative version prepared by Licensee.
        
        Alternately, in lieu of CNRIs License Agreement, Licensee may
        substitute the following text (omitting the quotes): "Python 1.6,
        beta 1, is made available subject to the terms and conditions in
        CNRIs License Agreement. This Agreement may be located on the
        Internet using the following unique, persistent identifier (known
        as a handle): 1895.22/1011. This Agreement may also be obtained
        from a proxy server on the Internet using the
        URL:http://hdl.handle.net/1895.22/1011".
        
        3. In the event Licensee prepares a derivative work that is based on
        or incorporates Python 1.6b1 or any part thereof, and wants to make
        the derivative work available to the public as provided herein,
        then Licensee hereby agrees to indicate in any such work the nature
        of the modifications made to Python 1.6b1.
        
        4. CNRI is making Python 1.6b1 available to Licensee on an "AS IS"
        basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
        IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND
        DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR
        FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6b1
        WILL NOT INFRINGE ANY THIRD PARTY RIGHTS.
        
        5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE
        SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR
        LOSS AS A RESULT OF USING, MODIFYING OR DISTRIBUTING PYTHON 1.6b1,
        OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY
        THEREOF.
        
        6. This License Agreement will automatically terminate upon a material
        breach of its terms and conditions.
        
        7. This License Agreement shall be governed by and interpreted in all
        respects by the law of the State of Virginia, excluding conflict of
        law provisions. Nothing in this License Agreement shall be deemed
        to create any relationship of agency, partnership, or joint venture
        between CNRI and Licensee. This License Agreement does not grant
        permission to use CNRI trademarks or trade name in a trademark
        sense to endorse or promote products or services of Licensee, or
        any third party.
        
        8. By clicking on the "ACCEPT" button where indicated, or by copying,
        installing or otherwise using Python 1.6b1, Licensee agrees to be
        bound by the terms and conditions of this License Agreement.
        
        ACCEPT
        
        CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2
        --------------------------------------------------
        
        Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam,
        The Netherlands. All rights reserved.
        
        Permission to use, copy, modify, and distribute this software and its
        documentation for any purpose and without fee is hereby granted,
        provided that the above copyright notice appear in all copies and that
        both that copyright notice and this permission notice appear in
        supporting documentation, and that the name of Stichting Mathematisch
        Centrum or CWI not be used in advertising or publicity pertaining to
        distribution of the software without specific, written prior
        permission.
        
        STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO
        THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
        FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
        FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
        WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
        ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
        OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
        
Keywords: log,parsing
Classifier: License :: OSI Approved :: Python Software Foundation License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: archery>=0.1
Requires-Dist: pygeoip
Requires-Dist: httpagentparser
Requires-Dist: repoze.lru>=0.6

# Versatile log parser

- source: https://github.com/jul/yahi
- doc: http://yahi.readthedocs.org/
- ticketting: https://github.com/jul/yahi/issues


# Synopsis

Given a regexp for a log, enables to quicly create
aggregation statisctics by writing few code and generates a all in one web page with all vizualisations and data (that requires javascript to work and has some dependencies).


The library comes with a script that aggregates various data from common log format (apache, nginx) :
*speed_shoot*.

And a script to generate the all in one view *yahi_all_in_one_maker*.

The [demo being there](https://jul.github.io/cv/demo.html?route=chrono#hour_hit)

# Installation


```
    pip install yahi
```

# Quickstart

First you need a geoIP database in legacy format::
```
    mkdir ~/.yahi
    wget -O- https://mailfud.org/geoip-legacy/GeoIP.dat.gz | \
        zcat > ~/.yahi/GeoIP.dat
    wget -O- https://mailfud.org/geoip-legacy/GeoIPv6.dat.gz | \
        zcat > data/GeoIPv6.dat
```
And thanks to [mailfud](http://mailfud.org) for keeping these legacy databases.


Simplest usage is:
```
    speed_shoot -g /usr/local/data/geoIP.dat /var/www/apache/access*log* > data.js
```

It reads gzipped file format automatically.

And then:
```
    yahi_all_in_one_maker data.js
```

To create a *all in one* HTML page with all JS/CSS/data included that has a multi route view.
It includes various external libraries to work : D3js (charting), jquery, google js api (geo chart).

# Screenshots

## Time serie
<image src="https://raw.githubusercontent.com/jul/yahi/refs/heads/master/docs/source/img/chrono.png">

## Histograms

<image src="https://raw.githubusercontent.com/jul/yahi/refs/heads/master/docs/source/img/histo.png">

## Geographic map

<image src="https://raw.githubusercontent.com/jul/yahi/refs/heads/master/docs/source/img/geo.png">

## Raw data

<image src="https://raw.githubusercontent.com/jul/yahi/refs/heads/master/docs/source/img/raw.png">



# Use as a script

speed shoot is in fact a template of how to use yahi as a module::

```python
    #!/usr/bin/env python
    from archery import mdict
    from yahi import notch, shoot
    from datetime import datetime


    context=notch()

    date_formater= lambda dt :"%s-%s-%s" % ( dt.year, dt.month, dt.day)
    context.output(
        shoot(
            context,
            lambda data : mdict({
                'by_country': mdict({data['_country']: 1}),
                'date_hit': mdict({date_formater(data['_datetime']): 1 }),
                'date_bandwidth': mdict({date_formater(data['_datetime']): int(data["bytes"]) }),
                'hour_hit': mdict({data['_datetime'].hour: 1 }),
                'hour_bandwidth': mdict({data['_datetime'].hour: int(data["bytes"]) }),
                'by_os': mdict({data['_platform_name']: 1 }),
                'by_dist': mdict({data['_dist_name']: 1 }),
                'by_browser': mdict({data['_browser_name']: 1 }),
                'by_bandwidth_by_browser': mdict({data['_browser_name']: int(data["bytes"]) }),
                'by_ip': mdict({data['ip']: 1 }),
                'by_bandwidth_by_ip': mdict({data['ip']: int(data["bytes"]) }),
                'by_status': mdict({data['status']: 1 }),
                'by_url': mdict({data['uri']: 1}),
                'by_agent': mdict({data['agent']: 1}),
                'by_referer': mdict({data['referer']: 1}),
                'ip_by_url': mdict({data['uri']: mdict( {data['ip']: 1 })}),
                'bytes_by_ip': mdict({data['ip']: int(data['bytes'])}),
                'date_dayofweek_hit' : mdict({data['_datetime'].weekday(): 1 }),
                'weekday_browser' : mdict({data['_datetime'].weekday():
                    mdict({data["_browser_name"] :1 })}),
                'total_line' : 1,
            }),
        ),
    )
```

# Naming

Archery is a pun on trait.

[Yahi](https://en.wikipedia.org/wiki/Ishi) is a remembrance of a native american tribes that was versed in
archery so that somewhere on the net we remember the genocides committed in the
name of civilisation.

Yahi is thus a concrete application of archery for aggregation based on 2
functions : 

- notch to prepare your log aggregations
- shoot to actually aggregate


Let's have a thought for the native americans that are still second ranks
citizens in their own lands. 





# Changelog

## 0.2.3

* fix : missing dates

## 0.2.2

* fix #21 html injections through ref and uri

## 0.2.0

* adds a loader in the all in one web page for when JS is hogging parsing the JSON
* -g options now applies to the DIRECTORY where both GeoIP.dat and GeoIPv6.dat
 are

## 0.1.22

* fix #18 wrong date formatting resulting in bad date ordeer
* fix #19 create ~/.yahi on startup if not exists
* fixing the template issue the nice way


## 0.1.21

* fix #16 no templates in the package
* fix #17 crashing of the HTML when JSON embedded is too big

## 0.1.19-0.1.20

* wording in README

## 0.1.8

* adding tests in the package so package does not install if tests dont pass

## 0.1.7

* oopsies removed needless pictures of the package

## 0.1.6

* adding yahii\_all\_in\_one\_maker to generate the all in one HTML file with
visualization from speed\_shoot

## 0.1.5

* preparing a new release that generates all in one html static pages

## 0.1.3

Adding varnish incomplete regexp for log parsing (I miss 2 fields)

## 0.1.1

* bad url for the demo  

## 0.1.0

* it is NEW, seen on TV, and is guaranteed to make you tenfolds more desirable. 



