# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.7.2] - 2025-11-11

  * Add sip.utils.is_bag function
  * Bump magika to 1.0.1
  * Improve tests
  * Fix .gitlab-ci.yml
   
## [0.7.1] - 2025-11-10

### Changed

  * Add more file names, which should not be treated as datastreams.

## [0.7.0] - 2025-10-16

### Changed
  * Add sip submodule 
  * Add Makefile
  * Add reference (API documentation)
    
## [0.6.1] - 2025-07-23

### Changed

  * Code cleanup
  * line in datastreams.csv are again saved sorted by dsid

## [0.6.0] - 2025-07-18

### Changed

  * Massively refactored. 
  * CSV handling was heavily rewritten: objectcsv was replaced by two new 
    classes: ObjectCSVManager and ObjectCollector
  * mainResource is set automatically for XML files
  * title for datastreams now contains the Type and Subtype. This was request from the
    front end people


## [0.5.0] - 2025-02-28

### Changed

   * Breaking change: The subtype property of FormatInfo classes 
     is no longer a string, but a formatinfo.SubType enum value.
     This gives more control about supported subtypes and avoids
     mistakes.


## [0.4.3] - 2025-02-20

### Changed

   * get_languages() in create_csv.py no longer supports data from DC.xml,
     because 'language' of the object does not say anything about the language
     of a data stream. 

## [0.4.2] - 2025-02-20

### Added

* Add a detect_languages to create_csv.py
  * Currently language(s) are taken from DC.xml
  * NLP based detection can be added later

## Removed

* Remove the 'funder' field from datastreams.csv, which was introduced 
  in 0.4.1, because it was highly redundant. 'funder' now is only
  supported in object.csv


## [0.4.1] - 2025-02-20

### Changed
* Add a 'funder' field to object.csv and datastreams.csv

## [0.4.0] - 2025-01-30

### Added

* A new subpackage formatdetect for file format identification 

### Changed

* project.toml has 2 new fields in the general table: format_detector and format_detector_url
* Add tags column to datastreams.csv
* Improve auto-detection for some csv columns
* The Configuration object in submodule projectconfiguration now must be obtained by calling
  the (cached) get_configuration() function

## [0.3.1] - 2025-01-09

### Changed

* Additional filename pattterns for autogenerating values  title and description are
    autogenerated 
* Default titles for media files (images, audio, video): 'Image|Audio|Video: DS ID'



## [0.3.0] - 2024-12-13

### Changed

- manage_csv.py: The collected csv files are now by default created 
   and expected in the current working directory and no longer in object_root
- collect_csv_data() output_dir ist now a required parameter: 
- Unify/simplify API
- manage_csv.collect_csv_data() now requires the output_dir parameter 
- When creating the csv files, some fields like title and description are
  autogenerated


## [0.2.6] - 2024-12-06

### Added

- CHANGELOG.md
- More tests

### Changed

- Extend pyproject.toml
  
