Each line in index.tsv contains 4 tab-separated columns:
1. filename        : Audio file name (e.g., speaker_emotion_001.flac)
2. speaker         : Speaker name/identifier
3. emotion         : Emotion label (neutral, happy, sad, etc.)
4. text           : Transcription of the utterance

Example:
speaker_happy_001.flac<TAB>speaker<TAB>happy<TAB>Hvað er klukkan?