Missed a bunch of these reviews so will refrain from being elaborate this time. I am also starting to add a timestamp in every capture from now on so I can clearly say which things happened in last week and which earlier.
1 Experiments
- I have been thinking about a more programmer friendly API for emacs-speech-input as I don't really like the boring FSM-ish conversational framework. Looking at the whole process as a probabilistic parser running on audio streams can work out. Plus you, probably, get to design flows using something like parser combinators.
2 Readings/Explorations
- A few readings on speaker diarization and connected topics
- Speaker diarization with lstm (wang2018speaker)
- Fully Supervised Speaker Diarization (zhang2018fully)
- Utterance-level Aggregation for Speaker Recognition in the Wild (xie2019utterance)
- How Much of a Genius-Level Move Was Using Binary Space Partitioning in Doom? A post from two-bit history after a long time.
3 Programming
4 Media
- Finished Sandman.
- Small Things Considered - This American Life
- Making Sense with Sam Harris: #164 — Cause & Effect. I have been meaning to dive in Pearl's ideas but haven't really gotten time. This specific episode wasn't that helpful in that sense.
Bibliography
- [wang2018speaker] Wang, Downey, Wan, Mansfield & Moreno. 2018. "Speaker diarization with lstm", 5239-5243, in in: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), edited by
- [zhang2018fully] "Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley & Chong Wang". 2018. "Fully Supervised Speaker Diarization." "arXiv preprint arXiv:1810.04719", , link. doi.
- [xie2019utterance] Xie, Nagrani, Chung & Zisserman. 2019. "Utterance-level Aggregation for Speaker Recognition in the Wild", 5791-5795, in in: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), edited by