- Took something from my lists and wrote a post after a long time. Blurring Text. Will be trying to do such things on a regular basis.
- Blinkdb: queries with bounded errors and bounded response times on very large data (agarwal2013blinkdb)
- Went through a few subword papers:
- Were Nazis Drug-Fueled Crankheads? | Stuff You Should Know. I kind of want to try Blitzed now.
- Kurt Vonnegut, Shape of Stories (subtitulos castellano)
- Walden by Henry David Thoreau
- [agarwal2013blinkdb] Agarwal, Mozafari, Panda, Milner, Madden & Stoica. 2013. "BlinkDB: queries with bounded errors and bounded response times on very large data", 29-42, in in: Proceedings of the 8th ACM European Conference on Computer Systems, edited by
- [heinzerling2017bpemb] Heinzerling & Strube. 2017. "Bpemb: Tokenization-free pre-trained subword embeddings in 275 languages." arXiv preprint arXiv:1710.02187, , link. doi.
- [kudo2018sentencepiece] Kudo & Richardson. 2018. "SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing." arXiv preprint arXiv:1808.06226, , link. doi.