Radim: please consider incorporating this into gensim. It really is superior to simpler classification models running on top of word/BPE/wordpiece embeddings and to classic machine learning algorithms used for text classification and topic modeling like HDP, LDA, LSI/LSA, etc. (You can see for yourself how well this works out-of-the-box with a simple exercise: grab a pretrained model from fast.ai, run a bunch of documents through it, grabbing and saving each time the last hidden-layer representation of each document, and then map these representations to a two-dimensional plot with, say, t-SNE.)
I realize that outside of Silicon Valley and other technology centers, most established companies are far -- far -- from adopting deep learning for any application of importance, due partly to the current unavailability of developers with AI expertise, and partly to deep learning's so-called "unexplainability" (i.e., the inability of many corporate executives and machine learning practitioners to reason about it, and their resulting discomfort with it). But it's only a matter of time before Corporate America starts following the lead of companies like Google and Facebook, which today are aggressively using state-of-the-art AI in lots of important applications.
Why not get ahead of this multi-decade trend?
PS. For those who don't know, Radim is the creator of gensim, a popular, friendly Python library for text classification and topic modeling.[a]
I realize that outside of Silicon Valley and other technology centers, most established companies are far -- far -- from adopting deep learning for any application of importance, due partly to the current unavailability of developers with AI expertise, and partly to deep learning's so-called "unexplainability" (i.e., the inability of many corporate executives and machine learning practitioners to reason about it, and their resulting discomfort with it). But it's only a matter of time before Corporate America starts following the lead of companies like Google and Facebook, which today are aggressively using state-of-the-art AI in lots of important applications.
Why not get ahead of this multi-decade trend?
PS. For those who don't know, Radim is the creator of gensim, a popular, friendly Python library for text classification and topic modeling.[a]
[a] https://radimrehurek.com/gensim | https://github.com/RaRe-Technologies/gensim