On the Power of Pre-Trained Text Representations: Models and Applications in Text Mining