Should U skip RNNs, LSTMs, GRUs and directly learn Attention and Transformers, and what about CNNs?