Text this: LSI-based semantic characterisation for automated text categorisation