Text this: Text representation using canonical data model