Text this: Deep learning framework for hierarchical-based object identification and description