Feature Extraction from Text

nlp
feature_engineering
python
#1

Hi | We are trying to build Binary Classification Model - we have payment reference column which is coming from CAMT statements (Bank to Customer Statements). We are trying to classify whether the transaction is INTERNAL( Transaction related within the Organisation) or EXTERNAL (External Customer Involved)

I would need an advise or idea on how to extract meaningful feature from the Payment Reference.

Sample reference : /PT/DE/EI/NEFT IN UTR CITIN17 FROM ********** P HSB CN95TXN REF NO ACCNEFTRECSENDREF:F*****2

#2

Hi @hemashekarsantosh

I’d suggest you try to first tokenize the sample reference and then create bag of words features from it. Then use those features to train your classifier.

Refer the article below to learn more about feature extraction from text.