What is the best way for vectorizing a dataset having text features and numeric feature in it?

machine_learning
data_science
feature_engineering

#1

Let’s say i have a column with review as well as a column with category and then a column with ratings.combining
How do i vectorize these three such that they remain at same scale and remain relative to one another.
I understand i can use of BOW or TFIDF for text features but how to combine the category and the rating column to form a training matrix.


#2

I am new to this domain, If there is any scope of improvement in my answer please let me know.
My answer is :
Use euclidian distance for ratings and normalize it accordingly for combing with other vectors(or dimensions of the same vector) that are review and category.


#3

I did not get what do you mean by using " euclidian distance for ratings ".
It should be for entire row vector right.


#4

Make a column for each of the categories and give a value of 1/0 accordingly to the records. If you don’t want to bias your model too much because if these categories, you can transform them to lower values e.g. log