Recently, I started my short term internship in a Logistics company (it involves shipment and transportation of mails from source to destination all over the world) as a Data Scientist in Paris.
They gave me a Data set in the form of Excel sheet, related to company’s Distribution department. It has two Excel files (worksheets) with the following attributes (columns),
Excel file 1: Decision Criteria
Total number of rows: 2078
Excel file 2: Distribution Cost
Logistics Provider Id
Product Name Id
Cost Per Kg
Total number of rows: 6760
The task is to integrate or merge both the files into one so that if someone want to know the Cost (which is in Excel file 2) of a certain shipment (shipments details in Excel file 1) can obtain an optimized result in the end, i;e the Cost of the shipment.
The problem I am facing;
The size of both the files are not same.
If I simply copy and paste the columns from Excel file 2 to Excel file 1 the result would not be optimized.
I am new in this field of Data Science.
Kindly help me with this by giving some guidelines so that I wont lose my confidence to become a Data Scientist like you. I would be thankful of you.
Please feel free to ask me any question related this problem. I can personally provide all the information and resources that I have.
An aspiring Data Scientist