Merging two Excel files of different size into one

sheet
excel
data_wrangling
data_science

#1

Hello Everyone,

Recently, I started my short term internship in a Logistics company (it involves shipment and transportation of mails from source to destination all over the world) as a Data Scientist in Paris.

They gave me a Data set in the form of Excel sheet, related to company’s Distribution department. It has two Excel files (worksheets) with the following attributes (columns),

Excel file 1: Decision Criteria

Distribution Country
Network
IMPC Codes
Channel
Distribution Partner
Service Level
Transaction Product
Transaction Format
Product Name
Format

Total number of rows: 2078

Excel file 2: Distribution Cost
Destination Country
Logistics Provider Id
Product Name Id
FormatId
Currency
WeightFrom
WeightTo
CostsPerItem
Cost Per Kg

Total number of rows: 6760

The task is to integrate or merge both the files into one so that if someone want to know the Cost (which is in Excel file 2) of a certain shipment (shipments details in Excel file 1) can obtain an optimized result in the end, i;e the Cost of the shipment.

The problem I am facing;

  • The size of both the files are not same.

  • If I simply copy and paste the columns from Excel file 2 to Excel file 1 the result would not be optimized.

  • I am new in this field of Data Science.

Kindly help me with this by giving some guidelines so that I wont lose my confidence to become a Data Scientist like you. I would be thankful of you.

Please feel free to ask me any question related this problem. I can personally provide all the information and resources that I have.

Thank you.
An aspiring Data Scientist


#2

Hi @abhijeet_tiwari,

First of all, in order to merge the two datasets, you must have some column in common. For instance, If you have a column ‘Distribution Country’ in both the sheets, we would be able to merge the two csv files. Let me explain my point with a simple example -

If I have one file as -

ID  NAME  AGE   GENDER 
 1    A   15      M
 2    B   16      M
 3    C   15      F
 4    D   14      M

and another file-

ID  SUBJECT     MARKS 
 1    ENGLISH     45
 1    MATHS       48
 1    SCIENCE     40
 2    MATHS       39
 2    SCIENCE     35
 3    ENGLISH     44
 3    SCIENCE     30
 4    MATHS       47

I can merge the two files based on ID.


#3

Hello @AishwaryaSingh ,

Thank you so much for your reply and explaining in a very simple way with the help of example.

Here, I have two columns, Service Level and Network common in both sheets. I will surely consider this approach.
:slightly_smiling_face: