How can I remove duplicate values from data set in SAS?

sas
data_wrangling
duplicates

#1

Hi,

In my data set, there is multiple duplicate values. Here I want to remove duplicate values but it must store the total count of duplicate values against each unique value.

Example data Set:

ID
A001
A002
A001
A003
A002
A001
A002
A003
A001
A003

Required Output Data Set:

ID              Count
A001              4 
A002              3
A003              3

Please help me to write a code to perform this using data step.

Regards,
Ravi


#2

Hi Ravi,

Please find the code below -

Data x;
Format count 10.;
Set y;
Count = 1;
Run;

Proc summary data= x nway noprint missing;
Class id;
Var count;
Output out = z(drop= type freq) sum=;
Run;

I think I cannot type underscore in the console using mobile. But its drop underscoretypeunderscore and underscorefrequnderscore.
Hope this helps.

Regards,
Aayush


#3

Ravi,
The easiest way to remove duplicate observations is use nodupkey option with SORT.

proc sort data = x nodupkey; by y;run;

Hope this helps.
Tavish


#4

Thanks @aayushmnit and @tavish_srivastava.

Tavish, I want to count the number of duplicate values as well.

Regards,
Ravi