'proc sort', and the 'by' Statement


proc sort is the main tool for sorting a data set in SAS. The general format is as follows:

proc sort data=<name of data>;

by <name of variable>;

run;

Sorting by a Single Variable (default: ascending order)

data one;

input studyid name $ sex $ age weight height;

cards;

run;

 

proc sort data=one;

by weight;

run;

/*will sort data one by the variable weight in ascending order */

proc print data=one;

run;

 

• When sorted in ascending order (default), missing values are listed first because SAS treats numeric missing values as having a value of negative infinity.

• Sorting a data set is required when using a BY statement in a procedure as shown below.

The 'BY' Statement

The 'BY' statement instructs SAS to apply the SAS procedure for each subset of data as defined by the different values of the variable specified in the BY statement, and this works in the majority of SAS procedures. The general format is as follows:

 

proc <name of SAS Procedure> data=<name of data>;

<SAS Statements>

by <variable name>;

run;

IMPORTANT: Sorting is necessary when using a BY statement in a procedure. If the data set is not sorted an error message will appear in the Log File. Remember to always examine the Log File after running SAS data steps and procedures.

Example:

 

/* First sort the data */

proc sort data=one;

by sex;

run;

 

proc print data=one;

by sex;

run;

 

proc means data=one;

by sex;

var age weight height;

run;

 

 

Sorting in Descending Order by a Single Variable

Example:

 

proc sort data=one;

by descending height;

run;

 

proc print data=one;

id studyid;

var name age height;

run;