17 December 2014

The most effective method to union distinctive dataset values in SAS utilizing Merge concept

A SAS dataset is structure file that contains columns and rows. These columns and rows are used to store the data; the data that is stored in the SAS environment is stored in the form of tabular structure, this table which contains data within it is called as a dataset. Sometime to manage these datasets together, programmer has to join these datasets together into a single dataset.

To merge these datasets merging concept is used. The basic idea of this merge is to combine the values/contents of two or more datasets that contain different variables and might have nearly the same number of observations.

There are two types of merging:
One - to - one merging and match merging. In one - to - one merging, merge statement do not require a BY statement. Observations are combined based on their positions in the input data sets. Where as in match merging, merge statement need a BY statement to combine observations from the input data sets based on common values of the variable.

Syntax to Merge

MERGE SAS-data-set-list;
BY variable-list;

SAS-data-set-list: is the names of two or more SAS data sets to merge. The list may contain any number of data sets.
 

variable-list: is one or more variables by which to merge the data sets. If a BY statement is used, then the data sets must be sorted by the same BY variables before merging the data sets.

Different types of merging:


One-to-One Merging: 

In one-to-one merge, the number of observations in the new data set is equal to the number of observations in the largest data set named in the MERGE statement.

Match-Merging:
Merging with a BY statement enables to match observations according to the values of the BY variables that specified. Before performing a match-merge, all data sets must be sorted by the variables that is used for merging.

In order to understand match-merging, there are three key concepts:

BY variable - is a variable named in a BY statement.
BY value - is the value of a BY variable.
BY group - is the set of all observations with the same value for the BY variable (if there is only one BY variable).


Clinnovo is a clinical innovation company. It is pioneer CRO industry in India. Clinnovo offers professional Clinical Research Course,  Clinical Data Management Course, SAS Courses and Imaging Training. Clinnovo has been serving different bio-pharma industries across the world with excellence and high quality. 

For more information on Courses and Training contact at: +91 9912868928, 040 64635501.

No comments:

Post a Comment