Abstract:
|
Sample allocation procedures for complex sample designs are usually implemented in multiple steps. For example, we may need to allocate the sample by strata to meet precision targets by analysis domains, where the domains may be defined in a hierarchical manner. In addition, higher level domains may not be defined for all the data in the frame, or we may choose not to target them. We may want to repeat the allocation procedure many times to adjust the conditions and simulate the results based on different allocation schemes and sample size levels. To allocate the sample under many restrictions is a challenging task by itself. When we need to apply the allocation procedures to big data, such as U.S. household frames or Medicaid beneficiary files, we also face problems associated with elapsed time, CPU and Memory usage. In this paper, we compare the use of the Hash Object (SAS), the traditional SAS DATA step processing mode, and PROC SQL in SAS for complex sample allocation tasks and present the advantages and tradeoffs of using the Hash Object.
|