Activity Number:
|
190
|
Type:
|
Invited
|
Date/Time:
|
Tuesday, August 13, 2002 : 10:30 AM to 12:20 PM
|
Sponsor:
|
Amstat Online
|
Abstract - #300060 |
Title:
|
Extensible Formats for Data Analysis, Documentation, and Simulation Control
|
Author(s):
|
Friedrich Leisch*+
|
Affiliation(s):
|
Technische Universität Wien
|
Address:
|
Wiedner Hauptstraße 8-10/1071, Wien, , A-1040, Austria
|
Keywords:
|
R ; XML ; integrated documents
|
Abstract:
|
One of the distinctions between good statistical software and general purpose numerical software is the ability of the software to operate easily with meta-information on raw numbers, like scale information (nominal, ordinal, metric) or missing values. However, this information is only the lowest level of meta-information that can be associated with a data set. Often it is desirable to turn a raw data set into more general statistical objects that consist of the raw data itself, together with information on the source, documentation, and previous usages up to a complete statistical analysis (including code therefore). We discuss various options for designing the syntax of such general statistical objects and show solutions that are already implemented in R. XML is a natural choice whenever contents and meta-information about the content shall be stored in a file. This allows for flexible ways of controlling even large-scale simulations that involve more than one engine for numerical computing and exchange data between several R and octave processes in each simulation cycle.
|
- The address information is for the authors that have a + after their name.
- Authors who are presenting talks have a * after their name.
Back to the full JSM 2002 program |