Student Contest

ICES VI sponsored a student contest focusing on the analysis/visualization of economic statistical data. The contest was designed to create interest and innovation in the establishment survey field by inspiring students and the faculty they work with to create interesting and challenging applications that tested their technical skill and creativity.

The winning teams received a monetary award and had a recorded presentation of their work played during the conference.

Welcome & Introduction Video from the ICES VI Student Contest Committee

 

Student Contest General Information

As the sixth in the series of international conferences on establishment statistics, ICES VI is designed to look at key issues and challenges pertaining to establishment statistics.

For this conference, we are introducing a student contest focusing on the analysis/visualization of economic statistical data. The winners will have their research presented at ICES VI.

An award of up to $1,500 (USD) will be presented to the contest winner.

Eligibility

  • Participants in the contest must have held student status during the 2020-2021 academic year (i.e., those who completed their studies before September 1, 2020, are not eligible).
  • Students will perform their research independently or with a group of up to five students.
  • Students are to carry out this assignment autonomously. Contributions from faculty advisers, if any, should be made clear in the report, as well as their expertise.

Submission Instructions and Deadline Details

  • Submissions must include two components:
    • Presentation: a short video (5 or fewer minutes) or graphical display, showing the visualizations
    • Report: a report summarizing the methods and results, limited to 6,000 words, excluding tables, references, and appendices.
  • Report Details:
    • The report must be submitted in English, the official conference language.
    • Supporting materials (e.g., tables, appendices) are limited to 10 pages.
    • The report should contain sufficient detail so as to enable reproducibility of results. As such, programming code must be included as an appendix. (Code will not count toward the 10-page limit on appendices.)
    • The report should be submitted as a PDF or Word document.
  • Presentation Details:
    • Uploading a video to YouTube and sharing the link is the preferred way to submit a video. Please do not email large video files.
  • Any software can be used.
  • The report and presentation must be submitted by March 31, 2021.
  • The report and presentation must be sent to ices@amstat.org with the subject line “Student Contest: Data Analysis/Visualization.”

Winners will be notified on April 30, 2021. You may not submit the paper to any other 2021 student/young investigator award competition until this decision is made public. If you have questions concerning this contest, send an email to ices@amstat.org and use the subject line “Student Contest: Inquiry.”

Judging Submissions

The reports will be reviewed by a panel of international experts in establishment survey design and analysis chosen by the ICES VI Program Committee.

Student Contest – Data Analysis and Visualization

Background and Research Question

A business is not the primary source of income for all business owners. “Running a side small business while working a full-time job is relatively common. Some people operate side businesses to supplement their incomes. Others simply keep their jobs for financial stability while trying to launch full-fledged companies.” https://smallbusinessbc.ca/article/how-run-your-business-while-working-a-job/

Logically, the hours dedicated to the business are likely to be smaller if the business is not the primary source of income than if it is. The number of hours all owners of a business combined dedicate to the business could vary across domains.

Research Question: How does the extent to which income derived from the small business is not the primary source of income for a business owner vary by owner’s sex, ethnicity, race, and veteran status and by business characteristics (e.g., size, sector, location)?

Using data from four states in the 2007 SBO, students should answer this question using appropriate analysis methods and creative data visualizations.

The Data Set: 2007 Survey of Business Owners
The Survey of Business Owners (SBO) is conducted every five years by the US Census Bureau. It is the only regularly collected source of information about US businesses and business owners including gender, race, ethnicity, and veteran status. Read details about the SBO.

The 2007 SBO is a large survey with a complex sampling design that collects data from all 50 states. The full SBO data set contains more than 2 million records; for the purposes of this student contest, data from only four states (California, Georgia, Massachusetts, and Ohio) are provided, producing a data set with 391,093 records. In addition to using only a subset of states for analysis, only a subset of variables is provided.

The data is available at the following link in CSV format:
https://ww2.amstat.org/meetings/ices/2021/studentcontest/track2sbo.csv

Description of variables and codes for categorical variables are in the Data Dictionary. Note that some variables have been modified for ease of use in the contest (specifically, some character variables have been modified to be numeric). Thus, the Data Dictionary provided should be used as the source of information (instead of referring to variable information at the end of the 2007 SBO User Guide).

Note that, as stated in the user guide, the data contains “rounded, noise-infused estimates of receipts, payroll, and employment” for confidentiality reasons and disclosure avoidance. No special treatment is needed for these “noisy” variables (i.e., treat them as if they were not noise-infused).

To produce valid population-level estimates, the sample design and unequal probabilities of selection must be properly used in estimation procedures. Read general information about sampling and estimation methods for complex samples.

The specific sample design and estimation procedures used in the 2007 SBO are described in the 2007 SBO User Guide.

Note about variance estimation: As described in the 2007 SBO User Guide, variance estimation for the data was historically done using an older methodology called Random Groups. This approach is not available in standard statistical software packages (e.g., SAS, R, Stata, etc.). Thus, we recommend calculating standard errors as if the design were a simple random sample, but using the unequal sampling weights, and then multiplying the resulting standard errors by 1.5. Please see the instructional videos provided below for additional information on analysis of the SBO data.

Judging Criteria

Entries will be judged based on the following criteria:

  • Do the data analyses and visualizations address/answer the research question?
  • Is the data analysis methodology appropriate (for the complex survey design), and is it explained clearly?
  • Is the data visualized in a creative way that provides insight into the research question?
  • Are the data visualizations explained and interpreted clearly? Are they appropriate given the data source (i.e., a complex sample survey), and are they effective at conveying information?
  • Are limitations of the analyses explained in view of limitations of the data source and challenges encountered?

Data Tips

  • Each business in the 2007 SBO can report up to four owners. Thus, each owner-level characteristic is described by a set of four variables. For example, ethnicity of the business owner(s) is in the variables {ETH1, ETH2, ETH3, ETH4} for the first through fourth owner.
  • Some of the data in the file comes from administrative records and is thus relatively complete. For example, total employment (number of employees), payroll, and receipts are nonmissing for all units. However, some business owner characteristics collected on the survey are incomplete (i.e., there is missing data). The focus of this contest is not on handling missing data, and thus sophisticated methods for handling any missing data are not required (though they can, of course, be used). However, as noted in the last judging criteria, limitations of the analyses due to missing data can and should be addressed in the report.

Instructional Videos

The videos below provide an overview of the analysis of complex survey data—including instructions specific to the 2007 SBO—and the SBO.

Survey Data Analysis (14:04)

Survey of Business Owners (SBO) Overview (11:38)

 

Key Dates

  • January 2, 2019
    Invited Session Proposal Submission Opens
  • June 13, 2019
    Invited Session Proposal Submission Closes
  • July 16, 2019
    Topic Contributed Session Proposal Submission Opens
  • August 15, 2019
    Topic Contributed Session Proposal Submission Closes
  • August 20, 2019
    Software Demonstration Proposal Submission Opens
  • October 16, 2019
    Contributed Abstract Submission Opens
  • December 3, 2019
    Contributed Abstract Submission Closes
  • December 12, 2019
    Software Demonstration Proposal Submission Closes
  • April 30, 2020
    Draft Manuscript Deadline
  • May 22, 2020
    Extended Draft Deadline
  • February 10, 2021
    Early Registration Opens
  • April 15, 2021
    Participant Registration Deadline
  • May 6, 2021
    Early Registration Deadline
  • May 7, 2021
    Regular Registration (increased fees apply)
  • June 14, 2021 – June 17, 2021
    ICES VI
  • March 1, 2023
    Invited Session Proposal Submission Opens
  • March 1, 2023
    Invited Session Proposal Submission Opens
  • May 15, 2023
    Invited Session Proposal Submission Closes
  • May 15, 2023
    Invited Session Proposal Submission Closes