support

Technical Support


Phone: (410) 638-9239

Fax: (410) 638-6108

GoToMeeting: Meet Now!

Web: www.CadmiumCD.com

Submit Support Ticket


close this panel
‹‹ Go Back

Yuan Yuan

Auburn University



‹‹ Go Back

Nedret Billor

Auburn University



‹‹ Go Back

Asuman Turkmen

The Ohio State University



‹‹ Go Back

Please enter your access key

The asset you are trying to access is locked for premium users. Please enter your access key to unlock.


Email This Presentation:

From:

To:

Subject:

Body:

←Back IconGems-Print

29 – SPEED: An Ensemble of Advances in Genomics and Genetics

Benford's Law Based Outliers Detection for Population Stratification in Genotype Data

Sponsor: Section on Statistics in Genomics and Genetics
Keywords: Benford's Law, GWAS, outliers, PCA, population structure, Case-control studies

Yuan Yuan

Auburn University

Nedret Billor

Auburn University

Asuman Turkmen

The Ohio State University

The issue of population stratification remains a challenging problem in genome-wide association studies. The sample of genome data is often stratified and contaminated by outliers. Benford's law, also called Newcomb-Benford's law and first-digit law, is an observation about the frequency distribution of leading digits in many real-life sets of numerical data. Benford's law has been applied to fraud detection for different types of datasets (i.e., tax fraud, election survey, etc.). When the dataset is free from error or fabrication, the first digit should follow the Benford distribution. When a dataset is artificially modified or is contaminated by outliers, the digits distribution would not follow the Benford distribution exactly. This study proposes an outlier detection method for the genotype data by using Benford's law. We test the accuracy of the new method by applying it to datasets with genuine or simulated outliers. We also compare the performance of Benford's law based outlier detection against other existing approaches (e.g., PCA methods). We believe that the new approach will be a promising contribution which helps to detect population stratification more accurately.

"eventScribe", the eventScribe logo, "CadmiumCD", and the CadmiumCD logo are trademarks of CadmiumCD LLC, and may not be copied, imitated or used, in whole or in part, without prior written permission from CadmiumCD. The appearance of these proceedings, customized graphics that are unique to these proceedings, and customized scripts are the service mark, trademark and/or trade dress of CadmiumCD and may not be copied, imitated or used, in whole or in part, without prior written notification. All other trademarks, slogans, company names or logos are the property of their respective owners. Reference to any products, services, processes or other information, by trade name, trademark, manufacturer, owner, or otherwise does not constitute or imply endorsement, sponsorship, or recommendation thereof by CadmiumCD.

As a user you may provide CadmiumCD with feedback. Any ideas or suggestions you provide through any feedback mechanisms on these proceedings may be used by CadmiumCD, at our sole discretion, including future modifications to the eventScribe product. You hereby grant to CadmiumCD and our assigns a perpetual, worldwide, fully transferable, sublicensable, irrevocable, royalty free license to use, reproduce, modify, create derivative works from, distribute, and display the feedback in any manner and for any purpose.

© 2018 CadmiumCD