Start Date: 07/05/2020
Course Type: Common Course |
Course Link: https://www.coursera.org/learn/statistical-genomics
An introduction to the statistics behind the most popular genomic data science projects. This is the sixth course in the Genomic Big Data Science Specialization from Johns Hopkins University.
Statistics for Genomic Data Science This course introduces the statistics behind the most powerful data science tools. It shows how to use statistics to improve statistical modeling and inference of genomic data. We take a close look at fundamental concepts like hypothesis testing, power analysis, and the classically useful p-values, demonstrating how to interpret the results of these tools. The course also focuses on the creation of hypothesis-correction maps, which show how to integrate two different analyses of the same data using a common framework. The course requires some prior familiarity with basic data science. It is designed to take just a few hours per week, and you will be working with a large number of datasets. The course is designed to be fun and engaging, so we've put in a lot of effort to make it easy. Our goal was to make this as convenient as possible for you without sacrificing any essential content. Please note that the free version of this class gives you access to all of the instructional videos and handouts. The peer-reviewed material is only available in the paid version.Nomenclature and Priming Analysis Hypothesis Testing Hypothesis Testing (Part 1) Hypothesis Testing (Part 2) Understanding Your Coaching Success: From the Day-By-Week History In this course you’ll learn the key elements of effective coaching. We’ll cover topics such as how
Article | Example |
---|---|
Data science | he initiated the modern, non-computer science, usage of the term "data science" and advocated that statistics be renamed data science and statisticians data scientists. |
Data science | In 2001, William S. Cleveland introduced data science as an independent discipline, extending the field of statistics to incorporate "advances in computing with data" in his article "Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics," which was published in Volume 69, No. 1, of the April 2001 edition of the International Statistical Review / Revue Internationale de Statistique. In his report, Cleveland establishes six technical areas which he believed to encompass the field of data science: multidisciplinary investigations, models and methods for data, computing with data, pedagogy, tool evaluation, and theory. |
Data science | In November 1997, C.F. Jeff Wu gave the inaugural lecture entitled "Statistics = Data Science?" for his appointment to the H. C. Carver Professorship at the University of Michigan. |
Data science | Data science is a "concept to unify statistics, data analysis and their related methods" in order to "understand and analyze actual phenomena" with data. |
Data science | Although use of the term "data science" has exploded in business environments, many academics and journalists see no distinction between data science and statistics. Writing in Forbes, Gil Press argues that data science is a buzzword without a clear definition and has simply replaced “business analytics” in contexts such as graduate degree programs. In the question-and-answer section of his keynote address at the Joint Statistical Meetings of American Statistical Association, noted applied statistician Nate Silver said, “I think data-scientist is a sexed up term for a statistician...Statistics is a branch of science. Data scientist is slightly redundant in some way and people shouldn’t berate the term statistician.” |
Compression of Genomic Re-Sequencing Data | High-throughput sequencing technologies have led to a dramatic decline of genome sequencing costs and to an astonishingly rapid accumulation of genomic data. These technologies are enabling ambitious genome sequencing endeavours, such as the 1000 Genomes Project and 1001 ("Arabidopsis thaliana") Genomes Project. The storage and transfer of the tremendous amount of genomic data have become a mainstream problem, motivating the development of high-performance compression tools designed specifically for genomic data. A recent surge of interest in the development of novel algorithms and tools for storing and managing genomic re-sequencing data emphasizes the growing demand for efficient methods for genomic data compression. |
The Genomic HyperBrowser | The Genomic HyperBrowser is a web-based system for statistical analysis of genomic annotation data. |
Data science | It employs techniques and theories drawn from many fields within the broad areas of mathematics, statistics, information science, and computer science, in particular from the subdomains of machine learning, classification, cluster analysis, data mining, databases, and visualization. |
Statistics | Statistics is a mathematical body of science that pertains to the collection, analysis, interpretation or explanation, and presentation of data, or as a branch of mathematics. Some consider statistics to be a distinct mathematical science rather than a branch of mathematics. While many scientific investigations make use of data, statistics is concerned with the use of data in the context of uncertainty and decision making in the face of uncertainty. |
Data science | Later, he presented his lecture entitled "Statistics = Data Science?" as the first of his 1998 P.C. Mahalanobis Memorial Lectures. These lectures honor Prasanta Chandra Mahalanobis, an Indian scientist and statistician and founder of the Indian Statistical Institute. |
Data science | The term "data science" (originally used interchangeably with "datalogy") has existed for over thirty years and was used initially as a substitute for computer science by Peter Naur in 1960. In 1974, Naur published "Concise Survey of Computer Methods", which freely used the term data science in its survey of the contemporary data processing methods that are used in a wide range of applications. |
Data science | In 2013, the IEEE Task Force on Data Science and Advanced Analytics was launched, and the first international conference: IEEE International Conference on Data Science and Advanced Analytics was launched in 2014. In 2014, the American Statistical Association section on Statistical Learning and Data Mining renamed its journal to "Statistical Analysis and Data Mining: The ASA Data Science Journal" and in 2016 changed its section name to "Statistical Learning and Data Science". In 2015, the International Journal on Data Science and Analytics was launched by Springer to publish original work on data science and big data analytics. 2013 the first "European Conference on Data Analysis (ECDA)" was organised in Luxembourg establishing the European Association for Data Science (EuADS) in August 2015. In September 2015 the Gesellschaft für Klassifikation (GfKl) added to the name of the Society "Data Science Society" at the third ECDA conference at the University of Essex, Colchester, UK. |
Statistics | There are two applications for machine learning and data mining: data management and data analysis. Statistics tools are necessary for the data analysis. |
Statistics (disambiguation) | Statistics is a mathematical science pertaining to the collection, analysis, interpretation, and presentation of data. |
ACE (genomic file format) | The ACE file format is a specification for storing data about genomic contigs. |
Data science | In April 2002, the International Council for Science: Committee on Data for Science and Technology (CODATA) started the "Data Science Journal", a publication focused on issues such as the description of data systems, their publication on the internet, applications and legal issues. Shortly thereafter, in January 2003, Columbia University began publishing "The Journal of Data Science", which provided a platform for all data workers to present their views and exchange ideas. The journal was largely devoted to the application of statistical methods and quantitative research. In 2005, The National Science Board published "Long-lived Digital Data Collections: Enabling Research and Education in the 21st Century" defining data scientists as "the information and computer scientists, database and software and programmers, disciplinary experts, curators and expert annotators, librarians, archivists, and others, who are crucial to the successful management of a digital data collection" whose primary activity is to "conduct creative inquiry and analysis." |
Genomic counseling | Genomic counseling is the process by which a person gets informed about his or her genome. In contrast to genetic counseling, which focuses on Mendelian diseases and typically involves person-to-person communication with a medical genetics expert, genomic counseling is not limited to currently clinically relevant information and includes other genomic information that is of interest for the informed person, such as increased risk for complex disease (for example diabetes or obesity), genetically determined non-disease related traits (for example baldness), or genetic genealogy data. Given the less sensitive nature of this information, genomic advice can be given impersonally, for example over the internet (virtual genomic counseling). |
Genomic convergence | Genomic convergence is a multifactor approach used in genetic research that combines different kinds of genetic data analysis to identify and prioritize susceptibility genes for a complex disease. |
Data science | When Harvard Business Review called it "The Sexiest Job of the 21st Century" the term became a buzzword, and is now often applied to business analytics, or even arbitrary use of data, or used as a sexed-up term for statistics. While many university programs now offer a data science degree, there exists no consensus on a definition or curriculum contents. Because of the current popularity of this term, there are many "advocacy efforts" surrounding it. |
Data science | Turing award winner Jim Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the data deluge. |