Big Data are not only Big but Complex, Messy, Badly Sampled, and Creepy

Explore the Complex, Messy, and Creepy World of Big Data. Discover the Role of Statisticians in Data Science


statistics

Interested in improving your skills in data science? Take our data science courses and get Data Analysis Certified.

How to Empower Decision Makers: deliver concise, trusted and relevant information for informed decision-making.

In researching last week’s post (Is data science still the sexiest job of the 21st century?), I came across this great lecture by Professor Thomas Lumley

Prof Lumley discusses the role that statisticians must play in data science – in an entertaining and understandable way.

Great piece for any aspiring data scientist, or interested layperson.

Data Science: Will Computer Science and Informatics Eat Our Lunch? 

Mainstream statistics ignored computing for many years, so that students were taught to handle infinite N, but not N of a million. Practical estimation of conditional probabilities and conditional distributions in large data sets was often left to computer science and informatics.

Although statistics started behind, we are catching up: many individual statisticians and some statistics departments are taking computing seriously.

More importantly, applied statistics has a long tradition of understanding how to formulate questions: large-scale empirical data can tell you a lot of things, but not what your question is. Big Data are not only Big but Complex, Messy, Badly Sampled, and Creepy.

These are problems that statistics has thought about for some time, so we have the opportunity to take all the shiny computing technology that other people have developed and use it to re-establish statistics in data science.

Explore the potential impact of fake news on stock prices with insights from Could Fake News Influence Stock Prices?, examining the intersection of misinformation and financial markets.

Learn techniques for extracting better insights from big data with insights from How to Get Better Insights from Big Data, focusing on optimizing data quality to drive meaningful analysis.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.



Related posts

Discover more from Data Quality Matters

Subscribe now to keep reading and get our new posts in your email.

Continue reading