What is big data and how big is big really. Big Data Analytics says how much you can do with big data and Statistical Paradigm says how to proceed. Classical statistics considers big as simply being not small, big is the sample size after which the asymptotic properties of the method “kick in” for valid results. Contemporary statistics considers big in terms of “lifting observations” and “learning from the variables”. Big (and disparate) Data Sources are dynamically captured, real-time updated and automatically inspected. Big data are big in terms of high volume, high velocity, and/or high variety information assets. There’s the need for big data validity (is the data correct and accurate for the intended usage?), big data veracity (are the results meaningful for the given problem space?), big data volatility (how long do you need to store this data?). What are the key issues and challenges for Statistical Offices in big data “era”? Which sources and typologies of big data can be handled? How to combine big data with administrative data and traditional surveys? Which quality framework to satisfy? How to provide an incremental learning knowledge such to exploit big data?
Big Data: methodological statistics
Fasano A.
;
2013-01-01
Abstract
What is big data and how big is big really. Big Data Analytics says how much you can do with big data and Statistical Paradigm says how to proceed. Classical statistics considers big as simply being not small, big is the sample size after which the asymptotic properties of the method “kick in” for valid results. Contemporary statistics considers big in terms of “lifting observations” and “learning from the variables”. Big (and disparate) Data Sources are dynamically captured, real-time updated and automatically inspected. Big data are big in terms of high volume, high velocity, and/or high variety information assets. There’s the need for big data validity (is the data correct and accurate for the intended usage?), big data veracity (are the results meaningful for the given problem space?), big data volatility (how long do you need to store this data?). What are the key issues and challenges for Statistical Offices in big data “era”? Which sources and typologies of big data can be handled? How to combine big data with administrative data and traditional surveys? Which quality framework to satisfy? How to provide an incremental learning knowledge such to exploit big data?I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.