大數據(巨量資料)
資料庫系統無法在合理時間內進行儲存、運算、處理就稱為大數據。
2012 年 Doug Laney 給予大數據一個全新的定義
“Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.”
4Vs
Data Volume: amount of data
- 1 Byte = 8 Bits
- 1 Kilobyte (KB) = 1024 Bytes
- 1 Megabyte (MB) = 1024 KB
- 1 Gigabyte (GB) = 1024 MB
- 1 Terabyte (TB) = 1024 GB
- 1 Petabyte (PB) = 1024 TB
- 1 Exabyte (EB) = 1024 PB
- 1 Zettabyte (ZB) = 1024 EB
- 1 Yottabyte (YB) = 1024 ZB
Data Velocity: speed of data in and out
- 資料流動是連續且快速
- 也可認為是”時效性”
Data Variety: range of data types and sources
- 資料來源包羅萬象
- 簡單劃分:結構化和非結構化
Data Veracity: uncertainty of data
- 分析過濾有偏差、偽造、異常的部分
大數據四字箴言:「大、快、雜、疑」
Laney, Douglas. “The Importance of ‘Big Data’: A Definition”. Gartner. Retrieved 21 June 2012.