Abstract:Traditionally, data have been stored in securely protected databases for special purposes, such as satellite imagery data for earth science research or customer transaction data for business analytics. The usefulness of data lies in the fact that they can be examined and analyzed to unearth correlations among data items and to discover knowledge to gain deeper insightful trends. Data analytics has been the key research topic in data mining, knowledge discovery and machine learning for decades. In recent years, the term "data" has experienced a major rejuvenation in many aspects of our lives. The rapid development of the Internet and web technologies allows ordinary users to generate vast amounts of data about their daily lives. On the Internet of Things, the number of connected devices has grown exponentially; each of these produces real-time or near real-time streaming data about our physical world. The resulting data, which is extremely difficult, if not impossible, to be stored, processed, and analyzed with conventional computing methodologies and resources, is referred to as the "Big Data." In this chapter, we focus on a subset of big data digital data and analog data. These two major subsets are further divided as the environmental and personal source of data. We have also highlighted the data types and formats as well as different input mechanisms. These classifications are helpful to understand the active and passive way of data collection and production with explicit and without (i.e., implicit) human involvement. This chapter intends to provide enough information to support the reader to understand the role of digital and analog sources, and how data is acquired, transmitted, and preprocessed using today's growing variety of computing devices and sensors. © 2017 All rights reserved.