Q1. Which technology framework is commonly used for distributed storage and processing of Big Data?
Flink
Kafka
Spark
Hadoop
Q2. Which of the following is not a characteristic of a data warehouse?
Historical data
Real-time processing
Optimized for analytics
Integrated data
Q3. Which of the following is not a data type commonly encountered in Big Data?
XML
JSON
CSV
Binary
Q4. Which of the following is not a component of the Hadoop ecosystem?
Spark
HDFS
MapReduce
YARN
Q5. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Sharding
Data Redundancy
Data Partitioning
Data Replication
Q6. Which technology is commonly used for real-time data analytics and visualization?
Power BI
QlikView
Databricks
Tableau
Q7. What is the transaction data of the bank?
None of the above
Unstructured data
Structured data
Both 1 and 2
Q8. Which technology is commonly used for distributed messaging in Big Data systems?
Flink
Hadoop
Spark
Kafka
Q9. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Preparation
Data Cleansing
Data Staging
Data Scrubbing
Q10. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Warehousing
Data Visualization
Data Analysis
Data Mining
Q11. Which of the following is not a characteristic of Big Data?
Velocity
Velocity
Variety
Volume
Q12. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Big Data
Huge Data
Mega Data
Massive Data
Q13. Which technology is commonly used for distributed data storage in Big Data systems?
Cassandra
MongoDB
HDFS
SQL
Q14. Which type of database is optimized for handling transactional workloads and providing high availability?
NoSQL
OLAP
NewSQL
OLTP
Q15. How many V's are there in Big Data?
2
3
4
5
Q16. Which of the following is not a challenge associated with Big Data?
Security
Data Consistency
Privacy
Scalability
Q17. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, Heron, an Trumpet
MapReduce, Hive, and HBase
MapReduce, Hummer, and Iguana
MapReduce, MySQL, and Google Apps
Q18. Which of the following is not a layer of the Big Data stack?
Storage Layer
Processing Layer
Application Layer
Presentation Layer
Q19. Which type of data refers to data that is generated in real-time or near real-time?
Unstructured Data
Semi-Structured Data
Structured Data
Streaming Data
Q20. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Aggregation
Data Normalization
Data Fusion
Data Integration
Q21. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Prism
Project Big
Project Prism
Project Data
Q22. Data is what size of bytes is known as Big Data?
Tera
Giga
Meta
Peta
Q23. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
Oozie
Mahout
MapReduce
None of the above
Q24. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Stream
Data Lake
Data Pond
Data Reservoir
Q25. All the options given accurately describe Hadoop except one. Which one is it?
Distributed computing approach
Open-source
Real-time
Java-based
Q26. Which technology is commonly used for real-time stream processing in Big Data systems?
Hadoop
Kafka
Flink
Spark
Q27. Which of the following is not a key feature of Apache Spark?
Batch Processing
MapReduce Support
In-memory Computing
Real-time Processing
Q28. Big Data can be found in how many versions?
1
3
2
4
Q29. Which of these has the world's largest Hadoop cluster?
All of the above
Datamatics
Facebook
Apple
Q30. Which of these options is Hadoop named after?
Cutting's high school best friend
The toy elephant of Creator Cutting's son
Creator Doug Cutting's favourite circus act
A sound Cutting's laptop made during Hadoop development