Q1. Which of these options is Hadoop named after?
Cutting's high school best friend
The toy elephant of Creator Cutting's son
A sound Cutting's laptop made during Hadoop development
Creator Doug Cutting's favourite circus act
Q2. Which technology is commonly used for real-time data analytics and visualization?
Tableau
Power BI
Databricks
QlikView
Q3. Which of these has the world's largest Hadoop cluster?
Datamatics
Facebook
All of the above
Apple
Q4. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Stream
Data Lake
Data Pond
Data Reservoir
Q5. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Fusion
Data Aggregation
Data Integration
Data Normalization
Q6. What is the transaction data of the bank?
Both 1 and 2
None of the above
Structured data
Unstructured data
Q7. Which of the following is not a characteristic of a data warehouse?
Optimized for analytics
Integrated data
Real-time processing
Historical data
Q8. Which type of database is optimized for handling transactional workloads and providing high availability?
OLAP
OLTP
NoSQL
NewSQL
Q9. All the options given accurately describe Hadoop except one. Which one is it?
Open-source
Java-based
Distributed computing approach
Real-time
Q10. Which of the following is not a characteristic of Big Data?
Variety
Velocity
Volume
Velocity
Q11. Which of the following is not a challenge associated with Big Data?
Scalability
Data Consistency
Privacy
Security
Q12. Which of the following is not a key feature of Apache Spark?
Real-time Processing
Batch Processing
MapReduce Support
In-memory Computing
Q13. Which technology is commonly used for distributed messaging in Big Data systems?
Hadoop
Spark
Kafka
Flink
Q14. How many V's are there in Big Data?
5
3
2
4
Q15. Which technology is commonly used for distributed data storage in Big Data systems?
HDFS
MongoDB
SQL
Cassandra
Q16. Which technology is commonly used for real-time stream processing in Big Data systems?
Hadoop
Spark
Flink
Kafka
Q17. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Project Prism
Project Data
Prism
Project Big
Q18. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Warehousing
Data Analysis
Data Visualization
Data Mining
Q19. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
Mahout
MapReduce
Oozie
None of the above
Q20. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Partitioning
Data Redundancy
Data Sharding
Data Replication
Q21. Data is what size of bytes is known as Big Data?
Tera
Peta
Meta
Giga
Q22. Which of the following is not a component of the Hadoop ecosystem?
Spark
HDFS
MapReduce
YARN
Q23. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Cleansing
Data Scrubbing
Data Preparation
Data Staging
Q24. Which technology framework is commonly used for distributed storage and processing of Big Data?
Kafka
Flink
Hadoop
Spark
Q25. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, Hive, and HBase
MapReduce, MySQL, and Google Apps
MapReduce, Heron, an Trumpet
MapReduce, Hummer, and Iguana
Q26. Which of the following is not a data type commonly encountered in Big Data?
JSON
XML
Binary
CSV
Q27. Which of the following is not a layer of the Big Data stack?
Processing Layer
Presentation Layer
Application Layer
Storage Layer
Q28. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Massive Data
Mega Data
Huge Data
Big Data
Q29. Which type of data refers to data that is generated in real-time or near real-time?
Structured Data
Unstructured Data
Streaming Data
Semi-Structured Data
Q30. Big Data can be found in how many versions?
3
2
4
1