Q1. Which of the following is not a component of the Hadoop ecosystem?
MapReduce
YARN
HDFS
Spark
Q2. All the options given accurately describe Hadoop except one. Which one is it?
Java-based
Distributed computing approach
Open-source
Real-time
Q3. Which technology is commonly used for distributed messaging in Big Data systems?
Hadoop
Spark
Flink
Kafka
Q4. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, Heron, an Trumpet
MapReduce, Hummer, and Iguana
MapReduce, Hive, and HBase
MapReduce, MySQL, and Google Apps
Q5. What is the transaction data of the bank?
Structured data
None of the above
Both 1 and 2
Unstructured data
Q6. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Visualization
Data Mining
Data Analysis
Data Warehousing
Q7. Which of the following is not a characteristic of a data warehouse?
Historical data
Optimized for analytics
Integrated data
Real-time processing
Q8. Which technology is commonly used for real-time stream processing in Big Data systems?
Spark
Kafka
Hadoop
Flink
Q9. Which of the following is not a data type commonly encountered in Big Data?
XML
CSV
JSON
Binary
Q10. Big Data can be found in how many versions?
1
3
4
2
Q11. Data is what size of bytes is known as Big Data?
Peta
Giga
Tera
Meta
Q12. Which type of data refers to data that is generated in real-time or near real-time?
Semi-Structured Data
Unstructured Data
Streaming Data
Structured Data
Q13. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Stream
Data Pond
Data Reservoir
Data Lake
Q14. Which of the following is not a challenge associated with Big Data?
Privacy
Scalability
Data Consistency
Security
Q15. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Partitioning
Data Replication
Data Redundancy
Data Sharding
Q16. Which of the following is not a layer of the Big Data stack?
Presentation Layer
Processing Layer
Storage Layer
Application Layer
Q17. Which type of database is optimized for handling transactional workloads and providing high availability?
NoSQL
OLTP
OLAP
NewSQL
Q18. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Massive Data
Big Data
Huge Data
Mega Data
Q19. Which technology is commonly used for real-time data analytics and visualization?
Tableau
QlikView
Power BI
Databricks
Q20. Which of these has the world's largest Hadoop cluster?
Facebook
All of the above
Datamatics
Apple
Q21. Which of the following is not a characteristic of Big Data?
Volume
Variety
Velocity
Velocity
Q22. How many V's are there in Big Data?
3
4
2
5
Q23. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Project Big
Project Data
Prism
Project Prism
Q24. Which technology framework is commonly used for distributed storage and processing of Big Data?
Kafka
Spark
Hadoop
Flink
Q25. Which technology is commonly used for distributed data storage in Big Data systems?
SQL
HDFS
Cassandra
MongoDB
Q26. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Staging
Data Scrubbing
Data Preparation
Data Cleansing
Q27. Which of the following is not a key feature of Apache Spark?
MapReduce Support
Batch Processing
In-memory Computing
Real-time Processing
Q28. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Integration
Data Fusion
Data Normalization
Data Aggregation
Q29. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
Oozie
MapReduce
Mahout
None of the above
Q30. Which of these options is Hadoop named after?
Creator Doug Cutting's favourite circus act
A sound Cutting's laptop made during Hadoop development
Cutting's high school best friend
The toy elephant of Creator Cutting's son