Q1. All the options given accurately describe Hadoop except one. Which one is it?
Java-based
Real-time
Distributed computing approach
Open-source
Q2. Which of the following is not a component of the Hadoop ecosystem?
HDFS
YARN
Spark
MapReduce
Q3. Which of the following is not a characteristic of Big Data?
Velocity
Variety
Velocity
Volume
Q4. Which technology framework is commonly used for distributed storage and processing of Big Data?
Spark
Flink
Hadoop
Kafka
Q5. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Big Data
Mega Data
Massive Data
Huge Data
Q6. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
None of the above
Mahout
Oozie
MapReduce
Q7. How many V's are there in Big Data?
2
4
3
5
Q8. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Project Prism
Prism
Project Big
Project Data
Q9. Which of the following is not a key feature of Apache Spark?
Real-time Processing
In-memory Computing
MapReduce Support
Batch Processing
Q10. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Staging
Data Scrubbing
Data Cleansing
Data Preparation
Q11. Which technology is commonly used for real-time stream processing in Big Data systems?
Hadoop
Spark
Kafka
Flink
Q12. Which of the following is not a characteristic of a data warehouse?
Historical data
Optimized for analytics
Real-time processing
Integrated data
Q13. Which of the following is not a data type commonly encountered in Big Data?
XML
Binary
JSON
CSV
Q14. Which of the following is not a challenge associated with Big Data?
Security
Data Consistency
Scalability
Privacy
Q15. Which type of database is optimized for handling transactional workloads and providing high availability?
OLAP
OLTP
NoSQL
NewSQL
Q16. Big Data can be found in how many versions?
2
3
4
1
Q17. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Redundancy
Data Sharding
Data Replication
Data Partitioning
Q18. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, MySQL, and Google Apps
MapReduce, Hive, and HBase
MapReduce, Hummer, and Iguana
MapReduce, Heron, an Trumpet
Q19. Data is what size of bytes is known as Big Data?
Giga
Meta
Peta
Tera
Q20. Which of the following is not a layer of the Big Data stack?
Presentation Layer
Storage Layer
Application Layer
Processing Layer
Q21. Which of these options is Hadoop named after?
Creator Doug Cutting's favourite circus act
A sound Cutting's laptop made during Hadoop development
Cutting's high school best friend
The toy elephant of Creator Cutting's son
Q22. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Integration
Data Fusion
Data Aggregation
Data Normalization
Q23. Which type of data refers to data that is generated in real-time or near real-time?
Unstructured Data
Structured Data
Semi-Structured Data
Streaming Data
Q24. Which technology is commonly used for distributed data storage in Big Data systems?
Cassandra
HDFS
SQL
MongoDB
Q25. Which of these has the world's largest Hadoop cluster?
Apple
Datamatics
Facebook
All of the above
Q26. Which technology is commonly used for distributed messaging in Big Data systems?
Spark
Hadoop
Flink
Kafka
Q27. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Pond
Data Stream
Data Lake
Data Reservoir
Q28. What is the transaction data of the bank?
None of the above
Unstructured data
Both 1 and 2
Structured data
Q29. Which technology is commonly used for real-time data analytics and visualization?
Databricks
QlikView
Tableau
Power BI
Q30. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Visualization
Data Mining
Data Analysis
Data Warehousing