Q1. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Pond
Data Reservoir
Data Lake
Data Stream
Q2. Which technology is commonly used for distributed messaging in Big Data systems?
Spark
Flink
Kafka
Hadoop
Q3. Which of the following is not a key feature of Apache Spark?
Batch Processing
In-memory Computing
Real-time Processing
MapReduce Support
Q4. Which of the following is not a component of the Hadoop ecosystem?
Spark
YARN
MapReduce
HDFS
Q5. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Project Data
Project Prism
Prism
Project Big
Q6. Which technology is commonly used for real-time data analytics and visualization?
Tableau
Power BI
Databricks
QlikView
Q7. Which technology is commonly used for real-time stream processing in Big Data systems?
Kafka
Spark
Flink
Hadoop
Q8. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Staging
Data Preparation
Data Scrubbing
Data Cleansing
Q9. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Redundancy
Data Replication
Data Partitioning
Data Sharding
Q10. All the options given accurately describe Hadoop except one. Which one is it?
Real-time
Java-based
Distributed computing approach
Open-source
Q11. Which technology is commonly used for distributed data storage in Big Data systems?
SQL
HDFS
Cassandra
MongoDB
Q12. Which of the following is not a challenge associated with Big Data?
Security
Scalability
Data Consistency
Privacy
Q13. Which of the following is not a data type commonly encountered in Big Data?
XML
CSV
Binary
JSON
Q14. Which of the following is not a layer of the Big Data stack?
Presentation Layer
Application Layer
Storage Layer
Processing Layer
Q15. What is the transaction data of the bank?
Structured data
Unstructured data
Both 1 and 2
None of the above
Q16. Big Data can be found in how many versions?
2
3
4
1
Q17. Which type of database is optimized for handling transactional workloads and providing high availability?
OLTP
NewSQL
NoSQL
OLAP
Q18. Which of these has the world's largest Hadoop cluster?
Apple
All of the above
Facebook
Datamatics
Q19. Which of these options is Hadoop named after?
Creator Doug Cutting's favourite circus act
The toy elephant of Creator Cutting's son
A sound Cutting's laptop made during Hadoop development
Cutting's high school best friend
Q20. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, MySQL, and Google Apps
MapReduce, Hummer, and Iguana
MapReduce, Heron, an Trumpet
MapReduce, Hive, and HBase
Q21. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Mega Data
Big Data
Huge Data
Massive Data
Q22. Which type of data refers to data that is generated in real-time or near real-time?
Structured Data
Semi-Structured Data
Unstructured Data
Streaming Data
Q23. Which technology framework is commonly used for distributed storage and processing of Big Data?
Kafka
Flink
Spark
Hadoop
Q24. Data is what size of bytes is known as Big Data?
Tera
Giga
Meta
Peta
Q25. Which of the following is not a characteristic of Big Data?
Variety
Velocity
Velocity
Volume
Q26. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Warehousing
Data Mining
Data Analysis
Data Visualization
Q27. How many V's are there in Big Data?
4
2
3
5
Q28. Which of the following is not a characteristic of a data warehouse?
Real-time processing
Historical data
Optimized for analytics
Integrated data
Q29. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Normalization
Data Aggregation
Data Integration
Data Fusion
Q30. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
Mahout
Oozie
MapReduce
None of the above