Q1. Which technology is commonly used for real-time stream processing in Big Data systems?
Kafka
Hadoop
Flink
Spark
Q2. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Project Prism
Project Big
Prism
Project Data
Q3. Data is what size of bytes is known as Big Data?
Peta
Tera
Giga
Meta
Q4. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Cleansing
Data Preparation
Data Staging
Data Scrubbing
Q5. Which of these has the world's largest Hadoop cluster?
Apple
All of the above
Facebook
Datamatics
Q6. Which of the following is not a component of the Hadoop ecosystem?
HDFS
YARN
MapReduce
Spark
Q7. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Normalization
Data Aggregation
Data Integration
Data Fusion
Q8. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Visualization
Data Analysis
Data Mining
Data Warehousing
Q9. What is the transaction data of the bank?
Both 1 and 2
Structured data
None of the above
Unstructured data
Q10. How many V's are there in Big Data?
3
4
5
2
Q11. Which technology is commonly used for distributed data storage in Big Data systems?
HDFS
Cassandra
SQL
MongoDB
Q12. Which technology framework is commonly used for distributed storage and processing of Big Data?
Kafka
Flink
Spark
Hadoop
Q13. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Huge Data
Mega Data
Big Data
Massive Data
Q14. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
None of the above
MapReduce
Mahout
Oozie
Q15. Which of the following is not a data type commonly encountered in Big Data?
Binary
JSON
CSV
XML
Q16. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Stream
Data Lake
Data Pond
Data Reservoir
Q17. Which type of database is optimized for handling transactional workloads and providing high availability?
NoSQL
OLAP
NewSQL
OLTP
Q18. Which of the following is not a challenge associated with Big Data?
Data Consistency
Privacy
Scalability
Security
Q19. Which of the following is not a characteristic of a data warehouse?
Historical data
Optimized for analytics
Real-time processing
Integrated data
Q20. Which type of data refers to data that is generated in real-time or near real-time?
Streaming Data
Semi-Structured Data
Unstructured Data
Structured Data
Q21. Which technology is commonly used for real-time data analytics and visualization?
Power BI
QlikView
Tableau
Databricks
Q22. All the options given accurately describe Hadoop except one. Which one is it?
Java-based
Open-source
Real-time
Distributed computing approach
Q23. Which of the following is not a characteristic of Big Data?
Velocity
Volume
Velocity
Variety
Q24. Which of the following is not a key feature of Apache Spark?
Real-time Processing
Batch Processing
In-memory Computing
MapReduce Support
Q25. Which technology is commonly used for distributed messaging in Big Data systems?
Spark
Flink
Kafka
Hadoop
Q26. Which of these options is Hadoop named after?
The toy elephant of Creator Cutting's son
Creator Doug Cutting's favourite circus act
A sound Cutting's laptop made during Hadoop development
Cutting's high school best friend
Q27. Big Data can be found in how many versions?
4
3
2
1
Q28. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, Hive, and HBase
MapReduce, Heron, an Trumpet
MapReduce, MySQL, and Google Apps
MapReduce, Hummer, and Iguana
Q29. Which of the following is not a layer of the Big Data stack?
Processing Layer
Presentation Layer
Application Layer
Storage Layer
Q30. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Redundancy
Data Replication
Data Sharding
Data Partitioning