1. . Data collection in analytics is:
(A) Backup only
(B) Encrypting data only
(C) Compressing files only
(D) The process of gathering relevant and accurate data from various sources
2. . Primary data sources include:
(A) Data warehouses only
(B) Databases only
(C) Surveys, interviews, experiments, and observations
(D) Backup only
3. . Secondary data sources include:
(A) Experiments only
(B) Personal interviews only
(C) Existing datasets, reports, publications, and online databases
(D) Backup only
4. . Data pre-processing is important because:
(A) Encrypting data
(B) Raw data often contains noise, missing values, and inconsistencies
(C) Compressing data
(D) Backup only
5. . Data cleaning involves:
(A) Backup only
(B) Encrypting data
(C) Compressing data
(D) Removing duplicates, correcting errors, and handling missing values
6. . Data normalization is:
(A) Compressing values
(B) Encrypting numbers
(C) Scaling data to a specific range to improve model performance
(D) Backup only
7. . Data transformation includes:
(A) Backup only
(B) Encrypting transformations
(C) Compressing transformations
(D) Converting data formats, encoding categorical variables, and aggregating values
8. . Handling missing values can be done by:
(A) Backup only
(B) Encrypting missing values
(C) Compressing missing values
(D) Removing rows, filling with mean/median/mode, or using predictive imputation
9. . Outlier detection in pre-processing helps to:
(A) Backup only
(B) Encrypt outliers
(C) Compress outliers
(D) Identify and handle data points that deviate significantly from the rest
10. . Feature selection is:
(A) Encrypting features
(B) Choosing relevant variables to reduce dimensionality and improve model accuracy
(C) Compressing features
(D) Backup only
11. . Feature extraction is:
(A) Backup only
(B) Encrypting features
(C) Compressing features
(D) Creating new features from existing data to better represent patterns
12. . Data integration involves:
(A) Encrypting integration
(B) Combining data from multiple sources into a unified dataset
(C) Compressing integration
(D) Backup only
13. . Data reduction techniques include:
(A) Dimensionality reduction, sampling, and aggregation
(B) Encrypting data
(C) Compressing data only
(D) Backup only
14. . One-hot encoding is used to:
(A) Compress categories
(B) Encrypt categories
(C) Convert categorical variables into binary vectors
(D) Backup only
15. . Z-score standardization is:
(A) Scaling data based on mean and standard deviation
(B) Encrypting z-scores
(C) Compressing z-scores
(D) Backup only
16. . Data discretization is:
(A) Converting continuous data into intervals or categories
(B) Encrypting intervals
(C) Compressing categories
(D) Backup only
17. . Noise in data refers to:
(A) Compressing noise
(B) Encrypting errors
(C) Random errors or irrelevant information in datasets
(D) Backup only
18. . Data pre-processing improves:
(A) Encrypting models
(B) Accuracy, efficiency, and performance of AI and ML models
(C) Compressing models
(D) Backup only
19. . Sampling in data pre-processing is used to:
(A) Backup only
(B) Encrypt samples
(C) Compress samples
(D) Reduce dataset size while maintaining representative information
20. . The main purpose of data collection and pre-processing is to:
(A) Compress files only
(B) Encrypt data only
(C) Obtain clean, accurate, and structured data ready for analysis and model building
(D) Backup only