AWS

# aws machine learning certification data exploratory problems

2023. 2. 3.
1. What is the purpose of exploratory data analysis in machine learning?
a. To build the model architecture
b. To gather insights and make informed decisions about features and model selection
c. To evaluate model performance
d. To optimize hyperparameters

2. What is the difference between univariate and bivariate analysis?
a. Univariate analysis focuses on one feature at a time, while bivariate analysis focuses on two features at a time.
b. Univariate analysis focuses on analyzing the entire data set, while bivariate analysis focuses on two features at a time.
c. Univariate analysis focuses on the relationship between the target and one feature, while bivariate analysis focuses on the relationship between the target and two features.
d. Univariate analysis focuses on the distribution of one feature, while bivariate analysis focuses on the distribution of two features.

3. What is the purpose of missing value imputation in exploratory data analysis?
a. To fill in missing values with the mean of the data set
b. To fill in missing values with the median of the data set
c. To fill in missing values with the mode of the data set
d. To fill in missing values with the mean of the feature

4. What is the difference between normalization and standardization in machine learning?
a. Normalization scales the data between 0 and 1, while standardization transforms the data to have a mean of 0 and a standard deviation of 1.
b. Normalization scales the data between -1 and 1, while standardization transforms the data to have a mean of 0 and a standard deviation of 2.
c. Normalization scales the data between 0 and 100, while standardization transforms the data to have a mean of 50 and a standard deviation of 25.
d. Normalization scales the data between -100 and 100, while standardization transforms the data to have a mean of 0 and a standard deviation of 50.

5. What is the purpose of feature scaling in machine learning?
a. To increase the speed of the model training process
b. To handle outliers in the data
c. To improve the interpretability of the model
d. To ensure that all features are on the same scale

6. What is the difference between a histogram and a bar plot in exploratory data analysis?
a. A histogram is used to display the distribution of a single feature, while a bar plot is used to display the distribution of two features.
b. A histogram is used to display the distribution of two features, while a bar plot is used to display the distribution of a single feature.
c. A histogram is used to display the distribution of the target variable, while a bar plot is used to display the distribution of features.
d. A histogram is used to display the distribution of continuous data, while a bar plot is used to display the distribution of categorical data.

7. What is the purpose of a scatter plot in exploratory data analysis?
a. To display the distribution of a single feature
b. To display the distribution of two features
c. To display the relationship between two features
d. To display the relationship between a feature and the target variable

8. What is the difference between a heatmap and a pairplot in exploratory data analysis?
a. A heatmap is used to display the distribution of two features, while a pairplot is used to display the distribution of all features.
b. A heatmap is used to display the relationship between two features, while a pairplot is used to display the relationship between all features and the target variable.
c. A heatmap is used to display the correlation between two features, while a pairplot is used to display the correlation between all features.
d. A heatmap is used to display the relationship between two continuous features, while a pairplot is used to display the relationship between continuous and categorical features.

9. What is the purpose of a box plot in exploratory data analysis?
a. To display the distribution of a single feature
b. To display the distribution of two features
c. To display the relationship between two features
d. To display the distribution and outliers of a feature

10. What is the difference between a correlation matrix and a covariance matrix in exploratory data analysis?
a. A correlation matrix measures the linear relationship between features, while a covariance matrix measures the strength of the relationship between features.
b. A correlation matrix measures the strength of the relationship between features, while a covariance matrix measures the linear relationship between features.
c. A correlation matrix measures the relationship between two features, while a covariance matrix measures the relationship between all features.
d. A correlation matrix measures the relationship between continuous features, while a covariance matrix measures the relationship between continuous and categorical features.

1. Imputation (filling in the missing values) using a technique such as mean imputation or regression imputation.
2. Develop a machine learning system by first creating a dataset of Tweets that are manually labeled as positive or negative sentiment. This dataset can then be used as the training data to train a machine learning model to classify the sentiment of future Tweets.
3. d. Scaling the features so they are in the same range and can be compared on the same scale.
4. b. Remove outliers by using methods such as the z-score or the Interquartile Range (IQR).
5. a. Histogram to display the distribution of a single feature.
6. d. Scatter plot to display the relationship between two features.
7. b. Line plot to display the relationship between two features over time.
8. d. A heatmap is used to display the distribution and outliers of a feature, while a pairplot is used to display the relationship between all features.
9. d. To display the distribution and outliers of a feature.
10. a. A correlation matrix measures the linear relationship between features, while a covariance matrix measures the strength of the relationship between features.

#### 'AWS' 카테고리의 다른 글

aws machine learning data engineering problems  (0) 2023.02.02 2023.02.02

, , , ,