
Q91
Q91 What is the primary goal of Exploratory Data Analysis?
Predict outcomes
Summarize data characteristics
Visualize predictions
Build models
Q92
Q92 Which of the following is a common technique used during EDA?
Clustering
PCA
Descriptive statistics
Feature selection
Q93
Q93 What is the significance of identifying skewness in data during EDA?
It helps in feature scaling
It determines model type
It affects data distribution assumptions
It improves visualization
Q94
Q94 Which visualization is best suited for analyzing the relationship between two numerical variables?
Histogram
Boxplot
Scatter plot
Bar chart
Q95
Q95 Why is it critical to detect multicollinearity during EDA?
It improves model accuracy
It ensures independence among predictors
It removes missing values
It selects important features
Q96
Q96 Which Python library is used for creating basic visualizations such as line and bar charts?
NumPy
Pandas
Matplotlib
Seaborn
Q97
Q97 How do you compute the correlation matrix for a DataFrame in Python?
df.corr()
df.describe()
df.cov()
df.plot()
Q98
Q98 Which visualization technique is useful for identifying clusters in data during EDA?
Scatter plot
Heatmap
Boxplot
Pairplot
Q99
Q99 If a dataset contains missing values in a column, what is the simplest way to visualize its impact?
Use a scatter plot
Use a heatmap
Drop the column
Fill missing values
Q100
Q100 A dataset shows a perfect correlation of +1 between two variables. What is the likely issue?
Multicollinearity
Outliers
No issue
Wrong visualization
Q101
Q101 During EDA, an outlier is identified in a boxplot. What is the best course of action?
Remove the outlier
Keep the outlier
Investigate the outlier
Ignore the outlier
Q102
Q102 What is the primary purpose of hypothesis testing in statistics?
To clean data
To test assumptions
To visualize trends
To encode features
Q103
Q103 Which statistical measure represents the spread of data values around the mean?
Variance
Mean
Median
Skewness
Q104
Q104 When is the p-value considered statistically significant in hypothesis testing?
When p > 0.05
When p < 0.05
When p = 0.1
When p > 1
Q105
Q105 What does the standard deviation indicate in a dataset?
The central tendency
The variability
The skewness
The correlation
Q106
Q106 What type of statistical analysis helps identify relationships between variables?
Correlation analysis
Variance analysis
Skewness analysis
Descriptive statistics
Q107
Q107 What assumption is made about data in a parametric statistical test?
Data is categorical
Data follows a normal distribution
Data has no missing values
Data is continuous
Q108
Q108 Which Python library provides the ttest_ind function for hypothesis testing?
Pandas
NumPy
SciPy
Matplotlib
Q109
Q109 How can you calculate the mean of a column in a Pandas DataFrame?
df.column.mean()
df.mean(column)
mean(df.column)
df.column.calc_mean()
Q110
Q110 A dataset has a column with skewed numerical data. What is the best approach to normalize it?
Use log transformation
Drop outliers
Encode values
Use boxplot
Q111
Q111 A dataset's p-value is 0.01 after running a statistical test. What does this imply?
Strong evidence against the null hypothesis
No evidence against the null hypothesis
Data is normally distributed
Data has no variance
Q112
Q112 After standardizing data, the z-scores of a column are very high. What could be the issue?
Incorrect scaling
Outliers
Data is normalized
No issue
Q113
Q113 What is the primary purpose of data visualization?
To analyze data
To predict outcomes
To represent data visually
To encode data
Q114
Q114 Which visualization is best suited for showing data distribution?
Line chart
Scatter plot
Histogram
Pie chart
Q115
Q115 Which chart is most effective for comparing parts of a whole?
Scatter plot
Pie chart
Boxplot
Line chart
Q116
Q116 What does a boxplot help identify in a dataset?
Outliers
Correlations
Clusters
Trends
Q117
Q117 Which of the following is a common mistake in data visualization?
Using appropriate scales
Choosing the right chart type
Overloading charts with data
Labeling axes
Q118
Q118 Which Matplotlib function is used to create a simple line chart?
plt.scatter()
plt.line()
plt.plot()
plt.bar()
Q119
Q119 How do you create a bar chart in Matplotlib?
plt.bar(x, y)
plt.plot(x, y)
plt.hist(x)
plt.scatter(x, y)
Q120
Q120 Which Python library allows for creating highly interactive visualizations with minimal coding?
Seaborn
Matplotlib
Plotly
Pandas

