Exam code : DS-200 Exam name : Data Science Essentials BetaWhy should stop an interactive machinelearningalgorithm assoon as the performanceof the model on a test set stops improving? A. To avoid the need for cross-validating the model B. To prevent overfitting C. To increase the VC (VAPNIK-Chervonenkis) dimension for the model D. To keep the number of terms in the model as possible E. To maintain the highest VC (Vapnik-Chervonenkis) dimension for the model Answer: B What is default delimiterfor Hive tables? A. ^A (Control-A) B. , (comma) C. \t (tab) D. : (colon) Answer: A Reference: html(change the delimiter when exporting hive table) Certain individuals aremoresusceptibleto autismif they have particularcombinationsofgenesexpressed in their DNA. Givena sample of DNAfrom personswho have autismand a sample of DNAfrom persons who do not haveautism,determine the best technique forpredictingwhetheror nota given individualis susceptibleto developing autism? A. Native Bayes B. Linear Regression C. Survival analysis D. Sequencealignment Answer: B You are working with a logistic regression model to predictthe probabilitythat a user will click on anad.Your model has hundreds of features, andyou’renot sure ifall of thosefeatures are helpingyour prediction.Which regularization techniqueshould you use to prune features that aren’tcontributing tothe model? A. Convex B. Uniform C. L2 D. L1 Answer: A Refer to the exhibit. Which point in the figure is the median? A. A B. B C. C Answer: A Refer to the exhibit. Which point in the figure is the mode? A. A B. B C. C Answer: C

