Hi everyone.
Apologies if this isn’t the best place to post this, I thought it’d be better than r/learnpython since its a bit more advanced of a question.
I’m working through Introduction to Statistical Learning with Python and currently on Chapter 2, Exercise 9. This exercise uses the Auto data set which has the following predictors:
mpg, cylinders, displacement, horsepower, weight, acceleration, year, origin, name
Part (a) of this question asks: *Which of the predictors are quantitative, and which are qualitative?*
I sorted them as follows:
-
quantitative: mpg, displacement, horsepower, weight, acceleration
-
qualitative: cylinders, year, origin, name
I then consulted some other peoples’ solutions online (as well as some Google searches) and found the following results:
-
Using
df.select_dtypes(include=['number']).columns
anddf.select_dtypes(exclude=['number']).columns
gave the answer that only “name” is qualitative; all others are quantitative. -
Only “name” and “origin” are qualitative; all others are quantitative.
-
All variables except “horsepower” and “name” are quantitative.
And some Google searches stated that, for example, “year” is a quantitative predictor, not qualitative as I would expect.
Am I misunderstanding how to classify a predictor as either qualitative or quantitative?
In my mind, qualitative is more or less synonymous with categorical: there is a finite number of categories into which a value can be placed. It also helps me to think about whether the value is able/likely to change for a given observation. For example, ‘mpg’ is quantitative (in part) because it could easily change as the car is used; whereas a car’s model year or number of cylinders can’t change, so the cars can be sorted into discrete categories based on these characteristics.
By this understanding, I would think predictors such as cylinders (4-cyl, v6, v8) and year the car was manufactured (1970, 1971, 1972, etc.) would be qualitative/categorical.
Am I thinking about this wrong? Or is my solution a fairly accurate way of thinking?
submitted by /u/godshammer_86
[link] [comments]
r/learnpython Hi everyone. Apologies if this isn’t the best place to post this, I thought it’d be better than r/learnpython since its a bit more advanced of a question. I’m working through Introduction to Statistical Learning with Python and currently on Chapter 2, Exercise 9. This exercise uses the Auto data set which has the following predictors: mpg, cylinders, displacement, horsepower, weight, acceleration, year, origin, name Part (a) of this question asks: *Which of the predictors are quantitative, and which are qualitative?* I sorted them as follows: quantitative: mpg, displacement, horsepower, weight, acceleration qualitative: cylinders, year, origin, name I then consulted some other peoples’ solutions online (as well as some Google searches) and found the following results: Using df.select_dtypes(include=[‘number’]).columns and df.select_dtypes(exclude=[‘number’]).columns gave the answer that only “name” is qualitative; all others are quantitative. Only “name” and “origin” are qualitative; all others are quantitative. All variables except “horsepower” and “name” are quantitative. And some Google searches stated that, for example, “year” is a quantitative predictor, not qualitative as I would expect. Am I misunderstanding how to classify a predictor as either qualitative or quantitative? In my mind, qualitative is more or less synonymous with categorical: there is a finite number of categories into which a value can be placed. It also helps me to think about whether the value is able/likely to change for a given observation. For example, ‘mpg’ is quantitative (in part) because it could easily change as the car is used; whereas a car’s model year or number of cylinders can’t change, so the cars can be sorted into discrete categories based on these characteristics. By this understanding, I would think predictors such as cylinders (4-cyl, v6, v8) and year the car was manufactured (1970, 1971, 1972, etc.) would be qualitative/categorical. Am I thinking about this wrong? Or is my solution a fairly accurate way of thinking? submitted by /u/godshammer_86 [link] [comments]
Hi everyone.
Apologies if this isn’t the best place to post this, I thought it’d be better than r/learnpython since its a bit more advanced of a question.
I’m working through Introduction to Statistical Learning with Python and currently on Chapter 2, Exercise 9. This exercise uses the Auto data set which has the following predictors:
mpg, cylinders, displacement, horsepower, weight, acceleration, year, origin, name
Part (a) of this question asks: *Which of the predictors are quantitative, and which are qualitative?*
I sorted them as follows:
-
quantitative: mpg, displacement, horsepower, weight, acceleration
-
qualitative: cylinders, year, origin, name
I then consulted some other peoples’ solutions online (as well as some Google searches) and found the following results:
-
Using
df.select_dtypes(include=['number']).columns
anddf.select_dtypes(exclude=['number']).columns
gave the answer that only “name” is qualitative; all others are quantitative. -
Only “name” and “origin” are qualitative; all others are quantitative.
-
All variables except “horsepower” and “name” are quantitative.
And some Google searches stated that, for example, “year” is a quantitative predictor, not qualitative as I would expect.
Am I misunderstanding how to classify a predictor as either qualitative or quantitative?
In my mind, qualitative is more or less synonymous with categorical: there is a finite number of categories into which a value can be placed. It also helps me to think about whether the value is able/likely to change for a given observation. For example, ‘mpg’ is quantitative (in part) because it could easily change as the car is used; whereas a car’s model year or number of cylinders can’t change, so the cars can be sorted into discrete categories based on these characteristics.
By this understanding, I would think predictors such as cylinders (4-cyl, v6, v8) and year the car was manufactured (1970, 1971, 1972, etc.) would be qualitative/categorical.
Am I thinking about this wrong? Or is my solution a fairly accurate way of thinking?
submitted by /u/godshammer_86
[link] [comments]