NumPy Axis Confusion /u/godshammer_86 Python Education

Not terribly new to Python or programming in general, but I’m looking at some initial NumPy exercises for Introduction to Statistical Learning with Applications in Python and I’m rather confused about the axis parameter to some methods.

Here’s the relevant supporting code from the ISLP Lab Notebook:

“`python3 rng = np.random.default_rng(3) X = rng.standard_normal((10, 3))

# array([[ 0.22578661, -0.35263079, -0.28128742], # [-0.66804635, -1.05515055, -0.39080098], # [ 0.48194539, -0.23855361, 0.9577587 ], # [-0.19980213, 0.02425957, 1.54582085], # [ 0.54510552, -0.50522874, -0.18283897], # [ 0.54052513, 1.93508803, -0.26962033], # [-0.24355868, 1.0023136 , -0.88645994], # [-0.29172023, 0.88253897, 0.58035002], # [ 0.0915167 , 0.67010435, -2.82816231], # [ 1.02130682, -0.95964476, -1.66861984]]) X.mean(axis=0) # array([ 0.15030588, 0.14030961, -0.34238602])

“`

The accompanying MD text here says:

The np.mean(), np.var(), and np.std() functions can also be applied to the rows and columns of a matrix. To see this, we construct a matrix of random variables, and consider computing its row sums.

Since arrays are row-major ordered, the first axis, i.e. axis=0, refers to its rows. We pass this argument into the mean() method for the object X.

I know axis=0 is rows and axis=1 is columns (for this example, at least). Since the example is passing axis 0 (rows) to the mean method, I would expect an output array of length 10 that calculates the mean of 3 elements in each row.

But the example appears to be calculating the mean of 10 elements in each of 3 columns, given the output array of length 3.

Further, when I calculate the column mean using X.mean(axis=1), I get the result I expected for axis=0, so the outputs are being switched.

The NumPy documentation didn’t provide any additional clarity.

Hoping someone can provide an explanation for what’s going on here. Is there possibly a setting (perhaps embedded in the notebook, that I can’t see) that allows axes to be switched, where rows are axis=1 and columns are axis=0?

Thanks in advance for any help!

submitted by /u/godshammer_86
[link] [comments]

r/learnpython Not terribly new to Python or programming in general, but I’m looking at some initial NumPy exercises for Introduction to Statistical Learning with Applications in Python and I’m rather confused about the axis parameter to some methods. Here’s the relevant supporting code from the ISLP Lab Notebook: “`python3 rng = np.random.default_rng(3) X = rng.standard_normal((10, 3)) # array([[ 0.22578661, -0.35263079, -0.28128742], # [-0.66804635, -1.05515055, -0.39080098], # [ 0.48194539, -0.23855361, 0.9577587 ], # [-0.19980213, 0.02425957, 1.54582085], # [ 0.54510552, -0.50522874, -0.18283897], # [ 0.54052513, 1.93508803, -0.26962033], # [-0.24355868, 1.0023136 , -0.88645994], # [-0.29172023, 0.88253897, 0.58035002], # [ 0.0915167 , 0.67010435, -2.82816231], # [ 1.02130682, -0.95964476, -1.66861984]]) X.mean(axis=0) # array([ 0.15030588, 0.14030961, -0.34238602]) “` The accompanying MD text here says: The np.mean(), np.var(), and np.std() functions can also be applied to the rows and columns of a matrix. To see this, we construct a matrix of random variables, and consider computing its row sums. Since arrays are row-major ordered, the first axis, i.e. axis=0, refers to its rows. We pass this argument into the mean() method for the object X. I know axis=0 is rows and axis=1 is columns (for this example, at least). Since the example is passing axis 0 (rows) to the mean method, I would expect an output array of length 10 that calculates the mean of 3 elements in each row. But the example appears to be calculating the mean of 10 elements in each of 3 columns, given the output array of length 3. Further, when I calculate the column mean using X.mean(axis=1), I get the result I expected for axis=0, so the outputs are being switched. The NumPy documentation didn’t provide any additional clarity. Hoping someone can provide an explanation for what’s going on here. Is there possibly a setting (perhaps embedded in the notebook, that I can’t see) that allows axes to be switched, where rows are axis=1 and columns are axis=0? Thanks in advance for any help! submitted by /u/godshammer_86 [link] [comments]