Examples of how to transform (encode) a qualitative (categorical) variable into a quantitative variable with scikit learn in python ?

### Input matrix

Let's consider the following input matrix X:

`from sklearn import preprocessing`

`import numpy as np`

`X = np.array(('A','C','B','A','C','D','A'))`

of shape

`print(X.shape)`

`(7,)`

that can be reshaped:

`X = X.reshape(-1,1)`

returns

`print(X.shape)`

`(7, 1)`

### Encoding the elements of matrix X using the function OrdinalEncoder

To encode the elements of matrix X a solution is to use OrdinalEncoder:

`enc = preprocessing.OrdinalEncoder(categories='auto')`

`enc.fit(X)`

`print( enc.transform(X) )`

returns

`[[0.]`

`[2.]`

`[1.]`

`[0.]`

`[2.]`

`[3.]`

`[0.]]`

### Encoding the elements of matrix X using the function OneHotEncoder

Another solution to encode the elements of matrix X using the function OneHotEncoder

`enc = preprocessing.OneHotEncoder(categories='auto')`

`enc.fit(X)`

`print( enc.transform(X) )`

returns

`(0, 0) 1.0`

`(1, 2) 1.0`

`(2, 1) 1.0`

`(3, 0) 1.0`

`(4, 2) 1.0`

`(5, 3) 1.0`

`(6, 0) 1.0`

To get a matrix just use toarray() :

`print( enc.transform(X).toarray() )`

gives here

`[[1. 0. 0. 0.]`

`[0. 0. 1. 0.]`

`[0. 1. 0. 0.]`

`[1. 0. 0. 0.]`

`[0. 0. 1. 0.]`

`[0. 0. 0. 1.]`

`[1. 0. 0. 0.]]`