# How to transform (encode) a qualitative (categorical) variable into a quantitative variable with scikit learn in python ?

Published: August 25, 2020

Examples of how to transform (encode) a qualitative (categorical) variable into a quantitative variable with scikit learn in python ?

### Input matrix

Let's consider the following input matrix X:

````from sklearn import preprocessing`

`import numpy as np`

`X = np.array(('A','C','B','A','C','D','A'))`
```

of shape

````print(X.shape)`

`(7,)`
```

that can be reshaped:

````X = X.reshape(-1,1)`
```

returns

````print(X.shape)`

`(7, 1)`
```

### Encoding the elements of matrix X using the function OrdinalEncoder

To encode the elements of matrix X a solution is to use OrdinalEncoder:

````enc = preprocessing.OrdinalEncoder(categories='auto')`

`enc.fit(X)`

`print( enc.transform(X) )`
```

returns

````[[0.]`
` [2.]`
` [1.]`
` [0.]`
` [2.]`
` [3.]`
` [0.]]`
```

### Encoding the elements of matrix X using the function OneHotEncoder

Another solution to encode the elements of matrix X using the function OneHotEncoder

````enc = preprocessing.OneHotEncoder(categories='auto')`

`enc.fit(X)`

`print( enc.transform(X) )`
```

returns

````  (0, 0)    1.0`
`  (1, 2)    1.0`
`  (2, 1)    1.0`
`  (3, 0)    1.0`
`  (4, 2)    1.0`
`  (5, 3)    1.0`
`  (6, 0)    1.0`
```

To get a matrix just use toarray() :

````print( enc.transform(X).toarray() )`
```

gives here

````[[1. 0. 0. 0.]`
` [0. 0. 1. 0.]`
` [0. 1. 0. 0.]`
` [1. 0. 0. 0.]`
` [0. 0. 1. 0.]`
` [0. 0. 0. 1.]`
` [1. 0. 0. 0.]]`
```