- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Which function of scipy.cluster.vq module is used to normalize observations on each feature dimension?
Before implementing k-means algorithms, it is always beneficial to rescale each feature dimension of the observation set. The function scipy.cluster.vq.whiten(obs, check_finite = True)is used for this purpose. To give it unit variance, it divides each feature dimension of the observation by its standard deviation (SD).
Parameters
Below are given the parameters of the function scipy.cluster.vq.whiten(obs, check_finite = True) −
- obs− ndarray
It is an array, to be rescaled, where each row is an observation, and the columns are the features seen during each observation. The example is given below −
obs = [[ 1., 1., 1.], [ 2., 2., 2.], [ 3., 3., 3.], [ 4., 4., 4.]]
check_finite− bool,optional
This parameter is used to check whether the input matrices contain only finite numbers. Disabling this parameter may give you a performance gain but it may also result in some problems like crashes or non-termination if the observations do contain infinites. The default value of this parameter is True.
Returns
It returns an array which contains the values in obs scaled by the SD of each column.
Example
import numpy as np from scipy.cluster.vq import whiten observations = np.array([[2.9, 1.3, 1.9], [1.7, 3.2, 1.1], [1.0, 0.2, 1.7,]]) whiten(observations)
Output
array([[3.69627581, 1.04908478, 5.58930985], [2.16678237, 2.58236253, 3.23591623], [1.27457787, 0.16139766, 5.00096145]])