Recent projects have been studying how to deal with missing values. Because the data used for analysis is diverse, missing values also account for a small part. There are two headaches:
1. There are A mice package that specializes in handling missing values. Is there anything similar in the all-purpose Python?
2. How to fill in missing values containing string types? Clustering and regression are all relative to numerical types, so what are good algorithms or good encapsulation packages for character types?
Please ask God for answers.
PS: Well, as for the example, it’s hard to describe, so it’s as follows:
name,password,age,address
Zhang San,123456,15.3,sichuang
李思,12,12.2, wuhan
王五,232,12,
钱六,,23,nanchang
haha,123456,,lal
拉拉,123123,,mmm
We hope that like the mice package in R language, we can use Python to quickly fill in the missing values (of course the information in this example is not very relevant, but there are more correlations in the data to be processed), and then As in the example, filling in the address belonging to the string type through other attributes is the second problem.
PyMICE is a Python® library for mice behavioral data analysis. Can you see if it is what you want?
https://neuroinflab.wordpress...
http://neuroinflab.github.io/...