top of page

Imputing Categorical Variable Using Python Machine Learning | Data Imputation


The python file data_imputation_categorical.py imputes one categorical variable


data_imputation_categorical.py

from collections import Counter

row_num=0
temperature_sum=0
value_list=[]
filename1 = open('weather_orginal.csv', 'r')
next(filename1)

variable_index=0

for line in filename1.readlines():
    print(line)
    words = line.split(",")
    if words[variable_index].strip()!="":   
        value_list.append(words[variable_index])

c = Counter(value_list)
imputed=(c.most_common(1)[0][0])

filename1.seek(0)
firstline=next(filename1)

filename="weather_imputation_categorical.csv"

with open(filename, "w") as outF:
    outF.write(firstline)
    for line in filename1.readlines():
        words = line.split(",")
        if words[variable_index].strip()=="":
            words[variable_index]=imputed
        outLine=outLine=','.join(words)
        print(outLine)
        outF.write(outLine)
    outF.close()

You can modify at three places in this file to fit your use.

  1. Original file

  2. Variable index for the categorical variable. (Variable index=0 is the first column, since python counts starting 0)

  3. Result file


Before Imputation:


After Imputation




If you have any issue related to data imputation then don't worry. Realcode4you.com expert provide complete help to solve your issue with reasonable price. We are group of top rated and experienced experts.


For more details you can send your requirement details at:


realcode4you@gmail.com


bottom of page