Pandas add an empty row or column to a dataframe with index

Add empty row with or without name:

df.append(pd.Series(name='NameOfNewRow')) # name the new row
df.append(pd.Series(), ignore_index=True) # not name the new row

Add empty column:

df['new'] = pd.Series()

Source:
https://stackoverflow.com/questions/39998262/append-an-empty-row-in-dataframe-using-pandas
https://stackoverflow.com/questions/16327055/how-to-add-an-empty-column-to-a-dataframe

Python NumPy replace nan in array to 0 or a number

Replace nan in a numpy array to zero or any number:

a = numpy.array([1,2,3,4,np.nan])

# if copy=False, the replace inplace, default is True, it will be changed to 0 by default
a = numpy.nan_to_num(a, copy=True) 

# if you want it changed to any number, eg. 10.
numpy.nan_to_num(a, copy=False, nan=10) 

Replace inf or -inf with the most positive or negative finite floating-point values or any numbers:

a = numpy.array([1,2,3,4,np.inf])

# change to the most positive or finite floating-point value by default
a = numpy.nan_to_num(a, copy=True)

# if you want it changed to any number, eg. 10.
a = numpy.nan_to_num(a, copy=True, posinf=10)

# if you want it changed to any number, eg. 10., same goes to neginf
a = numpy.nan_to_num(a, copy=True, posinf=10, neginf=-10)

The parameter posinf and neginf only works when your numpy version is equal or higher than 1.17.

Source:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.nan_to_num.html

Pickle UnicodeDecodeError incompatible between python2 and python3

When loading a pickle file saved using python2 and reload it into python3, you might get such errors:

UnicodeDecodeError: 'ascii' codec can't decode byte 0x90 in position X: ordinal not in range(128)

This is due to an incompatible issue between python2 and python3, below is the easiest way to fix it, by adding encoding='latin1':

pickle.load(file, encoding='latin1')

There are definitely other ways, but this is the simplest one.

Source:
https://stackoverflow.com/questions/11305790/pickle-incompatibility-of-numpy-arrays-between-python-2-and-3

Pandas remove rows or columns with null/nan/missing values

Remove rows with nan/null/missing values:

df = df.dropna(axis=0, how='any') # Remove if any value is na
df = df.dropna(axis=0, how='all') # Remove if all values are na

Remove columns with nan/null/missing values:

df = df.dropna(axis=1, how='any') # Remove if any value is na
df = df.dropna(axis=1, how='all') # Remove if all values are na

Defaut remove is inplace=False, if you want to remove inplace, add inplace=True

Source:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html

Pandas apply function to values in a DataFrame

Apply your function to all values in a dataframe:

df = df+1
df = df.apply(np.sqrt)
df = df.apply(lambda x: np.log2(x+1))
df = df.apply(lambda x: function(x))

Apply function to a column or a row of the dataframe:

df.loc[:,'yourLabel'] = df.loc[:,'yourLabel'].map(lambda x: function(x))
df.loc['yourLabel',:] = df.loc['yourLabel',:].map(lambda x: function(x))

df.loc[:,'yourLabel'] = df.loc[:,'yourLabel'].apply(lambda x: function(x))
df.loc['yourLabel',:] = df.loc['yourLabel',:].apply(lambda x: function(x))

Source:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html;
https://stackoverflow.com/questions/34962104/pandas-how-can-i-use-the-apply-function-for-a-single-column