I am a newbie in Python and I am struggling for coding things that seem simple in PHP/SQL and I hope you can help me.
I have 2 Pandas Dataframes that I have simplified for a better understanding.
In the first Dataframe df2015, I have the Sales for the 2015.
! Notice that unfortunately, we do not have ALL the values for each store !
>>> df2015
Store Date Sales
0 1 2015-01-15 6553
1 3 2015-01-15 7016
2 6 2015-01-15 8840
3 8 2015-01-15 10441
4 9 2015-01-15 7952
And another Dataframe named df2016 for the Sales Forecast in 2016, which lists ALL the stores.
( As you guess, the column SalesForecast is the column to fill. )
>>> df2016
Store Date SalesForecast
0 1 2016-01-15
1 2 2016-01-15
2 3 2016-01-15
3 4 2016-01-15
4 5 2016-01-15
I want to create a function that for each row in df2016 will retrieve the Sales values from df2015, and for example, will increase by 5% these values and add these new values in SalesForecast column of df2016.
Let's say forecast is the function I have created that I want to apply :
def forecast(store_id,date):
sales2015 = df2015['Sales'].loc[(df2015['Store'].values == store_id) & (df2015['Date'].values == date )].values
forecast2016 = sales2015 * 1.05
return forecast2016
I have tested this function in a hardcoding way as below and it works:
>>> forecast(1,'2015-01-15')
array([ 6880.65])
But here we are where my problem is... How can I apply this function to the dataframes ?
It would be very easy to do it in PHP by creating a loop for each row in df2016 and retrieve the values (if they exist) from df2015 by a SELECT and WHERE Store = store_id and Date = date.. ...but the it seems the logic is not the same with Pandas Dataframes and Python.
I have tried the apply function as follows :
df2016['SalesForecast'] = df2016.apply(df2016['Store'],df2016['Date'])
but I am unable to put the arguments correctly or there is something I am doing wrong..
I think I do not have the good method or maybe my method is not suitable at all with Pandas and Python.. ?
Copyright Notice:Content Author:「」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/33834136/python-function-to-add-values-in-a-pandas-dataframe-using-values-from-another-da