2016-09-27 5 views
2
내가 dataframe이

에서 일부 값을 계산, 내가 모든 ID에 매달에 일의 양을 계산이 필요하다고팬더가 : 열

ID,"url","app_name","used_at","active_seconds","device_connection","device_os","device_type","device_usage"  
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:29:11,13,3g,android,smartphone,home  
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:33:00,3,unknown,android,smartphone,home  
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:33:07,1,unknown,android,smartphone,home  
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:34:30,5,unknown,android,smartphone,home  
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-06-01 09:36:22,133,3g,android,smartphone,home   
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-05-02 09:38:40,5,3g,android,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",Yandex.Navigator,2015-05-01 11:04:48,70,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-6-01 12:02:27,248,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",Viber,2015-07-01 12:06:35,7,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-08-01 12:23:26,86,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-02 12:24:52,0,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",My Talking Angela,2015-08-03 12:24:52,167,3g,ios,smartphone,home   
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-04 12:27:39,34,3g,ios,smartphone,home   

의 일부입니다.

df.groupby('ID')['used_at'].count()을 방문하면 daysmonth에 가져오고 계산할 수 있습니까?

답변

2

난 당신이 필요하다고 생각 groupbyID, monthday 및 집계 size의 :

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.month,df.used_at.dt.day ]).size() 

print (df1) 
ID        used_at used_at 
574c4969b017ae6481db9a7c77328bc3 5  1   1 
            6  1   1 
            7  1   1 
            8  1   1 
              2   1 
              3   1 
              4   1 
e990fae0f48b7daf52619b5ccbec61bc 5  1   2 
              2   1 
            6  1   3 
dtype: int64 

또는 date하여 - 그것은 year, monthday으로 동일하다 : 사이

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.date]).size() 

print (df1) 
ID        used_at 
574c4969b017ae6481db9a7c77328bc3 2015-05-01 1 
            2015-06-01 1 
            2015-07-01 1 
            2015-08-01 1 
            2015-08-02 1 
            2015-08-03 1 
            2015-08-04 1 
e990fae0f48b7daf52619b5ccbec61bc 2015-05-01 2 
            2015-05-02 1 
            2015-06-01 3 
dtype: int64 

차이 countsize :

size 카운트 NaN 값은 count입니다.