Can you filter pandas dataframes by Day/Month date without year? Writing a generalised function for between date that is year agnostic is proving difficult. /u/Statnamara Python Education

Ths issue stems from using year agnostic dates. If I want to return dates between June 15th – Sep 15th for every year it seems I need to manually write

df[“date_col”].dt.month.eq(6) & df[“date_col”].dt.day.ge(15) df[“date_col”].dt.month.between(7,8) df[“date_col”].dt.month.eq(9) & df[“date_col”].dt.day.le(15)

This doesn’t generalise well. Writing a function that takes a min and max date has a lot of different edge cases, i.e. dates are in the same month, dates are in consecutive months, dates cross a new year, etc.

A big part of the problem is that pd.to_datetime(“1506”, format = “%d%m”) returns Timestamp(‘1900-06-15 00:00:00’) and is therefore very much NOT year agnostic.

submitted by /u/Statnamara
[link] [comments]

​r/learnpython Ths issue stems from using year agnostic dates. If I want to return dates between June 15th – Sep 15th for every year it seems I need to manually write df[“date_col”].dt.month.eq(6) & df[“date_col”].dt.day.ge(15) df[“date_col”].dt.month.between(7,8) df[“date_col”].dt.month.eq(9) & df[“date_col”].dt.day.le(15) This doesn’t generalise well. Writing a function that takes a min and max date has a lot of different edge cases, i.e. dates are in the same month, dates are in consecutive months, dates cross a new year, etc. A big part of the problem is that pd.to_datetime(“1506”, format = “%d%m”) returns Timestamp(‘1900-06-15 00:00:00’) and is therefore very much NOT year agnostic. submitted by /u/Statnamara [link] [comments] 

Ths issue stems from using year agnostic dates. If I want to return dates between June 15th – Sep 15th for every year it seems I need to manually write

df[“date_col”].dt.month.eq(6) & df[“date_col”].dt.day.ge(15) df[“date_col”].dt.month.between(7,8) df[“date_col”].dt.month.eq(9) & df[“date_col”].dt.day.le(15)

This doesn’t generalise well. Writing a function that takes a min and max date has a lot of different edge cases, i.e. dates are in the same month, dates are in consecutive months, dates cross a new year, etc.

A big part of the problem is that pd.to_datetime(“1506”, format = “%d%m”) returns Timestamp(‘1900-06-15 00:00:00’) and is therefore very much NOT year agnostic.

submitted by /u/Statnamara
[link] [comments] 

Leave a Reply

Your email address will not be published. Required fields are marked *