I’m using the Ruptures package and not getting the changepoints I’d expected. I got this result:
https://www.reddit.com/media?url=https%3A%2F%2Fi.redd.it%2Fonitdxu7ylae1.png
Take a look at 2024-05-26. You can see that, to a human eye, this is clearly a major changepoint. The line has been in a downward trend and, at that date, has suddenly shifted into an upward climb. However, Ruptures is identifying the changepoint as 2024-06-09, a few days after the obvious change, instead.
What’s going on here? Why is Ruptures picking this date and how can I improve the model?
Here is my code:
import numpy as np import ruptures as rpt import matplotlib.pyplot as plt # Data Import data = pd.read_csv('my_data.csv') # Model calibration algo = rpt.Pelt(model='l2', min_size=20, jump=1).fit(data) change_points = algo.predict(pen=statistics.variance(data['effect'])) # Using the variance in the key metric as the penality. # List change points: print("Change points detected at indices:", data.index[change_points[:-1]]) # Visualize data: rpt.display(data, change_points, figsize=(10, 6)) plt.xlabel('Date') plt.ylabel('Cumulative Effect') plt.xticks(ticks=range(0, len(data), len(data)//10), labels=data.index[::len(data)//10]) plt.grid(True) plt.show()
submitted by /u/takenorinvalid
[link] [comments]
r/learnpython I’m using the Ruptures package and not getting the changepoints I’d expected. I got this result: https://www.reddit.com/media?url=https%3A%2F%2Fi.redd.it%2Fonitdxu7ylae1.png Take a look at 2024-05-26. You can see that, to a human eye, this is clearly a major changepoint. The line has been in a downward trend and, at that date, has suddenly shifted into an upward climb. However, Ruptures is identifying the changepoint as 2024-06-09, a few days after the obvious change, instead. What’s going on here? Why is Ruptures picking this date and how can I improve the model? Here is my code: import numpy as np import ruptures as rpt import matplotlib.pyplot as plt # Data Import data = pd.read_csv(‘my_data.csv’) # Model calibration algo = rpt.Pelt(model=’l2′, min_size=20, jump=1).fit(data) change_points = algo.predict(pen=statistics.variance(data[‘effect’])) # Using the variance in the key metric as the penality. # List change points: print(“Change points detected at indices:”, data.index[change_points[:-1]]) # Visualize data: rpt.display(data, change_points, figsize=(10, 6)) plt.xlabel(‘Date’) plt.ylabel(‘Cumulative Effect’) plt.xticks(ticks=range(0, len(data), len(data)//10), labels=data.index[::len(data)//10]) plt.grid(True) plt.show() submitted by /u/takenorinvalid [link] [comments]
I’m using the Ruptures package and not getting the changepoints I’d expected. I got this result:
https://www.reddit.com/media?url=https%3A%2F%2Fi.redd.it%2Fonitdxu7ylae1.png
Take a look at 2024-05-26. You can see that, to a human eye, this is clearly a major changepoint. The line has been in a downward trend and, at that date, has suddenly shifted into an upward climb. However, Ruptures is identifying the changepoint as 2024-06-09, a few days after the obvious change, instead.
What’s going on here? Why is Ruptures picking this date and how can I improve the model?
Here is my code:
import numpy as np import ruptures as rpt import matplotlib.pyplot as plt # Data Import data = pd.read_csv('my_data.csv') # Model calibration algo = rpt.Pelt(model='l2', min_size=20, jump=1).fit(data) change_points = algo.predict(pen=statistics.variance(data['effect'])) # Using the variance in the key metric as the penality. # List change points: print("Change points detected at indices:", data.index[change_points[:-1]]) # Visualize data: rpt.display(data, change_points, figsize=(10, 6)) plt.xlabel('Date') plt.ylabel('Cumulative Effect') plt.xticks(ticks=range(0, len(data), len(data)//10), labels=data.index[::len(data)//10]) plt.grid(True) plt.show()
submitted by /u/takenorinvalid
[link] [comments]