【时间序列】【异常检测】【Twitter】
Twitter Anomaly Detection Tool
Twitter has called upon the software application developer community to help in the global fight against hacking and spammers. The company has released its AnomalyDetection software tool to open source on the GitHub code repository.
Twitter hopes that this open release will a) allow the community to learn from the software and b) help evolve the tool further.
Spikes and surges
When Twitter talks about ‘anomalies’ it is referring to spikes and surges of traffic on the network that can be caused by both legitimate and malicious activity.
“Both last year and this year, we saw a spike in the number of photos uploaded to Twitter on Christmas Eve, Christmas and New Year’s Eve (in other words, an anomaly occurred in the corresponding time series),” said Twitter, on the firm’s technical blog.
So while Christmas photo uploads spikes are a genuine discrete event for Twitter, the potential exists for similar unusual traffic surges caused by spam bots and hacking activity. With firms now increasingly operating big data analytics databases and real time network/cloud-based services, unwelcome (and unplanned) traffic surges can result in denial-of-service, website downtime and deeper offline problems.
Machine learning & algorithmic logic
Twitter’s AnomalyDetection is an open-source R statistical computing language package designed to automatically detects anomalies. It is built around algorithmic logic designed to accommodate for anomaly detection in the presence of seasonality and an underlying trend. Closely related to the discipline of machine learning, anomaly detection in this case employs ‘piecewise approximation’ – a mathematical function that enables the software to produce intelligent trend extraction from a set of traffic data.
According to Twitter, “Early detection of anomalies plays a key role in ensuring high-fidelity data is available to our own product teams and those of our data partners. This package helps us monitor spikes in user engagement on the platform surrounding holidays, major sporting events or during breaking news. The package can be used to find such bots or spam, as well as detect anomalies in system metrics after a new software release. We’re open-sourcing AnomalyDetection because we’d like the public community to evolve the package and learn from it as we have.”
Recommended by Forbes