EBay Shares Its Real-Time Data Tool With The Public

Online auction giant eBay is now using a real-time analytics engine to deal with personalization, fraud detection and other high-traffic functions — and it’s giving the code away, the company said in a blog post on Monday (Feb. 23).

Specifically, eBay is offering an open-source license for its Pulsar “stream processing framework,” which the company says can handle a million events per second, uses a SQL-like event-processing language and can be easily integrated with similar services such as the Druid real-time analytics database and the Cassandra NoSQL database. The code can be used under either the Apache 2.0 license or the GPL 2.0 (the license used for the Linux kernel).

What exactly do those features mean? “Pulsar can be used to collect and process user and business events in real time, providing key insights and enabling systems to react to user activities within seconds,” according to eBay engineers Sharad Murthy and Tony Ng. That’s as opposed to somewhat slower results using batch processing and a big-data framework such as Hadoop, which eBay also still uses. But the real-time Pulsar system is already in production use at eBay and is processing “all user behavior events,” Murthy and Ng wrote.

Google, Microsoft, Twitter and LinkedIn have all developed similar stream-processing systems for real-time analysis, VentureBeat reported. That could help explain why eBay is so eager to give its software away: The code itself isn’t a competitive advantage when competitors all have similar technology. On the other hand, the data and rules used to watch for fraud or generate personalization aren’t likely to become part of the giveaway package.

It’s also not clear how useful the system will be to anyone who doesn’t need to process a pipeline of on-the-fly information about very large numbers of customers or transactions — or what it’s currently best at. In their blog post, Murthy and Ng described some areas where the system might be used, including real-time reporting, business activity monitoring, marketing and advertising, along with fraud detection and personalization. But they were light on specifics of exactly what Pulsar is being used for inside eBay.

Still, given that tasks like payments fraud detection increasingly depend on scanning every transaction in real time, some organizations may find Pulsar worth a hard look. EBay has put up a Pulsar website with documentation and a FAQ. The open-sourced Pulsar code is at the Pulsar GitHub site.