Open source tool flags malicious domains with Support Vector Machine algorithm

Sharkcop anti-phishing tool detects malicious URL domains

A Google Chrome browser extension that identifies suspected phishing URLs with a machine learning algorithm was unveiled at Black Hat Asia last week.

Developed by a team of Vietnamese undergraduates, Sharkcop distinguishes between phishing URLs and legitimate domains based on analysis of SSL certificates, URL length, domain age, and the number of redirections.

These variables are evaluated by Support Vector Machine (SVM), a comparatively simple machine learning algorithm that performs regression and classification tasks with impressive accuracy.

A Flask web server interacts with a RESTful API to retrieve a URL’s information and a Chrome extension to flag malicious links.


Read more of the latest news from Black Hat Asia 2020


“We crawl the malicious URLs from PhishTank.com, and the normal URLs from the world wide web,” Hoang Tung Cao, one of the tool’s developers, told The Daily Swig in advance of his Arsenal presentation at Black Hat Asia.

The open source tool has also been incorporated into the Vietnam Authority of Information Security’s website as an API service. Anyone who wants to check whether a link is likely to be malicious or not can simply paste the URL into the tool and click a ‘check’ button.

Hoàng said the Python-based application was built using previous research into SVM-based URL detection techniques from Indian and Chinese academics.

‘Simply as possible’

The application first emerged from the Hacksmith v3 Hackathon event that took place in Vietnam in 2019. Tung Cao and fellow students from the Hanoi University of Science & Technology won first prize and the chance to pitch the tool to Black Hat’s Arsenal board.

“As university students, having some basic knowledge, passion about machine learning and security, yet little to no real experience in the field, we tried to tackle the problem as [simply] as possible,” he said.

Tung Cao said the team were motivated to develop the tool because phishing attacks were so rampant.

“As 70% of breaches associated with a nation-state or state-affiliated actors involved phishing, according to the Verizon Data Breach Investigations Report (DBIR), human-factor is the key factor in many cyber-attacks,” he said.

Project relaunch

The team “had completed a bank logo detection feature” as part of an effort to optimize the tool for specific sectors before the Covid-19 pandemic brought the project to a halt.

“However, our team is planning to relaunch the project,” said Tung Cao.

“No detailed schedule has been made yet, but we hope the full version can be on air in 2021.”

The Black Hat presentation would, he hoped, serve to entice other developers to “join our team to improve Sharkcop”.

Tung Cao’s co-presenters were Anh Nguyen Ngoc and Aiden Pearce.


YOU MIGHT ALSO LIKE Grinder Framework helps overcome Shodan false negatives and blind spots