dimitris kalamaras

math, social network analysis, web dev, free software…

SocNetV v1.6 released – a nice web crawler included!

The Social Network Visualizer project has just released its latest version 1.6. This new version brings back the web crawler feature, which has been disabled in the 1.x series so far, but in a much more improved form.

To start the web crawler, go to menu Network > Web Crawler or press Shift+C…

A dialog will appear, where you must enter the initial web page (seed). You may also set the maximum nodes/pages (default 600) and what kind of links to crawl: internal, external or both.  By default the spider will crawl both internal and external links.

 

The new web crawler is vastly improved from the 0.x releases and consists of two parts: a ‘spider’ and a ‘parser’, each one running on its own thread.

The spider visits a given initial URL (i.e. a website or a single webpage) and downloads its HTML code. The parser scans the downloaded code for ‘href’ links to other pages (internal or external) and adds them to a queue of URLs (called frontier).

As URLs are added in the queue, the spider visits them and downloads their HTML which is scanned for more links by the parser, and so on…

The process is multithreaded and completed in a matter of seconds even for 1000 urls.

The end result is the ‘network’ of all visited webpages as nodes and their real links as edges. To help you find some patterns right away, the nodes are by default displayed with their node sizes reflecting their outDegree.


 

From there, you can analyze the network using the SNA tools provided by SocNetV.

Please note that the parser searches for ‘href’ links only in the body section of the HTML code.

Binaries for Windows, Mac OS X and Linux are available from SocNetV’s Downloads area.

Previous

An apache monitoring script

Next

SocNetV v1.7 brings Group Node Edit, new Properties dialog, file Previewer and more

3 Comments

  1. Noor

    Hello,

    Thank you for this great tool ! I want to know how can we have a preview of the actual links and not just the number of the nodes. I want the links to appear visually on the graph is that possible?

    Best regard.

  2. Hello Noor,

    Are you referring to the web crawler? If so, every web page link (a href) found on each page visited appear on the graph as “edges” connecting the respective node with other nodes (which depict those other web pages).

  3. muhtadin

    Hello, i have final project (thesis in undergraduate program) using SNA as basic analysis. May i have your number for asking more, i’ve problem in crawling data.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress & Theme by Anders Norén