Wild Spider

web pages are crawled by being loaded into browser using multiple tabs parallelly

What is Wild Spider?

Wild Spider is a Chrome extension developed by Xuan Wu, and its main feature is "web pages are crawled by being loaded into browser using multiple tabs parallelly".

Extension Screenshots

screenshot

Download Wild Spider Extension CRX File

Download Wild Spider extension files in crx format, manually install Chrome extensions in the browser, or share the crx files with friends to easily install Chrome extensions.

Extension Usage Instructions

                        WATCH OUT: more tabs you use, more computer resources (CPU, memory) will be used, and each page costs a bit disk to save the content (in IndexedDb, accessible from extensions -> Inspect views: background page).

The "spider" works in this way:
1) The current url is used as the starting point, and it's loaded again in a new tab.
2) After this page is loaded, fetch all the links on the page.
3) Get all the links on the page, including relative urls.
4) Open the extracted link parallelly in all the tabs used (by default 3, set in eventPage).
5) repeat 2-4

All source code at: https://github.com/nobodxbodon/ChromeCrawlerWildSpider                    

Extension Basic Information

Name Wild Spider Wild Spider
ID aanpchnfojihjddlocpgoekffmjkhbbe
Official URL https://chromewebstore.google.com/detail/wild-spider/aanpchnfojihjddlocpgoekffmjkhbbe
Description web pages are crawled by being loaded into browser using multiple tabs parallelly
File Size 121 KB
Installation Count 44
Current Version 0.0.3
Last Updated 2019-03-08
Publish Date 2019-03-08
Rating 1.00/5 Total 1 Ratings
Developer Xuan Wu
Payment Type free
Extension Website https://github.com/nobodxbodon/ChromeCrawlerWildSpider
Help Page URL https://github.com/nobodxbodon/ChromeCrawlerWildSpider/issues
Supported Languages en-US
manifest.json
{
    "update_url": "https:\/\/clients2.google.com\/service\/update2\/crx",
    "name": "Wild Spider",
    "short_name": "demo web crawler that's still in experimenting",
    "description": "web pages are crawled by being loaded into browser using multiple tabs parallelly",
    "version": "0.0.3",
    "browser_action": {
        "default_icon": "icon.png"
    },
    "permissions": [
        "tabs",
        "activeTab",
        "webNavigation"
    ],
    "background": {
        "scripts": [
            "Dexie.js",
            "eventPage.js"
        ],
        "persistent": false
    },
    "content_scripts": [
        {
            "matches": [
                "*:\/\/*\/*"
            ],
            "js": [
                "htmlparser2.js",
                "content.js"
            ]
        }
    ],
    "manifest_version": 2
}