Wild Spider

web pages are crawled by being loaded into browser using multiple tabs parallelly

什麼是Wild Spider?

Wild Spider是由Xuan Wu開發的Chrome擴展程式,該擴展的主要功能是“web pages are crawled by being loaded into browser using multiple tabs parallelly”。

擴展截圖

screenshot

下載Wild Spider擴展crx文件

下載Wild Spider擴展crx格式的文件,手動將Chrome擴充功能安裝到瀏覽器中,也可以將crx文件分享給朋友,輕鬆安裝Chrome擴充功能。

擴展使用說明

                        WATCH OUT: more tabs you use, more computer resources (CPU, memory) will be used, and each page costs a bit disk to save the content (in IndexedDb, accessible from extensions -> Inspect views: background page).

The "spider" works in this way:
1) The current url is used as the starting point, and it's loaded again in a new tab.
2) After this page is loaded, fetch all the links on the page.
3) Get all the links on the page, including relative urls.
4) Open the extracted link parallelly in all the tabs used (by default 3, set in eventPage).
5) repeat 2-4

All source code at: https://github.com/nobodxbodon/ChromeCrawlerWildSpider                    

擴展基本資訊

名稱 Wild Spider Wild Spider
ID aanpchnfojihjddlocpgoekffmjkhbbe
官方網址 https://chromewebstore.google.com/detail/wild-spider/aanpchnfojihjddlocpgoekffmjkhbbe
簡介 web pages are crawled by being loaded into browser using multiple tabs parallelly
檔案大小 121 KB
安裝次數 44
目前版本 0.0.3
更新時間 2019-03-08
上架時間 2019-03-08
評分 1.00/5 共 1 次評分
開發者 Xuan Wu
付費類型 free
擴展官網 https://github.com/nobodxbodon/ChromeCrawlerWildSpider
說明頁面URL https://github.com/nobodxbodon/ChromeCrawlerWildSpider/issues
支援的語言 en-US
manifest.json
{
    "update_url": "https:\/\/clients2.google.com\/service\/update2\/crx",
    "name": "Wild Spider",
    "short_name": "demo web crawler that's still in experimenting",
    "description": "web pages are crawled by being loaded into browser using multiple tabs parallelly",
    "version": "0.0.3",
    "browser_action": {
        "default_icon": "icon.png"
    },
    "permissions": [
        "tabs",
        "activeTab",
        "webNavigation"
    ],
    "background": {
        "scripts": [
            "Dexie.js",
            "eventPage.js"
        ],
        "persistent": false
    },
    "content_scripts": [
        {
            "matches": [
                "*:\/\/*\/*"
            ],
            "js": [
                "htmlparser2.js",
                "content.js"
            ]
        }
    ],
    "manifest_version": 2
}