Wild Spider

web pages are crawled by being loaded into browser using multiple tabs parallelly

什么是Wild Spider?

Wild Spider是由Xuan Wu开发的Chrome扩展程序,该扩展的主要功能是“web pages are crawled by being loaded into browser using multiple tabs parallelly”。

扩展截图

screenshot

下载Wild Spider扩展crx文件

下载Wild Spider扩展crx格式的文件,手动将Chrome插件安装到浏览器中,也可以将crx文件分享给朋友,轻松安装Chrome插件。

扩展使用说明

                        WATCH OUT: more tabs you use, more computer resources (CPU, memory) will be used, and each page costs a bit disk to save the content (in IndexedDb, accessible from extensions -> Inspect views: background page).

The "spider" works in this way:
1) The current url is used as the starting point, and it's loaded again in a new tab.
2) After this page is loaded, fetch all the links on the page.
3) Get all the links on the page, including relative urls.
4) Open the extracted link parallelly in all the tabs used (by default 3, set in eventPage).
5) repeat 2-4

All source code at: https://github.com/nobodxbodon/ChromeCrawlerWildSpider                    

扩展基本信息

名称 Wild Spider Wild Spider
ID aanpchnfojihjddlocpgoekffmjkhbbe
官方URL https://chromewebstore.google.com/detail/wild-spider/aanpchnfojihjddlocpgoekffmjkhbbe
简介 web pages are crawled by being loaded into browser using multiple tabs parallelly
文件大小 121 KB
安装次数 44
当前版本 0.0.3
更新时间 2019-03-08
上架时间 2019-03-08
评分 1.00/5 共1次评分
开发者 Xuan Wu
付费类型 free
扩展官网 https://github.com/nobodxbodon/ChromeCrawlerWildSpider
帮助页面URL https://github.com/nobodxbodon/ChromeCrawlerWildSpider/issues
支持的语言 en-US
manifest.json
{
    "update_url": "https:\/\/clients2.google.com\/service\/update2\/crx",
    "name": "Wild Spider",
    "short_name": "demo web crawler that's still in experimenting",
    "description": "web pages are crawled by being loaded into browser using multiple tabs parallelly",
    "version": "0.0.3",
    "browser_action": {
        "default_icon": "icon.png"
    },
    "permissions": [
        "tabs",
        "activeTab",
        "webNavigation"
    ],
    "background": {
        "scripts": [
            "Dexie.js",
            "eventPage.js"
        ],
        "persistent": false
    },
    "content_scripts": [
        {
            "matches": [
                "*:\/\/*\/*"
            ],
            "js": [
                "htmlparser2.js",
                "content.js"
            ]
        }
    ],
    "manifest_version": 2
}