企业🤖AI Agent构建引擎,智能编排和调试,一键部署,支持私有化部署方案 广告
# 快速开始 * * * * * 创建一个文件 start.php,包含以下内容 ~~~ <?php require_once __DIR__ . '/vendor/autoload.php'; use PHPCrawler\Crawler; $config = array( 'name' => 'meiriyiwen', 'url' => 'http://w.ihx.cc/', 'method' => 'depth', 'queue' => 'redis', 'db_config' => array( 'dbhost' => '127.0.0.1', 'port' => 3306, 'dbuser' => 'root', 'dbpsw' => 'root', 'dbname' => 'discussion' ), 'table' => array( array( 'column' => 'title', 'type' => 0 ),array( 'column' => 'author', 'type' => 0 ),array( 'column' => 'time', 'type' => 0 ), ), 'urlSelector' => array( array( 'selector' => '/http:\/\/w.ihx.cc\/.{1,20}\/\d+.html/ism', 'parser'=>'Reg', 'childer' => array( array( 'selector' => '//*[@id="the-post"]/div[2]/h1/span', 'repeat'=>true, 'column'=>'title', ),array( 'selector' => '//*[@id="the-post"]/div[2]/p/span[1]/a', 'repeat'=>true, 'column'=>'author', ),array( 'selector' => '//*[@id="the-post"]/div[2]/p/span[3]/text()', 'repeat'=>true, 'column'=>'time', ), ) ) ) ); $crawler = new Crawler($config); $crawler->run(); ?> ~~~ 在命令行中执行 ~~~ $ php start.php ~~~ 接下来就可以看到抓取的日志了。![](https://box.kancloud.cn/4451036561ee0c17b4b90188f8ee90b7_1011x612.png) // 下面为守护进程的界面(ubuntu 16.04) // ~~~ // $ php start.php start // ~~~ ### 更多栗子在demo文件夹中将它们复制到vendor外层直接运行即可。