取得页面所有不重复连接的函数_PHP

取得页面所有不重复连接的函数

发表于：2007-07-01来源：作者：点击数：标签：

由于需要我做了个函数，实现取得页面连接放到数组里思路：1，取得静态+不带参数的： htm html asp php jsp cgi a，包含绝对路径的处理：直接取得 preg_match_all （）？ b，包含相对路径的，路径得到处理参数: 根据情况（. 或 .. ）处理得到绝对路径 2，取

　　由于需要我做了个函数，实现取得页面连接放到数组里
　　思路：1，取得静态+不带参数的：
　　htm html asp php jsp cgi
　　a，包含绝对路径的处理：直接取得
　　preg_match_all （）？
　　b，包含相对路径的，路径得到处理参数:
　　根据情况（. 或 .. ）处理得到绝对路径
　　2，取得带参数的：
　　3,经过筛选：选择了一些可以读的后缀比如asp，php，html等
　　连接重复的进行删除。
　　4，直接运行代码就把落伍者论坛 » 网站建设专栏
　　第一页面的连接拿下来到数组＄e中，＄e[o][0]为第一个连接；＄e[o][1]为第2个
　　
　　

Code:

<?
　　＄e=clinchgeturl("http://im286.com/forumdisplay.php?fid=1");
　　
　　var_dump(＄e);
　　function clinchgeturl(＄url)
　　{
　　
　　//＄url="http://127.0.0.1/1.htm";
　　//＄rootpath="http://fsrootpathfsfsf/yyyyyy/";
　　//var_dump(＄rrr);
　　if(eregi(@#(.)*[\.](.)*@#,＄url)){
　　＄roopath=split("\/",＄url);
　　＄rootpath="http://".＄roopath[2]."/";
　　＄nnn=count(＄roopath)-1;for(＄yu=3;＄yu<＄nnn;＄yu++){＄rootpath.=＄roopath[＄yu]."/";}
　　 // var_dump(＄rootpath); //http: ,@#@#,127.0.0.1,xnml,index.php
　　 }
　　 else{＄rootpath=＄url;//var_dump(＄rootpath);
　　}
　　if(isset(＄url)){
　　echo "＄url 有下列裢接：<br>";
　　＄fcontents = file(＄url);
　　while(list(,＄line)=each(＄fcontents)){
　　while(eregi(@#(href[[:space:]]*=[[:space:]]*"?[[:alnum:]:@/._-]+[\?]?[^\"]*"?)@#,＄line,＄regs)){
　　//＄regs[1] = eregi_replace(@#(href[[:space:]]*=[[:space:]]*\"?)([[:alnum:]:@/._-]+)(\"?)@#,"\\2",＄regs[1]);
　　＄regs[1] = eregi_replace(@#(href[[:space:]]*=[[:space:]]*[\"]?)([[:alnum:]:@/._-]+[\?]?[^\"]*)(\.*)[^\"\/]*([\"]?)@#,"\\2",＄regs[1]);
　　
　　if(!eregi(@#^http:\/\/@#,＄regs[1])){
　　
　　 if(eregi(@#^\.\.@#,＄regs[1])){
　　 // ＄roopath=eregi_replace(@#(http:\/\/)?([[:alnum:]:@/._-]+)[[:alnum:]+](\.*)[[:alnum:]+]@#,"http:\/\/\\2",＄url);
　　
　　＄roopath=split("\/",＄rootpath);
　　＄rootpath="http://".＄roopath[2]."/";
　　 //echo "这是根本d ："."\n";
　　＄nnn=count(＄roopath)-1;for(＄yu=3;＄yu<＄nnn;＄yu++){＄rootpath.=＄roopath[＄yu]."/";}
　　 //var_dump(＄rootpath);
　　 if(eregi(@#^\.\.[\/[:alnum:]]@#,＄regs[1])){
　　 //echo "这是../目录/ ："."\n";
　　 //＄regs[1]="../xx/xxxxxx.xx";
　　 // ＄rr=split("\/",＄regs[1]);
　　 //for(＄oooi=1;＄oooi<count(＄rr);＄oooi++)
　　＄rrr=＄regs[1];
　　 // {＄rrr.="/".＄rr[＄oooi];
　　＄rrr = eregi_replace("^[\.][\.][\/]",@#@#,＄rrr); //}
　　
　　＄regs[1]=＄rootpath.＄rrr;
　　
　　
　　 }
　　
　　
　　 }else{
　　 if(eregi(@#^[[:alnum:]]@#,＄regs[1])){ ＄regs[1]=＄rootpath.＄regs[1]; }
　　
　　 else{ ＄regs[1] = eregi_replace("^[\/]",@#@#,＄regs[1]);＄regs[1]=＄rootpath.＄regs[1];}
　　
　　 }
　　
　　
　　 }
　　
　　
　　
　　
　　
　　＄line = ＄regs[2];
　　if(eregi(@#(.)*[\.](htm|shtm|html|asp|aspx|php|jsp|cgi)(.)*@#,＄regs[1])){
　　＄out[0][]=＄regs[1]; }
　　}
　　}
　　}for (＄ouou=0;＄ouou<count(＄out[0]);＄ouou++)
　　 {
　　 if(＄out[0][＄ouou]==＄out[0][＄ouou+1]){
　　＄sameurlsum=1;
　　//echo "sameurlsum=1:";
　　 for(＄sameurl=1;＄sameurl<count(＄out[0]);＄sameurl++){
　　 if(＄out[0][＄ouou+＄sameurl]==＄out[0][＄ouou+＄sameurl+1]){＄sameurlsum++;}
　　 else{break;}
　　 }
　　
　　
　　 for(＄p=＄ouou;＄p<count(＄out[0]);＄p++)
　　 { ＄out[0][＄p]=＄out[0][＄p+＄sameurlsum];}
　　 }
　　 }
　　
　　
　　＄i=0;
　　while(＄out[0][++＄i]) {
　　//echo ＄root.＄out[0][＄i]."\r\n";
　　＄outed[0][＄i]=＄out[0][＄i];
　　
　　}
　　unset(＄out);
　　＄out=＄outed; return ＄out;
　　}
　　?>

原文转自：http://www.ltesting.net

软件测试 > 测试开发技术 > 软件测试开发语言 > PHP >