BeautifulSoup警告: BeautifulSoup([your markup], "html.parser")
今天看<python数据采集>一书,跟着敲了点代码,代码如下:
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://www.pythonscraping.com/pages/page1.html")
bsObj = BeautifulSoup(html.read())
print(bsObj.h1)
这个书中没提,但是会报警告
/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/jiangxiaohan/Desktop/PythonDemo/beautifulSoupDemo/beautifulSoupTest1.py
/Users/jiangxiaohan/Library/Python/3.6/lib/python/site-packages/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
The code that caused this warning is on line 6 of the file /Users/jiangxiaohan/Desktop/PythonDemo/beautifulSoupDemo/beautifulSoupTest1.py. To get rid of this warning, change code that looks like this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "html.parser")
markup_type=markup_type))
<h1>An Interesting Title</h1>
Process finished with exit code 0
这个就是说你没有指定beautifulsoup的解析器,所以作者默认使用html.parser来解析,一般没什么问题,但是如果运行在其它系统或环境它可能会使用不同的解析器(可能会导致不同的结果)。如果想消除这个警告信息你可以这样写
BeautifulSoup(html.read(), "html.parser")
这样就好了
作者:蒋昉霖
链接:https://www.jianshu.com/p/e09403f4cd6a
来源:简书
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。