BeautifulSoup警告: BeautifulSoup([your markup], "html.parser")

发表于： 2020年5月12日 2020年5月12日
分类： python3, selenium, 运维

今天看<python数据采集>一书,跟着敲了点代码,代码如下:

from urllib.request import urlopen
from bs4 import BeautifulSoup


html = urlopen("http://www.pythonscraping.com/pages/page1.html")
bsObj = BeautifulSoup(html.read())
print(bsObj.h1)

这个书中没提,但是会报警告

/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/jiangxiaohan/Desktop/PythonDemo/beautifulSoupDemo/beautifulSoupTest1.py
/Users/jiangxiaohan/Library/Python/3.6/lib/python/site-packages/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 6 of the file /Users/jiangxiaohan/Desktop/PythonDemo/beautifulSoupDemo/beautifulSoupTest1.py. To get rid of this warning, change code that looks like this:

 BeautifulSoup([your markup])

to this:

 BeautifulSoup([your markup], "html.parser")

  markup_type=markup_type))
<h1>An Interesting Title</h1>

Process finished with exit code 0

这个就是说你没有指定beautifulsoup的解析器，所以作者默认使用html.parser来解析，一般没什么问题，但是如果运行在其它系统或环境它可能会使用不同的解析器（可能会导致不同的结果）。如果想消除这个警告信息你可以这样写

BeautifulSoup(html.read(), "html.parser")

这样就好了

作者：蒋昉霖
链接：https://www.jianshu.com/p/e09403f4cd6a
来源：简书
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

tingyuxinsheng@gmail.com

1294

tingyuxinsheng@gmail.com

发表评论 取消回复

发表评论取消回复