本文整理汇总了Python中lxml.etree.ParserError方法的典型用法代码示例。如果您正苦于以下问题:Python etree.ParserError方法的具体用法?Python etree.ParserError怎么用?Python etree.ParserError使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在模块lxml.etree的用法示例。

在下文中一共展示了etree.ParserError方法的9个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Python代码示例。

示例1: feed

​点赞 6

# 需要导入模块: from lxml import etree [as 别名]

# 或者: from lxml.etree import ParserError [as 别名]

def feed(self, markup):

if isinstance(markup, bytes):

markup = BytesIO(markup)

elif isinstance(markup, unicode):

markup = StringIO(markup)

# Call feed() at least once, even if the markup is empty,

# or the parser won"t be initialized.

data = markup.read(self.CHUNK_SIZE)

try:

self.parser = self.parser_for(self.soup.original_encoding)

self.parser.feed(data)

while len(data) != 0:

# Now call feed() on the rest of the data, chunk by chunk.

data = markup.read(self.CHUNK_SIZE)

if len(data) != 0:

self.parser.feed(data)

self.parser.close()

except (UnicodeDecodeError, LookupError, etree.ParserError), e:

raise ParserRejectedMarkup(str(e))

开发者ID:MarcelloLins,项目名称:ServerlessCrawler-VancouverRealState,代码行数:22,

示例2: feed

​点赞 6

# 需要导入模块: from lxml import etree [as 别名]

# 或者: from lxml.etree import ParserError [as 别名]

def feed(self, markup):

if isinstance(markup, bytes):

markup = BytesIO(markup)

elif isinstance(markup, str):

markup = StringIO(markup)

# Call feed() at least once, even if the markup is empty,

# or the parser won"t be initialized.

data = markup.read(self.CHUNK_SIZE)

try:

self.parser = self.parser_for(self.soup.original_encoding)

self.parser.feed(data)

while len(data) != 0:

# Now call feed() on the rest of the data, chunk by chunk.

data = markup.read(self.CHUNK_SIZE)

if len(data) != 0:

self.parser.feed(data)

self.parser.close()

except (UnicodeDecodeError, LookupError, etree.ParserError) as e:

raise ParserRejectedMarkup(str(e))

开发者ID:the-ethan-hunt,项目名称:B.E.N.J.I.,代码行数:22,

示例3: extract_html_content

​点赞 6

# 需要导入模块: from lxml import etree [as 别名]

# 或者: from lxml.etree import ParserError [as 别名]

def extract_html_content(self, html_body, fix_html=True):

"""Ingestor implementation."""

if html_body is None:

return

try:

try:

doc = html.fromstring(html_body)

except ValueError:

# Ship around encoding declarations.

# https://stackoverflow.com/questions/3402520

html_body = self.RE_XML_ENCODING.sub("", html_body, count=1)

doc = html.fromstring(html_body)

except (ParserError, ParseError, ValueError):

raise ProcessingException("HTML could not be parsed.")

self.extract_html_header(doc)

self.cleaner(doc)

text = self.extract_html_text(doc)

self.result.flag(self.result.FLAG_HTML)

self.result.emit_html_body(html_body, text)

开发者ID:occrp-attic,项目名称:ingestors,代码行数:22,

示例4: ingest

​点赞 6

# 需要导入模块: from lxml import etree [as 别名]

# 或者: from lxml.etree import ParserError [as 别名]

def ingest(self, file_path):

"""Ingestor implementation."""

file_size = self.result.size or os.path.getsize(file_path)

if file_size > self.MAX_SIZE:

raise ProcessingException("XML file is too large.")

try:

doc = etree.parse(file_path)

except (ParserError, ParseError):

raise ProcessingException("XML could not be parsed.")

text = self.extract_html_text(doc.getroot())

transform = etree.XSLT(self.XSLT)

html_doc = transform(doc)

html_body = html.tostring(html_doc, encoding=str, pretty_print=True)

self.result.flag(self.result.FLAG_HTML)

self.result.emit_html_body(html_body, text)

开发者ID:occrp-attic,项目名称:ingestors,代码行数:19,

示例5: _retrieve_html_page

​点赞 6

# 需要导入模块: from lxml import etree [as 别名]

# 或者: from lxml.etree import ParserError [as 别名]

def _retrieve_html_page(self):

"""

Download the requested player"s stats page.

Download the requested page and strip all of the comment tags before

returning a PyQuery object which will be used to parse the data.

Oftentimes, important data is contained in tables which are hidden in

HTML comments and not accessible via PyQuery.

Returns

-------

PyQuery object

The requested page is returned as a queriable PyQuery object with

the comment tags removed.

"""

url = self._build_url()

try:

url_data = pq(url)

except (HTTPError, ParserError):

return None

# For NFL, a 404 page doesn"t actually raise a 404 error, so it needs

# to be manually checked.

if "Page Not Found (404 error)" in str(url_data):

return None

return pq(utils._remove_html_comment_tags(url_data))

开发者ID:roclark,项目名称:sportsreference,代码行数:27,

示例6: _retrieve_html_page

​点赞 6

# 需要导入模块: from lxml import etree [as 别名]

# 或者: from lxml.etree import ParserError [as 别名]

def _retrieve_html_page(self):

"""

Download the requested player"s stats page.

Download the requested page and strip all of the comment tags before

returning a pyquery object which will be used to parse the data.

Returns

-------

PyQuery object

The requested page is returned as a queriable PyQuery object with

the comment tags removed.

"""

url = self._build_url()

try:

url_data = pq(url)

except (HTTPError, ParserError):

return None

return pq(utils._remove_html_comment_tags(url_data))

开发者ID:roclark,项目名称:sportsreference,代码行数:21,

示例7: _retrieve_html_page

​点赞 6

# 需要导入模块: from lxml import etree [as 别名]

# 或者: from lxml.etree import ParserError [as 别名]

def _retrieve_html_page(self):

"""

Download the requested player"s stats page.

Download the requested page and strip all of the comment tags before

returning a pyquery object which will be used to parse the data.

Returns

-------

PyQuery object

The requested page is returned as a queriable PyQuery object with

the comment tags removed.

"""

url = PLAYER_URL % self._player_id

try:

url_data = pq(url)

except (HTTPError, ParserError):

return None

return pq(utils._remove_html_comment_tags(url_data))

开发者ID:roclark,项目名称:sportsreference,代码行数:21,

示例8: _pull_conference_page

​点赞 6

# 需要导入模块: from lxml import etree [as 别名]

# 或者: from lxml.etree import ParserError [as 别名]

def _pull_conference_page(self, conference_abbreviation, year):

"""

Download the conference page.

Download the conference page for the requested conference and season

and create a PyQuery object.

Parameters

----------

conference_abbreviation : string

A string of the requested conference"s abbreviation, such as

"big-12".

year : string

A string of the requested year to pull conference information from.

"""

try:

return pq(CONFERENCE_URL % (conference_abbreviation, year))

except (HTTPError, ParserError):

return None

开发者ID:roclark,项目名称:sportsreference,代码行数:21,

示例9: feed

​点赞 6

# 需要导入模块: from lxml import etree [as 别名]

# 或者: from lxml.etree import ParserError [as 别名]

def feed(self, markup):

if isinstance(markup, bytes):

markup = BytesIO(markup)

elif isinstance(markup, str):

markup = StringIO(markup)

# Call feed() at least once, even if the markup is empty,

# or the parser won"t be initialized.

data = markup.read(self.CHUNK_SIZE)

try:

self.parser = self.parser_for(self.soup.original_encoding)

self.parser.feed(data)

while len(data) != 0:

# Now call feed() on the rest of the data, chunk by chunk.

data = markup.read(self.CHUNK_SIZE)

if len(data) != 0:

self.parser.feed(data)

self.parser.close()

except (UnicodeDecodeError, LookupError, etree.ParserError) as e:

raise ParserRejectedMarkup(e)

开发者ID:Tautulli,项目名称:Tautulli,代码行数:22,

注:本文中的lxml.etree.ParserError方法示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。

parsererror是什么意思中文翻译python-Python etree.ParserError方法代码示例相关推荐

  1. python dateformatter_Python dates.DateFormatter方法代码示例

    本文整理汇总了Python中matplotlib.dates.DateFormatter方法的典型用法代码示例.如果您正苦于以下问题:Python dates.DateFormatter方法的具体用法 ...

  2. python paperclip_Python pyplot.sca方法代码示例

    本文整理汇总了Python中matplotlib.pyplot.sca方法的典型用法代码示例.如果您正苦于以下问题:Python pyplot.sca方法的具体用法?Python pyplot.sca ...

  3. python fonttool_Python wx.Font方法代码示例

    本文整理汇总了Python中wx.Font方法的典型用法代码示例.如果您正苦于以下问题:Python wx.Font方法的具体用法?Python wx.Font怎么用?Python wx.Font使用 ...

  4. python res_Python models.resnet152方法代码示例

    本文整理汇总了Python中torchvision.models.resnet152方法的典型用法代码示例.如果您正苦于以下问题:Python models.resnet152方法的具体用法?Pyth ...

  5. python dropout_Python slim.dropout方法代码示例

    本文整理汇总了Python中tensorflow.contrib.slim.dropout方法的典型用法代码示例.如果您正苦于以下问题:Python slim.dropout方法的具体用法?Pytho ...

  6. python batch_size_Python config.batch_size方法代码示例

    本文整理汇总了Python中config.batch_size方法的典型用法代码示例.如果您正苦于以下问题:Python config.batch_size方法的具体用法?Python config. ...

  7. python pool_Python pool.Pool方法代码示例

    本文整理汇总了Python中multiprocessing.pool.Pool方法的典型用法代码示例.如果您正苦于以下问题:Python pool.Pool方法的具体用法?Python pool.Po ...

  8. python nextpow2_Python signal.hann方法代码示例

    本文整理汇总了Python中scipy.signal.hann方法的典型用法代码示例.如果您正苦于以下问题:Python signal.hann方法的具体用法?Python signal.hann怎么 ...

  9. python colormap_Python colors.LinearSegmentedColormap方法代码示例

    本文整理汇总了Python中matplotlib.colors.LinearSegmentedColormap方法的典型用法代码示例.如果您正苦于以下问题:Python colors.LinearSe ...

  10. python transformat_Python transforms.Bbox方法代码示例

    本文整理汇总了Python中matplotlib.transforms.Bbox方法的典型用法代码示例.如果您正苦于以下问题:Python transforms.Bbox方法的具体用法?Python ...

最新文章

  1. Java实现JsApi方式的微信支付
  2. 【Spring-Boot】【入门 01】第一个 Spring Boot 程序
  3. 【滤波器】基于matlab脉冲响应不变法+双线性变换法数字滤波器设计【含Matlab源码 884期】
  4. 牛奶盒喷码字符识别(基于opencv)————(三)字符的识别
  5. 与戴尔科技同行,与远见如影随形
  6. 2022年北京购房攻略一 (常识篇)
  7. 早期微处理器相关的中文翻译书籍
  8. 解决Android 10+无法创建文件问题
  9. [转载]关于如何选择5D2和6D的忠告_我是亲民_新浪博客
  10. 简单编写图书管理系统
  11. 国产服务器软件 LinWinHttp 重大更新 V1.3 Community Build 2022.10.29 发布,这次的更新有什么内容?
  12. Ping IP时出现 request time out怎么解决?
  13. jupyter 启动后能打开页面 ,页面提示‘连接失败以及 TensorBoard的打开方法
  14. css弹性盒模型详解----flex-direction
  15. three.js和D3.js
  16. 外贸人如何把握客户跟进频率?
  17. NMEA报文解析程序(c语言)-命令解析
  18. 数据可视化--Superset使用示例
  19. Excel多人同时共享编辑同一个表格
  20. 小程序解析富文本编辑器中的内容

热门文章

  1. 回收站勾选变灰了无法选择 桌面回收站不见了 解决办法
  2. 抗干扰性强、传输速度快,5G工业路由器穿墙效果好的原因在这里~
  3. linux shell 发送图片,51CTO博客-专业IT技术博客创作平台-技术成就梦想
  4. typora中文免费版
  5. 有了这个,我再也不怕文章被别人搬运抄袭了
  6. 软件体系期末论文_可扩展置标语言(xml)
  7. 架构整洁之道-书中箴言
  8. 浏览器跨域问题(CROS Error)
  9. 总结一篇Win7电脑没声音的解决方案
  10. python 摸索(二) 让我爱上python的一句1000阶乘代码