原文见:http://zephyrfalcon.org/labs/python_pitfalls.html


(or however many I'll find ;-)

These are not necessarily warts or flaws; rather, they are (side effects of) language features that often trip up newbies, and sometimes experienced programmers. Incomplete understanding of some core Python behavior may cause people to get bitten by these.

This document is meant as some sort of guideline to those who are new to Python. It's better to learn about the pitfalls early, than to encounter them in production code shortly before a deadline. :-} It is *not* meant to criticize the language; as said, most of these pitfalls are not due to language flaws.

1. Inconsistent indentation

OK, this is a cheesy one to start with. However, many newbies come from languages where whitespace "doesn't matter", and are in for a rude surprise when they find out the hard way that their inconsistent indentation practices are punished by Python.

Solution: Indent consistently. Use all spaces, or all tabs, but don't mix them. A decent editor helps.

2. Assignment, aka names and objects

People coming from statically typed languages like Pascal and C often assume that Python variables and assignment work the same as in their language of choice. At first glance, it looks indeed the same:

a = b = 3
a = 4
print a, b  # 4, 3

However, then they run into trouble when using mutable objects. Often this goes hand in hand with a claim that Python treats mutable and immutable objects differently.

a = [1, 2, 3]
b = a
a.append(4)
print b
# b is now [1, 2, 3, 4] as well

What is going on, is that a statement like a = [1, 2, 3] does two things: 1. it creates an object, in this case a list, with value [1, 2, 3]; 2. it binds name a to it in the local namespace. b = a then binds b to the same list (which is already referenced by a). Once you realize this, it is less difficult to understand what a.append(4) does... it changes the list referenced to by both a and b.

The idea that mutable and immutable objects are treated differently when doing assignment, is incorrect. When doing a = 3 and b = a, the exact same thing happens as with the list. a and b now refer to the same object, an integer with value 3. However, because integers are immutable, you don't run into side effects.

Solution: Read this. To get rid of unwanted side effects, copy (using the copy method, the slice operator, etc). Python never copies implicitly.

3. The += operator

In languages like C, augmented assignment operators like += are a shorthand for a longer expression. For example,

x += 42;

is syntactic sugar for

x = x + 42;

So, you might think that it's the same in Python. Sure enough, it seems that way at first:

a = 1
a = a + 42
# a is 43
a = 1
a += 42
# a is 43

However, for mutable objects, x += y is not necessarily the same as x = x + y. Consider lists:

>>> z = [1, 2, 3]
>>> id(z)
24213240
>>> z += [4]
>>> id(z)
24213240
>>> z = z + [5]
>>> id(z)
24226184

x += y changes the list in-place, having the same effect as the extend method. x = x + y creates a new list and rebinds it to x, which is something else. A subtle difference that can lead to subtle and hard-to-catch bugs.

Not only that, it also leads to surprising behavior when mixing mutable and immutable containers:

>>> t = ([],)
>>> t[0] += [2, 3]
Traceback (most recent call last):
File "<input>", line 1, in ?
TypeError: object doesn't support item assignment
>>> t
([2, 3],)

Sure enough, tuples don't support item assignment -- but after applying the +=, the list inside the tuple *did* change! The reason is again that += changes in-place. The item assignment doesn't work, but when the exception occurs, the item has already been changed in place.

This is one pitfall that I personally consider a wart.

Solution: depending on your stance on this, you can: avoid += altogether; use it for integers only; or just live with it. :-)

4. Class attributes vs instance attributes

At least two things can go wrong here. First of all, newbies regularly stick attributes in a class (rather than an instance), and are surprised when the attributes are shared between instances:

>>> class Foo:
...     bar = []
...     def __init__(self, x):
...         self.bar.append(x)
...
>>> f = Foo(42)
>>> g = Foo(100)
>>> f.bar, g.bar
([42, 100], [42, 100])

This is not a wart, though, but a nice feature that can be useful in many situations. The misunderstanding springs from the fact that class attributes have been used rather than instance attributes, possibly because instance attributes are created differently from other languages. In C++, Object Pascal, etc, you declare them in the class body.

Another (small) pitfall is that self.foo can refer to two things: the instance attribute foo, or, in absence of that, the class attribute foo. Compare:

>>> class Foo:
...     a = 42
...     def __init__(self):
...         self.a = 43
...
>>> f = Foo()
>>> f.a
43

and

>>> class Foo:
...     a = 42
...
>>> f = Foo()
>>> f.a
42

In the first example, f.a refers to the instance attribute, with value 43. It overrides the class attribute a with value 42. In the second example, there is no instance attribute a, so f.a refers to the class attribute.

The following code combines the two:

>>> class Foo:
...
...     bar = []
...     def __init__(self, x):
...         self.bar = self.bar + [x]
...
>>> f = Foo(42)
>>> g = Foo(100)
>>> f.bar
[42]
>>> g.bar
[100]

In self.bar = self.bar + [x], the self.bars are not the same... the second one refers to the class attribute bar, then the result of the expression is bound to the instance attribute.

Solution: This distinction can be confusing, but is not incomprehensible. Use class attributes when you want to share something between multiple class instances. To avoid ambiguity, you can refer to them as self.__class__.name rather than self.name, even if there is no instance attribute with that name. Use instance attributes for attributes unique to the instance, and refer to them as self.name.

Update: Several people noted that #3 and #4 can be combined for even more twisted fun:

>>> class Foo:
... bar = []
... def __init__(self, x):
...     self.bar += [x]
...
>>> f = Foo(42)
>>> g = Foo(100)
>>> f.bar
[42, 100]
>>> g.bar
[42, 100]

Again, the reason for this behavior is that self.bar += something is not the same as self.bar = self.bar + something. self.bar refers to Foo.bar here, so f and g update the same list.

5. Mutable default arguments

This one bites beginners over and over again. It's really a variant of #2, combined with unexpected behavior of default arguments. Consider this function:

>>> def popo(x=[]):
...     x.append(666)
...     print x
...
>>> popo([1, 2, 3])
[1, 2, 3, 666]
>>> x = [1, 2]
>>> popo(x)
[1, 2, 666]
>>> x
[1, 2, 666]

This was expected. But now:

>>> popo()
[666]
>>> popo()
[666, 666]
>>> popo()
[666, 666, 666]

Maybe you expected that the output would be [666] in all cases... after all, when popo() is called without arguments, it takes [] as the default argument for x, right? Wrong. The default argument is bound *once*, when the function is *created*, not when it's called. (In other words, for a function f(x=[]), x is *not* bound whenever the function is called. x got bound to [] when we defined f, and that's it.) So if it's a mutable object, and it has changed, then the next function call will take this same list (which has different contents now) as its default argument.

Solution: This behavior can occasionally be useful. In general, just watch out for unwanted side effects.

6. UnboundLocalError

According to the reference manual, this error occurs if a name "refers to a local variable that has not been bound". That sounds cryptical. It's best illustrated by a small example:

>>> def p():
...     x = x + 2
...
>>> p()
Traceback (most recent call last):
File "<input>", line 1, in ?
File "<input>", line 2, in p
UnboundLocalError: local variable 'x' referenced before
assignment

Inside p, the statement x = x + 2 cannot be resolved, because the x in the expression x + 2 has no value yet. That seems reasonable; you can't refer to a name that hasn't been bound yet. But now consider:

>>> x = 2
>>> def q():
...     print x
...     x = 3
...     print x
...
>>> q()
Traceback (most recent call last):
File "<input>", line 1, in ?
File "<input>", line 2, in q
UnboundLocalError: local variable 'x' referenced before
assignment

You'd think that this piece of code would be valid -- first it prints 2 (for the global variable x), then assigns the local variable x to 3, and prints it (3). This doesn't work though. This is because of scoping rules, explained by the reference manual:

"If a name is bound in a block, it is a local variable of that block. If a name is bound at the module level, it is a global variable. (The variables of the module code block are local and global.) If a variable is used in a code block but not defined there, it is a free variable.
 
When a name is not found at all, a NameError exception is raised. If the name refers to a local variable that has not been bound, a UnboundLocalError exception is raised."

In other words: a variable in a function can be local or global, but not both. (No matter if you rebind it later.) In the example above, Python determines that x is local (according to the rules). But upon execution it encounters print x, and x doesn't have a value yet... hence the error.

Note that a function body of just print x or x = 3; print x would have been perfectly valid.

Solution: Don't mix local and global variables like this.

7. Floating point rounding errors

When using floating point numbers, printing their values may have surprising results. To make matters more interesting, the str() and repr() representations may differ. An example says it all:

>>> c = 0.1
>>> c
0.10000000000000001
>>> repr(c)
'0.10000000000000001'
>>> str(c)
'0.1'

Because many numbers cannot be represented exactly in base 2 (which is what computer hardware uses), the actual value has to be approximated in base 10.

Solution: Read the tutorial for more information.

8. String concatenation

This is a different kind of pitfall. In many languages, concatenating strings with the + operator or something similar might be quite efficient. For example, in Pascal:

var S : String;
for I := 1 to 10000 do begin
S := S + Something(I);
end;

(This piece of code assumes a string type of more than 255 characters, which was the maximum in Turbo Pascal, aside... ;-)

Similar code in Python is likely to be highly inefficient. Since Python strings are immutable (as opposed to Pascal strings), a new string is created for every iteration (and old ones are thrown away). This may result in unexpected performance hits. Using string concatenation with + or += is OK for small changes, but it's usually not recommended in a loop.

Solution: If at all possible, create a list of values, then use string.join (or the join() method) to glue them together as one long string. Sometimes this can result in dramatic speedups.

To illustrate this, a simple benchmark. (timeit is a simple function that runs another function and returns how long it took to complete, in seconds.)

>>> def f():
...     s = ""
...     for i in range(100000):
...         s = s + "abcdefg"[i % 7]
...
>>> timeit(f)
23.7819999456
>>> def g():
...     z = []
...     for i in range(100000):
...         z.append("abcdefg"[i % 7])
...     return ''.join(z)
...
>>> timeit(g)
0.343000054359

Update: This was fixed in CPython 2.4. According to the What's New in Python 2.4 page: "String concatenations in statements of the form s = s + "abc" and s += "abc" are now performed more efficiently in certain circumstances. This optimization won't be present in other Python implementations such as Jython, so you shouldn't rely on it; using the join() method of strings is still recommended when you want to efficiently glue a large number of strings together."

9. Binary mode for files

Or rather, it's *not* using binary mode that can cause confusion. Some operating systems, like Windows, distinguish between binary files and text files. To illustrate this, files in Python can be opened in binary mode or text mode:

f1 = open(filename, "r")  # text
f2 = open(filename, "rb") # binary

In text mode, lines may be terminated by any newline/carriage return character (/n, /r, or /r/n). Binary mode does not do this. Also, on Windows, when reading from a file in text mode, newlines are represented by Python as /n (universal); in binary mode, it's /r/n. Reading a piece of data may therefore yield very different results in these modes.

There are also systems that don't have the text/binary distinction. On Unix, for example, files are always opened in binary mode. Because of this, some code written on Unix may open a file in mode 'r', which has different results when run on Windows. Or, someone coming from Unix may use the 'r' flag on Windows, and be puzzled about the results.

Solution: Use the correct flags -- 'r' for text mode (even on Unix), 'rb' for binary mode.

10. Catching multiple exceptions

Sometimes you want to catch multiple exception in one except clause. An obvious idiom seems to be:

try:
...something that raises an error...
except IndexError, ValueError:
# expects to catch IndexError and ValueError
# wrong!

This doesn't work though... the reason becomes clear when comparing this to:

>>> try:
...     1/0
... except ZeroDivisionError, e:
...     print e
...
integer division or modulo by zero

The first "argument" in the except clause is the exception class, the second one is an optional name, which will be used to bind the actual exception instance that has been raised. So, in the erroneous code above, the except clause catches an IndexError, and binds the name ValueError to the exception instance. Probably not what we want. ;-)

This works better:

try:
...something that raises an error...
except (IndexError, ValueError):
# does catch IndexError and ValueError

Solution: When catching multiple exceptions in one except clause, use parentheses to create a tuple with exceptions.


10 Python pitfalls相关推荐

  1. Watch out for these 10 common pitfalls of experienced Java developers architects--转

    原文地址:http://zeroturnaround.com/rebellabs/watch-out-for-these-10-common-pitfalls-of-experienced-java- ...

  2. 北理工嵩天Python语言程序设计笔记(10 Python计算生态概览)

    前言 本文是对<北理工 嵩天/黄天宇/礼欣 Python语言程序设计>的学习笔记,供自己查阅使用. 文章目录 北理工嵩天Python语言程序设计笔记(目录) 北理工嵩天Python语言程序 ...

  3. 10.python解答蓝桥杯省赛 回文数字

    10.python解答蓝桥杯省赛 回文数字 试题 历届试题 回文数字 提交此题 评测记录 资源限制 时间限制:1.0s 内存限制:256.0MB 问题描述 观察数字:12321,123321 都有一个 ...

  4. 10. python float( )函数

    10. python float( )函数 文章目录 10. python float( )函数 1. float( )函数 2. 将int转换为float 3. 将str转换成float 3.1 整 ...

  5. 2018年10月Top 10 Python开源项目

    过去一个月,MyBridge从将近250个Python开源项目中选择出了最好的10个项目: 这些项目在GitHub上平均获得1140个star 项目涵盖话题包括性能分析.图表提取.HTTP框架.HTT ...

  6. 学习笔记(10):Python网络编程并发编程-粘包现象

    立即学习:https://edu.csdn.net/course/play/24458/296240?utm_source=blogtoedu 粘包现象:服务器接收到客户端的命令后,进行执行得到结果后 ...

  7. python 3.6.5 pip_在Windows 10 + Python 3.6.5 中用 pip 安装最新版 TensorFlow v1.8 for GPU

    声明 什么cuDNN之类的安装,应该是毫无难度的,按照官网的教程来即可,除非...像我一样踩了狗屎运.咳咳,这些问题不是本文的关键. 本文的关键是解决pip安装tensorflow gpu版的问题. ...

  8. ValueError: invalid literal for int() with base 10:Python报错及其解决办法

    https://blog.csdn.net/hanhanwanghaha宝藏女孩 欢迎您的关注! 欢迎关注微信公众号:宝藏女孩的成长日记 如有转载,请注明出处(如不注明,盗者必究) 报错情况 Valu ...

  9. 10 python 扩展

    说起来扩展,基本就是在其他语言里调用C或者C++,因为这两个是效率最高的代码,而其他大多都是另外又封装的,所以效率较低. 当出现语言本身无法解决的效率问题时,就需要扩展调用其他代码. 因为我自己会C+ ...

最新文章

  1. html5 graphics with svg css3,HTML5 GRAPHICS WITH SVG AND CSS3
  2. 数字孪生技术,让酷炫的智慧城市不再遥不可及
  3. photoimpression 5中文版
  4. lambda 序列化_如何以及为什么要序列化Lambda
  5. Silverlight实用窍门系列:56.Silverlight中的Binding使用(一)【附带实例源码】
  6. java cxf 不使用springmvc_使用cfx与springMVC集成发布与调用webservice
  7. 深度学习的实用层面 —— 1.5 为什么正则化可以减少过拟合
  8. JWT学习(二):Json Web Token JWT的Java使用 (JJWT)
  9. ServerVariables 变量
  10. 电脑控制手机 易语言也可以实现颜色比较功能哦
  11. 通过减小Bootstrapping Error Reduction来进行离线RL学习
  12. centos6.5重置密码
  13. 如何修改Windows(可移植)桌面文件夹图标
  14. 密码学RSA解密之Pollard_rho分解
  15. 基于PHP的客户分销商管理系统
  16. python输出emoji表情符号 学习笔记
  17. 一个软件实施人员的自我评价
  18. JS 获取星期几的四种写法(转)
  19. leetcode1658.将x见到0的最小操作数
  20. Oracle项目管理主数据之EPS

热门文章

  1. 天龙八部凤鸣镇服务器无响应,天龙八部:脚本只封冰焰不封小蜜?不管怎样都伤透老玩家的心!...
  2. 口腔APP开发具有什么好处
  3. 使用vue语法编写程序打印乘法表
  4. 2021年电气试验考试试卷及电气试验模拟考试题库
  5. 物料优选与可靠性管理
  6. 将element-plus 默认的使用英语改成其他语言设置
  7. VS2017中文乱码
  8. VS2017连接Mysql
  9. 【转】Android自适应不同分辨率或不同屏幕大小的layout布局(横屏|竖屏)
  10. VTP(Vlan Trunking Protocol)——vlan中继协议