Python-CheatSheet

Python 语法速览与实战清单

Python CheatSheet 是对于 Python 学习/实践过程中的语法与技巧进行盘点，其属于 Awesome CheatSheet 系列，致力于提升学习速度与研发效能，即可以将其当做速查手册，也可以作为轻量级的入门学习资料。本文参考了许多优秀的文章与代码示范，统一声明在了 Python Links；如果希望深入了解某方面的内容，可以继续阅读，或者前往 coding-snippets/python 查看使用 Python 解决常见的数据结构与算法、设计模式、业务功能方面的代码实现。

According to its creator, Guido van Rossum, Python is a:“high-level programming language, and its core design philosophy is all about code readability and a syntax which allows programmers to express concepts in a few lines of code.”

建议使用 pipenv 作为项目环境管理：

# 创建 Python 2/3 版本的项目
$ pipenv --two/--three

# 安装项目依赖，会在当前目录下生成 .venv 目录，包含 python 解释器
$ pipenv install
$ pipenv install --dev

# 弹出 Virtual Env 对应的脚本环境
$ pipenv shell

# 执行文件
$ pipenv run python

# 定位项目路径
$ pipenv --where
/Users/kennethreitz/Library/Mobile Documents/com~apple~CloudDocs/repos/kr/pipenv/test

# 定位虚拟环境路径
$ pipenv --venv
/Users/kennethreitz/.local/share/virtualenvs/test-Skyy4vre

# 定位 Python 解释器路径
$ pipenv --py
/Users/kennethreitz/.local/share/virtualenvs/test-Skyy4vre/bin/python

如果遇到编码问题，可以设置如下环境变量：

export LC_ALL=zh_CN.UTF-8
export LANG=zh_CN.UTF-8

如果遇到网络问题，可以尝试使用国内的镜像源：

[[source]]
url = "https://mirrors.ustc.edu.cn/pypi/web/simple"
verify_ssl = true
name = "pypi"

在这里，我们首先对于 Python 的常用语法有所了解，可以参考 python-snippets。

基础语法

Python 是一门高阶、动态类型的多范式编程语言；定义 Python 文件的时候我们往往会先声明文件编码方式

# 指定脚本调用方式
#!/usr/bin/env python
# 配置 utf-8 编码
# -*- coding: utf-8 -*-

# 配置其他编码
# -*- coding: <encoding-name> -*-

# Vim 中还可以使用如下方式
# vim:fileencoding=<encoding-name>

# Python 中的注释方式
# 这是一个注释

'''
这是多行注释，用三个单引号
'''

"""
这是多行注释，用三个双引号
"""

人生苦短，请用 Python，大量功能强大的语法糖的同时让很多时候 Python 代码看上去有点像伪代码。譬如我们用 Python 实现的简易的快排相较于 Java 会显得很短小精悍

def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) / 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)

print quicksort([3,6,8,10,1,2,1])
# Prints "[1, 1, 2, 3, 6, 8, 10]"

控制台交互

可以根据 __name__ 关键字来判断是否是直接使用 python 命令执行某个脚本，还是外部引用；Google 开源的 fire 也是不错的快速将某个类封装为命令行工具的框架：

import fire

class Calculator(object):
  """A simple calculator class."""

  def double(self, number):
    return 2 * number

if __name__ == '__main__':
  fire.Fire(Calculator)

# python calculator.py double 10  # 20
# python calculator.py double --number=15  # 30

Python 2 中 print 是表达式，而 Python 3 中 print 是函数；如果希望在 Python 2 中将 print 以函数方式使用，则需要自定义引入

from __future__ import print_function

我们也可以使用 pprint 来美化控制台输出内容：

import pprint

stuff = ['spam', 'eggs', 'lumberjack', 'knights', 'ni']
pprint.pprint(stuff)

# 自定义参数
pp = pprint.PrettyPrinter(depth=6)
tup = ('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead',('parrot', ('fresh fruit',))))))))
pp.pprint(tup)

模块

Python 中的模块(Module)即是 Python 源码文件，其可以导出类、函数与全局变量；当我们从某个模块导入变量时，函数名往往就是命名空间(Namespace)。

而 Python 中的包（Package）则是模块的文件夹，往往由 __init__.py 指明某个文件夹为包，Package 可以为某个目录下所有的文件设置统一入口

# 目录格式
someDir/
	main.py
	subModules/
		__init__.py
		subA.pys
		subSubModules/
			__init__.py
			subSubA.py

# subA.py
def subAFun():
	print('Hello from subAFun')

def subAFunTwo():
	print('Hello from subAFunTwo')

# subSubA.py
def subSubAFun():
	print('Hello from subSubAFun')

def subSubAFunTwo():
	print('Hello from subSubAFunTwo')

# __init__.py from subDir

# 将 'subAFun()' 与 'subAFunTwo()' 添加到 'subModules' 命名空间
from subA import *

# 假设 'subSubModules' 中的 '__init__.py' 为空
# 将 'subSubAFun()' 与 'subSubAFunTwo()' 添加到 'subModules' 命名空间
from subSubModules.subSubA import *

# 假设 'subSubModules' 中的 '__init__.py' 不为空，包含了 'from .subSubA import *'
# __init__.py，将 'subSubAFun()' 与 'subSubAFunTwo()' 添加到 'subSubModules' 命名空间
from subSubA import *

# 将 'subSubAFun()' 与 'subSubAFunTwo()' 添加到 'subModules' 命名空间
from subSubDir import *

# main.py
import subDir

subDir.subAFun() # Hello from subAFun
subDir.subAFunTwo() # Hello from subAFunTwo
subDir.subSubAFun() # Hello from subSubAFun
subDir.subSubAFunTwo() # Hello from subSubAFunTwo

__init__.py 中也支持使用 __all__ 变量来声明所有需要导出的子模块:

import submodule1
import submodule2

__all__ = ['submodule1', 'submodule2']

Python 中只能在 Package 中使用相对导入，不能在用户的应用程序中使用相对导入，因为不论是相对导入还是绝对导入，都是相当于当前模块来说的，对于用户的主应用程序，也就是入口文件，模块名总是 __main__, 所以用户的应用程序必须使用绝对导入，而 Package 中的导入可以使用相对导入。

# python -m 运行该文件
from .some_module import some_class
from ..some_package import some_function
from . import some_class

import foo.baz # absolute import, always OK
from . import .baz # explicit relative import, Python >= 2.5, Py3
import baz # implicit relative import, OK in Python < 2.5, deprecated in Python >= 2.5, error in Python 3

动态加载

Python 支持动态地加载模块文件，即从某个文件中手动初始化对象：

def get_factory_from_template(maintype):
    path = os.path.join(BASE_DIR, 'templates', maintype, FACTORY_FILENAME)
    if (python_version_gte(3, 5)):
        # Python 3.5 code in this block
        import importlib.util
        spec = importlib.util.spec_from_file_location(
            "{}.factory".format(maintype), path)
        foo = importlib.util.module_from_spec(spec)
        spec.loader.exec_module(foo)
        return foo
    elif (python_version_gte(3, 0)):
        from importlib.machinery import SourceFileLoader
        foo = SourceFileLoader(
            "{}.factory".format(maintype), path).load_module()
        return foo
    else:
        # Python 2 code in this block
        import imp
        foo = imp.load_source("{}.factory".format(maintype), path)
        return foo

自定义模块

我们可以通过定义 setup.py 来创建自定义模块:

# pipenv install -e . 来安装本地目录的依赖
from setuptools import setup, find_packages

setup(
    name = "test",
    version = "1.0",
    keywords = ("test", "xxx"),
    description = "eds sdk",
    long_description = "eds sdk for python",
    license = "MIT Licence",

    url = "http://test.com",
    author = "test",
    author_email = "test@gmail.com",

    packages = find_packages(),
    include_package_data = True,
    platforms = "any",
    install_requires = [],

    scripts = [],
    entry_points = {
        'console_scripts': [
            'test = test.help:main'
        ]
    }
)

表达式与控制流

条件选择

Python 中使用 if、elif、els#e 来进行基础的条件选择操作：

if x < 0:
     x = 0
     print('Negative changed to zero')
 elif x == 0:
     print('Zero')
 else:
     print('More')

Python 同样支持 ternary conditional operator，并且提供了

a if condition else b

x = [True, True, False]
if any(x):
    print("At least one True")
if all(x):
    print("Not one False")
if any(x) and not all(x):
    print("At least one True and one False")

也可以使用 Tuple 来实现类似的效果：

# test 需要返回 True 或者 False
(falseValue, trueValue)[test]

# 更安全的做法是进行强制判断
(falseValue, trueValue)[test == True]

# 或者使用 bool 类型转换函数
(falseValue, trueValue)[bool(<expression>)]

循环遍历

for-in 可以用来遍历数组与字典：

words = ['cat', 'window', 'defenestrate']

for w in words:
    print(w, len(w))

# 使用数组访问操作符，能够迅速地生成数组的副本
for w in words[:]:
    if len(w) > 6:
        words.insert(0, w)

# words -> ['defenestrate', 'cat', 'window', 'defenestrate']

如果我们希望使用数字序列进行遍历，可以使用 Python 内置的 range 函数：

a = ['Mary', 'had', 'a', 'little', 'lamb']

for i in range(len(a)):
    print(i, a[i])

tqdm 是不错的命令行中的进度指示器：

from tqdm import tqdm
for i in tqdm(range(10000)):
    ...

基本数据类型

可以使用内建函数进行强制类型转换(Casting )

int(str)
float(str)
str(int)
str(float)

isinstance 方法用于判断某个对象是否源自某个类

ex = 10

# 判断是否为 int 类型
isinstance(ex,int)

# isinstance 也支持同时判断多个类型
# 如下代码判断是否为数组
def is_array(var):
    return isinstance(var, (list, tuple))

Number | 数值类型

x = 3
print type(x) # Prints "<type 'int'>"
print x       # Prints "3"
print x + 1   # Addition; prints "4"
print x - 1   # Subtraction; prints "2"
print x * 2   # Multiplication; prints "6"
print x ** 2  # Exponentiation; prints "9"
x += 1
print x  # Prints "4"
x *= 2
print x  # Prints "8"
y = 2.5
print type(y) # Prints "<type 'float'>"
print y, y + 1, y * 2, y ** 2 # Prints "2.5 3.5 5.0 6.25"

Python2 中默认使用传统除法，即自动四舍五入；而 Python 3 中默认使用精确除法：

# 传统除法 如果是整数除法则执行地板除，如果是浮点数除法则执行精确除法。
> 1/2
0
> 1.0/2.0
0.5

# 精确除法 除法总是会返回真实的商，不管操作数是整形还是浮点型。

# ‘//’无论是否整除返回的都是 int，而且是去尾整除
>>> 5//2
2

# 向上取整，返回值为 int
> math.ceil()

# 向下取整，返回值为 int
> math.floor()

# 返回值为 int
> round()

布尔类型

Python 提供了常见的逻辑操作符，不过需要注意的是 Python 中并没有使用 &&、|| 等，而是直接使用了英文单词。

t = True
f = False
print type(t) # Prints "<type 'bool'>"
print t and f # Logical AND; prints "False"
print t or f  # Logical OR; prints "True"
print not t   # Logical NOT; prints "False"
print t != f  # Logical XOR; prints "True"

String | 字符串

Python 2 中支持 Ascii 码的 str() 类型，独立的 unicode() 类型，没有 byte 类型；而 Python 3 中默认的字符串为 utf-8 类型，并且包含了 byte 与 bytearray 两个字节类型：

type("Guido") # string type is str in python2
# <type 'str'>

# 使用 __future__ 中提供的模块来降级使用 Unicode
from __future__ import unicode_literals
type("Guido") # string type become unicode
# <type 'unicode'>

Python 字符串支持分片、模板字符串等常见操作

var1 = 'Hello World!'
var2 = "Python Programming"

print "var1[0]: ", var1[0]
print "var2[1:5]: ", var2[1:5]
# var1[0]:  H
# var2[1:5]:  ytho

print "My name is %s and weight is %d kg!" % ('Zara', 21)
# My name is Zara and weight is 21 kg!

str[0:4]
len(str)

string.replace("-", " ")
",".join(list)
str.find(",")
str.index(",")   # same, but raises IndexError
str.count(",")
str.split(",")

str.lower()
str.upper()
str.title()

str.lstrip()
str.rstrip()
str.strip()

str.islower()

# 移除所有的特殊字符
re.sub('[^A-Za-z0-9]+', '', mystring)

如果需要判断是否包含某个子字符串，或者搜索某个字符串的下标

# in 操作符可以判断字符串
if "blah" not in somestring:
    continue

# find 可以搜索下标
s = "This be a string"
if s.find("is") == -1:
    print "No 'is' here!"
else:
    print "Found 'is' in the string."

模板字符串

import datetime

name = 'Fred'
age = 50
anniversary = datetime.date(1991, 10, 12)

x = a + b
x = '%s, %s!' % (name, age)
x = 'name: %s; age: %d' % (name, age)
x = '{}, {}!'.format(name, age)
x = 'name: {}; age: {}'.format(name, age)

f'My name is {name}, my age next year is {age+1}, my anniversary is {anniversary:%A, %B %d, %Y}.'
# 'My name is Fred, my age next year is 51, my anniversary is Saturday, October 12, 1991.'

f'He said his name is {name!r}.'
# "He said his name is 'Fred'."

Regex | 正则表达式

Symbols

Term	Description
.	(period) Matches any single character, except for line breaks.
*	Matches the preceding expression 0 or more times.
+	Matches the preceding expression 1 or more times.
?	Preceding expression is optional (Matches 0 or 1 times).
^	Matches the beginning of the string.
$	Matches the end of the string.

Character groups

Term	Description
\d	Matches any single digit character.
\w	Matches any word character (alphanumeric & underscore).
[XYZ]	Character Set: Matches any single character from the character within the brackets. You can also do a range such as [A-Z]
[XYZ]+	Matches one or more of any of the characters in the set.
[^a-z]	Inside a character set, the ^ is used for negation. In this example, match anything that is NOT an uppercase letter.

Flags: There are five optional flags. They can be used separately or together and are placed after the closing slash. Example: /[A-Z]/g I’ll only be introducing 2 here.

Term	Description
g	Global search
i	case insensitive search

Advanced

Term	Description
(x)	Capturing Parenthesis: Matches x and remembers it so we can use it later.
(?:x)	Non-capturing Parenthesis: Matches x and does not remembers it.
x(?=y)	Lookahead: Matches x only if it is followed by y.

import re

# 判断是否匹配
re.match(r'^[aeiou]', str)

# 以第二个参数指定的字符替换原字符串中内容
re.sub(r'^[aeiou]', '?', str)
re.sub(r'(xyz)', r'\1', str)

# 编译生成独立的正则表达式对象
expr = re.compile(r'^...$')
expr.match(...)
expr.sub(...)

如果我们需要提取出正则表达式中的匹配组，则需要理由正则表达式的中括号与 group 方法:

title_search = re.search('<title>(.*)</title>', html, re.IGNORECASE)

if title_search:
    title = title_search.group(1)

下面列举了常见的表达式使用场景

# 检测是否为 HTML 标签
re.search('<[^/>][^>]*>', '<a href="#label">')

# 常见的用户名密码
re.match('^[a-zA-Z0-9-_]{3,16}$', 'Foo') is not None
re.match('^\w|[-_]{3,16}$', 'Foo') is not None

# Email
re.match('^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$', 'hello.world@example.com')

# Url
exp = re.compile(r'''^(https?:\/\/)? # match http or https
                ([\da-z\.-]+)            # match domain
                \.([a-z\.]{2,6})         # match domain
                ([\/\w \.-]*)\/?$        # match api or file
                ''', re.X)
exp.match('www.google.com')

# IP 地址
exp = re.compile(r'''^(?:(?:25[0-5]
                     |2[0-4][0-9]
                     |[1]?[0-9][0-9]?)\.){3}
                     (?:25[0-5]
                     |2[0-4][0-9]
                     |[1]?[0-9][0-9]?)$''', re.X)
exp.match('192.168.1.1')

集合类型

List | 列表

Operation | 创建增删

list 是基础的序列类型：

l = []
l = list()

# 使用字符串的 split 方法，可以将字符串转化为列表
str.split(".")

# 如果需要将数组拼装为字符串，则可以使用 join
list1 = ['1', '2', '3']
str1 = ''.join(list1)

# 如果是数值数组，则需要先进行转换
list1 = [1, 2, 3]
str1 = ''.join(str(e) for e in list1)

可以使用 append 与 extend 向数组中插入元素或者进行数组连接

x = [1, 2, 3]

x.append([4, 5]) # [1, 2, 3, [4, 5]]

x.extend([4, 5]) # [1, 2, 3, 4, 5]，注意 extend 返回值为 None

可以使用 pop、slices、del、remove 等移除列表中元素：

myList = [10,20,30,40,50]

# 弹出第二个元素
myList.pop(1) # 20
# myList: myList.pop(1)

# 如果不加任何参数，则默认弹出最后一个元素
myList.pop()

# 使用 slices 来删除某个元素
a = [1, 2, 3, 4, 5, 6 ]
index = 3 # Only Positive index
a = a[:index] + a[index+1 :]

# 根据下标删除元素
myList = [10,20,30,40,50]
rmovIndxNo = 3
del myList[rmovIndxNo] # myList: [10, 20, 30, 50]

# 使用 remove 方法，直接根据元素删除
letters = ["a", "b", "c", "d", "e"]
numbers.remove(numbers[1])
print(*letters) # used a * to make it unpack you don't have to

Iteration | 索引遍历

你可以使用基本的 for 循环来遍历数组中的元素，就像下面介个样纸

animals = ['cat', 'dog', 'monkey']
for animal in animals:
    print animal
# Prints "cat", "dog", "monkey", each on its own line.

如果你在循环的同时也希望能够获取到当前元素下标，可以使用 enumerate 函数

animals = ['cat', 'dog', 'monkey']
for idx, animal in enumerate(animals):
    print '#%d: %s' % (idx + 1, animal)
# Prints "#1: cat", "#2: dog", "#3: monkey", each on its own line

Python 也支持切片(Slices):

nums = range(5)    # range is a built-in function that creates a list of integers
print nums         # Prints "[0, 1, 2, 3, 4]"
print nums[2:4]    # Get a slice from index 2 to 4 (exclusive); prints "[2, 3]"
print nums[2:]     # Get a slice from index 2 to the end; prints "[2, 3, 4]"
print nums[:2]     # Get a slice from the start to index 2 (exclusive); prints "[0, 1]"
print nums[:]      # Get a slice of the whole list; prints ["0, 1, 2, 3, 4]"
print nums[:-1]    # Slice indices can be negative; prints ["0, 1, 2, 3]"
nums[2:4] = [8, 9] # Assign a new sublist to a slice
print nums         # Prints "[0, 1, 8, 9, 4]"

Comprehensions | 变换

Python 中同样可以使用 map, reduce, filter，其中 map 用于变换数组

# 使用 map 对数组中的每个元素计算平方
items = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x**2, items))

# map 支持函数以数组方式连接使用
def multiply(x):
    return (x*x)

def add(x):
    return (x+x)

funcs = [multiply, add]

for i in range(5):
    value = list(map(lambda x: x(i), funcs))
    print(value)

reduce 用于进行归纳计算

# reduce 将数组中的值进行归纳
from functools import reduce
product = reduce((lambda x, y: x * y), [1, 2, 3, 4])

# Output: 24

filter 则可以对数组进行过滤

number_list = range(-5, 5)
less_than_zero = list(filter(lambda x: x < 0, number_list))
print(less_than_zero)

# Output: [-5, -4, -3, -2, -1]

字典类型

创建增删

d = {'cat': 'cute', 'dog': 'furry'}  # 创建新的字典
print d['cat']       # 字典不支持点(Dot)运算符取值

如果需要合并两个或者多个字典类型：

# python 3.5
z = {**x, **y}

# python 2.7
def merge_dicts(*dict_args):
    """
    Given any number of dicts, shallow copy and merge into a new dict,
    precedence goes to key value pairs in latter dicts.
    """
    result = {}
    for dictionary in dict_args:
        result.update(dictionary)
    return result

索引遍历

可以根据键来直接进行元素访问

# Python 中对于访问不存在的键会抛出 KeyError 异常，需要先行判断或者使用 get
print 'cat' in d     # Check if a dictionary has a given key; prints "True"

# 如果直接使用 [] 来取值，需要先确定键的存在，否则会抛出异常
print d['monkey']  # KeyError: 'monkey' not a key of d

# 使用 get 函数则可以设置默认值
print d.get('monkey', 'N/A')  # Get an element with a default; prints "N/A"
print d.get('fish', 'N/A')    # Get an element with a default; prints "wet"

d.keys() # 使用 keys 方法可以获取所有的键

Python 还提供了部分特殊的字典类型：

from collections import OrderedDict, Counter
# Remembers the order the keys are added!
x = OrderedDict(a=1, b=2, c=3)
# Counts the frequency of each character
y = Counter("Hello World!")

可以使用 for-in 来遍历数组

# 遍历键
for key in d:

# 比前一种方式慢
for k in dict.keys(): ...

# 直接遍历值
for value in dict.itervalues(): ...

# Python 2.x 中遍历键值
for key, value in d.iteritems():

# Python 3.x 中遍历键值
for key, value in d.items():

其他序列类型

Set

# Same as {"a", "b","c"}
normal_set = set(["a", "b","c"])

# Adding an element to normal set is fine
normal_set.add("d")

# 判断 Set 大小
len(s)

# 判断元素是否存在于 Set 中
x in s
x not in s

# 集合间运算符
s.issubset(t)
s.issuperset(t)
s.union(t)
s.intersection(t)
s.difference(t)
s.symmetric_difference(t)

# A frozen set
frozen_set = frozenset(["e", "f", "g"])

print("Frozen Set")
print(frozen_set)

# Uncommenting below line would cause error as
# we are trying to add element to a frozen set
# frozen_set.add("h")

Tuple

# 修改 Tuple 中的值
t = ('275', '54000', '0.0', '5000.0', '0.0')
lst = list(t)
lst[0] = '300'
t = tuple(lst)

Enum | 枚举类型

class Enum(set):
    def __getattr__(self, name):
        if name in self:
            return name
        raise AttributeError

函数

函数定义

Python 中的函数使用 def 关键字进行定义，譬如

def sign(x):
    if x > 0:
        return 'positive'
    elif x < 0:
        return 'negative'
    else:
        return 'zero'


for x in [-1, 0, 1]:
    print sign(x)
# Prints "negative", "zero", "positive"

Python 支持运行时创建动态函数，也即是所谓的 lambda 函数：

def f(x): return x**2

# 等价于
g = lambda x: x**2

lambda 表达式是 Python 函数式开发的重要基石，其方便实现 Partial Function 等模式；典型的 lambda 函数声明如下：

lambda x: x**2 + 2*x - 5

典型的用法譬如自定义的 filter 函数中：

mult3 = filter(lambda x: x % 3 == 0, [1, 2, 3, 4, 5, 6, 7, 8, 9])

def filterfunc(x):
    return x % 3 == 0
mult3 = filter(filterfunc, [1, 2, 3, 4, 5, 6, 7, 8, 9])

mult3 = [x for x in [1, 2, 3, 4, 5, 6, 7, 8, 9] if x % 3 == 0]

range(3,10,3)

参数

默认参数

Option Arguments | 不定参数

def example(a, b=None, *args, **kwargs):
  print a, b
  print args
  print kwargs

example(1, "var", 2, 3, word="hello")
# 1 var
# (2, 3)
# {'word': 'hello'}

a_tuple = (1, 2, 3, 4, 5)
a_dict = {"1":1, "2":2, "3":3}
example(1, "var", *a_tuple, **a_dict)
# 1 var
# (1, 2, 3, 4, 5)
# {'1': 1, '2': 2, '3': 3}

对于不定参数的调用，同样可以使用 ** 运算符：

func(**{'type':'Event'})

# 等价于
func(type='Event')

生成器

def simple_generator_function():
    yield 1
    yield 2
    yield 3

for value in simple_generator_function():
    print(value)

# 输出结果
# 1
# 2
# 3
our_generator = simple_generator_function()
next(our_generator)
# 1
next(our_generator)
# 2
next(our_generator)
#3

# 生成器典型的使用场景譬如无限数组的迭代
def get_primes(number):
    while True:
        if is_prime(number):
            yield number
        number += 1

装饰器

装饰器是非常有用的设计模式

# 简单装饰器

from functools import wraps
def decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print('wrap function')
        return func(*args, **kwargs)
    return wrapper

@decorator
def example(*a, **kw):
    pass

example.__name__  # attr of function preserve
# 'example'
# Decorator

# 带输入值的装饰器

from functools import wraps
def decorator_with_argument(val):
  def decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
      print "Val is {0}".format(val)
      return func(*args, **kwargs)
    return wrapper
  return decorator

@decorator_with_argument(10)
def example():
  print "This is example function."

example()
# Val is 10
# This is example function.

# 等价于

def example():
  print "This is example function."

example = decorator_with_argument(10)(example)
example()
# Val is 10
# This is example function.

类与对象

类定义

Python 中对于类的定义也很直接

class Greeter(object):

    # Constructor
    def __init__(self, name):
        self.name = name  # Create an instance variable

    # Instance method
    def greet(self, loud=False):
        if loud:
            print 'HELLO, %s!' % self.name.upper()
        else:
            print 'Hello, %s' % self.name

g = Greeter('Fred')  # Construct an instance of the Greeter class
g.greet()            # Call an instance method; prints "Hello, Fred"
g.greet(loud=True)   # Call an instance method; prints "HELLO, FRED!"

Managed Attributes | 受控属性

# property、setter、deleter 可以用于复写点方法

class Example(object):
    def __init__(self, value):
       self._val = value

    @property
    def val(self):
        return self._val

    @val.setter
    def val(self, value):
        if not isintance(value, int):
            raise TypeError("Expected int")
        self._val = value

    @val.deleter
    def val(self):
        del self._val

    @property
    def square3(self):
        return 2**3

ex = Example(123)
ex.val = "str"
# Traceback (most recent call last):
#   File "", line 1, in
#   File "test.py", line 12, in val
#     raise TypeError("Expected int")
# TypeError: Expected int

类方法与静态方法

class example(object):

  @classmethod
  def clsmethod(cls):
    print "I am classmethod"

  @staticmethod
  def stmethod():
    print "I am staticmethod"

  def instmethod(self):
    print "I am instancemethod"

ex = example()
ex.clsmethod()
# I am classmethod
ex.stmethod()
# I am staticmethod
ex.instmethod()
# I am instancemethod
example.clsmethod()
# I am classmethod
example.stmethod()
# I am staticmethod
example.instmethod()
# Traceback (most recent call last):
#   File "", line 1, in
# TypeError: unbound method instmethod() ...

如果我们希望直接比较两个对象的，则需要覆写该类的比较相关方法：


class Thing:
    def __init__(self, value):
        self.__value = value
    def __gt__(self, other):
        return self.__value > other.__value
    def __lt__(self, other):
        return self.__value < other.__value
something = Thing(100)
nothing = Thing(0)
# True
something > nothing
# False
something < nothing
# Error
something + nothing

抽象类

Python 中支持定义抽象类，抽象类不可被直接实例化；抽象类必须包含一或多个抽象方法：

from abc import ABC, abstractmethod

class AbstractClassExample(ABC):

    def __init__(self, value):
        self.value = value
        super().__init__()

    @abstractmethod
    def do_something(self):
        pass

在 Python 3 中，也可以通过继承元类的方式来实现:

from abc import ABCMeta

class MyABC(metaclass=ABCMeta):
    pass

MetaClass | 元类

Python 中所谓的

类继承

# 父类必须继承 object 或者其他父类
class BaseClass(object):
    def __init__(self, *args, **kwargs):
        pass

class ChildClass(BaseClass):
    def __init__(self, *args, **kwargs):
        # 调用父类构造函数
        super(ChildClass, self).__init__(*args, **kwargs)

class Car(object):
    condition = "new"

    def __init__(self, model, color, mpg):
        self.model = model
        self.color = color
        self.mpg   = mpg

class ElectricCar(Car):
    def __init__(self, battery_type, model, color, mpg):
        self.battery_type=battery_type
        super(ElectricCar, self).__init__(model, color, mpg)

car = ElectricCar('battery', 'ford', 'golden', 10)
print car.__dict__

对象

实例化

class Foo(object):
    def __init__(self, a, b):
        self.a = a
        self.b = b

    def bar(self):
        pass

i = Foo(2, 3)

Python 中与类实例化相关的方法有 __new__ 与 __init__，__new__ 会在对象创建时候调用，覆写该方法能够自定义实例的创建过程；而 __init__ 方法则是在实例创建完毕后初始化实例:

class Foo(object):
    def __new__(cls, *args, **kwargs):
        print "Creating Instance"
        instance = super(Foo, cls).__new__(cls, *args, **kwargs)
        # 或者
        # object.__new__(cls, *args, **kwargs)
        return instance
    ...

>>> i = Foo(2, 3)
Creating Instance

属性操作

Python 中对象的属性不同于字典键，可以使用点运算符取值，直接使用 in 判断会存在问题

class A(object):
    @property
    def prop(self):
        return 3

a = A()
print "'prop' in a.__dict__ =", 'prop' in a.__dict__
print "hasattr(a, 'prop') =", hasattr(a, 'prop')
print "a.prop =", a.prop

# 'prop' in a.__dict__ = False
# hasattr(a, 'prop') = True
# a.prop = 3

建议使用 hasattr, getattr, setattr 这种方式对于对象属性进行操作

class Example(object):
  def __init__(self):
    self.name = "ex"
  def printex(self):
    print "This is an example"


# Check object has attributes
# hasattr(obj, 'attr')
ex = Example()
hasattr(ex,"name")
# True
hasattr(ex,"printex")
# True
hasattr(ex,"print")
# False

# Get object attribute
# getattr(obj, 'attr')
getattr(ex,'name')
# 'ex'

# Set object attribute
# setattr(obj, 'attr', value)
setattr(ex,'name','example')
ex.name
# 'example'

单例模式

我们可以通过覆写 __new__ 方法或者创建特殊的 MetaClass 来实现单例模式:

class Singleton(object):
    _instance = None  # Keep instance reference

    def __new__(cls, *args, **kwargs):
        if not cls._instance:
            cls._instance = object.__new__(cls, *args, **kwargs)
        return cls._instance

# 指定数目的单例
class LimitedInstances(object):
    _instances = []  # Keep track of instance reference
    limit = 5

    def __new__(cls, *args, **kwargs):
        if not len(cls._instances) <= cls.limit:
            raise RuntimeError, "Count not create instance. Limit %s reached" % cls.limit
        instance = object.__new__(cls, *args, **kwargs)
        cls._instances.append(instance)
        return instance

    def __del__(self):
        # Remove instance from _instances
        self._instance.remove(self)

也可以通过继承元类的方式来实现:

class Singleton(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton,                                      cls).__call__(*args, **kwargs)
        return cls._instances[cls]


class SingletonClass(metaclass=Singleton):
    pass
class RegularClass():
    pass
x = SingletonClass()
y = SingletonClass()
print(x == y)
x = RegularClass()
y = RegularClass()
print(x == y)

异常与测试

异常处理

try

import sys

try:
    f = open('myfile.txt')
    s = f.readline()
    i = int(s.strip())
except OSError as err:
    print("OS error: {0}".format(err))
except ValueError:
    print("Could not convert data to an integer.")
except:
    print("Unexpected error:", sys.exc_info()[0])
    raise

class B(Exception):
    pass

class C(B):
    pass

class D(C):
    pass

for cls in [B, C, D]:
    try:
        raise cls()
    except D:
        print("D")
    except C:
        print("C")
    except B:
        print("B")

Context Manager - with

with 常用于打开或者关闭某些资源

host = 'localhost'
port = 5566
with Socket(host, port) as s:
    while True:
        conn, addr = s.accept()
        msg = conn.recv(1024)
        print msg
        conn.send(msg)
        conn.close()

单元测试

from __future__ import print_function

import unittest

def fib(n):
    return 1 if n<=2 else fib(n-1)+fib(n-2)

def setUpModule():
        print("setup module")
def tearDownModule():
        print("teardown module")

class TestFib(unittest.TestCase):

    def setUp(self):
        print("setUp")
        self.n = 10
    def tearDown(self):
        print("tearDown")
        del self.n
    @classmethod
    def setUpClass(cls):
        print("setUpClass")
    @classmethod
    def tearDownClass(cls):
        print("tearDownClass")
    def test_fib_assert_equal(self):
        self.assertEqual(fib(self.n), 55)
    def test_fib_assert_true(self):
        self.assertTrue(fib(self.n) == 55)

if __name__ == "__main__":
    unittest.main()

存储

文件读写

路径处理

Python 内置的 __file__ 关键字会指向当前文件的相对路径，可以根据它来构造绝对路径，或者索引其他文件

# 获取当前文件的相对目录
dir = os.path.dirname(__file__) # src\app

## once you're at the directory level you want, with the desired directory as the final path node:
dirname1 = os.path.basename(dir)
dirname2 = os.path.split(dir)[1] ## if you look at the documentation, this is exactly what os.path.basename does.

# 获取当前代码文件的绝对路径，abspath 会自动根据相对路径与当前工作空间进行路径补全
os.path.abspath(os.path.dirname(__file__)) # D:\WorkSpace\OWS\tool\ui-tool-svn\python\src\app

# 获取当前文件的真实路径
os.path.dirname(os.path.realpath(__file__)) # D:\WorkSpace\OWS\tool\ui-tool-svn\python\src\app

# 获取当前执行路径
os.getcwd()

可以使用 listdir、walk、glob 模块来进行文件枚举与检索：

# 仅列举所有的文件
from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

# 使用 walk 递归搜索
from os import walk

f = []
for (dirpath, dirnames, filenames) in walk(mypath):
    f.extend(filenames)
    break

# 使用 glob 进行复杂模式匹配
import glob
print(glob.glob("/home/adam/*.txt"))
# ['/home/adam/file1.txt', '/home/adam/file2.txt', .... ]

简单文件读写

# 可以根据文件是否存在选择写入模式
mode = 'a' if os.path.exists(writepath) else 'w'

# 使用 with 方法能够自动处理异常
with open("file.dat",mode) as f:
    f.write(...)
    ...
    # 操作完毕之后记得关闭文件
    f.close()

# 读取文件内容
message = f.read()

复杂格式文件

JSON

import json

# Writing JSON data
with open('data.json', 'w') as f:
     json.dump(data, f)

# Reading data back
with open('data.json', 'r') as f:
     data = json.load(f)

XML

我们可以使用 lxml 来解析与处理 XML 文件，本部分即对其常用操作进行介绍。lxml 支持从字符串或者文件中创建 Element 对象：

from lxml import etree

# 可以从字符串开始构造
xml = '<a xmlns="test"><b xmlns="test"/></a>'
root = etree.fromstring(xml)
etree.tostring(root)
# b'<a xmlns="test"><b xmlns="test"/></a>'

# 也可以从某个文件开始构造
tree = etree.parse("doc/test.xml")

# 或者指定某个 baseURL
root = etree.fromstring(xml, base_url="http://where.it/is/from.xml")

其提供了迭代器以对所有元素进行遍历：

# 遍历所有的节点
for tag in tree.iter():
    if not len(tag):
        print tag.keys() # 获取所有自定义属性
        print (tag.tag, tag.text) # text 即文本子元素值

# 获取 XPath
for e in root.iter():
    print tree.getpath(e)

lxml 支持以 XPath 查找元素，不过需要注意的是，XPath 查找的结果是数组，并且在包含命名空间的情况下，需要指定命名空间：

root.xpath('//page/text/text()',ns={prefix:url})

# 可以使用 getparent 递归查找父元素
el.getparent()

lxml 提供了 insert、append 等方法进行元素操作：

# append 方法默认追加到尾部
st = etree.Element("state", name="New Mexico")
co = etree.Element("county", name="Socorro")
st.append(co)

# insert 方法可以指定位置
node.insert(0, newKid)

Excel

可以使用 xlrd 来读取 Excel 文件，使用 xlsxwriter 来写入与操作 Excel 文件。

# 读取某个 Cell 的原始值
sh.cell(rx, col).value

# 创建新的文件
workbook = xlsxwriter.Workbook(outputFile)
worksheet = workbook.add_worksheet()

# 设置从第 0 行开始写入
row = 0

# 遍历二维数组，并且将其写入到 Excel 中
for rowData in array:
    for col, data in enumerate(rowData):
        worksheet.write(row, col, data)
    row = row + 1

workbook.close()

文件系统

对于高级的文件操作，我们可以使用 Python 内置的 shutil

# 递归删除 appName 下面的所有的文件夹
shutil.rmtree(appName)

网络交互

Requests

Requests 是优雅而易用的 Python 网络请求库

import requests

r = requests.get('https://api.github.com/events')
r = requests.get('https://api.github.com/user', auth=('user', 'pass'))

r.status_code
# 200
r.headers['content-type']
# 'application/json; charset=utf8'
r.encoding
# 'utf-8'
r.text
# u'{"type":"User"...'
r.json()
# {u'private_gists': 419, u'total_private_repos': 77, ...}

r = requests.put('http://httpbin.org/put', data = {'key':'value'})
r = requests.delete('http://httpbin.org/delete')
r = requests.head('http://httpbin.org/get')
r = requests.options('http://httpbin.org/get')

数据存储

MySQL

import pymysql.cursors

# Connect to the database
connection = pymysql.connect(host='localhost',
                             user='user',
                             password='passwd',
                             db='db',
                             charset='utf8mb4',
                             cursorclass=pymysql.cursors.DictCursor)

try:
    with connection.cursor() as cursor:
        # Create a new record
        sql = "INSERT INTO `users` (`email`, `password`) VALUES (%s, %s)"
        cursor.execute(sql, ('webmaster@python.org', 'very-secret'))

    # connection is not autocommit by default. So you must commit to save
    # your changes.
    connection.commit()

    with connection.cursor() as cursor:
        # Read a single record
        sql = "SELECT `id`, `password` FROM `users` WHERE `email`=%s"
        cursor.execute(sql, ('webmaster@python.org',))
        result = cursor.fetchone()
        print(result)
finally:
    connection.close()

并发编程

concurrent.futures 是标准库的一部分，自 3.2 版本之后引入；对于老版本则需要手动引入 futures 依赖库。我们可以使用 ProcessPoolExecutor 来处理 CPU 密集型任务，使用 ThreadPoolExecutor 来处理网络操作或者 IO 操作。ProcessPoolExecutor 内部使用了 multiprocessing 模块，其不会受到 GIL(Global Intercept Lock)的干扰。

from concurrent.futures import ThreadPoolExecutor
import time

import requests

def fetch(a):
    url = 'http://httpbin.org/get?a={0}'.format(a)
    r = requests.get(url)
    result = r.json()['args']
    return result

start = time.time()

# if max_workers is None or not given, it will default to the number of processors, multiplied by 5
with ThreadPoolExecutor(max_workers=None) as executor:
    for result in executor.map(fetch, range(30)):
        print('response: {0}'.format(result))

print('time: {0}'.format(time.time() - start))

executor.submit() 则可以返回 Future 对象，所谓的 Future 即是包含了某个异步执行函数，并且会在未来执行完毕或者抛出异常的对象。而 as_completed 函数与上文使用的 map 函数的区别在于，map 会按照传入迭代器的顺序返回结果，而 as_completed 则会返回首页执行完毕的 Future 对象：

from concurrent.futures import ThreadPoolExecutor, as_completed

import requests

def fetch(url, timeout):
    r = requests.get(url, timeout=timeout)
    data = r.json()['args']
    return data

with ThreadPoolExecutor(max_workers=10) as executor:
    futures = {}
    for i in range(42):
        url = 'https://httpbin.org/get?i={0}'.format(i)
        future = executor.submit(fetch, url, 60)
        futures[future] = url

    for future in as_completed(futures):
        url = futures[future]
        try:
            data = future.result()
        except Exception as exc:
            print(exc)
        else:
            print('fetch {0}, get {1}'.format(url, data))

编程规范

def fetch_bigtable_rows(big_table, keys, other_silly_variable=None):
    """Fetches rows from a Bigtable.

    Retrieves rows pertaining to the given keys from the Table instance
    represented by big_table.  Silly things may happen if
    other_silly_variable is not None.

    Args:
        big_table: An open Bigtable Table instance.
        keys: A sequence of strings representing the key of each table row
            to fetch.
        other_silly_variable: Another optional variable, that has a much
            longer name than the other args, and which does nothing.

    Returns:
        A dict mapping keys to the corresponding table row data
        fetched. Each row is represented as a tuple of strings. For
        example:

        {'Serak': ('Rigel VII', 'Preparer'),
         'Zim': ('Irk', 'Invader'),
         'Lrrr': ('Omicron Persei 8', 'Emperor')}

        If a key from the keys argument is missing from the dictionary,
        then that row was not found in the table.

    Raises:
        IOError: An error occurred accessing the bigtable.Table object.
    """
    pass

所谓"内部(Internal)“表示仅模块内可用, 或者, 在类内是保护或私有的。用单下划线(_)开头表示模块变量或函数是 protected 的(使用 import * from 时不会包含)。用双下划线(__)开头的实例变量或方法表示类内私有。将相关的类和顶级函数放在同一个模块里. 不像 Java, 没必要限制一个类一个模块。对类名使用大写字母开头的单词(如 CapWords, 即 Pascal 风格), 但是模块名应该用小写加下划线的方式(如 lower_with_under.py). 尽管已经有很多现存的模块使用类似于 CapWords.py 这样的命名, 但现在已经不鼓励这样做, 因为如果模块名碰巧和类名一致, 这会让人困扰.

Type	Public	Internal
Modules	lower_with_under	`_lower_with_under`
Packages	lower_with_under
Classes	CapWords	`_CapWords`
Exceptions	CapWords
Functions	lower_with_under()	_lower_with_under()
Global/Class Constants	CAPS_WITH_UNDER	`_CAPS_WITH_UNDER`
Global/Class Variables	lower_with_under	`_lower_with_under`
Instance Variables	lower_with_under	_lower_with_under (protected) or __lower_with_under (private)
Method Names	lower_with_under()	_lower_with_under() (protected) or __lower_with_under() (private)
Function/Method Parameters	lower_with_under
Local Variables	lower_with_under