python中for语句中对当前遍历对象赋值的问题？

Question

我有一个二维列表，列表中的元素是以字符串为元素的列表。由于这些字符串表示的是数字，我想把这些字符串都转换成float型。 {代码...} 上面这段程序执行后dataset不变，为什么i ＝ float(i)不能改变i的值呢？虽然...

迷茫 · Answer

A variable is just a pointer to an object

Understand it, and then continue reading.

First type: During traversal, a pointer named i is created and points to an element in data. When you execute i=float(i), you just create a new object float(i) and let i point to it, that's all.

The second one is the same.

ringa_lee · Answer

To be precise, the value of i ＝ float(i)的确改变了i的值，但i只是data元素的一份copy，并没有改变data is like:

>>> a = ["ab", "cd"]
>>> i = a[0]
>>> i = 123
>>> a
['ab', 'cd']

The second is because you operate directly on the original list:

>>> a[0] = 123
>>> a
[123, 'cd']

PHPz · Answer

In python, when traversing an object through for, you cannot modify the traversed object itself. This is a general rule.
Generally speaking, the next element is obtained through the next() method.
If the traversal object is allowed to be modified, it will affect the order of elements, making the results of each next step uncontrollable.
Consider using enumerate to modify it.

for i,v in enumerate(data):
data[i] = float(v)
At this time, the iteration object is enumerate(data), not data, so it can be modified.

For additional information, please refer to the description of the for statement in the official Python manual:
https://docs.python.org/2/reference/compound_stmts.html#the-for-statement

伊谢尔伦 · Answer

In for traversal in Python, it is usually not recommended (not impossible) to directly modify the traversal object itself, because this will cause problems similar to the following: for 遍历通常不建议（不是不能）直接修改遍历对象本身，因为这样就会出现类似如下的问题：

n = [1, 2, 3]
for i in n:
    n.append(i)
    print(i)

上段程序会进入无限死循环，因为 n 在每次迭代时，长度都会增加，因此for永远不可能穷尽，所以我们通常会使用迭代对象的副本进行遍历：

n = [1, 2, 3]
for i in n[:]:
    n.append(i)
    print(i)

这样就可以修改n的值，也可以顺利完成遍历。

现在回到你的例子的第一段程序中：

def loadCsv(filename):
    # 取出数据
    lines = csv.reader(open(filename, 'rb'))
    # 存入dataset
    dataset = list(lines)
    for data in dataset:
        for i in data:
            i = float(i) #这里前后两个`i`其实指代的是不同的对象
    return dataset

事实上，你的i = float(i) 前后两个指代的根本不是同一个对象，后一个 i 是 data的元素，前一个 i 则是loadCsv作用域内的局部变量，这里涉及到 Python 语言设计中的一个不合理的地方，来看一段程序：

for i in range(3):
    pass

print(i)
# 2

也就是说参与迭代的标识符 i 在退出 for 循环之后，仍然没有被回收，并且保留着与迭代最后一个值之间的关联，这对同名的全局变量会造成影响，时常会出现这样的错误：

i = 7

for i in range(3):
    pass

print(i)
# 2

一个 for 循环之后，全局变量 i 的值尽然莫名其妙的变了，原因在于 i 其实并非对象本身，而是对象的标识符，Python 的标识符并非对象的属性，而是可以复用的命名空间的一部分。

因此当 for 循环内有同名的 i 标识符被赋值时，情况就又不一样了：

for i in range(5):
    i = 3
    
print(i)
# 3

这里 i 的值完全等同于 for 内给它的赋值，原因在于 Python 中的赋值操作，就是将值对象与标识符关联的操作，最后一次迭代时，数值 3 会被关联到标识符 i，因此 i 就被绑定到新的对象上了，回到你的第一段程序，情况也就是如此：i = float(i) 是将值对象 float(i) 绑定到标识符 i,因此赋值后的 i 压根不是 data的元素对象，因此不会更改 dataset。
而你的第二段程序：

def loadCsv(filename):
    # 取出数据
    lines = csv.reader(open(filename, 'rb'))
    # 存入dataset
    dataset = list(lines)
    for i in range(len(dataset)):
        dataset[i] = [float(x) for x in dataset[i]]
    return dataset

for 内的 i 只是个索引，你修改的不是i,而是 dataset[i]标识符关联的对象 ,而dataset[i]则是dataset的组成元素，因此可以更改data rrreee

The above program will enter an infinite loop, because the length of n will increase with each iteration, so for can never be exhausted, so we usually use the iteration object Copy to traverse:

rrreee

In this way, the value of n can be modified and the traversal can be completed smoothly. 🎜 🎜Now go back to the first program in your example: 🎜 rrreee 🎜In fact, your i = float(i) does not refer to the same object at all. The latter i is data element, the previous i is a local variable within the scope of loadCsv. This involves an unreasonable point in Python language design. Let’s take a look at a program: 🎜 rrreee 🎜That is to say, the identifier i participating in the iteration has not been recycled after exiting the for loop, and retains the association with the last value of the iteration. This is Global variables with the same name will have an impact, and errors like this often occur: 🎜 rrreee 🎜After a for loop, the value of the global variable i changed inexplicably. The reason is that i is not actually the object itself, but the object's Identifiers in Python are not attributes of objects, but part of a reusable namespace. 🎜 🎜So when the i identifier with the same name is assigned within the for loop, the situation is different: 🎜 rrreee 🎜The value of i here is completely equivalent to the value assigned to it in for. The reason is that the assignment operation in Python is the operation of associating the value object with the identifier. The last time When iterating, the value 3 will be associated with the identifier i, so i is bound to the new object, back to you This is the case in the first program: i = float(i) binds the value object float(i) to the identifier i code>, so the assigned i is not the element object of data at all, so dataset will not be changed.
And your second program: 🎜 rrreee 🎜i in for is just an index. What you modify is not i, but the dataset[i] identifier. The object associated with the symbol, and dataset[i] is a component element of dataset, so data can be changed. 🎜 🎜For identifiers and namespaces, you can read this article: Python namespaces🎜