#913 update storage

Merged
hanjr merged 3 commits from erpim_0409 into master 2 weeks ago
Erpim commented 3 weeks ago
Erpim reviewed 3 weeks ago
@@ -65,8 +65,6 @@ def is_floating_point(obj):
return obj.is_floating_point()


strided = False
Erpim commented 3 weeks ago
定义移动至tensor.py
Erpim changed title from [WIP]add data_ptr to add data_ptr 3 weeks ago
Erpim changed title from add data_ptr to [WIP]add data_ptr 3 weeks ago
Erpim commented 3 weeks ago
Poster
适配safetensors,该pr合入后,safetensors.torch目录下功能将完全支持,不用做其他适配修改。
Erpim changed title from [WIP]add data_ptr to [WIP]update storage 3 weeks ago
Erpim reviewed 3 weeks ago
mindtorch/torch/common/dtype.py
@@ -207,3 +210,2 @@
else:
warning("If you want to obtain `smallest_normal` in finfo, " \
"NumPy version must be greater or equal 1.23.0.")
warning_only_once("If you want to obtain `smallest_normal` in finfo, " \
Erpim commented 3 weeks ago
原有告警会刷屏打印,用户体验较差,现改为每次执行仅打印一次
Erpim changed title from [WIP]update storage to update storage 3 weeks ago
zoulq reviewed 3 weeks ago
@@ -1478,2 +1479,3 @@
<span id="jump13">Storage统一约束:</span>
- 当前Storage相关对象并未实现存储区控制功能,仅支持size,nbytes,len,element_size,tolist,type及__getitem__等基础功能对标。
- 由于存储机制差异,无法支持share_memory_,pin_memory等涉及数据存储区控制的功能,暂不支持两个Tensor共享同一个Storage对象的相关场景。
- 当前基于Numpy实现,性能存在较大优化空间,不建议频繁调用。
zoulq commented 3 weeks ago
在FAQ写一个典型不支持的代码使用样例
Erpim commented 2 weeks ago
done
zoulq reviewed 3 weeks ago
mindtorch/torch/logging.py
@@ -75,6 +77,11 @@ def warning(msg, *args, **kwargs):
_get_logger().warning(msg, *args, **kwargs)


@lru_cache(_GLOBAL_LRU_CACHE_SIZE_NN)
zoulq commented 3 weeks ago
图模式下语法编译是否支持?
Erpim commented 2 weeks ago
本地验证,严格图模式下可通过,正常打印
zoulq reviewed 3 weeks ago
mindtorch/torch/common/dtype.py
@@ -7,3 +8,3 @@
from mindspore.ops.primitive import _primexpr
from mindspore._c_expression import typing
from mindtorch.torch.logging import warning
from mindtorch.torch.logging import warning_only_once
zoulq commented 3 weeks ago
存在需要多次打印场景?
Erpim commented 2 weeks ago
warnning加在接口里,每次调用接口就会打印,有的接口被频繁调用,就会整个屏幕都显示warning,调这个接口,就只第一次出现的时候打印一次,后面相同的告警信息就不打印了
Erpim commented 2 weeks ago
统一整改为只打印一次
zoulq reviewed 3 weeks ago
mindtorch/torch/storage.py
@@ -54,3 +61,1 @@
return self.inner_data.flatten()[idx].item()
else:
return self.inner_data.flatten()[idx]
return self.inner_data[idx]
zoulq commented 3 weeks ago
对应原来支持的两种场景,返回单个元素和返回一段tensor,现在这种表达是否能支持?
Erpim commented 2 weeks ago
直接调用numpy切片,如果是单元素直接返回scalar,如果是slice返回ndarray,torch该接口返回的也不是tensor类型,已做适配,如果是slice返回storage类型
zoulq reviewed 3 weeks ago
@@ -1188,0 +1213,4 @@
target_shape = ori_shape[:-1] + target_shape
out = np.frombuffer(self.numpy(), _TypeDict.get(dtype, np.float32))
if dtype == mindtorch_dtype.bfloat16:
return tensor(out.astype(np.float32), dtype=dtype).reshape(target_shape)
zoulq commented 3 weeks ago
为什么这段逻辑会变复杂?
Erpim commented 2 weeks ago
view传dtype的逻辑不是做astype,之前封的不太对,是数据对应的字节,以dtype类型读取
hanjr reviewed 3 weeks ago
mindtorch/torch/storage.py
@@ -123,0 +155,4 @@
else:
append_data = np.random.randint(0, 255, size=size - self.size(), dtype=np.uint8)
self.inner_data = np.concatenate((self.inner_data, append_data), axis=0)
if self.referenced_tensor is not None:
hanjr commented 3 weeks ago
为什么这个地方不能使用_update_referenced_tensor来更新tensor的值?
Erpim commented 2 weeks ago
如果resize_往小了变,torch打印tensor也会报错.如果resize往大了变,tensor打印的时候还是原先的数据shape,所以不需要更新tensor
hanjr reviewed 3 weeks ago
mindtorch/torch/storage.py
@@ -493,0 +554,4 @@
elif not isinstance(idx, int):
raise RuntimeError(f"can't index a {type(self)} with {type(idx)}")
idx_wrapped = self._maybe_wrap_index(idx)
tmp_np_data = np.frombuffer(self._storage.inner_data, _TypeDict.get(self.dtype, np.float32))
hanjr commented 3 weeks ago
__getitem__和__setitem__里面为什么frombuffer在无法取到字典内key值时返回np.float32,而不是保持uint8
Erpim commented 2 weeks ago
typedstorage的类型和tensor的默认类型保持一致,所以是fp32,untypedstorage类型是uint8,不过这个场景应该不会有取到默认值的场景.
hanjr reviewed 3 weeks ago
mindtorch/torch/storage.py
@@ -621,2 +687,3 @@
raise TypeError(f"Argument 'dtype' must be torch.dtype, not {type(dtype)}")
storage = self._storage.inner_data.to(dtype).storage()
from mindtorch.torch.tensor import tensor # pylint: disable=R0401, C0415
np_data = np.frombuffer(self._storage.inner_data, _TypeDict.get(self.dtype))
hanjr commented 3 weeks ago
会有_TypeDict没有保存到的数据格式么?
Erpim commented 2 weeks ago
已补充bool和复数类型,其他没有识别到
hanjr merged commit ab16c042a6 into master 2 weeks ago
The pull request has been merged as ab16c042a6.
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No Assignees
3 Participants
Notifications
Due Date

No due date set.

Dependencies

This pull request currently doesn't have any dependencies.

Loading…
There is no content yet.