Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
LuoJianing ca9acb5ec1 | 1 year ago | |
---|---|---|
.gitee | 1 year ago | |
.jenkins/check/config | 1 year ago | |
docs/api | 1 year ago | |
mindpandas | 1 year ago | |
tests | 1 year ago | |
.gitignore | 1 year ago | |
LICENSE | 1 year ago | |
MANIFEST.in | 1 year ago | |
NOTICE | 1 year ago | |
OWNERS | 1 year ago | |
README.md | 1 year ago | |
README_CN.md | 1 year ago | |
RELEASE.md | 1 year ago | |
RELEASE_CN.md | 1 year ago | |
Third_Party_Open_Source_Software_Notice | 1 year ago | |
build.sh | 1 year ago | |
requirements.txt | 1 year ago | |
setup.py | 1 year ago |
MindSpore Pandas uses distributed computing engine to accelerate pandas operations, seamlessly integrated and compatible with existing pandas code. Using MindSpore Pandas for calculations can use all CPU cores on the computer, which makes MindSpore Pandas works especially well on larger datasets.
MindSpore Pandas is implemented based on distribution, while native pandas is implemented based on single thread. This means that only one CPU core can be used at a time.
However, MindSpore Pandas can use more threads and cores on the machine, or all cores of the entire cluster.
For detailed architecture design, please refer to official website document.
The following table lists the environment required for installing, compiling and running MindSpore Pandas:
software | version |
---|---|
Linux-x86_64 | Ubuntu >=18.04 Euler >=2.9 |
Python | 3.8-3.9 |
glibc | >=2.25 |
If you use the pip, please download the whl package from MindSpore Pandas page and install it.
Installing whl package will download MindSpore Pandas dependencies automatically (detail of dependencies is shown in requirements.txt) in the networked state, and other dependencies should be installed manually.
Download source code, then enter the mindpandas
directory to run build.sh script.
git clone https://gitee.com/mindspore/mindpandas.git
cd mindpandas
bash build.sh
The package is in output directory after compiled, and you can install with pip.
pip install output/mindpandas-0.1.0-cp38-cp38-linux_x86_64.whl
Execute the following command in shell. If no No module named 'mindpandas'
error is reported, the installation is successful.
python -c "import mindpandas"
First import MindSpore Pandas with the following command.
import mindpandas as pd
Set the running mode of MindSpore Pandas with the following command, which can speed up your MindSpore Pandas workflow.
pd.set_concurrency_mode('multithread')
The complete example is as follows:
>>> import mindpandas as pd
>>> pd.set_concurrency_mode('multithread')
>>> pd.set_partition_shape((16, 2))
>>> pd_df = pd.DataFrame([[1, 2, 3], [4, 5, 6]])
>>> sum = pd_df.sum()
>>> print(sum)
0 5
1 7
2 9
Name: sum, dtype: int64
More details about installation guide, tutorials and APIs, please see the
User Documentation.
Welcome contributions. See our Contributor Wiki for
more details.
The release notes, see our RELEASE.
MindSpore Pandas is a data analysis framework, which is compatible with Pandas interfaces and provides distributed processing capabilities.
https://gitee.com/mindspore/mindpandas
CSV Python Markdown other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》