PyODPS: ODPS Python SDK and data analysis frameworkΒΆ
PyODPS is the Python SDK of MaxCompute. It supports basic actions on MaxCompute objects and the DataFrame framework for ease of data analysis on MaxCompute.
Installation
PyODPS supports Python 2.6 and later versions (including Python 3). After installing PIP in the system, you only need to run
pip install pyodps
The related dependencies of PyODPS are automatically installed.
Note that for Linux and MacOS users, installing Cython before installation of PyODPS can accelerate data upload and download via MaxCompute Tunnel.
Windows users with appropriate versions of Visual C++ and Cython can also enjoy this acceleration.
Quick start
Firstly, use an Alibaba Cloud account to initialize a MaxCompute (ODPS) entrance object, as shown in the following code. Parameters of `ODPS`
function should be replaced with your account and project information. Asterisks should be removed.
from odps import ODPS
o = ODPS('**your-access-id**', '**your-secret-access-key**', '**your-default-project**',
endpoint='**your-end-point**')
After completing initialization, you can operate tables, resources, and functions.
If you need to use STS Token to access MaxCompute, you may use code below to create a MaxCompute (ODPS) entrance object.
from odps import ODPS
from odps.accounts import StsAccount
account = StsAccount('**your-access-id**', '**your-secret-access-key**', '**your-sts-token**')
o = ODPS(account=account, project='**your-default-project**', endpoint='**your-end-point**')
We provide elementary functions for major MaxCompute objects, including list
, get
, exist
, create
and delete
.
We will elaborate every object in the next chapters. If not mentioned, the variable o
represents the MaxCompute (ODPS) entrance object.