PyODPS: ODPS Python SDK and data analysis framework

PyODPS is the Python SDK of MaxCompute. It supports basic actions on MaxCompute objects and the DataFrame framework for ease of data analysis on MaxCompute.

Installation

PyODPS supports Python 2.7 and later versions (including Python 3). After installing PIP in the system, you only need to run

pip install pyodps

The related dependencies of PyODPS are automatically installed.

Quick start

You can use access id and key of an Alibaba Cloud account to initialize a MaxCompute (ODPS) entrance object, as shown in the following code. Parameters of `ODPS` function should be replaced with your account and project information. Asterisks should be removed.

import os
from odps import ODPS
# Make sure environment variable ALIBABA_CLOUD_ACCESS_KEY_ID already set to Access Key ID of user
# while environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET set to Access Key Secret of user.
# Not recommended to hardcode Access Key ID or Access Key Secret in your code.
o = ODPS(
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
    project='**your-project**',
    endpoint='**your-endpoint**',
)

After completing initialization, you can operate tables, resources, and functions.

If you need to use STS Token to access MaxCompute, you may use code below to create a MaxCompute (ODPS) entrance object.

import os
from odps import ODPS
from odps.accounts import StsAccount
# Make sure environment variable ALIBABA_CLOUD_ACCESS_KEY_ID already set to acquired Access Key ID,
# environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET set to acquired Access Key Secret
# while environment variable ALIBABA_CLOUD_STS_TOKEN set to acquired STS token.
# Not recommended to hardcode Access Key ID or Access Key Secret in your code.
account = StsAccount(
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
    os.getenv('ALIBABA_CLOUD_STS_TOKEN'),
)
o = ODPS(
    account=account,
    project='**your-default-project**',
    endpoint='**your-end-point**',
)

We provide elementary functions for major MaxCompute objects, including list, get, exist, create and delete.

We will elaborate every object in the next chapters. If not mentioned, the variable o represents the MaxCompute (ODPS) entrance object.