Schema

Note

Schema is a beta function of MaxCompute. You need to apply for a trial of new features before accessing it. PyODPS above 0.11.3 is also needed.

Schema is a concept between projects and objects like tables, resources or functions. It maintains categories for these objects.

Basic operations

You may use exist_schema() to check if the schema with specific name exists.

print(o.exist_schema("test_schema"))

Use create_schema() to create a schema object.

schema = o.create_schema("test_schema")
print(schema)

Use delete_schema() to delete a schema object.

schema = o.delete_schema("test_schema")

Use get_schema() to obtain a schema object and print its owner.

schema = o.get_schema("test_schema")
print(schema.owner)

Use list_schema() to list all schemas in s project and print their names.

for schema in o.list_schema():
    print(schema.name)

Handling objects in Schema

After schemas are enabled, calls on your MaxCompute entrance only affects objects in the schema named DEFAULT by default. To handle objects in other schemas, you need to provide the name of the schema. For instance,

import os
from odps import ODPS
# Make sure environment variable ALIBABA_CLOUD_ACCESS_KEY_ID already set to Access Key ID of user
# while environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET set to Access Key Secret of user.
# Not recommended to hardcode Access Key ID or Access Key Secret in your code.
o = ODPS(
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
    project='**your-project**',
    endpoint='**your-endpoint**',
    schema='**your-schema-name**',
)

You can also specify names of schemas when handling MaxCompute objects. For instance, the code below lists all tables under the schema test_schema.

for table in o.list_tables(schema='test_schema'):
    print(table)

The code below gets a table named dual under schema named ``test_schema``and outputs its structure.

table = o.get_table('dual', schema='test_schema')
print(table.table_schema)

You can also specify name of the default schema when executing SQL statements.

o.execute_sql("SELECT * FROM dual", default_schema="test_schema")

对于表而言,如果项目空间没有启用 Schema,get_table 方法对于 x.y 形式的表名,默认按照 project.table 处理。如果当前租户开启了租户级语法开关get_table 会将 x.y 作为 schema.table 处理,否则依然按照 project.table 处理。如果租户上没有配置该选项,可以配置 options.enable_schema = True,此后所有 x.y 都将被作为 schema.table 处理:

from odps import options
options.enable_schema = True
print(o.get_table("myschema.mytable"))

Note

options.enable_schema is supported since PyODPS 0.12.0. options.always_enable_schema should be used in lower versions.