Schema

Note

Schema is a beta function of MaxCompute. You need to apply for a trial of new features before accessing it. PyODPS above 0.11.3 is also needed.

Schema is a concept between projects and objects like tables, resources or functions. It maintains categories for these objects.

Basic operations

You may use exist_schema to check if the schema with specific name exists.

print(o.exist_schema("test_schema"))

Use create_schema to create a schema object.

schema = o.create_schema("test_schema")
print(schema)

Use delete_schema to delete a schema object.

schema = o.delete_schema("test_schema")

Use get_schema to obtain a schema object and print its owner.

schema = o.get_schema("test_schema")
print(schema.owner)

Use list_schema to list all schemas in s project and print their names.

for schema in o.list_schema():
    print(schema.name)

Handling objects in Schema

After schemas are enabled, calls on your MaxCompute entrance only affects objects in the schema named DEFAULT by default. To handle objects in other schemas, you need to provide the name of the schema. For instance,

import os
from odps import ODPS
# Make sure environment variable ALIBABA_CLOUD_ACCESS_KEY_ID already set to Access Key ID of user
# while environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET set to Access Key Secret of user.
# Not recommended to hardcode Access Key ID or Access Key Secret in your code.
o = ODPS(
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
    project='**your-project**',
    endpoint='**your-endpoint**',
    schema='**your-schema-name**',
)

You can also specify names of schemas when handling MaxCompute objects. For instance, the code below lists all tables under the schema test_schema.

for table in o.list_tables(schema='test_schema'):
    print(table)

You can also specify name of the default schema when executing SQL statements.

o.execute_sql("SELECT * FROM dual", default_schema="test_schema")

For tables, if schema is not enabled in project, get_table will handle x.y as project.table. When odps.namespace.schema is enabled for current tenant, get_table will handle x.y as schema.table, or it will be still handled as project.table. If the option is not specified, you may configure options.always_enable_schema = True in your Python code and then all table names like x.y will be handled as schema.table.

from odps import options
options.always_enable_schema = True
print(o.get_table("myschema.mytable"))