Configuration

PyODPS provides a series of configuration options, which can be obtained through odps.options. Here is a simple example:

from odps import options
# configure lifecycle for all output tables (option lifecycle)
options.lifecycle = 30
# handle string type as bytes when downloading with Tunnel (option tunnel.string_as_binary)
options.tunnel.string_as_binary = True
# get more records when sorting the DataFrame with MaxCompute
options.df.odps.sort.limit = 100000000

The following lists configurable MaxCompute options:

General configurations

Option Description Default value
endpoint MaxCompute Endpoint None
default_project Default project None
log_view_host LogView host name None
log_view_hours LogView holding time (hours) 24
local_timezone Used time zone. None indicates that PyODPS takes no actions, True indicates local time, and False indicates UTC. The time zone of pytz package can also be used. None
lifecycle Life cycles of all tables None
temp_lifecycle Life cycles of temporary tables 1
biz_id User ID None
verbose Whether to print logs False
verbose_log Log receiver None
chunk_size Size of write buffer 1496
retry_times Request retry times 4
pool_connections Number of cached connections in the connection pool 10
pool_maxsize Maximum capacity of the connection pool 10
connect_timeout Connection time-out 5
read_timeout Read time-out 120
api_proxy Proxy address for APIs None
data_proxy Proxy address for data transfer None
completion_size Limit on the number of object complete listing items 10
display.notebook_widget 使用交互式插件 True
sql.settings Global hints for MaxCompute SQL None
sql.use_odps2_extension Enable MaxCompute 2.0 language extension False

Data upload/download configurations

Option Description Default value
tunnel.endpoint Tunnel Endpoint None
tunnel.use_instance_tunnel Use Instance Tunnel to obtain the execution result True
tunnel.limit_instance_tunnel Limit the number of results obtained by Instance Tunnel None
tunnel.string_as_binary Use bytes instead of unicode in the string type False

DataFrame configurations

Option Description Default value
interactive Whether in an interactive environment Depend on the detection value
df.analyze Whether to enable non-MaxCompute built-in functions True
df.optimize Whether to enable DataFrame overall optimization True
df.optimizes.pp Whether to enable DataFrame predicate push down optimization True
df.optimizes.cp Whether to enable DataFrame column tailoring optimization True
df.optimizes.tunnel Whether to enable DataFrame tunnel optimization True
df.quote Whether to use `` to mark fields and table names at the end of MaxCompute SQL True
df.libraries Third-party library (resource name) that is used for DataFrame running None
df.supersede_libraries Use numpy package resource to supersede the version provided by MaxCompute False
df.odps.sort.limit Limit count when sort is performed 10000

PyODPS ML configurations

Option Description Default value
ml.xflow_settings Global settings for Xflow Task None
ml.xflow_project Default Xflow project name algo_public
ml.use_model_transfer Whether to use ModelTransfer to obtain the model PMML False
ml.model_volume Volume name used when ModelTransfer is used pyodps_volume