Configuration¶
PyODPS provides a series of configuration options, which can be obtained through odps.options
. Here is a simple example:
from odps import options
# configure lifecycle for all output tables (option lifecycle)
options.lifecycle = 30
# handle string type as bytes when downloading with Tunnel (option tunnel.string_as_binary)
options.tunnel.string_as_binary = True
# get more records when sorting the DataFrame with MaxCompute
options.df.odps.sort.limit = 100000000
The following lists configurable MaxCompute options:
General configurations¶
Option | Description | Default value |
---|---|---|
endpoint | MaxCompute Endpoint | None |
default_project | Default project | None |
logview_host | LogView host name | None |
logview_hours | LogView holding time (hours) | 24 |
local_timezone | Used time zone. None indicates that PyODPS takes no actions, True indicates local time, and False indicates UTC. The time zone of pytz package can also be used. | None |
lifecycle | Life cycles of all tables | None |
verify_ssl | Verify SSL certificate of the server end | True |
temp_lifecycle | Life cycles of temporary tables | 1 |
biz_id | User ID | None |
verbose | Whether to print logs | False |
verbose_log | Log receiver | None |
chunk_size | Size of write buffer | 1496 |
retry_times | Request retry times | 4 |
pool_connections | Number of cached connections in the connection pool | 10 |
pool_maxsize | Maximum capacity of the connection pool | 10 |
connect_timeout | Connection time-out | 120 |
read_timeout | Read time-out | 120 |
api_proxy | Proxy address for APIs | None |
data_proxy | Proxy address for data transfer | None |
completion_size | Limit on the number of object complete listing items | 10 |
display.notebook_widget | Use interactive plugins | True |
sql.settings | Global hints for MaxCompute SQL | None |
sql.use_odps2_extension | Enable MaxCompute 2.0 language extension | None |
sql.always_enable_schema | Enable Schema level under any scenario | None |
Data upload/download configurations¶
Option | Description | Default value |
---|---|---|
tunnel.endpoint | Tunnel Endpoint | None |
tunnel.use_instance_tunnel | Use Instance Tunnel to obtain the execution result | True |
tunnel.limit_instance_tunnel | Limit the number of results obtained by Instance Tunnel | None |
tunnel.string_as_binary | Use bytes instead of unicode in the string type | False |
DataFrame configurations¶
Option | Description | Default value |
---|---|---|
interactive | Whether in an interactive environment | Depend on the detection value |
df.analyze | Whether to enable non-MaxCompute built-in functions | True |
df.optimize | Whether to enable DataFrame overall optimization | True |
df.optimizes.pp | Whether to enable DataFrame predicate push down optimization | True |
df.optimizes.cp | Whether to enable DataFrame column tailoring optimization | True |
df.optimizes.tunnel | Whether to enable DataFrame tunnel optimization | True |
df.quote | Whether to use `` to mark fields and table names at the end of MaxCompute SQL | True |
df.libraries | Third-party library (resource name) that is used for DataFrame running | None |
df.supersede_libraries | Use uploaded package resource to supersede the version provided by MaxCompute | True |
df.odps.sort.limit | Limit count when sort is performed |
10000 |