Instance

Tasks such as SQLTask are the basic computing units in MaxCompute. When executed, a Task is instantiated as a MaxCompute instance.

Basic operations

You can call list_instances to retrieve all the instances in the project. You can use exist_instance to determine if an instance exists, and use get_instance to retrieve instances.

>>> for instance in o.list_instances():
>>>     print(instance.id)
>>> if o.exist_instance('<my_instance_id>'):
>>>     print("Instance <my_instance_id> exists!")

You can call stop_instance on an odps object to stop an instance, or call the stop method on an instance object.

>>> # Method 1: use stop_instance to stop an instance
>>> o.exist_instance('<my_instance_id>')
>>> # Method 2: use stop method of instance object to stop an instance
>>> instance = o.get_instance('<my_instance_id>')
>>> instance.stop()

Retrieve LogView address

For a SQL task, you can call the get_logview_address method to retrieve the LogView address.

>>> # from an existing instance object
>>> instance = o.run_sql('desc pyodps_iris')
>>> print(instance.get_logview_address())
>>> # from an instance id
>>> instance = o.get_instance('2016042605520945g9k5pvyi2')
>>> print(instance.get_logview_address())

For an XFlow task, you need to enumerate its subtasks and retrieve their LogView as follows. More details can be seen at XFlow and models.

>>> instance = o.run_xflow('AppendID', 'algo_public',
                           {'inputTableName': 'input_table', 'outputTableName': 'output_table'})
>>> for sub_inst_name, sub_inst in o.get_xflow_sub_instances(instance).items():
>>>     print('%s: %s' % (sub_inst_name, sub_inst.get_logview_address()))

Instance status

The status of an instance can be Running, Suspended or Terminated. You can retrieve the status of an instance by using the status attribute. The is_terminated method returns whether the execution of the current instance has been completed. The is_successful method returns whether the execution of the current instance has been successful. A False is returned if the instance is still running or if the execution has failed.

>>> instance = o.get_instance('2016042605520945g9k5pvyi2')
>>> instance.status
<Status.TERMINATED: 'Terminated'>
>>> from odps.models import Instance
>>> instance.status == Instance.Status.TERMINATED
True
>>> instance.status.value
'Terminated'

The wait_for_completion method will block your thread until the execution of the current instance has been completed. The wait_for_success method will also block until the execution of the current instance has been successful. Otherwise, an exception is thrown.

Subtask operations

When an instance is running, it may contain one or several subtasks, which are called Tasks. Note that these Tasks are different from the computing units in MaxCompute.

You can call get_task_names to retrieve all Tasks. This method returns the Task names in a list type.

>>> instance.get_task_names()
['SQLDropTableTask']

After getting the Task names, you can use get_task_result to retrieve the execution results of these tasks. The get_task_results method returns a dict type.

>>> instance = o.execute_sql('select * from pyodps_iris limit 1')
>>> instance.get_task_names()
['AnonymousSQLTask']
>>> instance.get_task_result('AnonymousSQLTask')
'"sepallength","sepalwidth","petallength","petalwidth","name"\n5.1,3.5,1.4,0.2,"Iris-setosa"\n'
>>> instance.get_task_results()
OrderedDict([('AnonymousSQLTask',
           '"sepallength","sepalwidth","petallength","petalwidth","name"\n5.1,3.5,1.4,0.2,"Iris-setosa"\n')])

You can use get_task_progress to retrieve the running progress of a Task.

>>> while not instance.is_terminated():
>>>     for task_name in instance.get_task_names():
>>>         print(instance.id, instance.get_task_progress(task_name).get_stage_progress_formatted_string())
>>>     time.sleep(10)
20160519101349613gzbzufck2 2016-05-19 18:14:03 M1_Stg1_job0:0/1/1[100%]