Instance
Tasks such as SQLTask are the basic computing units in MaxCompute. When executed, a Task is instantiated as a MaxCompute instance.
Basic operations
You can call list_instances
to retrieve all the instances in the project. You can use exist_instance
to determine if an instance exists, and use get_instance
to retrieve instances.
>>> for instance in o.list_instances():
>>> print(instance.id)
>>> if o.exist_instance('<my_instance_id>'):
>>> print("Instance <my_instance_id> exists!")
You can call stop_instance
on an odps object to stop an instance, or call the stop
method on an instance object.
>>> # Method 1: use stop_instance to stop an instance
>>> o.exist_instance('<my_instance_id>')
>>> # Method 2: use stop method of instance object to stop an instance
>>> instance = o.get_instance('<my_instance_id>')
>>> instance.stop()
Retrieve LogView address
For a SQL task, you can call the get_logview_address
method to retrieve the LogView address.
>>> # from an existing instance object
>>> instance = o.run_sql('desc pyodps_iris')
>>> print(instance.get_logview_address())
>>> # from an instance id
>>> instance = o.get_instance('2016042605520945g9k5pvyi2')
>>> print(instance.get_logview_address())
For an XFlow task, you need to enumerate its subtasks and retrieve their LogView as follows. More details can be seen at XFlow and models.
>>> instance = o.run_xflow('AppendID', 'algo_public',
{'inputTableName': 'input_table', 'outputTableName': 'output_table'})
>>> for sub_inst_name, sub_inst in o.get_xflow_sub_instances(instance).items():
>>> print('%s: %s' % (sub_inst_name, sub_inst.get_logview_address()))
Instance status
The status of an instance can be Running
, Suspended
or Terminated
. You can retrieve the status of an instance by using the status
attribute. The is_terminated
method returns whether the execution of the current instance has been completed. The is_successful
method returns whether the execution of the current instance has been successful. A False is returned if the instance is still running or if the execution has failed.
>>> instance = o.get_instance('2016042605520945g9k5pvyi2')
>>> instance.status
<Status.TERMINATED: 'Terminated'>
>>> from odps.models import Instance
>>> instance.status == Instance.Status.TERMINATED
True
>>> instance.status.value
'Terminated'
The wait_for_completion
method will block your thread until the execution of the current instance has been completed. The wait_for_success
method will also block until the execution of the current instance has been successful. Otherwise, an exception is thrown.
Subtask operations
When an instance is running, it may contain one or several subtasks, which are called Tasks. Note that these Tasks are different from the computing units in MaxCompute.
You can call get_task_names
to retrieve all Tasks. This method returns the Task names in a list type.
>>> instance.get_task_names()
['SQLDropTableTask']
After getting the Task names, you can use get_task_result
to retrieve the execution results of these tasks. The get_task_results
method returns a dict type.
>>> instance = o.execute_sql('select * from pyodps_iris limit 1')
>>> instance.get_task_names()
['AnonymousSQLTask']
>>> instance.get_task_result('AnonymousSQLTask')
'"sepallength","sepalwidth","petallength","petalwidth","name"\n5.1,3.5,1.4,0.2,"Iris-setosa"\n'
>>> instance.get_task_results()
OrderedDict([('AnonymousSQLTask',
'"sepallength","sepalwidth","petallength","petalwidth","name"\n5.1,3.5,1.4,0.2,"Iris-setosa"\n')])
You can use get_task_progress
to retrieve the running progress of a Task.
>>> while not instance.is_terminated():
>>> for task_name in instance.get_task_names():
>>> print(instance.id, instance.get_task_progress(task_name).get_stage_progress_formatted_string())
>>> time.sleep(10)
20160519101349613gzbzufck2 2016-05-19 18:14:03 M1_Stg1_job0:0/1/1[100%]