Data types
- class odps.types.Boolean(*args, **kwargs)[source]
Represents boolean type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.boolean) instead.
- class odps.types.Tinyint(*args, **kwargs)[source]
Represents tinyint type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.tinyint) instead.Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.
- class odps.types.Smallint(*args, **kwargs)[source]
Represents smallint type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.smallint) instead.Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.
- class odps.types.Int(*args, **kwargs)[source]
Represents int type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.int_) instead.Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.
- class odps.types.Bigint(*args, **kwargs)[source]
Represents bigint type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.bigint) instead.
- class odps.types.Decimal(*args, **kwargs)[source]
Represents decimal type with size limit in MaxCompute.
- Parameters:
precision (int) – The precision (or total digits) of decimal type.
scale (int) – The decimal scale (or decimal digits) of decimal type.
- Example:
>>> decimal_type = Decimal(18, 6) >>> print(decimal_type) decimal(18, 6) >>> print(decimal_type.precision, decimal_type.scale) 18 6
- Note:
Need to set
options.sql.use_odps2_extension = Trueto enable full functionality when you are setting precision or scale.- precision
Precision (or total digits) of the decimal type.
- scale
Decimal scale (or decimal digits) of the decimal type.
- class odps.types.Float(*args, **kwargs)[source]
Represents float type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.float_) instead.Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.
- class odps.types.Double(*args, **kwargs)[source]
Represents double type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.double) instead.
- class odps.types.Binary(*args, **kwargs)[source]
Represents binary type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.binary) instead.Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.
- class odps.types.Char(*args, **kwargs)[source]
Represents char type with size limit in MaxCompute.
- Parameters:
size_limit (int) – The size limit of char type.
- Example:
>>> char_type = Char(65535) >>> print(char_type) char(65535) >>> print(char_type.size_limit) 65535
- Note:
Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.- size_limit
Size limit of the varchar type.
- class odps.types.String(*args, **kwargs)[source]
Represents string type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.string) instead.
- class odps.types.Varchar(*args, **kwargs)[source]
Represents varchar type with size limit in MaxCompute.
- Parameters:
size_limit (int) – The size limit of varchar type.
- Example:
>>> varchar_type = Varchar(65535) >>> print(varchar_type) varchar(65535) >>> print(varchar_type.size_limit) 65535
- Note:
Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.- size_limit
Size limit of the varchar type.
- class odps.types.Json(*args, **kwargs)[source]
Represents json type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.json) instead.Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.
- class odps.types.Date(*args, **kwargs)[source]
Represents date type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.date) instead.Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.
- class odps.types.Datetime(*args, **kwargs)[source]
Represents datetime type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.datetime) instead.
- class odps.types.Timestamp(*args, **kwargs)[source]
Represents timestamp type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.timestamp) instead.Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.
- class odps.types.TimestampNTZ(*args, **kwargs)[source]
Represents timestamp_ntz type in MaxCompute.
- Note:
This class may not be used directly. Use its singleton instance (
odps.types.timestamp_ntz) instead.Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.
- class odps.types.Array(*args, **kwargs)[source]
Represents array type in MaxCompute.
- Parameters:
value_type – type of elements in the array
- Example:
>>> from odps import types as odps_types >>> >>> array_type = odps_types.Array(odps_types.bigint) >>> print(array_type) array<bigint> >>> print(array_type.value_type) bigint
- Note:
Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.- value_type
Type of elements in the array.
- class odps.types.Map(*args, **kwargs)[source]
Represents map type in MaxCompute.
- Parameters:
key_type – type of keys in the array
value_type – type of values in the array
- Example:
>>> from odps import types as odps_types >>> >>> map_type = odps_types.Map(odps_types.string, odps_types.Array(odps_types.bigint)) >>> print(map_type) map<string, array<bigint>> >>> print(map_type.key_type) string >>> print(map_type.value_type) array<bigint>
- Note:
Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.- key_type
Type of keys in the map.
- value_type
Type of values in the map.
- class odps.types.Struct(*args, **kwargs)[source]
Represents struct type in MaxCompute.
- Parameters:
field_types – types of every field, can be a list of (field_name, field_type) tuples or a dict with field names as keys and field types as values.
- Example:
>>> from odps import types as odps_types >>> >>> struct_type = odps_types.Struct([("a", "bigint"), ("b", "array<string>")]) >>> print(struct_type) struct<`a`:bigint, `b`:array<string>> >>> print(struct_type.field_types) OrderedDict([("a", "bigint"), ("b", "array<string>")]) >>> print(struct_type.field_types["b"]) array<string>
- Note:
Need to set
options.sql.use_odps2_extension = Trueto enable full functionality.- field_types
Types of fields in the struct, as an OrderedDict.
- Example:
The example below extracts field types of a struct.
import odps.types as odps_types # obtain field types of the Struct instance struct_type = odps_types.Struct( {"a": odps_types.bigint, "b": odps_types.string} ) for field_name, field_type in struct_type.field_types.items(): print("field_name:", field_name, "field_type:", field_type)
- odps.types.validate_data_type(data_type)[source]
Parse data type instance from string in MaxCompute DDL.
- Example:
>>> field_type = validate_data_type("array<int>") >>> print(field_type) array<int> >>> print(field_type.value_type) int
- class odps.types.Column(name=None, typo=None, comment=None, label=None, nullable=True, generate_expression=None, **kw)[source]
Represents a column in a table schema.
- Parameters:
name (str) – column name
typo (str) – column type. Can also use type as keyword.
comment (str) – comment of the column, None by default
nullable (bool) – is column nullable, True by default
- Example:
>>> col = Column("col1", "bigint") >>> print(col.name) col1 >>> print(col.type) bigint
- name
Name of the column.
- type
Type of the column.
- nullable
True if the column is nullable.
- class odps.types.Partition(name=None, typo=None, comment=None, label=None, nullable=True, generate_expression=None, **kw)[source]
Represents a partition column in a table schema.
- Parameters:
name (str) – column name
typo (str) – column type. Can also use type as keyword.
comment (str) – comment of the column, None by default
nullable (bool) – is column nullable, True by default
- Example:
>>> col = Partition("col1", "bigint") >>> print(col.name) col1 >>> print(col.type) bigint
- name
Name of the column.
- type
Type of the column.
- nullable
True if the column is nullable.
- class odps.models.Record(columns=None, schema=None, values=None, max_field_size=None)[source]
A record generally means the data of a single line in a table. It can be created from a schema, or by
odps.models.Table.new_record()or byodps.tunnel.TableUploadSession.new_record().Hints on getting or setting different types of data can be seen here.
- Example:
>>> schema = TableSchema.from_lists(['name', 'id'], ['string', 'string']) >>> record = Record(schema=schema, values=['test', 'test2']) >>> record[0] = 'test' >>> record[0] >>> 'test' >>> record['name'] >>> 'test' >>> record[0:2] >>> ('test', 'test2') >>> record[0, 1] >>> ('test', 'test2') >>> record['name', 'id'] >>> for field in record: >>> print(field) ('name', 'test') ('id', 'test2') >>> len(record) 2 >>> 'name' in record True
- class odps.models.TableSchema(**kwargs)[source]
Schema includes the columns and partitions information of a
odps.models.Table.There are two ways to initialize a Schema object, first is to provide columns and partitions, the second way is to call the class method
from_lists. See the examples below:- Example:
>>> columns = [Column(name='num', type='bigint', comment='the column')] >>> partitions = [Partition(name='pt', type='string', comment='the partition')] >>> schema = TableSchema(columns=columns, partitions=partitions) >>> schema.columns [<column num, type bigint>, <partition pt, type string>] >>> >>> schema = TableSchema.from_lists(['num'], ['bigint'], ['pt'], ['string']) >>> schema.columns [<column num, type bigint>, <partition pt, type string>]
- classmethod from_lists(names, types, partition_names=None, partition_types=None)
Create a schema from lists of column names and types.
- Parameters:
names – List of column names.
types – List of column types.
partition_names – List of partition names.
partition_types – List of partition types.
- Example:
>>> schema = TableSchema.from_lists(['id', 'name'], ['bigint', 'string']) >>> print(schema.columns) [<column id, type bigint>, <column name, type string>]