ray.data.Datasource
ray.data.Datasource#
- class ray.data.Datasource[source]#
Bases:
objectInterface for defining a custom
Datasetdatasource.To read a datasource into a dataset, use
read_datasource(). To write to a writable datasource, usewrite_datasource().See
RangeDatasourceandDummyOutputDatasourcefor examples of how to implement readable and writable datasources.For an example of subclassing
Datasource, read Implementing a Custom Datasource.Note
Datasource instances must be serializable, since
create_reader()andwrite()are called in remote tasks.PublicAPI: This API is stable across Ray releases.
Methods
__init__()create_reader(**read_args)Return a Reader for the given read arguments.
do_write(blocks, metadata, ray_remote_args, ...)Launch Ray tasks for writing blocks out to the datasource.
get_name()Return a human-readable name for this datasource.
on_write_complete(write_results, **kwargs)Callback for when a write job completes.
on_write_failed(write_results, error, **kwargs)Callback for when a write job fails.
on_write_start(**write_args)Callback for when a write job starts.
prepare_read(parallelism, **read_args)Deprecated: Please implement create_reader() instead.
write(blocks, ctx, **write_args)Write blocks out to the datasource.